CN112364943A - Federal prediction method based on federal learning - Google Patents
Federal prediction method based on federal learning Download PDFInfo
- Publication number
- CN112364943A CN112364943A CN202011456395.8A CN202011456395A CN112364943A CN 112364943 A CN112364943 A CN 112364943A CN 202011456395 A CN202011456395 A CN 202011456395A CN 112364943 A CN112364943 A CN 112364943A
- Authority
- CN
- China
- Prior art keywords
- training
- round
- neural network
- cluster
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Business, Economics & Management (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Quality & Reliability (AREA)
- General Business, Economics & Management (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Tourism & Hospitality (AREA)
- Game Theory and Decision Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a federal prediction method based on federal learning, which enables the parameter change of a neural network model updated by a single participant to have direction difference and no size difference by unitizing a locally updated gradient vector, thereby protecting the data privacy, simultaneously not using homomorphic, differential privacy or other encryption technologies, and greatly reducing the communication cost between equipment and a server under the condition of not losing the data precision. In addition, considering that the data are often large in difference in a federal learning scene, the performance of local participants can be improved by increasing the local information of the data, the uploaded neural network model parameters are clustered by using a k-means algorithm to obtain similar neural network model parameters, and the aggregation weight of the neural network model parameters is improved, so that the method is more suitable for the data scene of the participants.
Description
Technical Field
The invention relates to the technical field of federal learning, in particular to a federal prediction method based on federal learning.
Background
In most industries, due to problems of industry competition, privacy security, complex administrative procedures and the like, data is not shared, and even among different departments of the same company, centralized integration of data faces an important resistance. In reality, it is almost impossible to integrate data distributed in various places and organizations, and the cost is enormous. Meanwhile, all countries strengthen the protection of data security and privacy, and a new law General Data Protection Regulation (GDPR) introduced recently in the European Union shows that the strictness of the user data privacy and security management is a world trend. Aiming at two problems of data island and privacy safety, Google proposed a federal learning framework in 2016, and the design goal of the Federa learning framework is to develop efficient machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information safety during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance.
The federal learning is that a plurality of data owners form a alliance and participate in the training of the global model together. On the basis of protecting data privacy and model parameters, all the participants only share the encrypted model parameters or the encrypted intermediate calculation results, but do not share original data, so that the data can be used and invisible, and the jointly constructed model can achieve better model performance. With the continuous perfection of laws and regulations in data security, more and more companies and organizations schedule privacy security, and more researchers invest the data security.
In federal learning, using matrix DiRepresenting the data held by each data owner i, each row of the matrix representing a sample, and each column representing a feature. Meanwhile, some data sets may also contain label data, for example, a label in the financial field may be a credit of a user, a label in the marketing field may be a purchasing desire of the user, and a label in the education field may be a degree of a student. For this purpose, the feature space is denoted by X, the label space by Y and the sample ID space by I, so that a complete training data set (I, X, Y) is formed among the feature X, the label Y and the sample ID. In existing federal learning, its parameters are in the polymerization processA pure federal averaging mode is adopted, and the prediction performance of the model obtained by the mode is poor. In addition, because encryption is required on the shared parameters, the communication cost between the equipment and the server can be greatly increased by using homomorphic or other encryption technologies, and the communication efficiency is low.
Disclosure of Invention
The invention aims to solve the problem of poor prediction effect of the existing federal learning and provides a federal prediction method based on the federal learning.
In order to solve the problems, the invention is realized by the following technical scheme:
a federal forecast method based on federal learning comprises the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all participants; all participants take the parameters of the initialized neural network model as model parameters of the 0 th round of the local neural network model;
step 2, the participation direction server meeting the conditions provides a request for participating in the t-th round of training, and the server selects cK participants as training participants to participate in the t-th round of training;
step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network model
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of trainingUploading to a server;
step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithmPolymerizing to obtain the final model parameter of the discrete cluster of the t-th roundAnd the t round clustering final model parameter of each clustering cluster
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering clusterRespectively sending the parameters to training participants corresponding to the t-th round clustering point model parameters of the corresponding clustering clusters, and obtaining the final model parameters of the t-th round discrete clusters of the discrete clustersTransmitting the training participator corresponding to the discrete point model parameter of the t-th round of the discrete cluster and other participators which do not participate in the t-th round;
step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, making t equal to t +1, and returning to the step 2;
and 8, each participant sends the local test data into the local neural network model, and the local test data is predicted by using the local neural network model.
The specific process of the step 3 is as follows:
step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the first model parameter of the t-th round of the local neural network model
Step 3.2, training participant k first from local training data set DkSelecting a part of data to form a part of local training data set BkReuse part of the local training data set BkTraining the current local neural network model, and updating the first model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the second model parameter of the t-th round of the local neural network model
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network model
The specific process of the step 5 is as follows:
step 5.1, the server uploads all the t-th round uploading model parameters by using a k-means clustering algorithmClustering into M clustering clusters, and calculating the cluster center coordinate of each clustering cluster;
step 5.2, the server follows eachClustering cluster t round uploading model parameterSelecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset rangeAs the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each clusterAs the t-th round clustering point model parameter;
step 5.3, the server selects a certain number of discrete point model parameters of the t-th round from all discrete point model parameters of the t-th round of the discrete clusters to average, and the discrete cluster intermediate model parameters of the t-th round of the discrete clusters are obtainedMeanwhile, the server averages the t-th round clustering point model parameters in all the t-th round clustering point model parameters of each clustering cluster to obtain the t-th round clustering cluster middle model parameters of each clustering cluster
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete clusterAnd the t round clustering final model parameter of each clustering clusterWherein:
the alpha is the weight of the current cluster i, beta is the weight of other clusters j, gamma is the weight of a discrete cluster, alpha + beta + gamma is 1, and alpha > beta > gamma;the Euclidean distance between the cluster center coordinate of the current cluster i and the cluster center coordinates of other cluster j is obtained; i belongs to M, j is not equal to i, and M is the number of the clustering clusters; k belongs to cK, and the cK is the number of the training participants; t is the number of training rounds.
Compared with the prior art, the invention has the following characteristics:
1. considering that the data are often large in difference in a federal learning scene, the performance of the local participants can be improved by increasing the local information of the data, the method utilizes a k-means algorithm to cluster uploaded neural network model parameters to obtain similar neural network model parameters, improves the aggregation weight of the neural network model parameters, and is more suitable for the data scene of the local participants.
2. According to the method, the gradient vector unit of local update is adopted, so that the change of the neural network model parameters updated by a single participant is only different in direction and has no difference in size, the data privacy is protected, homomorphic privacy, differential privacy or other encryption technologies are not needed, and the communication cost between equipment and a server is greatly reduced under the condition of not losing the data precision.
Drawings
FIG. 1 is a schematic diagram of a federated prediction method based on federated learning.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to specific examples.
For those skilled in the art of federal learning, before federal training can be performed, it is necessary to identify federal learning participants and servers and build a federal learning environment. Before participating in federal learning, a participant needs to prepare a local training data set to be trained, and needs a local neural network model determined according to an actual application scenario.
Taking keyboard input prediction as an example, the participating parties are thousands of mobile devices (mobile phones) participating in federal learning, and the server is Ariiyun or Baidu cloud. The local training data set is the data of input method (e.g., Gboard) users who select a shared text segment when entering the google application. The text is truncated to contain a phrase consisting of several words, and segments are only occasionally logged in from individual users. Prior to training, the logs were anonymized and deprived of personally identifiable information. Furthermore, segments are only used for training when starting with a sentence mark. The local neural network model uses a variant of the long-short term memory (LSTM) recurrent neural network, called the coupled-input forgetting gate (CIFG), for the next-word prediction model, and bound input-embedded and output projection matrices are used to reduce function size and accelerate training. Given a vocabulary of size V, by d Wv and an embedding matrix w ∈ RD×VEncoding a single hotspot v epsilon RVMapping to a dense embedded vector d ∈ RD. Output projection of CIFG, also at RDIs mapped to WTh∈RVThe output vector of (1). The softmax function on the output vector converts the raw logarithm to a normalized probability, and the model is trained using the cross entropy loss of the output label and the target label. Virtual keyboard input suggests that users all use mobile devices (cell phones), but the data input by different users on the keyboard is different.
Taking bank money laundering prediction as an example, the participating party is a bank participating in federal study, and the server is Ali cloud or Baidu cloud. The local training data set is business data of a bank, and the data comprises four items of client id, number x1 of inconsistent fund source and operation range, number x2 of large transaction and whether label data Y is used for money laundering. The local neural network model is set as a fully-connected neural network with two layers of three nodes and a softmax function, and the activation function is relu. The bank anti-money laundering task, bank A and bank B are in different areas. The business data of the bank A and the bank B have the same characteristic space and different customers due to the same business.
Referring to fig. 1, a federal prediction method based on federal learning includes the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all K participants; all participants take the parameters of the initialized neural network model as the model parameters of the 0 th round of the local neural network modelWherein K belongs to K, and K is the number of the participants.
In the keyboard input prediction example, the 0 th round raw model parameters are broadcast to all mobile devices (handsets) participating in federal training.
In the bank money laundering prediction example, round 0 original model parameters are broadcast to all bank devices participating in federal training.
And 2, the participation direction meeting the conditions provides a request for participating in the t-th round of training to the server, and the server selects cK participants as training participants to participate in the t-th round of training.
Participants meeting a certain condition (the condition is determined according to the task of the federal learning training plan) can make a request to the server, and the server can participate in the current training, after receiving the request, a part of the participants can be selected to participate in the current training, if some participants do not participate in the current training, the server can make the requests again after a period of time, and the server can consider the factors of the number of the participants and the overtime time. This round of training will only succeed if enough devices can participate in the current round of training before a timeout.
Step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network modelWherein k belongs to cK, and cK is the number of training participants.
By unitizing the locally updated gradient vectors, the parameter change of the neural network model updated by a single participant only has direction difference and no size difference. Even if an attacker obtains the neural network model parameters, only the gradient unit vector is known, and the local data cannot be reversely pushed out, so that the data privacy is protected. When uploading the parameters, the parameters do not need to be encrypted, and the communication efficiency between the participants and the server is improved.
Step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the first model parameter of the t-th round of the local neural network modelThis phase is trained once.
In the formula, eta is the learning rate,for the t-th round of training, the model parameter gradient of sample x in party k.
Step 3.2, training participant k first from local training data set DkSelecting a part forming part of the local training data set BkReuse of part of the local training dataSet BkTraining the current local neural network model, and updating the first model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the second model parameter of the t-th round of the local neural network modelThis phase may be trained multiple times.
In the formula, eta is the learning rate,for the t-th round of training, the model parameter gradient of sample x in party k.
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network modelThis phase is trained once.
In the formula, eta is the learning rate,for the sample in the t round training, participant kThe model parameter gradient of this x.
In the keyboard input prediction example, the mobile devices participating in the current round of training input a local training data set (input method user's data) including log data and log text into the local neural network model.
In the bank money laundering prediction example, the bank equipment participating in the current round of training inputs a local training data set (business data of the bank) into a local neural network model, wherein the local training data set comprises four items including client id, the number x1 of the capital source and the operating range which do not accord with each other, the number x2 of the large transaction number, and whether the label data Y is money laundering or not.
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of trainingAnd uploading to a server.
And the server waits for each participant to return the result after training, and if enough participants return the result before timeout, the training of the round is successful, otherwise, the training of the round fails. The server adopts an aggregation algorithm to aggregate after the training of the round is successful.
Step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithmPolymerizing to obtain the final model parameter of the t round clusterAnd the t-th discrete cluster final model parameters
Due to the fact that the parameters are directly averaged by the federal average, the data are often large in difference under the federal learning scene, and the performance of local participants can be improved by increasing the local information of the data. And clustering the uploaded neural network model parameters by using a k-means algorithm to obtain similar neural network model parameters. During aggregation, the weight of the neural network model parameters of the local cluster is improved for local data, so that the method is more suitable for the data scene of the participant and improves the performance of the neural network model.
Step 5.1, the server uploads all the t-th round uploading model parameters by using a k-means clustering algorithmClustering into M cluster clusters, and calculating cluster center coordinates of each clusterWherein i belongs to M, and M is the number of the clustering clusters.
Step 5.2, uploading model parameters from the t-th round of each cluster by the serverSelecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset rangeAs the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each clusterAs the t-th round clustering point model parameters.
Uploading model parameters when the tth roundCoordinates of the cluster centerThe Euclidean distance between them is less than or equal to the preset range, i.e.Then the t round uploads the model parametersCalled as the t-th round clustering point model parameter (the training participant corresponding to the parameter is the normal training participant in the cluster). Uploading model parameters when the tth roundCoordinates of the cluster centerHaving a Euclidean distance between them greater than a predetermined range, i.e.Then the t round uploads the model parametersAnd the parameter is called as the t-th round discrete point model parameter (the training participant corresponding to the parameter is the abnormal training participant in the cluster). Wherein s isiThe standard deviation of all model parameters in the i cluster to cluster center coordinates.
Step 5.3, the server selects Q discrete point model parameters of the t round from all discrete point model parameters of the t round of the discrete clusters to average to obtain discrete cluster intermediate model parameters of the t round of the discrete clustersMeanwhile, the server selects a certain number of Q cluster parameters from all the t round clustering point model parameters of each cluster to be averaged to obtain the t round clustering cluster intermediate model parameters of each cluster
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete clusterAnd the t round clustering final model parameter of each clustering clusterWherein:
where α is the weight of the current cluster i, β is the weight of the other cluster j, and γ is the weight of the discrete cluster, and α + β + γ is 1 and α > β > γ. WhereinAnd in the t-th round of training, the Euclidean distance between the current cluster i and the cluster center coordinates of other cluster j is calculated.
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering clusterRespectively sending the parameters to training participators (normal training participators) corresponding to the t-th round clustering point model parameters of the corresponding clustering clusters, and finally obtaining model parameters of the t-th round discrete clusters of the discrete clustersAnd transmitting the parameters to training participants (abnormal training participants) corresponding to the discrete point model parameters of the t-th round of the discrete clusters and other participants not participating in the t-th round.
Step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, let t be t +1, and return to step 2.
The model convergence refers to that the loss function value of the local neural network model is not changed any more or the change amount is smaller than a set change threshold.
And 8, each participant sends the local test data into the local neural network model, and the local test data is predicted by using the local neural network model.
In the keyboard input prediction example, each mobile device obtains a trained coupled input forgetting gate neural network model, the mobile device takes the real-time keyboard input characters of a user as input values of the coupled input forgetting gate neural network model, the output values of the coupled input forgetting gate neural network model are the next predicted input characters, and the predicted input characters are displayed on a virtual keyboard for the user to select.
In the bank money laundering prediction example, each bank obtains a trained anti-money laundering neural network model, bank business data comprising three items of client id, number x1 of inconsistent fund source and operation range and number x2 of large-amount transaction are used as input values of the anti-money laundering neural network model, and the output value of the anti-money laundering neural network model is used for predicting whether the business data is suspected to participate in money laundering.
It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and thus the present invention is not limited to the above-mentioned embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.
Claims (5)
1. A federal forecast method based on federal learning is characterized by comprising the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all participants; all participants take the parameters of the initialized neural network model as model parameters of the 0 th round of the local neural network model;
step 2, the participation direction server meeting the conditions provides a request for participating in the t-th round of training, and the server selects cK participants as training participants to participate in the t-th round of training;
step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network model
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of trainingUploading to a server;
step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithmPolymerizing to obtain the final model parameter of the discrete cluster of the t-th roundAnd the t round clustering final model parameter of each clustering cluster
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering clusterAre sent to the phases respectivelyTraining participants corresponding to the t-th round clustering point model parameters of the cluster to be clustered, and final model parameters of the t-th round discrete clusters of the discrete clustersTransmitting the training participator corresponding to the discrete point model parameter of the t-th round of the discrete cluster and other participators which do not participate in the t-th round;
step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, adding 1 to the number t of the training rounds, and returning to the step 2;
step 8, each participant sends the local test data into a local neural network model, and the local test data is predicted by using the local neural network model;
the k belongs to cK, and the cK is the number of training participants; t is the number of training rounds.
2. The federal forecast method based on federal learning as claimed in claim 1, wherein the specific process of step 3 is as follows:
step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the first model parameter of the t-th round of the local neural network model
Step 3.2, training participant k first from local training data set DkSelecting a part of data to form a part of local training data set BkReuse part of the local training data set BkTraining the current local neural network model, and applying a stochastic gradient descent method to make a connection in the training processOver-gradient unit vector for updating first model parameters of t-th round of local neural network modelObtaining the second model parameter of the t-th round of the local neural network model
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training processObtaining the t round uploading model parameter of the local neural network model
The k belongs to cK, and the cK is the number of training participants; t is the number of training rounds.
3. The federal prediction method as claimed in claim 2, wherein,
in step 3.1, local training data set D is usedkTraining a local neural network model for one time; in step 3.2, local training data set B is usedkTraining the local neural network model for multiple times; in step 3.3, local training data set D is usedkTraining a local neural network model for one time;
the k belongs to cK, and cK is the number of training participants.
4. The federal forecast method based on federal learning as claimed in claim 1, wherein the specific process of step 5 is as follows:
step 5.1, the server utilizes a k-means clustering algorithm to perform clusteringUploading model parameters of all tth roundsClustering into M clustering clusters, and calculating the cluster center coordinate of each clustering cluster;
step 5.2, uploading model parameters from the t-th round of each cluster by the serverSelecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset rangeAs the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each clusterAs the t-th round clustering point model parameter;
step 5.3, the server selects a certain number of discrete point model parameters of the t-th round from all discrete point model parameters of the t-th round of the discrete clusters to average, and the discrete cluster intermediate model parameters of the t-th round of the discrete clusters are obtainedMeanwhile, the server averages the t-th round clustering point model parameters in all the t-th round clustering point model parameters of each clustering cluster to obtain the t-th round clustering cluster middle model parameters of each clustering cluster
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete clusterAnd eachT-th round clustering final model parameter of each clustering clusterWherein:
the alpha is the weight of the current cluster i, beta is the weight of other clusters j, gamma is the weight of a discrete cluster, alpha + beta + gamma is 1, and alpha > beta > gamma;the Euclidean distance between the cluster center coordinate of the current cluster i and the cluster center coordinates of other cluster j is obtained; i belongs to M, j is not equal to i, and M is the number of the clustering clusters; k belongs to cK, and the cK is the number of the training participants; t is the number of training rounds.
5. The federal prediction method as claimed in claim 1, wherein in step 7, convergence of the local neural network model of the participating party means that the loss function value of the local neural network model does not change any more or the amount of change is smaller than a set change threshold.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011456395.8A CN112364943B (en) | 2020-12-10 | 2020-12-10 | Federal prediction method based on federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011456395.8A CN112364943B (en) | 2020-12-10 | 2020-12-10 | Federal prediction method based on federal learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112364943A true CN112364943A (en) | 2021-02-12 |
CN112364943B CN112364943B (en) | 2022-04-22 |
Family
ID=74536164
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011456395.8A Active CN112364943B (en) | 2020-12-10 | 2020-12-10 | Federal prediction method based on federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112364943B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112799708A (en) * | 2021-04-07 | 2021-05-14 | 支付宝(杭州)信息技术有限公司 | Method and system for jointly updating business model |
CN113033819A (en) * | 2021-03-25 | 2021-06-25 | 支付宝(杭州)信息技术有限公司 | Heterogeneous model-based federated learning method, device and medium |
CN113051557A (en) * | 2021-03-15 | 2021-06-29 | 河南科技大学 | Social network cross-platform malicious user detection method based on longitudinal federal learning |
CN113094407A (en) * | 2021-03-11 | 2021-07-09 | 广发证券股份有限公司 | Anti-money laundering identification method, device and system based on horizontal federal learning |
CN113139600A (en) * | 2021-04-23 | 2021-07-20 | 广东安恒电力科技有限公司 | Intelligent power grid equipment anomaly detection method and system based on federal learning |
CN113344220A (en) * | 2021-06-18 | 2021-09-03 | 山东大学 | User screening method, system, equipment and storage medium based on local model gradient in federated learning |
CN113378243A (en) * | 2021-07-14 | 2021-09-10 | 南京信息工程大学 | Personalized federal learning method based on multi-head attention mechanism |
CN113469373A (en) * | 2021-08-17 | 2021-10-01 | 北京神州新桥科技有限公司 | Model training method, system, equipment and storage medium based on federal learning |
CN113487351A (en) * | 2021-07-05 | 2021-10-08 | 哈尔滨工业大学(深圳) | Privacy protection advertisement click rate prediction method, device, server and storage medium |
CN113837399A (en) * | 2021-10-26 | 2021-12-24 | 医渡云(北京)技术有限公司 | Federal learning model training method, device, system, storage medium and equipment |
CN114077901A (en) * | 2021-11-23 | 2022-02-22 | 山东大学 | User position prediction framework based on clustering and used for image federation learning |
CN114118180A (en) * | 2021-04-02 | 2022-03-01 | 京东科技控股股份有限公司 | Clustering method and device, electronic equipment and storage medium |
CN114611722A (en) * | 2022-03-16 | 2022-06-10 | 中南民族大学 | Safe horizontal federal learning method based on cluster analysis |
CN114900343A (en) * | 2022-04-25 | 2022-08-12 | 西安电子科技大学 | Internet of things equipment abnormal flow detection method based on clustered federal learning |
CN114925744A (en) * | 2022-04-14 | 2022-08-19 | 支付宝(杭州)信息技术有限公司 | Joint training method and device |
CN115018085A (en) * | 2022-05-23 | 2022-09-06 | 郑州大学 | Data heterogeneity-oriented federated learning participation equipment selection method |
CN115061909A (en) * | 2022-06-15 | 2022-09-16 | 哈尔滨理工大学 | Heterogeneous software defect prediction algorithm research based on federal reinforcement learning |
WO2022226903A1 (en) * | 2021-04-29 | 2022-11-03 | 浙江大学 | Federated learning method for k-means clustering algorithm |
CN116502709A (en) * | 2023-06-26 | 2023-07-28 | 浙江大学滨江研究院 | Heterogeneous federal learning method and device |
CN116522228A (en) * | 2023-04-28 | 2023-08-01 | 哈尔滨工程大学 | Radio frequency fingerprint identification method based on feature imitation federal learning |
CN116665319A (en) * | 2023-07-31 | 2023-08-29 | 华南理工大学 | Multi-mode biological feature recognition method based on federal learning |
CN116701972A (en) * | 2023-08-09 | 2023-09-05 | 腾讯科技(深圳)有限公司 | Service data processing method, device, equipment and medium |
CN117094410A (en) * | 2023-07-10 | 2023-11-21 | 西安电子科技大学 | Model repairing method for poisoning damage federal learning |
CN118362309A (en) * | 2024-06-19 | 2024-07-19 | 石家庄铁道大学 | Rolling bearing fault diagnosis method based on dynamic cluster federal learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190103088A (en) * | 2019-08-15 | 2019-09-04 | 엘지전자 주식회사 | Method and apparatus for recognizing a business card using federated learning |
CN110399742A (en) * | 2019-07-29 | 2019-11-01 | 深圳前海微众银行股份有限公司 | A kind of training, prediction technique and the device of federation's transfer learning model |
CN111339212A (en) * | 2020-02-13 | 2020-06-26 | 深圳前海微众银行股份有限公司 | Sample clustering method, device, equipment and readable storage medium |
CN111860832A (en) * | 2020-07-01 | 2020-10-30 | 广州大学 | Method for enhancing neural network defense capacity based on federal learning |
-
2020
- 2020-12-10 CN CN202011456395.8A patent/CN112364943B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399742A (en) * | 2019-07-29 | 2019-11-01 | 深圳前海微众银行股份有限公司 | A kind of training, prediction technique and the device of federation's transfer learning model |
KR20190103088A (en) * | 2019-08-15 | 2019-09-04 | 엘지전자 주식회사 | Method and apparatus for recognizing a business card using federated learning |
CN111339212A (en) * | 2020-02-13 | 2020-06-26 | 深圳前海微众银行股份有限公司 | Sample clustering method, device, equipment and readable storage medium |
CN111860832A (en) * | 2020-07-01 | 2020-10-30 | 广州大学 | Method for enhancing neural network defense capacity based on federal learning |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113094407A (en) * | 2021-03-11 | 2021-07-09 | 广发证券股份有限公司 | Anti-money laundering identification method, device and system based on horizontal federal learning |
CN113094407B (en) * | 2021-03-11 | 2022-07-19 | 广发证券股份有限公司 | Anti-money laundering identification method, device and system based on horizontal federal learning |
CN113051557A (en) * | 2021-03-15 | 2021-06-29 | 河南科技大学 | Social network cross-platform malicious user detection method based on longitudinal federal learning |
CN113051557B (en) * | 2021-03-15 | 2022-11-11 | 河南科技大学 | Social network cross-platform malicious user detection method based on longitudinal federal learning |
CN113033819A (en) * | 2021-03-25 | 2021-06-25 | 支付宝(杭州)信息技术有限公司 | Heterogeneous model-based federated learning method, device and medium |
CN113033819B (en) * | 2021-03-25 | 2022-11-11 | 支付宝(杭州)信息技术有限公司 | Heterogeneous model-based federated learning method, device and medium |
CN114118180A (en) * | 2021-04-02 | 2022-03-01 | 京东科技控股股份有限公司 | Clustering method and device, electronic equipment and storage medium |
CN112799708B (en) * | 2021-04-07 | 2021-07-13 | 支付宝(杭州)信息技术有限公司 | Method and system for jointly updating business model |
CN112799708A (en) * | 2021-04-07 | 2021-05-14 | 支付宝(杭州)信息技术有限公司 | Method and system for jointly updating business model |
CN113139600A (en) * | 2021-04-23 | 2021-07-20 | 广东安恒电力科技有限公司 | Intelligent power grid equipment anomaly detection method and system based on federal learning |
WO2022226903A1 (en) * | 2021-04-29 | 2022-11-03 | 浙江大学 | Federated learning method for k-means clustering algorithm |
CN113344220A (en) * | 2021-06-18 | 2021-09-03 | 山东大学 | User screening method, system, equipment and storage medium based on local model gradient in federated learning |
CN113487351A (en) * | 2021-07-05 | 2021-10-08 | 哈尔滨工业大学(深圳) | Privacy protection advertisement click rate prediction method, device, server and storage medium |
CN113378243B (en) * | 2021-07-14 | 2023-09-29 | 南京信息工程大学 | Personalized federal learning method based on multi-head attention mechanism |
CN113378243A (en) * | 2021-07-14 | 2021-09-10 | 南京信息工程大学 | Personalized federal learning method based on multi-head attention mechanism |
CN113469373B (en) * | 2021-08-17 | 2023-06-30 | 北京神州新桥科技有限公司 | Model training method, system, equipment and storage medium based on federal learning |
CN113469373A (en) * | 2021-08-17 | 2021-10-01 | 北京神州新桥科技有限公司 | Model training method, system, equipment and storage medium based on federal learning |
CN113837399A (en) * | 2021-10-26 | 2021-12-24 | 医渡云(北京)技术有限公司 | Federal learning model training method, device, system, storage medium and equipment |
CN113837399B (en) * | 2021-10-26 | 2023-05-30 | 医渡云(北京)技术有限公司 | Training method, device, system, storage medium and equipment for federal learning model |
CN114077901B (en) * | 2021-11-23 | 2024-05-24 | 山东大学 | User position prediction method based on clustering graph federation learning |
CN114077901A (en) * | 2021-11-23 | 2022-02-22 | 山东大学 | User position prediction framework based on clustering and used for image federation learning |
CN114611722B (en) * | 2022-03-16 | 2024-05-24 | 中南民族大学 | Safe transverse federal learning method based on cluster analysis |
CN114611722A (en) * | 2022-03-16 | 2022-06-10 | 中南民族大学 | Safe horizontal federal learning method based on cluster analysis |
CN114925744A (en) * | 2022-04-14 | 2022-08-19 | 支付宝(杭州)信息技术有限公司 | Joint training method and device |
CN114900343B (en) * | 2022-04-25 | 2023-01-24 | 西安电子科技大学 | Internet of things equipment abnormal flow detection method based on clustered federal learning |
CN114900343A (en) * | 2022-04-25 | 2022-08-12 | 西安电子科技大学 | Internet of things equipment abnormal flow detection method based on clustered federal learning |
CN115018085B (en) * | 2022-05-23 | 2023-06-16 | 郑州大学 | Data heterogeneity-oriented federal learning participation equipment selection method |
CN115018085A (en) * | 2022-05-23 | 2022-09-06 | 郑州大学 | Data heterogeneity-oriented federated learning participation equipment selection method |
CN115061909B (en) * | 2022-06-15 | 2024-08-16 | 哈尔滨理工大学 | Heterogeneous software defect prediction algorithm research based on federal reinforcement learning |
CN115061909A (en) * | 2022-06-15 | 2022-09-16 | 哈尔滨理工大学 | Heterogeneous software defect prediction algorithm research based on federal reinforcement learning |
CN116522228B (en) * | 2023-04-28 | 2024-02-06 | 哈尔滨工程大学 | Radio frequency fingerprint identification method based on feature imitation federal learning |
CN116522228A (en) * | 2023-04-28 | 2023-08-01 | 哈尔滨工程大学 | Radio frequency fingerprint identification method based on feature imitation federal learning |
CN116502709A (en) * | 2023-06-26 | 2023-07-28 | 浙江大学滨江研究院 | Heterogeneous federal learning method and device |
CN117094410B (en) * | 2023-07-10 | 2024-02-13 | 西安电子科技大学 | Model repairing method for poisoning damage federal learning |
CN117094410A (en) * | 2023-07-10 | 2023-11-21 | 西安电子科技大学 | Model repairing method for poisoning damage federal learning |
CN116665319B (en) * | 2023-07-31 | 2023-11-24 | 华南理工大学 | Multi-mode biological feature recognition method based on federal learning |
CN116665319A (en) * | 2023-07-31 | 2023-08-29 | 华南理工大学 | Multi-mode biological feature recognition method based on federal learning |
CN116701972B (en) * | 2023-08-09 | 2023-11-24 | 腾讯科技(深圳)有限公司 | Service data processing method, device, equipment and medium |
CN116701972A (en) * | 2023-08-09 | 2023-09-05 | 腾讯科技(深圳)有限公司 | Service data processing method, device, equipment and medium |
CN118362309A (en) * | 2024-06-19 | 2024-07-19 | 石家庄铁道大学 | Rolling bearing fault diagnosis method based on dynamic cluster federal learning |
CN118362309B (en) * | 2024-06-19 | 2024-09-03 | 石家庄铁道大学 | Rolling bearing fault diagnosis method based on dynamic cluster federal learning |
Also Published As
Publication number | Publication date |
---|---|
CN112364943B (en) | 2022-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112364943B (en) | Federal prediction method based on federal learning | |
CN112949837B (en) | Target recognition federal deep learning method based on trusted network | |
WO2021179720A1 (en) | Federated-learning-based user data classification method and apparatus, and device and medium | |
CN112733967B (en) | Model training method, device, equipment and storage medium for federal learning | |
CN109657837A (en) | Default Probability prediction technique, device, computer equipment and storage medium | |
CN110796484B (en) | Method and device for constructing customer activity degree prediction model and application method thereof | |
CN114514519A (en) | Joint learning using heterogeneous model types and architectures | |
WO2023185539A1 (en) | Machine learning model training method, service data processing method, apparatuses, and systems | |
CN112116103B (en) | Personal qualification evaluation method, device and system based on federal learning and storage medium | |
CN114580663A (en) | Data non-independent same-distribution scene-oriented federal learning method and system | |
US20230419182A1 (en) | Methods and systems for imrpoving a product conversion rate based on federated learning and blockchain | |
CN112100642B (en) | Model training method and device for protecting privacy in distributed system | |
CN113377797A (en) | Method, device and system for jointly updating model | |
CN111860865A (en) | Model construction and analysis method, device, electronic equipment and medium | |
CN114282692A (en) | Model training method and system for longitudinal federal learning | |
CN114186694A (en) | Efficient, safe and low-communication longitudinal federal learning method | |
CN113360514A (en) | Method, device and system for jointly updating model | |
CN114362948B (en) | Federated derived feature logistic regression modeling method | |
CN115310625A (en) | Longitudinal federated learning reasoning attack defense method | |
CN116187469A (en) | Client member reasoning attack method based on federal distillation learning framework | |
CN115618008A (en) | Account state model construction method and device, computer equipment and storage medium | |
Mao et al. | A novel user membership leakage attack in collaborative deep learning | |
CN114491616A (en) | Block chain and homomorphic encryption-based federated learning method and application | |
CN112101609A (en) | Prediction system, method and device for timeliness of payment of user and electronic equipment | |
CN116452322A (en) | Credit card recommendation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |