CN112364943A - Federal prediction method based on federal learning - Google Patents

Federal prediction method based on federal learning Download PDF

Info

Publication number
CN112364943A
CN112364943A CN202011456395.8A CN202011456395A CN112364943A CN 112364943 A CN112364943 A CN 112364943A CN 202011456395 A CN202011456395 A CN 202011456395A CN 112364943 A CN112364943 A CN 112364943A
Authority
CN
China
Prior art keywords
training
round
neural network
cluster
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011456395.8A
Other languages
Chinese (zh)
Other versions
CN112364943B (en
Inventor
李先贤
段锦欢
王金艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN202011456395.8A priority Critical patent/CN112364943B/en
Publication of CN112364943A publication Critical patent/CN112364943A/en
Application granted granted Critical
Publication of CN112364943B publication Critical patent/CN112364943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a federal prediction method based on federal learning, which enables the parameter change of a neural network model updated by a single participant to have direction difference and no size difference by unitizing a locally updated gradient vector, thereby protecting the data privacy, simultaneously not using homomorphic, differential privacy or other encryption technologies, and greatly reducing the communication cost between equipment and a server under the condition of not losing the data precision. In addition, considering that the data are often large in difference in a federal learning scene, the performance of local participants can be improved by increasing the local information of the data, the uploaded neural network model parameters are clustered by using a k-means algorithm to obtain similar neural network model parameters, and the aggregation weight of the neural network model parameters is improved, so that the method is more suitable for the data scene of the participants.

Description

Federal prediction method based on federal learning
Technical Field
The invention relates to the technical field of federal learning, in particular to a federal prediction method based on federal learning.
Background
In most industries, due to problems of industry competition, privacy security, complex administrative procedures and the like, data is not shared, and even among different departments of the same company, centralized integration of data faces an important resistance. In reality, it is almost impossible to integrate data distributed in various places and organizations, and the cost is enormous. Meanwhile, all countries strengthen the protection of data security and privacy, and a new law General Data Protection Regulation (GDPR) introduced recently in the European Union shows that the strictness of the user data privacy and security management is a world trend. Aiming at two problems of data island and privacy safety, Google proposed a federal learning framework in 2016, and the design goal of the Federa learning framework is to develop efficient machine learning among multiple participants or multiple computing nodes on the premise of guaranteeing information safety during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance.
The federal learning is that a plurality of data owners form a alliance and participate in the training of the global model together. On the basis of protecting data privacy and model parameters, all the participants only share the encrypted model parameters or the encrypted intermediate calculation results, but do not share original data, so that the data can be used and invisible, and the jointly constructed model can achieve better model performance. With the continuous perfection of laws and regulations in data security, more and more companies and organizations schedule privacy security, and more researchers invest the data security.
In federal learning, using matrix DiRepresenting the data held by each data owner i, each row of the matrix representing a sample, and each column representing a feature. Meanwhile, some data sets may also contain label data, for example, a label in the financial field may be a credit of a user, a label in the marketing field may be a purchasing desire of the user, and a label in the education field may be a degree of a student. For this purpose, the feature space is denoted by X, the label space by Y and the sample ID space by I, so that a complete training data set (I, X, Y) is formed among the feature X, the label Y and the sample ID. In existing federal learning, its parameters are in the polymerization processA pure federal averaging mode is adopted, and the prediction performance of the model obtained by the mode is poor. In addition, because encryption is required on the shared parameters, the communication cost between the equipment and the server can be greatly increased by using homomorphic or other encryption technologies, and the communication efficiency is low.
Disclosure of Invention
The invention aims to solve the problem of poor prediction effect of the existing federal learning and provides a federal prediction method based on the federal learning.
In order to solve the problems, the invention is realized by the following technical scheme:
a federal forecast method based on federal learning comprises the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all participants; all participants take the parameters of the initialized neural network model as model parameters of the 0 th round of the local neural network model;
step 2, the participation direction server meeting the conditions provides a request for participating in the t-th round of training, and the server selects cK participants as training participants to participate in the t-th round of training;
step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA0002828727710000021
Obtaining the t round uploading model parameter of the local neural network model
Figure BDA0002828727710000022
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of training
Figure BDA0002828727710000023
Uploading to a server;
step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithm
Figure BDA0002828727710000024
Polymerizing to obtain the final model parameter of the discrete cluster of the t-th round
Figure BDA0002828727710000025
And the t round clustering final model parameter of each clustering cluster
Figure BDA0002828727710000026
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering cluster
Figure BDA0002828727710000027
Respectively sending the parameters to training participants corresponding to the t-th round clustering point model parameters of the corresponding clustering clusters, and obtaining the final model parameters of the t-th round discrete clusters of the discrete clusters
Figure BDA0002828727710000028
Transmitting the training participator corresponding to the discrete point model parameter of the t-th round of the discrete cluster and other participators which do not participate in the t-th round;
step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, making t equal to t +1, and returning to the step 2;
and 8, each participant sends the local test data into the local neural network model, and the local test data is predicted by using the local neural network model.
The specific process of the step 3 is as follows:
step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA0002828727710000029
Obtaining the first model parameter of the t-th round of the local neural network model
Figure BDA00028287277100000210
Step 3.2, training participant k first from local training data set DkSelecting a part of data to form a part of local training data set BkReuse part of the local training data set BkTraining the current local neural network model, and updating the first model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA00028287277100000211
Obtaining the second model parameter of the t-th round of the local neural network model
Figure BDA00028287277100000212
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA00028287277100000213
Obtaining the t round uploading model parameter of the local neural network model
Figure BDA00028287277100000214
The specific process of the step 5 is as follows:
step 5.1, the server uploads all the t-th round uploading model parameters by using a k-means clustering algorithm
Figure BDA0002828727710000031
Clustering into M clustering clusters, and calculating the cluster center coordinate of each clustering cluster;
step 5.2, the server follows eachClustering cluster t round uploading model parameter
Figure BDA0002828727710000032
Selecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset range
Figure BDA0002828727710000033
As the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each cluster
Figure BDA0002828727710000034
As the t-th round clustering point model parameter;
step 5.3, the server selects a certain number of discrete point model parameters of the t-th round from all discrete point model parameters of the t-th round of the discrete clusters to average, and the discrete cluster intermediate model parameters of the t-th round of the discrete clusters are obtained
Figure BDA0002828727710000035
Meanwhile, the server averages the t-th round clustering point model parameters in all the t-th round clustering point model parameters of each clustering cluster to obtain the t-th round clustering cluster middle model parameters of each clustering cluster
Figure BDA0002828727710000036
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete cluster
Figure BDA0002828727710000037
And the t round clustering final model parameter of each clustering cluster
Figure BDA0002828727710000038
Wherein:
Figure BDA0002828727710000039
Figure BDA00028287277100000310
the alpha is the weight of the current cluster i, beta is the weight of other clusters j, gamma is the weight of a discrete cluster, alpha + beta + gamma is 1, and alpha > beta > gamma;
Figure BDA00028287277100000311
the Euclidean distance between the cluster center coordinate of the current cluster i and the cluster center coordinates of other cluster j is obtained; i belongs to M, j is not equal to i, and M is the number of the clustering clusters; k belongs to cK, and the cK is the number of the training participants; t is the number of training rounds.
Compared with the prior art, the invention has the following characteristics:
1. considering that the data are often large in difference in a federal learning scene, the performance of the local participants can be improved by increasing the local information of the data, the method utilizes a k-means algorithm to cluster uploaded neural network model parameters to obtain similar neural network model parameters, improves the aggregation weight of the neural network model parameters, and is more suitable for the data scene of the local participants.
2. According to the method, the gradient vector unit of local update is adopted, so that the change of the neural network model parameters updated by a single participant is only different in direction and has no difference in size, the data privacy is protected, homomorphic privacy, differential privacy or other encryption technologies are not needed, and the communication cost between equipment and a server is greatly reduced under the condition of not losing the data precision.
Drawings
FIG. 1 is a schematic diagram of a federated prediction method based on federated learning.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to specific examples.
For those skilled in the art of federal learning, before federal training can be performed, it is necessary to identify federal learning participants and servers and build a federal learning environment. Before participating in federal learning, a participant needs to prepare a local training data set to be trained, and needs a local neural network model determined according to an actual application scenario.
Taking keyboard input prediction as an example, the participating parties are thousands of mobile devices (mobile phones) participating in federal learning, and the server is Ariiyun or Baidu cloud. The local training data set is the data of input method (e.g., Gboard) users who select a shared text segment when entering the google application. The text is truncated to contain a phrase consisting of several words, and segments are only occasionally logged in from individual users. Prior to training, the logs were anonymized and deprived of personally identifiable information. Furthermore, segments are only used for training when starting with a sentence mark. The local neural network model uses a variant of the long-short term memory (LSTM) recurrent neural network, called the coupled-input forgetting gate (CIFG), for the next-word prediction model, and bound input-embedded and output projection matrices are used to reduce function size and accelerate training. Given a vocabulary of size V, by d Wv and an embedding matrix w ∈ RD×VEncoding a single hotspot v epsilon RVMapping to a dense embedded vector d ∈ RD. Output projection of CIFG, also at RDIs mapped to WTh∈RVThe output vector of (1). The softmax function on the output vector converts the raw logarithm to a normalized probability, and the model is trained using the cross entropy loss of the output label and the target label. Virtual keyboard input suggests that users all use mobile devices (cell phones), but the data input by different users on the keyboard is different.
Taking bank money laundering prediction as an example, the participating party is a bank participating in federal study, and the server is Ali cloud or Baidu cloud. The local training data set is business data of a bank, and the data comprises four items of client id, number x1 of inconsistent fund source and operation range, number x2 of large transaction and whether label data Y is used for money laundering. The local neural network model is set as a fully-connected neural network with two layers of three nodes and a softmax function, and the activation function is relu. The bank anti-money laundering task, bank A and bank B are in different areas. The business data of the bank A and the bank B have the same characteristic space and different customers due to the same business.
Referring to fig. 1, a federal prediction method based on federal learning includes the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all K participants; all participants take the parameters of the initialized neural network model as the model parameters of the 0 th round of the local neural network model
Figure BDA0002828727710000041
Wherein K belongs to K, and K is the number of the participants.
In the keyboard input prediction example, the 0 th round raw model parameters are broadcast to all mobile devices (handsets) participating in federal training.
In the bank money laundering prediction example, round 0 original model parameters are broadcast to all bank devices participating in federal training.
And 2, the participation direction meeting the conditions provides a request for participating in the t-th round of training to the server, and the server selects cK participants as training participants to participate in the t-th round of training.
Participants meeting a certain condition (the condition is determined according to the task of the federal learning training plan) can make a request to the server, and the server can participate in the current training, after receiving the request, a part of the participants can be selected to participate in the current training, if some participants do not participate in the current training, the server can make the requests again after a period of time, and the server can consider the factors of the number of the participants and the overtime time. This round of training will only succeed if enough devices can participate in the current round of training before a timeout.
Step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA0002828727710000051
Obtaining the t round uploading model parameter of the local neural network model
Figure BDA0002828727710000052
Wherein k belongs to cK, and cK is the number of training participants.
By unitizing the locally updated gradient vectors, the parameter change of the neural network model updated by a single participant only has direction difference and no size difference. Even if an attacker obtains the neural network model parameters, only the gradient unit vector is known, and the local data cannot be reversely pushed out, so that the data privacy is protected. When uploading the parameters, the parameters do not need to be encrypted, and the communication efficiency between the participants and the server is improved.
Step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA0002828727710000053
Obtaining the first model parameter of the t-th round of the local neural network model
Figure BDA0002828727710000054
This phase is trained once.
Figure BDA0002828727710000055
In the formula, eta is the learning rate,
Figure BDA0002828727710000056
for the t-th round of training, the model parameter gradient of sample x in party k.
Step 3.2, training participant k first from local training data set DkSelecting a part forming part of the local training data set BkReuse of part of the local training dataSet BkTraining the current local neural network model, and updating the first model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA0002828727710000057
Obtaining the second model parameter of the t-th round of the local neural network model
Figure BDA0002828727710000058
This phase may be trained multiple times.
Figure BDA0002828727710000059
In the formula, eta is the learning rate,
Figure BDA00028287277100000510
for the t-th round of training, the model parameter gradient of sample x in party k.
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure BDA00028287277100000511
Obtaining the t round uploading model parameter of the local neural network model
Figure BDA00028287277100000512
This phase is trained once.
Figure BDA00028287277100000513
In the formula, eta is the learning rate,
Figure BDA00028287277100000514
for the sample in the t round training, participant kThe model parameter gradient of this x.
In the keyboard input prediction example, the mobile devices participating in the current round of training input a local training data set (input method user's data) including log data and log text into the local neural network model.
In the bank money laundering prediction example, the bank equipment participating in the current round of training inputs a local training data set (business data of the bank) into a local neural network model, wherein the local training data set comprises four items including client id, the number x1 of the capital source and the operating range which do not accord with each other, the number x2 of the large transaction number, and whether the label data Y is money laundering or not.
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of training
Figure BDA0002828727710000061
And uploading to a server.
And the server waits for each participant to return the result after training, and if enough participants return the result before timeout, the training of the round is successful, otherwise, the training of the round fails. The server adopts an aggregation algorithm to aggregate after the training of the round is successful.
Step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithm
Figure BDA0002828727710000062
Polymerizing to obtain the final model parameter of the t round cluster
Figure BDA0002828727710000063
And the t-th discrete cluster final model parameters
Figure BDA0002828727710000064
Due to the fact that the parameters are directly averaged by the federal average, the data are often large in difference under the federal learning scene, and the performance of local participants can be improved by increasing the local information of the data. And clustering the uploaded neural network model parameters by using a k-means algorithm to obtain similar neural network model parameters. During aggregation, the weight of the neural network model parameters of the local cluster is improved for local data, so that the method is more suitable for the data scene of the participant and improves the performance of the neural network model.
Step 5.1, the server uploads all the t-th round uploading model parameters by using a k-means clustering algorithm
Figure BDA0002828727710000065
Clustering into M cluster clusters, and calculating cluster center coordinates of each cluster
Figure BDA0002828727710000066
Wherein i belongs to M, and M is the number of the clustering clusters.
Step 5.2, uploading model parameters from the t-th round of each cluster by the server
Figure BDA0002828727710000067
Selecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset range
Figure BDA0002828727710000068
As the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each cluster
Figure BDA0002828727710000069
As the t-th round clustering point model parameters.
Uploading model parameters when the tth round
Figure BDA00028287277100000610
Coordinates of the cluster center
Figure BDA00028287277100000611
The Euclidean distance between them is less than or equal to the preset range, i.e.
Figure BDA00028287277100000612
Then the t round uploads the model parameters
Figure BDA00028287277100000613
Called as the t-th round clustering point model parameter (the training participant corresponding to the parameter is the normal training participant in the cluster). Uploading model parameters when the tth round
Figure BDA00028287277100000614
Coordinates of the cluster center
Figure BDA00028287277100000615
Having a Euclidean distance between them greater than a predetermined range, i.e.
Figure BDA00028287277100000616
Then the t round uploads the model parameters
Figure BDA00028287277100000617
And the parameter is called as the t-th round discrete point model parameter (the training participant corresponding to the parameter is the abnormal training participant in the cluster). Wherein s isiThe standard deviation of all model parameters in the i cluster to cluster center coordinates.
Step 5.3, the server selects Q discrete point model parameters of the t round from all discrete point model parameters of the t round of the discrete clusters to average to obtain discrete cluster intermediate model parameters of the t round of the discrete clusters
Figure BDA00028287277100000618
Meanwhile, the server selects a certain number of Q cluster parameters from all the t round clustering point model parameters of each cluster to be averaged to obtain the t round clustering cluster intermediate model parameters of each cluster
Figure BDA00028287277100000619
Figure BDA0002828727710000071
Figure BDA0002828727710000072
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete cluster
Figure BDA0002828727710000073
And the t round clustering final model parameter of each clustering cluster
Figure BDA0002828727710000074
Wherein:
Figure BDA0002828727710000075
Figure BDA0002828727710000076
where α is the weight of the current cluster i, β is the weight of the other cluster j, and γ is the weight of the discrete cluster, and α + β + γ is 1 and α > β > γ. Wherein
Figure BDA0002828727710000077
And in the t-th round of training, the Euclidean distance between the current cluster i and the cluster center coordinates of other cluster j is calculated.
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering cluster
Figure BDA0002828727710000078
Respectively sending the parameters to training participators (normal training participators) corresponding to the t-th round clustering point model parameters of the corresponding clustering clusters, and finally obtaining model parameters of the t-th round discrete clusters of the discrete clusters
Figure BDA0002828727710000079
And transmitting the parameters to training participants (abnormal training participants) corresponding to the discrete point model parameters of the t-th round of the discrete clusters and other participants not participating in the t-th round.
Step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, let t be t +1, and return to step 2.
The model convergence refers to that the loss function value of the local neural network model is not changed any more or the change amount is smaller than a set change threshold.
And 8, each participant sends the local test data into the local neural network model, and the local test data is predicted by using the local neural network model.
In the keyboard input prediction example, each mobile device obtains a trained coupled input forgetting gate neural network model, the mobile device takes the real-time keyboard input characters of a user as input values of the coupled input forgetting gate neural network model, the output values of the coupled input forgetting gate neural network model are the next predicted input characters, and the predicted input characters are displayed on a virtual keyboard for the user to select.
In the bank money laundering prediction example, each bank obtains a trained anti-money laundering neural network model, bank business data comprising three items of client id, number x1 of inconsistent fund source and operation range and number x2 of large-amount transaction are used as input values of the anti-money laundering neural network model, and the output value of the anti-money laundering neural network model is used for predicting whether the business data is suspected to participate in money laundering.
It should be noted that, although the above-mentioned embodiments of the present invention are illustrative, the present invention is not limited thereto, and thus the present invention is not limited to the above-mentioned embodiments. Other embodiments, which can be made by those skilled in the art in light of the teachings of the present invention, are considered to be within the scope of the present invention without departing from its principles.

Claims (5)

1. A federal forecast method based on federal learning is characterized by comprising the following steps:
step 1, a server initializes parameters of a neural network model and sends the parameters to all participants; all participants take the parameters of the initialized neural network model as model parameters of the 0 th round of the local neural network model;
step 2, the participation direction server meeting the conditions provides a request for participating in the t-th round of training, and the server selects cK participants as training participants to participate in the t-th round of training;
step 3, each training participant of the t round of training utilizes a local training data set DkTraining a local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure FDA0002828727700000011
Obtaining the t round uploading model parameter of the local neural network model
Figure FDA0002828727700000012
Step 4, uploading model parameters of the t-th round of the local neural network model by all training participants of the t-th round of training
Figure FDA0002828727700000013
Uploading to a server;
step 5, uploading model parameters to the tth round by the server based on a k-means clustering algorithm
Figure FDA0002828727700000014
Polymerizing to obtain the final model parameter of the discrete cluster of the t-th round
Figure FDA0002828727700000015
And the t round clustering final model parameter of each clustering cluster
Figure FDA0002828727700000016
Step 6, the server enables the t-th clustering cluster final model parameter of each clustering cluster
Figure FDA0002828727700000017
Are sent to the phases respectivelyTraining participants corresponding to the t-th round clustering point model parameters of the cluster to be clustered, and final model parameters of the t-th round discrete clusters of the discrete clusters
Figure FDA0002828727700000018
Transmitting the training participator corresponding to the discrete point model parameter of the t-th round of the discrete cluster and other participators which do not participate in the t-th round;
step 7, judging whether the local neural network models of all the participants have converged or reach the preset training times: if yes, go to step 8; otherwise, adding 1 to the number t of the training rounds, and returning to the step 2;
step 8, each participant sends the local test data into a local neural network model, and the local test data is predicted by using the local neural network model;
the k belongs to cK, and the cK is the number of training participants; t is the number of training rounds.
2. The federal forecast method based on federal learning as claimed in claim 1, wherein the specific process of step 3 is as follows:
step 3.1, training participant k utilizes local training data set DkTraining the current local neural network model, and updating model parameters of the t-1 th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure FDA0002828727700000019
Obtaining the first model parameter of the t-th round of the local neural network model
Figure FDA00028287277000000110
Step 3.2, training participant k first from local training data set DkSelecting a part of data to form a part of local training data set BkReuse part of the local training data set BkTraining the current local neural network model, and applying a stochastic gradient descent method to make a connection in the training processOver-gradient unit vector for updating first model parameters of t-th round of local neural network model
Figure FDA00028287277000000111
Obtaining the second model parameter of the t-th round of the local neural network model
Figure FDA0002828727700000021
Step 3.3, training participant k utilizes local training dataset DkTraining the current local neural network model, and updating the second model parameter of the t-th round of the local neural network model by using a random gradient descent method and gradient unit vectors in the training process
Figure FDA0002828727700000022
Obtaining the t round uploading model parameter of the local neural network model
Figure FDA0002828727700000023
The k belongs to cK, and the cK is the number of training participants; t is the number of training rounds.
3. The federal prediction method as claimed in claim 2, wherein,
in step 3.1, local training data set D is usedkTraining a local neural network model for one time; in step 3.2, local training data set B is usedkTraining the local neural network model for multiple times; in step 3.3, local training data set D is usedkTraining a local neural network model for one time;
the k belongs to cK, and cK is the number of training participants.
4. The federal forecast method based on federal learning as claimed in claim 1, wherein the specific process of step 5 is as follows:
step 5.1, the server utilizes a k-means clustering algorithm to perform clusteringUploading model parameters of all tth rounds
Figure FDA0002828727700000024
Clustering into M clustering clusters, and calculating the cluster center coordinate of each clustering cluster;
step 5.2, uploading model parameters from the t-th round of each cluster by the server
Figure FDA0002828727700000025
Selecting a t-th round uploading model parameter of which the distance between the parameter and the cluster center coordinate of the cluster is out of a preset range
Figure FDA0002828727700000026
As the t-th discrete point model parameters, forming a new discrete cluster by the t-th discrete point model parameters selected from all the cluster clusters; at the moment, uploading model parameters of the remaining t-th round of each cluster
Figure FDA0002828727700000027
As the t-th round clustering point model parameter;
step 5.3, the server selects a certain number of discrete point model parameters of the t-th round from all discrete point model parameters of the t-th round of the discrete clusters to average, and the discrete cluster intermediate model parameters of the t-th round of the discrete clusters are obtained
Figure FDA0002828727700000028
Meanwhile, the server averages the t-th round clustering point model parameters in all the t-th round clustering point model parameters of each clustering cluster to obtain the t-th round clustering cluster middle model parameters of each clustering cluster
Figure FDA0002828727700000029
Step 5.4, the server calculates the t-th round discrete cluster final model parameter of the discrete cluster
Figure FDA00028287277000000210
And eachT-th round clustering final model parameter of each clustering cluster
Figure FDA00028287277000000211
Wherein:
Figure FDA00028287277000000212
Figure FDA00028287277000000213
the alpha is the weight of the current cluster i, beta is the weight of other clusters j, gamma is the weight of a discrete cluster, alpha + beta + gamma is 1, and alpha > beta > gamma;
Figure FDA00028287277000000214
the Euclidean distance between the cluster center coordinate of the current cluster i and the cluster center coordinates of other cluster j is obtained; i belongs to M, j is not equal to i, and M is the number of the clustering clusters; k belongs to cK, and the cK is the number of the training participants; t is the number of training rounds.
5. The federal prediction method as claimed in claim 1, wherein in step 7, convergence of the local neural network model of the participating party means that the loss function value of the local neural network model does not change any more or the amount of change is smaller than a set change threshold.
CN202011456395.8A 2020-12-10 2020-12-10 Federal prediction method based on federal learning Active CN112364943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011456395.8A CN112364943B (en) 2020-12-10 2020-12-10 Federal prediction method based on federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011456395.8A CN112364943B (en) 2020-12-10 2020-12-10 Federal prediction method based on federal learning

Publications (2)

Publication Number Publication Date
CN112364943A true CN112364943A (en) 2021-02-12
CN112364943B CN112364943B (en) 2022-04-22

Family

ID=74536164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011456395.8A Active CN112364943B (en) 2020-12-10 2020-12-10 Federal prediction method based on federal learning

Country Status (1)

Country Link
CN (1) CN112364943B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112799708A (en) * 2021-04-07 2021-05-14 支付宝(杭州)信息技术有限公司 Method and system for jointly updating business model
CN113033819A (en) * 2021-03-25 2021-06-25 支付宝(杭州)信息技术有限公司 Heterogeneous model-based federated learning method, device and medium
CN113051557A (en) * 2021-03-15 2021-06-29 河南科技大学 Social network cross-platform malicious user detection method based on longitudinal federal learning
CN113094407A (en) * 2021-03-11 2021-07-09 广发证券股份有限公司 Anti-money laundering identification method, device and system based on horizontal federal learning
CN113139600A (en) * 2021-04-23 2021-07-20 广东安恒电力科技有限公司 Intelligent power grid equipment anomaly detection method and system based on federal learning
CN113344220A (en) * 2021-06-18 2021-09-03 山东大学 User screening method, system, equipment and storage medium based on local model gradient in federated learning
CN113378243A (en) * 2021-07-14 2021-09-10 南京信息工程大学 Personalized federal learning method based on multi-head attention mechanism
CN113469373A (en) * 2021-08-17 2021-10-01 北京神州新桥科技有限公司 Model training method, system, equipment and storage medium based on federal learning
CN113487351A (en) * 2021-07-05 2021-10-08 哈尔滨工业大学(深圳) Privacy protection advertisement click rate prediction method, device, server and storage medium
CN113837399A (en) * 2021-10-26 2021-12-24 医渡云(北京)技术有限公司 Federal learning model training method, device, system, storage medium and equipment
CN114077901A (en) * 2021-11-23 2022-02-22 山东大学 User position prediction framework based on clustering and used for image federation learning
CN114611722A (en) * 2022-03-16 2022-06-10 中南民族大学 Safe horizontal federal learning method based on cluster analysis
CN114900343A (en) * 2022-04-25 2022-08-12 西安电子科技大学 Internet of things equipment abnormal flow detection method based on clustered federal learning
CN114925744A (en) * 2022-04-14 2022-08-19 支付宝(杭州)信息技术有限公司 Joint training method and device
CN115018085A (en) * 2022-05-23 2022-09-06 郑州大学 Data heterogeneity-oriented federated learning participation equipment selection method
WO2022226903A1 (en) * 2021-04-29 2022-11-03 浙江大学 Federated learning method for k-means clustering algorithm
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN116522228A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116665319A (en) * 2023-07-31 2023-08-29 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN116701972A (en) * 2023-08-09 2023-09-05 腾讯科技(深圳)有限公司 Service data processing method, device, equipment and medium
CN117094410A (en) * 2023-07-10 2023-11-21 西安电子科技大学 Model repairing method for poisoning damage federal learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190103088A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for recognizing a business card using federated learning
CN110399742A (en) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 A kind of training, prediction technique and the device of federation's transfer learning model
CN111339212A (en) * 2020-02-13 2020-06-26 深圳前海微众银行股份有限公司 Sample clustering method, device, equipment and readable storage medium
CN111860832A (en) * 2020-07-01 2020-10-30 广州大学 Method for enhancing neural network defense capacity based on federal learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110399742A (en) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 A kind of training, prediction technique and the device of federation's transfer learning model
KR20190103088A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for recognizing a business card using federated learning
CN111339212A (en) * 2020-02-13 2020-06-26 深圳前海微众银行股份有限公司 Sample clustering method, device, equipment and readable storage medium
CN111860832A (en) * 2020-07-01 2020-10-30 广州大学 Method for enhancing neural network defense capacity based on federal learning

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113094407B (en) * 2021-03-11 2022-07-19 广发证券股份有限公司 Anti-money laundering identification method, device and system based on horizontal federal learning
CN113094407A (en) * 2021-03-11 2021-07-09 广发证券股份有限公司 Anti-money laundering identification method, device and system based on horizontal federal learning
CN113051557A (en) * 2021-03-15 2021-06-29 河南科技大学 Social network cross-platform malicious user detection method based on longitudinal federal learning
CN113051557B (en) * 2021-03-15 2022-11-11 河南科技大学 Social network cross-platform malicious user detection method based on longitudinal federal learning
CN113033819A (en) * 2021-03-25 2021-06-25 支付宝(杭州)信息技术有限公司 Heterogeneous model-based federated learning method, device and medium
CN113033819B (en) * 2021-03-25 2022-11-11 支付宝(杭州)信息技术有限公司 Heterogeneous model-based federated learning method, device and medium
CN112799708B (en) * 2021-04-07 2021-07-13 支付宝(杭州)信息技术有限公司 Method and system for jointly updating business model
CN112799708A (en) * 2021-04-07 2021-05-14 支付宝(杭州)信息技术有限公司 Method and system for jointly updating business model
CN113139600A (en) * 2021-04-23 2021-07-20 广东安恒电力科技有限公司 Intelligent power grid equipment anomaly detection method and system based on federal learning
WO2022226903A1 (en) * 2021-04-29 2022-11-03 浙江大学 Federated learning method for k-means clustering algorithm
CN113344220A (en) * 2021-06-18 2021-09-03 山东大学 User screening method, system, equipment and storage medium based on local model gradient in federated learning
CN113487351A (en) * 2021-07-05 2021-10-08 哈尔滨工业大学(深圳) Privacy protection advertisement click rate prediction method, device, server and storage medium
CN113378243B (en) * 2021-07-14 2023-09-29 南京信息工程大学 Personalized federal learning method based on multi-head attention mechanism
CN113378243A (en) * 2021-07-14 2021-09-10 南京信息工程大学 Personalized federal learning method based on multi-head attention mechanism
CN113469373B (en) * 2021-08-17 2023-06-30 北京神州新桥科技有限公司 Model training method, system, equipment and storage medium based on federal learning
CN113469373A (en) * 2021-08-17 2021-10-01 北京神州新桥科技有限公司 Model training method, system, equipment and storage medium based on federal learning
CN113837399A (en) * 2021-10-26 2021-12-24 医渡云(北京)技术有限公司 Federal learning model training method, device, system, storage medium and equipment
CN113837399B (en) * 2021-10-26 2023-05-30 医渡云(北京)技术有限公司 Training method, device, system, storage medium and equipment for federal learning model
CN114077901A (en) * 2021-11-23 2022-02-22 山东大学 User position prediction framework based on clustering and used for image federation learning
CN114077901B (en) * 2021-11-23 2024-05-24 山东大学 User position prediction method based on clustering graph federation learning
CN114611722B (en) * 2022-03-16 2024-05-24 中南民族大学 Safe transverse federal learning method based on cluster analysis
CN114611722A (en) * 2022-03-16 2022-06-10 中南民族大学 Safe horizontal federal learning method based on cluster analysis
CN114925744A (en) * 2022-04-14 2022-08-19 支付宝(杭州)信息技术有限公司 Joint training method and device
CN114900343B (en) * 2022-04-25 2023-01-24 西安电子科技大学 Internet of things equipment abnormal flow detection method based on clustered federal learning
CN114900343A (en) * 2022-04-25 2022-08-12 西安电子科技大学 Internet of things equipment abnormal flow detection method based on clustered federal learning
CN115018085B (en) * 2022-05-23 2023-06-16 郑州大学 Data heterogeneity-oriented federal learning participation equipment selection method
CN115018085A (en) * 2022-05-23 2022-09-06 郑州大学 Data heterogeneity-oriented federated learning participation equipment selection method
CN116522228A (en) * 2023-04-28 2023-08-01 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116522228B (en) * 2023-04-28 2024-02-06 哈尔滨工程大学 Radio frequency fingerprint identification method based on feature imitation federal learning
CN116502709A (en) * 2023-06-26 2023-07-28 浙江大学滨江研究院 Heterogeneous federal learning method and device
CN117094410A (en) * 2023-07-10 2023-11-21 西安电子科技大学 Model repairing method for poisoning damage federal learning
CN117094410B (en) * 2023-07-10 2024-02-13 西安电子科技大学 Model repairing method for poisoning damage federal learning
CN116665319B (en) * 2023-07-31 2023-11-24 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN116665319A (en) * 2023-07-31 2023-08-29 华南理工大学 Multi-mode biological feature recognition method based on federal learning
CN116701972B (en) * 2023-08-09 2023-11-24 腾讯科技(深圳)有限公司 Service data processing method, device, equipment and medium
CN116701972A (en) * 2023-08-09 2023-09-05 腾讯科技(深圳)有限公司 Service data processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN112364943B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
CN112364943B (en) Federal prediction method based on federal learning
CN112949837B (en) Target recognition federal deep learning method based on trusted network
WO2021179720A1 (en) Federated-learning-based user data classification method and apparatus, and device and medium
CN112733967B (en) Model training method, device, equipment and storage medium for federal learning
WO2021022707A1 (en) Hybrid federated learning method and architecture
CN109657837A (en) Default Probability prediction technique, device, computer equipment and storage medium
CN110796484B (en) Method and device for constructing customer activity degree prediction model and application method thereof
EP4038519A1 (en) Federated learning using heterogeneous model types and architectures
CN112039702B (en) Model parameter training method and device based on federal learning and mutual learning
WO2023185539A1 (en) Machine learning model training method, service data processing method, apparatuses, and systems
CN114580663A (en) Data non-independent same-distribution scene-oriented federal learning method and system
CN114186694B (en) Efficient, safe and low-communication longitudinal federal learning method
CN112100642B (en) Model training method and device for protecting privacy in distributed system
CN111860865A (en) Model construction and analysis method, device, electronic equipment and medium
CN113377797A (en) Method, device and system for jointly updating model
CN113360514A (en) Method, device and system for jointly updating model
CN114362948B (en) Federated derived feature logistic regression modeling method
CN115310625A (en) Longitudinal federated learning reasoning attack defense method
CN112560059A (en) Vertical federal model stealing defense method based on neural pathway feature extraction
CN116187469A (en) Client member reasoning attack method based on federal distillation learning framework
CN115618008A (en) Account state model construction method and device, computer equipment and storage medium
CN114282692A (en) Model training method and system for longitudinal federal learning
Mao et al. A novel user membership leakage attack in collaborative deep learning
CN114491616A (en) Block chain and homomorphic encryption-based federated learning method and application
US20230419182A1 (en) Methods and systems for imrpoving a product conversion rate based on federated learning and blockchain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant