CN113361625A - Error data detection method with privacy protection in federated learning scene - Google Patents
Error data detection method with privacy protection in federated learning scene Download PDFInfo
- Publication number
- CN113361625A CN113361625A CN202110696108.9A CN202110696108A CN113361625A CN 113361625 A CN113361625 A CN 113361625A CN 202110696108 A CN202110696108 A CN 202110696108A CN 113361625 A CN113361625 A CN 113361625A
- Authority
- CN
- China
- Prior art keywords
- training
- test
- local
- local user
- terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioethics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for detecting error data with privacy protection in a federated learning scene, which comprises the following steps: 1, constructing a training target of a federal model, 2, training the federal model, 3, testing the federal model, 4, detecting a user side containing error data, 5, detecting and deleting the error data, 6, retraining the federal learning model, and 7, testing the result of the error data. The method can efficiently detect the user side containing the error training data and the error training data in the federal learning process with privacy protection, and repair errors at a lower cost, so that the prediction accuracy of the federal learning is improved, the convergence speed of the federal learning model is accelerated, and the two proposed efficient detection algorithms can respectively save calculation resources and communication resources so as to meet the requirement of dynamic resource limitation of the federal learning.
Description
Technical Field
The invention relates to a wrong data detection method with privacy protection in a federal learning scene, belonging to the field of data safety and data quality evaluation.
Background
In recent years, artificial intelligence has raised a wave and a tide, and AI gradually enters the living aspect of people from face recognition, living body inspection, criminal case alarm, to alpha dog wars, human go chess, hand and plum stone, to unmanned driving, and to the generally-applied precise marketing. The AI model is obtained by training a large amount of high-quality data, and in real life, except a few big companies, most enterprises have the problems of small data volume and poor data quality, so that the realization of an artificial intelligence technology is not supported sufficiently; meanwhile, the domestic and foreign supervision environment also gradually strengthens data protection, so that data freely flows on the premise of safety compliance, and becomes a great trend; data owned by business companies often has great potential value from both a user and enterprise perspective. Two companies and even departments between the companies need to consider the exchange of benefits, and the organizations cannot provide the respective data to be aggregated with other companies, so that even in the same company, the data often appears in an isolated island form, and the federal study is born at the end of the year.
Federal learning enables a data provider to complete a learning task through parameter updating instead of original data after local training of a sharing model under the coordination of a cloud server. In the federal learning process, the local data quality of a user affects the performance of the global model, and a large amount of error data (e.g., error label data) can reduce the performance of the global model, for example, the model has slow convergence speed and low test precision.
There have been a series of works for data error detection for centralized deep learning, including robustness and interpretability analysis of models, which fall into two broad categories: model-based interpretability analysis and data-based interpretability analysis. Model-based interpretability analysis focuses on building a more robust model by perturbing hidden units of the model. Interpretability analysis based on the data tracks predictions of the model through a model learning algorithm and returns to the training data to determine the training data points that have the greatest impact on the given test data. The influence function value (influence function) of a single data point is used to approximate the true influence value obtained by retraining the model after removing the data point from the training data.
But existing model interpretable work cannot be used directly in FL systems: 1) the existing method is designed for centralized model training and needs to directly access original training data, while in a federal system, local data cannot be directly accessed by a third party due to protection of user data privacy; 2) even if local data is accessed in some way, existing impact function evaluations incur significant computational and communication overhead, which is unacceptable for resource-constrained clients in federated systems.
Disclosure of Invention
The invention provides a method for detecting error data with privacy protection in a federated learning scene to overcome the defects of the prior art, which aims to detect a user side containing error training data and the error training data in a federated learning process with high efficiency and privacy protection and repair errors at a lower cost, thereby improving the prediction accuracy of federated learning, accelerating the convergence speed of a federated learning model, saving computing resources and communication resources and meeting the requirement of dynamic resource limitation of federated learning.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a method for detecting error data with privacy protection in a federated learning scene, which is characterized in that a terminal server and K local user terminals { C }k1,2, …, K, where CkRepresents the kth local user end and the kth local user end CkStore nkA training sample, denoted as { zk,i|i=1,2,…,nk},zk,iRepresents the kth local subscriber CkThe ith training sample of (1); the terminal server stores a test data set ZS(ii) a The error data detection method is carried out according to the following steps:
step 1, constructing a training target of a federal model:
constructing a loss function L (z, theta) of the federal model by using the formula (1):
in formula (1), θ represents a model parameter of federal learning after random initialization by using gaussian distribution, z represents training samples of K local clients, and Fk(theta) denotes the kth local subscriber terminal CkAnd has the following average of the loss functions:
in the formula (2), L (z)k,iAnd theta) represents the kth local subscriber CkThe loss function of the ith training sample of (1);
step 2, training of a federal model:
step 2.1, defining and initializing the current training time t as 1; assigning global model parameter theta of federal learning to global model parameter theta of t trainingt;
Step 2.2, in the training process of the t time, the terminal server randomly selects m local user terminals, and the global model parameter theta of the t training is usedtSending the data to the selected user side; the k selected local user terminal CkIndependently calculating updated model parameters using equation (3)And will beSending the model parameters to the terminal server, and storing the updated model parameters by the terminal serverAnd used as a training log;
in the formula (3), η represents the learning rate;the local client C selected at the k-th time of the t + 1-th trainingkThe model parameters of (1);represents the kth local user terminal C during the t trainingkThe gradient of the mean of the loss function of (a);
step 2.3, the terminal server utilizes the formula (4) to aggregate to obtain the global model parameter theta of the t +1 th trainingt+1;
Step 2.4, after the t +1 is assigned to the t, the step 2.2 is returned to for execution sequentially until the global model parameter thetatUntil convergence, to obtain the optimal global model parameters
Step 3, testing of a federal model:
the terminal server sends a test data set Z to the terminal serverSInput to the optimal global model parametersObtaining the global model parametersA set of mispredicted test samples Z;
step 4, detecting a user side containing error data:
the terminal server calculates the kth local user terminal C by using the formula (5)kDistance D of local update and global updatek:
In the formula (5), the reaction mixture is,represents the kth local subscriber CkWhether it is selected in the t-th training, if soIndicates is selected ifIndicating not selected, N (k) indicating the k-th local user terminal CkThe selected times in the training process from T/2 times to T times; thetatRepresenting that the terminal server aggregates training logs uploaded by K local user terminals in the t training process to obtain a global model parameter of the t training;
if the kth local subscriber CkDistance D ofkIf the ratio of the local subscriber terminal C to the median distances of all the local subscriber terminals is greater than the set distance threshold δ, it indicates that the kth local subscriber terminal C iskThe client side contains error data and is marked as a negative influence client side;
and 5, detecting error data and deleting:
suppose the kth local user terminal CkIf the local user side is a negative influence user side, the terminal server requires all local user sides to report available computing resources and communication resources of the local user sides, meanwhile, the computing resources and communication resources required by algorithm 1 with low communication overhead based on the differential privacy mechanism and algorithm 2 with high computing efficiency based on the differential privacy mechanism are estimated, and if the kth local user side C is a local user side CkCan satisfy the calculationThe requirement of method 1 is that algorithm 1 is selected to calculate the ith training sample zk,iInfluence function value of (I)f(zk,i) Otherwise, calculate the ith training sample z using Algorithm 2k,iInfluence function value of (I)f(zk,i);
If the value of the influence function If(zk,i) If the ratio of the median of the impact function values of all negative impact clients is greater than the set impact threshold, the ith training sample z is representedk,iIs an error sample, thereby to the k-th local user terminal CkJudging all training samples to detect all error samples;
the terminal server sends a deleting command to the kth negative influence user terminal CkSo that the kth negatively affects the ue CkDeleting all error samples of the self;
step 6, retraining the federated learning model:
the terminal server adjusts the probability of each local user terminal being selected according to the influence function values of all the local user terminals, so that the terminal server cooperates with the local user terminals to update the global model parameters
Step 6.1, initializing t to 1;
step 6.2, in the t training process, making K local user terminals have equal initial selection probability Pt 1,Pt 2,…,Pt k,…,Pt K;
Step 6.3, the terminal server selects probability P according to the initialt 1,Pt 2,…,Pt k,…,Pt KSelecting m local clients and participating in the training process shown in the step 2.2-step 2.3, so as to obtain global model parameters theta obtained by respectively uploading training logs in the t-th training process and aggregating the training logs after deleting all error samples of the K local clientst', and the k-th selected bookGround user terminal CkModel parameters θ 'at the time of t-th training after deleting all error samples of itself't k;
Step 6.4, in the t +1 training process, the terminal server updates the kth local user terminal C by using the formula (6)kIs selected, is selectedt kTo obtain the kth local user end C in the t +1 training processkIs selected from
In the formula (6), StRepresenting the local user terminal set selected by the terminal server in the t training process;
step 6.5, the terminal server trains the probability of the process according to the t +1Selecting m local clients and participating in the training process shown in the step 2.2-step 2.3;
6.6, assigning t +1 to t, and returning to the step 6.4 to execute in sequence until the global model parameter thetatUntil convergence, resulting in optimal global model parameters
Step 7, global model parameters are calculatedThe mispredicted test sample set Z is input to the optimal global model parametersIn the prediction process, the optimal global model parameters are obtainedThe mispredicted test sample set Z 'is determined to be theta' if the sample size of the test sample set Z 'does not meet the requirement of the terminal server'tIs assigned to thetat,θ′t kIs assigned toStep 4-step 7 are executed again, otherwise, it represents that the kth local user terminal C is detectedkAll error data.
The method for detecting error data with privacy protection in the federated learning scene is also characterized in that the algorithm 1 in the step 2 is carried out according to the following process:
in the t training process, the terminal server sever calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from the jth training sample stored in the test data storage unit, and according to the gradient of the test dataTaylor expansion based on m samples is performed on equation (7) for calculating the vector product stestJ th estimated value s oftest,jThen for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r1Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to the terminal server;
in the formula (7), HkRepresents the kth local subscriber Ckλ represents a threshold value such that the sum of the hessian matricesIs a semi-positive definite matrix, I represents a unit matrix,representing test data ztestA gradient of (a);
the terminal server will add the estimated value after making a noiseAs a vector product stestSo that the ith training sample z is calculated using equation (8)k,iInfluence function value of (I)f(zk,i):
In the formula (8), the reaction mixture is,represents the vector product stestThe transposing of (1).
The algorithm 2 in the step 2 is performed according to the following processes:
in the t training process, the terminal server sever calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from training samples stored in the training device for the jth time, and calculating a Hessian matrix based on the m samplesElement h in line llAnd a test specimen ztestGradient of (2)Element of line lCalculating the vector product s using equation (9)testJ th estimated value s oftest,j(ii) a Then for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r2Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to the terminal server;
the terminal server will add the estimated value after making a noiseAs a vector product stestSo that the ith training sample z is calculated using equation (8)k,iInfluence function value of (I)f(zk,i)。
Compared with the prior art, the invention has the beneficial effects that:
because the invention adopts a layered detection method, the detection method is high in efficiency; meanwhile, detection methods for optimizing computing resources and communication resources are respectively designed, and the detection methods are high in adaptability; in addition, local data are not exposed to any third party in the whole detection process, parameters transmitted in the middle are disturbed through differential privacy, and the user data privacy is protected through the detection method.
Drawings
FIG. 1 is a flow chart of a system for error data detection with privacy protection in the federated learning scenario of the present invention.
Detailed Description
In this embodiment, a method for detecting error data with privacy protection in a federated learning scenario is applied by a terminal server and K local clients { C } as shown in fig. 1k1,2, …, K, where CkRepresents the kth local user end and the kth local user end CkStore nkA training sample, denoted as { zk,i|i=1,2,…,nk},zk,iRepresents the kth local subscriber CkThe ith training sample of (1); terminal server storage test data set ZS(ii) a The error data detection method is carried out according to the following steps:
step 1, federal model training:
the objective of the federal model training is to minimize the loss function shown in equation (1):
in the formula (1), θ represents a model parameter of federal learning after random initialization by using gaussian distribution, and is used for mapping an input space X to an output space Y, z represents training samples of K local user terminals, L (z, θ) represents a loss function of the model θ, and Fk(theta) denotes the kth local subscriber terminal CkIs calculated by equation (2):
in the formula (2), L (z)k,iAnd theta) represents the kth local subscriber CkThe loss function of the ith training sample of (1);
step 2, training of a federal model:
step 2.1, defining and initializing the current training time t as 1; assigning global model parameter theta of federal learning to global model parameter theta of t trainingt;
Step 2.2, in the training process of the t time, the terminal server randomly selects m local user terminals CkAnd the t-th trained global model parameter theta is measuredtSending the data to the selected user side; the k selected local user terminal CkIndependently calculating updated model parameters using equation (3)And will beSending to the terminal server, the terminal server saves the local update parametersAnd use it as a training log;
in the formula (3), η represents the learning rate;the local client C selected at the k-th time of the t + 1-th trainingkThe model parameters of (1);represents the kth local user terminal C during the t trainingkThe gradient of the mean of the loss function of (a);
step 2.3, the terminal server obtains the global model parameter theta of the t +1 th round by using formula (4) polymerizationt+1;
Step 1.4, after t +1 is assigned to t, the step 2.2 is returned to for execution in sequence until the global model parameter thetatUntil convergence, to obtain the optimal global model parameters
Step 3, federal model test:
the terminal server sends a test data set ZSInput to the optimal global model parametersIn (3), obtaining global model parametersA set of mispredicted test samples Z;
step 4, detecting a user side containing error data:
the terminal server calculates the kth local user terminal C by using the formula (5)kDistance D of local update and global updatek:
In the formula (5), the reaction mixture is,represents the kth local subscriber CkWhether it is selected in the t-th training, if soIndicates is selected ifIndicating not selected, N (k) indicating the k-th local user terminal CkThe selected times in the training process from T/2 times to T times; thetatRepresenting that the terminal server aggregates training logs uploaded by K local user terminals in the t training process to obtain global model parameters of the t training;
if the kth local subscriber CkDistance D ofkIf the ratio of the local subscriber terminal C to the median distances of all the local subscriber terminals is greater than the set distance threshold δ, it indicates that the kth local subscriber terminal C iskThe client side contains error data and is marked as a negative influence client side;
step 5, detecting error data:
suppose the kth local user terminal CkIf the local user side is a negative influence user side, the terminal server requires all local user sides to report available computing resources and communication resources of the local user sides, meanwhile, the computing resources and communication resources required by algorithm 1 with low communication overhead based on the differential privacy mechanism and algorithm 2 with high computing efficiency based on the differential privacy mechanism are estimated, and if the kth local user side C is a local user side CkCan meet the requirements of the algorithm 1, the algorithm 1 is selected to calculate the ith training sample zk,iInfluence function value of (I)f(zk,i) Otherwise, calculate the ith training sample z using Algorithm 2k,iInfluence function value of (I)f(zk,i)。
If the value of the influence function If(zk,i) If the ratio of the median of the impact function values of all negative impact clients is greater than the set impact threshold, the ith training sample z is representedk,iIs an error sample, thereby to the k-th local user terminal CkJudging all training samples to detect all error samples;
the terminal server sends a deleting command to the kth negative influence user terminal CkSo that the kth negatively affects the ue CkDeleting all error samples of the self;
step 6, based on the influence function value If(zk,i) Local user side selection and global model parametersUpdating:
the terminal server adjusts the probability of each local user terminal being selected according to the influence function values of all the local user terminals, so that the terminal server cooperates with the local user terminals to update the global model parameters
Step 6.1, initializing t to 1;
step 6.2, in the t training process, making K local user terminals have equal initial selection probability Pt 1,Pt 2,…,Pt k,…,Pt K;
Step 6.3, the terminal server selects the probability P according to the initialt 1,Pt 2,…,Pt k,…,Pt KSelecting m tth training processes to participate in the training process shown in the step 2.2-the step 2.3, so as to obtain global model parameters theta 'obtained by respectively uploading training logs and aggregating the training logs in the tth training process after deleting all error samples of K local user sides'tAnd the k selected local client CkModel parameters θ 'at the time of t-th training after deleting all error samples of itself't k;
Step 6.4, in the t +1 training process, the terminal server updates the kth local user terminal C by using the formula (4)kIs selected, is selectedt kTo obtain the kth local user end C in the t +1 training processkIs selected from
In the formula (6), StRepresenting the local user terminal set selected by the terminal server in the t training process;
step 6.5, the terminal server trains the probability of the process according to the t +1Selecting m local clients and participating in the training process shown in the step 2.2-step 2.3;
step 6.6, after t +1 is assigned to t, the step 6.4 is returned to and executed in sequence until the global model parameter theta'tUntil convergence, to obtain the optimal global model parameters
Step 7, global model parameters are calculatedThe mispredicted test sample set Z is input to the optimal global model parametersIn the prediction process, the optimal global model parameters are obtainedThe mispredicted test sample set Z 'is determined to be theta' if the sample size of the test sample set Z 'does not meet the requirement of the terminal server'tIs assigned to thetat,θ′t kAssigning to the local user terminal C, executing the steps 4-7 again, otherwise, indicating that the kth local user terminal C is detectedkAll error data.
Due to calculation of If(zk,i) Requires O (np)2+p3) Wherein p represents a global model parameterThe calculation cost is large, and in order to reduce the calculation cost, the algorithm 1 is carried out according to the following processes:
in the t training process, the terminal server sever calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from the jth training sample stored in the test data storage unit, and according to the gradient of the test dataTaylor expansion based on m samples is performed on equation (7) for calculating the vector product stestJ th estimated value s oftest,jThen for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r1Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to a terminal server;
in the formula (7), HkRepresents the kth local subscriber Ckλ represents a threshold value such that the sum of the hessian matricesIs a semi-positive definite matrix, I represents a unit matrix,representing test data ztestOf the gradient of (c).
The terminal server will add the estimated value after making a noiseAs a vector product stestSo that the ith training sample z is calculated using equation (8)k,iInfluence function value of (I)f(zk,i):
In the formula (8), the reaction mixture is,represents the vector product stestThe transposing of (1).
Due to calculation of If(zk,i) Requires O (Kp)2+ np) and thus a large communication overhead, and to reduce the communication overhead, algorithm 2 proceeds as follows:
in the t training process, the terminal server calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from training samples stored in the training device for the jth time, and calculating a Hessian matrix based on the m samplesElement h in line llAnd test dataRow i element of the gradient ofCalculating the vector product s using equation (9)testJ th estimated value s oftest,j. Then for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r2Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to a terminal server;
Claims (3)
1. A method for detecting error data with privacy protection in a federated learning scene is characterized in that a terminal server and K local user terminals { C }are appliedk1,2, …, K, where CkRepresents the kth local user end and the kth local user end CkStore nkA training sample, denoted as { zk,i|i=1,2,…,nk},zk,iRepresents the kth local subscriber CkThe ith training sample of (1); the terminal server stores a test data set ZS(ii) a The error data detection method is carried out according to the following steps:
step 1, constructing a training target of a federal model:
constructing a loss function L (z, theta) of the federal model by using the formula (1):
in formula (1), θ represents a model parameter of federal learning after random initialization by using gaussian distribution, z represents training samples of K local clients, and Fk(theta) denotes the kth local subscriber terminal CkAnd has the following average of the loss functions:
in the formula (2), L (z)k,iAnd theta) represents the kth local subscriber CkThe loss function of the ith training sample of (1);
step 2, training of a federal model:
step 2.1, defining and initializing the current training times t ═ l; assigning global model parameter theta of federal learning to global model parameter theta of t trainingt;
Step 2.2, in the training process of the t time, the terminal server randomly selects m local user terminals, and the global model parameter theta of the t training is usedtSending the data to the selected user side; the k selected local user terminal CkIndependently calculating updated model parameters using equation (3)And will beSending the model parameters to the terminal server, and storing the updated model parameters by the terminal serverAnd used as a training log;
in the formula (3), η represents the learning rate;the local client C selected at the k-th time of the t + 1-th trainingkThe model parameters of (1);represents the kth local user terminal C during the t trainingkThe gradient of the mean of the loss function of (a);
step 2.3, the terminal server utilizes the formula (4) to aggregate to obtain the global model parameter theta of the t +1 th trainingt+1;
Step 2.4, after the t +1 is assigned to the t, the step 2.2 is returned to for execution sequentially until the global model parameter thetatUntil convergence, to obtain the optimal global model parameters
Step 3, testing of a federal model:
the terminal server sends a test data set Z to the terminal serverSInput to the optimal global model parametersObtaining the global model parametersA set of mispredicted test samples Z;
step 4, detecting a user side containing error data:
the terminalThe server calculates the kth local client C by using the formula (5)kDistance D of local update and global updatek:
In the formula (5), the reaction mixture is,represents the kth local subscriber CkWhether it is selected in the t-th training, if soIndicates is selected ifIndicating not selected, N (k) indicating the k-th local user terminal CkThe selected times in the training process from T/2 times to T times; thetatRepresenting that the terminal server aggregates training logs uploaded by K local user terminals in the t training process to obtain a global model parameter of the t training;
if the kth local subscriber CkDistance D ofkIf the ratio of the local subscriber terminal C to the median distances of all the local subscriber terminals is greater than the set distance threshold δ, it indicates that the kth local subscriber terminal C iskThe client side contains error data and is marked as a negative influence client side;
and 5, detecting error data and deleting:
suppose the kth local user terminal CkIf the local user side is a negative influence user side, the terminal server requires all local user sides to report available computing resources and communication resources of the local user sides, meanwhile, the computing resources and communication resources required by algorithm 1 with low communication overhead based on the differential privacy mechanism and algorithm 2 with high computing efficiency based on the differential privacy mechanism are estimated, and if the kth local user side C is a local user side CkIf the computing resources can meet the requirements of the algorithm 1, the algorithm 1 is selected for computingIth training sample zk,iInfluence function value of (I)f(zk,i) Otherwise, calculate the ith training sample z using Algorithm 2k,iInfluence function value of (I)f(zk,i);
If the value of the influence function If(zk,i) If the ratio of the median of the impact function values of all negative impact clients is greater than the set impact threshold, the ith training sample z is representedk,iIs an error sample, thereby to the k-th local user terminal CkJudging all training samples to detect all error samples;
the terminal server sends a deleting command to the kth negative influence user terminal CkSo that the kth negatively affects the ue CkDeleting all error samples of the self;
step 6, retraining the federated learning model:
the terminal server adjusts the probability of each local user terminal being selected according to the influence function values of all the local user terminals, so that the terminal server cooperates with the local user terminals to update the global model parameters
Step 6.1, initializing t to 1;
step 6.2, in the t training process, making K local user terminals have equal initial selection probability Pt 1,Pt 2,…,Pt k,…,Pt K;
Step 6.3, the terminal server selects probability P according to the initialt 1,Pt 2,…,Pt k,…,Pt KSelecting m local user sides and participating in the training process shown in the step 2.2-the step 2.3, so as to obtain global model parameters theta 'obtained by respectively uploading training logs in the t-th training process and aggregating the training logs after deleting all error samples of the K local user sides'tAnd the k selected local userTerminal CkModel parameters θ 'at the time of t-th training after deleting all error samples of itself't k;
Step 6.4, in the t +1 training process, the terminal server updates the kth local user terminal C by using the formula (6)kIs selected, is selectedt kTo obtain the kth local user end C in the t +1 training processkIs selected from
In the formula (6), StRepresenting the local user terminal set selected by the terminal server in the f training process;
step 6.5, the terminal server trains the probability of the process according to the t +1Selecting m local clients and participating in the training process shown in the step 2.2-step 2.3;
step 6.6, after t +1 is assigned to t, the step 6.4 is returned to and executed in sequence until the global model parameter theta'tUntil convergence, to obtain the optimal global model parameters
Step 7, global model parameters are calculatedThe mispredicted test sample set Z is input to the optimal global model parametersIn the prediction process, the optimal global model parameters are obtainedThe mispredicted test sample set Z 'is determined to be theta' if the sample size of the test sample set Z 'does not meet the requirement of the terminal server'tIs assigned to thetat,θ′t kIs assigned toStep 4-step 7 are executed again, otherwise, it represents that the kth local user terminal C is detectedkAll error data.
2. The method for detecting error data with privacy protection in a federated learning scene as claimed in claim 1, wherein the algorithm 1 in step 2 is performed as follows:
in the t training process, the terminal server sever calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from the jth training sample stored in the test data storage unit, and according to the gradient of the test dataTaylor expansion based on m samples is performed on equation (7) for calculating the vector product stestJ th estimated value s oftest,jThen for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r1Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to the terminal server;
in the formula (7), HkRepresents the kth local subscriber Ckλ represents a threshold value such that the sum of the hessian matricesIs a semi-positive definite matrix, I represents a unit matrix,representing test data ztestA gradient of (a);
the terminal server will add the estimated value after making a noiseAs a vector product stestSo that the ith training sample z is calculated using equation (8)k,iInfluence function value of (I)f(zk,i):
3. The method for detecting error data with privacy protection in a federated learning scene as claimed in claim 2, wherein the algorithm 2 in step 2 is performed as follows:
in the t training process, the terminalThe server sever calculates any test sample Z in the test sample set ZtestGradient of (2)And sends it to the kth local user terminal Ck;
The kth local subscriber terminal CkRandomly selecting m samples from training samples stored in the training device for the jth time, and calculating a Hessian matrix based on the m samplesElement h in line llAnd a test specimen ztestGradient of (2)Element of line lCalculating the vector product s using equation (9)testJ th estimated value s oftest,j(ii) a Then for the jth estimated value stest,jAfter Gaussian noise is added, a noise-added estimation value is obtainedThe kth local subscriber terminal CkRepetition of r2Selecting a sample and calculating an estimated value, and finally obtaining the estimated value after noise additionAnd transmitting to the terminal server;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110696108.9A CN113361625A (en) | 2021-06-23 | 2021-06-23 | Error data detection method with privacy protection in federated learning scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110696108.9A CN113361625A (en) | 2021-06-23 | 2021-06-23 | Error data detection method with privacy protection in federated learning scene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113361625A true CN113361625A (en) | 2021-09-07 |
Family
ID=77535777
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110696108.9A Pending CN113361625A (en) | 2021-06-23 | 2021-06-23 | Error data detection method with privacy protection in federated learning scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113361625A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115881306A (en) * | 2023-02-22 | 2023-03-31 | 中国科学技术大学 | Networked ICU intelligent medical decision-making method based on federal learning and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109961098A (en) * | 2019-03-22 | 2019-07-02 | 中国科学技术大学 | A kind of training data selection method of machine learning |
CN110633570A (en) * | 2019-07-24 | 2019-12-31 | 浙江工业大学 | Black box attack defense method for malicious software assembly format detection model |
US20200034665A1 (en) * | 2018-07-30 | 2020-01-30 | DataRobot, Inc. | Determining validity of machine learning algorithms for datasets |
CN111460524A (en) * | 2020-03-27 | 2020-07-28 | 鹏城实验室 | Data integrity detection method and device and computer readable storage medium |
CN112214342A (en) * | 2020-09-14 | 2021-01-12 | 德清阿尔法创新研究院 | Efficient error data detection method in federated learning scene |
CN112435230A (en) * | 2020-11-20 | 2021-03-02 | 哈尔滨市科佳通用机电股份有限公司 | Deep learning-based data set generation method and system |
EP3828777A1 (en) * | 2019-10-31 | 2021-06-02 | NVIDIA Corporation | Processor and system to train machine learning models based on comparing accuracy of model parameters |
-
2021
- 2021-06-23 CN CN202110696108.9A patent/CN113361625A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200034665A1 (en) * | 2018-07-30 | 2020-01-30 | DataRobot, Inc. | Determining validity of machine learning algorithms for datasets |
CN109961098A (en) * | 2019-03-22 | 2019-07-02 | 中国科学技术大学 | A kind of training data selection method of machine learning |
CN110633570A (en) * | 2019-07-24 | 2019-12-31 | 浙江工业大学 | Black box attack defense method for malicious software assembly format detection model |
EP3828777A1 (en) * | 2019-10-31 | 2021-06-02 | NVIDIA Corporation | Processor and system to train machine learning models based on comparing accuracy of model parameters |
CN111460524A (en) * | 2020-03-27 | 2020-07-28 | 鹏城实验室 | Data integrity detection method and device and computer readable storage medium |
CN112214342A (en) * | 2020-09-14 | 2021-01-12 | 德清阿尔法创新研究院 | Efficient error data detection method in federated learning scene |
CN112435230A (en) * | 2020-11-20 | 2021-03-02 | 哈尔滨市科佳通用机电股份有限公司 | Deep learning-based data set generation method and system |
Non-Patent Citations (2)
Title |
---|
HARD, A.等: "Training Keyword Spotting Models on Non-IID Data with Federated Learning", 《ARXIV》 * |
王琦等: "基于大数据分析技术的网络入侵检测方法", 《微电脑应用》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115881306A (en) * | 2023-02-22 | 2023-03-31 | 中国科学技术大学 | Networked ICU intelligent medical decision-making method based on federal learning and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111124840B (en) | Method and device for predicting alarm in business operation and maintenance and electronic equipment | |
CN107153874B (en) | Water quality prediction method and system | |
CN109840595B (en) | Knowledge tracking method based on group learning behavior characteristics | |
Dai et al. | Hybrid deep model for human behavior understanding on industrial internet of video things | |
CN112784920A (en) | Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part | |
CN112561119A (en) | Cloud server resource performance prediction method using ARIMA-RNN combined model | |
CN115982141A (en) | Characteristic optimization method for time series data prediction | |
CN115051929A (en) | Network fault prediction method and device based on self-supervision target perception neural network | |
CN113361625A (en) | Error data detection method with privacy protection in federated learning scene | |
CN117407665A (en) | Retired battery time sequence data missing value filling method based on generation countermeasure network | |
KR20200000660A (en) | System and method for generating prediction model for real-time time-series data | |
Lo | Predicting software reliability with support vector machines | |
CN113779116B (en) | Object ordering method, related equipment and medium | |
CN115204463A (en) | Residual service life uncertainty prediction method based on multi-attention machine mechanism | |
CN115392434A (en) | Depth model reinforcement method based on graph structure variation test | |
CN115616163A (en) | Gas accurate preparation and concentration measurement system | |
CN114139601A (en) | Evaluation method and system for artificial intelligence algorithm model of power inspection scene | |
Imbiriba et al. | Recursive Gaussian processes and fingerprinting for indoor navigation | |
CN111382391A (en) | Target correlation feature construction method for multi-target regression | |
Andersson et al. | Data-driven impulse response regularization via deep learning | |
Woo et al. | Development of a reinforcement learning-based adaptive scheduling algorithm for block assembly production line | |
Pearson et al. | Predicting ecological outcomes using fuzzy interaction webs | |
Li et al. | Two-stage Walsh-average-based robust estimation and variable selection for partially linear additive spatial autoregressive models | |
CN114970344A (en) | Packed tower pressure drop prediction method based on width migration learning | |
Pan et al. | Anomalous Update Identification Based on Cosine Similarity for Collaborative Wind Power Forecasting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210907 |