CN113642715A - Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget - Google Patents

Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget Download PDF

Info

Publication number
CN113642715A
CN113642715A CN202111009795.9A CN202111009795A CN113642715A CN 113642715 A CN113642715 A CN 113642715A CN 202111009795 A CN202111009795 A CN 202111009795A CN 113642715 A CN113642715 A CN 113642715A
Authority
CN
China
Prior art keywords
neuron
neural network
model
layer
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111009795.9A
Other languages
Chinese (zh)
Inventor
张亚玲
白世博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Hongyue Information Technology Co ltd
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202111009795.9A priority Critical patent/CN113642715A/en
Publication of CN113642715A publication Critical patent/CN113642715A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Bioethics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Image Analysis (AREA)

Abstract

The invention aims to provide a differential privacy protection deep learning algorithm for self-adaptively allocating dynamic privacy budgets, which comprises the steps of firstly, giving a data set, setting and initializing a neural network NN, training the neural network NN by using the data set, and obtaining a deep learning model M without privacy protection; calculating average feature correlation by using an LRP algorithm according to a trained deep learning model M without privacy protection; further calculating a correlation ratio; and finally, reinitializing the neural network NN, setting the iteration times of training, and adding noise in the training process according to the correlation ratio to obtain the deep learning model DPM with differential privacy protection, so that the data privacy can be protected when the model is used for prediction.

Description

Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets.
Background
With the development of internet technology, hundreds of millions of data are generated every day in daily life, and the huge amount of data often contains potential, regular and finally understandable knowledge or patterns. Data Mining (DM) technology can find and extract these useful information from these massive Data and feed back to guide business and human life, it mainly adopts machine learning method and statistical knowledge principle to do knowledge Mining, the research and improvement of machine learning method often has important influence on the efficiency and result of Data Mining. Deep Learning (Deep Learning) is a branch of machine Learning, an algorithm that attempts to perform high-level abstraction of data using multiple processing layers that contain complex structures or consist of multiple nonlinear transformations. Deep learning has made impressive breakthroughs in many areas, including computer vision, speech recognition, image recognition, natural language processing, and search recommendations, among others. The method aims to establish a multi-layer network, extract complex features from original input data and mine a hidden knowledge structure in the data.
Knowledge and patterns hidden in massive data can be discovered by using data mining algorithms such as deep learning, but the knowledge and the patterns are usually at the cost of sacrificing privacy. If the privacy data used for training are not well protected, they can be leaked through model parameters or predictions, so that data mining technology with privacy protection property becomes an important requirement. How to effectively protect the privacy of training sample data from being invaded while applying a deep neural network algorithm is of great importance. Dalenius proposes a concept of privacy disclosure control, and the k-anonymity algorithm lays a foundation for an anonymous privacy protection algorithm based on equivalence class grouping, and then is l-diversity, t-close (alpha, k) -anonymity and the like. These models improve the anonymity protection theory for attackers of different background knowledge. However, they all have some common drawbacks, require newer designs to cope with rapidly evolving attacks, and do not provide strict evidence to quantify privacy protection effects. Many scholars at home and abroad apply the privacy protection technology to various methods for data mining, but the methods are all based on special background knowledge mastered by attackers and cannot provide enough security guarantee.
Differential Privacy (DP) is a new Privacy definition proposed by Dwork in 2006 for the Privacy disclosure problem of statistical databases, which is a Privacy protection model based on data distortion. Compared with the traditional privacy protection model, the differential privacy model is defined on the solid mathematical basis, and the level of algorithm privacy protection can be controlled; at the same time, it defines the maximum background knowledge that an attacker possesses, i.e. the sum of all other information that the attacker can obtain except the target record.
Disclosure of Invention
The invention aims to provide a differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets, and solves the problem that in the prior art, the consumed privacy budget is too large, so that the privacy protection level is low.
The technical scheme adopted by the invention is that a differential privacy protection deep learning algorithm for self-adaptively allocating dynamic privacy budgets is implemented according to the following steps:
step 1, a data set D { (X) is given1,y1),(X2,y2),...,(Xj,yj),...,(Xn,yn) I j e (1, n) }, for a piece of data (X)j,yj),XjRepresenting a data feature, yjRepresenting a data label, setting and initializing a neural network NN, wherein the neural network NN comprises an input layer, a hidden layer and an output layer, and training the neural network NN by using a data set D to obtain a deep learning model M without privacy protection;
step 2, calculating average characteristic correlation by using LRP algorithm according to the deep learning model M which is trained in the step 1 and does not have privacy protection
Figure BDA0003238190380000031
Step 3, obtaining average characteristic correlation according to step 2
Figure BDA0003238190380000032
Calculating the correlation ratio alphaj
Step 4, reinitializing the neural network NN, setting the iteration times T of training, and obtaining the correlation ratio alpha according to the step 3jNoise is added in the training process to obtain a deep learning model DPM with differential privacy protection,so that data privacy can be protected when using the model for prediction.
The present invention is also characterized in that,
the step 1 is implemented according to the following steps:
step 1.1, a data set D containing an image is given, a neural network NN is set, the neural network NN comprises an input layer, three hidden layers and an output layer, and a neural network parameter omega is initialized randomly;
step 1.2, randomly selecting a batch of data B from the data set D, and inputting the data B into a neural network NN;
step 1.3, training the neural network NN by using the SGD algorithm, and continuously adjusting the neural network parameter omega to obtain the optimal neural network parameter omegaMThen, a deep learning model M without differential privacy protection is obtained, which is used for the correlation of features in the subsequent steps
Figure BDA0003238190380000037
And (4) calculating.
The step 2 is implemented according to the following steps:
step 2.1, randomly selecting a record (X) from the data set Di,yi);
Step 2.2, inputting data X to the model M trained in the step 1iObtaining the model predicted value
Figure BDA0003238190380000033
Step 2.3, predicting values according to the model
Figure BDA0003238190380000034
For each neuron p in the last hidden layer l, the neuron p and the model output are calculated
Figure BDA0003238190380000035
Characteristic correlation of
Figure BDA0003238190380000036
Figure BDA0003238190380000041
Wherein z ispm=apωpmIs the product of network parameters between a neuron p in the hidden layer l and its output layer neuron m, apRepresenting the value of the neuron p, ωpmRepresenting the weight coefficient from neuron p to neuron m, i.e. the network parameter between neuron p and neuron m, zm=∑pzpm+bmIs an affine transformation of the hidden layer l to the output layer neurons m, bmRepresents the bias value of the last hidden layer l to the output layer neuron m,
Figure BDA0003238190380000042
is a predefined stabilizer for avoiding the condition that the denominator is 0, and
Figure BDA0003238190380000043
step 2.4, calculating the relevance decomposition information of each neuron p in the last hidden layer l to each neuron q in the previous hidden layer, namely the layer l-1
Figure BDA0003238190380000044
Figure BDA0003238190380000045
Wherein z isqp=aqωqpRepresenting neural network parameters ω between neuron q and neuron pqmProduct of aqRepresenting the value of neuron q, zp=∑qzqp+bpIs an affine transformation of a neuron q to a neuron p, bpRepresents the bias value from l-1 layer to l layer;
step 2.5, decomposing information into correlations
Figure BDA0003238190380000046
Adding noise that follows the Laplace distribution:
Figure BDA0003238190380000047
wherein the content of the first and second substances,
Figure BDA0003238190380000048
denotes Laplace noise added to the correlation decomposition information, Δ represents global sensitivity, εrRepresenting a privacy budget when noise is added to the dependency resolution information;
step 2.6, calculating the characteristic correlation of each neuron q in the hidden layer l-1
Figure BDA0003238190380000049
Figure BDA00032381903800000410
Repeating the steps 2.3-2.6 until the correlation of the input features is calculated, and finally averaging the obtained correlations to obtain the average feature correlation
Figure BDA0003238190380000051
Step 3 is specifically implemented according to the following steps:
relevance ratio alpha of j-th neuron in layer of certain specific layer in neural network in the layerjComprises the following steps:
Figure BDA0003238190380000052
wherein the content of the first and second substances,
Figure BDA0003238190380000053
represents the average of the average characteristic correlations of the jth neuron in the layer.
Step 4 is specifically implemented according to the following steps:
step 4.1, creating a network Net with the same structure as the network in step 1, and initially changing the parameters of the neural network into omega0Setting a training batch size L, the iteration times T of a training model and a privacy budget epsilon, and initializing the iteration steps T to 1;
step 4.2, randomly sampling L samples from the data set D to form a training sample set L of the t iterationt
Step 4.3, training sample set L of the t iterationtRespectively sending the L samples into a current neural network Net to obtain model predicted values of the L samples
Figure BDA0003238190380000054
And based on the model predicted value of each sample
Figure BDA0003238190380000055
And true value yiCalculating a model loss function for each sample
Figure BDA0003238190380000056
Figure BDA0003238190380000057
This loss function, called the cross entropy loss function, is used to measure the difference between the predicted and true values, where ω istModel parameters representing the t-th iteration, (X)i,yi) Represents the batch LtOne record in;
step 4.4, optimizing the parameter omega of the model of the t iteration through error back propagation by using the model loss function of each sampletCalculating partial derivatives to obtain L intermediate model gradients g of the t-th iterationt
Step 4.5, calculating the privacy budget size epsilon of the t iterationt
Figure BDA0003238190380000058
Step 4.6, calculating privacy budget epsilon when noise is added to gradient of jth neuron of a certain layer in the neural network in the t iterationjt=αjt
Step 4.7, gradient gtAdding noise to obtain the noise gradient of the intermediate model
Figure BDA0003238190380000061
Figure BDA0003238190380000062
Wherein g ist(Xi) Represents according to the sample lot LtRecord (X) in (1)i,yi) The calculated gradient is used to calculate the gradient of the sample,
Figure BDA0003238190380000063
representing the parameter omega at the t-th iterationtGlobal sensitivity of (d);
step 4.8, updating model parameters
Figure BDA0003238190380000064
Wherein etatRepresenting the learning rate of the t iteration;
step 4.9, judging whether T is equal to T or not, and if so, carrying out the optimization parameter omega obtained by the T-th iterationt+1And (5) as the final parameters of the neural network Net, obtaining the trained deep learning model DPM with the differential privacy protection, and otherwise, turning to the step 4.2.
Compared with the prior art, the differential privacy protection deep learning algorithm for self-adaptively allocating the dynamic privacy budget has the beneficial effects that on the basis of disturbance of the gradient based on the correlation, the influence degree of the noise magnitude on the model convergence at different stages of training is further considered, and the training process of the deep neural network is a process from random weight to optimal weight, namely a process from an initial model to an optimal model. In the initial stage of training, the random weight is far away from the optimal weight, and the gradient is usually large, so that adding large noise does not have too much influence on the model. While in the later stages of training the values of the random weights are close to the optimal weights, the gradient values are usually also small, which may cause model oscillation and affect the accuracy of the model if the same amount of noise is added to the gradient at this time. Therefore, the invention further dynamically changes the size of the privacy budget in the training phase, thereby reducing the influence of noise on the model, and further improving the practicability of the model while providing effective differential privacy guarantee.
Drawings
FIG. 1 is a flow chart of the LRP algorithm with differential privacy protection of the present invention computing feature correlations;
FIG. 2 is a flow diagram of the adaptive gradient perturbed deep learning differential privacy preserving deep neural network of the present invention;
fig. 3 is a schematic diagram of the LRP algorithm used in the present invention, forward and backward.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
In the differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets, in the process of optimizing a neural network by using random Gradient Descent (SGD) or variants thereof, a model obtains a current predicted value according to input data of the model, calculates a prediction error of the model by using the predicted value, and reversely propagates the obtained error value to calculate a current Gradient giAnd adding noise to the gradient, which obeys Laplace distribution, to obtain a noisy gradient
Figure BDA0003238190380000071
Then use
Figure BDA0003238190380000072
And updating the parameters of the network to realize the protection of the private information. After a specified training period or when the model error is smaller than a threshold value, the trained model parameters are obtained, and finallyAnd obtaining the deep neural network classifier with differential privacy protection. In the invention, self-adaptation is adopted, namely, when gradient is disturbed, privacy budget is allocated according to the relevance of the neuron to model output, the larger the relevance is, the larger the allocated privacy budget is, the smaller the added noise is, and vice versa, and the privacy budget to be allocated at this time is the dynamic privacy budget. The dynamic privacy budget is that the size of the privacy budget is dynamically changed along with the training according to different degrees of influence of noise on the gradient at different stages of the training. Different from the existing neural network classifier with privacy protection, the method can effectively improve the accuracy of the deep neural network prediction with the differential privacy protection.
First, a Layer-wise Relevance Propagation (LRP) algorithm is used to calculate the Relevance R of each neuron and the output of the modeliWhile calculating the correlation, Laplace noise is added to the correlation decomposition information to protect the data privacy; secondly, the dynamic privacy budget epsilon of the current stage is calculated at different stages of trainingtAnd a dynamic privacy budget epsilon for the current phase based on the calculated correlationtSelf-adaptive distribution is carried out to obtain the privacy budget epsilon of each gradient in the current stagejt(ii) a Finally, according to different privacy budgets epsilonjtDifferent noises are added into the gradient for training, so that the risk of leakage of network model parameters is reduced, the privacy of training data is protected, an attacker cannot deduce the model parameters, and further cannot deduce data in a training set, so that the purpose of privacy protection is achieved.
The invention discloses a differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets, which is implemented by combining the following steps with the combination of figures 1-3:
step 1, a data set D { (X) is given1,y1),(X2,y2),...,(Xj,yj),...,(Xn,yn) I j e (1, n) }, for a piece of data (X)j,yj),XjRepresenting a data feature, yjRepresenting data tags, i.e. XjThe category to which it belongs, set and startThe method comprises the steps that a neural network NN is initialized, the neural network NN comprises an input layer, a hidden layer and an output layer, and a deep learning model M without privacy protection is obtained by training the neural network NN through a data set D;
the step 1 is implemented according to the following steps:
step 1.1, a data set D containing an image is given, a neural network NN is set, the neural network NN comprises an input layer, three hidden layers and an output layer, and a neural network parameter omega is initialized randomly;
step 1.2, randomly selecting a batch of data B from the data set D, and inputting the data B into a neural network NN;
step 1.3, training the neural network NN by using the SGD algorithm, and continuously adjusting the neural network parameter omega to obtain the optimal neural network parameter omegaMThen, a deep learning model M without differential privacy protection is obtained, which is used for the correlation of features in the subsequent steps
Figure BDA0003238190380000081
And (4) calculating.
Step 2, calculating average characteristic correlation by using LRP algorithm according to the deep learning model M which is trained in the step 1 and does not have privacy protection
Figure BDA0003238190380000091
The step 2 is implemented according to the following steps:
step 2.1, randomly selecting a record (X) from the data set Di,yi);
Step 2.2, inputting data X to the model M trained in the step 1iObtaining the model predicted value
Figure BDA0003238190380000092
Step 2.3, predicting values according to the model
Figure BDA0003238190380000093
For each neuron p in the last hidden layer l, the neuron p and the model output are calculated
Figure BDA0003238190380000094
Characteristic correlation of
Figure BDA0003238190380000095
Figure BDA0003238190380000096
Wherein z ispm=apωpmIs the product of network parameters between a neuron p in the hidden layer l and its output layer neuron m, apRepresenting the value of the neuron p, ωpmRepresenting the weight coefficient from neuron p to neuron m, i.e. the network parameter between neuron p and neuron m, zm=∑pzpm+bmIs an affine transformation of the hidden layer l to the output layer neurons m, bmRepresents the bias value of the last hidden layer l to the output layer neuron m,
Figure BDA0003238190380000097
is a predefined stabilizer for avoiding the condition that the denominator is 0, and
Figure BDA0003238190380000098
step 2.4, calculating the relevance decomposition information of each neuron p in the last hidden layer l to each neuron q in the previous hidden layer, namely the layer l-1
Figure BDA0003238190380000099
Figure BDA00032381903800000910
Wherein z isqp=aqωqpRepresenting neural network parameters ω between neuron q and neuron pqmProduct of aqRepresenting the value of neuron q, zp=Σqzqp+bpIs an affine transformation of a neuron q to a neuron p, bpRepresents the bias value from l-1 layer to l layer;
step 2.5, decomposing information into correlations
Figure BDA00032381903800000911
Adding noise that follows the Laplace distribution:
Figure BDA0003238190380000101
wherein the content of the first and second substances,
Figure BDA0003238190380000102
denotes Laplace noise added to the correlation decomposition information, Δ represents global sensitivity, εrRepresenting a privacy budget when noise is added to the dependency resolution information;
step 2.6, calculating the characteristic correlation of each neuron q in the hidden layer l-1
Figure BDA0003238190380000103
Figure BDA0003238190380000104
And repeating the step 2.3 to the step 2.6 until the correlation of the input features is calculated, and finally, solving the average value of the obtained correlations to obtain the average feature correlation R.
The above steps are described algorithmically as follows:
inputting: data set D, deep learning model M without differential privacy protection
And (3) outputting: mean characteristic correlation R
(1)for(Xi,yi)∈D do
(2) Mixing XiInput neural network to obtain predicted value
Figure BDA0003238190380000105
(3) Computing
Figure BDA0003238190380000106
(4)for l,...,1do
(5) Computing
Figure BDA0003238190380000107
(6) Adding noise
Figure BDA0003238190380000111
(7) Computing correlations
Figure BDA0003238190380000112
(8)end for
(9) Computing average feature correlations
Figure BDA0003238190380000113
(10)end for
Step 3, obtaining average characteristic correlation according to step 2
Figure BDA0003238190380000116
Calculating the correlation ratio alphaj
Step 3 is specifically implemented according to the following steps:
relevance ratio alpha of j-th neuron in layer of certain specific layer in neural network in the layerjComprises the following steps:
Figure BDA0003238190380000114
wherein the content of the first and second substances,
Figure BDA0003238190380000115
represents the average of the average characteristic correlations of the jth neuron in the layer.
Step 4, reinitializing the neural network NN, setting the iteration times T of training, and obtaining the correlation ratio alpha according to the step 3jNoise is added in the training process, and a deep learning model DPM with differential privacy protection is obtained, so that data privacy can be protected when the model is used for prediction.
Step 4 is specifically implemented according to the following steps:
step 4.1, creating a network Net with the same structure as the network in step 1, and initially changing the parameters of the neural network into omega0Setting a training batch size L, the iteration times T of a training model and a privacy budget epsilon, and initializing the iteration steps T to 1;
step 4.2, randomly sampling L samples from the data set D to form a training sample set L of the t iterationt
Step 4.3, training sample set L of the t iterationtRespectively sending the L samples into a current neural network Net to obtain model predicted values of the L samples
Figure BDA0003238190380000121
And based on the model predicted value of each sample
Figure BDA0003238190380000122
And true value yiCalculating a model loss function for each sample
Figure BDA0003238190380000123
Figure BDA0003238190380000124
This loss function, called the cross entropy loss function, is used to measure the difference between the predicted and true values, where ω istModel representing the t-th iterationParameter (X)i,yi) Represents the batch LtOne record in;
step 4.4, optimizing the parameter omega of the model of the t iteration through error back propagation by using the model loss function of each sampletCalculating partial derivatives to obtain L intermediate model gradients g of the t-th iterationt
Step 4.5, calculating the privacy budget size epsilon of the t iterationt
Figure BDA0003238190380000125
Step 4.6, calculating privacy budget epsilon when noise is added to gradient of jth neuron of a certain layer in the neural network in the t iterationjt=αjt
Step 4.7, add g to the gradienttAdding noise to obtain noise gradient of intermediate model
Figure BDA0003238190380000126
Figure BDA0003238190380000127
Wherein g ist(Xi) Represents according to the sample lot LtRecord (X) in (1)i,yi) The calculated gradient is used to calculate the gradient of the sample,
Figure BDA0003238190380000128
representing the parameter omega at the t-th iterationtGlobal sensitivity of (d);
step 4.8, updating model parameters
Figure BDA0003238190380000129
Wherein etatRepresenting the learning rate of the t iteration;
step 4.9, judging whether T is equal to T or not, and if so, carrying out the optimization parameter omega obtained by the T-th iterationt+1As final parameters of neural network NetAnd obtaining a trained deep learning model DPM with differential privacy protection, otherwise, turning to the step 4.2.

Claims (5)

1. The differential privacy protection deep learning algorithm for adaptively allocating the dynamic privacy budget is characterized by being implemented according to the following steps:
step 1, a data set D { (X) is given1,y1),(X2,y2),...,(Xj,yj),...,(Xn,yn) I j e (1, n) }, for a piece of data (X)j,yj),XjRepresenting a data feature, yjRepresenting data tags, i.e. XjSetting and initializing a neural network NN according to the category of the neural network NN, wherein the neural network NN comprises an input layer, a hidden layer and an output layer, and training the neural network NN by using a data set D to obtain a deep learning model M without privacy protection;
step 2, calculating average characteristic correlation by using LRP algorithm according to the deep learning model M which is trained in the step 1 and does not have privacy protection
Figure FDA0003238190370000011
Step 3, obtaining average characteristic correlation according to step 2
Figure FDA0003238190370000012
Calculating the correlation ratio alphaj
Step 4, reinitializing the neural network NN, setting the iteration times T of training, and obtaining the correlation ratio alpha according to the step 3jNoise is added in the training process, and a deep learning model DPM with differential privacy protection is obtained, so that data privacy can be protected when the model is used for prediction.
2. The differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets according to claim 1, wherein the step 1 is specifically implemented according to the following steps:
step 1.1, a data set D containing an image is given, a neural network NN is set, the neural network NN comprises an input layer, three hidden layers and an output layer, and a neural network parameter omega is initialized randomly;
step 1.2, randomly selecting a batch of data B from the data set D, and inputting the data B into a neural network NN;
step 1.3, training the neural network NN by using the SGD algorithm, and continuously adjusting the neural network parameter omega to obtain the optimal neural network parameter omegaMA deep learning model M without differential privacy protection is then obtained, which is used for the calculation of the feature correlation R in the subsequent steps.
3. The differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets according to claim 2, wherein the step 2 is specifically implemented according to the following steps:
step 2.1, randomly selecting a record (X) from the data set Di,yi);
Step 2.2, inputting data X to the model M trained in the step 1iObtaining the model predicted value
Figure FDA0003238190370000021
Step 2.3, predicting values according to the model
Figure FDA0003238190370000022
For each neuron p in the last hidden layer l, the neuron p and the model output are calculated
Figure FDA0003238190370000023
Characteristic correlation of
Figure FDA0003238190370000024
Figure FDA0003238190370000025
Wherein z ispm=apωpmIs the product of network parameters between a neuron p in the hidden layer l and its output layer neuron m, apRepresenting the value of the neuron p, ωpmRepresenting the weight coefficient from neuron p to neuron m, i.e. the network parameter between neuron p and neuron m, zm=∑pzpm+bmIs an affine transformation of the hidden layer l to the output layer neurons m, bmRepresents the bias value of the last hidden layer l to the output layer neuron m,
Figure FDA0003238190370000026
is a predefined stabilizer for avoiding the condition that the denominator is 0, and
Figure FDA0003238190370000027
step 2.4, calculating the relevance decomposition information of each neuron p in the last hidden layer l to each neuron q in the previous hidden layer, namely the layer l-1
Figure FDA0003238190370000028
Figure FDA0003238190370000029
Wherein z isqp=aqωqpRepresenting neural network parameters ω between neuron q and neuron pqmProduct of aqRepresenting the value of neuron q, zp=∑qzqp+bpIs an affine transformation of a neuron q to a neuron p, bpRepresents the bias value from l-1 layer to l layer;
step 2.5, decomposing information into correlations
Figure FDA00032381903700000210
Adding noise that follows the Laplace distribution:
Figure FDA0003238190370000031
wherein the content of the first and second substances,
Figure FDA0003238190370000032
denotes Laplace noise added to the correlation decomposition information, Δ represents global sensitivity, εrRepresenting a privacy budget when noise is added to the dependency resolution information;
step 2.6, calculating the characteristic correlation of each neuron q in the hidden layer l-1
Figure FDA0003238190370000033
Figure FDA0003238190370000034
Repeating the steps 2.3-2.6 until the correlation of the input features is calculated, and finally averaging the obtained correlations to obtain the average feature correlation
Figure FDA0003238190370000037
4. The differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets according to claim 3, wherein the step 3 is specifically implemented according to the following steps:
relevance ratio alpha of j-th neuron in layer of certain specific layer in neural network in the layerjComprises the following steps:
Figure FDA0003238190370000035
wherein the content of the first and second substances,
Figure FDA0003238190370000036
represents the average of the average characteristic correlations of the jth neuron in the layer.
5. The differential privacy protection deep learning algorithm for adaptively allocating dynamic privacy budgets according to claim 4, wherein the step 4 is specifically implemented according to the following steps:
step 4.1, creating a network Net with the same structure as the network in step 1, and initially changing the parameters of the neural network into omega0Setting a training batch size L, the iteration times T of a training model and a privacy budget epsilon, and initializing the iteration steps T to 1;
step 4.2, randomly sampling L samples from the data set D to form a training sample set L of the t iterationt
Step 4.3, training sample set L of the t iterationtRespectively sending the L samples into a current neural network Net to obtain model predicted values of the L samples
Figure FDA0003238190370000041
And based on the model predicted value of each sample
Figure FDA0003238190370000042
And true value yiCalculating a model loss function for each sample
Figure FDA0003238190370000043
Figure FDA0003238190370000044
This loss function, called the cross entropy loss function, is used to measure the difference between the predicted and true values, where ω istModel parameters representing the t-th iteration, (X)i,yi) Represents the batch LtOne record in;
step 4.4, benefitOptimizing the parameter omega of the model of the t iteration by the model loss function of each sample through error back propagationtCalculating partial derivatives to obtain L intermediate model gradients g of the t-th iterationt
Step 4.5, calculating the privacy budget size epsilon of the t iterationt
Figure FDA0003238190370000045
Step 4.6, calculating privacy budget epsilon when noise is added to gradient of jth neuron of a certain layer in the neural network in the t iterationjt=αjt
Step 4.7, gradient gtAdding noise to obtain the noise gradient of the intermediate model
Figure FDA0003238190370000046
Figure FDA0003238190370000047
Wherein g ist(Xi) Represents according to the sample lot LtRecord (X) in (1)i,yi) The calculated gradient is used to calculate the gradient of the sample,
Figure FDA0003238190370000048
representing the parameter omega at the t-th iterationtGlobal sensitivity of (d);
step 4.8, updating model parameters
Figure FDA0003238190370000049
Wherein etatRepresenting the learning rate of the t iteration;
step 4.9, judging whether T is equal to T or not, and if so, carrying out the optimization parameter omega obtained by the T-th iterationt+1As final parameters of the neural network Net, a trained deep learning model with differential privacy protection is obtainedDPM, otherwise go to step 4.2.
CN202111009795.9A 2021-08-31 2021-08-31 Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget Pending CN113642715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111009795.9A CN113642715A (en) 2021-08-31 2021-08-31 Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111009795.9A CN113642715A (en) 2021-08-31 2021-08-31 Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget

Publications (1)

Publication Number Publication Date
CN113642715A true CN113642715A (en) 2021-11-12

Family

ID=78424627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111009795.9A Pending CN113642715A (en) 2021-08-31 2021-08-31 Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget

Country Status (1)

Country Link
CN (1) CN113642715A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961967A (en) * 2021-12-13 2022-01-21 支付宝(杭州)信息技术有限公司 Method and device for jointly training natural language processing model based on privacy protection
CN114169007A (en) * 2021-12-10 2022-03-11 西安电子科技大学 Medical privacy data identification method based on dynamic neural network
CN114548373A (en) * 2022-02-17 2022-05-27 河北师范大学 Differential privacy deep learning method based on feature region segmentation
CN114780999A (en) * 2022-06-21 2022-07-22 广州中平智能科技有限公司 Deep learning data privacy protection method, system, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034228A (en) * 2018-07-17 2018-12-18 陕西师范大学 A kind of image classification method based on difference privacy and level relevance propagation
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
CN111091193A (en) * 2019-10-31 2020-05-01 武汉大学 Domain-adapted privacy protection method based on differential privacy and oriented to deep neural network
CN111737743A (en) * 2020-06-22 2020-10-02 安徽工业大学 Deep learning differential privacy protection method
CN111814190A (en) * 2020-08-21 2020-10-23 安徽大学 Privacy protection method based on differential privacy distributed deep learning optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190227980A1 (en) * 2018-01-22 2019-07-25 Google Llc Training User-Level Differentially Private Machine-Learned Models
CN109034228A (en) * 2018-07-17 2018-12-18 陕西师范大学 A kind of image classification method based on difference privacy and level relevance propagation
CN111091193A (en) * 2019-10-31 2020-05-01 武汉大学 Domain-adapted privacy protection method based on differential privacy and oriented to deep neural network
CN111737743A (en) * 2020-06-22 2020-10-02 安徽工业大学 Deep learning differential privacy protection method
CN111814190A (en) * 2020-08-21 2020-10-23 安徽大学 Privacy protection method based on differential privacy distributed deep learning optimization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李敏;李红娇;陈杰;: "差分隐私保护下的Adam优化算法研究", 计算机应用与软件, no. 06, 12 June 2020 (2020-06-12), pages 259 - 264 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114169007A (en) * 2021-12-10 2022-03-11 西安电子科技大学 Medical privacy data identification method based on dynamic neural network
CN114169007B (en) * 2021-12-10 2024-05-14 西安电子科技大学 Medical privacy data identification method based on dynamic neural network
CN113961967A (en) * 2021-12-13 2022-01-21 支付宝(杭州)信息技术有限公司 Method and device for jointly training natural language processing model based on privacy protection
CN114548373A (en) * 2022-02-17 2022-05-27 河北师范大学 Differential privacy deep learning method based on feature region segmentation
CN114548373B (en) * 2022-02-17 2024-03-26 河北师范大学 Differential privacy deep learning method based on feature region segmentation
CN114780999A (en) * 2022-06-21 2022-07-22 广州中平智能科技有限公司 Deep learning data privacy protection method, system, equipment and medium
CN114780999B (en) * 2022-06-21 2022-09-27 广州中平智能科技有限公司 Deep learning data privacy protection method, system, equipment and medium

Similar Documents

Publication Publication Date Title
CN113642715A (en) Differential privacy protection deep learning algorithm for self-adaptive distribution of dynamic privacy budget
CN110321926B (en) Migration method and system based on depth residual error correction network
Kingma et al. Adam: A method for stochastic optimization
CN111737743A (en) Deep learning differential privacy protection method
CN114548373A (en) Differential privacy deep learning method based on feature region segmentation
CN113642717B (en) Convolutional neural network training method based on differential privacy
Torra et al. On a comparison between Mahalanobis distance and Choquet integral: The Choquet–Mahalanobis operator
Oloso et al. Hybrid functional networks for oil reservoir PVT characterisation
CN110837603A (en) Integrated recommendation method based on differential privacy protection
CN112836802A (en) Semi-supervised learning method, lithology prediction method and storage medium
Dhulipala et al. Active learning with multifidelity modeling for efficient rare event simulation
CN111539444A (en) Gaussian mixture model method for modified mode recognition and statistical modeling
Adesuyi et al. A layer-wise perturbation based privacy preserving deep neural networks
Yang et al. Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence
CN111311324B (en) User-commodity preference prediction system and method based on stable neural collaborative filtering
Ibitoye et al. Differentially private self-normalizing neural networks for adversarial robustness in federated learning
Zhou et al. Deep binarized convolutional neural network inferences over encrypted data
Nilsen et al. Epistemic uncertainty quantification in deep learning classification by the Delta method
Lin et al. Differential privacy protection over deep learning: An investigation of its impacted factors
CN114912142A (en) Data desensitization method and device, electronic equipment and storage medium
CN117313160B (en) Privacy-enhanced structured data simulation generation method and system
CN113935496A (en) Robustness improvement defense method for integrated model
Krishnamoorthy et al. Gas lift optimization under uncertainty
Springer et al. Robust parameter estimation of chaotic systems
CN116933322A (en) Face image privacy protection method based on self-adaptive differential privacy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240603

Address after: 518000 1104, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Hongyue Information Technology Co.,Ltd.

Country or region after: China

Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5

Applicant before: XI'AN University OF TECHNOLOGY

Country or region before: China