CN111241582A

CN111241582A - Data privacy protection method and device and computer readable storage medium

Info

Publication number: CN111241582A
Application number: CN202010029622.2A
Authority: CN
Inventors: 李洪伟; 丁勇; 刘小源; 徐国文; 刘森; 龚丽
Original assignee: Peng Cheng Laboratory
Current assignee: Peng Cheng Laboratory
Priority date: 2020-01-10
Filing date: 2020-01-10
Publication date: 2020-06-05
Anticipated expiration: 2040-01-10
Also published as: CN111241582B

Abstract

The invention discloses a data privacy protection method, a device and a computer readable storage medium, wherein the data privacy protection method comprises the following steps: the method comprises the steps that a participant obtains a first weight sent by a cloud server; the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and determines a second weight of each neuron in the local model after the iteration of the local model is completed; the participant determines the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm; and the participants determine the disturbance weight after the second weight is interfered based on the second weight, the weight importance and a disturbance mechanism, and send the disturbance weight to a cloud server. According to the method and the device, the privacy protection level of the data shared by the participants is improved, and the accuracy of the joint training model of the participants and the cloud server is improved.

Description

Data privacy protection method and device and computer readable storage medium

Technical Field

The invention relates to the technical field of Internet of things, in particular to a data privacy protection method and device and a computer readable storage medium.

Background

With the development of communication networks, a large number of internet of things devices continuously access the network and generate a large amount of data. As a mainstream method in the field of big data analysis, deep learning is being closely combined with the application of the internet of things, and the method is widely applied to multiple fields such as smart cities, smart homes, unmanned driving and the like.

Traditional centralized deep learning requires users to submit data to a data center, and then the cloud server trains the data uniformly. However, these data are likely to be abused by the model trainer, inferring more private information about the user. Distributed deep learning allows multiple participants to jointly learn a common model without disclosing the data set. However, in a distributed deep learning environment, in the process of sharing data between a cloud server and participants, sensitive information may still be leaked due to poor privacy protection of data shared by the cloud server and the participants.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a data privacy protection method, a data privacy protection device and a computer readable storage medium, and aims to solve the technical problem of poor data privacy protection.

In order to achieve the above object, the present invention provides a data privacy protection method, including the following steps:

the method comprises the steps that a participant obtains a first weight sent by a cloud server;

the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and determines a second weight of each neuron in the local model after the iteration of the local model is completed;

the participant determines the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm;

and the participants determine the disturbance weight after the second weight is interfered based on the second weight, the weight importance and a disturbance mechanism, and send the disturbance weight to a cloud server.

Optionally, after the step of determining, by the participant, a disturbance weight after the disturbance of the second weight based on the second weight, the weight importance, and a disturbance mechanism, and sending the disturbance weight to a cloud server, the method further includes:

the cloud server receives the disturbance weight sent by the participant;

the cloud server inputs the disturbance weight to a first target model in a model ring of the cloud server, and obtains a third weight of a second target model of the model ring, wherein the second target model is a previous model of the first target model in the model ring;

and the cloud server sends the third weight to the participant so that the participant receives the third weight, the third weight is used as the first weight, the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and a step of determining a second weight of each neuron in the local model after the iteration of the local model is completed is executed.

Optionally, after the step of the cloud server inputting the disturbance weight to the first target model in the model ring of the cloud server and obtaining the third weight of the second target model of the model ring, the method further includes:

the cloud server acquires the first target model;

the cloud server takes the first target model as the second target model, and executes the step of obtaining the third weight of the second target model of the model ring.

Optionally, before the step of inputting, by the cloud server, the disturbance weight to a first target model in a model ring of the cloud server and obtaining a third weight of a second target model of the model ring, the method further includes:

the cloud server acquires non-private data;

the cloud server initializes target models in the model ring based on the non-private data, the target models including the first target model and the second target model.

Optionally, the step of determining, by the participant, a disturbance weight after the second weight is interfered based on the second weight, the weight importance, and a disturbance mechanism, and sending the disturbance weight to a cloud server includes:

the participant normalizes the weight importance based on the weight importance and a disturbance mechanism, and determines a weight normalization result;

the participant acquires a total privacy budget, and determines a privacy budget corresponding to the weight importance based on the weight normalization result and the total privacy budget;

the participant determines a perturbation weight after the second weight is interfered based on the second weight and the privacy budget.

Optionally, the step of determining, by the participant, a perturbation weight after the interference of the second weight based on the second weight and the privacy budget includes:

and the participant perturbs the second weight based on the privacy budget and a differential privacy mechanism, and determines a perturbation weight after the second weight is interfered.

Optionally, the step of determining, by the participant, the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm includes:

the participant determines neuron significance of the local model based on the second weight and a weight significance algorithm;

the participant determines a weight importance corresponding to the second weight based on the neuron importance.

Optionally, the step of determining a second weight of each neuron in the local model after iterating the local model is completed includes:

an iteration step of obtaining the local model by the participant;

and if the participant detects that the iteration step reaches the preset step, determining a second weight of each neuron in the local model after the local model is iterated.

In addition, to achieve the above object, the present invention provides a data privacy protecting apparatus, including: the system comprises a memory, a processor and a data privacy protection program stored on the memory and capable of running on the processor, wherein the data privacy protection program realizes the steps of the data privacy protection method when being executed by the processor.

In addition, to achieve the above object, the present invention also provides a computer readable storage medium, on which a data privacy protecting program is stored, and the data privacy protecting program, when executed by a processor, implements the steps of the data privacy protecting method as described above.

According to the method, a participant acquires a first weight sent by a cloud server; the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and determines a second weight of each neuron in the local model after the iteration of the local model is completed; the participant determines the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm; the participant determines the disturbance weight after the disturbance of the second weight based on the second weight, the weight importance and the disturbance mechanism, sends the disturbance weight to the cloud server, disturbs the data, namely the weight, shared between the participant and the cloud server differently by combining a weight importance algorithm and the disturbance mechanism, distributes less disturbance noise to the weight with higher importance, and injects more disturbance noise to the weight with low importance, so that the privacy protection level of the data shared by the participant and the cloud server is improved, and meanwhile, the accuracy of a combined training model of the participant and the cloud server is improved.

Drawings

Fig. 1 is a schematic structural diagram of a data privacy protection apparatus in a hardware operating environment according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a first embodiment of a data privacy protection method according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, fig. 1 is a schematic structural diagram of a data privacy protection apparatus in a hardware operating environment according to an embodiment of the present invention.

The data privacy protection device in the embodiment of the invention can be a PC, and can also be a mobile terminal device with a display function, such as a smart phone, a tablet computer, a portable computer and the like.

As shown in fig. 1, the data privacy protecting apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Optionally, the data privacy protecting apparatus may further include a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, a WiFi module, and the like.

Those skilled in the art will appreciate that the data privacy device architecture shown in fig. 1 does not constitute a limitation of the data privacy device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a data privacy protecting program.

In the data privacy protection apparatus shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be used to invoke a data privacy preserving program stored in the memory 1005.

In this embodiment, the data privacy protecting apparatus includes: the system comprises a memory 1005, a processor 1001 and a data privacy protection program stored on the memory 1005 and operable on the processor 1001, wherein when the processor 1001 calls the data privacy protection program stored in the memory 1005, the following operations are performed:

Further, the processor 1001 may call the data privacy protection program stored in the memory 1005, and further perform the following operations:

the cloud server receives the disturbance weight sent by the participant;

the cloud server acquires the first target model;

the cloud server acquires non-private data;

an iteration step of obtaining the local model by the participant;

The invention further provides a data privacy protection method, and referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the data privacy protection method of the invention.

In this embodiment, the data privacy protection method includes the following steps:

the system architecture applicable to the embodiment of the invention comprises a cloud server and a plurality of participants. In the technical scheme of the embodiment, each participant trains a local model with public non-private data in advance in a pre-training stage so as to initialize the local model; then, operating a weight importance algorithm to quantify the importance of the weight in the deep learning model to the model prediction; in the privacy protection stage, a differential privacy mechanism is combined to disturb the weight value of the local model differently; the perturbed local model weight values are then uploaded to the server and a new training model is requested. Deploying a plurality of deep learning models with the same type and different parameter values in the cloud server, and receiving a local model sent by a participant and replacing one of the deep learning models; and when the server processes the model request of the client, the model uploaded by other participants is sent to the participant requesting the model.

The cloud server may be a computer or other network device. The cloud server may be an independent device, or may be a server cluster formed by a plurality of servers. Preferably, the cloud server may perform information processing by using a cloud computing technology. The participants are deployed on a terminal, and the terminal may be an electronic device with a wireless communication function, such as a mobile phone, a tablet computer, or a dedicated handheld device, or may be a device connected to the internet in a wired access manner, such as a Personal Computer (PC), a notebook computer, or a server. The terminal may be an independent device, or a terminal cluster formed by a plurality of terminals. Preferably, the terminal can perform information processing by using a cloud computing technology. The participants may communicate with the cloud server through an INTERNET network, or may communicate with the cloud server through a Global System for mobile communications (GSM), a Long Term Evolution (LTE) System, or other mobile communication systems.

Step S10, the participant obtains a first weight sent by the cloud server;

in one embodiment, a plurality of deep learning models are deployed in a cloud server, the plurality of deep learning models form a model ring, the cloud server receives model parameters (weights) uploaded by participants and inputs the model parameters into one deep learning model in the model ring, when the model parameters uploaded by the next participant are received, the model parameters of the next participant are input into the next deep learning model, and so on, the cloud server receives the weight parameters of the local model sent by the participants in a rotating manner and inputs the weight parameters into the deep learning model in the model ring in a rotating manner at the same time, so that the deep learning model in the cloud server is continuously updated. Specifically, if M deep learning models are deployed in the cloud server, and the cloud server has initialized all the deep learning models on the model ring, once the model parameters sent by the participant are received, the model parameters are input to the mth deep learning model on the model ring (M ∈ [0, M-1]), and then the (M-1) th model is sent to the participant for the next round of training. Notably, the m-1 st model is a deep learning model that inputs model parameters uploaded by a participant who last interacted with the cloud server. After interacting with each participant, m ← m +1 is executed.

The cloud server acquires the weight parameter uploaded by the other participant who interacts with the cloud server, and inputs the weight parameter into the target model of the model ring of the cloud server. When the participant sends the weight parameters of the local model to the cloud server, and simultaneously requests the cloud server for new local model parameters, namely the first weight, the cloud server obtains the first weight in the target model and sends the first weight to the participant, and the participant obtains the first weight sent by the cloud server.

Step S20, the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and determines a second weight of each neuron in the local model after the iteration of the local model is completed;

in one embodiment, after receiving the first weight sent by the cloud server, the participant starts training the local model, inputs the first weight into the local model, and trains the local model according to a certain model training rule. Specifically, the first weight is input into the local model, the user data is input into the local model as a training sample, the forward propagation process operation is firstly performed on the local model, all the activation values in the local model are determined, and the activation values comprise output control quantity of a neural network hidden layer and the like. And then, carrying out operation in a back propagation process on the local model, and determining a new weight and a threshold of each node aiming at each node of each layer in the local model, wherein the new weight and the threshold indicate how much influence is generated on the weight and the threshold corresponding to the final output control quantity of the output layer by the node. And thus, continuously carrying out a series of operations of the forward propagation process and the backward propagation process on the local model until the local model is iterated, namely training is completed. Finally, a second weight of the local model is determined and output. The training process of the local model consists of a forward propagation process and a backward propagation process, wherein in the forward propagation process, an input mode is processed layer by layer from an input layer through hidden unit layers and is transferred to an output layer, and the state of each layer of neurons only affects the state of the next layer of neurons; and in the back propagation process, the error signals are returned along the original connecting path, and the weight and the threshold of each neuron are modified to minimize the error of the weight and the threshold of each neuron.

Step S30, the participant determines a weight importance corresponding to the second weight based on the second weight and a weight importance algorithm;

in one embodiment, after the participant iterates the local model, a weight importance algorithm is run to determine the weight importance corresponding to the second weight. Specifically, the specific process of operating the weight importance algorithm to determine the weight importance corresponding to the second weight is as follows:

1) initializing the weight importance matrix: participants set a weight importance matrix of gamma x gamma

And each element therein

Are initialized to zero, wherein,

representative is neuron a_pAnd neuron a_qWeight w between_p,qOf interest, gamma means the modelThe total number of neurons in;

2) computing output layer neuron significance: first, calculating each neuron a in the output layer_pImportance to model predictions

Namely, the importance of the output layer neuron to the model prediction value is the output value of the neuron, namely, the output value of the model, and the calculation formula of the importance of the output layer neuron is as follows:

wherein x is_iFrom training samples

ω is the second weight.

3) Calculating weight importance: the participator recurs layer by layer from back to front, and the weight w between adjacent layers is calculated_p,qThe significance of model predictors, assumed to be layer h-1 neuron a_pAnd layer h neurons a_qLayer h-1 neurons a_pAnd layer h neurons a_pImportance of weight between

The calculation formula is as follows:

wherein, a in the formula_pRefers to neurons a in other layers than the output layer_pThe output value of (1).

4) Calculate the importance of each layer of neurons except the output layer (including hidden layer neuron importance and input layer neuron importance): calculating the importance of the neurons of the remaining layers except the output layer, neuron a of layer (h-1)_pOf importance is

5) And repeating the step 3) and the step 4) until all the weight importance in the local model is calculated, and finally determining the weight importance corresponding to the second weight. It will be appreciated that the significance of the neurons is calculated as an intermediate parameter in the calculation of the significance matrix.

Step S40, the participant determines a disturbance weight after the second weight is interfered based on the second weight, the weight importance and a disturbance mechanism, and sends the disturbance weight to a cloud server.

In an embodiment, after the weight importance algorithm is operated to determine the weight importance corresponding to the second weight, the participant operates a disturbance mechanism customized by the participant, the second weight is disturbed based on different weight importance to obtain a disturbance weight, and the disturbed disturbance weight is sent to the cloud server to be used for the cloud server to train models of different participants in combination with other participants, wherein the weight importance determines the degree of disturbance on the second weight, and the higher the weight importance is, the lower the degree of disturbance is, so as to ensure the accuracy of data. Specifically, based on different weight importance, the disturbance mechanism is operated to interfere with the second weight, and the specific process of obtaining the disturbance weight is as follows:

1) weight importance normalization: for each parameter in weight importance

The normalization operation was performed as follows:

wherein the content of the first and second substances,

and

respectively, the maximum and minimum values of the weight importance, the result of the normalization operation being to limit all parameters of the weight importance to a preset range, e.g. [0.5, 1]]Between the intervals, the weight importance is expressed in a matrix manner.

2) Adjusting the privacy budget: for each weight importance in the weight importance matrix

Setting a privacy budget ε_p,qSo that less disturbance noise is allocated to the weight with higher importance, and the purpose is to improve the accuracy of the model; injecting more disturbance noise into the weight with low importance, and aiming at improving the privacy protection level of the data of the model parameters:

wherein epsilon_TRefers to the total privacy budget.

3) Protecting local model parameters: at this time, the model iteration step reaches a maximum step s, the weight obtained by the maximum step is a second weight, and in order to protect the training data of the participants from being leaked or speculated, a differential privacy mechanism is adopted to differentially disturb the second weight, that is: adding the adjusted Laplace noise to the weights with different importance to finally obtain the disturbance weight

The method comprises the following specific steps:

wherein the content of the first and second substances,

refers to sampling from a laplacian distribution, the distribution satisfying a mean of 0, the size of the sample value being given by a parameter

To decide. Wherein epsilon_p,qIs an adjusted privacy budget, generally speaking, a larger privacy budget means a smaller noise value, which results in higher system accuracy and also means a weaker privacy protection level; Δ f is the sensitivity of the model weights, in general, given two neighbor databases that differ by at most one piece of data: d₁And D₂The sensitivity process of the stochastic algorithm Γ is calculated as follows:

according to the data privacy protection method provided by the embodiment, a participant acquires a first weight sent by a cloud server; the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and determines a second weight of each neuron in the local model after the local model is iterated; the participant determines the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm; and the participants determine the disturbance weight after the disturbance of the second weight based on the second weight, the weight importance and the disturbance mechanism, send the disturbance weight to the cloud server, disturbs the data, namely the weight, shared between the participants and the cloud server differently by combining a weight importance algorithm and the disturbance mechanism, distributes less disturbance noise to the weight with higher importance, and injects more disturbance noise to the weight with low importance, so that the privacy protection level of the data shared by the participants and the cloud server is improved, and meanwhile, the accuracy of a joint training model of the participants and the cloud server is improved.

Based on the first embodiment, a second embodiment of the data privacy protecting method according to the present invention is proposed, in this embodiment, after step S40, the method further includes:

step a, the cloud server receives the disturbance weight sent by the participant;

b, the cloud server inputs the disturbance weight to a first target model in a model ring of the cloud server, and obtains a third weight of a second target model of the model ring, wherein the second target model is a previous model of the first target model in the model ring;

and c, the cloud server sends the third weight to the participant so that the participant receives the third weight, the third weight is used as the first weight, the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and the second weight of each neuron in the local model after the iteration of the local model is finished is determined.

In one embodiment, a plurality of deep learning models are deployed in a cloud server, the deep learning models form a model ring, and when a participant completes iteration of a local model and disturbs parameters of the local model to obtain a disturbance weight, the participant sends the disturbance weight to the cloud server and requests a new weight parameter to perform next joint training. The cloud server receives the disturbance weight uploaded by the participant and inputs the disturbance weight into a first target model in the model ring. Then, a weight parameter (i.e., a third weight) of a previous model of the first model, i.e., the second target model, in the model loop is obtained. And the cloud server sends the weight parameter (third weight) of the previous model (second target model) of the first target model to the participant for the participant to perform next joint training, namely the participant receives the third weight, the third weight is used as the first weight, the participant iterates the local model corresponding to the participant based on the first weight and the model training rule, and the second weight of each neuron in the local model after the iterative local model is completed is determined.

Further, in an embodiment, before the step of the cloud server inputting the disturbance weight to the first target model in the model ring of the cloud server and obtaining the third weight of the second target model of the model ring, the method further includes:

the cloud server acquires non-private data;

Further, starting from a joint training model of the participants and the cloud server, a plurality of participants need to be combined to obtain a more accurate learning model which does not cause local overfitting, and a deep learning network structure, such as a Convolutional Neural Network (CNN), a cyclic neural network (RNN) and the like, is negotiated between the participants and the cloud server in advance. After a network structure to be trained is negotiated between the cloud server and participants, the cloud server initializes the deep learning models in the model ring, and the cloud server trains the deep learning models by using public data or historical data (non-private data) to obtain a preset number of deep learning models.

Further, after the participants finish iteration on the local model and disturb the parameters of the local model to obtain disturbance weights, the participants send the disturbance weights to the cloud server and request new weight parameters for next joint training. The cloud server receives the disturbance weight uploaded by the participant, inputs the disturbance weight into a first target model in the model ring, and acquires a second target model of the model ring. And if the second target model is the deep learning model initialized by the cloud server, sending a continuous training instruction to the participant so that the participant can use the disturbance weight as the first weight, executing the local model corresponding to the participant based on the first weight and the model training rule, and determining the second weight of each neuron in the local model after the local model is iterated, namely if the second target model is the deep learning model initialized by the cloud server, the cloud server informs the participant to continuously use the current model for the next round of training.

Further, in an embodiment, after the step of the cloud server inputting the disturbance weight to the first target model in the model ring of the cloud server and obtaining the third weight of the second target model of the model ring, the method further includes:

step d, the cloud server acquires the first target model;

and e, the cloud server takes the first target model as the second target model, and executes the step of obtaining the third weight of the second target model of the model ring.

In an embodiment, the cloud server inputs the disturbance weight to a first target model in a model ring of the cloud server, the first target model is used as a second target model for a subsequent cloud server to obtain a third weight of the second target model of the model ring, the cloud server sends the third weight to a participant, so that the participant receives the third weight, the third weight is used as the first weight, a local model corresponding to the participant is iterated based on the first weight and a model training rule, and the second weight of each neuron in the local model after the iteration of the local model is completed is determined.

That is to say, when the cloud server receives the model parameters uploaded by the next participant, the model parameters of the next participant are input into the next deep learning model, and so on, the cloud server receives the weight parameters of the local model sent by the participant in turn and inputs the weight parameters into the deep learning model in the model ring in turn at the same time, so that the deep learning model in the cloud server is continuously updated. Specifically, if M deep learning models are deployed in the cloud server, and the cloud server has initialized all the deep learning models on the model ring, once the model parameters sent by the participant are received, the model parameters are input to the mth deep learning model on the model ring (M ∈ [0, M-1]), and then the (M-1) th model is sent to the participant for the next round of training. Notably, the m-1 st model is a deep learning model that inputs model parameters uploaded by a participant who last interacted with the cloud server. After interacting with each participant, m ← m +1 is executed.

Further, in an embodiment, the step of determining, by the participant, a disturbance weight after the disturbance of the second weight based on the second weight, the weight importance, and a disturbance mechanism, and sending the disturbance weight to the cloud server includes:

f, the participants perform normalization operation on the weight importance based on the weight importance and a disturbance mechanism, and determine a weight normalization result;

in an embodiment, after the operation weight importance algorithm determines the weight importance corresponding to the second weight, the participant operates a disturbance mechanism customized by the participant, and each parameter in the weight importance is subjected to

And carrying out normalization operation to obtain a weight normalization result, wherein the calculation formula of the weight normalization result is as follows:

wherein the content of the first and second substances,

and

Step g, the participant acquires a total privacy budget, and determines a privacy budget corresponding to the weight importance based on the weight normalization result and the total privacy budget;

in one embodiment, the participants obtain a total privacy budget for each weight importance in the weight importance matrix

Setting a privacy budget ε_p,qSo that less disturbance noise is allocated to the weight with higher importance, and the purpose is to improve the accuracy of the model; to the weight with low importanceInjecting more disturbance noise to improve the privacy protection level of data of model parameters, wherein the calculation formulas of privacy budgets corresponding to different weight importance are as follows:

wherein epsilon_TIs the total privacy budget.

And h, the participant determines the disturbance weight after the second weight is interfered based on the second weight and the privacy budget.

In one embodiment, when the model iteration step reaches the maximum step s, the weight obtained by the maximum step is the second weight, and in order to protect the training data of the participant from being leaked or speculated, the second weight is perturbed differently, that is: adding the adjusted Laplace noise to the weights with different importance to finally obtain the disturbance weight

The method comprises the following specific steps:

wherein the content of the first and second substances,

further, in an embodiment, the step of determining, by the participant based on the second weight and the privacy budget, a perturbation weight after the interference with the second weight includes:

and i, disturbing the second weight based on the privacy budget and a differential privacy mechanism, and determining a disturbance weight after the second weight is disturbed.

In an embodiment, when the model iteration step reaches the maximum step s, the weight obtained by the maximum step is the second weight, and in order to protect the training data of the participant from being leaked or speculated, the second weight is perturbed differentially by using a differential privacy mechanism, that is: adding the adjusted Laplace noise to the weights with different importance to finally obtain the disturbance weight

The method comprises the following specific steps:

wherein the content of the first and second substances,

in the data privacy protection method provided by this embodiment, the cloud server receives the disturbance weight sent by the participant; the cloud server inputs the disturbance weight to a first target model in a model ring of the cloud server, and obtains a third weight of a second target model of the model ring, wherein the second target model is a previous model of the first target model in the model ring; and the cloud server sends the third weight to the participant so that the participant receives the third weight, the third weight is used as the first weight, the participant iterates a local model corresponding to the participant based on the first weight and a model training rule, and the second weight of each neuron in the local model after the iteration of the local model is completed is determined.

Based on the first embodiment, a third embodiment of the data privacy protecting method of the present invention is proposed, in this embodiment, step S30 includes:

step j, the participant determines the neuron importance of the local model based on the second weight and a weight importance algorithm;

and k, determining the weight importance corresponding to the second weight by the participant based on the neuron importance.

In one embodiment, after the participant iterates the local model, a weight importance algorithm is run, first, the importance of the neuron is determined, and then, based on the importance of the neuron, the importance of the weight corresponding to the second weight is determined. It is understood that neuron significance is calculated as an intermediate parameter for calculating weight significance.

Specifically, the specific process of operating the weight importance algorithm to determine the weight importance corresponding to the second weight is as follows:

And each element therein

Are initialized to zero, wherein,

representative is neuron a_pAnd neuron a_qWeight w between_p,qGamma refers to the total number of neurons in the model;

wherein x is_iFrom training samples

ω is the second weight.

3) Calculating weight importance: the participator recurs layer by layer from back to front, and the weight w between adjacent layers is calculated_p,qThe significance of model predictors, assumed to be layer h-1 neuron a_pAnd h layerNeuron a_qLayer h-1 neurons a_pAnd layer h neurons a_pImportance of weight between

The calculation formula is as follows:

5) And repeating the step 3) and the step 4) until all the weight importance in the local model is calculated, and finally determining the weight importance corresponding to the second weight.

Further, in an embodiment, the step of determining the second weight of each neuron in the local model after iterating the local model includes:

step m, the participant obtains an iteration step of iterating the local model;

and n, if the participant detects that the iteration step reaches the preset step, determining second weights of all neurons in the local model after the local model is iterated.

In one embodiment, the participant terminal detects whether the iteration local model reaches the preset step, when the model iteration step reaches the maximum step s, the maximum step s is the preset step, the iteration local model is stopped, and the second weight of each neuron in the local model after the iteration local model is completed is determined.

In the data privacy protection method provided by this embodiment, the neuron importance of the local model is determined by the participant based on the second weight and the weight importance algorithm; and the participant determines the weight importance corresponding to the second weight based on the neuron importance, and determines the weight importance of different weights through a weight importance algorithm, so that data, namely the weights, shared between the participant and the cloud server can be disturbed differently, less disturbance noise is distributed to the weights with higher importance, more disturbance noise is injected into the weights with low importance, the privacy protection level of mutually shared data is improved, and meanwhile, the accuracy of a joint training model of the participant and the cloud server is improved.

Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, where a data privacy protection program is stored, and when executed by a processor, the data privacy protection program implements the steps of the data privacy protection method according to any one of the above.

The specific embodiment of the computer-readable storage medium of the present invention is substantially the same as the embodiments of the data privacy protecting method described above, and details are not repeated herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A data privacy protection method is characterized by comprising the following steps:

2. The data privacy protection method of claim 1, wherein after the step of determining, by the participant, a perturbation weight after the second weight is interfered based on the second weight, the weight importance, and a perturbation mechanism, and sending the perturbation weight to a cloud server, the method further comprises:

the cloud server receives the disturbance weight sent by the participant;

3. The data privacy protection method of claim 2, wherein the step of the cloud server inputting the perturbation weight to a first target model in a model ring of the cloud server and obtaining a third weight of a second target model of the model ring is followed by further comprising:

the cloud server acquires the first target model;

4. The data privacy protection method of claim 2, wherein the cloud server further comprises, before the step of inputting the perturbation weight to a first target model in a model ring of the cloud server and obtaining a third weight of a second target model of the model ring:

the cloud server acquires non-private data;

5. The data privacy protection method of claim 1, wherein the step of determining, by the participant, a perturbation weight after the second weight is interfered based on the second weight, the weight importance, and a perturbation mechanism, and sending the perturbation weight to a cloud server comprises:

6. The data privacy protection method of claim 5, wherein the step of determining, by the participant based on the second weight and the privacy budget, a perturbation weight after the second weight is interfered with comprises:

7. The data privacy protection method of claim 1, wherein the step of determining, by the participant, the weight importance corresponding to the second weight based on the second weight and a weight importance algorithm comprises:

8. The method of any of claims 1 to 7, wherein the step of determining a second weight for each neuron in the local model after iterating through the local model comprises:

an iteration step of obtaining the local model by the participant;

9. A data privacy protecting apparatus, characterized in that the data privacy protecting apparatus comprises: memory, a processor and a data privacy preserving program stored on the memory and executable on the processor, the data privacy preserving program when executed by the processor implementing the steps of the data privacy preserving method as claimed in any one of claims 1 to 8.

10. A computer-readable storage medium, on which a data privacy protection program is stored, which when executed by a processor implements the steps of the data privacy protection method according to any one of claims 1 to 8.