CN114780999A

CN114780999A - Deep learning data privacy protection method, system, equipment and medium

Info

Publication number: CN114780999A
Application number: CN202210700710.XA
Authority: CN
Inventors: 郑飞州
Original assignee: Guangzhou Zhongping Intelligent Technology Co ltd
Current assignee: Guangzhou Zhongping Intelligent Technology Co ltd
Priority date: 2022-06-21
Filing date: 2022-06-21
Publication date: 2022-07-22
Anticipated expiration: 2042-06-21
Also published as: CN114780999B

Abstract

The present disclosure relates to a method, system, device and medium for deep learning data privacy protection, the method comprising the steps of: loading an original training data set and a deep learning model; giving privacy protection weight to the privacy information in the original training data set, and constructing a privacy importance matrix; configuring global noise strength and generator parameters in the training data to construct a noise generator; training the noise generator according to a loss function constructed by the privacy importance matrix; all original training data in the original training data set are subjected to noise adding through a noise generator to generate a noise adding data set; training the deep learning model using a noisy data set to form a deep learning model with privacy preserving features. The method constructs the objective function and the parameter training method of the noise generator, achieves the purpose of minimizing the performance difference of the model while maximizing the noise intensity added by the training data, and automatically balances the feasibility and the privacy protection intensity of the model.

Description

Deep learning data privacy protection method, system, equipment and medium

Technical Field

The present disclosure relates to the field of deep learning models, and in particular, to a method, system, device, and medium for protecting deep learning data privacy.

Background

In recent years, due to the theoretical innovation of deep learning and the deep research in various fields, a plurality of commercial applications based on the deep learning technology relate to various industries, and immeasurable values are created. In order to accelerate the research and application of deep learning, numerous enterprises or research institutions disclose deep learning models that have been built. The construction of deep learning models relies on large amounts of training data, which may involve personal privacy and business secrets. Therefore, whether the disclosed deep learning model can leak the training data or not is attractive. In recent years, there has been relevant research work that demonstrates that the disclosed deep learning model runs the risk of revealing training data in some scenarios. For example, model inversion attack can be realized by introducing a regularization means for generating a network and a loss function to construct a training data representative of a specified label, which can reveal the distribution information of the training data to a certain extent, even directly acquire a personal privacy photo related to the training data on a face recognition model; as another example of a model update attack, the disclosed deep learning model may need to be continually updated as the distribution of the training data set shifts or expands, and such model parameter updates are reversed out of the new data set information used to update the model. Due to the existence of the data leakage problem of the deep learning model, not only personal privacy and business confidentiality are exposed, but also legal problems can be caused. There is therefore a need for an effective defense mechanism to alleviate the existing data leakage problem described above.

In order to solve the problem of data leakage in a deep learning model, some mainstream defensive measures exist at present, such as difference privacy machine learning, training data noise adding and the like. With respect to differential privacy machine learning, the mainstream approach is to limit the variability of model output by adding noise to the parameters, prevent differential attacks, these protective measures present a difficult trade-off in model inversion attacks and model update attacks, i.e., the trade-off between model availability and data privacy protection strength, which, in order to protect data leakage as much as possible, may cause an unacceptable amount of degradation in model performance (e.g., accuracy), in addition, certain understandability is lacked, the understandability problem means that after differential privacy is introduced, it is difficult to show which training data information is protected in a visual mode, for example, the face or the background in a picture is protected by more privacy, because privacy concerns are different for different parts of image data or other data, providing such understandability would help data providers to control the immediate risk of privacy disclosure; for training data noise addition, for example, in "privacy protection method based on medical image [ P ]. Guorihong, Saifengun, Songho, Tengten, and WangCarichi" Chinese patent CN113889232A, 2022-01-04 ", it extracts information from the personal privacy area of the original training image, then performs pixel value transformation on the extracted area by means of the complex sequence generated by Logistic chaotic system iteration, finally embeds the transformation results of all text areas into the original image, scrambles the image after the whole encryption, and protects the personal privacy area by the encryption form. The method has the problem that the training data after noise adding conversion can generate uncontrollable influence on the construction of the deep learning model, such as the usability of the model is reduced due to shielding excessive effective information.

Therefore, the above problems and the drawbacks of the corresponding solutions are addressed. The defense scheme provided by the disclosure performs personalized noise addition on training data, so that the intelligibility requirement can be met, and if converted data can be directly visualized, the privacy protection condition can be conveniently controlled and examined; secondly, differentiation of privacy protection strength can be realized by introducing a privacy importance matrix to specify key protection areas in training data. The noise generator in the defense scheme provided by the disclosure is constructed by a deep learning technology, and the loss function of the noise generator introduces the consideration of model performance, so that the noise intensity added by training data can be maximized, and the difference of model performance can be minimized.

Disclosure of Invention

The present disclosure provides a deep learning data privacy protection method, system, device, and medium, which can solve the problems mentioned in the background art. In order to solve the technical problem, the present disclosure provides the following technical solutions:

as an aspect of the embodiments of the present disclosure, a method for protecting deep learning data privacy is provided, which includes the following steps:

loading an original training data set and a deep learning model;

giving privacy protection weight to the privacy information in the original training data set, and constructing a privacy importance matrix;

configuring global noise strength and generator parameters in the training data to construct a noise generator; training the noise generator according to a loss function constructed by the privacy importance matrix;

all original training data in the original training data set are subjected to noise adding through a noise generator to generate a noise adding data set;

training the deep learning model using a noisy data set to form a deep learning model with privacy preserving features.

Optionally, the specific step of training the noise generator according to the loss function constructed by the privacy importance matrix is as follows:

selecting a group of original training data characteristics in the original training data set;

calculating a loss value of the noise generator according to the loss function;

calculating a derivative of the loss value to a parameter of a noise generator;

updating parameters of a noise generator according to the derivative;

and repeating the steps until the specified iteration times are reached.

Optionally, the loss function is as follows:

，

wherein F is trained with the original training data setA deep learning model is formed, x is the characteristic of the selected training data set, F (x) refers to the result of inputting x to obtain model output and passing through a softmax function, and G is a parameter

The noise generator is composed of a noise generator and a noise filter,

representing the data noise generated from the input x,

in order to be the global noise level,

is a privacy importance matrix.

Optionally, before training the deep learning model by using the noisy data set, the method further includes the following steps: the deep learning model is trained using the raw training data.

Optionally, after all the original training data in the original training data set is subjected to noise addition by the noise generator to generate a noise-added data set, the method further includes the following steps:

and visualizing the noise data in the noise data set, and adjusting the privacy importance matrix and/or the parameters of the noise generator according to the privacy protection condition of the noise data.

Optionally, the specific steps of constructing the privacy importance matrix are as follows:

constructing a privacy importance matrix by marking weights in the features or attributes of the original training data with global key privacy protection;

or, giving privacy protection weights to part of key areas in the original training data manually to construct a corresponding privacy importance matrix.

As another aspect of the disclosed embodiments, there is provided a deep learning data privacy protection system, including:

the resource loading module loads an original training data set and a deep learning model;

the privacy importance configuration module is used for endowing privacy protection weight to the privacy information in the original training data set and constructing a privacy importance matrix;

a noise generator construction module that configures global noise strength and generator parameters in the training data to construct a noise generator; training the noise generator according to a loss function constructed by the privacy importance matrix;

the data conversion module is used for carrying out noise addition on all original training data in the original training data set through the noise generator to generate a noise addition data set;

and the model building module is used for training the deep learning model by using the noise-added data set so as to form the deep learning model with the privacy protection characteristic.

Optionally, the system further includes a data visualization module, where the data visualization module is configured to visualize the noisy data in the noisy data set, and adjust the privacy importance matrix and/or the parameter of the noise generator according to the privacy protection condition of the noisy data.

As another aspect of the embodiments of the present disclosure, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned deep learning data privacy protection method.

As another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the steps of the deep learning data privacy protection method described above.

According to the method, the privacy importance matrix and the noise intensity can be adjusted, personalized privacy protection intensity is given to different information of the training data set, and a personalized noise adding function on the original data set is achieved; meanwhile, due to the fact that personalized noise is added to the original data, good visualization can be achieved, and a user can check the privacy protection condition and make adjustment according to the visualization result; the present disclosure constructs an objective function and a parameter training method of a noise generator, in which the model performance difference is minimized while maximizing the noise intensity added by training data, and the model feasibility and the privacy protection intensity are automatically equalized by using the strong expression ability of deep learning. The specific technical effects are as follows:

1) balancing model usability and privacy protection strengths

At present, some defense mechanisms mainly adjust the noise intensity manually, and the model performance is difficult to be well considered to a certain extent. The construction method of the noise generator can realize balance between model usability and privacy protection intensity.

2) Satisfying differentiated privacy protection policies for data

The privacy importance matrix is provided, and a user can endow higher privacy weight to key privacy protection characteristics or attributes of data and embody the privacy importance matrix, so that the noise generated by the noise generator is influenced to endow differentiated privacy protection strength at the specified characteristics or attributes.

3) Monitoring and managing actual privacy protection condition of data

The noise generator is mainly responsible for generating corresponding noise matrixes according to different input data, and the input data is added to form noise data, so that the noise data can be directly compared with the input data, a user can conveniently check the changed condition of the input data, and whether the requirement is met or not is judged to make new adjustment.

Drawings

Fig. 1 is a flowchart of a deep learning data privacy protection method according to embodiment 1 of the present disclosure;

fig. 2 is a flowchart of a specific implementation of step S30 according to embodiment 1 of the present disclosure;

fig. 3 is a schematic block diagram of a deep learning data privacy protection system.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

It is understood that the above-mentioned embodiments of the method of the present disclosure can be combined with each other to form a combined embodiment without departing from the principle logic, which is limited by the space, and the detailed description of the present disclosure is omitted.

In addition, the present disclosure also provides a deep learning data privacy protection system, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the deep learning data privacy protection methods provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the methods section are not repeated.

The execution subject of the deep-learning data privacy protection method may be a computer or other apparatus capable of implementing deep-learning data privacy protection, for example, the method may be executed by a terminal device or a server or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, or the like. In some possible implementations, the deep-learning data privacy protection method may be implemented by a processor invoking computer readable instructions stored in a memory.

Example 1

As an aspect of the embodiments of the present disclosure, there is provided a deep learning data privacy protection method, as shown in fig. 1, including the following steps:

s10, loading an original training data set and a deep learning model;

s20, endowing privacy protection weight to the privacy information in the original training data set, and constructing a privacy importance matrix;

s30, configuring global noise strength and generator parameters in the training data to construct a noise generator; training the noise generator according to a loss function constructed by the privacy importance matrix;

s40, all original training data in the original training data set are subjected to noise adding through a noise generator to generate a noise adding data set;

s60, training the deep learning model by using the noise-added data set to form the deep learning model with the privacy protection characteristic.

Based on the configuration, the embodiment of the disclosure endows personalized privacy protection strength to different information of a training data set by adjusting the privacy importance matrix and the noise strength, realizes a personalized noise adding function on an original data set, and can realize balance between model availability and privacy protection strength; a privacy importance matrix is provided, and a user can endow higher privacy weight to key privacy protection characteristics or attributes of data and embody the privacy importance matrix, so that noise generated by a noise generator is influenced to endow differentiated privacy protection strength at the specified characteristics or attributes.

The steps of the disclosed embodiments are described in detail below.

S10, loading an original training data set and a deep learning model;

the loading is the loading of system resources, namely, the loading is responsible for loading required resources for constructing the deep learning model into the system, and the required resources comprise an original training data set, the structure of the deep learning model and the structure of the noise generator. And the preprocessing of the original training data set and the initialization of the parameters of the model and the generator are completed.

S20, giving privacy protection weight to the privacy information in the original training data set, and constructing a privacy importance matrix;

wherein, all elements of the formed privacy importance matrix are greater than or equal to 0, and the sum is 1. Two configurations can be included: the user can give higher privacy weight to key privacy protection characteristics or attributes of the data and can reflect the key privacy protection characteristics or attributes in the privacy importance matrix, so that the noise generated by the noise generator is influenced to give differentiated privacy protection strength at the specified characteristics or attributes.

In some embodiments, an auto-configuration may be employed: by specifying the features or attributes of the training data for global key privacy protection, the function is responsible for automatically labeling the relevant features or attributes with higher weights and constructing the corresponding privacy importance matrix.

In some embodiments, a manual configuration may be used: and manually giving higher privacy protection weight to part of key areas in the data by a user, and constructing a corresponding privacy importance matrix.

in this embodiment, constructing a resource configuration step for the noise generator to perform on the noise generator, for example, a configuration option responsible for providing a hyper-parameter required by training the generator may include: global noise strength

Iteration times T, parameter update rate lambda and the like.

In this embodiment, the method further includes a training step of the noise generator:

the goal of the generator training is to maximize the noise intensity attached to the training data as much as possible while minimizing the model performance variance. Where the model performance difference refers to the difference between the performance (e.g., accuracy) of the model trained with the original training data and the performance of the model trained with the noisy training data. The constructed generator objective function (also called loss function) can be shown in formula (1).

In formula (1), F is a deep learning model trained by an original training data set, x is the characteristics of a selected training data set, F (x) refers to the result of inputting x to obtain model output and passing through a softmax function, and G is a parameter

The noise generator is formed by a noise generator,

representing the data noise generated from the input x,

in order to be a global noise level,

is a privacy importance matrix. First item

By reducing noisy data

The difference in model output from the original data x to minimize the difference in model performance, the second term

With the aim of maximising the noise strength

。

In some embodiments, as shown in fig. 2, the specific steps of training the noise generator according to the loss function constructed by the privacy importance matrix in step S30 are as follows:

s301: randomly selecting a group of original training data characteristics x;

s302: calculating loss values of a noise generator

；

S303: calculating loss value

Derivative of parameters of noise generator

；

S304: update parameters of the noise generator:

；

s305: and repeating the steps S301, S302, S303 and S304 until the specified iteration number T is met, thereby completing the construction of the noise generator.

wherein, all the original training data x are processed by a noise generator to generate a corresponding noise matrix

And form corresponding noise addition data x +

。

Wherein the deep learning model is trained using the noisy data in the noisy data set in S40 until the training is completed.

In some embodiments, before training the deep learning model using the noisy dataset, the method further comprises the steps of: the deep learning model is trained using the raw training data.

In some embodiments, after all the original training data in the original training data set are subjected to noise addition by the noise generator to generate the noise added data set, the method further comprises the following steps:

s50, visualizing the noise data in the noise data set, and adjusting the privacy importance matrix and/or the parameters of the noise generator according to the privacy protection condition of the noise data. Because the original data is subjected to personalized noise adding, good visualization can be met, and a user can check the privacy protection condition and make adjustment according to a visualization result. The noise generator is mainly responsible for generating corresponding noise matrixes according to different input data, and the noise matrixes are added with the input data to form noise data, so that the noise data can be directly compared with the input data, a user can conveniently check the condition that the input data is changed, and whether the requirement is met or not is judged to make a new adjustment.

Example 2

As another aspect of the embodiments of the present disclosure, there is provided a deep-learning data privacy protection system 100, as shown in fig. 3, including:

the resource loading module 1 loads an original training data set and a deep learning model;

the privacy importance configuration module 2 is used for endowing privacy protection weights to the privacy information in the original training data set and constructing a privacy importance matrix;

a noise generator construction module 3, which configures global noise intensity and generator parameters in the training data to construct a noise generator; training the noise generator according to a loss function constructed by the privacy importance matrix;

the data conversion module 4 is used for carrying out noise addition on all original training data in the original training data set through a noise generator to generate a noise addition data set;

and the model building module 6 is used for training the deep learning model by using the noise-added data set to form the deep learning model with the privacy protection characteristic.

In some embodiments, the system 100 further comprises a data visualization module 5 configured to visualize the noisy data in the noisy data set and adjust the privacy importance matrix and/or the parameters of the noise generator according to the privacy protection condition of the noisy data. Because the original data is subjected to personalized noise addition, good visualization can be met, and a user can check the privacy protection condition and make adjustment according to a visualization result. The noise generator is mainly responsible for generating corresponding noise matrixes according to different input data, and the noise matrixes are added with the input data to form noise data, so that the noise data can be directly compared with the input data, a user can conveniently check the condition that the input data is changed, and whether the requirement is met or not is judged to make a new adjustment.

Each module of the embodiments of the present disclosure is described in detail below.

the loading is the loading of system resources, namely, the loading is responsible for loading required resources for constructing the deep learning model into the system, and the required resources comprise an original training data set, the structure of the deep learning model and the structure of the noise generator. And the preprocessing of the original training data set is completed, and the parameters of the model and the generator are initialized.

wherein, all elements of the formed privacy importance matrix are greater than or equal to 0, and the sum is 1. Two configurations can be included: the user can endow higher privacy weight to key privacy protection characteristics or attributes of the data and embody the privacy importance matrix, so that the noise generated by the noise generator is influenced to endow differentiated privacy protection strength at the specified characteristics or attributes.

An automatic configuration mode can be adopted: by specifying the features or attributes of the training data for global key privacy protection, the function is responsible for automatically labeling the relevant features or attributes with higher weights and constructing the corresponding privacy importance matrix.

A manual configuration may be employed in some embodiments: and the user manually gives higher privacy protection weight to part of key areas in the data and constructs a corresponding privacy importance matrix.

in this embodiment, constructing a resource configuration step to be performed on the noise generator by the noise generator, for example, a configuration option responsible for providing a hyper-parameter required by training the generator is provided, where the configuration may include: global noise strength

Iteration times T, parameter update rate lambda and the like.

In this embodiment, the goal of the generator training is to maximize the noise intensity added to the training data as much as possible while minimizing the model performance variance. Where the model performance difference refers to the difference between the performance (e.g., accuracy) of the model trained with the original training data and the performance of the model trained with the noisy training data. The objective function (also called loss function) of the constructed generator can be shown in formula (1).

In equation (1), F is trained using the original training data setA deep learning model is formed, x is the characteristic of the selected training data set, F (x) refers to the result of inputting x to obtain model output and passing through a softmax function, and G is a parameter

The noise generator is formed by a noise generator,

representing the data noise generated from the input x,

in order to be the global noise level,

is a privacy importance matrix. First item

By reducing noisy data

With the aim of maximising the noise strength

。

In some embodiments, the specific steps of training the noise generator according to the loss function constructed by the privacy importance matrix in the noise generator construction module 3 are:

s301: randomly selecting a group of original training data features x;

s302: calculating loss values of a noise generator

；

S303: calculating loss value

Derivative of parameters of noise generator

；

S304: update parameters of the noise generator:

；

s305: repeating steps S301, S302, S303, S304 until a specified number of iterations is met

Thereby completing the construction of the noise generator.

The data conversion module 4 adds noise to all original training data in the original training data set through a noise generator to generate a noise-added data set;

And form corresponding noise addition data x +

。

And training the deep learning model by using the noise data in the noise data set in the data conversion module 4 until the training is completed.

In the following, as an illustration, the disclosed embodiment applies the above method to the deep learning animal classification model construction, and in this exemplary embodiment, the purpose is to construct a deep learning animal classification model with privacy protection effect, which is expected to be disclosed to the public but is not expected to be leaked by the training data set, because there is some private information in the data set of the classification model used for training, such as the face of the animal breeder or audience in the animal image. Therefore, the method provided by the invention is applied to construct an animal classification model. The specific process of application is as follows:

1) a developer loads an animal image data set for training a classification model through the resource loading module 1, and deeply learns the structure of the animal classification model and the structure of the noise generator. Preprocessing the data set and initializing the parameters of the model and the generator;

2) training the classification model by using an original animal image data set to obtain a deep learning model trained by using the original data set;

3) a developer endows privacy information such as human faces in images with higher privacy protection weight through a privacy importance configuration module 2, all human face information in a data set is automatically positioned by using a human face target recognition model, and a corresponding privacy importance matrix is constructed;

4) operating a noise generator building module 3 to complete the construction of the noise generator;

5) generating a corresponding noise data set by an original animal image data set through a data conversion module 4, wherein private information such as human faces of the noise data set has higher-intensity noise;

6) whether privacy information such as human faces and the like is effectively shielded in the noise-added image is checked through the data visualization module 5, if the effect is not good, the noise intensity of the noise generator is properly improved, and a new noise generator is constructed;

7) the model building function of the noisy data set in the model building block 6 is run. And training the deep learning animal classification model by using the noise-added data set, and finally completing construction of the animal classification model with the privacy protection characteristic.

Example 3

An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the deep learning data privacy protection method of embodiment 1 when executing the computer program.

Embodiment 3 of the present disclosure is merely an example, and should not bring any limitation to the function and the scope of use of the embodiments of the present disclosure.

The electronic device may be embodied in the form of a general purpose computing device, which may be, for example, a server device. Components of the electronic device may include, but are not limited to: at least one processor, at least one memory, and a bus connecting different system components (including the memory and the processor).

The buses include a data bus, an address bus, and a control bus.

The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory, and may further include read-only memory (ROM).

The memory may also include program means having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.

The processor executes various functional applications and data processing by executing computer programs stored in the memory.

The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, etc.). Such communication may be through an input/output (I/O) interface. Also, the electronic device may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via a network adapter. The network adapter communicates with other modules of the electronic device over the bus. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.

It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, according to embodiments of the application. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Example 4

A computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the deep-learning data privacy protection method of embodiment 1.

More specific examples that may be employed by the readable storage medium include, but are not limited to: a portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible implementation, the present disclosure may also be implemented in the form of a program product including program code for causing a terminal device to perform the steps of implementing the deep learning data privacy protection method described in embodiment 1 when the program product is run on the terminal device.

Where program code for carrying out the disclosure is written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.

Although embodiments of the present disclosure have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the disclosure, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A deep learning data privacy protection method is characterized by comprising the following steps:

loading an original training data set and a deep learning model;

2. The method for protecting privacy of deep-learning data according to claim 1, wherein the specific steps of training the noise generator according to the loss function constructed by the privacy importance matrix are as follows:

calculating a loss value of the noise generator according to the loss function;

calculating a derivative of the loss value to a parameter of a noise generator;

updating parameters of a noise generator according to the derivative;

and repeating the steps until the specified iteration times are reached.

3. The deep-learning data privacy preserving method of claim 1 or 2, wherein the loss function is as follows:

，

wherein F is a deep learning model trained with an original training data set, and x is a selectedThe characteristics of the training data set, F (x) refers to the result of inputting x to obtain model output and passing through the softmax function, and G is a parameter

The noise generator is composed of a noise generator and a noise filter,

representing the data noise generated from the input x,

in order to be a global noise level,

is a privacy importance matrix.

4. The method for protecting privacy of deep-learning data according to claim 1, wherein before training the deep-learning model using a noisy data set, further comprising the steps of: the deep learning model is trained using the raw training data.

5. The deep learning data privacy protection method of any one of claims 1-2 and 4, wherein after all original training data in an original training data set are subjected to noise addition by a noise generator to generate a noise added data set, the method further comprises the following steps:

6. The deep-learning data privacy protection method as claimed in any one of claims 1-2 and 4, wherein the specific steps of constructing the privacy importance matrix are as follows:

or manually endowing privacy protection weights to partial key areas in the original training data to construct a corresponding privacy importance matrix.

7. A deep learning data privacy protection system, comprising:

8. The deep-learning data privacy protection system of claim 7, further comprising a data visualization module configured to visualize the noisy data in the noisy data set and adjust the privacy importance matrix and/or parameters of the noise generator based on the privacy preserving condition of the noisy data.

9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for deep-learning data privacy protection of any one of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for deep-learning data privacy protection of any one of claims 1 to 6.