CN114003961A

CN114003961A - Deep neural network reasoning method with privacy protection

Info

Publication number: CN114003961A
Application number: CN202111472835.3A
Authority: CN
Inventors: 于佳; 郭丽; 郝蓉
Original assignee: Qingdao University
Current assignee: Qingdao University
Priority date: 2021-12-03
Filing date: 2021-12-03
Publication date: 2022-02-01
Anticipated expiration: 2041-12-03
Also published as: CN114003961B

Abstract

The invention discloses a deep neural network reasoning method with privacy protection, which comprises the following steps: the client generates a key; the client encrypts the input data matrix and the weight matrix of the trained deep neural network model by using the key and sends the encrypted input data matrix and the weight matrix to the edge server; the edge server performs linear layer calculation on the input data matrix by using the received weight matrix of the deep neural network model and returns the result to the client; the client verifies the returned result, if the result is correct, the client receives the result, and if the result is incorrect, the client refuses to receive the result; for the result of correct verification, the client recovers the actual output result of the linear layer by using the locally stored key and the bias matrix; and the client locally calculates the nonlinear layer, takes the calculation result as the input of the next linear layer, and circulates the steps until the final reasoning result is obtained. The invention can save the calculation expense of the user and simultaneously can ensure the privacy of the user data and the model.

Description

Deep neural network reasoning method with privacy protection

Technical Field

The invention relates to the technical field of information security, in particular to a deep neural network reasoning method with privacy protection.

Background

With the development of machine learning and the rise of artificial intelligence, various large research fields try to realize artificial intelligence by using a machine learning algorithm. For example, a generative confrontation network for image inpainting, a deep learning framework for image recognition, and the like. However, the inference task of a complex deep neural network typically involves a large number of computational operations, for example, based on some popular deep neural network architectures, billions of computational operations are required for a single inference task to perform visual inspection, which makes it a challenge to efficiently perform these operations on resource-limited internet-of-things devices.

The rapid development of edge computing provides an effective method for resource-constrained devices to perform complex deep neural network reasoning. The outsourcing calculation is one of the most important applications of edge calculation. It allows resource-constrained users to outsource complex computing to the edge server, charging only the users who use the computing resources. According to the provider of the deep neural network model, the existing outsourcing deep neural network reasoning work can be divided into two types: 1) the user submits data to be inferred, the cloud server/edge server provides a trained deep neural network model, and the service provided by the server is called as inference as service. 2) The trained model and the data to be inferred are provided by the same user, and the cloud server/edge server only provides computing resources. In these ways, the resource-limited user can utilize the computing power of the cloud server/edge server to complete complex computing operations in the deep neural network inference phase.

While users may benefit from outsourcing deep neural network reasoning to reduce computational and storage burden, protecting user data privacy and the effectiveness of reasoning results is a rather challenging problem. Some data collected by the terminal device may be very sensitive, such as medical diagnostic data. Once the data is leaked, a lot of trouble is brought to the user. In addition, some external factors, such as hacker attacks on the cloud server/edge server, may also cause the computing results to be invalid. How to make the deep neural network inference assisted by edge calculation safer and more efficient becomes a problem to be solved urgently.

Two common deep neural network inference techniques for privacy protection are homomorphic encryption and secure multiparty computation. The deep neural network inference scheme for constructing privacy protection by using homomorphic encryption technology and secure multi-party computing technology has strong security, but the computation efficiency is low. In order to avoid the complexity and inefficiency of homomorphic encryption and secure multiparty computing operations, a new double-edge server framework has emerged, which employs a lightweight encryption scheme to efficiently perform deep neural network inference under privacy protection. The inference efficiency of the deep neural network is greatly improved, and the computing energy consumption of the Internet of things equipment is remarkably saved. However, it can only protect the privacy of the input data, not the trained model of the user. Deep neural network models are also the core property of suppliers, as training an effective model requires a large investment in data sets, resources, and expertise. However, existing solutions either require time-consuming cryptographic operations or fail to protect the privacy of the training model. Therefore, how to realize safe and efficient deep neural network reasoning while protecting input data and model privacy is an important issue.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a deep neural network inference method with privacy protection, in which a user may send data to be inferred and a trained model to an edge server, the edge server processes a computationally burdensome and time-consuming linear layer, and the user only needs to process a computationally efficient nonlinear layer and encryption/decryption operations, thereby saving computational overhead of the user and ensuring privacy of the user data and the model.

To solve the above technical problem, an embodiment of the present invention provides the following solutions:

a deep neural network reasoning method with privacy protection comprises the following steps:

the client generates a key;

the client encrypts an input data matrix and a weight matrix of the trained deep neural network model by using the secret key, and sends the encrypted input data matrix and the weight matrix to the first edge server and the second edge server, wherein the bias matrix of the deep neural network model is stored locally;

the first edge server and the second edge server perform linear layer calculation on the input data matrix by using the received weight matrix of the deep neural network model, and return results to the client;

the client verifies the returned result, if the result is correct, the client receives the result, and if the result is incorrect, the client refuses to receive the result;

for the result of correct verification, the client recovers the actual output result of the linear layer by using the locally stored key and the bias matrix;

and the client locally calculates the nonlinear layer, takes the calculation result as the input of the next linear layer, and circulates the steps until the final inference result of the deep neural network model is obtained.

Preferably, the client generating the key specifically includes:

for the trained deep neural network model, the linear layers of the Q layer are included together, and the input data matrix corresponding to the linear layer of the ith layer of the model uses X_i1 < i < Q, and the weight matrix is W_iRepresenting the bias matrix by B_iRepresents;

generating a key by using a KeyGen key generation algorithm, inputting a security parameter k, and outputting a random number matrix R_iAnd a random number c_iAs a key, R_iEach element of which is a random number of k bits for blinding the weight matrix W_iIts size and W_iAre the same in size, c_iIs also a k-bit random number, and is used for blinding the input data matrix X of the i-th layer_i。

Preferably, the encrypting the input data matrix and the trained weight matrix of the deep neural network model specifically includes:

encrypting the Input data matrix and the weight matrix by using an Input Encryption algorithm, and inputting a random number matrix R_iAnd a random number c_iAnd to transportInto data matrix X_iAnd weight matrix W_iOutputs four matrices X_i，a，X_i，b，W_i，aAnd W_i，b；

The encryption process is as follows: first using a random number c_iConstruction matrix C_iThe matrix C_iIs c_iIts size and X_iThe consistency is achieved; to blind X_iDivide it into two matrices X_i，aAnd X_i，bThen using a random number matrix R_iBlind weight matrix W_iInto two matrices W_i，aAnd W_i，b(ii) a After encryption is completed, X is added_i，aAnd W_i，aSend to the first edge server ES_AIs mixing X_i，bAnd W_i，bSend to the second edge server ES_B。

Preferably, in the encryption process, two matrices X_i，aAnd X_i，bThe following conditions are satisfied:

X_i＝X_i，a+X_i，b；

C_i＝X_i，a-X_i，b；

the method is simplified and can be obtained:

X_i，a＝1/2(X_i+C_i)；

X_i，b＝1/2(X_i-C_i)；

then, the random number matrix Ri is used for blinding the weight matrix W_iInto two matrices W_i，aAnd W_i，b；

W_i，a＝W_i+R_i；

W_i，b＝W_i-R_i。

Preferably, the performing, by the first edge server and the second edge server, the linear layer calculation on the input data matrix by using the received weight matrix of the deep neural network model specifically includes:

the first edge server ES and the second edge server ES perform linear layer calculation on the input data matrix by using a Privacy-forecasting calculation algorithm_AReceives X_i，aAnd W_i，aThen, the convolution of the two is calculated to obtain a result S_i，a(ii) a Second edge server ES_BReceives X_i，bAnd W_i，bThen, the convolution of the two is calculated to obtain a result S_i，b(ii) a The output of the algorithm is S_i，aAnd S_i，b。

Preferably, the verifying the returned result by the client specifically includes:

the client verifies the returned result by utilizing a Verification algorithm, and the client randomly selects S_i，aOr S_i，bReuse X of the value of any position_i、W_iAnd locally stored keys, i.e. a matrix of random numbers R_iAnd a random number c_iCalculating a convolution value of the corresponding position; the client compares whether the values of the two are equal; if not, the client refuses to receive the returned result; if so, continuing to execute the next step.

Preferably, the recovering, by the client, the actual output result of the linear layer by using the locally stored key and the bias matrix specifically includes:

the client recovers the encrypted result by using Recovery algorithm, and the input of the algorithm is the result S returned by the first edge server and the second edge server_i，aAnd S_i，b(ii) a The client first uses the random number c_iConstruction matrix C_iThe matrix C_iIs c_iIts size and X_iThe consistency is achieved; then client uses C_iLocally stored random number matrix R_iAnd a bias matrix B_iTo recover the actual output result O_i；O_i＝S_i，a+S_i，b-C_i·R_i+B_iThen O is_iIs the actual output result of the i-th layer linear layer.

Preferably, the input X of the (i + 1) th linear layer_i+1＝NF(O_i) NF is the activation function of the nonlinear layer; and circularly executing the algorithms until a final inference result Res (NF (O)) of the deep neural network model is obtained_Q)。

Preferably, the deep neural network model comprises an input layer, a hidden layer and an output layer, wherein the hidden layer comprises a convolutional layer, an activation layer, a pooling layer and a full-link layer; wherein, the convolution layer and the full connecting layer are linear layers, and the activation layer and the pooling layer are non-linear layers.

The technical scheme provided by the embodiment of the invention has the beneficial effects that at least:

1) the resource-constrained user can also cost less to implement efficient deep neural network reasoning.

2) The low efficiency of a fussy homomorphic encryption technology and a safe multi-party computing technology is avoided.

3) The method can ensure the privacy of the input data to be inferred of the user and the privacy of the deep neural network model trained by the user.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow chart of a deep neural network inference method with privacy protection provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of a deep neural network inference system with privacy protection provided by an embodiment of the present invention;

fig. 3 is a schematic diagram of a basic structure of a deep neural network model hidden layer according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

The embodiment of the invention provides a deep neural network reasoning method with privacy protection, the flow of the method is shown in fig. 1, and a system model involved in the method is shown in fig. 2 and comprises a client (a data and deep neural network model owner) and two outsourced edge servers (a first edge server and a second edge server).

The method comprises the following steps:

the client generates a key;

In the embodiment of the invention, a user at a client can send data needing to be inferred and a trained model to the edge server, the edge server processes a linear layer which is heavy in calculation and time-consuming, and the user only needs to process a nonlinear layer with high calculation efficiency and encryption and decryption operations. The method of the invention can not only save the calculation cost of the user, but also ensure the privacy of the user data and the model.

In the embodiment of the present invention, the deep neural network model includes an input layer, a hidden layer, and an output layer, where the hidden layer includes a convolutional layer, an active layer, a pooling layer, and a full-link layer, as shown in fig. 3; wherein, the convolution layer and the full connecting layer are linear layers, and the activation layer and the pooling layer are non-linear layers.

The function of the convolutional layer is to perform feature extraction on the input data matrix, and usually contains multiple convolutional kernels. The convolution operation is to multiply the convolution kernel and the matrix data in the corresponding input one by one and then sum. The convolution operation starts from the top left corner of the input data matrix and ends at the bottom right corner of the image. The matrix obtained by convolving the original matrix is called a characteristic diagram.

Typically, each convolutional layer is followed by an active layer. The activation layer typically enhances the model's ability to handle non-linear problems by using an activation function. The main activation functions are the sigmoid function, tanh function and the ReLU function.

The pooling layer is mainly used to reduce the dimensionality of each feature map while retaining most important information. There are generally two ways of pooling operation, maximum pooling and average pooling. The difference between the two methods lies in that the median processing modes of the pooling windows are different, the maximum pooling means taking the maximum value of the values in the pooling windows, and the average pooling means taking the average value of the values in the pooling windows.

The fully-connected layer acts as a "classifier" in the overall convolutional neural network. In practical use, the input data of the fully-connected layer needs to be preprocessed into a vector form, and the calculation mode is similar to that of the convolutional layer.

Deep neural networks are essentially a mapping from input to output that is capable of learning a large number of mappings between inputs and outputs without requiring any precise mathematical expression between inputs and outputs.

As an embodiment of the invention, it is assumed that a trained deep neural network model is already available, and Q layers of linear layers (convolutional layers and full-link layers) are included together. X for input data corresponding to i-th linear layer of model_i(1 < i < Q), and the weight matrix is represented by W_i(1 < i < Q), and the bias matrix is represented by B_i(1 < ═ i < ═ Q). In the following description, the subscript i denotes the i-th layer linear layer.

For the i-th linear layer, firstly, the client generates a key by using a KeyGen key generation algorithm, inputs a security parameter k and outputs a random number matrix R_iAnd a random number c_iAsSecret key, R_iEach element of which is a random number of k bits for blinding the weight matrix W_iIts size and W_iAre the same in size, c_iIs also a k-bit random number, and is used for blinding the input data matrix X of the i-th layer_i。

Then, the Input Encryption algorithm is used for encrypting the Input data matrix and the weight matrix, and the random number matrix R is Input_iAnd a random number c_iAnd an input data matrix X_iAnd weight matrix W_iOutputs four matrices X_i，a，X_i，b，W_i，aAnd W_i，b。

The encryption process is as follows: first using a random number c_iConstruction matrix C_iThe matrix C_iIs c_iIts size and X_iThe consistency is achieved; to blind X_iDivide it into two matrices X_i，aAnd X_i，bWherein two matrices X_i，aAnd X_i，bThe following conditions are satisfied:

X_i＝X_i，a+X_i，b；

C_i＝X_i，a-X_i，b；

the method is simplified and can be obtained:

X_i，a＝1/2(X_i+C_i)；

X_i，b＝1/2(X_i-C_i)；

W_i，a＝W_i+R_i；

W_i，b＝W_i-R_i。

After encryption is completed, X is added_i，aAnd W_i，aSend to the first edge server ES_AIs mixing X_i，bAnd W_i，bSend to the second edge server ES_B。

The first edge server and the second edge server receive the encrypted numberThen, linear layer calculation is carried out on the input data matrix by using the Privacy-forecasting calculation algorithm, and the first edge server ES_AReceives X_i，aAnd W_i，aThen, the convolution of the two is calculated to obtain a result S_i，a(ii) a Second edge server ES_BReceives X_i，bAnd W_i，bThen, the convolution of the two is calculated to obtain a result S_i，b(ii) a The output of the algorithm is S_i，aAnd S_i，b。

And after the two edge servers finish the calculation, returning the result to the client. The client verifies the returned result by utilizing a Verification algorithm, and the client randomly selects S_i，aOr S_i，bReuse X of the value of any position_i、W_iAnd locally stored keys, i.e. a matrix of random numbers R_iAnd a random number c_iCalculating a convolution value of the corresponding position; the client compares whether the values of the two are equal; if not, the client refuses to receive the returned result; if so, continuing to execute the next step.

And for the correct verification result, the client recovers the encryption result by using a Recovery algorithm, wherein the input of the algorithm is the result S returned by the first edge server and the second edge server_i，aAnd S_i，b(ii) a The client first uses the random number c_iConstruction matrix C_iThe matrix C_iIs c_iIts size and X_iThe consistency is achieved; then client uses C_iLocally stored random number matrix R_iAnd a bias matrix B_iTo recover the actual output result O_i；O_i＝S_i，a+S_i，b-C_i·R_i+B_iThen O is_iIs the actual output result of the i-th layer linear layer.

And the client locally performs calculation of the nonlinear layer, and takes the calculation result as the input of the next linear layer. Input X of i +1 th layer of linear layer_i+1＝NF(O_i) NF is the activation function of the nonlinear layer; and circularly executing the algorithms until a final inference result Res (NF (O)) of the deep neural network model is obtained_Q)。

In summary, the deep neural network inference method provided by the invention effectively utilizes the edge server to process the linear layer which is heavy and time-consuming in computation, and the user only needs to process the nonlinear layer with high computation efficiency and encryption and decryption operations, so that the user with limited resources can also spend less cost to realize high-efficiency deep neural network inference, and meanwhile, the privacy of the user input data and the deep neural network model can be ensured.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A deep neural network reasoning method with privacy protection is characterized by comprising the following steps:

the client generates a key;

2. The deep neural network inference method of claim 1, wherein the client generating a key specifically comprises:

3. The deep neural network inference method of claim 2, wherein the encrypting the input data matrix and the weight matrix of the trained deep neural network model specifically comprises:

encrypting the Input data matrix and the weight matrix by using an Input Encryption algorithm, and inputting a random number matrix R_iAnd a random number c_iAnd an input data matrix X_iAnd weight matrix W_iOutputs four matrices X_i，a，X_i，b，W_i，aAnd W_i，b；

4. The deep neural network inference method of claim 3, wherein in the encryption process, two matrices X_i，aAnd X_i，bThe following conditions are satisfied:

X_i＝X_i，a+X_i，b；

C_i＝X_i，a-X_i，b；

the method is simplified and can be obtained:

X_i，a＝1/2(X_i+C_i)；

X_i，b＝1/2(X_i-C_i)；

W_i，a＝W_i+R_i；

W_i，b＝W_i-R_i。

5. The deep neural network inference method of claim 3, wherein the linear layer computation of the input data matrix by the first edge server and the second edge server using the received weight matrix of the deep neural network model specifically comprises:

6. The deep neural network inference method of claim 5, wherein the verifying the returned result by the client specifically comprises:

customerThe client verifies the returned result by utilizing a Verification algorithm, and the client randomly selects S_i，aOr S_i，bReuse X of the value of any position_i、W_iAnd locally stored keys, i.e. a matrix of random numbers R_iAnd a random number c_iCalculating a convolution value of the corresponding position; the client compares whether the values of the two are equal; if not, the client refuses to receive the returned result; if so, continuing to execute the next step.

7. The deep neural network inference method of claim 6, wherein the recovering, by the client, the actual output result of the linear layer using the locally stored key and the bias matrix specifically comprises:

8. The deep neural network inference method of claim 7, wherein input X of the i +1 th layer of linear layers_i+1＝NF(O_i) NF is the activation function of the nonlinear layer; and circularly executing the algorithms until a final inference result Res (NF (O)) of the deep neural network model is obtained_Q)。

9. The deep neural network inference method of any one of claims 1-8, wherein the deep neural network model comprises an input layer, a hidden layer, and an output layer, the hidden layer comprising a convolutional layer, an active layer, a pooling layer, and a fully-connected layer; wherein, the convolution layer and the full connecting layer are linear layers, and the activation layer and the pooling layer are non-linear layers.