CN113343277B

CN113343277B - Safe and efficient entrusted privacy data category prediction method

Info

Publication number: CN113343277B
Application number: CN202110752010.0A
Authority: CN
Inventors: 刘静; 张良峰
Original assignee: ShanghaiTech University
Current assignee: ShanghaiTech University
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2023-12-29
Anticipated expiration: 2041-07-02
Also published as: CN113343277A

Abstract

The invention relates to a safe and efficient method for entrusting privacy data category prediction, which utilizes a given classifier to conduct category prediction on privacy data to obtain category prediction results Y of a plurality of classifiers on a plurality of pieces of privacy data, sets model parameters of the plurality of classifiers to form a matrix F, and forms a matrix X of the plurality of pieces of privacy data of an entruster. The invention provides a entrusted calculation method for privacy data category prediction, which can simultaneously meet the data privacy, public verifiability and experiment-based high efficiency. When the matrix size is greater than 10000, it can be observed in experimental data that the run time of the principal after the principal calculation is less than the time required for direct calculation.

Description

Safe and efficient entrusted privacy data category prediction method

Technical Field

The invention relates to a method for entrusting privacy data category prediction.

Background

Machine learning classifiers have a wide range of applications in real life, such as medical or genomic prediction, spam detection, facial recognition, and financial computing, where large amounts of private data are involved and need to be protected at the time of prediction. Most class prediction processes of classifiers (e.g., perceptrons, support vector machines) can be roughly seen as matrix multiplication operations. The prediction process is therefore highly complex, and is a heavy computational burden for devices with limited computational resources (e.g., internet of things clients). Delegation of computing solves this problem as a new computing model that allows computing resource-limited devices to delegate computing to resource-rich devices (e.g., cloud servers).

Entrusting calculation brings challenges while avoiding the participation of entrusters in complex calculation and improving the calculation efficiency. Principal challenges faced by delegated computing are: the privacy information of the delegator may be known by a malicious attacker; the delegate may return erroneous calculation results in order to reduce the workload or collusion with a malicious attacker; the runtime of the delegate after delegating the computation may be greater than the time required for direct computation. The design of a matrix multiplication entrusted calculation scheme capable of coping with all challenges has important application value.

The existing delegation scheme for privacy data class prediction cannot simultaneously cope with all challenges, namely, the following important characteristics are provided: 1) Data privacy: semantic security is achieved in the aspect of protecting the privacy information of the entrusters; 2) Public verifiability: the verification algorithm can verify the correctness of the entrusted calculation result, and can be run by any entity; 3) Efficiency based on experiments: from the experimental data it can be observed that the runtime of the delegate after the delegate calculation is less than the time required for the direct calculation.

Mohassel protects data privacy by using existing homomorphic encryption schemes (e.g., goldwasser-Micali encryption schemes) and verifies the correctness of the delegated computation results by a random algorithm. The scheme provides data privacy and public verifiability, but the calculation process involves a large number of modular exponentiations, and experimental-based efficiency is not realized. The patent et al propose an operation of delegated matrix vector multiplication, which implements public verifiability. When the data privacy is satisfied, it is found through theoretical analysis that when the matrix size is very large (greater than 250000), the runtime of the principal in the scheme after the principal calculation is smaller than the time required for direct calculation. This approach does not achieve experimental-based efficiency.

Disclosure of Invention

The invention aims to solve the technical problems that: the existing delegation scheme for privacy data class prediction cannot simultaneously provide data privacy, public verifiability and experiment-based high efficiency.

In order to solve the technical problems, the technical scheme of the invention provides a safe and efficient method for entrusting privacy data category prediction, which utilizes a given classifier to conduct category prediction on privacy data to obtain category prediction results Y of a plurality of classifiers on a plurality of pieces of privacy data, sets model parameters of the plurality of classifiers to form a matrix F, and forms a matrix X of the plurality of pieces of privacy data of an entruster, and is characterized in that FX is calculated to obtain the category prediction results of the plurality of classifiers on the plurality of pieces of privacy data at the same time, and at the moment, the matrix F and the matrix X are regarded as functions and inputs in matrix multiplication operation, and the method comprises the following steps:

step 1, selecting a safety parameter lambda, and according to the safety parameter lambda and a function setThe principal gets the public key pk=lhep and initializes the private key SK to +.>The common parameter lhep= (p, q, n, f (x), χ) is input 1 by the linear homomorphic encryption LHE's parameter generation algorithm ^λ Obtained, wherein q is a prime number; />Is a prime number; f (x) =x ⁿ +1 is a cyclic polynomial; /> Is distributed in the ring->The standard deviation is the discrete gaussian distribution of r;

step 2, according to the public key PK, the private key SK and the functionThe principal gets the key EK for the calculation _F =f and key VK for authentication _F ＝F；

Step 3, according to the public key PK, the private key SK and the inputThe entruster obtains the secret corresponding to the inputText (A)Key for authentication VK _X And a key DK for decryption _X ；

Step 4, according to the key used for calculationAnd ciphertext->The trusted party can calculate the result ciphertext +.>

Step 5, according to the key for verificationVK _X ＝σ _X And result ciphertext->The principal can obtain 1, representing the result ciphertext sigma _Y Is correct, or 0, represents the result ciphertext sigma _Y Is wrong;

step 6, according to the key DK for decryption _X Sum result ciphertextThe delegate can actually calculate the result y=fx.

Preferably, the step 3 specifically includes the following steps:

step 301, obtaining an encryption key sk by a key generation algorithm of the linear homomorphic encryption LHE according to the public key pk=lhep;

step 302, dividing tn columns of input X into t blocks X ₁ ,X ₂ ,…,X _t Each block X _i Containing n columns, then input X write X= [ X ] ₁ … X _t ]Each blockIs a d x n matrix;

step 303, for each vector obtained in step 302Performing an encryption algorithm of a linear homomorphic encryption LHE with sk as an encryption key to obtain ciphertext ++>And each block->Is encrypted as->The input X is encrypted as +.>

Step 304, performing the following assignment: VK (vK) _X ＝σ _X ，DK _X =sk, wherein decryption key DK _X Is private to the principal.

Preferably, the step 5 specifically includes the following steps:

step 501, slave setUniformly and randomly selecting a vector r;

step 501, calculate F (σ) _X r) and sigma _Y r, if F (sigma) _X r) and sigma _Y r is equal, the principal obtains 1, if F (σ _X r) and sigma _Y r is not equal, the principal gets 0.

Preferably, the step 6 specifically includes the following steps:

step 601, inputting sigma _Y Divided into t blocks by 2tn columnsEach block->Containing 2n columns, then input sigma _Y WritingEach block->Is an m x 2n matrix, i.epsilon.t]；

Step 601, for each vectorWith the encryption key sk as the decryption key, performing a decryption algorithm of the linear homomorphic encryption LHE to obtain ciphertext ++>And each block->Decrypted as +.>Ciphertext->Decrypted as +.>

The invention provides a entrusted calculation method for privacy data category prediction, which can simultaneously meet the data privacy, public verifiability and experiment-based high efficiency. When the matrix size is greater than 10000, it can be observed in experimental data that the run time of the principal after the principal calculation is less than the time required for direct calculation.

Drawings

Fig. 1 is a schematic diagram of the partitioning of an input X.

Detailed Description

The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.

In the delegated privacy data class prediction problem, the parameters in the classifier have been obtained through a training process. The present invention only discusses how class predictions are made on private data using a given classifier. The model parameters of one classifier can be regarded as a row vector and the model parameters of a plurality of classifiers can form a matrix F. A piece of privacy data of the principal can be regarded as a column vector, and a plurality of pieces of privacy data can constitute a matrix X. The FX is calculated to obtain the category prediction results of a plurality of classifiers on a plurality of pieces of private data at the same time. At this time, the matrix F and the matrix X can be regarded as functions and inputs in the matrix multiplication operation.

Selecting a security parameter lambda, and summing the security parameter lambda and the function set Representing a set of m rows and d columns of the matrix, wherein the elements in the set come from the group +.>Representing the integer set, the principal can obtain the public key pk=lhep, and let the private key +.>The common parameter lhep= (p, q, n, f (x), χ) is input 1 by the linear homomorphic encryption LHE's parameter generation algorithm ^λ Obtained, wherein q is a prime number; />Is prime number, & lt & gt>Superscript x of (1) denotes the multiplicative group; f (x) =x ⁿ +1 is a cyclic polynomial; is distributed in the ring->The standard deviation is a discrete gaussian distribution of r,representing the coefficients at the ∈ ->Is a polynomial of (a).

Based on public key PK, private key SK and functionThe principal can obtain the key EK for the calculation _F =f and key VK for authentication _F ＝F。

Based on public key PK, private key SK and inputThe entruster can obtain the ciphertext corresponding to the input>Key for authentication VK _X And a key DK for decryption _X . The specific process is as follows:

the encryption key sk is obtained from a key generation algorithm that linearly homomorphic encrypts the LHE according to the public key pk=lhep. As shown in FIG. 1, tn columns of input X are divided into t blocks X ₁ ,X ₂ ,…,X _t Each block X _i Contains n columns. The input may be written as x= [ X ] ₁ … X _t ]Each blockIs a d x n matrix.

For each vectorPerforming an encryption algorithm of a linear homomorphic encryption LHE with sk as an encryption key to obtain ciphertext ++>Thus, each block ∈ ->Is encrypted as->The input X is encrypted as +.>The following assignments were made: VK (vK) _X ＝σ _X ，Dk _X =sk. Decryption key DK _X Is private to the principal.

Based on the key used for calculationAnd ciphertext->The trusted party can calculate the result ciphertext +.>

Based on keys for authenticationVK _X ＝σ _X And result ciphertext->The principal can obtain 1 (representing the result ciphertext sigma) _Y Is correct) or 0 (representing the result ciphertext sigma) _Y Is erroneous). The verification process is as follows:

from a collection(in order to match the specifications of the matrix multiplication, a matrix with 2tn elements is chosen here; the elements involved in the matrix multiplication are all from the group +.>) The vector r is chosen uniformly and randomly. Calculation of F (sigma) _X r) and sigma _Y r. If the two are equal, the commissioner obtains 1; if not, the delegate gets a 0.

Based on key DK for decryption _X Sum result ciphertextThe delegate can actually calculate the result y=fx. The decryption process is as follows:

will input sigma _Y Divided into t blocks by 2tn columnsEach block->Containing 2 columns. The input may be writtenEach block->Is an m x 2n matrix. For each vectorPerforming a decryption algorithm of the linear homomorphic encryption LHE with sk as decryption key to obtain ciphertext ++>Thus, each block ∈ ->Decrypted as +.>Ciphertext->Decrypted as +.>The matrix Y is the class prediction result of multiple classifiers on multiple pieces of private data.

The invention will be further described with the example of medical data classification prediction. The internet of things device (e.g., smart bracelet, smart appliance, etc.) may detect a plurality of pieces of private health data. A plurality of trained disease classifiers can be used to determine whether the disease is present. In this problem, the model parameters of one disease classifier can be regarded as one row vector, and a plurality of classifiers can constitute a matrix F. A private piece of health data of the principal can be regarded as a column vector, and a plurality of private pieces of data can be formed into a matrix X. The delegated computation FX can obtain the category prediction results of a plurality of disease classifiers on a plurality of pieces of private data at the same time.

Claims

1. A safe and efficient method for entrusting the prediction of the class of the privacy data utilizes a given classifier to predict the class on the privacy data, obtain the prediction result Y of the class of a plurality of classifiers on a plurality of pieces of privacy data, set up model parameter of a plurality of classifiers to form matrix F, the privacy data of the entruster form matrix X, characterized by that, calculate FX and get the prediction result of class of a plurality of classifiers on a plurality of pieces of privacy data at the same time, at this moment, matrix F and matrix X are regarded as function and input in the multiplication operation of the matrix, comprising the following steps:

step 1, selecting a safety parameter lambda, and according to the safety parameter lambda and a function set Representing a set of m rows and d columns of the matrix, wherein the elements in the set come from the group +.> Representing an integer set, the principal gets the public key pk=lhep and initializes the private key SK to +.>The common parameter lhep= (p, q, n, f (x), χ) is input 1 by the linear homomorphic encryption LHE's parameter generation algorithm ^λ Obtained, wherein q is a prime number; />Is prime number->Is a multiplicative group of (a); f (x) =x ⁿ +1 is a cyclic polynomial; n=2 ^[logλ]-1 ；/>Is distributed in the ring->The standard deviation is a discrete gaussian distribution of r,representing the coefficients at the ∈ ->Is a polynomial of (2);

Step 3, according to the public key PK, the private key SK and the inputThe entruster obtains the ciphertext corresponding to the input>Key for authentication VK _X And a key DK for decryption _X ；

Step 5, according to the key for verificationVK _X ＝σ _X And result ciphertext->The principal can get 1, representing the knotFruit ciphertext sigma _Y Is correct, or 0, represents the result ciphertext sigma _Y Is wrong;

2. The method for delegating privacy data class prediction in a safe and efficient manner as claimed in claim 1, wherein step 3 comprises the steps of:

step 302, dividing tn columns of input X into t blocks X ₁ ，X ₂ ，...，X _t Each block X _i Containing n columns, then input X write X= [ X ] ₁ … X _t ]Each blockIs a d x n matrix, i.e. [ t ]]；

Step 303, for each vector obtained in step 302Performing an encryption algorithm of a linear homomorphic encryption LHE with sk as an encryption key to obtain ciphertext ++>And each block->Is encrypted asThe input X is encrypted as +.>

3. The method for delegating privacy data class prediction in a safe and efficient manner as recited in claim 1, wherein step 5 comprises the steps of:

step 501, slave setUniformly and randomly selecting a vector r;

4. The method for delegating privacy data class prediction in a safe and efficient manner as recited in claim 1, wherein step 6 comprises the steps of:

Step 601, for each vectorWith the encryption key sk as the decryption key, performing a decryption algorithm of the linear homomorphic encryption LHE to obtain ciphertext ++>And each block->Decrypted asCiphertext->Decrypted as +.>