CN115865307B

CN115865307B - Data point multiplication operation method for federal learning

Info

Publication number: CN115865307B
Application number: CN202310170136.6A
Authority: CN
Inventors: 冯黎明; 马煜翔; 刘洋; 王玥; 邢冰; 刘文博
Original assignee: Lanxiang Zhilian Hangzhou Technology Co ltd
Current assignee: Lanxiang Zhilian Hangzhou Technology Co ltd
Priority date: 2023-02-27
Filing date: 2023-02-27
Publication date: 2023-05-09
Anticipated expiration: 2043-02-27
Also published as: CN115865307A

Abstract

The invention discloses a data point multiplication operation method for federal learning. The method comprises the following steps: preprocessing the data column vector V by a first party, then homomorphic encrypting to obtain an encrypted data matrix V1, and transmitting the encrypted data matrix V1 to a second party; preprocessing a data matrix w by a second party, then homomorphic encrypting to obtain a plurality of encrypted sub-data matrices F1, generating a random number vector Z corresponding to each encrypted sub-data matrix F1, and homomorphic encrypting to obtain an encrypted random number vector Z1; the second party calculates Hadamard products of each encrypted sub-data matrix F1 and the encrypted data matrix V1 to obtain corresponding matrixes E, sums each matrix E according to columns, adds the corresponding encrypted random number vector Z1 to obtain a vector Q, and sends the vector Q to the first party; the first party extracts the sum u of the corresponding data according to each vector Q to form a dot product. The invention can rapidly calculate the dot multiplication result of the data matrix and the data column vector, and protect the data privacy.

Description

Data point multiplication operation method for federal learning

Technical Field

The invention relates to the technical field of federal learning, in particular to a data point multiplication operation method for federal learning.

Background

With the high-speed development of the domestic internet industry and privacy protection, more and more institutions in real life adopt federal learning for model training. Based on privacy protection of data, the data which need to be trained by all parties are encrypted secret state data, and based on the characteristics of the encrypted data and federal learning self-training, a large number of secret state operations of matrix multiplication and vector multiplication are involved in the training process, however, the secret state operations of matrix multiplication and vector multiplication take a large computing time, so that federal learning training under big data and big model scenes becomes very difficult.

At present, federal learning generally adopts homomorphic encryption algorithm to encrypt and calculate plaintext data to be trained, homomorphic encryption refers to an encryption function which encrypts plaintext after ring addition and multiplication operation, and the result is equivalent to ciphertext after encryption.

For CKS, BFV and the like, a homomorphic encryption algorithm of SIMD accelerated computation is used, the SIMD accelerated computation can encrypt a plurality of plaintext data into the same ciphertext, if the situation is required to calculate the secret state operation of matrix multiplied by column vector, the existing method is to homomorphic encrypt column vector, homomorphic encrypt each row of matrix, calculate the dot multiplication result between each row of ciphertext matrix and ciphertext column vector, and in the process, rotation operation (Rotation) is required, and the Rotation operation (Rotation) of ciphertext after homomorphic encryption needs to take a great deal of calculation time, when the dimension of matrix is large, the calculation overhead caused by ciphertext Rotation operation is unacceptable.

Disclosure of Invention

In order to solve the technical problems, the invention provides a data point multiplication operation method for federal learning, which can rapidly calculate the point multiplication result of a data matrix held by a second party and a data column vector held by a first party, does not leak data of both parties, does not need rotation operation, solves the problem that when homomorphic encryption algorithms such as CKS and BFV and the like using SIMD to accelerate calculation are used for calculating the point multiplication result of the matrix multiplied by the vector, occupies extremely high calculation time due to rotation operation, and improves the calculation efficiency.

In order to solve the problems, the invention is realized by adopting the following technical scheme:

the invention relates to a data point multiplication operation method for federal learning, wherein a first party holds a data column vector v in s dimension, and a second party holds a data matrix w in g rows and s columns, and the method comprises the following steps:

s1: the second party determines preprocessing parameters according to the number of rows and the number of columns of the data matrix w, the first party preprocesses the data column vector V according to the preprocessing parameters to obtain the data matrix V, and the second party preprocesses the data matrix w according to the preprocessing parameters to obtain a plurality of sub-data matrices F;

s2: the first party generates a public key pk and a private key sk, homomorphic encryption is carried out on the data matrix V by adopting the public key pk to obtain an encrypted data matrix V1, and the public key pk and the encrypted data matrix V1 are sent to the second party;

s3: the second party generates a corresponding random number vector Z for each sub-data matrix F, and the sum of random numbers at the appointed position in the random number vector Z is determined to be 0 by the preprocessing parameters;

s4: the second party adopts the public key pk to homomorphic encrypt each sub-data matrix F to obtain a corresponding encrypted sub-data matrix F1, and adopts the public key pk to homomorphic encrypt each random number vector Z to obtain a corresponding encrypted random number vector Z1;

s5: the second party calculates Hadamard products of each encryption sub-data matrix F1 and the encryption data matrix V1 to obtain corresponding matrixes E, sums each matrix E according to columns to obtain corresponding vectors R, adds the vectors R and the corresponding encryption random number vectors Z1 to obtain vectors Q, and sends all the calculated vectors Q to the first party;

s6: the first party adopts the private key sk to decrypt each vector Q to obtain a corresponding vector Q, and extracts the sum u of corresponding data from the vector Q to form a dot product result according to the preprocessing parameters.

In this scheme, the number of rows and columns of each sub data matrix F are respectively identical to the number of rows and columns of the data matrix V. The dimension of the random number vector Z corresponds to the number of columns of the sub-data matrix F. Each random number within the random number vector Z is not 0. The first party holds the data column vector v in s dimension and the second party holds the data matrix w of g rows and s columns. The first party performs preprocessing to convert the data column vector V into a data matrix V, the second party performs preprocessing to convert the data matrix w into a plurality of sub-data matrices F, then the first party and the second party encrypt the data matrices of the plaintext held by the first party and the second party by using the same public key pk as an encryption key and adopting the same homomorphic encryption algorithm to obtain corresponding encrypted data matrices, the second party also generates a corresponding random number vector Z for each sub-data matrix F, uses the same public key pk as an encryption key and adopts the same homomorphic encryption algorithm to encrypt the random number vector Z to obtain an encrypted random number vector Z1. In this way, according to the homomorphic encryption algorithm principle, the Hadamard product of each encryption sub-data matrix F1 and the encryption data matrix V1 is calculated to obtain a corresponding matrix E, each matrix E is summed according to columns to obtain a corresponding vector R, the vector R and the corresponding encryption random number vector Z1 are added to obtain a vector Q, the vector Q is decrypted by a private key sk to obtain a vector Q, the data of the corresponding position in each vector Q is summed according to the preprocessing parameters to obtain the sum u of a plurality of data corresponding to each vector Q, and the sum u of the data is the dot product result of the row corresponding to the data column vector V and the data matrix w because the sum of the random numbers of the appointed position in the random number vector Z is determined to be 0 by the preprocessing parameters.

The Rotation operation is not carried out in the whole calculation process of the scheme, extremely high calculation time cannot be occupied by the Rotation operation, the technical problem that extremely high calculation time is occupied by the Rotation operation when homomorphic encryption algorithms such as CKS and BFV and the like using SIMD (single instruction multiple data) acceleration calculation are used for calculating dot multiplication results of the matrix and the column vector is solved, the calculation efficiency is greatly improved, federal learning training under big data and big model scenes is convenient to achieve, and meanwhile data privacy of both parties is protected.

Preferably, the method for extracting the sum u of the corresponding data from the vector q according to the preprocessing parameter in the step S6 includes the following steps:

and summing the data at the corresponding position in each vector q according to the preprocessing parameters to obtain a sum u of a plurality of data corresponding to each vector q, and extracting the corresponding sum u of the data according to the preprocessing parameters to form a data column vector p in s dimension, wherein the data column vector p is the dot multiplication result of the data matrix w and the data column vector v.

Preferably, in the step S1, the method for determining the preprocessing parameters by the second party according to the number of rows and the number of columns of the data matrix w is as follows:

the second party calculates preprocessing parameters k and h, and the formula is as follows:

,/>

,

wherein ,

representing an upward rounding.

Preferably, in the step S1, the first party performs preprocessing on the data column vector V according to the preprocessing parameter to obtain the data matrix V as follows:

m1: the first party calculates a parameter n, n=h×k, and if s=n, the data column vector v is transposed to obtain a data row vector v1; if s is not equal to n, carrying out zero padding operation on the data column vector v until the dimension reaches n, and then transposing to obtain a data row vector v1;

v1=[a ₁ 、a ₂ 、a ₃ ……a _n ]，1≤f≤n，a _f representing the f-th data within the data line vector v1;

m2: the data matrix V is calculated according to the preprocessing parameters k and h and the data line vector V1, and the formula is as follows:

,

wherein i is more than or equal to 1 and less than or equal to k, A _i,1 、A _i,2 、A _i,3 ……A _i,k Are the same row vectors, all are [ a ] _(i-1)*h+1 、a _{(i-1) *h+2} ……a _i*h-1 、a _i*h ]。

Preferably, in the step S1, the second party performs preprocessing on the data matrix w according to the preprocessing parameters to obtain a plurality of sub-data matrices F, which is as follows:

n1: the second party calculates the parameter m, m=k ² If the g is not equal to m, performing row-by-row zero padding operation on the data matrix w until the row number reaches m;

n2: the second party calculates a parameter n, n=h×k, if s is not equal to n, the data matrix w is subjected to column zero padding operation until the column number reaches n, and finally a data matrix L with m×n is obtained;

and N3: the data matrix L is equally divided into k x n sub-data matrices W by rows,

,

wherein i is more than or equal to 1 and less than or equal to k, f is more than or equal to 1 and less than or equal to n, b _i,f Data representing the ith row and the ith column of the sub data matrix W;

n4: converting each sub-data matrix W into a sub-data matrix F according to the preprocessing parameters k and h to obtain k sub-data matrices F;

the method of converting the sub data matrix W into the sub data matrix F is as follows:

dividing each row of data of the sub data matrix W into k row vectors, wherein each row vector has h data, and obtaining the following formula:

（1），

wherein i is not less than 1 and not more than k, j is not less than 1 and not more than k, B _i,j Row vectors representing the ith row and jth column of the sub-data matrix W, B _i,j =[b _i,(j-1)*h+1 、b _i,(j-1)*h+2 ……b _i,j*h-1 、b _i,j*h ]，

Transpose equation (1) to obtain a sub-data matrix F,

。

preferably, the method for generating the corresponding random number vector Z by the second party to the sub-data matrix F in the step S3 is as follows:

the second party generates a random number vector Z, Z= [ Z ] ₁ ，Z ₂ ……Z _n ]Each random number within the random number vector Z is not 0,

，

wherein f is not less than 1 and not more than n, d is not less than 0 and not more than k-1, Z _f Representing the f-th random number within the random number vector Z.

Preferably, in the step S5, the formula for adding the vector R to the corresponding encrypted random number vector Z1 to obtain the vector Q is as follows:

Q=R+Z1=[R ₁ +Z1 ₁ ，R ₂ +Z1 ₂ ，……R _n +Z1 _n ]，1≤f≤n，R _f represents the f-th value, Z1, in the vector R _f Represents the f-th encrypted random number in the encrypted random number vector Z1.

Preferably, in the step S6, the formula of decrypting the vector Q by the first party using the private key sk to obtain the corresponding vector Q is as follows:

q=DEC(sk, Q)=[r ₁ +Z ₁ ，r ₂ +Z ₂ ，……r _n +Z _n ]，

wherein DEC (sk, Q) represents that the private key sk is used as a decryption key to decrypt the vector Q by adopting a homomorphic encryption algorithm, and f is more than or equal to 1 and less than or equal to n and r _f Representing the use of homomorphic encryption algorithm to R using private key sk as decryption key _f Result of decryption, Z _f Representing the use of homomorphic encryption algorithm to Z1 using private key sk as decryption key _f And (5) a result of decryption.

Preferably, in the step S6, the method for summing the data of the corresponding position in the vector q according to the preprocessing parameter by the first party to obtain the sum u of the plurality of data corresponding to the vector q is as follows:

the first party calculates the sum u of k data, denoted as u ₁ 、u ₂ ……u _k ，1≤i≤k，

u _i =[(r _(i-1)*h+1 +Z _(i-1)*h+1 )+(r _(i-1)*h+2 +Z _(i-1)*h+2 )+……(r _i*h+ Z _i*h )]。

Preferably, the method for extracting the sum u of the corresponding data to form the S-dimensional data column vector p according to the preprocessing parameter in the step S6 is as follows:

the k sub-data matrixes W of the data matrix L divided by the row are respectively marked as W ₁ 、W ₂ ……W _k ，

，

Sub-data matrix W _i The converted sub-data matrix F is denoted as F _i ，

Sub-data matrix F _i The corresponding vector Q is denoted as Q _i ，

Vector Q _i The decrypted vector q is denoted as q _i ，

Vector q _i The sum u of the corresponding k data is respectively recorded as u (i) ₁ 、u(i) ₂ ……u(i) _k ，

The sum u of all data is arranged in sequence into a data column vector y,

，/>

，

deleting the last t values in the data column vector y to obtain a data column vector p, wherein t=m-s.

The beneficial effects of the invention are as follows: the method has the advantages that the dot multiplication result of the data matrix held by the second party and the data column vector held by the first party can be calculated quickly, plaintext data of the two parties cannot be leaked, data privacy of the two parties is protected, rotation operation is not needed, the problem that when homomorphic encryption algorithms such as CKS and BFV and the like which are used for calculating the dot multiplication result of the matrix multiplied by the vector and are used for calculating the dot multiplication result of the matrix multiplied by the vector, extremely high calculation time is occupied by rotation operation is solved, calculation efficiency is improved, and federal learning training under large data and large model scenes is facilitated.

Drawings

FIG. 1 is a flow chart of an embodiment;

FIG. 2 is a schematic diagram illustrating the conversion of a data column vector V into a data matrix V and the conversion of a data matrix w into a sub-data matrix F;

FIG. 3 is an illustrative matrix E ₁ Matrix E ₂ Is a schematic diagram of (a).

Detailed Description

The technical scheme of the invention is further specifically described below through examples and with reference to the accompanying drawings.

Examples: the data point multiplication operation method for federal learning in this embodiment includes the following steps, as shown in fig. 1, that a first party holds a data column vector v in s dimension and a second party holds a data matrix w in g rows and s columns:

s1: the second side determines preprocessing parameters k and h according to the number of rows and columns of the data matrix w, and the formula is as follows:

，/>

，

wherein ,

representing an upward rounding;

the first party preprocesses the data column vector V according to the preprocessing parameters k and h to obtain a data matrix V, and the specific steps are as follows:

，

wherein i is more than or equal to 1 and less than or equal to k, A _i,1 、A _i,2 、A _i,3 ……A _i,k Are the same row vectors, all are [ a ] _(i-1)*h+1 、a _{(i-1) *h+2} ……a _i*h-1 、a _i*h ]；

The second side carries out preprocessing on the data matrix w according to the preprocessing parameters to obtain a plurality of sub data matrices F, wherein the number of rows and the number of columns of each sub data matrix F are respectively consistent with the number of rows and the number of columns of the data matrix V, and the specific steps are as follows:

，

（1），

Transpose equation (1) to obtain a sub-data matrix F,

；

s3: the second party generates a corresponding random number vector Z for each sub-data matrix F, the dimension of the random number vector Z is consistent with the column number of the sub-data matrix F, each random number in the random number vector Z is not 0, and the sum of random numbers at the appointed position in the random number vector Z is determined to be 0 by the preprocessing parameters k and h;

the method for generating the corresponding random number vector Z for the sub-data matrix F by the second party is as follows:

，

wherein f is not less than 1 and not more than n, d is not less than 0 and not more than k-1, Z _f Representing the f-th random number in the random number vector Z;

s5: the second party calculates Hadamard products of each encryption sub-data matrix F1 and the encryption data matrix V1 to obtain corresponding matrixes E, sums each matrix E according to columns to obtain corresponding vectors R, adds the vectors R and the corresponding encryption random number vectors Z1 to obtain vectors Q, and sequentially sends all the calculated vectors Q to the first party;

the formula for adding the vector R to the corresponding encrypted random number vector Z1 to obtain the vector Q is as follows:

Q=R+Z1=[R ₁ +Z1 ₁ ，R ₂ +Z1 ₂ ，……R _n +Z1 _n ]，1≤f≤n，R _f represents the f-th value, Z1, in the vector R _f Representing the f-th encrypted random number in the encrypted random number vector Z1;

s6: the first party decrypts each vector Q by adopting a private key sk to obtain a corresponding vector Q, sums the data at the corresponding position in each vector Q according to pretreatment parameters k and h to obtain a plurality of data sums u corresponding to each vector Q, and extracts the corresponding data sums u according to the pretreatment parameters k and h to form a data column vector p in s dimension, wherein the data column vector p is the dot product result of a data matrix w and a data column vector v;

the formula for decrypting the vector Q by the first party by using the private key sk to obtain the corresponding vector Q is as follows:

q=DEC(sk, Q)=[r ₁ +Z ₁ ，r ₂ +Z ₂ ，……r _n +Z _n ]，

wherein DEC (sk, Q) represents that the private key sk is used as a decryption key to decrypt the vector Q by adopting a homomorphic encryption algorithm, and f is more than or equal to 1 and less than or equal to n and r _f Representing the use of homomorphic encryption algorithm to R using private key sk as decryption key _f Result of decryption, Z _f Representing the use of homomorphic encryption algorithm to Z1 using private key sk as decryption key _f The result of decryption;

the method for summing the data of the corresponding position in the vector q by the first party according to the preprocessing parameters k and h to obtain the sum u of a plurality of data corresponding to the vector q is as follows:

In step S6, the method for extracting the data column vector p of S dimension from the sum u of the corresponding data according to the preprocessing parameters k and h is as follows:

，

Sub-data matrix W _i The converted sub-data matrix F is denoted as F _i ，

Sub-data matrix F _i The corresponding vector Q is denoted as Q _i ，

Vector Q _i The decrypted vector q is denoted as q _i ，

The sum u of all data is arranged in sequence into a data column vector y,

，/>

，

In this scheme, the second party calculates preprocessing parameters k and h according to the number of rows and columns of the data matrix w, and calculates parameters m, n, m=k ² N=h×k, if g=m, the number of rows is unchanged, otherwise, the data matrix W performs a row-wise zero-filling operation until the number of rows reaches m, if s=n, the number of columns is unchanged, otherwise, the data matrix W performs a column-wise zero-filling operation until the number of columns reaches n, a matrix with m×n is finally obtained, and is recorded as a data matrix L, the data matrix L is equally divided into k sub-data matrices W with k×n according to rows, each sub-data matrix W is converted into a sub-data matrix F, and k sub-data matrices F are obtained in total.

The first party calculates a parameter n, and if s=n, the data column vector v is directly transposed to obtain a data row vector v1; if s is not equal to n, performing zero padding operation on the data column vector V until the dimension reaches n, then transposing to obtain a data row vector V1, and calculating a data matrix V of k rows and n columns according to the preprocessing parameters k and h and the data row vector V1, wherein the number of rows and the number of columns of the sub data matrix F are respectively consistent with the number of rows and the number of columns of the data matrix V.

The first party and the second party encrypt the data matrix of the plaintext held by the first party and the second party by using the same public key pk as an encryption key and adopting the same homomorphic encryption algorithm to obtain a corresponding encrypted data matrix, the second party also generates a corresponding random number vector Z for each sub-data matrix F, the same public key pk is used as the encryption key, the same homomorphic encryption algorithm is adopted to encrypt the random number vector Z to obtain an encrypted random number vector Z1, each random number in the random number vector Z is not 0, and the sum of every h random numbers from front to back in the random number vector Z is 0.

Then, the second party calculates Hadamard products of each encrypted sub-data matrix F1 and the encrypted data matrix V1 to obtain a corresponding matrix E, sums each matrix E according to columns to obtain a corresponding vector R, adds the vector R and the corresponding encrypted random number vector Z1 to obtain a vector Q, sequentially sends all the calculated vectors Q to the first party, the first party decrypts each vector Q by adopting a private key sk to obtain a corresponding vector Q, sums every h values from front to back in each vector Q to obtain a corresponding data sum u, sums data at corresponding positions in each vector Q to obtain k data sums u, and the k data sums u in the vector Q are consistent with the k values in the column vector obtained by the dot product of the corresponding sub-data matrix W and the data column vector V from front to back. In the initial stage, the data matrix w is subjected to row-column zero padding operation to obtain a data matrix L, so that the value of the dot product of all rows with 0 value and the data column vector v in the data matrix L is 0, after the sum u of all data is sequentially arranged into the data column vector y, the last t values in the data column vector y correspond to the value of the dot product of the zero padding row and the data column vector v, t=m-s, and the last t values in the data column vector y are directly deleted to obtain the data column vector p, wherein the data column vector p is the dot product result of the data matrix w and the data column vector v.

In the scheme, the homomorphic encryption algorithm such as CKKS and BFV is adopted to encrypt the matrix by rows, that is, the first party sends k ciphertexts when sending the encrypted data matrix V1 to the second party, the second party sends k calculated vectors Q to the first party, that is, the encryption calculation part communicates 2k ciphertexts in total, while the prior algorithm is adopted, since the data matrix w has g rows, the ciphertexts sent to the first party by the second party have at least g ciphertexts, because

The traffic of the method is also much lower than with existing algorithms.

Illustrating:

the first party and the second party perform federal learning model training, as shown in fig. 2, the second party holds a 4-row 4-column data matrix w, the first column characteristic value of the data matrix w is age data, the second column characteristic value is gender data, the third column characteristic value is personal income data, and the fourth column characteristic value is household overall income data; the first party holds a 4-dimensional data column vector v, the first data is an age characteristic parameter, the second data is a gender characteristic parameter, the third data is a personal income characteristic parameter, and the fourth data is a family overall income characteristic parameter.

The second party calculates k= 2,h =2, m=4, n=4, and since the data matrix w is 4 rows and 4 columns and the dimension of the data column vector v is 4, the data matrix w and the data column vector v do not need zero padding operation. As shown in fig. 2, the first party performs a data column vector vPreprocessing to obtain a data matrix V, preprocessing a data matrix W by a second party, and equally dividing the data matrix W into sub-data matrices W ₁ 、W ₂ Then the sub data matrix W ₁ 、W ₂ Conversion into a sub-data matrix F ₁ 、F ₂ 。

The first party generates a public key pk and a private key sk, homomorphic encryption is carried out on the data matrix V by adopting the public key pk, an encrypted data matrix V1 is obtained, v1=enc (V), enc (V) represents homomorphic encryption on the data matrix V, and the public key pk and the encrypted data matrix V1 are sent to the second party.

The second party gives the sub-data matrix F ₁ Generating a corresponding random number vector Z ₁ For a sub-data matrix F ₂ Generating a corresponding random number vector Z ₂ ，Z ₁ =[Z _1,1 ，Z _1,2 ，Z _1,3 ，Z _1,4 ]，Z _1,1 +Z _1,2 =0，Z _1,3 + Z _1,4 =0，Z _1,1 ，Z _1,2 ，Z _1,3 ，Z _1,4 Neither is 0; z is Z ₂ =[Z _2,1 ，Z _2,2 ，Z _2,3 ，Z _2,4 ]，Z _2,1 +Z _2,2 =0，Z _2,3 +Z _2,4 =0，Z _2,1 ，Z _2,2 ，Z _2,3 ，Z _2,4 Neither is 0.

The second party adopts the public key pk to pair the sub-data matrix F ₁ 、F ₂ Homomorphic encryption is carried out to obtain a corresponding encrypted sub-data matrix F1 ₁ 、F1 ₂ ，F1 ₁ =enc(F ₁ )，F1 ₂ =enc(F ₂ ) The public key pk is adopted for the random number vector Z ₁ 、Z ₂ Homomorphic encryption is carried out to obtain a corresponding encrypted random number vector Z1 ₁ 、Z1 ₂ ，Z1 ₁ =enc(Z ₁ )，Z1 ₂ =enc(Z ₂ )。

The second party calculates an encrypted sub-data matrix F1 ₁ Hadamard product of the encrypted data matrix V1 to obtain a corresponding matrix E ₁ The second party calculates an encrypted sub-data matrix F1 ₂ Hadamard product of the encrypted data matrix V1 to obtain a corresponding matrix E ₂ . Matrix E ₁ Matrix E ₂ As shown in fig. 3, a momentArray E ₁ Enc (a) ₁ *d _1,1 ) Representing a in the data matrix V ₁ Homomorphic encryption ciphertext and sub-data matrix W ₁ D in (d) _1,1 A value of the product of homomorphic encryption ciphertexts.

The second party will matrix E ₁ Summing by column to obtain corresponding vector R ₁ Vector R ₁ And corresponding encrypted random number vector Z1 ₁ Adding to obtain a vector Q ₁ ，

Q ₁ =[enc(a ₁ *d _1,1 +a ₃ *d _1,3 +Z _1,1 )，enc(a ₂ *d _1,2 +a ₄ *d _1,4 +Z _1,2 ) ，enc(a ₁ *d _2,1 +a ₃ *d _2,3 +Z _1,3 ) ，enc(a ₂ *d _2,2 +a ₄ *d _2,4 +Z _1,4 )]，

Matrix E ₂ Summing by column to obtain corresponding vector R ₂ Vector R ₂ And corresponding encrypted random number vector Z1 ₂ Adding to obtain a vector Q ₂ ，

Q ₂ =[enc(a ₁ *d _3,1 +a ₃ *d _3,3 +Z _2,1 )，enc(a ₂ *d _3,2 +a ₄ *d _3,4 +Z _2,2 ) ，enc(a ₁ *d _4,1 +a ₃ *d _4,3 +Z _2,3 ) ，enc(a ₂ *d _4,2 +a ₄ *d _4,4 +Z _2,4 )]，

To calculate the vector Q ₁ 、Q ₂ To the first party.

The first party adopts the private key sk to vector Q ₁ 、Q ₂ Decrypting to obtain a corresponding vector q ₁ 、q ₂ ，

q ₁ =[a ₁ *d _1,1 +a ₃ *d _1,3 +Z _1,1 ，a ₂ *d _1,2 +a ₄ *d _1,4 +Z _1,2 ，a ₁ *d _2,1 +a ₃ *d _2,3 +Z _1,3 ，a ₂ *d _2,2 +a ₄ *d _2,4 +Z _1,4 ]，

q ₂ =[a ₁ *d _3,1 +a ₃ *d _3,3 +Z _2,1 ，a ₂ *d _3,2 +a ₄ *d _3,4 +Z _2,2 ，a ₁ *d _4,1 +a ₃ *d _4,3 +Z _2,3 ，a ₂ *d _4,2 +a ₄ *d _4,4 +Z _2,4 ]，

Since h=2, the first party will vector q ₁ Summing every 2 values from front to back to obtain u (1) ₁ 、u(1) ₂ Vector q ₂ Summing every 2 values from front to back to obtain u (2) ₁ 、u(2) ₂ ，

u(1) ₁ = a ₁ *d _1,1 +a ₃ *d _1,3 +Z _1,1 +a ₂ *d _1,2 +a ₄ *d _1,4 +Z _1,2 ，

u(1) ₂ = a ₁ *d _2,1 +a ₃ *d _2,3 +Z _1,3 +a ₂ *d _2,2 +a ₄ *d _2,4 +Z _1,4 ，

u(2) ₁ = a ₁ *d _3,1 +a ₃ *d _3,3 +Z _2,1 +a ₂ *d _3,2 +a ₄ *d _3,4 +Z _2,2 ，

u(2) ₂ = a ₁ *d _4,1 +a ₃ *d _4,3 +Z _2,3 +a ₂ *d _4,2 +a ₄ *d _4,4 +Z _2,4 ，

Due to Z _1,1 +Z _1,2 =0，Z _1,3 + Z _1,4 =0，Z _2,1 +Z _2,2 =0，Z _2,3 +Z _2,4 =0，

u(1) ₁ = a ₁ *d _1,1 +a ₂ *d _1,2 +a ₃ *d _1,3 +a ₄ *d _1,4 ，

u(1) ₂ = a ₁ *d _2,1 +a ₂ *d _2,2 +a ₃ *d _2,3 +a ₄ *d _2,4 ，

u(2) ₁ = a ₁ *d _3,1 +a ₂ *d _3,2 +a ₃ *d _3,3 +a ₄ *d _3,4 ，

u(2) ₂ = a ₁ *d _4,1 +a ₂ *d _4,2 +a ₃ *d _4,3 +a ₄ *d _4,4 ，

Therefore, u (1) ₁ 、u(1) ₂ 、u(2) ₁ 、u(2) ₂ The data column vector p forming 4 dimensions is the dot product of the data matrix w and the data column vector v, and is consistent with the plaintext dot product of the data matrix w and the data column vector v.

The calculated data column vector p is used as a training sample of the batch in logistic regression and is used for calculating a loss function later and updating the federal learning model through gradient back propagation.

Claims

1. A data point multiplication operation method for federal learning, a first party holds a data column vector v in s dimension, and a second party holds a data matrix w in g rows and s columns, the method is characterized by comprising the following steps:

s6: the first party decrypts each vector Q by adopting a private key sk to obtain a corresponding vector Q, and extracts a sum u of corresponding data from the vector Q to form a dot product result according to the preprocessing parameters;

the method for extracting the corresponding sum u of the data from the vector q to form the dot product result in the step S6 according to the preprocessing parameters comprises the following steps:

summing the data at the corresponding position in each vector q according to the preprocessing parameters to obtain a sum u of a plurality of data corresponding to each vector q, and extracting the corresponding sum u of the data according to the preprocessing parameters to form a data column vector p in s dimension, wherein the data column vector p is the dot multiplication result of the data matrix w and the data column vector v;

the method for determining the preprocessing parameters by the second party according to the row number and the column number of the data matrix w in the step S1 is as follows:

，/>

,

wherein ,

representing an upward rounding;

in the step S1, the first party performs preprocessing on the data column vector V according to the preprocessing parameter, and the method for obtaining the data matrix V is as follows:

，

In the step S1, the second party performs preprocessing on the data matrix w according to the preprocessing parameters, and the method for obtaining a plurality of sub-data matrices F is as follows:

，

（1），

Transpose equation (1) to obtain a sub-data matrix F,

。

2. the method of claim 1, wherein the method of generating the corresponding random number vector Z for the sub-data matrix F by the second party in the step S3 is as follows:

，

3. The method according to claim 2, wherein the formula for adding the vector R to the corresponding encrypted random number vector Z1 in the step S5 to obtain the vector Q is as follows:

4. A data point multiplication method for federal learning according to claim 3, wherein the formula of decrypting the vector Q by the first party using the private key sk in the step S6 to obtain the corresponding vector Q is as follows:

q=DEC(sk, Q)=[r ₁ +Z ₁ ，r ₂ +Z ₂ ，……r _n +Z _n ]，

5. The method for performing the data point multiplication operation for federal learning according to claim 4, wherein the method for summing the data at the corresponding position in the vector q according to the preprocessing parameter by the first party in the step S6 to obtain the sum u of the plurality of data corresponding to the vector q is as follows:

6. The method for performing the data point multiplication operation for federal learning according to claim 5, wherein the step S6 extracts the sum u of the corresponding data according to the preprocessing parameters to form the S-dimensional data column vector p as follows:

，

Sub-data matrix W _i The converted sub-data matrix F is denoted as F _i ，

Sub-data matrix F _i The corresponding vector Q is denoted as Q _i ，

Vector Q _i The decrypted vector q is denoted as q _i ，

The sum u of all data is arranged in sequence into a data column vector y,

，/>

，