CN111259443B

CN111259443B - PSI (program specific information) technology-based method for protecting privacy of federal learning prediction stage

Info

Publication number: CN111259443B
Application number: CN202010046301.3A
Authority: CN
Inventors: 张韶峰; 单进勇
Original assignee: Bairong Yunchuang Technology Co ltd
Current assignee: Bairong Yunchuang Technology Co ltd
Priority date: 2020-01-16
Filing date: 2020-01-16
Publication date: 2022-07-01
Anticipated expiration: 2040-01-16
Also published as: CN111259443A

Abstract

A method for protecting privacy of a federal learning prediction stage based on a PSI technology comprises the following steps: firstly, a prediction service party calculates a prediction result of a model of the prediction service party, then the two parties execute an improved PSI protocol, and the prediction demand party decrypts a calculation result of a data provider by encrypting a part of model calculation results of the prediction service party in combination with self data to finally obtain a prediction result of a keyword id shared by the two parties. The method utilizes the PSI technology to encrypt data through the key derivation function, meets the privacy protection requirement in the federal learning prediction stage, gets through the last link of the federal learning privacy safety protection, and promotes the federal learning more application scenes to fall to the ground.

Description

PSI (program specific information) technology-based method for protecting privacy of federal learning prediction stage

Technical Field

The invention relates to the field of information safety and the field of artificial intelligence, in particular to application of a PSI (program specific information) technology in federal learning, which realizes privacy protection in a federal learning prediction stage and is a method for protecting the privacy in the federal learning prediction stage based on the PSI technology.

Background

The rapid development of emerging technologies such as big data, cloud computing, internet of things and the like leads to the explosive growth of data, and the data are mastered by different entity organizations. On one hand, with the issuance of laws and regulations such as the network security law of China and the General Data Protection Regulation (GDPR) of the European Union, the requirements of governments of various countries on the privacy protection of user data will become more and more strict, and bottlenecks already appear in the use and analysis of big data only by relying on the traditional method; on the other hand, each entity organization grasping a large amount of data hopes to realize data sharing, mine the potential huge value of the data, and does not hope to reveal own data. Therefore, in the process of big data application, data privacy security is more and more emphasized. Common privacy protection techniques include k-anonymity, differential privacy, homomorphic encryption, secure multiparty computing, and the like.

The federal learning technique proposed in google of 2016, which is considered one of the techniques to go out of the dilemma, is gaining increasing attention. The federal learning is actually a distributed learning technology based on cryptography, and each participating entity organization trains a model on an equal level on the premise of not revealing own data, namely the federal model, so that the requirements of user privacy protection, organization data safety and government laws and regulations can be met, and the model training effect can be ensured. According to the distribution characteristics of the data set, the federal learning is divided into horizontal federal learning, vertical federal learning and federal transfer learning.

From the currently published codes, papers and patents, in the prior art, the federal learning scheme focuses more on the training phase of the model, and how to implement the prediction phase is rarely mentioned, and most of the prior art only simply illustrates that the federal learning participators perform joint prediction by using the federal learning model, and then think that the user directly uses the federal learning model. Although the federal learning in the prior art takes the data privacy protection of the training phase of the federal learning model into consideration, the data security problem which may exist in the subsequent use of the over-model is not taken into deep consideration. As the last ring of the federal learning, the prediction is used most frequently in the actual application scene, and the application value of the federal learning is reflected, while most of the prior art only considers how to obtain the federal learning model, and few researches are made on the problems existing in the use of the model. The actual situation of federal learning is that each entity organization can only obtain part of the model through federal learning, so that the participation of each entity organization is still required in the prediction stage. Meanwhile, in a real-world scenario, a party who proposes a prediction demand often does not want other participants to know the id of prediction data, and therefore privacy protection is also needed in the prediction stage. For example, the lending institution and the credit bureau cooperate to predict the credit of a borrower, the lending institution does not want to let the other party know that the borrower has the borrowing demand, otherwise the credit bureau can provide the borrower information to other lending institutions, so the lending institution hopes that the borrower information is not leaked while the lender credit is predicted.

The Privacy Set Interaction (PSI) belongs to a specific application problem in the field of secure multi-party computing, and is one of the hot problems of privacy protection. The method can be used for data alignment before training the model in federal learning and can also be used for realizing privacy protection in a prediction phase. PSI allows participants to use respective data sets to calculate intersection through a series of underlying cryptography techniques, and does not reveal any data of the participants except the intersection, wherein the information of the intersection can be obtained by a certain participant or all the participants. Therefore, the PSI technology has potential application value in the scenes of blacklist sharing, marketing matching, similar document detection, private contact person discovery and the like. The method is mainly divided into 3 types of protocols according to the difference of the cryptographic technology used by the bottom layer: PSI based on public key cryptography, PSI based on obfuscation circuits, and PSI based on OT (Oblivious Transfer) protocol. Although PSI technology is mature, there has not been any relevant study on how to specifically use it in the prediction phase of federal learning.

Disclosure of Invention

The invention aims to solve the problems that: in the existing federal learning technology, most emphasis is placed on establishment of a federal learning model, and actual landing application research on the federal learning is few; as the last ring of federal learning, prediction is used most frequently in practical application scenarios, while the prior art is only rarely studied on the prediction stage, and the data security problem of the federal learning prediction stage is not considered.

The technical means of the invention is as follows: in the prediction stage of the federal learning, data are interacted by the PSI technology based on a privacy set, interaction information is encrypted in the interaction process, and privacy protection is performed on a party participating in the federal learning.

The federal learning is longitudinal federal learning, the participants of the federal learning comprise a forecasting demand party Alice, a forecasting service party Bob and a trusted third party Carol, and Alice and Bob respectively obtain corresponding part models of the federal learning Model through the federal learning under the help of Carol_AAnd a Model_BIn the prediction phase, Alice and Bob use respective Model models_AAnd a Model_BCalculating a prediction result and then performing merging prediction: and carrying out privacy protection on Alice and Bob based on PSI in the merged prediction, wherein Bob encrypts the sent prediction result, Alice decrypts the prediction result corresponding to the intersection part after completing set intersection based on PSI and Bob to obtain the required prediction data, so that Alice can not reveal own privacy information while obtaining the prediction result of Bob, and Bob can not reveal own privacy information when providing the prediction data.

In the Federal learning model, the data samples (id, f) are composed of keywords id and featuresAnd (3) forming a value F, recording a set of ID as ID and a characteristic set as F, and when Alice sends a prediction request in the prediction stage of Federal learning, firstly, Bob sends a prediction request according to a Model_BCalculating the prediction result of the Model, then Alice and Bob execute a PSI protocol to obtain masks of respective keywords, Bob encrypts the prediction result of the Model through a key derivation function, Alice performs set intersection according to the PSI transformation result of the keywords, then decrypts the prediction result of the intersection part according to the key derivation function, and the prediction result and the Model of the Model are combined_AAnd combining the calculated prediction results to finally obtain the prediction result of the key word id shared by the two parties.

In order to solve the problem of privacy protection in the federal learning prediction stage, the invention provides a solution based on a PSI technology, which meets the privacy protection requirement in the federal learning prediction stage, realizes the last link of privacy safety protection in federal learning, and promotes the falling of more application scenes in federal learning. The invention aims to solve the problems that when a prediction demand side Alice cooperates with a prediction service side Bob for prediction, the prediction demand side Bob cannot know the id information and the characteristic value information of the ID information which needs to be predicted by the Alice, and meanwhile, the Alice cannot know the data set of the Bob, and the prediction service side Bob cannot know the prediction data id which is needed by the Alice and can send the calculation result of the self model to the prediction demand side Alice at the same time when the prediction service side Bob cannot know the characteristic value corresponding to the id which is overlapped with the Bob.

The privacy problem research of the prior art in the federal learning prediction stage is still blank, although the PSI technology provides a safety scheme for two data interaction parties to obtain intersection information, the prediction result of the federal learning not only lies in the acquisition of coincident ids of Alice and Bob, but also lies in a specific value related to the ids, the privacy protection in the prediction stage not only lies in preventing Bob from knowing which ids are submitted by Alice, but also lies in the safety protection of information provided by Bob, the invention researches the privacy and data safety problems in the federal learning prediction stage, analyzes the problems existing in the stage, and provides a solution, after the prediction demand party and the prediction service party obtain the self prediction result according to the federal learning model, under the privacy protection method of the invention, the prediction demand party Alice further obtains the related prediction result of the prediction service party Bob, and a more perfect prediction result is obtained by combining, and the privacy safety of Alice and Bob and the data safety in the data interaction process are ensured.

Drawings

Fig. 1 is a flow chart of federal learning, in which a prediction stage of federal learning is located in the invention in a dashed line box, and the federal learning can be generally divided into several stages of data alignment, training, prediction and the like.

FIG. 2 is a flow chart schematic of the Federal learning prediction phase of the present invention, namely the contents of the dashed box of FIG. 1 as refined.

Detailed Description

The invention provides a method for protecting privacy of a federal learning prediction stage based on a PSI technology. As shown in fig. 1, the federal learning can be roughly divided into several stages of data alignment, training, prediction, etc., and the present invention is a technical solution made for data security and privacy security of the prediction stage, i.e., the part of the dashed box in fig. 1.

The method is mainly used for longitudinal federal learning, and the participants of the federal learning comprise a prediction demand party Alice, a prediction service party Bob and a trusted third party Carol, wherein Alice and Bob respectively obtain corresponding part models of a federal learning Model through the federal learning with the help of Carol_AAnd a Model_BThe data sample (id, f) is composed of a keyword id and a characteristic value f, and in the prediction stage, Alice and Bob respectively use respective Model models_AAnd Model_BAnd (3) calculating prediction results for respective data samples, and then performing merging prediction: in the merging prediction, based on PSI, privacy protection is carried out on Alice and Bob, the Alice and the Bob execute PSI protocol to obtain masks of respective keywords, wherein the Bob further encrypts the sent prediction result through a key derivation function, the Alice carries out set intersection according to PSI transformation results of the keywords, then decrypts the prediction result of the intersection part according to the key derivation function and the prediction result is combined with the Model of the Alice and the Model of the Alice_AThe calculated prediction results are merged to finally obtain the common keywords of the two partiesD, predicting the result; therefore, Alice can not reveal own private information while obtaining the prediction result of Bob, and Bob can not reveal own private information when providing prediction data.

The implementation of the present invention is specifically illustrated in fig. 2. In the federal learning model, Data samples (ID, F) are composed of keywords ID and characteristic values F, the set of ID is represented as ID, the set of characteristic is represented as F, and Data sets using Alice and Bob as prediction are respectively represented as Data_AAnd Data_BThe corresponding key word sets are respectively ID_A、ID_BThe corresponding feature sets are respectively F_A、F_B，Data_AAnd Data_BThe keywords have intersection and different characteristic values; data_AAnd Data_BThe samples in the dataset are (id, f) respectively_A,id) And (id, f)_B,id) Wherein f is_A,idAnd f_B,idRespectively representing the characteristic values corresponding to the respective keyword IDs of Alice and Bob, wherein Alice needs to calculate ID e to ID under the cooperation of Bob_AWhen Alice sends a prediction request, Alice and Bob execute the following steps:

s1) Bob utilizes Model_BFor self Data set Data_BCalculating to obtain score_B,id＝Model_B(f_B,id) For all ID ∈ ID_BForming a new data set denoted as Predict_B＝{(id,score_B,id)|id∈ID_B}。

S2) Alice and Bob execute the PSI protocol to Predict the data set of Bob_BTransformation to eP_B＝{(eid_B,id,score_B，id)|id∈ID_BWhere the same id may correspond to multiple eids_B,idAlice's keyword set ID_AConversion to eID_A＝{eid_A,id|id∈ID_AWhere the same id corresponds to only one eid_A,id。

S3) for each (eid)_B,id,score_B，id)∈eP_BBob derives a symmetric encryption key k using a key derivation function KDF_id＝KDF(eid_B,idId, iter, klen), among others_B,idRepresenting a password, id as a keyIter represents the number of iterations, and klen represents the key length of the symmetric encryption algorithm; in cryptography, a salt value is a result of a hash operation that is not matched with a hash value of an original password by inserting a specific character string at an arbitrary fixed position of the password.

S4) Bob uses k in step S3)_idTo score_B,idIs encrypted to obtain c_id＝Enc(k_id,score_B,id) Wherein Enc (·,) is an encryption algorithm of a symmetric cipher, and a new data set EPredict is obtained_B＝{(eid_B,id,c_id)|id∈ID_BAnd sending to Alice; preferably, eid is also treated_B,idCarrying out Hash operation to obtain EPredict_B＝{(H(eid_B,id),c_id)|id∈ID_BH (-) is a hash function, so as to prevent the data sent by Bob from being violently searched, and further ensure the data security, and Alice performs calculation by using the same hash function in step S5).

S5) according to the data set EPredict sent by Bob in step S4)_B＝{(eid_B,id,c_id)|id∈ID_BAnd (4) calculating intersection by Alice:

EPredict_A∩B＝{(eid_A,id,c_id)|eid_A,id＝eid_B,id,eid_A,id∈eID_A,(eid_B,id,c_id)∈eP_B}

if the eid of Bob_B,idAnd performing hash operation, and then, Alice calculates by using the same hash function, and the calculation intersection is as follows:

EPredict_A∩B＝{(eid_A,id,c_id)|H(eid_A,id)＝H(eid_B,id),eid_A,id∈eID_A,(H(eid_B,id),c_id)∈eP_B}；

in this step, a cuckoofilter is further preferably adopted to improve the intersection calculation efficiency.

S6) for all (eid)_A,id,c_id)∈EPredict_A∩BAlice obtains the corresponding keyword intersection id_A∩B∈ID_A∩ID_BComputing a symmetric key k using a key derivation function_id,A∩B＝KDF(eid_A,id,id_A∩BIter, klen), decrypt c_idGet prediction result score of Bob corresponding to intersection_B,id＝Dec(k_id,A∩B,c_id) Where Dec (·,) is the decryption algorithm of symmetric cipher to obtain data intersection Predict_A∩B＝{(id,score_B,id)|id∈ID_A∩ID_B}。

S7) obtaining the Presect according to the result of the step S6)_A∩BAnd Alice calculates a merged prediction result: for ID ∈ ID_A∩ID_BAccording to Model_ACalculate score_A,id＝Model_A(f_A,id) Will score_A,idAnd Predict_A∩BScore in (1)_B,idThe final prediction result score is obtained by combination_id＝Merge(score_A,id,score_B,id) Where Merge (·,) is the machine learning algorithm set for federal learning.

As shown in fig. 2, the present invention obtains the respective masks of Alice and Bob by using PSI technique. The prediction service party uses the mask as a seed of a Key Derivation Function (KDF) to obtain a symmetric cipher encryption key, and encrypts a self model calculation result. And the prediction demand party can obtain a decryption key through a key derivation function according to the intersection result, and obtain required data from the prediction service party to complete final prediction.

The invention can use various PSI protocols, such as PSI based on OT protocol, and realizes the prediction of large-scale data while protecting privacy; the PSI technology based on the public key is suitable for the situation that the data volume of Alice is small and the data volume of Bob is large; and various machine learning algorithms can be used, so that the method is suitable for different application scenes.

The practice of the invention is illustrated by the following specific examples.

Example 1:

assuming that the machine learning algorithm used in the present invention is logistic regression, Alice's data features are

Bob is characterized by the data

Wherein d is_AAnd d_BRespectively, are the feature numbers owned by Alice and Bob. Part of models obtained by Alice and Bob through federal learning are respectively

And

wherein theta is_-1Is the intercept. The prediction function of the logistic regression is

Wherein

For specific sample data, the first half

Owned by Alice, the second half

Is owned by Bob. Then Alice predicts according to the following steps:

1. for all ID e IDs_BBob calculation

Form a new data set denoted Predict_B＝{(id,score_B,id)|id∈ID_B}；

Alice and Bob perform an improved PSI based on the OT protocol to obtain a new data set eID respectively_A＝{eid_A,id|id∈ID_A} and eP_B＝{(eid_B,id,score_B,id)|id∈ID_B}；

3. For each (eid)_B,id,score_B，id)∈eP_BBob derives a symmetric encryption key k using a key derivation function KDF_id＝KDF(eid_B,idId, iter, klen), among others_B,idRepresenting a password, taking id as a salt value of a key, representing iter representing iteration times, and representing the key length of a symmetric encryption algorithm by klen;

bob uses k in step 3_idTo score_B,idIs encrypted to obtain c_id＝Enc(k_id,score_B,id) Wherein Enc () is an encryption algorithm of a symmetric cipher to obtain a new data set EPredict_B＝{(H(eid_B,id),c_id)|id∈ID_BH (·) is a hash function, and send to Alice;

5. according to the data set EPredict sent by Bob in step 4_BAnd Alice calculates the intersection:

meanwhile, improving the intersection calculation efficiency in the step 5 by using cuckoofilter;

6. for all (eid)_A,id,c_id)∈EPredict_A∩BAlice gets the corresponding id_A∩B∈ID_A∩ID_BComputing a symmetric key k using a key derivation function_id,A∩B＝KDF(eid_A,id,id_A∩BIter, klen), decryption c_idObtain score of corresponding intersection_B,id＝Dec(k_id,A∩B,c_id) Where Dec (·,) is the decryption algorithm of symmetric cipher to obtain data intersection Predict_A∩B＝{(id,score_B,id)|id∈ID_A∩ID_B}；

7. Predict according to the result obtained in step 6_A∩BAnd Alice calculates a prediction result: for (id, score)_B,id)∈Predict_A∩BCalculating

Two parts score_A,idAnd score_B,idMerging to obtain the final prediction result

Example 2:

the same notations as used in example 1 were used for prediction using logistic regression, all except that in this example, public key based PSI was used. For this purpose, some other symbols are required, which are assumed to be based on elliptic curves in the present embodiment, q is the order of the elliptic curve,

represents the set {1, 2.., q-1}, H₁And H₂Is a hash function, where H₁The message may be mapped to a point on an elliptic curve. The method comprises the following specific steps:

1. precomputation, for all ID ∈ ID_BBob random selection

Calculating s_id＝H₂(a*H₁(id)), and all s_id,id∈ID_BSending the data to Alice;

2. for all ID ∈ ID_ARandom selection by Alice

Calculating t_id＝b*H₁(id), and mixing t_id,id∈ID_ASending the data to Bob;

bob performs the following steps:

3.1. for all ID ∈ ID_BBased on Alice's needs, Bob calculates

3.2. For all ID ∈ ID_BComputing the key k using a key derivation function_id＝KDF(a*H₁(id), id, iter, klen), and for score_B,idAnd (3) encryption: c. C_id＝Enc(k_id,score_B,id)；

3.3. T sent to Alice_idAnd Bob calculates t'_id＝a*t_id＝a*b*H₁(id)；

3.4.Bob mixing c_idAnd t'_idSent to Alice, guarantees c_idAnd t_idA one-to-one correspondence relationship of;

alice performs the following steps:

4.1. for all ID ∈ ID_AAlice calculates t ″_id＝b^-1*t′_id＝a*H₁(id) and s'_id＝H₂(t″_id)；

4.2.Alice calculates S ═ S'_id,id∈ID_A}∩{s_id,id∈ID_B}；

4.3. To s'_idE.g. S, using a key derivation function to calculate k_id＝KDF(t″_idId, iter, klen), decryption c_idGet score_B,id＝Dec(k_id,c_id)；

4.4, Alice finds the id corresponding to the ciphertext intersection to obtain the Predict_A∩B＝{(id,score_B,id)|id∈ID_A∩ID_B}；

5. Predict according to the result obtained in step 4.4_A∩BAnd Alice calculates a prediction result: for ID ∈ ID_A∩ID_BThen (id, score)_B,id)∈Predict_A∩BCalculating

Two parts score_A,idAnd score_B,idMerging to obtain the final prediction result

The implementation case is particularly suitable for the situation that the data volume of a prediction demand side is small, and the data volume of the other side is large, and the score is guaranteed_B,idEtc. sensitive data is not revealed. In addition, the hash function H described above₂Cuckoo filter can be adopted to improve the intersection calculation efficiency in the step 4.2.

When the method is specifically implemented, an intermediate service provider can be introduced, and the application scene is expanded. For example, in embodiment 2, the calculation result in step 1 may be sent to an intermediate facilitator, and finally the process of calculating the intersection in step 4.2 is completed by the intermediate facilitator, and the intermediate facilitator cannot obtain useful information.

Claims

1. A method for protecting privacy of a federal learning prediction stage based on a PSI technology is characterized in that in the prediction stage of the federal learning, data is interacted with the PSI technology based on a privacy set for interaction, interaction information is encrypted in the interaction process, and privacy protection is carried out on participants of the federal learning; the federal learning is longitudinal federal learning, the participants of the federal learning comprise a forecasting demand party Alice, a forecasting service party Bob and a trusted third party Carol, and Alice and Bob respectively obtain corresponding part models of the federal learning Model through the federal learning under the help of Carol_AAnd a Model_BIn the prediction phase, Alice and Bob use their Model models respectively_AAnd a Model_BCalculating a prediction result and then performing merging prediction: in the combined prediction, privacy protection is carried out on Alice and Bob based on PSI, wherein Bob encrypts the sent prediction result, Alice decrypts the prediction result corresponding to the intersection part after completing set intersection based on PSI and Bob to obtain the required prediction data, so that Alice does not reveal own privacy information while obtaining the prediction result of Bob, and Bob does not reveal own privacy information when providing the prediction data;

in the federal learning model, Data samples (ID, F) are composed of keywords ID and characteristic values F, the set of ID is represented as ID, the set of characteristic is represented as F, and Data sets using Alice and Bob as prediction are respectively represented as Data_AAnd Data_BThe corresponding key word sets are respectively ID_A、ID_BThe corresponding feature sets are respectively F_A、F_B，Data_AAnd Data_BThe keywords have intersection, and the characteristic values are different; data_AAnd Data_BThe samples in the dataset are (id, f) respectively_A,id) And (id, f)_B,id) Wherein f is_A,idAnd f_B,idRespectively represent the corresponding characters of the key words id of Alice and BobProper value, Alice needs to calculate ID e ID under the cooperation of Bob_AWhen Alice sends out a prediction request, Alice and Bob execute the following steps:

s1) Bob utilizes Model_BFor self Data set Data_BCalculating to obtain score_B,id＝Model_B(f_B,id) For all ID ∈ ID_BForm a new data set as Presect_B＝{(id,score_B,id)|id∈ID_B}；

S2) Alice and Bob execute the PSI protocol to Predict the data set of Bob_BTransformation to eP_B＝{(eid_B,id,score_B，id)|id∈ID_BWhere the same id corresponds to one or more eids_B,idAlice's keyword set ID_AConversion to eID_A＝{eid_A,id|id∈ID_AWhere the same id corresponds to only one eid_A,id；

S3) for each (eid)_B,id,score_B，id)∈eP_BBob derives a symmetric encryption key k using a key derivation function KDF_id＝KDF(eid_B,idId, iter, klen), among others_B,idRepresenting a password, taking id as a salt value of a key, iter representing iteration times, and klen representing the key length of a symmetric encryption algorithm;

s4) Bob uses k in step S3)_idTo score_B,idIs encrypted to obtain c_id＝Enc(k_id,score_B,id) Wherein Enc () is an encryption algorithm of a symmetric cipher to obtain a new data set EPredict_B＝{(eid_B,id,c_id)|id∈ID_BAnd sending the data to Alice;

s5) according to the data set EPredict sent by Bob in step S4)_BAnd Alice calculates the intersection according to the PSI protocol:

EPredict_A∩B＝{(eid_A,id,c_id)|eid_A,id＝eid_B,id,eid_A,id∈eID_A,(eid_B,id,c_id)∈eP_B}；

s6) for all (eid)_A,id,c_id)∈EPredict_A∩BAlice gets the corresponding id_A∩B∈ID_A∩ID_BComputing a symmetric key k using a key derivation function_id,A∩B＝KDF(eid_A,id,id_A∩BIter, klen), decrypt c_idObtain score of corresponding intersection_B,id＝Dec(k_id,A∩B,c_id) Where Dec (·,) is the decryption algorithm of symmetric cipher to obtain data intersection Predict_A∩B＝{(id,score_B,id)|id∈ID_A∩ID_B}；

S7) Predict according to the result obtained in the step S6)_A∩BAnd Alice calculates a merged prediction result: for ID ∈ ID_A∩ID_BAccording to Model_ACalculate score_A,id＝Model_A(f_A,id) Will score_A,idAnd Predict_A∩BScore in (1)_B,idThe final prediction result score is obtained by combination_id＝Merge(score_A,id,score_B,id) Where Merge (·,) is the machine learning algorithm set for federal learning.

2. The method for protecting privacy of the federal learning forecast phase based on PSI technique as claimed in claim 1, wherein in step S4), eid is also corrected_B,idCarrying out Hash operation to obtain EPredict_B＝{(H(eid_B,id),c_id)|id∈ID_BH (·) represents a hash function, and in the corresponding step S5), Alice calculates an intersection as:

EPredict_A∩B＝{(eid_A,id,c_id)|H(eid_A,id)＝H(eid_B,id),eid_A,id∈eID_A,(H(eid_B,id),c_id)∈eP_B}。

3. the method according to claim 1, wherein in step S5), cuckoo filter is used to improve intersection calculation efficiency.

4.A method for protecting privacy of the federal learning forecast phase based on PSI techniques as in any of claims 1-3, wherein the PSI is a PSI technique based on OT protocol.

5. A method for protecting privacy of the federal learning forecast phase based on PSI techniques as in any of claims 1-3, wherein the PSI is a public key based PSI technique.