CN112822005B - Secure transfer learning system based on homomorphic encryption - Google Patents

Secure transfer learning system based on homomorphic encryption Download PDF

Info

Publication number
CN112822005B
CN112822005B CN202110134461.8A CN202110134461A CN112822005B CN 112822005 B CN112822005 B CN 112822005B CN 202110134461 A CN202110134461 A CN 202110134461A CN 112822005 B CN112822005 B CN 112822005B
Authority
CN
China
Prior art keywords
encrypted
sample
algorithm
encryption
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110134461.8A
Other languages
Chinese (zh)
Other versions
CN112822005A (en
Inventor
杨旸
黄欣迪
池升恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202110134461.8A priority Critical patent/CN112822005B/en
Publication of CN112822005A publication Critical patent/CN112822005A/en
Application granted granted Critical
Publication of CN112822005B publication Critical patent/CN112822005B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0838Key agreement, i.e. key establishment technique in which a shared key is derived by parties as a function of information contributed by, or associated with, each of these
    • H04L9/0841Key agreement, i.e. key establishment technique in which a shared key is derived by parties as a function of information contributed by, or associated with, each of these involving Diffie-Hellman or related key agreement protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a security transfer learning system based on homomorphic encryption. The system designs an encryption TrAdaboost training and prediction algorithm based on a double-cloud server model (a storage cloud server and a computing cloud server) around the privacy disclosure problem of migration machine learning in a cloud outsourcing scene. On one hand, a source domain data owner and a target domain data owner of the system respectively upload encrypted training data to a cloud end, and a cloud server trains out a TrAdaboost model in a privacy protection mode; on the other hand, a requesting user of the system sends an encrypted data sample to the cloud server to request a secure predictive service, and then the cloud server returns an encrypted predictive classification result. The system does not leak training and prediction request data, training models, prediction results and intermediate calculation results of users (including data owners and prediction requesters) to a cloud or unauthorized user.

Description

Secure transfer learning system based on homomorphic encryption
Technical Field
The invention relates to a security transfer learning system based on homomorphic encryption.
Background
Cloud computing delivers computing services (including data storage, computing, software, data analysis, etc.) to individuals or companies, providing great convenience to users and reducing software construction or operating costs. Due to the characteristics of strong computing power and storage capacity, high reliability, on-demand service and the like of cloud computing, the cloud computing is widely applied to the practical fields of big data analysis, data backup, software development test, management and the like. Meanwhile, the rapid development of machine learning (including migration learning) benefits from the support of cloud computing. Technical researchers or users outsource complex intelligent computing tasks to the cloud server, the cloud server performs efficient computing and returns computing results to the users, and therefore the users can efficiently and stably complete machine learning computing on resource-limited personal computers or mobile equipment.
In recent years, machine learning has been applied to many fields in real life, such as face or voice recognition, medical diagnosis, financial analysis, and smart home. Transfer learning incorporates the idea of "analogy learning" into machine learning and gains more and more attention from researchers. It aims to transfer learned knowledge (i.e., source tasks) to new problems (i.e., target tasks) to assist in the learning of the target tasks. In general, migration learning is effective when the source domain (or source task) and the target domain (or target task) are different but related. For example, the migratory learning algorithm can multiplex shallow network parameters in a high-performance training model (e.g., AlexNet or ResNet model) and adapt the deep network to the user-customized classification task through training. Consider another real-world scenario: a certain medical research institution collects a large amount of labeled medical data, while another clinic institution has only a small amount of labeled medical data and unlabeled data from the patient, while the medical data of both institutions is relevant. In this case, if the corresponding classification model is trained only with the medical data of the clinic, it is obviously likely that the accuracy is low. When the transfer learning technology is used for training by combining the medical data of the two institutions (namely, the data of the medical research institution is used for assisting the model training of the clinic institution), the quality of the classifier can be improved to a greater extent. Tradoboost is an example-based classical migration learning algorithm (proposed in 2007) that can be used to solve such problems. The TrAdaboost algorithm refers to the core design idea of Adaboost (an ensemble learning algorithm), combines large-scale marking data of a source domain and a small amount of marking data of a target domain, and trains a classifier with good performance for a target task. Specifically, the algorithm optimizes the sub-classifiers round-by-round by adjusting the weight values of the training samples. And the final prediction result is determined by a weighted combination of the sub-classifiers.
However, while delegating the migration learning task to the cloud server for computing, a serious challenge is introduced — the risk of privacy disclosure. Since the data or results involved in the calculations may involve sensitive information of the user, such as personal images, financial information, health conditions, etc. In addition, models of transfer learning are often viewed as important property for researchers, and may be subject to significant damage if exposed. While the cloud server is generally not trusted, and a system attacker may eavesdrop on the transmission channel of the data or attack the cloud server. Therefore, it is necessary to apply effective privacy protection measures in the outsourcing computation task. In order to solve such problems, several mainstream security technologies have been used to ensure security and privacy of cloud computing, including Secret Sharing (SS), Garbled Circuits (GCs), Differential Privacy (DP), hardware security, Homomorphic Encryption (HE), and the like. The homomorphic encryption technology is characterized by the computability of a ciphertext (namely, a calculation result of the ciphertext is equivalent to a calculation result of a corresponding plaintext after being decrypted), and an excellent solution is provided for privacy-protecting cloud computing. Compared with other security technologies, homomorphic encryption can achieve a higher security level. However, the homomorphic encryption technique has the characteristics of high overhead, capability of only supporting positive integer arithmetic, incapability of naturally supporting nonlinear arithmetic (such as logarithm arithmetic), and the like. How to design an effective homomorphic encryption protocol to simultaneously meet the privacy, correctness and efficiency of migration learning in cloud computing is still a research problem which needs to be continuously perfected.
With the help of different security technologies, privacy protection schemes for transfer learning currently exist. Some researchers have implemented Federal Transfer Learning (FTL) computation while protecting privacy of sensitive information. Liu et al achieved the safety training, prediction and cross validation process of FTL by Additive Homomorphic Encryption (AHE). In this scheme, data samples are kept separately by both parties (i.e., the source domain data owner and the target domain data owner) and parameters are encrypted before exchanging data. Sharma et al implement a similar secure FTL training process via SS technology. Both of these schemes can resist two threat levels, semi-honest (semi-host) and malicious (malicious) models. In addition, Gao et al propose a privacy protection scheme for joint neural network migration learning (including training and prediction processes) based on AHE and SS technologies, respectively. In addition, using the same technique, Gao et al designed a heterogeneous FTL system based on a safe logistic regression model. However, the above schemes involve multiple rounds of interaction between users, while consuming a large amount of local computing resources. The first FTL framework for wearable medical scenarios (using AHE technology) was proposed by Chen et al, but no detailed security design solution is given in the scheme. Since the FTL supports only unidirectional transmission, the source domain participants cannot benefit from the computation. To address this problem, Ma et al devised a secure collaborative migration learning scheme based on the SPDZ framework. The correctness of the system result can be verified by a Message Authentication Code (MAC).
Liu et al propose a privacy-preserving multitask learning scheme that utilizes AHE techniques to encrypt model parameters of users and perform cryptographic computations through cloud services. However, in this scheme, all users encrypt their respective data using the same public key. Thus, as long as one user is bought (confidential), the privacy of all users will be compromised. Different from the encryption technology adopted by Liu et al, Xie et al and Zhang et al design a multi-task learning system for privacy protection by using a differential privacy technology. In addition, differential privacy has also been used in other secure migration learning schemes, such as the privacy-preserving domain adaptation (domain adaptation) computation scheme proposed by Wang et al or the privacy-preserving hypothetical migration learning (hypothesization) scheme proposed by Yao et al. Wang et al have designed a security domain adaptive scheme based on counterlearning, which adds certain gaussian noise to the gradient to protect the confidentiality of private data. While differential privacy based schemes have significantly lower overhead than encryption schemes, the accuracy of the results is impaired to a different extent. To implement multi-party knowledge migration, Ma et al introduced a privacy-preserving cloud outsourcing framework that implements decision tree learning and prediction. The core idea of the algorithm is to implement a similarity measure between decision trees over the encrypted domain.
Disclosure of Invention
The invention aims to provide a homomorphic encryption-based secure migration learning system aiming at the problems of certain security risk, low efficiency of a ciphertext computing protocol and the like in the existing secure migration learning scheme, so that the TrAdaboost training and prediction for protecting privacy are realized, and the local overhead of a user is reduced as much as possible.
In order to achieve the purpose, the technical scheme of the invention is as follows: a secure migration learning system based on homomorphic encryption, comprising: the system comprises a key generation center KGC, a cloud platform CP, a cloud service provider CSP, a source data owner SDO, a target data owner TDO and a request user RUs;
a key generation center KGC responsible for initializing cryptographic system parameters and distributing public/private key pairs of system entities;
a cloud platform CP, which is responsible for receiving and storing training data from SDO and TDO and prediction request data from RUs, and performing partial computation of the system;
the CSP interacts with the CP and provides computing service for a transfer learning algorithm for protecting privacy; in addition, the CP and CSP jointly perform decryption and re-encryption operations;
a source data owner SDO, the SDO owning the tagged sample instance from the source domain, sending the encrypted data set to the CP as a source training data set of the system;
target data owner TDO, TDO having tagged sample instances and untagged sample instances from target domains, the source and target domains involved in the system are distributed differently but have correlation; the TDO sends the encrypted data set to a CP to serve as a target training data set of the system, and the CP combines the encrypted source training data set and the target training data set to serve as a joint training data set;
and the request user RUs sends the encrypted unmarked sample from the target domain to the CP after the CP and the CSP finish the construction of the TrAdaboost classifier for the target space, requests related prediction calculation, and the encrypted prediction result returned from the CP can only be decrypted by the corresponding request user.
In an embodiment of the present invention, the key generation center KGC initializes the cryptosystem parameters and distributes the public key/private key pair of the system entity as follows:
(1) KGC generates system parameters N and g of a Paillier algorithm-based homomorphic re-encryption system HRES, and KGC generates respective public key/private key pairs for all entities in the system, specifically: KGC distributes key pairs (pk) for SDO, TDO, CP and CSP, respectively sdo ,sk sdo )、(pk tdo ,sk tdo )、(pk cp ,sk cp ) And (pk) csp ,sk csp ) (ii) a In addition, the KGC sets up a key repository to store the requesting user's public/private key pairs
Figure GDA0003581598920000031
Wherein n is user Representing the total number of system users, and distributing an unused key pair to each registered user; the key bank is updated by KGC in idle time or when needed;
(2) CP and CSP exchange their respective public keys and negotiate out a Diffie-Hellman key
Figure GDA0003581598920000032
Subsequently, the PK is used as a global public key of the system to be disclosed to the SDO, the TDO and the subsequent RUs of the requesting user; as can be seen from the characteristics of HRES, PK encrypted based messages can only be decrypted by CP and CSP jointly to recover the plaintext.
In an embodiment of the present invention, the Paillier algorithm based homomorphic re-encryption system HRES includes the following algorithms:
key generation KeyGen algorithm: a safety parameter k and two large prime numbers p and q are given to satisfy
Figure GDA0003581598920000041
(symbol)
Figure GDA0003581598920000042
A bit length of expression; then, N ═ pq is calculated and one group generator g is selected, and the order of g is ord (g) ═ p-1 (q-1)/2; user i's public/private key pair of
Figure GDA0003581598920000043
Wherein s is iR [1,λ(N 2 )]λ (·) denotes the euler function; furthermore, assume that two entities a and B possess a public/private key pair (pk) respectively A =g a modN 2 ,sk A A) and (pk) B =g b modN 2 ,sk B B), the public key obtained after the two carry out Diffie-Hellman negotiation is
Figure GDA0003581598920000044
The corresponding joint decryption private keys are a and b respectively; taking PK as a global public key of the system in the system; the parameters g and N are published;
the encryption Enc algorithm: joining message m to Z N And the public key pk i As input, randomly select
Figure GDA0003581598920000045
And calculating to obtain ciphertext
Figure GDA0003581598920000046
Wherein T and T' are respectively the first and second elements of the ciphertext;
decryption Dec algorithm: using the private key sk i For ciphertext
Figure GDA0003581598920000047
And (3) decryption:
Figure GDA0003581598920000048
wherein l (u) ═ 1/N;
the EncTK algorithm with double keys is as follows: the system global public key PK is selected to encrypt the message, similar to the Enc algorithm, given a plaintext message m ∈ Z N Obtaining a ciphertext [ [ m ]]] PK =(T,T′)={PK r (1+m·N),g r }(modN 2 ) (ii) a To simplify the expression, [ [ m ]]] PK Is uniformly and briefly expressed as [ [ m ]]];
Using sk A Partial decryption PDec1 algorithm: input [ [ m ]]]And sk A A, the first stage of partial decryption:
Figure GDA0003581598920000049
using sk B Partial decryption PDec2 algorithm: inputting partially decrypted ciphertext
Figure GDA00035815989200000410
And sk B The second stage of partial decryption is performed, so as to obtain the plaintext information m:
Figure GDA00035815989200000411
m=L(T (1) /T′ (2) modN 2 )
re-encryption first stage FPRE algorithm: given ciphertext [ [ m ]]]Private key sk A And a user public key pk j And executing the first-stage re-encryption calculation:
Figure GDA00035815989200000412
Figure GDA00035815989200000413
re-encryption second stage SPRE algorithm: given partial re-encryption ciphertext [ m [ [ m ]]] + Private key sk B And a user public key pk j Performing a second stage of re-encryption calculation to obtain a plaintext m based on the public key pk j Corresponding cipher text of
Figure GDA0003581598920000051
Figure GDA0003581598920000052
Figure GDA0003581598920000053
Cipher text
Figure GDA0003581598920000054
The private key sk can only be used by the user j j And executes the Dec algorithm for decryption.
In an embodiment of the present invention, HRES characteristics of the stateful re-encryption system based on the Paillier algorithm are as follows:
(1) additive homomorphism: given m 1 ,m 2 ∈Z n Is provided with
Figure GDA0003581598920000055
(2) Given a
Figure GDA0003581598920000056
And
Figure GDA0003581598920000057
is provided with
Figure GDA0003581598920000058
(3)
Figure GDA0003581598920000059
In an embodiment of the present invention, a training process of the TrAdaboost classifier is as follows:
the training of the tradoost is an R-round iterative process, firstly preparation and preprocessing are carried out, and then each round of the tradoost iterative training consists of four sub-blocks, including sample weight vector normalization, prediction error calculation, weight adjustment rate calculation and sample weight update, which are specifically as follows:
s1, algorithm preparation and preprocessing
First, SDO and TDO submit respective encrypted data sets to CP, assuming that SDO owns source training data set D S ={(x 1 ,y 1 ),...,(x n ,y n )},TDO has a target training data set D T ={(x n+1 ,y n+1 ),...,(x n+m ,y n+m ) }; in SDO and TDO, each value and corresponding tag value included in the feature vector in the dataset are first multiplied by a scaling factor L, that is: execute
Figure GDA00035815989200000510
And
Figure GDA00035815989200000511
wherein 1 ≦ i ≦ n + m and d represents the magnitude of the feature vector; after encryption with the system global public key PK, SDO and TDO send respective encrypted data sets to CP, i.e., [ D ] S ]] PK ={([[x 1 ]] PK ,[[y 1 ]] PK ),...,([[x n ]] PK ,[[y n ]] PK ) And [ [ D ] T ]] PK ={([[x n+1 ]] PK ,[[y n+1 ]] PK ),...,([[x n+m ]] PK ,[[y n+m ]] PK )},[[x i ]] PK =([[x i1 ]] PK ,[[x i2 ]] PK ,...,[[x id ]] PK ) Wherein i is more than or equal to 1 and less than or equal to n + m; to simplify the representation, [. C]] PK Is expressed as [ ·]](ii) a In addition, the sizes of the source training data set and the target training data set, namely n and m, are also respectively sent to the CP for storage; upon receiving [ [ D ] S ]]And [ [ D ] T ]]The CP then merges them as a joint training data set: [ [ D ]]]={([[x 1 ,y 1 ]]),...,([[x n+m ,y n+m ]])};
Subsequently, the CP initializes sample weights for the joint training data set, assuming initial weight values
Figure GDA00035815989200000512
Respectively determining the sizes of a source training data set and a target training data set, setting the initial weight of a source sample to be (1/n) and the initial weight of a target sample to be (1/m) by the CP; then, an encrypted sample weight vector is calculated
Figure GDA00035815989200000513
Figure GDA00035815989200000514
S2 sample weight vector normalization
In the t-th iteration training, an encrypted normalized sample weight vector p is obtained t (ii) a First, the CP calculates the sum of all weight values over the ciphertext domain:
Figure GDA0003581598920000061
then, get [ p ] by calling the secure division protocol SDiv n + m times t ]]:
Figure GDA0003581598920000062
S3, calculating prediction error
Suppose that the weak classifier of the t-th round is trained [ [ h ] t ]]It is applied to the encrypted sample [ [ x ]]]Is expressed as [ [ h ] t (x)]]And satisfy h t (x) E {0,1 }; the algorithm aims at calculating [ [ h ] t ]]At D T Weighted prediction error of (1); knowing | h t (x i )-y i I and { h | t (x i ),y i The XOR between them results in equality; therefore, first calculate | h using the secure XOR protocol Sxor t (x i )-y i L, |; since in the sample weight update phase, it needs to use [ [ h ] t ]]Error values on the source and target samples, so in this calculation, the encrypted prediction error is calculated for the source and target samples and the result is stored at the CP:
Figure GDA0003581598920000063
next, the operations of SMul and addition are performed through the secure multiplication protocolCalculating the prediction error rate [ ∈ of encryption t ]]:
Figure GDA0003581598920000064
Figure GDA0003581598920000065
S4, calculating weight adjustment rate
The weight adjustment rate controls the degree of updating of the sample weights; the adjustment rate beta of the weights of the source training samples is constant in each iteration; therefore, the value of β only needs to be calculated once during the whole training process of traadaboost and since the operands involved in the calculation are all public, it can be calculated in the plaintext domain:
Figure GDA0003581598920000066
wherein R is a preset training iteration number;
in calculating D T Before the sample weight adjustment rate; i.e. beta t =∈ t /(1-∈ t )=-1+1/(1-∈ t ) The algorithm is based on e t The different values of (A) are calculated under three special conditions respectively; first, a condition e is determined by the following calculation t ≥1/2、∈ t 1 or e t Whether or not 0 is satisfied; wherein, if ex 1 1, then indicates the condition e t 1/2 is equal to or greater than; otherwise, indicating ∈ t < 1/2; if ex 2 1, then indicates the condition e t 0 is satisfied; if ex 3 1, then indicates the condition e t 1 is satisfied;
[[ex 1 ]]=SGE([[∈ t ,[[1/2]]);
[[ex 2 ]]=SETest([[∈ t ]]);
[[ex 3 ]]=SETest([[1]]·[[∈ t ] N-1 )=SETest([[1-∈ t ]])
SGE is greater than or equal to the safety comparison protocol, SETest is the safety ciphertext-plaintext equal test protocol;
next, the algorithm calculates [ [ beta ] t ]]Taking the value of (A);
Figure GDA0003581598920000071
Figure GDA0003581598920000072
SETest is a safety reciprocal protocol;
discussion of the related Art
Figure GDA0003581598920000073
Different value results of (a): if e is t When 1, there is ex 2 1 and
Figure GDA0003581598920000074
thus, it is possible to provide
Figure GDA0003581598920000075
If e is t Not equal to 1, then there is ex 2 0 and
Figure GDA0003581598920000076
thus, it is possible to provide
Figure GDA0003581598920000077
Suppose when e is t ≧ 1/2 or ∈ t =1、∈ t When equal to 0, the beta is t Set directly to constant c 1 Or c 2 、c 3 (ii) a Finally, [ [ beta ] t ]]This can be calculated as follows:
Figure GDA0003581598920000078
Figure GDA0003581598920000079
[[S′ 3 ]]=SMul([[1]]·[[ex 1 ]] N-1 ,[[1]]·[[ex 2 ]] N-1 );
[[S″ 3 ]]=SMul([[S′ 3 ]],[[1]]·[[ex 3 ]] N-1 );
Figure GDA00035815989200000718
Figure GDA00035815989200000710
s5 sample weight update
It is known that the strategy of updating the weight values of the source data sample and the target data sample is different; in addition, only when
Figure GDA00035815989200000711
When misclassified, its corresponding sample weight
Figure GDA00035815989200000712
Need to be updated; ciphertext due to prediction error [ [ h ] t (x i )-y i |]]And i is more than or equal to 1 and less than or equal to n + m is calculated, so that the algorithm only needs to test [ | h ] on the encrypted domain by calling SETest protocol t (x i )-y i |]]Whether or not it is equal to 0:
[[s]]=SETest([[|h t (x i )-y i |]])
the sample weight vector is then updated by the following strategy:
Figure GDA00035815989200000713
Figure GDA00035815989200000714
in an embodiment of the present invention, a process of implementing encryption prediction by the tradoboost classifier is as follows:
RU when requesting user i When a request is made to obtain a classification label of an unlabeled sample x from a target sample space, registration is first completed on the system and a unique public/private key pair is obtained
Figure GDA00035815989200000715
And a global public key PK of the system; requesting user RU to prevent leakage of its privacy sample data i Encrypting a request sample with PK to [ x [ [ x ]]]And transmitting the data packet
Figure GDA00035815989200000716
Feeding the CP; received from RU i After requesting the data, the CP and CSP combine the encrypted weak classifier { [ [ h ] t ]]And its influence factor pair [ [ x ]]]Performing privacy-preserving TrAdaboost prediction, wherein
Figure GDA00035815989200000717
For deployment, CP and CSP are first in [ [ x ]]]Performing encrypted weak classifier computation alternately and obtaining
Figure GDA0003581598920000081
Performing a weighted prediction result calculation:
[[l t ]]=SMul([[h t (x)]],[[SNLog([β i ])]]) N-1
Figure GDA0003581598920000082
SMul is a safe multiplication protocol, and SNLog is a safe natural logarithm protocol;
subsequently, the cloud server calculates two decision parameters, namely [ [ left ] ] and [ [ right ] ], wherein:
Figure GDA0003581598920000083
Figure GDA0003581598920000084
next, the cloud server runs a greater than or equal security comparison protocol SGE to compare [ [ left ]]]And [ [ right ]]]If right ≧ left, the final classification result [ [ h ] f (x)]]Is set as [ [ 1]]](ii) a Otherwise, will [ [ h ] f (x)]]Is set to [ [0 ]]](ii) a In RU returning prediction result to requesting user i Previously, CP and CSP were RU-based i Of (2) a public key
Figure GDA0003581598920000085
Re-encryption of classification result [ [ h ] f (x)]](ii) a Receiving the result of the re-encryption
Figure GDA0003581598920000086
Thereafter, requesting user RU i Using its own private key
Figure GDA0003581598920000087
The plaintext of the result is recovered.
In an embodiment of the present invention, the secure xor protocol SXor is implemented as follows: for two encrypted numbers [ m ] 1 ]]And [ [ m ] 2 ]],m 1 ,m 2 E {0,1}, and implementing ciphertext XOR operation to obtain an encrypted bit XOR result [ u ]]]If m is 1 =m 2 And u is 0; otherwise, u is 1.
In an embodiment of the present invention, the secure ciphertext-plaintext equality test protocol SETest is implemented in the following manner: testing equality relationship between ciphertext [ m ] and plaintext k, wherein m is a real number between [0,1] and k is 0; if u is 1, it means that m is 0; if u is 0, it indicates that m is not equal to 0.
In an embodiment of the present invention, the implementation manner of the secure natural logarithm synlog is: inputting a ciphertext [ x ] and outputting an encrypted natural logarithm operation result [ ln (x) ]; since the range of the input value of the logarithmic operation related to the present system is between (0, 1), the input value [ [ x ] ] of the present protocol only needs to satisfy x ∈ (0, 1).
Compared with the prior art, the invention has the following beneficial effects:
1. privacy preserving tradabost training. The system adopts a Paillier-based homomorphic re-encryption scheme (HRES) as a basic encryption system. The system allows two non-hooked cloud servers to be used for carrying out privacy-protecting TrAdaboost model training on the data sets (both on the encryption domain) of the source domain and the target domain. While during model training, the two servers will not get any information about the private data (i.e. the training data set, the final model results, and the intermediate calculation results). The encrypted training model parameters are stored in the cloud server by the system and are used for subsequently processing a sample prediction request of a requesting user.
2. Privacy preserving TrAdaboost prediction. In the system, a user is requested to upload an encrypted data sample to a cloud server, and the cloud server calculates a prediction result by using a pre-trained model and finally returns the prediction result to the user. Due to the encryption computing characteristic of the AHE, the two cloud servers perform outsourcing prediction computation on the encryption domain through interaction, and obtain an encrypted prediction result. Only the corresponding requesting user can decrypt the real prediction result by using the own secret key.
3. An efficient ciphertext computing protocol. In order to further reduce the operation overhead of the system, the system designs and realizes three cipher text calculation protocols, including a safe exclusive-or protocol, a safe cipher text-plaintext equivalence test protocol and a safe natural logarithm protocol. These security protocols perform security operations on encrypted input values and output the encrypted results. At the same time, these protocols are more efficient than existing related protocols.
4. The local overhead of the user is reduced as much as possible. On one hand, a system data user only needs to encrypt or decrypt the data uploaded to the cloud end or the encrypted result returned by the cloud end, and the cloud server with strong computing power executes complex TrAdaboost training and prediction calculation. On the other hand, the system minimizes the interaction cost between the data user and the cloud server: the owner of the source domain data and the target domain data only needs to send the encrypted training data to the cloud end; and the prediction request user only needs to transmit the encrypted data sample to the cloud end and wait for the cloud server to return the prediction result.
The application is as follows: the invention provides a safe transfer learning system based on an addition homomorphic re-encryption scheme. The system designs an encryption TrAdaboost training and prediction algorithm based on a double-cloud server model (a storage cloud server and a computing cloud server) around the privacy disclosure problem of migration machine learning in a cloud outsourcing scene. On one hand, a source domain data owner and a target domain data owner of the system respectively upload encrypted training data to a cloud end, and a cloud server trains a TrAdaboost model in a privacy protection mode; on the other hand, a requesting user of the system sends an encrypted data sample to the cloud server to request a secure predictive service, and then the cloud server returns an encrypted predictive classification result. The system does not leak training and prediction request data, training models, prediction results and intermediate calculation results of users (including data owners and prediction requesters) to a cloud or unauthorized user.
Drawings
FIG. 1 is a system model of the present invention.
FIG. 2 is a flow chart of the system of the present invention.
Fig. 3 illustrates the secure TrAdaboost training phase of the present invention.
Fig. 4 shows the secure tradoost prediction stage of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides a security transfer learning system based on homomorphic encryption, which comprises: the system comprises a key generation center KGC, a cloud platform CP, a cloud service provider CSP, a source data owner SDO, a target data owner TDO and a request user RUs;
a key generation center KGC responsible for initializing cryptographic system parameters and distributing public/private key pairs of system entities;
a cloud platform CP, which is responsible for receiving and storing training data from SDO and TDO and prediction request data from RUs, and performing partial computation of the system;
the CSP interacts with the CP and provides computing service for a migration learning algorithm for protecting privacy; in addition, the CP and CSP jointly perform decryption and re-encryption operations;
a source data owner SDO, the SDO owning a tagged sample instance from the source domain, sending the encrypted data set to the CP as a source training data set of the system;
target data owner TDO, TDO having tagged sample instances and untagged sample instances from target domains, the source and target domains involved in the system are distributed differently but have correlation; the TDO sends the encrypted data set to a CP to serve as a target training data set of the system, and the CP combines the encrypted source training data set and the target training data set to serve as a joint training data set;
and the request user RUs sends the encrypted unmarked sample from the target domain to the CP after the CP and the CSP finish the construction of the TrAdaboost classifier for the target space, requests related prediction calculation, and the encrypted prediction result returned from the CP can only be decrypted by the corresponding request user.
The following is a specific implementation of the present invention.
A secure migration learning system based on homomorphic encryption, comprising: the system comprises a key generation center KGC, a cloud platform CP, a cloud service provider CSP, a source data owner SDO, a target data owner TDO and a request user RUs;
a key generation center KGC responsible for initializing cryptographic system parameters and distributing public/private key pairs of system entities;
a cloud platform CP responsible for receiving and storing training data from SDO and TDO and prediction request data from RUs, and performing partial computation of the system;
the CSP interacts with the CP and provides computing service for a migration learning algorithm for protecting privacy; in addition, the CP and the CSP jointly execute decryption and re-encryption operations;
a source data owner SDO, the SDO owning a tagged sample instance from the source domain, sending the encrypted data set to the CP as a source training data set of the system;
target data owner TDO, TDO having tagged sample instances and untagged sample instances from target domains, the source and target domains involved in the system are distributed differently but have correlation; the TDO sends the encrypted data set to a CP to serve as a target training data set of the system, and the CP combines the encrypted source training data set and the target training data set to serve as a joint training data set;
and the request user RUs sends the encrypted unmarked sample from the target domain to the CP after the CP and the CSP finish the construction of the TrAdaboost classifier for the target space, requests related prediction calculation, and the encrypted prediction result returned from the CP can only be decrypted by the corresponding request user.
The following is a specific implementation of the present invention.
Fig. 1 is a system architecture of the present invention, which includes five entities, namely, a Key Generation Center (KGC), a Cloud Platform (CP), a Cloud Service Provider (CSP), a Source Data Owner (SDO), a Target Data Owner (TDO), and a Request User (RUs).
1. Key Generation Center (KGC): the KGC is a trusted authority responsible for initializing cryptographic system parameters and distributing the public/private key pairs of system entities.
2. Cloud Platform (CP): the CP has powerful storage and computation capabilities, and its task is to receive and store training data from SDO and TDO and prediction request data from RU, and perform part of the computation of the system.
3. Cloud Service Provider (CSP): the CSP interacts with the CP to provide computing service for the migration learning algorithm for protecting privacy. In addition, the CP and CSP jointly perform decryption and re-encryption operations.
4. Source Data Owner (SDO): the SDO has enough instances of tagged samples (from the source domain) to send encrypted data samples to the CP as source training data for the system.
5. Target Data Owner (TDO): TDO has a small number of labeled sample instances and unlabeled sample instances (from the target domain), requiring that the source and target domains involved in the system are distributed differently but have correlation. After the TDO sends the encrypted data set to the CP, the CP merges the encrypted source and target data sets as a joint training data set.
6. User (RUs): after the cloud server completes the construction of the TrAdaboost classifier (for the target space), the user sends the encrypted unlabeled sample (from the target domain) to the CP and requests the relevant prediction calculation. The encrypted prediction results returned from the CP can only be decrypted by the corresponding requesting user.
Table 1 lists the important symbols used in the present invention.
Table 1: symbols in the invention
Figure GDA0003581598920000111
Figure GDA0003581598920000121
1. First, some of the algorithms employed by the present invention are introduced:
1.1 TrAdaboost algorithm
The main algorithm adopted by the invention is the TrAdaboost algorithm, which is specifically explained as follows:
TrAdaboost is an example-based classical migration learning algorithm. Assuming a large number of labeled source data sets D S ={(x 1 ,y 1 ),...,(x n ,y n ) And a small number of labeled target data sets D T ={(x n+1 ,y n+1 ),...,(x n+m ,y n+m ) } (Label y i E {0,1}), the two datasets have different distributions but some similarity. The TrAdaboost algorithm is extended from the idea of the Adaboost algorithm, and the purpose of the TrAdaboost algorithm is to utilize a joint data set D ═ D S ∪D T And constructing a good classifier for the target domain space. The algorithm sets weights for each source and target training sample, respectively, and adjusts the values of the weight vectors in turn during the iterative training process. Specifically, the method comprises the following steps: in each round of TrAdaboost training, the weights of the instances that are favorable to the target classification task are increased, and the weights of the opposite instances are decreased. After the training is finished, the training device is used,the algorithm obtains R (i.e., the number of iterations of the algorithm) weak classifiers. The final classification hypothesis is determined by the weighted classification results of the weak classifiers (from the second half of the training iteration). The overall procedure of the traadaboost algorithm is shown below (where vector data is represented by a coarse body sign to distinguish scalar data, e.g., w ═ w (w) 1 ,...,w n )):
(1) Weight vector w of the initialization training example: the initial value of the weight vector w of the joint data sample is specified. For example, the initial weights of the source sample and the target sample may be set according to the size of the data set, respectively, that is:
Figure GDA0003581598920000122
the following is a repetition of the iterative process of the training algorithm (i.e., stage 2 to stage 6) R times (with t representing the current iteration round):
(2) normalized weight vector:
Figure GDA0003581598920000123
(3) training a weak classifier: using joint training examples (taking into account their weight distribution p) t ) Training output hypothesis h t X → Y. Where X is the instance space of D and Y ∈ {0,1} represents the set of classification labels.
(4) Calculate h t At D T Prediction error of (2): given D T Instance x on the Domain, classifier predictor h t (x) And a true tag c (x). Classifier h t The total error of (c) is calculated by combining the weighted misclassification rates:
Figure GDA0003581598920000131
(5) calculating the weight adjustment rate of the source and target training samples: d S The weight adjustment rate (i.e. beta) of (c) depends only on n and R and thus remains constant throughout the algorithm. D T The weight adjustment rate (i.e. beta) t ) With e t In each round of the movementAnd (6) updating the rows.
Figure GDA0003581598920000132
It is worth noting that: when is e t Taking a special value (1) epsilon t ≥1/2、(2)∈ t 1 or (3) e t When 0, the algorithm will calculate β alone t The value of (c). For example, β may be t Set to a suitable constant.
(6) The weight vector is updated. If example x i Satisfy h t (x i )≠c(x i ) (i ∈ {1,..., n }), the algorithm reduces x i And (4) corresponding weight values. On the contrary, if example x i Satisfy h t (x i )≠c(x i ) (i ∈ { n + 1.,. n + m }), x should be added i The corresponding weight. Otherwise, the weight value of the sample is unchanged.
Figure GDA0003581598920000133
(7) The final hypothesis is output. When the iterative training process is finished, the iterative weak classifier is trained in combination with the second half (i.e. { h } t },
Figure GDA0003581598920000134
The final classifier (for the target domain) is obtained. Wherein, the weak classifier h t Is dependent on beta t Value of (so beta) t Also known as h t The influence factor of (c).
Figure GDA0003581598920000135
1.2 homomorphic re-encryption system based on Paillier algorithm
The invention utilizes a homomorphic re-encryption system (HRES) based on the Paillier algorithm as a basic cryptographic algorithm. The cryptographic system comprises the following algorithms: key Generation (KeyGen), encryption (Enc), decryption (Dec), Dual Key encryption (EncTK), Using sk A Partial decryption (PDec1) is performed using sk B Partial decryption (PDec2), re-encryption first stage (FPRE) and re-encryption second Stage (SPRE) are performed.
(1) Key generation (KeyGen): a security parameter k and two large prime numbers p, q are given to satisfy
Figure GDA0003581598920000136
Then, N ═ pq is calculated and one maximum order generator g is selected. The public/private key pair of user i is
Figure GDA0003581598920000137
Wherein s is iR [1,λ(N 2 )](λ (·) denotes the Euler function). Furthermore, assume that two entities a and B possess a public/private key pair (pk), respectively A =g a modN 2 ,sk A A) and (pk) B =g b modN 2 ,sk B B), the public key obtained after the two carry out Diffie-Hellman negotiation is
Figure GDA0003581598920000138
The corresponding joint decryption private keys are a and b, respectively. In the system, the PK is used as a global public key of the system. The parameters g and N are published.
(2) Encryption (Enc): the algorithm makes the message m belong to Z N And the public key pk i As input, randomly select
Figure GDA0003581598920000141
And calculating to obtain ciphertext
Figure GDA0003581598920000142
(3) Decryption (Dec): using the private key sk i For ciphertext
Figure GDA0003581598920000143
And (3) decryption:
Figure GDA0003581598920000144
wherein l (u) ═ 1/N.
(4) Double key encryption (EncTK): to avoid processing operations between ciphertexts (based on different public keys), the system global public key PK is chosen instead of the user own public key PK i To encrypt the message. Similar to the Enc algorithm, a given plaintext message m ∈ Z N Obtaining a ciphertext [ [ m ]]] PK =(T,T′)={PK r (1+m·N),g r }(modN 2 ). For simplicity of expression, [ [ m ] will be used herein]] PK Is uniformly and briefly expressed as [ [ m ]]]。
(5) Using sk A Partial decryption (PDec 1): input [ [ m ]]]And sk A A, the first stage of partial decryption:
Figure GDA0003581598920000145
(6) using sk B Partial decryption (PDec 2): inputting partially decrypted ciphertext
Figure GDA0003581598920000146
And sk B The second stage of partial decryption is performed, so as to obtain the plaintext information m:
Figure GDA0003581598920000147
m=L(T (1) /T′ (2) modN 2 )
(7) re-encryption first stage (FPRE): given ciphertext [ [ m ]]]Private key sk A And a user public key pk j And executing the first-stage re-encryption calculation:
Figure GDA0003581598920000148
Figure GDA0003581598920000149
(8) re-encryption second Stage (SPRE): given partial re-encryption ciphertext [ m [ [ m ]]] + Private key sk B And a user public key pk j Performing a second stage of re-encryption calculation to obtain a plaintext m based on the public key pk j Corresponding cipher text of
Figure GDA00035815989200001410
Figure GDA00035815989200001411
Figure GDA00035815989200001412
Cipher text
Figure GDA00035815989200001413
The private key sk can only be used by the user j j And executes the Dec algorithm for decryption.
Further, HRES satisfies the following characteristics:
(1) additive homomorphism: given m 1 ,m 2 ∈Z n Is provided with
Figure GDA00035815989200001414
(2) Given [ [ m ]]]And
Figure GDA00035815989200001415
is provided with
Figure GDA00035815989200001416
(3)
Figure GDA00035815989200001417
1.3 privacy protection protocol
The present invention utilizes the following protocol as a basic privacy protection algorithm. Using []] PK Representing the ciphertext encrypted by the global public key PK, for ease of representation, the invention will be described in the following description]] PK UnifyIs expressed as [ ·]]. It is worth mentioning that, since HRES only supports computations in the non-negative integer domain, in order to make the system compatible with fractional or negative operations, the system performs the following pre-processing: one is that the system uniformly multiplies the plaintext data by a certain expansion factor and takes the whole as an operation value. For example, when setting the scale factor L to 10 5 The original fraction 0.003343 will be converted to 334. Secondly, the plaintext input field of the known HRES is Z N Therefore, the scheme utilizes (0, N/2)]Data in the range represents positive numbers and data in the (N/2, N) range represents negative numbers. Given [ [ X ]]]Secure Scaling-down Protocol (SSDown) output [ [ X/L ]]](ii) a Secure Reciprocal Protocol (SRec) output [ [1/X ]]]. Given [ [ X ]]]And [ [ Y ]]]Secure Multiplication Protocol (SMul) output [ [ X · Y ]]](ii) a Secure Division Protocol (SDiv) output [ [ X/Y ]]]](ii) a Greater than or Equal to the Secure compare Protocol (SGE) output [ [ u ] u]]←SGE([[X]],[[Y]]) When X is more than or equal to Y, u is 1; when X < Y, u is 0.
1.4 improved ciphertext computing protocol
The invention provides three Ciphertext computing protocols, including a Secure exclusive OR Protocol (SXor), a Secure Ciphertext-Plaintext equivalence Test Protocol (see Ciphertext-Plaintext equivalence Test Protocol, see), and a Secure Natural Log Protocol (SNLog).
1.4.1 secure XOR protocol
Secure xor protocol SXor pairs two encrypted numbers [ m [ [ m ] 1 ]]And [ [ m ] 2 ]](m 1 ,m 2 E {0,1}) to obtain an encrypted bit XOR result [ u ]]]. If m 1 =m 2 And u is 0; otherwise, u is 1. The protocol description is shown in algorithm 1.
Figure GDA0003581598920000151
1.4.2 secure ciphertext-plaintext equality test protocol
The secure ciphertext-plaintext equality test protocol tests the equality relationship between ciphertext [ m ] and plaintext k (where m is a real number between [0,1] and k is 0). If u is 1, it means that m is 0; if u is 0, it indicates that m is not equal to 0. The protocol description is shown in algorithm 2.
Figure GDA0003581598920000161
1.4.3 secure Natural logarithm protocol
The secure logarithm protocol inputs the ciphertext [ x ] and outputs the result of the encrypted natural logarithm operation [ ln (x) ]. In particular, since the system involves logarithmic operation input values in the range of (0, 1), the input values of the protocol [ [ x ] ] only satisfy x ∈ (0,1 ].
Figure GDA0003581598920000162
2. System flow
2.1 System overview
The system of the invention consists of the following three stages (fig. 2). Privacy preserving weak classifier training, privacy preserving traadaboost training, and privacy preserving traadaboost prediction, respectively. The private data involved in the algorithm (e.g., training data set, classifier model, request data, prediction result, or intermediate calculation result) cannot be obtained by other entities, whether in the training or prediction phase.
Privacy preserving weak classifier training. In each round of iterative training of TrAdaboost, the algorithm trains the weak classifiers according to the current source and target data sets (and their sample weight distributions). To protect privacy, the training process should be performed on the encrypted domain. Many security base classifier training schemes based on homomorphic encryption algorithms (especially additive homomorphism) exist, such as Support Vector Machine (SVM) training or Logistic Regression (LR) training. Therefore, the invention will not give extra security algorithm for weak classifier training. After the weak classifier training is completed, the algorithm returns the encrypted model parameters for subsequent calculation of secure TrAdaboost training.
Privacy preserving tradabost training. In each round of secure TrAdaboost training, the algorithm first trains the base classifier using the joint training samples (including the source and target datasets) and their weight distributions and calculates their weighted prediction errors. The algorithm then updates the misclassified sample weights for the next iteration (in a privacy-preserving manner). After each round of algorithm training, the trained encrypted weak classifier model and its influence factors are saved, wherein the base classifier from the second half of iterative training will be used for the final TrAdaboost prediction.
Privacy preserving tradabost prediction. After receiving the encrypted samples from the requesting user, the CP and CSP perform privacy-preserving traadaboost predictive computation in a cooperative interactive manner. The final data classification result depends on a weighted combination of weak classifier predictors. And finally, the cloud server returns the re-encrypted predicted value to the requesting user. The plaintext value of the prediction result can only be decrypted by the requesting user using the corresponding private key.
2.2, System initialization
The initialization task of the system is performed by the KGC, including generating parameters for the cryptographic system and distributing the public/private key pair to all entities in the system. In addition, the system global public key is generated by the negotiation between the CP and the CSP. The detailed description is as follows:
(1) KGC executes the KeyGen algorithm to generate the encrypted system parameters, e.g., N and g, for HRES. At the same time, KGC generates respective public/private key pairs for all entities in the system. Specifically, the method comprises the following steps: KGC distributes key pairs (pk) for SDO, TDO, CP and CSP, respectively sdo ,sk sdo )、(pk tdo ,sk tdo )、(pk cp ,sk cp ) And (pk) csp ,sk csp ). In addition, the KGC provides a key store for storing public/private key pairs of the requesting user (i.e., the KGC stores a public key and a private key pair of the requesting user
Figure GDA0003581598920000171
Wherein n is user May be optionally specified) and assign a failure to each registered userA used key pair. The keystore will be updated by the KGC at idle time or when needed.
(2) CP and CSP exchange their respective public keys and negotiate out a Diffie-Hellman key
Figure GDA0003581598920000172
The PK is then published as the system's global public key to the SDO, TDO, and subsequent requesting users RUs. As can be seen from the characteristics of HRES, PK encrypted based messages can only be decrypted by CP and CSP jointly to recover the plaintext.
2.3 secure TrAdaboost training phase
As shown in fig. 3, the training phase of the secure TrAdaboost is an R-round iterative process. The algorithm first performs preparation and preprocessing work. Subsequently, each round of TrAdaboost iterative training consists of four sub-blocks, including sample weight vector normalization, prediction error calculation, weight adjustment rate calculation and sample weight update. Algorithm 4 gives the algorithm flow of the secure TrAdaboost training scheme.
S1, algorithm preparation and preprocessing
First, the SDO and TDO submit respective sets of encrypted training samples to the CP side. Suppose that SDO owns the source data set D S ={(x 1 ,y 1 ),...,(x n ,y n ) }, TDO having a target data set D T ={(x n+1 ,y n+1 ),...,(x n+m ,y n+m )}. At the SDO and TDO ends, they first multiply each value contained in the feature vector in their dataset and the corresponding tag value by a scaling factor L (L is specified by the system). In particular, execution
Figure GDA0003581598920000181
And
Figure GDA0003581598920000182
where 1 ≦ i ≦ n + m and d represents the size of the feature vector. After encryption with the system global public key PK, SDO and TDO send respective encrypted training sets to CP, i.e., [ D ] S ]] PK ={([[x 1 ]] PK ,[[y 1 ]] PK ),...,([[x n ]] PK ,[[y n ]] PK ) And [ [ D ] T ]] PK ={([[x n+1 ]] PK ,[[y n+1 ]] PK ),...,([[x n+m ]] PK ,[[y n+m ]] PK )}. In particular, [ [ x ] i ]] PK =([[x i1 ]] PK ,[[x i2 ]] PK ,...,[[x id ]] PK ) Wherein i is more than or equal to 1 and less than or equal to n + m. For simplicity of illustration, PK will be omitted in the following description, i.e., [. cndot. ]]] PK Is expressed as [ ·]]. In addition, the sizes of the source and target datasets (i.e., n and m) are also sent to the CP store, respectively. Upon receiving [ [ D ] S ]]And [ [ D ] T ]]The CP then merges them as a joint training data set: [ [ D ]]]={([[x 1 ,y 1 ]]),...,([[x n+m ,y n+m ]])}。
Subsequently, the CP initializes the sample weights of the joint training data set (the initialization strategy may be specified by TDO). Assuming initial weight values
Figure GDA0003581598920000183
Determined by the size of the source and target data sets, respectively. The CP sets the initial weight of the source sample to (1/n) and the initial weight of the target sample to (1/m). Then, an encrypted sample weight vector is calculated
Figure GDA0003581598920000184
Figure GDA0003581598920000185
S2 sample weight vector normalization
In the t-th iteration training, the algorithm obtains an encrypted normalized sample weight vector p t . First, the CP computes the sum of all weight values over the ciphertext domain:
Figure GDA0003581598920000186
then, obtain [ [ p ] by calling the SDiv algorithm n + m times t ]]:
Figure GDA0003581598920000187
S3, calculating prediction error
Suppose that the weak classifier of the t-th round is trained [ [ h ] t ]]It is applied to the encrypted sample [ [ x ]]]Is expressed as [ [ h ] t (x)]]And satisfy h t (x) E {0,1 }. The algorithm aims at calculating [ [ h ] t ]]At D T Weighted prediction error of (1). Knowing | h t (x i )-y i I and { h | t (x i ),y i The xor between them results in equality. Therefore, first calculate | h using the Sxor algorithm t (x i )-y i L (since in the sample weight update phase, it needs to use [ [ h ] t ]]Error values on source and target samples, so in this calculation, the encrypted prediction error is calculated for the source and target samples and the result is stored at the CP end):
Figure GDA0003581598920000191
next, the prediction error rate [ ∈ of encryption is calculated by ciphertext multiplication and addition operations t ]]:
Figure GDA0003581598920000192
Figure GDA0003581598920000193
S4, calculating weight adjustment rate
The weight adjustment rate controls the degree of sample weight update. The adjustment rate β of the source training sample weights is constant in each iteration. Therefore, the value of β only needs to be calculated once during the entire traadaboost training process (and since the operands involved in the calculation are public, it is calculated in the clear text domain):
Figure GDA0003581598920000194
(where R is a preset number of training iterations).
In calculating D T Before the sample weight adjustment rate (i.e. beta) t =∈ t /(1-∈ t )=-1+1/(1-∈ t ) The algorithm respectively considers the calculation under three special conditions (based on the epsilon) t Different values of (a). First, a condition e is determined by the following calculation t ≥1/2、∈ t 1 or e t Whether or not 0 is satisfied (where e ∈ t Encrypted). Wherein, if ex 1 1, then indicates the condition e t 1/2 is equal to or greater than; otherwise, it indicates ∈ t < 1/2. If ex 2 1, then indicates the condition e t 0 is satisfied; if ex 3 1, then indicates the condition e t 1 is satisfied.
[[ex 1 ]]=SGE([[∈ t ]],[[1/2]]);
[[ex 2 ]]=SETest([[∈ t ]]);
[[ex 3 ]]=SETest([[1]]·[[∈ t ]] N-1 )=SETest([[1-∈ t )
Next, the algorithm calculates [ [ beta ] t ]]Is taken (has been avoided)
Figure GDA0003581598920000195
The calculation time denominator may be 0).
Figure GDA0003581598920000196
Figure GDA0003581598920000197
Discussion of the related Art
Figure GDA0003581598920000198
Different value results of (a): if e is t When 1, there is ex 2 1 and
Figure GDA0003581598920000199
thus, it is possible to provide
Figure GDA00035815989200001910
If e is t Not equal to 1, then there is ex 2 0 and
Figure GDA00035815989200001911
thus, it is possible to provide
Figure GDA00035815989200001912
Suppose when e is t ≧ 1/2 (or ∈ t =1、∈ t When is equal to 0), will be beta t Set directly to constant c 1 (or c) 2 、c 3 ) E.g. c 1 0.5 (or c) 2 =0.4、c 3 0.99). Finally, [ [ beta ] t ]]This can be calculated as follows:
Figure GDA00035815989200001913
Figure GDA00035815989200001914
[[S′ 3 ]]=SMul([[1]]·[[ex 1 ]] N-1 ,[[1]]·[[ex 2 ]] N-1 );
[[S″ 3 ]]=SMul([[S′ 3 ]],[[1]]·[[ex 3 ]] N-1 );
Figure GDA00035815989200001915
Figure GDA0003581598920000201
s5 sample weight update
It is known that the strategy of updating the weight values of the source data sample and the target data sample is different. In addition, only when
Figure GDA0003581598920000202
When misclassified, its corresponding sample weight
Figure GDA0003581598920000203
Need to be updated. Ciphertext [ | h) due to prediction error t (x i )-y i |]]And i is more than or equal to 1 and less than or equal to n + m is calculated, so that the algorithm only needs to test [ | h ] on the encrypted domain by calling SETest protocol t (x i )-y i |]]Whether or not it is equal to 0:
[[s]]=SETest([[|h t (x i )-y i |]])
the sample weight vector is then updated by the following strategy:
Figure GDA0003581598920000204
Figure GDA0003581598920000205
Figure GDA0003581598920000206
Figure GDA0003581598920000211
2.4 secure TrAdaboost prediction phase
Fig. 4 depicts an algorithm flow diagram of the privacy preserving TrAdaboost prediction phase. When the user (called RU) i ) When requesting a class label for its unlabeled sample x (x from the target sample space), he/she first completes the registration on the system and obtains a unique publicKey/private key pair
Figure GDA0003581598920000212
And a global public key PK of the system. To prevent leakage of its privacy sample data, the user RU i Encrypting a request sample with PK to [ x [ [ x ]]]And transmits the data packet
Figure GDA0003581598920000213
To the CP end. Received from RU i After requesting data, CP combines with CSP encrypted weak classifiers { [ [ h ] t ]]And its influence factor pair [ [ x ]]]Performing privacy preserving TrAdaboost prediction, wherein
Figure GDA0003581598920000214
For deployment, CP and CSP are first in [ [ x ]]]Performing encrypted weak classifier computation alternately and obtaining
Figure GDA0003581598920000215
Subsequently, the cloud server calculates two decision parameters (i.e., [ left ]]]And [ [ right ]]]) Wherein:
Figure GDA0003581598920000216
Figure GDA0003581598920000217
next, the cloud server runs the SGE protocol to compare [ [ left ]]]And [ [ right ]]]. If right ≧ left, the final classification result of the model [ [ h ] f (x)]]Is set as [ [ 1]]](ii) a Otherwise, will [ [ h ] f (x)]]Is set to [ [0 ]]]. In the RU for returning the prediction result to the user i Previously, CP and CSP were RU-based i Of (2)
Figure GDA0003581598920000218
Re-encryption of classification result [ [ h ] f (x)]]. Receiving the result of the re-encryption
Figure GDA0003581598920000219
Thereafter, requesting user RU i Using its own private key
Figure GDA00035815989200002110
And recovering the plaintext of the result. Due to addition of RU i Other entities cannot acquire the private key
Figure GDA00035815989200002111
The prediction result is thus kept secret. The pseudo code for the secure tradoboost prediction algorithm is given in algorithm 5.
Figure GDA0003581598920000221
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (6)

1. A secure migration learning system based on homomorphic encryption, comprising: the system comprises a key generation center KGC, a cloud platform CP, a cloud service provider CSP, a source data owner SDO, a target data owner TDO and a request user RUs;
a key generation center KGC responsible for initializing cryptographic system parameters and distributing public/private key pairs of system entities;
a cloud platform CP responsible for receiving and storing training data from SDO and TDO and prediction request data from RUs, and performing partial computation of the system;
the CSP interacts with the CP and provides computing service for a migration learning algorithm for protecting privacy; in addition, the CP and CSP jointly perform decryption and re-encryption operations;
a source data owner SDO, the SDO owning the tagged sample instance from the source domain, sending the encrypted data set to the CP as a source training data set of the system;
target data owner TDO, TDO having tagged sample instances and untagged sample instances from target domains, the source and target domains involved in the system are distributed differently but have correlation; the TDO sends the encrypted data set to a CP to serve as a target training data set of the system, and the CP combines the encrypted source training data set and the target training data set to serve as a joint training data set;
and the request user RUs sends the encrypted unmarked sample from the target domain to the CP after the CP and the CSP finish the construction of the TrAdaboost classifier for the target space, requests related prediction calculation, and the encrypted prediction result returned from the CP can only be decrypted by the corresponding request user.
2. The secure migration learning system based on homomorphic encryption as claimed in claim 1, wherein the key generation center KGC, the initialization cryptosystem parameters and the public/private key pair of the distribution system entity are implemented as follows:
(1) KGC generates system parameters N and g of a Paillier algorithm-based homomorphic re-encryption system HRES, and KGC generates respective public key/private key pairs for all entities in the system, specifically: KGC distributes key pairs (pk) for SDO, TDO, CP and CSP, respectively sdo ,sk sdo )、(pk tdo ,sk tdo )、(pk cp ,sk cp ) And (pk) csp ,sk csp ) (ii) a In addition, the KGC sets up a key repository to store the requesting user's public/private key pairs
Figure FDA0003581598910000011
Wherein n is user A total number of system users is assigned, and an unused key pair is distributed to each registered user; the key bank is updated by KGC in idle time or when needed;
(2) CP and CSP exchange their respective public keys and negotiate out a Diffie-Hellman key
Figure FDA0003581598910000012
Subsequently, the PK is used as a global public key of the system to be disclosed to the SDO, the TDO and the subsequent RUs of the requesting user; as can be seen from the characteristics of the HRES,PK encrypted based messages can only be decrypted by CP and CSP jointly to recover the plaintext.
3. The secure migration learning system based on homomorphic encryption according to claim 2, wherein the HRES comprises the following algorithms:
key generation KeyGen algorithm: a safety parameter k and two large prime numbers p and q are given to satisfy
Figure FDA0003581598910000013
(symbol)
Figure FDA0003581598910000014
A bit length of expression; then, N ═ pq is calculated and a group generator g is selected and the order of g is ord (g) ═ p-1 (q-1)/2; user i's public/private key pair of
Figure FDA0003581598910000021
Wherein s is iR [1,λ(N 2 )]λ (·) denotes the euler function; furthermore, assume that two entities a and B possess a public/private key pair (pk) respectively A =g a mod N 2 ,sk A A) and (pk) B =g b mod N 2 ,sk B B), the public key obtained after the two carry out Diffie-Hellman negotiation is
Figure FDA0003581598910000022
The corresponding joint decryption private keys are a and b respectively; taking PK as a global public key of the system in the system; the parameters g and N are published;
the encryption Enc algorithm: joining message m to Z N And the public key pk i As input, randomly select
Figure FDA0003581598910000023
Z N Is an integer set {0,1, …, N-1},
Figure FDA0003581598910000024
is an integer set {1, …, N-1}, and a ciphertext is obtained by calculation
Figure FDA0003581598910000025
Wherein T and T' are respectively a first element and a second element of the ciphertext;
decryption Dec algorithm: using the private key sk i For ciphertext
Figure FDA0003581598910000026
And (3) decryption:
Figure FDA0003581598910000027
wherein l (u) ═ 1/N;
the EncTK algorithm with double keys is as follows: the system global public key PK is selected to encrypt the message, similar to the Enc algorithm, given a plaintext message m ∈ Z N Obtaining a ciphertext
Figure FDA00035815989100000216
For the sake of simplifying the expression, will
Figure FDA00035815989100000217
Is uniformly and schematically represented as
Figure FDA00035815989100000218
Using sk A Partial decryption PDec1 algorithm: input the method
Figure FDA00035815989100000219
And sk A A, the first stage of partial decryption:
Figure FDA0003581598910000028
using sk B Partial decryption PDec2 algorithm: inputting partially decrypted ciphertext
Figure FDA0003581598910000029
And sk B B, the second stage of partial decryption is performed, so as to obtain the plaintext information m:
Figure FDA00035815989100000210
m=L(T (1) /T′ (2) mod N 2 )
re-encryption first stage FPRE algorithm: given ciphertext
Figure FDA00035815989100000220
Private key sk A And a user public key pk j And executing the first-stage re-encryption calculation:
Figure FDA00035815989100000211
Figure FDA00035815989100000212
re-encryption second stage SPRE algorithm: given partial re-encrypted ciphertext
Figure FDA00035815989100000221
Private key sk B And a user public key pk j Performing a second stage of re-encryption calculation to obtain a plaintext m based on the public key pk j Corresponding cipher text of
Figure FDA00035815989100000213
Figure FDA00035815989100000214
Figure FDA00035815989100000215
Cipher text
Figure FDA0003581598910000031
The private key sk can only be used by the user j j And executes the Dec algorithm for decryption.
4. The secure migration learning system based on homomorphic encryption according to claim 3, wherein the HRES characteristics of the homomorphic re-encryption system based on Paillier algorithm are as follows:
(1) additive homomorphism: given m 1 ,m 2 ∈Z n Is provided with
Figure FDA0003581598910000032
(2) Given a
Figure FDA00035815989100000311
And
Figure FDA0003581598910000033
is provided with
Figure FDA0003581598910000034
(3)
Figure FDA0003581598910000035
5. The homomorphic encryption-based secure migration learning system according to claim 2, wherein the training process of the TrAdaboost classifier is as follows:
the training of the tradoost is an R-round iterative process, firstly preparation and preprocessing are carried out, and then each round of the tradoost iterative training consists of four sub-blocks, including sample weight vector normalization, prediction error calculation, weight adjustment rate calculation and sample weight update, which are specifically as follows:
s1, algorithm preparation and preprocessing
First, SDO and TDO submit respective encrypted data sets to CP, assuming that SDO owns source training data set D S ={(x 1 ,y 1 ),...,(x n ,y n ) }, TDO has a target training data set D T ={(x n+1 ,y n+1 ),...,(x n+m ,y n+m ) }; in SDO and TDO, each value and corresponding tag value included in the feature vector in the dataset is first multiplied by a scaling factor L, specifically: execute
Figure FDA0003581598910000036
And
Figure FDA0003581598910000037
wherein 1 ≦ i ≦ n + m and d represents the magnitude of the feature vector; after encryption with the system global public key PK, SDO and TDO send respective encrypted data sets to CP, i.e.
Figure FDA00035815989100000312
And
Figure FDA00035815989100000313
Figure FDA00035815989100000314
wherein i is more than or equal to 1 and less than or equal to n + m; for simplicity of presentation, will
Figure FDA00035815989100000316
Is shown as
Figure FDA00035815989100000315
In addition, the sizes of the source training data set and the target training data set, namely n and m, are also respectively sent to the CP for storage; upon receiving
Figure FDA00035815989100000317
And
Figure FDA00035815989100000318
the CP then merges them as a joint training data set:
Figure FDA00035815989100000319
subsequently, the CP initializes sample weights for the joint training data set, assuming initial weight values
Figure FDA0003581598910000038
Respectively determining the sizes of a source training data set and a target training data set, setting the initial weight of a source sample to be (1/n) and the initial weight of a target sample to be (1/m) by the CP; then, an encrypted sample weight vector is calculated
Figure FDA0003581598910000039
Figure FDA00035815989100000310
S2 sample weight vector normalization
In the t-th round of iterative training, an encrypted normalized sample weight vector p is obtained t (ii) a First, the CP computes the sum of all weight values over the ciphertext domain:
Figure FDA0003581598910000041
then, it is obtained by calling the secure division protocol SDiv n + m times
Figure FDA00035815989100000410
Figure FDA0003581598910000042
S3, calculating prediction error
Suppose that the weak classifier of the t-th round has been trained
Figure FDA00035815989100000411
Apply it to the encrypted sample
Figure FDA00035815989100000412
Is expressed as
Figure FDA00035815989100000420
And satisfy h t (x) E {0,1 }; the algorithm aims at calculating
Figure FDA00035815989100000413
At D T Weighted prediction error of (1); knowing | h t (x i )-y i I and { h | t (x i ),y i The XOR between them results in equality; therefore, first calculate | h using the secure XOR protocol Sxor t (x i )-y i L, |; due to the need to use in the sample weight update phase
Figure FDA00035815989100000415
Error values on the source and target samples, so in this calculation, the encrypted prediction error is calculated for the source and target samples and the result is stored at the CP: the implementation mode of the secure exclusive-or protocol Sxor is as follows: for two encrypted numbers
Figure FDA00035815989100000417
And
Figure FDA00035815989100000416
m 1 ,m 2 e.g. 0,1, to realize the ciphertext XOR operation to obtain the encrypted bit XOR result
Figure FDA00035815989100000414
If m 1 =m 2 And u is 0; otherwise, u is 1;
Figure FDA0003581598910000043
next, the prediction error rate of the encryption is calculated by the secure multiplication protocol SMul and the addition operation
Figure FDA00035815989100000418
Figure FDA0003581598910000044
Figure FDA0003581598910000045
S4, calculating weight adjustment rate
The weight adjustment rate controls the degree of updating of the sample weights; the adjustment rate beta of the weights of the source training samples is constant in each iteration; therefore, the value of β only needs to be calculated once during the whole training process of traadaboost and since the operands involved in the calculation are all public, it can be calculated in the plaintext domain:
Figure FDA0003581598910000046
wherein R is a preset training iteration number;
in calculating D T Before the sample weight adjustment rate; i.e. beta t =∈ t /(1-∈ t )=-1+1/(1-∈ t ) The algorithm is based on e t The different values of (A) are calculated under three special conditions respectively; first, a condition e is determined by the following calculation t ≥1/2、∈ t 1 or e t Whether or not 0 is satisfied; wherein, if ex 1 1, then indicates the condition e t 1/2 is equal to or greater than; otherwise, it indicates ∈ t <1/2; if ex 2 1, then indicates the condition e t 0 is satisfied; if ex 3 1, then indicates the condition e t 1 is satisfied;
Figure FDA0003581598910000047
Figure FDA0003581598910000048
Figure FDA0003581598910000049
SGE is greater than or equal to the safety comparison protocol, SETest is the safety ciphertext-plaintext equal test protocol; the secure ciphertext-plaintext equivalence test protocol SETest is implemented in the following modes: for ciphertext
Figure FDA00035815989100000419
And the equality relationship between k in the plain text, where m is [0,1]]Real number in between and k is 0; if u-1, indicates that m-0; if u is 0, it indicates that m is not equal to 0;
next, an algorithm is calculated
Figure FDA00035815989100000523
Taking the value of (A);
Figure FDA0003581598910000051
Figure FDA0003581598910000052
SRec is a safety reciprocal protocol;
discussion of the related Art
Figure FDA0003581598910000053
Different value results of (a): if e is t When 1, there is ex 2 1 and
Figure FDA0003581598910000054
thus, it is possible to provide
Figure FDA0003581598910000055
If e is equal to t Not equal to 1, then there is ex 2 0 and
Figure FDA0003581598910000056
thus, it is possible to provide
Figure FDA0003581598910000057
Suppose when e is t ≧ 1/2 or ∈ t =1、∈ t When equal to 0, the beta is t Set directly to constant c 1 Or c 2 、c 3 (ii) a Finally, the process is carried out in a batch,
Figure FDA00035815989100000522
this can be calculated as follows:
Figure FDA0003581598910000058
Figure FDA0003581598910000059
Figure FDA00035815989100000510
Figure FDA00035815989100000511
Figure FDA00035815989100000512
Figure FDA00035815989100000513
s5, sample weight updating
It is known that the strategy of updating the weight values of the source data sample and the target data sample is different; in addition, only when
Figure FDA00035815989100000514
When misclassified, its corresponding sample weights
Figure FDA00035815989100000515
Need to be updated; ciphertext due to prediction error
Figure FDA00035815989100000520
Has been calculated, so the algorithm only needs to test on the encrypted domain by calling SETest protocol
Figure FDA00035815989100000521
Whether or not it is equal to 0:
Figure FDA00035815989100000516
the sample weight vector is then updated by the following strategy:
for i=1,...,n,
Figure FDA00035815989100000517
for i=n+1,...,n+m,
Figure FDA00035815989100000518
6. the homomorphic encryption-based secure migration learning system according to claim 5, wherein the TrAdaboost classifier implements encryption prediction as follows:
RU when requesting user i When a request is made to obtain a classification label of an unlabeled sample x from a target sample space, registration is first completed on the system and a unique public/private key pair is obtained
Figure FDA00035815989100000519
And a global public key PK of the system; requesting user RU to prevent leakage of its privacy sample data i Encrypting request samples into using PK
Figure FDA00035815989100000626
And transmits the data packet
Figure FDA0003581598910000061
Feeding the CP; received from RU i After requesting data, CP and CSP are combined to encrypt weak classifier
Figure FDA00035815989100000625
And its influence factor pair
Figure FDA00035815989100000624
Performing privacy preserving TrAdaboost prediction, wherein
Figure FDA0003581598910000062
For deployment, the CP and CSP are first in
Figure FDA00035815989100000623
Performing encrypted weak classifier computation alternately and obtaining
Figure FDA00035815989100000622
Performing a weighted prediction result calculation:
Figure FDA0003581598910000063
Figure FDA0003581598910000064
SMul is a safe multiplication protocol, and SNLog is a safe natural logarithm protocol; the implementation mode of the safe natural logarithm cooperation SNLog is as follows: input ciphertext
Figure FDA0003581598910000065
And outputs the result of the encrypted natural logarithm operation
Figure FDA0003581598910000066
The input value range of the logarithmic operation related to the system is (0, 1)]Thus the input value of the protocol
Figure FDA0003581598910000067
Satisfying only x ∈ (0, 1)]Then the method is finished;
subsequently, the cloud server calculates two decision parameters, i.e.
Figure FDA0003581598910000068
And
Figure FDA0003581598910000069
wherein:
Figure FDA00035815989100000610
Figure FDA00035815989100000611
next, the cloud server runs a greater than or equal security comparison protocol SGE to compare
Figure FDA00035815989100000612
And
Figure FDA00035815989100000613
if right ≧ left, the final classification result is
Figure FDA00035815989100000614
Is arranged as
Figure FDA00035815989100000615
Otherwise, it will
Figure FDA00035815989100000616
Is arranged as
Figure FDA00035815989100000617
In RU returning prediction result to requesting user i Previously, CP and CSP were RU-based i Of (2) a public key
Figure FDA00035815989100000618
Re-encrypting the classification result
Figure FDA00035815989100000619
Receiving the result of the re-encryption
Figure FDA00035815989100000620
Thereafter, requesting user RU i Using its own private key
Figure FDA00035815989100000621
The plaintext of the result is recovered.
CN202110134461.8A 2021-02-01 2021-02-01 Secure transfer learning system based on homomorphic encryption Active CN112822005B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110134461.8A CN112822005B (en) 2021-02-01 2021-02-01 Secure transfer learning system based on homomorphic encryption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110134461.8A CN112822005B (en) 2021-02-01 2021-02-01 Secure transfer learning system based on homomorphic encryption

Publications (2)

Publication Number Publication Date
CN112822005A CN112822005A (en) 2021-05-18
CN112822005B true CN112822005B (en) 2022-08-12

Family

ID=75860845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110134461.8A Active CN112822005B (en) 2021-02-01 2021-02-01 Secure transfer learning system based on homomorphic encryption

Country Status (1)

Country Link
CN (1) CN112822005B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032838B (en) * 2021-05-24 2021-10-29 易商征信有限公司 Label prediction model generation method, prediction method, model generation device, system and medium based on privacy calculation
CN113421122A (en) * 2021-06-25 2021-09-21 创络(上海)数据科技有限公司 First-purchase user refined loss prediction method under improved transfer learning framework
CN113472805B (en) * 2021-07-14 2022-11-18 中国银行股份有限公司 Model training method and device, storage medium and electronic equipment
CN113938266B (en) * 2021-09-18 2024-03-26 桂林电子科技大学 Junk mail filter training method and system based on integer vector homomorphic encryption
CN113783898B (en) * 2021-11-12 2022-06-10 湖南大学 Renewable hybrid encryption method
CN114219306B (en) * 2021-12-16 2022-11-15 蕴硕物联技术(上海)有限公司 Method, apparatus, medium for establishing welding quality detection model
CN114915399A (en) * 2022-05-11 2022-08-16 国网福建省电力有限公司 Energy big data security system based on homomorphic encryption
CN115051816B (en) * 2022-08-17 2022-11-08 北京锘崴信息科技有限公司 Privacy protection-based cloud computing method and device and financial data cloud computing method and device
CN116402505B (en) * 2023-05-11 2023-09-01 蓝象智联(杭州)科技有限公司 Homomorphic encryption-based graph diffusion method, homomorphic encryption-based graph diffusion device and storage medium
CN117290659B (en) * 2023-11-24 2024-04-02 华信咨询设计研究院有限公司 Data tracing method based on regression analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108259158A (en) * 2018-01-11 2018-07-06 西安电子科技大学 Efficient and secret protection individual layer perceptron learning method under a kind of cloud computing environment
CN109255444A (en) * 2018-08-10 2019-01-22 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on transfer learning
CN110008717A (en) * 2019-02-26 2019-07-12 东北大学 Support the decision tree classification service system and method for secret protection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103036884B (en) * 2012-12-14 2015-09-16 中国科学院上海微系统与信息技术研究所 A kind of data guard method based on homomorphic cryptography and system
CN105488422B (en) * 2015-11-19 2019-01-11 上海交通大学 Editing distance computing system based on homomorphic cryptography private data guard

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108259158A (en) * 2018-01-11 2018-07-06 西安电子科技大学 Efficient and secret protection individual layer perceptron learning method under a kind of cloud computing environment
CN109255444A (en) * 2018-08-10 2019-01-22 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on transfer learning
CN110008717A (en) * 2019-02-26 2019-07-12 东北大学 Support the decision tree classification service system and method for secret protection

Also Published As

Publication number Publication date
CN112822005A (en) 2021-05-18

Similar Documents

Publication Publication Date Title
CN112822005B (en) Secure transfer learning system based on homomorphic encryption
Ding et al. Encrypted data processing with homomorphic re-encryption
Xu et al. Privacy-preserving federated deep learning with irregular users
Mandal et al. PrivFL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks
CN108712260B (en) Multi-party deep learning computing agent method for protecting privacy in cloud environment
Li et al. Outsourced privacy-preserving classification service over encrypted data
Liu et al. An efficient privacy-preserving outsourced calculation toolkit with multiple keys
US11165558B2 (en) Secured computing
González-Serrano et al. Training support vector machines with privacy-protected data
US20160020898A1 (en) Privacy-preserving ridge regression
Barbosa et al. Labeled homomorphic encryption: scalable and privacy-preserving processing of outsourced data
CN109992979A (en) A kind of ridge regression training method calculates equipment, medium
CN111581648B (en) Method of federal learning to preserve privacy in irregular users
CN113434898A (en) Non-interactive privacy protection logistic regression federal training method and system
Wang et al. Privacy preserving computations over healthcare data
Zhang et al. Privacy-preserving multikey computing framework for encrypted data in the cloud
Zhu et al. Enhanced federated learning for edge data security in intelligent transportation systems
Liu et al. DHSA: efficient doubly homomorphic secure aggregation for cross-silo federated learning
CN111159727B (en) Multi-party cooperation oriented Bayes classifier safety generation system and method
Zhao et al. A privacy preserving homomorphic computing toolkit for predictive computation
Chen et al. Cryptanalysis and improvement of DeepPAR: Privacy-preserving and asynchronous deep learning for industrial IoT
Liu et al. Efficient and Privacy-Preserving Logistic Regression Scheme based on Leveled Fully Homomorphic Encryption
Wang et al. DPP: Data Privacy-Preserving for Cloud Computing based on Homomorphic Encryption
CN113114454B (en) Efficient privacy outsourcing k-means clustering method
Wang et al. Secure outsourced calculations with homomorphic encryption

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant