CN109902742B - Sample completion method, terminal, system and medium based on encryption migration learning - Google Patents

Sample completion method, terminal, system and medium based on encryption migration learning Download PDF

Info

Publication number
CN109902742B
CN109902742B CN201910153223.4A CN201910153223A CN109902742B CN 109902742 B CN109902742 B CN 109902742B CN 201910153223 A CN201910153223 A CN 201910153223A CN 109902742 B CN109902742 B CN 109902742B
Authority
CN
China
Prior art keywords
feature
completion
terminal
encryption
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910153223.4A
Other languages
Chinese (zh)
Other versions
CN109902742A (en
Inventor
刘洋
康焱
陈天健
杨强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201910153223.4A priority Critical patent/CN109902742B/en
Publication of CN109902742A publication Critical patent/CN109902742A/en
Application granted granted Critical
Publication of CN109902742B publication Critical patent/CN109902742B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Storage Device Security (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a sample completion method, a terminal, a system and a medium based on encryption transfer learning, wherein the method comprises the following steps: determining the feature intersection of the samples of the two parties, training a first sample based on the intersection to obtain a first feature model, and encrypting and sending the model to a second terminal; receiving a second encryption characteristic model sent by a second terminal, and pre-measuring a first encryption completion characteristic for the missing characteristic of the first sample according to the model; calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic; receiving a second encryption marking model sent by a second terminal, and pre-measuring a first encryption completion marking on the marking missing from the first sample based on the model, the initial characteristic of the first sample and the first completion characteristic; and operating the first encryption completion annotation according to a third preset operation rule to obtain the first completion annotation. The invention realizes the feature and mark completion of the sample data of each party on the premise of ensuring that the data privacy of each party is not revealed.

Description

Sample completion method, terminal, system and medium based on encryption migration learning
Technical Field
The invention relates to the technical field of data processing, in particular to a sample completion method, a terminal, a system and a medium based on encryption migration learning.
Background
In the field of artificial intelligence, a traditional data processing mode is that one party collects data, then transfers the data to the other party for processing, cleaning and modeling, and finally sells the model to a third party. However, as regulations become more sophisticated and monitoring becomes more stringent, operators may violate the laws if the data leaves the collector or if the user is not aware of the particular use of the model. Data exists in an island form, and a direct scheme for solving the island is to integrate the data into one party for processing. However, this is now likely to be illegal, as the laws do not allow the operator to aggregate data roughly.
To solve the dilemma, distributed machine learning algorithms are proposed, but the distributed machine learning algorithms often cannot be used due to the fact that part of data lacks features or labels. For example, a horizontal federal learning algorithm usually requires that feature dimensions of each participant in the algorithm are the same, and for features which are not owned by all participants, we can only generally abandon the part of feature data not used; in the distributed machine learning algorithm based on supervised learning, samples of all participants in the algorithm need to be labeled, and similarly, for data without labels, the data can be usually abandoned. These situations cause a lot of data waste, and also cause the sample data used for training the machine learning algorithm to be unevenly distributed, thereby reducing the generalization ability of the training model.
Therefore, when each participant in distributed machine learning comes from different organizations, how to complement the characteristics and labels of each party under the condition of ensuring the data security and privacy of each party is a problem to be solved urgently.
Disclosure of Invention
The invention mainly aims to provide a sample completion method, a terminal, a system and a medium based on encryption migration learning, and aims to solve the technical problem of data waste caused by sample characteristic loss or label loss of the existing distributed machine learning algorithm.
In order to achieve the above object, the present invention provides a sample completion method based on encryption migration learning, where the sample completion method based on encryption migration learning is applied to a first terminal, and the sample completion method based on encryption migration learning includes the following steps:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model and sending the first feature model to the second terminal so that the second terminal can predict a feature missing from the second sample to obtain a second encryption completion feature, and operating the second encryption completion feature according to a second preset operation rule to obtain a second completion feature;
receiving a second encryption feature model sent by the second terminal, predicting the missing features of the first sample according to the second encryption feature model to obtain first encryption completion features, wherein the second encryption feature model is obtained by training initial features of the second sample based on the feature intersection by the second terminal;
calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic;
receiving a second encryption labeling model sent by the second terminal, predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion labeling, wherein the second encryption labeling model is obtained by the second terminal according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample;
and calculating the first encryption completion label according to a third preset operation rule to obtain a first completion label.
Optionally, the step of performing an operation on the first encryption completion feature according to a first preset operation rule to obtain a first completion feature includes:
adding a first random mask to the first encryption completion feature to obtain a first encryption mask feature, and sending the first encryption mask feature to the second terminal so that the second terminal decrypts the first encryption mask feature to obtain a first mask feature;
and receiving a first mask feature sent by the second terminal, and subtracting the first random mask from the first mask feature to obtain a first completion feature.
Optionally, the step of performing an operation on the first encryption completion tag according to a third preset operation rule to obtain a first completion tag includes:
adding a third random mask to the first encryption completion label to obtain a third encryption mask label, and sending the third encryption mask label to the second terminal, so that the second terminal decrypts the third encryption mask label to obtain a third mask label;
and receiving a third mask label sent by the second terminal, and subtracting the third random mask from the third mask label to obtain a first completion label.
Optionally, the determining a feature intersection of a first sample of the first terminal and a second sample of the second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model, and sending the first feature model to the second terminal, so that the second terminal predicts a missing feature of the second sample to obtain a second encrypted completion feature, and operating the second encrypted completion feature according to a second preset operation rule to obtain a second completion feature includes:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting and sending the first feature model to the second terminal so that the second terminal predicts a missing feature of the second sample to obtain a second encrypted completion feature, adding a second random mask to the second encrypted completion feature to obtain a second encrypted mask feature, sending the second encrypted mask feature to the first terminal so that the first terminal decrypts the second encrypted mask feature to obtain a second mask feature, and subtracting the second random mask from the second mask feature to obtain a second completion feature when the second mask feature sent by the first terminal is received.
The invention provides a sample completion method based on encryption transfer learning, which is applied to a second terminal and comprises the following steps:
the second terminal receives an encrypted first feature model sent by the first terminal, predicts the missing features of a second sample according to the encrypted first feature model to obtain second encrypted completion features, and operates the second encrypted completion features according to a second preset operation rule to obtain second completion features, the first feature model determines the feature intersection of the first sample of the first terminal and the second sample of the second terminal through the first terminal, and the initial features of the first sample are trained on the basis of the feature intersection to obtain the first feature model;
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal can predict the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, and calculating the first encrypted completion feature according to a first preset operation rule to obtain a first completion feature;
and training according to the initial characteristic of the second sample, the second completion characteristic and the initial label of the second sample to obtain a second encrypted label model, sending the second encrypted label model to the first terminal so that the first terminal can predict the label missing from the first sample based on the second encrypted label model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encrypted completion label, and calculating the first encrypted completion label according to a third preset operation rule to obtain the first completion label.
Optionally, the step of calculating the second encryption completion characteristic according to a second preset calculation rule to obtain a second completion characteristic includes:
adding a second random mask to the second encryption completion characteristic to obtain a second encryption mask characteristic, and sending the second encryption mask characteristic to the first terminal so that the first terminal decrypts the second encryption mask characteristic to obtain a second mask characteristic;
and when a second mask feature sent by the first terminal is received, subtracting the second random mask from the second mask feature to obtain a second completion feature.
Optionally, the training of the initial feature of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing feature of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, and performs an operation on the first encrypted completion feature according to a first preset operation rule to obtain a first completion feature, where the step of obtaining the first completion feature includes:
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing features of the first sample according to the second encrypted feature model to obtain first encrypted completion features, adding a first random mask to the first encrypted completion features to obtain first encrypted mask features, sending the first encrypted mask features to the second terminal, so that the second terminal decrypts the first encrypted mask features to obtain first mask features, and subtracting the first random mask from the first mask features to obtain first completion features when receiving the first mask features sent by the second terminal.
Optionally, the training according to the initial feature of the second sample, the second completion feature, and the initial label of the second sample is performed to obtain a second encrypted label model, and the second encrypted label model is sent to the first terminal, so that the first terminal predicts the label missing from the first sample based on the second encrypted label model, the initial feature of the first sample, and the first completion feature to obtain a first encrypted completion label, and operates the first encrypted completion label according to a third preset operation rule to obtain a first completion label, where the training includes:
obtaining a second encryption labeling model according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample, and sending the second encryption labeling model to the first terminal, so that the first terminal can predict the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, and adding a third random mask to the first encryption completion label to obtain a third encryption mask label, sending the third encryption mask label to the second terminal, for the second terminal to decrypt the third encrypted mask label to obtain a third mask label, and when a third mask mark sent by the second terminal is received, subtracting the third random mask from the third mask mark to obtain a first completion mark.
In addition, to achieve the above object, the present invention further provides a terminal, where the terminal is a first terminal, and the first terminal includes: the system comprises a memory, a processor and a sample completion program based on the encryption migration learning, wherein the sample completion program based on the encryption migration learning is stored on the memory and can run on the processor, and when being executed by the processor, the sample completion program based on the encryption migration learning realizes the steps of the sample completion method based on the encryption migration learning.
The present invention further provides a terminal, where the terminal is a second terminal, and the second terminal includes: the system comprises a memory, a processor and a sample completion program based on the encryption migration learning, wherein the sample completion program based on the encryption migration learning is stored on the memory and can run on the processor, and when being executed by the processor, the sample completion program based on the encryption migration learning realizes the steps of the sample completion method based on the encryption migration learning.
The invention also provides a sample completion system based on encryption transfer learning, which comprises at least one first terminal and at least one second terminal.
In addition, in order to achieve the above object, the present invention further provides a storage medium applied to a computer, wherein the storage medium stores a sample completion program based on encryption migration learning, and the sample completion program based on encryption migration learning is executed by a processor to implement the steps of the sample completion method based on encryption migration learning as described above.
The method comprises the steps of determining a feature intersection of a first sample of a first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model and sending the first feature model to the second terminal so that the second terminal can predict missing features of the second sample to obtain second encryption completion features, and operating the second encryption completion features according to a second preset operation rule to obtain second completion features; receiving a second encryption characteristic model sent by a second terminal, predicting the missing characteristic of a first sample of the first terminal according to the second encryption characteristic model to obtain a first encryption completion characteristic, wherein the second encryption characteristic model is obtained by training the initial characteristic of a second sample based on the characteristic intersection by the second terminal; calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic; receiving a second encryption labeling model sent by a second terminal, predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, wherein the second encryption labeling model is obtained by training the second terminal according to the initial characteristic of the second sample, the second completion characteristic and the initial label of the second sample; and operating the first encryption completion label according to a third preset operation rule to obtain a first completion label. The invention realizes the feature and mark completion of the sample data of each party on the premise of ensuring that the data privacy of each party is not revealed.
Drawings
FIG. 1 is a schematic diagram of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of a sample completion method based on encryption transfer learning according to the present invention;
fig. 3 is a schematic view of a second embodiment of a sample completion method based on encryption transfer learning according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
It should be noted that, the terminal in the embodiment of the present invention may be a terminal device such as a smart phone, a personal computer, and a server, and is not limited herein.
As shown in fig. 1, the model parameter training apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration of the model parameter training device illustrated in FIG. 1 does not constitute a limitation of the model parameter training device, and may include more or fewer components than illustrated, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a sample complementation program based on the cryptographic migration learning. The operating system is a program for managing and controlling hardware and software resources of the model parameter training device and supports the running of a sample completion program based on encryption migration learning and other software or programs.
In the model parameter training apparatus shown in fig. 1, the user interface 1003 is mainly used for data communication with each terminal; the network interface 1004 is mainly used for connecting a background server and performing data communication with the background server; and the processor 1001 may be configured to call the sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following operations:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model and sending the first feature model to the second terminal so that the second terminal can predict a feature missing from the second sample to obtain a second encryption completion feature, and operating the second encryption completion feature according to a second preset operation rule to obtain a second completion feature;
receiving a second encryption feature model sent by the second terminal, predicting the missing features of the first sample according to the second encryption feature model to obtain first encryption completion features, wherein the second encryption feature model is obtained by training initial features of the second sample based on the feature intersection by the second terminal;
calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic;
receiving a second encryption labeling model sent by the second terminal, predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion labeling, wherein the second encryption labeling model is obtained by the second terminal according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample;
and calculating the first encryption completion label according to a third preset operation rule to obtain a first completion label.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
adding a first random mask to the first encryption completion feature to obtain a first encryption mask feature, and sending the first encryption mask feature to the second terminal so that the second terminal decrypts the first encryption mask feature to obtain a first mask feature;
and receiving a first mask feature sent by the second terminal, and subtracting the first random mask from the first mask feature to obtain a first completion feature.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
adding a third random mask to the first encryption completion label to obtain a third encryption mask label, and sending the third encryption mask label to the second terminal, so that the second terminal decrypts the third encryption mask label to obtain a third mask label;
and receiving a third mask label sent by the second terminal, and subtracting the third random mask from the third mask label to obtain a first completion label.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting and sending the first feature model to the second terminal so that the second terminal predicts a missing feature of the second sample to obtain a second encrypted completion feature, adding a second random mask to the second encrypted completion feature to obtain a second encrypted mask feature, sending the second encrypted mask feature to the first terminal so that the first terminal decrypts the second encrypted mask feature to obtain a second mask feature, and subtracting the second random mask from the second mask feature to obtain a second completion feature when the second mask feature sent by the first terminal is received.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
the second terminal receives an encrypted first feature model sent by the first terminal, predicts the missing features of a second sample according to the encrypted first feature model to obtain second encrypted completion features, and operates the second encrypted completion features according to a second preset operation rule to obtain second completion features, the first feature model determines the feature intersection of the first sample of the first terminal and the second sample of the second terminal through the first terminal, and the initial features of the first sample are trained on the basis of the feature intersection to obtain the first feature model;
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal can predict the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, and calculating the first encrypted completion feature according to a first preset operation rule to obtain a first completion feature;
and training according to the initial characteristic of the second sample, the second completion characteristic and the initial label of the second sample to obtain a second encrypted label model, sending the second encrypted label model to the first terminal so that the first terminal can predict the label missing from the first sample based on the second encrypted label model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encrypted completion label, and calculating the first encrypted completion label according to a third preset operation rule to obtain the first completion label.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
adding a second random mask to the second encryption completion characteristic to obtain a second encryption mask characteristic, and sending the second encryption mask characteristic to the first terminal so that the first terminal decrypts the second encryption mask characteristic to obtain a second mask characteristic;
and when a second mask feature sent by the first terminal is received, subtracting the second random mask from the second mask feature to obtain a second completion feature.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing features of the first sample according to the second encrypted feature model to obtain first encrypted completion features, adding a first random mask to the first encrypted completion features to obtain first encrypted mask features, sending the first encrypted mask features to the second terminal, so that the second terminal decrypts the first encrypted mask features to obtain first mask features, and subtracting the first random mask from the first mask features to obtain first completion features when receiving the first mask features sent by the second terminal.
Further, the processor 1001 may be further configured to call a sample completion program based on the cryptographic transfer learning stored in the memory 1005, and perform the following steps:
obtaining a second encryption labeling model according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample, and sending the second encryption labeling model to the first terminal, so that the first terminal can predict the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, and adding a third random mask to the first encryption completion label to obtain a third encryption mask label, sending the third encryption mask label to the second terminal, for the second terminal to decrypt the third encrypted mask label to obtain a third mask label, and when a third mask mark sent by the second terminal is received, subtracting the third random mask from the third mask mark to obtain a first completion mark.
In the technical solution provided by the present invention, the terminal calls a sample completion program based on encryption migration learning stored in the memory 1005 through the processor 1001 to implement the steps of: determining a feature intersection of a first sample of the first terminal and a second sample of the second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model and sending the first feature model to the second terminal so that the second terminal can predict a missing feature of the second sample to obtain a second encryption completion feature, and calculating the second encryption completion feature according to a second preset operation rule to obtain a second completion feature; receiving a second encryption characteristic model sent by a second terminal, predicting the missing characteristic of a first sample of the first terminal according to the second encryption characteristic model to obtain a first encryption completion characteristic, wherein the second encryption characteristic model is obtained by training the initial characteristic of a second sample based on the characteristic intersection by the second terminal; calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic; receiving a second encryption labeling model sent by a second terminal, predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, wherein the second encryption labeling model is obtained by training the second terminal according to the initial characteristic of the second sample, the second completion characteristic and the initial label of the second sample; and operating the first encryption completion label according to a third preset operation rule to obtain a first completion label. The invention realizes the feature and mark completion of the sample data of each party on the premise of ensuring that the data privacy of each party is not revealed.
In addition, an embodiment of the present invention further provides a terminal, where the terminal is a first terminal, and the first terminal includes: the system comprises a memory, a processor and a sample completion program based on the encryption migration learning, wherein the sample completion program based on the encryption migration learning is stored on the memory and can run on the processor, and when being executed by the processor, the sample completion program based on the encryption migration learning realizes the steps of the sample completion method based on the encryption migration learning.
The method implemented when the sample completion program based on the encryption migration learning running on the processor is executed may refer to each embodiment of the sample completion method based on the encryption migration learning of the present invention, and details thereof are not described herein.
In addition, an embodiment of the present invention further provides a terminal, where the terminal is a second terminal, and the second terminal includes: the system comprises a memory, a processor and a sample completion program based on the encryption migration learning, wherein the sample completion program based on the encryption migration learning is stored on the memory and can run on the processor, and when being executed by the processor, the sample completion program based on the encryption migration learning realizes the steps of the sample completion method based on the encryption migration learning.
The method implemented when the sample completion program based on the encryption migration learning running on the processor is executed may refer to each embodiment of the sample completion method based on the encryption migration learning of the present invention, and details thereof are not described herein.
In addition, the embodiment of the present invention further provides a sample completion system based on encryption migration learning, where the sample completion system based on encryption migration learning includes at least one first terminal and at least one second terminal.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a sample completion program based on migratory learning is stored on the storage medium, and when executed by a processor, the sample completion program based on migratory learning implements the steps of the sample completion method based on migratory learning as described above.
The method implemented when the sample completion program based on the encryption migration learning running on the processor is executed may refer to each embodiment of the sample completion method based on the encryption migration learning of the present invention, and details thereof are not described herein.
Based on the above structure, various embodiments of the sample completion method based on the encryption transfer learning are proposed.
Referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of a sample completion method based on encryption transfer learning according to the present invention.
While a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in a different order than that shown.
The first embodiment of the present invention is a sample completion method based on encryption transfer learning, which is applied to a first terminal, and the first terminal and a second terminal in the embodiment of the present invention may be terminal devices such as a smart phone, a personal computer, and a server, and are not limited specifically herein.
The sample completion method based on the encryption transfer learning in the embodiment comprises the following steps:
step S1, determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting and sending the first feature model to the second terminal so that the second terminal can predict the missing feature of the second sample to obtain a second encryption completion feature, and calculating the second encryption completion feature according to a second preset calculation rule to obtain a second completion feature;
in the field of artificial intelligence, a traditional data processing mode is that one party collects data, then transfers the data to the other party for processing, cleaning and modeling, and finally sells the model to a third party. However, as regulations become more sophisticated and monitoring becomes more stringent, operators may violate the laws if the data leaves the collector or if the user is not aware of the particular use of the model. Data exists in an island form, and a direct scheme for solving the island is to integrate the data into one party for processing. However, this is now likely to be illegal, as the laws do not allow the operator to aggregate data roughly.
To solve the dilemma, distributed machine learning algorithms are proposed, but the distributed machine learning algorithms often cannot be used due to the missing features or labels of the data. For example, a horizontal federal learning algorithm usually requires that feature dimensions of each participant in the algorithm are the same, and for features which are not owned by all participants, we can only generally abandon the part of feature data not used; in the distributed machine learning algorithm based on supervised learning, samples of all participants in the algorithm need to be labeled, and similarly, for data without labels, the data can be usually abandoned. These situations cause a lot of data waste, and also cause the sample data used for training the machine learning algorithm to be unevenly distributed, thereby reducing the generalization ability of the training model.
Therefore, when each participant in distributed machine learning comes from different organizations, how to complement the characteristics and labels of each party under the condition of ensuring the data security and privacy of each party is a problem to be solved urgently. In order to solve this problem, various embodiments of the sample completion method based on the encryption migration learning of the present invention are proposed.
The invention is based on transfer learning, wherein the transfer learning refers to a learning process of applying a model learned in an old field to a new field by utilizing the similarity among data, tasks or models. The core problem of the transfer learning is that the similarity between a new problem and an original problem is found, and the learned knowledge can be smoothly transferred and applied to the new problem, so that the knowledge transfer is realized.
In this embodiment, the sample dimensions of the first sample at the first terminal and the second sample at the second terminal are different, the characteristic dimensions are partially overlapped, and the label of the first sample is missing.
The method comprises the steps that firstly, a first terminal determines an overlapped part of feature dimensions of a first sample and a second sample, based on the intersection of the overlapped part, a non-overlapped part of the first sample and the overlapped part are trained to obtain a function mapping model from the overlapped part to the non-overlapped part, namely a first feature model, the first feature model is encrypted through a preset encryption algorithm and then sent to a second terminal, the second terminal predicts the missing features of the second sample through the encrypted first feature model after receiving the encrypted first feature model to obtain a second encryption completion feature, and the second terminal calculates the second encryption completion feature according to a second preset calculation rule to obtain the second completion feature.
Wherein, the preset Encryption algorithm is Homomorphic Encryption algorithm (Homomorphic Encryption).
Step S2, receiving a second encrypted feature model sent by a second terminal, predicting the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, wherein the second encrypted feature model is obtained by training the initial features of the second sample based on the feature intersection by the second terminal;
meanwhile, the second terminal determines the overlapping part of the feature dimensions of the first sample and the second sample, trains the non-overlapping part of the second sample and the overlapping part to obtain a function mapping model from the overlapping part to the non-overlapping part, namely a second feature model, encrypts the second feature model through a preset encryption algorithm to obtain a second encryption feature model, and then sends the second encryption feature model to the first terminal, and the first terminal predicts the missing features of the first sample by using the second encryption feature model after receiving the second encryption feature model to obtain a first encryption completion feature.
Step S3, calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic;
after the first terminal carries out prediction completion on the missing features to obtain first encryption completion features, the first terminal carries out operation on the first encryption completion features according to a first preset operation rule to obtain unencrypted first completion features, and at this time, feature completion on the first terminal is completed.
Step S4, receiving a second encrypted annotation model sent by the second terminal, predicting the label missing from the first sample based on the second encrypted annotation model, the initial feature of the first sample and the first completion feature to obtain a first encrypted completion label, wherein the second encrypted annotation model is obtained by the second terminal according to the initial feature of the second sample, the second completion feature and the initial label training of the second sample;
after the second terminal calculates the second encryption completion characteristic according to a second preset calculation rule to obtain a second completion characteristic so as to complete the characteristic completion of the second terminal, the first terminal calculates the first encryption completion characteristic according to a first preset calculation rule to obtain an unencrypted first completion characteristic and complete the characteristic completion of the first terminal, the second terminal obtains a function mapping model from the characteristics of the second sample to the labels according to the initial characteristics of the second sample, the second completion characteristics and the initial label training of the second sample, namely a second annotation model, and the second annotation model is encrypted by a preset encryption algorithm to obtain a second encrypted annotation model which is then sent to the first terminal, and predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label.
Step S5, calculating the first encryption completion flag according to a third preset calculation rule, to obtain a first completion flag.
After the first terminal carries out prediction completion on the missing marks to obtain first encryption completion marks, the first terminal carries out operation on the first encryption completion marks according to a third preset operation rule to obtain unencrypted first completion marks, and thus feature completion and mark completion of the first terminal and feature completion of the second terminal are completed.
In the embodiment, the characteristic completion and the marking completion of the sample data of each party are realized on the premise of ensuring that the data privacy of each party is not disclosed.
Further, in the second embodiment of the sample completion method based on the encryption migration learning of the present invention, the step S3 includes:
step S31, adding a first random mask to the first encrypted padding feature to obtain a first encrypted mask feature, and sending the first encrypted mask feature to the second terminal, so that the second terminal decrypts the first encrypted mask feature to obtain a first mask feature;
step S32, receiving the first mask feature sent by the second terminal, and subtracting the first random mask from the first mask feature to obtain a first completion feature.
After a first terminal carries out prediction completion on missing features to obtain a first encryption completion feature, a first terminal adds a first random mask to the first encryption completion feature to obtain a first encryption mask feature and sends the first encryption mask feature to a second terminal, the second terminal decrypts the first encryption mask feature after receiving the first encryption mask feature to obtain a first mask feature and sends the first mask feature to the first terminal, and the first terminal subtracts the first random mask from the first mask feature after receiving the first mask feature sent by the second terminal to obtain an unencrypted first completion feature, so that feature completion of the first terminal is completed.
Further, the step S5 includes:
step S51, adding a third random mask to the first encryption completion tag to obtain a third encryption mask tag, and sending the third encryption mask tag to the second terminal, so that the second terminal decrypts the third encryption mask tag to obtain a third mask tag;
step S52, receiving a third mask label sent by the second terminal, and subtracting the third random mask from the third mask label to obtain a first completion label.
After the first terminal carries out prediction completion on the missing marks to obtain a first encryption completion mark, the first terminal adds a third random mask to the first encryption completion mark to obtain a third encryption mask mark and sends the third encryption mask mark to the second terminal, the second terminal decrypts the third encryption mask mark to obtain a third mask mark and sends the third mask mark to the first terminal after receiving the third encryption mask mark, and the first terminal subtracts the third random mask from the third mask mark to obtain an unencrypted first completion mark after receiving the third mask mark sent by the second terminal, so that the mark completion of the first terminal is completed.
Further, the step S1 includes:
step S11, determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, sending the first feature model to the second terminal in an encrypted manner, so that the second terminal can predict a missing feature of the second sample to obtain a second encrypted completion feature, adding a second random mask to the second encrypted completion feature to obtain a second encrypted mask feature, sending the second encrypted mask feature to the first terminal, so that the first terminal can decrypt the second encrypted mask feature to obtain a second mask feature, and when receiving the second mask feature sent by the first terminal, subtracting the second random mask from the second mask feature to obtain a second completion feature.
After the second terminal receives the encrypted first feature model sent by the first terminal, the second terminal predicts the missing features of the second sample by using the encrypted first feature model to obtain second encryption completion features, then, the second terminal adds a second random mask to the second encryption completion features to obtain second encryption mask features and sends the second encryption mask features to the first terminal, after the first terminal receives the second encryption mask features, the second terminal decrypts the second encryption mask features to obtain second mask features and sends the second mask features to the second terminal, and after the second terminal receives the second mask features sent by the first terminal, the second mask features subtract the second random mask to obtain unencrypted second completion features, so that the feature completion of the second terminal is completed.
To aid understanding, an example is now listed: as shown in FIG. 3, the sample dimensions of both A and B are different, and the characteristicsThere is a partial overlap. B party has known data (X)B,YB) Wherein X isBBy
Figure GDA0003089561830000161
And
Figure GDA0003089561830000162
two parts are formed. Party A only has known data XAWherein X isAFrom X1 AAnd
Figure GDA0003089561830000164
two parts are formed. The system consists of a party A and a party B. Firstly, the A party and the B party determine the feature intersection part of the A party and the B party
Figure GDA0003089561830000163
At the A side, we define
Figure GDA0003089561830000171
Is a set of n features. For each feature
Figure GDA0003089561830000172
Train a slave
Figure GDA0003089561830000173
To
Figure GDA0003089561830000174
Function mapping model of
Figure GDA0003089561830000175
The function mapping model is obtained by training the following objective functions:
Figure GDA0003089561830000176
similarly, on the B side, we define
Figure GDA0003089561830000177
Characterised by mAnd (4) collecting. For each feature
Figure GDA0003089561830000178
Train a slave
Figure GDA0003089561830000179
To
Figure GDA00030895618300001710
Function mapping model of
Figure GDA00030895618300001711
The function mapping model is obtained by training the following objective functions:
Figure GDA00030895618300001712
the A party and the B party carry out feature completion, and the A party carries out encryption model
Figure GDA00030895618300001713
And transmitted to the party B. Utilization of B
Figure GDA00030895618300001714
For the characteristics of the defect
Figure GDA00030895618300001715
Make a prediction to obtain
Figure GDA00030895618300001716
And to
Figure GDA00030895618300001717
Adding a random mask matrix MBTo obtain
Figure GDA00030895618300001718
Then will be
Figure GDA00030895618300001719
Sent to party A, which is receiving
Figure GDA00030895618300001720
Then decrypting the data to obtain
Figure GDA00030895618300001721
Then will be
Figure GDA00030895618300001722
Is sent to the B party, and the B party will
Figure GDA00030895618300001723
Minus MBTo obtain
Figure GDA00030895618300001724
Similarly, party B will encrypt the model
Figure GDA00030895618300001725
Transmitted to the party A and utilized by the party A
Figure GDA00030895618300001726
For the characteristics of the defect
Figure GDA00030895618300001727
Make a prediction to obtain
Figure GDA00030895618300001728
And to
Figure GDA00030895618300001729
Adding a random mask matrix MATo obtain
Figure GDA00030895618300001730
Then will be
Figure GDA00030895618300001731
Sent to party B, which is receiving
Figure GDA00030895618300001732
Then decrypting the data to obtain
Figure GDA00030895618300001733
Then will be
Figure GDA00030895618300001734
Sent to the A side, which will
Figure GDA00030895618300001735
Minus MATo obtain
Figure GDA00030895618300001736
Subsequently, A, B, both feature data are complemented, and the B-side data (X) is usedB,YB) Train a Slave XBTo YBIs mapped to a model gB:XB→YB. The function mapping model is obtained by training the following objective functions:
Figure GDA00030895618300001737
party B encrypts the model [ [ g ]B]]To the A side, the A side utilizes [ [ g ]B]]For missing tag YAPredicting to obtain [ [ Y ]A]]And to [ [ Y ]A]]Adding a random mask matrix MA' obtaining [ [ Y ]A+MA']]Then [ [ Y ] willA+MA']]Sent to party B, party B receives [ [ Y ]A+MA']]Then decrypting the data to obtain YA+MA', then Y isA+MA' sending to party A, party A sends YA+MA' subtract MA' obtaining YAAnd completing the completion of the label of the party A.
Wherein, L refers to a loss function, theta refers to a model parameter, lambda refers to a regular formula parameter, and F refers to a sum of squares.
In this embodiment, a mask is added in the process of performing feature completion and label completion by the first terminal and the second terminal. The method and the device have the advantages that the characteristic completion is carried out on sample data of all parties on the premise that the data privacy of the two parties is not disclosed, the label completion is carried out on the first terminal with the missing label after the characteristic completion, and the privacy of data interaction is further improved.
Further, a third embodiment of the method for obtaining model parameters based on federated learning according to the present invention is provided, where in this embodiment, the method for complementing samples based on encryption migration learning is applied to a second terminal, and the method for complementing samples based on encryption migration learning includes the following steps:
step C1, the second terminal receives an encrypted first feature model sent by the first terminal, predicts missing features of a second sample according to the encrypted first feature model to obtain second encryption completion features, and operates the second encryption completion features according to a second preset operation rule to obtain second completion features, the first feature model determines a feature intersection of a first sample of the first terminal and a second sample of the second terminal, and an initial feature of the first sample is trained based on the feature intersection to obtain;
in this embodiment, the sample dimensions of the first sample at the first terminal and the second sample at the second terminal are different, the characteristic dimensions are partially overlapped, and the label of the first sample is missing.
The method comprises the steps that firstly, a first terminal determines an overlapped part of feature dimensions of a first sample and a second sample, based on the intersection of the overlapped part, a non-overlapped part of the first sample and the overlapped part are trained to obtain a function mapping model from the overlapped part to the non-overlapped part, namely a first feature model, the first feature model is encrypted through a preset encryption algorithm and then sent to a second terminal, the second terminal predicts the missing features of the second sample through the encrypted first feature model after receiving the encrypted first feature model to obtain a second encryption completion feature, and the second terminal calculates the second encryption completion feature according to a second preset calculation rule to obtain the second completion feature.
Wherein, the preset Encryption algorithm is Homomorphic Encryption algorithm (Homomorphic Encryption).
Step C2, training the initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal can predict the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, and calculating the first encrypted completion feature according to a first preset calculation rule to obtain a first completion feature;
meanwhile, the second terminal determines the overlapping part of the feature dimensions of the first sample and the second sample, trains the non-overlapping part of the second sample and the overlapping part to obtain a function mapping model from the overlapping part to the non-overlapping part, namely a second feature model, encrypts the second feature model through a preset encryption algorithm to obtain a second encryption feature model, and then sends the second encryption feature model to the first terminal, and the first terminal predicts the missing features of the first sample by using the second encryption feature model after receiving the second encryption feature model to obtain a first encryption completion feature. After the first terminal carries out prediction completion on the missing features to obtain first encryption completion features, the first terminal carries out operation on the first encryption completion features according to a first preset operation rule to obtain unencrypted first completion features, and at this time, feature completion on the first terminal is completed.
And step C3, training according to the initial features of the second sample, the second completion features and the initial labels of the second sample to obtain a second encrypted label model, sending the second encrypted label model to the first terminal, predicting the labels missing from the first sample by the first terminal based on the second encrypted label model, the initial features of the first sample and the first completion features to obtain first encrypted completion labels, and calculating the first encrypted completion labels according to a third preset operation rule to obtain first completion labels.
After the second terminal calculates the second encryption completion characteristic according to a second preset calculation rule to obtain a second completion characteristic so as to complete the characteristic completion of the second terminal, the first terminal calculates the first encryption completion characteristic according to a first preset calculation rule to obtain an unencrypted first completion characteristic and complete the characteristic completion of the first terminal, the second terminal obtains a function mapping model from the characteristics of the second sample to the labels according to the initial characteristics of the second sample, the second completion characteristics and the initial label training of the second sample, namely a second annotation model, and the second annotation model is encrypted by a preset encryption algorithm to obtain a second encrypted annotation model which is then sent to the first terminal, and predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label.
After the first terminal carries out prediction completion on the missing marks to obtain first encryption completion marks, the first terminal carries out operation on the first encryption completion marks according to a third preset operation rule to obtain unencrypted first completion marks, and thus feature completion and mark completion of the first terminal and feature completion of the second terminal are completed.
Further, the step of calculating the second encryption completion characteristic according to a second preset calculation rule to obtain a second completion characteristic includes:
step C11, adding a second random mask to the second encryption completion feature to obtain a second encryption mask feature, and sending the second encryption mask feature to the first terminal, so that the first terminal decrypts the second encryption mask feature to obtain a second mask feature;
step C21, when receiving the second mask feature sent by the first terminal, subtracting the second random mask from the second mask feature to obtain a second completion feature.
Specifically, after the second terminal receives the encrypted first feature model sent by the first terminal, the second terminal predicts the missing features of the second sample by using the encrypted first feature model to obtain a second encryption completion feature, then, the second terminal adds a second random mask to the second encryption completion feature to obtain a second encryption mask feature and sends the second encryption mask feature to the first terminal, after the first terminal receives the second encryption mask feature, the first terminal decrypts the second encryption mask feature to obtain a second mask feature and sends the second mask feature to the second terminal, and after the second terminal receives the second mask feature sent by the first terminal, the second mask feature subtracts the second random mask to obtain an unencrypted second completion feature, and thus, the feature completion of the second terminal is completed.
Further, the step C2 includes:
step C21, training the initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted complete feature, adding a first random mask to the first encrypted complete feature to obtain a first encrypted mask feature, sending the first encrypted mask feature to the second terminal, so that the second terminal decrypts the first encrypted mask feature to obtain a first mask feature, and when receiving the first mask feature sent by the second terminal, subtracting the first random mask from the first mask feature to obtain a first complete feature.
Specifically, after the first terminal performs predictive completion on the missing features to obtain a first encryption completion feature, the first terminal adds a first random mask to the first encryption completion feature to obtain a first encryption mask feature, and sends the first encryption mask feature to the second terminal, after receiving the first encryption mask feature, the second terminal decrypts the first encryption mask feature to obtain a first mask feature and sends the first mask feature to the first terminal, after receiving the first mask feature sent by the second terminal, the first terminal subtracts the first random mask from the first mask feature to obtain an unencrypted first completion feature, and thus, the feature completion of the first terminal is completed.
Further, the step C3 includes:
step C31, obtaining a second encrypted annotation model according to the initial feature of the second sample, the second complementing feature and the initial annotation training of the second sample, and sending the second encrypted annotation model to the first terminal, so that the first terminal can predict the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, and adding a third random mask to the first encryption completion label to obtain a third encryption mask label, sending the third encryption mask label to the second terminal, for the second terminal to decrypt the third encrypted mask label to obtain a third mask label, and when a third mask mark sent by the second terminal is received, subtracting the third random mask from the third mask mark to obtain a first completion mark.
Specifically, after the first terminal performs predictive completion on the missing label to obtain a first encryption completion label, the first terminal adds a third random mask to the first encryption completion label to obtain a third encryption mask label, and sends the third encryption mask label to the second terminal, the second terminal decrypts the third encryption mask label to obtain a third mask label and sends the third mask label to the first terminal after receiving the third encryption mask label, and the first terminal subtracts the third random mask from the third mask label to obtain an unencrypted first completion label after receiving the third mask label sent by the second terminal, so that completion of the label of the first terminal is completed.
According to the embodiment, on the premise that the data privacy of each party is not disclosed, the characteristic completion and the marking completion are performed on the sample data of each party.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (12)

1. A sample completion method based on encryption migration learning is characterized in that the sample completion method based on encryption migration learning is applied to a first terminal, and comprises the following steps:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting the first feature model and sending the first feature model to the second terminal so that the second terminal can predict a feature missing from the second sample to obtain a second encryption completion feature, and operating the second encryption completion feature according to a second preset operation rule to obtain a second completion feature;
receiving a second encryption feature model sent by the second terminal, predicting the missing features of the first sample according to the second encryption feature model to obtain first encryption completion features, wherein the second encryption feature model is obtained by training initial features of the second sample based on the feature intersection by the second terminal;
calculating the first encryption completion characteristic according to a first preset calculation rule to obtain a first completion characteristic;
receiving a second encryption labeling model sent by the second terminal, predicting the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion labeling, wherein the second encryption labeling model is obtained by the second terminal according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample;
and calculating the first encryption completion label according to a third preset operation rule to obtain a first completion label.
2. The method according to claim 1, wherein the step of computing the first cryptographic completion feature according to a first predetermined computation rule to obtain a first completion feature comprises:
adding a first random mask to the first encryption completion feature to obtain a first encryption mask feature, and sending the first encryption mask feature to the second terminal so that the second terminal decrypts the first encryption mask feature to obtain a first mask feature;
and receiving a first mask feature sent by the second terminal, and subtracting the first random mask from the first mask feature to obtain a first completion feature.
3. The method according to claim 1, wherein the step of computing the first completion label according to a third preset computation rule to obtain the first completion label comprises:
adding a third random mask to the first encryption completion label to obtain a third encryption mask label, and sending the third encryption mask label to the second terminal, so that the second terminal decrypts the third encryption mask label to obtain a third mask label;
and receiving a third mask label sent by the second terminal, and subtracting the third random mask from the third mask label to obtain a first completion label.
4. The method for complementing samples based on cryptographic transfer learning according to claim 1, wherein the steps of determining a feature intersection of a first sample of the first terminal and a second sample of the second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, and sending the first feature model to the second terminal in a cryptographic manner, so that the second terminal predicts a missing feature of the second sample to obtain a second cryptographic complementing feature, and calculates the second cryptographic complementing feature according to a second preset calculation rule to obtain a second complementing feature comprise:
determining a feature intersection of a first sample of the first terminal and a second sample of a second terminal, training an initial feature of the first sample based on the feature intersection to obtain a first feature model, encrypting and sending the first feature model to the second terminal so that the second terminal predicts a missing feature of the second sample to obtain a second encrypted completion feature, adding a second random mask to the second encrypted completion feature to obtain a second encrypted mask feature, sending the second encrypted mask feature to the first terminal so that the first terminal decrypts the second encrypted mask feature to obtain a second mask feature, and subtracting the second random mask from the second mask feature to obtain a second completion feature when the second mask feature sent by the first terminal is received.
5. A sample completion method based on encryption migration learning is characterized in that the sample completion method based on encryption migration learning is applied to a second terminal, and comprises the following steps:
the second terminal receives an encrypted first feature model sent by the first terminal, predicts the missing features of a second sample according to the encrypted first feature model to obtain second encrypted completion features, and operates the second encrypted completion features according to a second preset operation rule to obtain second completion features, the first feature model determines the feature intersection of the first sample of the first terminal and the second sample of the second terminal through the first terminal, and the initial features of the first sample are trained on the basis of the feature intersection to obtain the first feature model;
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal can predict the missing features of the first sample according to the second encrypted feature model to obtain a first encrypted completion feature, and calculating the first encrypted completion feature according to a first preset operation rule to obtain a first completion feature;
and training according to the initial characteristic of the second sample, the second completion characteristic and the initial label of the second sample to obtain a second encrypted label model, sending the second encrypted label model to the first terminal so that the first terminal can predict the label missing from the first sample based on the second encrypted label model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encrypted completion label, and calculating the first encrypted completion label according to a third preset operation rule to obtain the first completion label.
6. The method according to claim 5, wherein the step of computing the second cryptographic completion characteristic according to a second predetermined computation rule to obtain a second completion characteristic comprises:
adding a second random mask to the second encryption completion characteristic to obtain a second encryption mask characteristic, and sending the second encryption mask characteristic to the first terminal so that the first terminal decrypts the second encryption mask characteristic to obtain a second mask characteristic;
and when a second mask feature sent by the first terminal is received, subtracting the second random mask from the second mask feature to obtain a second completion feature.
7. The method as claimed in claim 5, wherein the step of training the initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing features of the first sample according to the second encrypted feature model to obtain first encrypted completion features, and performs an operation on the first encrypted completion features according to a first preset operation rule to obtain first completion features includes:
training initial features of the second sample based on the feature intersection to obtain a second encrypted feature model, sending the second encrypted feature model to the first terminal, so that the first terminal predicts the missing features of the first sample according to the second encrypted feature model to obtain first encrypted completion features, adding a first random mask to the first encrypted completion features to obtain first encrypted mask features, sending the first encrypted mask features to the second terminal, so that the second terminal decrypts the first encrypted mask features to obtain first mask features, and subtracting the first random mask from the first mask features to obtain first completion features when receiving the first mask features sent by the second terminal.
8. The method for sample completion based on encryption migration learning according to claim 5, wherein the step of training according to the initial feature of the second sample, the second completion feature and the initial label of the second sample to obtain a second encryption labeling model, sending the second encryption labeling model to the first terminal, so that the first terminal predicts the label missing from the first sample based on the second encryption labeling model, the initial feature of the first sample and the first completion feature to obtain a first encryption completion label, and calculates the first encryption completion label according to a third preset calculation rule to obtain the first completion label comprises:
obtaining a second encryption labeling model according to the initial characteristic of the second sample, the second completion characteristic and the initial labeling training of the second sample, and sending the second encryption labeling model to the first terminal, so that the first terminal can predict the label missing from the first sample based on the second encryption labeling model, the initial characteristic of the first sample and the first completion characteristic to obtain a first encryption completion label, and adding a third random mask to the first encryption completion label to obtain a third encryption mask label, sending the third encryption mask label to the second terminal, for the second terminal to decrypt the third encrypted mask label to obtain a third mask label, and when a third mask mark sent by the second terminal is received, subtracting the third random mask from the third mask mark to obtain a first completion mark.
9. A terminal, characterized in that the terminal comprises: memory, a processor and a exemplar complementation program based on Cryptographic transfer learning stored on the memory and executable on the processor, which when executed by the processor implements the steps of the exemplar complementation method based on Cryptographic transfer learning according to any one of claims 1 to 4.
10. A terminal, characterized in that the terminal comprises: memory, a processor and a exemplar complementation program based on Cryptographic transfer learning stored on the memory and executable on the processor, which when executed by the processor implements the steps of the exemplar complementation method based on Cryptographic transfer learning according to any one of claims 5 to 8.
11. A system for performing exemplar completion based on migratory learning, the system comprising: at least one first terminal and at least one second terminal, the first terminal being the terminal of claim 9 and the second terminal being the terminal of claim 10.
12. A storage medium applied to a computer, wherein a sample completion program based on encryption migration learning is stored on the storage medium, and when being executed by a processor, the sample completion program based on encryption migration learning realizes the steps of the sample completion method based on encryption migration learning according to any one of claims 1 to 8.
CN201910153223.4A 2019-02-28 2019-02-28 Sample completion method, terminal, system and medium based on encryption migration learning Active CN109902742B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910153223.4A CN109902742B (en) 2019-02-28 2019-02-28 Sample completion method, terminal, system and medium based on encryption migration learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910153223.4A CN109902742B (en) 2019-02-28 2019-02-28 Sample completion method, terminal, system and medium based on encryption migration learning

Publications (2)

Publication Number Publication Date
CN109902742A CN109902742A (en) 2019-06-18
CN109902742B true CN109902742B (en) 2021-07-16

Family

ID=66945893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910153223.4A Active CN109902742B (en) 2019-02-28 2019-02-28 Sample completion method, terminal, system and medium based on encryption migration learning

Country Status (1)

Country Link
CN (1) CN109902742B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334815A (en) * 2019-07-10 2019-10-15 深圳前海微众银行股份有限公司 Label complementing method, terminal, device and storage medium based on cross validation
CN113781082B (en) * 2020-11-18 2023-04-07 京东城市(北京)数字科技有限公司 Method and device for correcting regional portrait, electronic equipment and readable storage medium
CN117156070B (en) * 2023-11-01 2024-01-02 江苏惟妙纺织科技有限公司 Intelligent parameter regulation and control method and system for embroidery machine

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102881019A (en) * 2012-10-08 2013-01-16 江南大学 Fuzzy clustering image segmenting method with transfer learning function
CN107241182A (en) * 2017-06-29 2017-10-10 电子科技大学 A kind of secret protection hierarchy clustering method based on vectorial homomorphic cryptography
CN109101806A (en) * 2018-08-17 2018-12-28 浙江捷尚视觉科技股份有限公司 A kind of privacy portrait data mask method based on Style Transfer
CN109143199A (en) * 2018-11-09 2019-01-04 大连东软信息学院 Sea clutter small target detecting method based on transfer learning
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109165515A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Model parameter acquisition methods, system and readable storage medium storing program for executing based on federation's study
CN109255444A (en) * 2018-08-10 2019-01-22 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on transfer learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL250948B (en) * 2017-03-05 2021-04-29 Verint Systems Ltd System and method for applying transfer learning to identification of user actions

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102881019A (en) * 2012-10-08 2013-01-16 江南大学 Fuzzy clustering image segmenting method with transfer learning function
CN107241182A (en) * 2017-06-29 2017-10-10 电子科技大学 A kind of secret protection hierarchy clustering method based on vectorial homomorphic cryptography
CN109165725A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Neural network federation modeling method, equipment and storage medium based on transfer learning
CN109165515A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Model parameter acquisition methods, system and readable storage medium storing program for executing based on federation's study
CN109255444A (en) * 2018-08-10 2019-01-22 深圳前海微众银行股份有限公司 Federal modeling method, equipment and readable storage medium storing program for executing based on transfer learning
CN109101806A (en) * 2018-08-17 2018-12-28 浙江捷尚视觉科技股份有限公司 A kind of privacy portrait data mask method based on Style Transfer
CN109143199A (en) * 2018-11-09 2019-01-04 大连东软信息学院 Sea clutter small target detecting method based on transfer learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Secure Federated Transfer Learning";Yang Liu 等;《Secure Federated Transfer Learning》;20181208;全文 *
"多源迁移学习算法研究";严海锐;《中国优秀硕士学位论文全文数据库-信息科技辑》;20170215;第2017年卷(第2期);I140-277 *

Also Published As

Publication number Publication date
CN109902742A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109492420B (en) Model parameter training method, terminal, system and medium based on federal learning
CN109902742B (en) Sample completion method, terminal, system and medium based on encryption migration learning
CN109255444B (en) Federal modeling method and device based on transfer learning and readable storage medium
CN110633805A (en) Longitudinal federated learning system optimization method, device, equipment and readable storage medium
CN110110229B (en) Information recommendation method and device
WO2020248537A1 (en) Model parameter determination method and apparatus based on federated learning
CN109241770B (en) Information value calculation method and device based on homomorphic encryption and readable storage medium
CN113627085B (en) Transverse federal learning modeling optimization method, equipment and medium
CN109639643B (en) Block chain-based client manager information sharing method, electronic device and readable storage medium
CN112287372B (en) Method and apparatus for protecting clipboard privacy
CN106487747A (en) User identification method, system, device and processing method, device
CN109391611B (en) User personal information encryption authorization method, device, equipment and readable storage medium
CN114338016B (en) Hazardous waste block chain supervision system and method based on group key negotiation
CN113254947B (en) Vehicle data protection method, system, equipment and storage medium
CN111274611A (en) Data desensitization method, device and computer readable storage medium
CN111368196A (en) Model parameter updating method, device, equipment and readable storage medium
CN112785002A (en) Model construction optimization method, device, medium, and computer program product
CN108388806B (en) Thing networking safety is consolidated and data rights and interests protection device based on block chain
CN110969261B (en) Encryption algorithm-based model construction method and related equipment
CN111464655A (en) Block chain-based Internet of things data management method and system
CN111739190B (en) Vehicle diagnostic file encryption method, device, equipment and storage medium
CN111310047B (en) Information recommendation method, device and equipment based on FM model and storage medium
CN112329057A (en) Document management method, device, equipment and computer readable storage medium
CN111447206A (en) JS resource encryption transmission method and device, server and storage medium
CN116644472A (en) Data encryption and data decryption methods and devices, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant