CN112668046A - Feature interleaving method, apparatus, computer-readable storage medium, and program product - Google Patents

Feature interleaving method, apparatus, computer-readable storage medium, and program product Download PDF

Info

Publication number
CN112668046A
CN112668046A CN202011552619.5A CN202011552619A CN112668046A CN 112668046 A CN112668046 A CN 112668046A CN 202011552619 A CN202011552619 A CN 202011552619A CN 112668046 A CN112668046 A CN 112668046A
Authority
CN
China
Prior art keywords
feature
ciphertext
crossing
party
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011552619.5A
Other languages
Chinese (zh)
Inventor
衣志昊
刘洋
陈天健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202011552619.5A priority Critical patent/CN112668046A/en
Publication of CN112668046A publication Critical patent/CN112668046A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a feature crossing method, a device, a computer readable storage medium and a program product, wherein the method is applied to a credible participant for feature crossing, and the credible participant is used for joint training of a model, and the method comprises the following steps: generating a public key and a private key for homomorphic encryption; distributing the public key to a first party and a second party which perform feature crossing; obtaining at least one ciphertext feature crossover result from the second party, the ciphertext feature crossover result being obtained according to the first feature of the first party, the second feature of the second party, and the public key; and homomorphic decryption is carried out on the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result. By the method and the device, the class characteristics can be crossed on the basis of a longitudinal federal learning framework on the premise of protecting data privacy.

Description

Feature interleaving method, apparatus, computer-readable storage medium, and program product
Technical Field
The present application relates to the field of artificial intelligence technology, and relates to, but is not limited to, a feature intersection method, apparatus, computer-readable storage medium, and program product.
Background
The federal learning technology is a novel privacy protection technology, and can effectively combine data of all parties to carry out model training on the premise that the data cannot be out of the local.
In the longitudinal federal learning, different participants jointly train a machine learning model, and when the longitudinal federal learning is used for modeling, features of different participants need to be crossed frequently, and in the longitudinal federal framework, if the features of two category attributes are distributed in different participants respectively, the features are difficult to be crossed due to the requirement of protecting data privacy.
Disclosure of Invention
The embodiment of the application provides a feature crossing method, a feature crossing device, a computer readable storage medium and a computer program product, which can realize feature crossing on category features based on a longitudinal federal learning framework on the premise of protecting data privacy.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a feature crossing method, which is applied to a credible participant for feature crossing, wherein the credible participant is used for performing joint training on a model, and the method comprises the following steps:
generating a public key and a private key for homomorphic encryption;
distributing the public key to a first party and a second party which perform feature crossing;
obtaining at least one ciphertext feature crossover result from the second party, the ciphertext feature crossover result being obtained according to the first feature of the first party, the second feature of the second party, and the public key;
and homomorphic decryption is carried out on the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result.
The embodiment of the application provides a feature crossing method, which is applied to a first participant for feature crossing, wherein the first participant is used for performing joint training on a model, and the method comprises the following steps:
obtaining at least one first feature and a public key for feature crossing;
coding the at least one first characteristic to obtain a first coded value corresponding to the at least one first characteristic;
homomorphic encryption is carried out on a first coding value corresponding to the at least one first characteristic based on the public key to obtain at least one first ciphertext characteristic;
and sending the at least one first ciphertext feature to a second party performing feature crossing so that the second party performs feature crossing based on the first ciphertext feature.
The embodiment of the application provides a feature crossing method, which is applied to a second participant who performs feature crossing and is used for performing joint training on a model, and the method comprises the following steps:
obtaining at least one second feature, at least one first ciphertext feature and a public key for feature crossing, wherein the at least one first ciphertext feature is obtained by a first party for feature crossing based on the at least one first feature and the public key;
homomorphic encryption is carried out on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic;
performing feature crossing based on the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result;
and sending the at least one ciphertext feature interleaving result to a trusted party performing feature interleaving so that the trusted party determines a plaintext feature interleaving result based on the ciphertext feature interleaving result.
The embodiment of the application provides a feature crossing method, which is applied to an active participant for feature crossing, wherein the active participant is used for performing joint training on a model, and the method comprises the following steps:
determining a sample for carrying out feature crossing according to a first party and a second party for carrying out feature crossing;
acquiring marking information of the sample;
carrying out homomorphic encryption on the mark information of the sample to obtain ciphertext mark information;
the ciphertext mark information is sent to a credible party performing feature crossing, so that the credible party determines the number of ciphertext mark information corresponding to a plaintext feature crossing result based on the ciphertext mark information;
and determining the information value of the plaintext feature crossing result based on the number of the ciphertext marker information sent by the trusted participant.
The embodiment of the application provides a feature crossing device, the device is applied to the credible participant who carries out the feature crossing, credible participant is used for carrying out the joint training to the model, the device includes:
the first generation module is used for generating a public key and a private key for homomorphic encryption;
the first sending module is used for distributing the public key to a first party and a second party which perform feature crossing;
a first obtaining module, configured to obtain at least one ciphertext feature interleaving result from the second party, where the ciphertext feature interleaving result is obtained according to the first feature of the first party, the second feature of the second party, and the public key;
and the decryption module is used for homomorphically decrypting the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result.
The embodiment of the application provides a feature crossing device, the device is applied to a first participant who carries out feature crossing, the first participant is used for carrying out joint training to the model, the device includes:
a fourth obtaining module, configured to obtain at least one first feature and a public key for feature crossing;
the encoding module is used for encoding the at least one first characteristic to obtain a first encoding value corresponding to the at least one first characteristic;
the second encryption module is used for homomorphically encrypting the first coding value corresponding to the at least one first characteristic based on the public key to obtain at least one first ciphertext characteristic;
and the fourth sending module is used for sending the at least one first ciphertext feature to a second party performing feature crossing so that the second party performs feature crossing based on the first ciphertext feature.
The embodiment of the application provides a feature crossing device, the device is applied to a second participant who carries out feature crossing, the second participant is used for carrying out joint training to the model, the device includes:
the second acquisition module is used for acquiring at least one second feature, at least one first ciphertext feature and a public key for feature crossing; the at least one first ciphertext feature is obtained by a first party performing feature crossing based on the at least one first feature and the public key;
the first encryption module is used for homomorphic encryption on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic;
the feature crossing module is used for carrying out feature crossing on the basis of the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result;
and the second sending module is used for sending the at least one ciphertext feature interleaving result to a trusted party performing feature interleaving so that the trusted party determines a plaintext feature interleaving result based on the ciphertext feature interleaving result.
The embodiment of the application provides a feature crossing device, the device is applied to the initiative participant who carries out the feature crossing, the initiative participant is used for carrying out the joint training to the model, the device includes:
the second determining module is used for determining a sample for carrying out feature crossing according to the first party and the second party for carrying out feature crossing;
a fifth obtaining module, configured to obtain label information of the sample;
the third encryption module is used for carrying out homomorphic encryption on the mark information of the sample to obtain ciphertext mark information;
the fifth sending module is used for sending the ciphertext marker information to a trusted party performing feature crossing so that the trusted party determines the number of ciphertext marker information corresponding to a plaintext feature crossing result based on the ciphertext marker information;
and the third determining module is used for determining the information value of the plaintext feature crossing result based on the number of the ciphertext marking information sent by the trusted participant.
The embodiment of the present application provides a feature crossing device, including:
a memory for storing executable instructions;
and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.
Embodiments of the present application provide a computer-readable storage medium, where executable instructions are stored on the computer-readable storage medium, and when the computer-readable storage medium is executed by a processor, the computer-readable storage medium implements a method provided by embodiments of the present application.
Embodiments of the present application provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the method provided by the embodiments of the present application.
The embodiment of the application has the following beneficial effects:
in the feature crossing method provided by the embodiment of the application, a trusted party generates a public key and a private key for homomorphic encryption; distributing the public key to a first party and a second party which perform feature crossing; obtaining at least one ciphertext feature crossover result from the second party, the ciphertext feature crossover result being obtained according to the first feature of the first party, the second feature of the second party, and the public key; and homomorphic decryption is carried out on the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result. By the embodiment of the application, the characteristic intersection can be realized on the premise of not revealing the original data of the first party and the second party; the cross result of the ciphertext features is obtained through the trusted participants, so that all the participants cannot reversely push out the original data, and the data privacy can be protected; the encryption and decryption are carried out through homomorphic encryption, so that the obtained at least one plaintext feature crossing result is consistent with the feature crossing result obtained by processing the unencrypted original data by the same method on the premise of not revealing the original data, and the result obtained by feature crossing is ensured to be correct. Therefore, on the premise of protecting data privacy, feature crossing of the category features can be realized based on a longitudinal federal learning framework.
Drawings
Fig. 1 is a schematic network architecture diagram of a feature crossing method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a component structure of a feature crossing apparatus provided in an embodiment of the present application;
fig. 3 is a schematic flow chart of an implementation of a feature intersection method provided in an embodiment of the present application;
fig. 4 is a schematic flow chart of another implementation of the feature crossing method provided in the embodiment of the present application;
fig. 5 is a schematic flowchart of another implementation of the feature crossing method according to the embodiment of the present application;
fig. 6 is a schematic flowchart of another implementation of the feature crossing method according to the embodiment of the present application;
fig. 7 is a schematic flowchart of another implementation of the feature crossing method according to the embodiment of the present application;
fig. 8 is a schematic diagram of participants of a category feature crossing method according to an embodiment of the present application.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only used to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where permissible, so that the embodiments of the present application described herein can be practiced in other than the order shown or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Federal Learning (fed Learning), an emerging artificial intelligence base technology, is designed to carry out efficient machine Learning among multiple participants or multiple computing nodes on the premise of guaranteeing information security during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance.
2) In Vertical federal Learning (Vertical fed Learning), under the condition that the users of two data sets overlap more and the user features overlap less, the data sets are segmented according to the Vertical direction (namely feature dimension), and partial data which are the same for both users but have not the same user features are taken out for training machine Learning.
3) Homomorphic Encryption (Homomorphic Encryption), which is a cryptographic technique based on the theory of computational complexity of mathematical puzzles. The homomorphic encrypted data is processed to produce an output, which is decrypted, the result being the same as the output obtained by processing the unencrypted original data in the same way.
4) Binning, which divides a continuous segment of values into several segments, each segment of values being considered as a category. The process of converting continuous values into discrete values is commonly referred to as binning.
5) Information Value (IV) is mainly used to encode input variables and estimate prediction capability in machine learning binary problem. The magnitude of the characteristic variable IV value represents the strength of the variable prediction capability.
An exemplary application of the apparatus implementing the embodiment of the present application is described below, and the apparatus provided in the embodiment of the present application may be implemented as a terminal device. In the following, exemplary applications covering terminal devices when the apparatus is implemented as a terminal device will be explained.
Fig. 1 is a schematic diagram of a network architecture of a feature crossing method according to an embodiment of the present application, as shown in fig. 1, the network architecture at least includes a trusted party C100, a party a200, a party B300, an active party D400, and a network 500. To enable support of one exemplary application, trusted participant C100, participant a200, participant B300, and active participant D400 may be participants in longitudinal federal learning who jointly train a machine learning model. The trusted participant C100, the participant a200, the participant B300, and the active participant D400 may be clients, for example, participant devices such as banks or hospitals, which store user characteristic data, and the clients may be devices with model training functions such as a notebook computer, a tablet computer, a desktop computer, and a special training device. Trusted party C100 is connected to party a200, party B300, and active party D400 through network 500, respectively, party a200 is connected to party B300 through network 500, network 500 may be a wide area network or a local area network, or a combination of both, and data transmission is achieved using wireless or wired links.
Trusted party C100 first generates a first public key and a first private key for use in additive homomorphic encryption and distributes the first public key to party a200 and party B300. The participant B300 receives the first public key sent by the trusted participant C100 and sends the ith characteristic
Figure BDA0002858432890000071
Is coded into
Figure BDA0002858432890000072
Is an integer from 1 to N, where N is a participant B300 including a characteristic XBUsing the first public key pair
Figure BDA0002858432890000073
Performing homomorphic encryption to obtain a first ciphertext feature
Figure BDA0002858432890000074
Characterizing the first ciphertext
Figure BDA0002858432890000075
To party a 200. Participant a200 receives the first public key sent by trusted participant C100 and receives the first public key sent by participant B300
Figure BDA0002858432890000076
Will be the jth feature
Figure BDA0002858432890000077
Is coded into
Figure BDA0002858432890000078
Is an integer from 1 to M, M is a participant A200 including a feature XAUsing the first public key pair
Figure BDA0002858432890000081
Performing homomorphic encryption to obtain a second ciphertext feature
Figure BDA0002858432890000082
Based on the first ciphertext feature and the second ciphertextDetermining the cross result of the ciphertext features by the text features and crossing the result of the ciphertext features
Figure BDA0002858432890000083
To trusted participant C100. Trusted participant C100 receives transmissions from participant a200
Figure BDA0002858432890000084
Decrypting with the first private key to obtain the cross feature
Figure BDA0002858432890000085
Then, the cross feature is calculated by combining with the active participant D400
Figure BDA0002858432890000086
IV value of (a). The credible participant C100 does not need to acquire the original data of the participant A200 and the participant B300, and can realize the cross of class characteristics on the premise of not revealing the original data of each participant; cross feature acquisition by introduction of trusted participant C100
Figure BDA0002858432890000087
All participants can not reversely push out original data, and data privacy can be protected; the encryption and decryption are carried out through homomorphic encryption, so that the obtained at least one plaintext feature crossing result is consistent with the feature crossing result obtained by processing the unencrypted original data by the same method on the premise of not revealing the original data, and the result obtained by feature crossing is ensured to be correct. Therefore, on the premise of protecting data privacy, feature crossing of the category features can be realized based on a longitudinal federal learning framework.
The apparatus provided in the embodiments of the present application may be implemented as hardware or a combination of hardware and software, and various exemplary implementations of the apparatus provided in the embodiments of the present application are described below.
According to the exemplary structure of the signature crossing device shown in fig. 2, here, the signature crossing device is shown by taking the trusted party C100 as an example, other exemplary structures of the signature crossing device can be foreseen, so that the structure described herein should not be seen as a limitation, for example, some components described below may be omitted, or components not described below may be added to adapt to the specific requirements of some applications.
The feature crossing apparatus 100 shown in fig. 2 includes: at least one processor 110, memory 140, at least one network interface 120, and a user interface 130. Each of the components in the feature crossing device 100 are coupled together by a bus system 150. It will be appreciated that the bus system 150 is used to enable communications among the components of the connection. The bus system 150 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 150 in fig. 2.
The user interface 130 may include a display, a keyboard, a mouse, a touch-sensitive pad, a touch screen, and the like.
The memory 140 may be either volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM). The volatile Memory may be a Random Access Memory (RAM). The memory 140 described in embodiments herein is intended to comprise any suitable type of memory.
The memory 140 in the embodiments of the present application is capable of storing data to support the operation of the feature crossing apparatus 100. Examples of such data include: any computer program, such as an operating system and an application program, for operating on the feature crossing device 100. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program may include various application programs.
As an example of the method provided by the embodiment of the present application implemented by software, the method provided by the embodiment of the present application may be directly embodied as a combination of software modules executed by the processor 110, the software modules may be located in a storage medium located in the memory 140, and the processor 110 reads executable instructions included in the software modules in the memory 140, and completes the method provided by the embodiment of the present application in combination with necessary hardware (for example, including the processor 110 and other components connected to the bus 150).
By way of example, the Processor 110 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor or the like.
The feature crossing method provided by the embodiment of the present application will be described in conjunction with exemplary applications and implementations of the terminal provided by the embodiment of the present application.
Fig. 3 is a schematic flow chart of an implementation of the feature crossing method provided in the embodiment of the present application, which is applied to a trusted party C of the network architecture shown in fig. 1, and will be described with reference to the steps shown in fig. 3.
In step S301, a public key and a private key for homomorphic encryption are generated.
In the embodiment of the application, the trusted participants generate the public key and the private key for homomorphic encryption, so that the participants do not need to send the original data to the trusted participants, and the privacy of the original data of the participants can be protected.
Here, in order to distinguish from the public key and the private key generated by the active party, the public key generated by the trusted party is hereinafter referred to as a first public key, and the private key generated by the trusted party is hereinafter referred to as a first private key.
Homomorphic encryption is a cryptographic technique based on the theory of computational complexity of mathematical problems. The homomorphic encrypted data is processed to produce an output, which is decrypted, the result being the same as the output obtained by processing the unencrypted original data in the same way. The generated first public key and the first private key can be used for any homomorphic encryption of addition homomorphic encryption, multiplication homomorphic encryption, mixed multiplication homomorphic encryption, subtraction homomorphic encryption, division homomorphic encryption, algebraic homomorphic encryption (also called fully homomorphic encryption) and arithmetic homomorphic encryption. Here, the fully homomorphic encryption means that the encryption function satisfies both the addition homomorphism and the multiplication homomorphy.
In some embodiments, the trusted party may generate the first public key and the first private key for additive homomorphic encryption, or the trusted party may generate the first public key and the first private key for multiplicative homomorphic encryption, or the trusted party may generate the first public key and the first private key for fully homomorphic encryption. Compared with the generation of the first public key and the first private key for the full homomorphic encryption, the generation of the first public key and the first private key for the addition homomorphic encryption can improve the operation efficiency.
Step S302, the public key is distributed to a first party and a second party which carry out feature crossing.
Here, the first party and the second party are different parties performing feature crossing, a feature included in the first party is referred to as a first feature, and a feature included in the second party is referred to as a second feature. The first participant encrypts a first code value corresponding to the first characteristic based on the first public key to obtain a first ciphertext characteristic, and the second participant encrypts a second code value corresponding to the second characteristic based on the first public key to obtain a second ciphertext characteristic. And then the second party performs feature crossing on the first ciphertext feature of the first party and the second ciphertext feature of the second party to obtain a ciphertext feature crossing result.
For example, the first feature is "male", the first code value corresponding to the first feature is 1, and the first code value 1 is encrypted based on the first public key to obtain a first ciphertext feature 5; the second feature is juvenile, the second code value corresponding to the second feature is 2, and the second code value 2 is encrypted based on the first public key to obtain a second ciphertext feature 7. And the second participant performs feature crossing on the first ciphertext feature 5 and the second ciphertext feature 7 to obtain a ciphertext feature crossing result, wherein the ciphertext feature crossing result is 32 for example.
Step S303, obtaining at least one ciphertext feature cross result from the second party.
And the ciphertext feature cross result is obtained according to the first feature of the first party, the second feature of the second party and the public key.
And the second party performs feature crossing to obtain a ciphertext feature crossing result, and then sends the ciphertext feature crossing result (namely ciphertext crossing feature) to the trusted party. In the embodiment of the application, the ciphertext feature cross result is obtained through the trusted participants, so that the participants cannot reversely push out the original data, and the data privacy can be protected.
Step S304, the at least one ciphertext feature crossover result is homomorphic decrypted based on the private key, and at least one plaintext feature crossover result is obtained.
And carrying out homomorphic decryption on the received ciphertext characteristic cross result by using a first private key generated by the trusted party to obtain a plaintext characteristic cross result. By decrypting the first private key for homomorphic encryption, the obtained at least one plaintext feature crossing result is consistent with the feature crossing result obtained by processing the unencrypted original data by the same method on the premise of not revealing the original data, and the result obtained by feature crossing is ensured to be correct. Therefore, on the premise of protecting data privacy, feature crossing of the category features can be realized on the basis of a longitudinal federal learning framework.
The feature crossing method provided by the embodiment of the application is applied to a credible participant for feature crossing, wherein the credible participant is used for performing joint training on a model, and the credible participant generates a public key and a private key for homomorphic encryption; distributing the public key to a first party and a second party which perform feature crossing; obtaining at least one ciphertext feature crossover result from the second party, the ciphertext feature crossover result being obtained according to the first feature of the first party, the second feature of the second party, and the public key; and homomorphic decryption is carried out on the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result. By the embodiment of the application, the characteristic intersection can be realized on the premise of not revealing the original data of the first party and the second party; the cross result of the ciphertext features is obtained through the trusted participants, so that all the participants cannot reversely push out the original data, and the data privacy can be protected; the encryption and decryption are carried out through homomorphic encryption, so that the obtained at least one plaintext feature crossing result is consistent with the feature crossing result obtained by processing the unencrypted original data by the same method on the premise of not revealing the original data, and the result obtained by feature crossing is ensured to be correct. Therefore, on the premise of protecting data privacy, feature crossing of the category features can be realized based on a longitudinal federal learning framework.
In some embodiments, after obtaining the at least one plaintext feature crossing result in step S304 in the embodiment shown in fig. 3, the trusted participant may further determine the information value of the at least one plaintext feature crossing result in combination with the active participant. After step S304, the method may further include the steps of:
step S305, performing box separation on the at least one plaintext feature crossing result to obtain at least one box separation result.
In this embodiment of the present application, the same plaintext feature crossing result in the at least one plaintext feature crossing result may be classified into one bin, for example, if there are 3 plaintext feature crossing results in the at least one plaintext feature crossing result being 5, the 3 plaintext feature crossing results are classified into one bin.
And step S306, acquiring ciphertext mark information of a plaintext feature crossing result in each box dividing result from the active participating party which carries out feature crossing and has mark information.
Here, the tag information includes a first tag y and a second tag 1-y, and the ciphertext tag information includes a first ciphertext tag [ [ y ] ] and a second ciphertext tag [ [1-y ] ].
Step S307, determining the number of the ciphertext mark information corresponding to the plaintext feature crossing result based on the ciphertext mark information.
In some embodiments, this may be implemented as: determining the number of first ciphertext marks corresponding to the plaintext feature crossing result based on the first ciphertext marks; and determining the number of second ciphertext marks corresponding to the plaintext feature crossing result based on the second ciphertext marks.
Also illustrated by the above described binning results, the plaintext feature interleaving result is 5 plaintext 3In the feature interleaving result, the first plaintext feature interleaving result and the third plaintext feature interleaving result correspond to the first ciphertext mark [ [ y ]]]The second plaintext feature interleaving result corresponds to the second ciphertext tag [ [1-y ]]]Then the number of the first ciphertext marks [ y ] corresponding to the plaintext feature crossing result in the binning result]]kTo 2, the second ciphertext is signed by the number [ [1-y ]]]kIs 1.
And step S308, sending the number of the ciphertext mark information to the active participant.
The trusted party marks the determined number of the first ciphertext marks [ [ y ]]]kAnd a second ciphertext tag number [ [1-y ]]]kAnd sending the information to an active participant so that the active participant determines the information value of the plaintext feature crossing result based on the number of the ciphertext marking information.
According to the feature crossing method provided by the embodiment of the application, the credible participant can determine the information value of at least one plaintext feature crossing result by combining the active participants, and can efficiently screen out the crossing features with strong interpretability and good effect based on the information value, so that the model fitting capacity and the interpretability under a longitudinal federated framework are improved, and the model effect is further improved.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing method, fig. 4 is a schematic diagram of another implementation flow of the feature crossing method provided in the embodiment of the present application, and is applied to a participant B in a network architecture shown in fig. 1, as shown in fig. 4, the feature crossing method includes the following steps:
step S401, at least one first feature and a public key for feature crossing are obtained.
Here, the first feature is a category feature in which the first party (i.e., party B) performs feature crossing. The public key is obtained from the trusted party, namely the public key is a first public key which is generated by the trusted party and used for homomorphic encryption.
Step S402, encoding the at least one first feature to obtain a first encoded value corresponding to the at least one first feature.
In the embodiments of the present application, the first feature is denoted as XBThe first party includes a first feature XBIs recorded as N for the ith first feature
Figure BDA0002858432890000131
Encoding to obtain a first encoded value
Figure BDA0002858432890000132
Is an integer between 1 and N, i.e. N first features XBA first encoded value obtained by encoding
Figure BDA0002858432890000133
Is [1, N ]]An integer within the interval.
In some embodiments, step S402 may be implemented as the following steps:
step S4021, when the target first feature which is not coded exists in the at least one first feature, obtaining the coded times.
Step S4022, encoding the target first feature based on the encoded times to obtain a first encoded value of the target first feature.
For example, there is a target first feature that is not encoded in the at least one first feature
Figure BDA0002858432890000134
The coded number is 5, and the first characteristic of the target is coded based on the coded number 5
Figure BDA0002858432890000135
Obtaining a first characteristic of the target
Figure BDA0002858432890000136
The first code value of 6. Then, whether the target first feature which is not coded still exists is continuously judged, and the coded number is updated to 6 at this time. Repeating the steps until no target first feature which is not coded exists, and accordingly, obtaining N first features
Figure BDA0002858432890000137
A first encoded value obtained by encoding
Figure BDA0002858432890000138
Step S403, homomorphic encrypting the first encoded value corresponding to the at least one first feature based on the public key to obtain at least one first ciphertext feature.
Ith first feature based on first public key pair
Figure BDA0002858432890000141
Corresponding first coding value
Figure BDA0002858432890000142
And carrying out homomorphic encryption to obtain the ith first ciphertext characteristic.
Here, the ith first feature may be paired based on the first public key
Figure BDA0002858432890000143
Corresponding first coding value
Figure BDA0002858432890000144
And the arithmetic efficiency can be improved by carrying out addition homomorphic encryption.
Step S404, sending the at least one first ciphertext feature to a second party performing feature interleaving.
The first participant sends a first ciphertext feature determined based on the first feature of the first participant and the first public key to a second participant, so that the second participant performs feature crossing based on the first ciphertext feature. Because the first ciphertext characteristic is sent by the first party to the second party, the second party cannot obtain the first characteristic of the first party, and therefore data privacy can be protected.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing method, and fig. 5 is a schematic diagram of a further implementation flow of the feature crossing method provided in the embodiment of the present application, which is applied to a participant a in a network architecture shown in fig. 1, as shown in fig. 5, the feature crossing method includes the following steps:
step S501, at least one second feature, at least one first ciphertext feature and a public key for feature crossing are obtained.
Here, the second feature is a category feature in which the second party (i.e., party a) performs feature crossing. The at least one first ciphertext feature may be obtained from the first party, the at least one first ciphertext feature being derived by the first party performing the feature crossing based on the at least one first feature and the public key. The public key is obtained from the trusted party, namely the public key is a first public key which is generated by the trusted party and used for homomorphic encryption.
Step S502, homomorphic encryption is carried out on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic.
Step S503, performing feature crossing based on the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result.
Here, at least one first ciphertext is characterized as
Figure BDA0002858432890000151
Characterizing at least one second ciphertext
Figure BDA0002858432890000152
Based on at least one first ciphertext feature
Figure BDA0002858432890000153
And at least one second ciphertext feature
Figure BDA0002858432890000154
Performing feature intersection to obtain at least one ciphertext feature intersection result and recording the result as
Figure BDA0002858432890000155
Wherein M is that the second party includes a second feature XAThe number of (2).
And step S504, sending the at least one ciphertext feature intersection result to a trusted party performing feature intersection.
The second party obtains at least one ciphertext feature cross result
Figure BDA0002858432890000156
Sending the ciphertext feature cross result to a trusted participant so that the trusted participant can determine the plaintext feature cross result based on the ciphertext feature cross result, and when the ciphertext feature cross result is actually realized, the trusted participant can perform homomorphic decryption on at least one ciphertext feature cross result based on the private key to obtain at least one plaintext feature cross result
Figure BDA0002858432890000157
In the embodiment of the application, the second party sends the ciphertext feature cross result to the trusted party, so that the trusted party cannot obtain the first feature of the first party and the second feature of the second party, and data privacy can be protected.
In some embodiments, step S502 "homomorphic encrypting the at least one second feature based on the public key to obtain at least one second ciphertext feature" in the embodiment shown in fig. 5 may be implemented as the following steps:
step S5021, encoding the at least one second feature to obtain a second encoded value corresponding to the at least one second feature.
In the embodiments of the present application, the second feature is denoted as XAThe second party including a second feature XAIs recorded as M for the jth second feature
Figure BDA0002858432890000158
Coding to obtain a second coded value
Figure BDA0002858432890000159
Is an integer between 1 and M, i.e. M second features XAA second encoded value obtained by encoding
Figure BDA00028584328900001510
Is [1, M ]]An integer within the interval.
In some embodiments, step S5021 may be implemented as the following steps:
step S50211, when there is a target second feature that is not coded in the at least one second feature, obtaining the number of coded times.
Step S50212, based on the encoded times, encodes the target second feature to obtain a second encoded value of the target second feature.
For example, the at least one second feature has an uncoded target second feature
Figure BDA0002858432890000161
The number of coded times is 5, and the second characteristic of the target is determined based on the number of coded times of 5
Figure BDA0002858432890000162
Obtaining a target second characteristic
Figure BDA0002858432890000163
And a second code value of 6. Then, it is continuously determined whether there is any target second feature that has not been encoded, and the number of encoded times is updated to 6. Repeating the steps until no target second characteristics which are not coded exist, thereby obtaining M second characteristics
Figure BDA0002858432890000164
A second encoded value obtained by encoding
Figure BDA0002858432890000165
Step S5022, homomorphic encrypting is performed on a second coded value corresponding to the at least one second feature based on the public key, so as to obtain at least one second ciphertext feature.
J second feature based on first public key pair
Figure BDA0002858432890000166
Corresponding second codeValue of
Figure BDA0002858432890000167
And carrying out homomorphic encryption to obtain the jth second ciphertext feature.
Here, the jth second feature may be based on the first public key pair
Figure BDA0002858432890000168
Corresponding second coded value
Figure BDA0002858432890000169
And the arithmetic efficiency can be improved by carrying out addition homomorphic encryption.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing method, and fig. 6 is a schematic diagram of another implementation flow of the feature crossing method provided in the embodiment of the present application, and is applied to an active participant D in a network architecture shown in fig. 1, as shown in fig. 6, the feature crossing method includes the following steps:
step S601, determining a sample for performing feature crossing according to the first party and the second party for performing feature crossing.
Step S602, obtaining the marking information of the sample.
Here, the mark information includes a first mark and a second mark.
For example, the flag information indicates whether there is a overdue record in the flag sample. If the overdue record exists, determining that the mark information of the sample is a first mark, if the mark y is 1; if no overdue record exists, the mark information of the sample is determined to be a second mark, such as marks 1-y are 0.
Step S603, performing homomorphic encryption on the tag information of the sample to obtain ciphertext tag information.
In some embodiments, before step S603, the method further comprises step S61 of generating a second public key and a second private key for homomorphic encryption. Step S603 includes: and carrying out homomorphic encryption on the mark information of the sample based on the second public key to obtain ciphertext mark information.
In some embodiments, the active participant may generate the second public key and the second private key for additive homomorphic encryption, or the active participant may generate the second public key and the second private key for multiplicative homomorphic encryption, or the active participant may generate the second public key and the second private key for fully homomorphic encryption. Compared with the generation of the second public key and the second private key for the full homomorphic encryption, the generation of the second public key and the second private key for the addition homomorphic encryption can improve the operation efficiency.
In some embodiments, the tag information includes a first tag and a second tag, and the ciphertext tag information includes a first ciphertext tag and a second ciphertext tag, and the step S603 may be implemented by:
step S6031, homomorphic encrypting the first mark and the second mark of the sample based on the second public key, respectively, to obtain a first ciphertext mark and a second ciphertext mark.
And carrying out homomorphic encryption on the first mark y and the second mark 1-y, and marking the obtained first ciphertext mark as [ [ y ] ], and marking the obtained second ciphertext mark as [ [1-y ] ].
Step S6032, determine the first ciphertext tag and the second ciphertext tag as ciphertext tag information.
And step S604, sending the ciphertext mark information to a credible party performing feature crossing.
The active participant marks the first ciphertext [ [ y ]]]And a second ciphertext tag [ [1-y ]]]And sending the ciphertext marking information to a trusted party so that the trusted party determines the number of the ciphertext marking information corresponding to the plaintext feature crossing result based on the ciphertext marking information. The trusted participant marks [ y ] based on the first ciphertext]]Determining the number [ y ] of first ciphertext marks corresponding to the plaintext feature crossing result]]kBased on the second ciphertext tag [ [1-y ]]]Determining the number [1-y ] of second ciphertext marks corresponding to the plaintext feature crossing result]]kAnd then the trusted party marks the first ciphertext with the number [ y [ [ y ]]]kAnd a second ciphertext tag number [ [1-y ]]]kAnd sending the information to the active participant.
And step S605, determining the information value of the plaintext feature crossing result based on the number of the ciphertext marking information sent by the trusted participant.
And the active participant performs homomorphic decryption on the number of the ciphertext mark information based on the second private key to obtain the number of plaintext mark information, and further calculates the information value of the plaintext feature crossing result based on the number of the plaintext mark information.
In some embodiments, the step S605 "determining the information value of the plaintext feature interleaving result based on the number of the ciphertext flag information sent by the trusted participant" may be implemented by:
step S6051, receiving the number of the first ciphertext tags and the number of the second ciphertext tags sent by the trusted party.
Step S6052, decrypting the number of the first ciphertext marks and the number of the second ciphertext marks based on the second private key, respectively, to obtain the number of the first plaintext marks and the number of the second plaintext marks.
Step S6053, based on the number of the first plaintext flag and the number of the second plaintext flag, determines the information value of the plaintext feature crossing result.
For example, the active participant marks a first ciphertext token number [ [ y ] based on a second private key]]kCarrying out homomorphic decryption to obtain the number y of the first plaintext markskThe second ciphertext is signed with a number [1-y ] based on the second private key]]kPerforming homomorphic decryption to obtain the number n of second plaintext marksk. Marking the number y of the first plaintextkAnd a second number n of plaintext flagskAnd inputting a calculation formula of the IV value to obtain the information value of the plaintext feature crossing result.
In the embodiment of the present application, the calculation formula of the IV value is shown in the following formula (1):
Figure BDA0002858432890000181
in the formula, WOEkIs the evidentiary weight.
According to the feature crossing method provided by the embodiment of the application, the active party can determine the information value of at least one plaintext feature crossing result by combining the credible parties, and the crossing features with strong interpretability and good effect can be efficiently screened out based on the information value, so that the model fitting capacity and interpretability under a longitudinal federated framework are improved, and the model effect is further improved.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing method, and fig. 7 is a schematic diagram of a further implementation flow of the feature crossing method provided in the embodiment of the present application, which is applied to the network architecture shown in fig. 1, as shown in fig. 7, the feature crossing method includes the following steps:
step S701, the trusted party generates a first public key and a first private key for homomorphic encryption.
Step S702, the trusted party distributes the first public key to the first party and the second party performing the feature crossing.
In step S703, the first party obtains at least one first feature for performing feature crossing.
The first participant is participant B in the network architecture shown in fig. 1.
Step S704, the first participant encodes the at least one first feature to obtain a first encoded value corresponding to the at least one first feature.
Step S705, the first participant homomorphically encrypts the first encoded value corresponding to the at least one first feature based on the first public key to obtain at least one first ciphertext feature.
Step S706, the first party sends the at least one first ciphertext feature to the second party performing feature interleaving.
In step S707, the second party acquires at least one second feature for feature crossing.
Step S708, the second party encodes the at least one second feature to obtain a second encoded value corresponding to the at least one second feature.
Step S709, the second participant homomorphically encrypts the second encoded value corresponding to the at least one second feature based on the first public key to obtain at least one second ciphertext feature.
Step S710, the second party performs feature crossing based on the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result.
Step S711, the second party sends the at least one ciphertext feature interleaving result to the trusted party.
Step S712, the trusted party decrypts the at least one ciphertext feature cross result based on the first private key to obtain at least one plaintext feature cross result.
And step S713, the trusted party performs box separation on the at least one plaintext feature crossing result to obtain at least one box separation result.
And step 714, the active party determines a sample for performing the feature crossing according to the first party and the second party for performing the feature crossing.
And step S715, the active participant acquires the marking information of the sample.
In step S716, the active party generates a second public key and a second private key for homomorphic encryption.
And step S717, the active participant homomorphically encrypts the tag information of the sample based on the second public key to obtain ciphertext tag information.
Step S718, the active party sends the ciphertext tag information to the trusted party.
And step S719, the trusted party determines the number of the ciphertext mark information corresponding to the plaintext feature crossing result based on the ciphertext mark information.
And step S720, the trusted party sends the number of the ciphertext mark information to the active party.
Step S721, the active participant performs homomorphic decryption on the number of the ciphertext tag information based on the second private key, to obtain the number of plaintext tag information.
Step S722, the active participant determines the information value of the plaintext feature crossing result based on the number of the plaintext flag information.
The feature crossing method provided by the embodiment of the application can realize the feature crossing on the premise of not revealing the original data of the first party and the second party; the cross result of the ciphertext features is obtained through the trusted participants, so that all the participants cannot reversely push out the original data, and the data privacy can be protected; the encryption and decryption are carried out through homomorphic encryption, so that the obtained at least one plaintext feature crossing result is consistent with the feature crossing result obtained by processing the unencrypted original data by the same method on the premise of not revealing the original data, and the result obtained by feature crossing is ensured to be correct. Therefore, on the premise of protecting data privacy, feature crossing of the category features can be realized based on a longitudinal federal learning framework.
Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
In the related art, longitudinal federated learning is generally realized by jointly training a machine learning model by different participants, wherein a labeled participant (usually only one) is called an active party, and a non-labeled participant is called a passive party. When longitudinal federal learning is used for modeling, the characteristics of different participants need to be crossed frequently, and a good characteristic combination is searched for, so that the model effect is improved. The range of values of the class attribute features is usually a set, and then the range of values after the features of the two class attributes are crossed is usually the cartesian product of the two sets. For example, if the human life stage (infant, juvenile, young, middle-aged, and old) and the sex are crossed, the human life stage (infant male, juvenile male, young male, middle-aged male, old male, infant female, juvenile female, young female, middle-aged female, and old female) will be obtained. However, in the framework of the longitudinal federation, if the features of the two types of attributes are respectively distributed in different participants, the features are often difficult to intersect due to the requirement of protecting data privacy. And (3) screening out a characteristic combination with strong interpretability and good effect in the absence of an effective mechanism under a longitudinal federal framework. The pain points affect the model fitting ability and interpretability under the longitudinal federated framework, and further affect the model effect, so that the longitudinal federated learning framework in the related technology is difficult to perform feature crossing on the category attribute characteristics.
The embodiment of the application provides a category characteristic crossing method based on a longitudinal federated learning framework, and a participator A is assumed to have a category characteristic XAParticipant B has a class feature XBParticipant D has label Y. To XAAnd XBWhen the characteristics are crossed, in order to not reveal the original data of the participant A and the participant B, a credible participant C is introduced and used for acquiring the characteristics after the participant A and the participant B are crossed.
Fig. 8 is a schematic diagram of participants of a category feature crossing method provided in an embodiment of the present application, and as shown in fig. 8, the method includes the following steps:
in step S801, the party C generates a public key and a private key for additive homomorphic encryption, and distributes the public key to the party a and the party B.
Step S802, participant B receives the public key sent by participant C and sends the public key to participant B
Figure BDA0002858432890000211
Is coded into
Figure BDA0002858432890000212
Will be
Figure BDA0002858432890000213
Homomorphic encryption with public key
Figure BDA0002858432890000214
Will be provided with
Figure BDA0002858432890000215
To party a.
Where N is
Figure BDA0002858432890000216
The size of the set of value ranges.
Step S803, participant A receives the public key sent by participant C and receives the public key sent by participant B
Figure BDA0002858432890000217
Will be provided with
Figure BDA0002858432890000218
Is coded into
Figure BDA0002858432890000219
The number of (a); will be provided with
Figure BDA00028584328900002110
Homomorphic encryption with public keys
Figure BDA00028584328900002111
Will be provided with
Figure BDA00028584328900002112
Figure BDA00028584328900002113
To party C.
Where M is
Figure BDA00028584328900002114
The size of the set of value ranges.
Step S804, participant C receives the message sent by participant A
Figure BDA00028584328900002115
Decrypted by a private key to obtain
Figure BDA00028584328900002116
The IV value of the cross feature is computed in conjunction with participant D having label Y.
The method provided by the embodiment of the application realizes the cross of the category characteristics on the premise of not revealing the original data of the participants. Because the participator A and the participator B have the ability to reversely deduce the original data of the other party after obtaining the cross feature, the invention introduces the credible participator C to obtain the cross feature, so that each participator can not reversely deduce the original data. Encryption is realized only through addition homomorphism, full homomorphic encryption is avoided, and operation efficiency is guaranteed. The category attributes are directly coded, and then the codes of the cross features are solved according to the codes, so that the algorithm is simple and easy to implement and high in efficiency. Encryption is realized only by adding homomorphic encryption, so that the use of fully homomorphic encryption is avoided, and the operation efficiency is ensured.
Continuing with the exemplary architecture of the feature crossing apparatus implemented as a software module provided in the embodiments of the present application, in some embodiments, as shown in fig. 2, the feature crossing apparatus 90 stored in the memory 140 is applied to trusted participants performing feature crossing, where the trusted participants are used for joint training of models, and the software module in the feature crossing apparatus 90 may include:
a first generating module 91 for generating a public key and a private key for homomorphic encryption;
a first sending module 92, configured to distribute the public key to the first and second parties performing feature crossing;
a first obtaining module 93, configured to obtain at least one ciphertext feature interleaving result from the second party, where the ciphertext feature interleaving result is obtained according to the first feature of the first party, the second feature of the second party, and the public key;
a decryption module 94, configured to perform homomorphic decryption on the at least one ciphertext feature interleaving result based on the private key, so as to obtain at least one plaintext feature interleaving result.
In some embodiments, the feature crossing device 90 may further include:
the box separation module is used for carrying out box separation on the at least one plaintext feature crossing result to obtain at least one box separation result;
the third acquisition module is used for acquiring ciphertext marking information of a plaintext feature crossing result in each box dividing result from an active participant who carries out feature crossing and has marking information;
the first determining module is used for determining the number of ciphertext mark information corresponding to the plaintext feature crossing result based on the ciphertext mark information;
and the third sending module is used for sending the number of the ciphertext mark information to the active party so that the active party can determine the information value of the plaintext feature crossing result based on the number of the ciphertext mark information.
In some embodiments, the tag information comprises a first tag and a second tag, the ciphertext tag information comprising a first ciphertext tag and a second ciphertext tag;
the first determining module is further configured to:
determining the number of first ciphertext marks corresponding to the plaintext feature crossing result based on the first ciphertext marks;
and determining the number of second ciphertext marks corresponding to the plaintext feature crossing result based on the second ciphertext marks.
In some embodiments, the first generating module 91 is further configured to:
generating a public key and a private key for additive homomorphic encryption;
alternatively, a public key and a private key for multiplicative homomorphic encryption are generated.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing apparatus, which is applied to a first participant performing feature crossing, where the first participant is used to perform joint training on a model, and the apparatus at least includes:
a fourth obtaining module, configured to obtain at least one first feature and a public key for feature crossing;
the encoding module is used for encoding the at least one first characteristic to obtain a first encoding value corresponding to the at least one first characteristic;
the second encryption module is used for homomorphically encrypting the first coding value corresponding to the at least one first characteristic based on the public key to obtain at least one first ciphertext characteristic;
and the fourth sending module is used for sending the at least one first ciphertext feature to a second party performing feature crossing so that the second party performs feature crossing based on the first ciphertext feature.
In some embodiments, the encoding module may be further configured to:
when the target first characteristics which are not coded exist in the at least one first characteristic, obtaining the coded times;
and coding the target first characteristic based on the coded times to obtain a first coded value of the target first characteristic.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing apparatus, which is applied to a second participant performing feature crossing, where the second participant is used to perform joint training on a model, and the apparatus at least includes:
the second acquisition module is used for acquiring at least one second feature, at least one first ciphertext feature and a public key for feature crossing; the at least one first ciphertext feature is obtained by a first party performing feature crossing based on the at least one first feature and the public key;
the first encryption module is used for homomorphic encryption on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic;
the feature crossing module is used for carrying out feature crossing on the basis of the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result;
and the second sending module is used for sending the at least one ciphertext feature interleaving result to a trusted party performing feature interleaving so that the trusted party determines a plaintext feature interleaving result based on the ciphertext feature interleaving result.
In some embodiments, the first encryption module is further configured to:
coding the at least one second characteristic to obtain a second coded value corresponding to the at least one second characteristic;
and homomorphic encryption is carried out on a second code value corresponding to the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic.
In some embodiments, the first encryption module is further configured to:
when the target second feature which is not coded exists in the at least one second feature, obtaining the coded times;
and coding the target second characteristic based on the coded times to obtain a second coded value of the target second characteristic.
Based on the foregoing embodiments, an embodiment of the present application further provides a feature crossing apparatus, which is applied to an active participant for performing feature crossing, where the active participant is used to perform joint training on a model, and the apparatus at least includes:
the second determining module is used for determining a sample for carrying out feature crossing according to the first party and the second party for carrying out feature crossing;
a fifth obtaining module, configured to obtain label information of the sample;
the third encryption module is used for carrying out homomorphic encryption on the mark information of the sample to obtain ciphertext mark information;
the fifth sending module is used for sending the ciphertext marker information to a trusted party performing feature crossing so that the trusted party determines the number of ciphertext marker information corresponding to a plaintext feature crossing result based on the ciphertext marker information;
and the third determining module is used for determining the information value of the plaintext feature crossing result based on the number of the ciphertext marking information sent by the trusted participant.
In some embodiments, the marker information comprises a first marker and a second marker, the apparatus further comprising:
the second generation module is used for generating a second public key and a second private key for homomorphic encryption;
correspondingly, the third encryption module may be further configured to:
encrypting the first mark and the second mark of the sample respectively based on the second public key to obtain a first ciphertext mark and a second ciphertext mark;
and determining the first ciphertext mark and the second ciphertext mark as ciphertext mark information.
In some embodiments, the third determining module is further configured to:
receiving the number of first ciphertext marks and the number of second ciphertext marks sent by the trusted party;
respectively decrypting the number of the first ciphertext marks and the number of the second ciphertext marks based on the second private key to obtain the number of the first plaintext marks and the number of the second plaintext marks;
and determining the information value of the plaintext feature crossing result based on the number of the first plaintext marks and the number of the second plaintext marks.
Here, it should be noted that: the above description of the feature crossing device embodiment is similar to the above description of the method, with the same advantageous effects as the method embodiment. For technical details not disclosed in the embodiments of the feature crossing device of the present application, a person skilled in the art shall refer to the description of the embodiments of the method of the present application for understanding.
Embodiments of the present application provide a storage medium having stored therein executable instructions, which when executed by a processor, will cause the processor to perform the methods provided by embodiments of the present application, for example, the methods as illustrated in fig. 3 to 8.
In some embodiments, the storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EE PROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (H TML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (12)

1. A feature crossing method is applied to a credible participant for feature crossing, wherein the credible participant is used for joint training of a model, and the method comprises the following steps:
generating a public key and a private key for homomorphic encryption;
distributing the public key to a first party and a second party which perform feature crossing;
obtaining at least one ciphertext feature crossover result from the second party, the ciphertext feature crossover result being obtained according to the first feature of the first party, the second feature of the second party, and the public key;
and homomorphic decryption is carried out on the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result.
2. The method of claim 1, further comprising:
performing box separation on the at least one plaintext feature crossing result to obtain at least one box separation result;
acquiring ciphertext mark information of a plaintext feature crossing result in each box dividing result from an active participant with mark information for feature crossing;
determining the number of ciphertext mark information corresponding to the plaintext feature crossing result based on the ciphertext mark information;
and sending the number of the ciphertext mark information to the active party so that the active party determines the information value of the plaintext feature crossing result based on the number of the ciphertext mark information.
3. The method of claim 2, wherein the tag information comprises a first tag and a second tag, and wherein the ciphertext tag information comprises a first ciphertext tag and a second ciphertext tag;
correspondingly, the determining the number of the ciphertext marker information corresponding to the plaintext feature crossing result based on the ciphertext marker information includes:
determining the number of first ciphertext marks corresponding to the plaintext feature crossing result based on the first ciphertext marks;
and determining the number of second ciphertext marks corresponding to the plaintext feature crossing result based on the second ciphertext marks.
4. The method of claim 1, wherein generating a public key and a private key for homomorphic encryption comprises:
generating a public key and a private key for additive homomorphic encryption;
alternatively, a public key and a private key for multiplicative homomorphic encryption are generated.
5. A method of feature intersection applied to a second participant performing feature intersection, the second participant being configured to jointly train a model, the method comprising:
obtaining at least one second feature, at least one first ciphertext feature and a public key for feature crossing; the at least one first ciphertext feature is obtained by a first party performing feature crossing based on the at least one first feature and the public key;
homomorphic encryption is carried out on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic;
performing feature crossing based on the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result;
and sending the at least one ciphertext feature interleaving result to a trusted party performing feature interleaving so that the trusted party determines a plaintext feature interleaving result based on the ciphertext feature interleaving result.
6. The method of claim 5, wherein the homomorphic encrypting the at least one second feature based on the public key to obtain at least one second ciphertext feature comprises:
coding the at least one second characteristic to obtain a second coded value corresponding to the at least one second characteristic;
and homomorphic encryption is carried out on a second code value corresponding to the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic.
7. The method according to claim 6, wherein said encoding the at least one second feature to obtain a second encoded value corresponding to the at least one second feature comprises:
when the target second feature which is not coded exists in the at least one second feature, obtaining the coded times;
and coding the target second characteristic based on the coded times to obtain a second coded value of the target second characteristic.
8. An apparatus for feature crossing applied to trusted participants for feature crossing for joint training of models, the apparatus comprising:
the first generation module is used for generating a public key and a private key for homomorphic encryption;
the first sending module is used for distributing the public key to a first party and a second party which perform feature crossing;
a first obtaining module, configured to obtain at least one ciphertext feature interleaving result from the second party, where the ciphertext feature interleaving result is obtained according to the first feature of the first party, the second feature of the second party, and the public key;
and the decryption module is used for homomorphically decrypting the at least one ciphertext feature crossover result based on the private key to obtain at least one plaintext feature crossover result.
9. An apparatus for feature crossing applied to a second participant performing feature crossing for joint training of a model, the apparatus comprising:
the second acquisition module is used for acquiring at least one second feature, at least one first ciphertext feature and a public key for feature crossing; the at least one first ciphertext feature is obtained by a first party performing feature crossing based on the at least one first feature and the public key;
the first encryption module is used for homomorphic encryption on the at least one second characteristic based on the public key to obtain at least one second ciphertext characteristic;
the feature crossing module is used for carrying out feature crossing on the basis of the at least one first ciphertext feature and the at least one second ciphertext feature to obtain at least one ciphertext feature crossing result;
and the second sending module is used for sending the at least one ciphertext feature interleaving result to a trusted party performing feature interleaving so that the trusted party determines a plaintext feature interleaving result based on the ciphertext feature interleaving result.
10. A feature crossing apparatus, characterized in that the apparatus comprises:
a memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 4 or claims 5 to 7 when executing executable instructions stored in the memory.
11. A computer-readable storage medium having stored thereon executable instructions for causing a processor, when executed, to implement the method of any one of claims 1 to 4 or claims 5 to 7.
12. A computer program product comprising a computer program, characterized in that the computer program realizes the method of any of claims 1 to 4 or claims 5 to 7 when executed by a processor.
CN202011552619.5A 2020-12-24 2020-12-24 Feature interleaving method, apparatus, computer-readable storage medium, and program product Pending CN112668046A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011552619.5A CN112668046A (en) 2020-12-24 2020-12-24 Feature interleaving method, apparatus, computer-readable storage medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011552619.5A CN112668046A (en) 2020-12-24 2020-12-24 Feature interleaving method, apparatus, computer-readable storage medium, and program product

Publications (1)

Publication Number Publication Date
CN112668046A true CN112668046A (en) 2021-04-16

Family

ID=75408500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011552619.5A Pending CN112668046A (en) 2020-12-24 2020-12-24 Feature interleaving method, apparatus, computer-readable storage medium, and program product

Country Status (1)

Country Link
CN (1) CN112668046A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392422A (en) * 2021-08-16 2021-09-14 华控清交信息科技(北京)有限公司 Data processing method and device and data processing device
CN113542228A (en) * 2021-06-18 2021-10-22 腾讯科技(深圳)有限公司 Data transmission method and device based on federal learning and readable storage medium
CN113657615A (en) * 2021-09-02 2021-11-16 京东科技信息技术有限公司 Method and device for updating federal learning model
CN113657614A (en) * 2021-09-02 2021-11-16 京东科技信息技术有限公司 Method and device for updating federal learning model
CN114398671A (en) * 2021-12-30 2022-04-26 翼健(上海)信息科技有限公司 Privacy calculation method, system and readable storage medium based on feature engineering IV value

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113542228A (en) * 2021-06-18 2021-10-22 腾讯科技(深圳)有限公司 Data transmission method and device based on federal learning and readable storage medium
CN113392422A (en) * 2021-08-16 2021-09-14 华控清交信息科技(北京)有限公司 Data processing method and device and data processing device
CN113657615A (en) * 2021-09-02 2021-11-16 京东科技信息技术有限公司 Method and device for updating federal learning model
CN113657614A (en) * 2021-09-02 2021-11-16 京东科技信息技术有限公司 Method and device for updating federal learning model
CN113657615B (en) * 2021-09-02 2023-12-05 京东科技信息技术有限公司 Updating method and device of federal learning model
CN113657614B (en) * 2021-09-02 2024-03-01 京东科技信息技术有限公司 Updating method and device of federal learning model
CN114398671A (en) * 2021-12-30 2022-04-26 翼健(上海)信息科技有限公司 Privacy calculation method, system and readable storage medium based on feature engineering IV value

Similar Documents

Publication Publication Date Title
CN112668046A (en) Feature interleaving method, apparatus, computer-readable storage medium, and program product
Zhong et al. An efficient and outsourcing-supported attribute-based access control scheme for edge-enabled smart healthcare
Liu et al. Privacy-preserving outsourced calculation toolkit in the cloud
Souyah et al. An image encryption scheme combining chaos-memory cellular automata and weighted histogram
Abdo et al. A cryptosystem based on elementary cellular automata
US11652603B2 (en) Homomorphic encryption
Liu et al. A multidimensional chaotic image encryption algorithm based on the region of interest
CN113518092B (en) Set intersection method for realizing multi-party privacy
WO2014007296A1 (en) Order-preserving encryption system, encryption device, decryption device, encryption method, decryption method, and programs thereof
CN105721156A (en) General Encoding Functions For Modular Exponentiation Encryption Schemes
CN110784306A (en) SM4 algorithm white box implementation method and device, electronic equipment and computer medium
CN112769542A (en) Multiplication triple generation method, device, equipment and medium based on elliptic curve
Ramamurthy et al. Using echo state networks for cryptography
Mandal Reversible steganography and authentication via transform encoding
Osipyan Building of alphabetic data protection cryptosystems on the base of equal power knapsacks with Diophantine problems
Das et al. An Improved Chaos based medical image encryption using DNA encoding techniques
Anshel et al. Non-abelian key agreement protocols
Qi et al. Secure data deduplication with dynamic access control for mobile cloud storage
Chen et al. On the design of a two-decoding-option image secret sharing scheme
Afolabi et al. Implementation of an improved data encryption algorithm in a web based learning system
TWI746296B (en) Homomorphic multi-level visual image encryption system and method and its application
Yan et al. Penrose tiling for visual secret sharing
Guo et al. A novel (n, t, n) secret image sharing scheme without a trusted third party
US11811920B1 (en) Secure computation and communication
Begum et al. Design and implementation of multilevel access control in medical image transmission using symmetric polynomial based audio steganography

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination