CN114564730A - Symmetric encryption-based federal packet statistic calculation method, device and medium - Google Patents

Symmetric encryption-based federal packet statistic calculation method, device and medium Download PDF

Info

Publication number
CN114564730A
CN114564730A CN202210163226.8A CN202210163226A CN114564730A CN 114564730 A CN114564730 A CN 114564730A CN 202210163226 A CN202210163226 A CN 202210163226A CN 114564730 A CN114564730 A CN 114564730A
Authority
CN
China
Prior art keywords
user
participant
information
party
intersection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210163226.8A
Other languages
Chinese (zh)
Inventor
朱帆
孟丹
傅致晖
李晓林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Bodun Xiyan Technology Co ltd
Original Assignee
Hangzhou Bodun Xiyan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Bodun Xiyan Technology Co ltd filed Critical Hangzhou Bodun Xiyan Technology Co ltd
Priority to CN202210163226.8A priority Critical patent/CN114564730A/en
Publication of CN114564730A publication Critical patent/CN114564730A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords

Abstract

The disclosure provides a method, equipment and a medium for calculating federal grouping statistics based on symmetric encryption, wherein the method comprises the steps of grouping a first user set according to grouping information to obtain a plurality of grouped user sets; determining the federal intersection of the user group set and the second user set through a safety intersection algorithm according to the user group set and the second user set; determining statistical information of a second user characteristic belonging to the federal intersection based on the federal intersection; the second party encrypts the statistic information according to the predetermined key information and the encryption algorithm and sends the encrypted result to the first party; and the first party decrypts the encrypted result according to the same key information as the second party to determine the statistic information of the user grouping set. The method and the system can fully protect the privacy of data of each party or meet related supervision requirements without directly or indirectly transmitting the grouping information and the characteristic information distributed on different parties to other organizations or organizations.

Description

Symmetric encryption-based federal packet statistic calculation method, device and medium
Technical Field
The disclosure relates to the technical field of privacy computation, in particular to a federated grouping statistic computation method, device and medium based on symmetric encryption.
Background
In federal learning, the situation that a participant a holds data grouping information and a participant B holds data characteristic information often occurs, and in the process of calculating the statistic of the participant B holding the data characteristic information, the characteristic information of the participant B needs to be grouped according to the grouping information of the participant a and the statistic of the grouped characteristic information needs to be calculated, so that the subsequent requirement of federal modeling or calculation is met.
In order to meet the requirement of data privacy protection in the federal learning process, the participant A cannot reveal data grouping information to the participant B, and the participant B cannot reveal characteristic information to the participant A.
Therefore, when the grouping information and the feature information are distributed in different parties, it is required to simultaneously protect the privacy of data of each party, i.e. when the grouping statistic calculation is performed in the context of federal learning, it is still a core problem to be solved by those skilled in the art to ensure that the grouping information and the feature information of any party are not directly or indirectly exchanged or disclosed to other parties.
The information disclosed in this background section of the application is only for enhancement of understanding of the general background of the application and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art that is already known to a person skilled in the art.
Disclosure of Invention
The embodiment of the disclosure provides a method, equipment and medium for calculating the grouping statistics based on symmetric encryption, which can not directly or indirectly exchange or remit the grouping information and feature information of any party to other parties when the grouping statistics is calculated in a federal learning scene, and can fully protect the privacy of data of each party or meet related supervision requirements.
In a first aspect of an embodiment of the present disclosure,
provides a federal packet statistic calculation method based on symmetric encryption,
the method is applied to a plurality of participants, the participants including a first participant and a second participant, the first participant storing user group information and the second participant storing user characteristic information, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
the first participant determines a federal intersection of each user group set and a second user set of the second participant through a safety intersection algorithm on each user group set and the second user set;
the second participant determines a second user characteristic belonging to the intersection based on the federal intersection, and calculates user characteristic statistics in the intersection to obtain statistic information of each user group set, wherein the second user characteristic comprises characteristic information of a second user corresponding to the second user set;
the second party encrypts the statistic information of each user grouping set according to the predetermined key information and an encryption algorithm and sends the encrypted result to the first party;
and the first party decrypts the encrypted result according to the same key information as the second party, and determines the statistic information of each user group set.
In an optional embodiment, before the determining the federal intersection of each user group set and the second user set, the method further comprises:
and the first participant and the second participant adopt the same data desensitization algorithm to perform data desensitization on each user group set corresponding to the first participant and the second user set corresponding to the second participant respectively.
In an alternative embodiment of the method according to the invention,
the method for determining the federal intersection of each user group set and the second user set comprises the following steps:
and the first participant performs pairwise privacy comparison on the first element in each user grouping set and the second element in the second user set of the second participant, and determines the federal intersection of each user grouping set and the second user set.
In an alternative embodiment of the method according to the invention,
before the first participant pairwise privacy comparing a first element in each set of user groupings to a second element in a second set of users of the second participant, the method further comprises:
the first participant and the second participant respectively encode the first element and the second element into binary codes with the length of n, and respectively obtain a first binary code and a second binary code;
wherein n is a coding length jointly negotiated by the first party and the second party.
In an alternative embodiment of the method according to the invention,
the method for the first participant to make pairwise privacy comparisons between the first element in each user group set and the second element in the second user set of the second participant comprises:
the first participant randomly generates two first random binary strings with the same length for each bit in the first binary code;
the first participant selects a corresponding random binary string from the two first random binary strings according to each digit value in the first binary code to obtain a first selected binary string;
the second participant and the first random binary string perform the operation of the inadvertent transmission protocol on each bit in the second binary code to obtain a second selected binary string;
the first participant performs exclusive-or operation on each element in the first selection binary string to obtain a first exclusive-or binary string;
the second participant performs exclusive-or operation on each element in the second selection binary string to obtain a second exclusive-or binary string;
the first participant sending the first xor binary string to a second participant, the second participant sending the second xor binary string to the first participant;
a pairwise privacy comparison is achieved by comparing the first xor binary string and the second xor binary string.
In an alternative embodiment of the method according to the invention,
before the second party encrypts the statistic information according to the predetermined key information and the encryption algorithm, the method further includes:
the first and second parties determine first and second private keys, respectively;
the first party and the second party respectively determine and disclose a first public key and a second public key based on a public key generation algorithm and global parameters disclosed in advance according to the first private key and the second private key;
the first party determining a shared key based on the first private key, the first public key, and the global parameter;
the second party determines the shared key based on the second private key, the second public key, and the global parameter.
In a second aspect of an embodiment of the present disclosure,
there is provided a symmetric encryption based federated packet statistic calculation method, applied to a first party, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
determining the federal intersection of each user group set and the second user set according to each user group set and the obtained second user set of the second participant through a safety intersection algorithm;
acquiring the encrypted information sent by the second party, decrypting the encrypted information according to the same key as the second party, determining the statistic information of each user group set,
wherein the encryption information sent by the second party is determined according to the federal intersection, a predetermined key, the statistic information and an encryption algorithm.
In a third aspect of the embodiments of the present disclosure,
there is provided a symmetric encryption based federated packet statistic calculation method, applied to a second party, the method comprising:
determining a federal intersection of the second user set of the second participant and each user group set of the first participant through a safety intersection algorithm;
determining second user characteristics belonging to the federal intersection element based on the federal intersection, and calculating user characteristic statistics in the federal intersection to obtain statistic information of each user grouping set, wherein the second user characteristics comprise characteristic information of second users corresponding to the second user sets;
and the second party encrypts the statistic of each user group set according to the predetermined key information and an encryption algorithm and sends the encrypted result to the first party so as to determine the statistic information of each user group set.
In a fourth aspect of an embodiment of the present disclosure,
there is provided a symmetric encryption based federated packet statistics calculation apparatus,
the apparatus applies to a plurality of participants, the participants including a first participant storing user grouping information and a second participant storing user characteristic information, the apparatus comprising:
the first unit is used for grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
a second unit, configured to determine, by the first participant, a federal intersection of each user group set and a second user set of the second participant through a secure intersection algorithm;
a third unit, configured to determine, by the second participant, a second user feature that belongs to the intersection based on the federal intersection, and perform calculation on user feature statistics in the intersection to obtain statistic information of each user group set, where the second user feature includes feature information of a second user corresponding to the second user group set;
a fourth unit, configured to encrypt, by the second party, statistics information of each user packet set according to predetermined key information and an encryption algorithm, and send an encrypted result to the first party;
a fifth unit, configured to decrypt the encrypted result according to the same key information as the second party by the first party, and determine statistic information of each user group set.
In a fifth aspect of the embodiments of the present disclosure,
there is provided a symmetric encryption based federated packet statistics calculation apparatus, the apparatus being applied to a first party, the apparatus comprising:
a sixth unit, configured to group a first user set corresponding to a first participant according to user grouping information, and obtain multiple user grouping sets;
a seventh unit, configured to determine, according to the each user group set and the obtained second user set of the second participant, a federal intersection between the each user group set and the second user set through a security intersection algorithm;
an eighth unit, configured to acquire the encrypted information sent by the second party, decrypt the encrypted information according to the same key as the second party, and determine statistic information of each user group set,
wherein the encryption information sent by the second party is determined according to the federal intersection, a predetermined key, the statistic information and an encryption algorithm.
In a sixth aspect of an embodiment of the present disclosure,
there is provided a symmetric encryption based federated packet statistics calculation apparatus, the apparatus being applied to a second party, the apparatus comprising:
a ninth unit, wherein the second user group of the second participant and each user group of the first participant determine a federal intersection with each user group of the first participant through a security intersection algorithm;
a tenth unit, configured to determine, based on the federal intersection information, second user features that belong to the federal intersection element, and perform calculation on user feature statistics in the intersection to obtain statistic information of each user group set, where the second user features include feature information of a second user corresponding to the second user group;
an eleventh unit, configured to encrypt, by the second party, the statistics of each user group set according to predetermined key information and an encryption algorithm, and send an encrypted result to the first party, so as to determine statistics information of each user group set.
In a seventh aspect of the embodiments of the present disclosure,
provided is an electronic device including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of the preceding claims.
In an eighth aspect of the embodiments of the present disclosure,
there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any of the preceding claims.
The embodiment of the disclosure provides a method for calculating federal grouping statistics based on symmetric encryption,
the method is applied to a plurality of participants, the participants including a first participant and a second participant, the first participant storing user group information and the second participant storing user characteristic information, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
by grouping the first user information corresponding to the first participant, the intersection of each group of information and the second user information of the second participant can be accurately determined, so that the calculation amount is reduced, more detailed analysis data can be obtained, and the requirements of subsequent modeling or calculation can be met.
The first participant determines the federal intersection of each user group set and a second user set of the second participant through a safety intersection algorithm;
through the safety intersection algorithm, two parties holding respective sets are allowed to jointly calculate the intersection operation of the two sets, one party or the two parties should obtain correct intersection, and any information in the other party except the intersection cannot be obtained.
The second participant determines second user features belonging to the intersection based on the federal intersection, and calculates user feature statistics in the intersection to obtain statistic information of each user grouping set, wherein the second user features comprise feature information of second users corresponding to the second user sets;
the second party encrypts the statistic information of each user grouping set according to the predetermined key information and an encryption algorithm and sends the encrypted result to the first party;
and the first party decrypts the encrypted result according to the same key information as the second party, and determines the statistic information of each user group set.
The first party and the second party fully utilize the public information to encrypt and decrypt through a symmetric encryption algorithm and a security intersection algorithm, and when the grouping statistic calculation is carried out under the scene of federal learning, the grouping information and the characteristic information of any party are ensured not to be directly or indirectly exchanged or disclosed to other parties.
Drawings
FIG. 1 is a schematic flow chart of a first method for calculating federal packet statistics based on symmetric encryption according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of a second method for calculating federal packet statistics based on symmetric encryption according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart of a third method for calculating federal packet statistics based on symmetric encryption according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a first federal packet statistic calculation device based on symmetric encryption according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a second symmetric encryption-based federal packet statistic calculation device according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a third federal packet statistic calculation apparatus based on symmetric encryption according to an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions in the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present disclosure and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein.
It should be understood that, in various embodiments of the present disclosure, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
It should be understood that in the present disclosure, "including" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present disclosure, "plurality" means two or more. "and/or" is merely an association describing an associated object, meaning that three relationships may exist, for example, and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "comprises A, B and C" and "comprises A, B, C" means that A, B, C all comprise, "comprises A, B or C" means comprise one of A, B, C, "comprises A, B and/or C" means comprise any 1 or any 2 or 3 of A, B, C.
It should be understood that in this disclosure, "B corresponding to a", "a corresponds to B", or "B corresponds to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may be determined from a and/or other information. And the matching of A and B means that the similarity of A and B is greater than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context.
The technical solution of the present disclosure is explained in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 1 is a schematic flowchart illustrating a first method for calculating federal group statistics based on symmetric encryption according to an embodiment of the present disclosure, where the method is applied to multiple participants, where the participants include a first participant and a second participant, the first participant stores user group information, and the second participant stores user characteristic information, as shown in fig. 1, where the method includes:
s101, grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
s102, the first participant determines the federal intersection of each user group set and a second user set of the second participant through a safety intersection algorithm;
illustratively, in order to meet the requirements of data privacy protection and federal calculation, the disclosed embodiment performs federal grouping statistic calculation based on privacy security intersection and symmetric encryption, wherein the privacy security intersection means that a party holding data obtains the intersection of the data held by the two parties without revealing any additional information.
The privacy security intersection is very practical in a real scene, for example, in social software, when a new user registers the social software, privacy security intersection is performed on address book friends of the user and users registered in the social software, so that which address book friends of the new user have registered in the social software can be obtained, friend discovery and recommendation are performed on the new user, and information of other users outside any intersection is not disclosed to the user or the social software, so that privacy information of other users is protected.
It should be noted that the embodiment of the present disclosure may be applied to a plurality of participants, and for convenience of description, the embodiment of the present disclosure is described with the number of the participants being two, where P1 denotes the first participant, P2 denotes the second participant,
where P1 holds data packet information, illustratively, as shown in table 1 below:
table 1: group ID and user ID set of P1
Packet ID Grouping user ID sets
group_1 {user_id_1_1,user_id_1_5,…}
group_2 {user_id_1_3,user_id_1_10,…}
group_3 {user_id_1_5,user_id_1_n,…}
group_t {user_id_1_2,user_id_1_12,…}
The participating party P1 has n pieces of user information of { user _ ID _1_1, user _ ID _1_2, … and user _ ID _1_ n }, and is divided into t groups group _1, group _2, … and group _ t, and the user ID sets corresponding to each group are { user _ ID _1_1, user _ ID _1_5 and … }, { user _ ID _1_3, user _ ID _1_10 and … }, …, { user _ ID _1_2, user _ ID _1_12 and … }, respectively. Intersection is allowed to exist between the user ID sets of different groups, and as in the above example, the user ID sets in group _1 and group _3 both have the same user _ ID _1_ 5;
where P2 possesses data characteristic information, illustratively, as shown in table 2 below:
table 2: user ID and data characteristic information
User ID feature_ feature_2 feature_j
user_id_2_1 f_1_1 f_2_1 f_j_1
user_id_2_2 f_1_2 f_2_2 f_j_2
user_id_2_3 f_1_3 f_2_3 f_j_3
user_id_2_m f_1_m f_2_m f_j_m
The participant P2 owns m pieces of user information { user _ id _2_1, user _ id _2_2, …, user _ id _2_ m }, and the participant P2 shares feature data [ feature _1, feature _2, …, feature _ j ] of the m users, where the feature data of each user shares j dimensions.
It is to be understood that tables 1 and 2 above are merely illustrative of P1 and P2 and are not intended to limit P1 and P2 in practice. Illustratively, for ease of understanding, P1 and P2 are further illustrated with specific data:
for example, taking P1 as an example of having 2000 users and grouping information of the 2000 users, the user information of P1 may be grouped, where the number of the groups may be 20, and the corresponding user grouping information is shown in table 3 below:
table 3: user grouping information of P1
Packet ID Grouping user ID sets
group_1 {A_1,A_100,…}
group_2 {A_3,A_20,…}
group_3 {A_5,A_2000,…}
group_20 {A_2,A_120,…}
It can be seen that the grouping number of P1 is 20 groups, which are respectively group _1-group _20, where taking group _1 as an example, the set of corresponding user IDs is { a _1, a _100, … }, and by grouping the first user information corresponding to the first participant, the intersection of each piece of grouping information and the second user information of the second participant can be accurately determined, which is not only beneficial to reducing the amount of computation, but also can obtain more detailed analysis data, and can meet the requirements of subsequent modeling or computation.
Illustratively, taking P2 as an example, which has 1000 users and the characteristic information of the 1000 users, the following table 4 shows:
table 4: user ID and 10-dimensional feature information of P2
Figure BDA0003515603970000101
Figure BDA0003515603970000111
According to the user grouping information of P1 and the second user information of P2, the intersection information of the user grouping information and the second user information can be determined through a security intersection algorithm.
In practical applications, the first user information and the second user information include partial data related to privacy, and in order to ensure privacy security, data desensitization may be performed on the first user information and the second user information.
In an optional implementation manner, before determining intersection information of the user group information and the second user information, the method of the present disclosure further includes:
and the first participant and the second participant respectively perform data desensitization on the first user information corresponding to the first participant and the second user information corresponding to the second participant by adopting the same data desensitization algorithm.
Data desensitization is performed on the user IDs of the user grouping information and the second user information, wherein P1 and P2 may perform data desensitization on the original user ID by using the same desensitization rule, wherein the method of data desensitization may include SHA256 algorithm of Hash function, it should be noted that the desensitization algorithm of the embodiment of the present disclosure is only exemplary, and the embodiment of the present disclosure does not specifically limit the desensitization algorithm.
Wherein, the P1 desensitized local User ID set PA _ User ═ { en _ a _1, en _ a _2, …, en _ a _2000}, and the P2 desensitized local User ID set PB _ User ═ { en _ B _1, en _ B _2, …, en _ B _10000 }; for example, if the user ID is 'admin', after data desensitization, the user ID is converted into:
8c6976e5b5410415bde908bd4de 15dfb167a9c873fc4bb8a81f6f2ab448a918, thereby ensuring that the user's ID data is not leaked.
Grouping the PA _ User according to the original grouping information of group _1, group _2, … and group _20, and obtaining a desensitized User ID set of each group of PA _ group _1 { (en _ a _1, en _ a _100, … }, PA _ group _2 { (en _3, en _ a _20, … }, … and PA _ group _20 { (en _ a _2, en _ a _120, … };
in an alternative embodiment, the method for determining the federal intersection of each user group set with the second user set includes:
and the first participant performs pairwise privacy comparison on the first element in each user grouping set and the second element in the second user set of the second participant, and determines the federal intersection of each user grouping set and the second user set.
In an alternative embodiment of the method according to the invention,
before the first participant pairwise privacy comparing a first element in each user group set with a second element in a second user set of the second participant, the first participant pairwise privacy comparing the first element in the user group with the second element in the second user set, the method further comprises:
the first participant and the second participant respectively encode the first element and the second element into binary codes with the length of X, and respectively obtain a first binary code and a second binary code;
wherein X is the determined coding length negotiated by the first party and the second party.
In an optional embodiment, the method for the first participant to perform pairwise privacy comparison between the first element in each user grouping set and the second element in the second user set of the second participant includes:
the first participant randomly generates two first random binary strings with the same length for each bit in the first binary code;
the first participant selects a corresponding random binary string from the two first random binary strings according to each digit value in the first binary code to obtain a first selected binary string;
the second participant and the first random binary string perform the operation of the inadvertent transmission protocol on each bit in the second binary code to obtain a second selected binary string;
the first participant performs exclusive-or operation on each element in the first selected binary string to obtain a first exclusive-or binary string;
the second participant performs exclusive-or operation on each element in the second selection binary string to obtain a second exclusive-or binary string;
the first participant sending the first xor binary string to a second participant, the second participant sending the second xor binary string to the first participant;
a pairwise privacy comparison is achieved by comparing the first xor binary string and the second xor binary string.
For example, the first participant may randomly generate two random binary strings with the same length for each bit in the first binary code, that is, each first binary code may obtain 2 × random binary strings; wherein the first binary is generated by encoding a first element in the first participant into a binary of length X; the first element in the first participant can inquire the grouping information of the corresponding user;
illustratively, the second participant may perform an inadvertent transmission protocol operation on each bit in a second binary encoding with the first random binary string of the corresponding number of bits to determine a second selected binary string, wherein the second binary encoding is a binary encoding of a second element into a length X by the second participant; wherein the second element in the second party may index the characteristic information of the corresponding user.
Optionally, the first binary code of the first participant may be 1101, and the second binary code of the second participant may be 1010, and it should be noted that the above specific values are merely exemplary, and the specific values of the global parameter, the first binary code, and the second binary code are not limited in this disclosure.
The first participant can randomly generate two first random binary strings with the same length for each bit of 1101, for example, the two random binary strings corresponding to each bit of 1101 are [110,010], [100,110], [001,010], [011,111 ];
the second participant may perform an inadvertent transmission protocol operation on each bit of 1010 and the first random binary string corresponding to the first participant to obtain a second selected binary string, optionally:
OT([110,010],1)=010;
OT([100,110],0)=100;
OT([001,010],1)=010;
OT([011,111],0)=011;
wherein OT represents an inadvertent transport protocol operation;
the second participant may obtain 010, 100, 010, 011, and perform an exclusive-or operation on each second element in the second selected binary string, that is, 010 ≦ 100 ≦ 010 ≦ 011 ≦ 111, and the second exclusive-or binary string is 111;
the first participant may select 010,110,001,111 from [110,010], [100,110], [001,010], [011,111] according to each bit of 1101, that is, a first selected binary string, and perform an exclusive or operation on each first element in the first selected binary string, that is, 010 ≥ 110 ≥ 001 ≥ 111 ≥ 010, and then the first exclusive or binary string is 010;
as can be seen from the above, the first xor binary string is 010, the second xor binary string is 111, and the first xor binary string and the second xor binary string are not equal, it can be determined that the two original numbers are not equal, and thus a privacy comparison is completed.
S103, the second participant determines second user features belonging to the intersection based on the federal intersection, and calculates user feature statistics in the intersection to obtain statistic information of each user grouping set;
the second user characteristics comprise characteristic information of a second user corresponding to the second user set information.
Illustratively, statistics are the most important concepts for estimating and inferring digital features, and common statistics include sample mean, sample variance, and the like. Grouping statistics refers to computing descriptive statistics after grouping samples, such as sums, means, maxima, etc. within each group.
Alternatively, the second party may calculate statistical information for the second user features belonging to the intersection according to a security intersection algorithm, for example, a maximum value of the features of all users within the intersection, a summation result of the features, an average value of the features, and the like may be calculated.
It should be noted that the federal intersection is an intersection of the second user set and each user group set, that is, each user group will obtain a corresponding federal intersection, and the statistical information obtained on this basis, that is, the statistical information corresponding to each user group set.
S104, the second party encrypts the statistic information of each user group set according to predetermined key information and an encryption algorithm, and sends an encrypted result to the first party;
s105, the first party decrypts the encrypted result according to the same key information as the second party, and determines the statistic information of each user group set.
For example, the embodiment of the present disclosure may adopt a symmetric encryption algorithm, that is, an encryption method of a single-key cryptosystem, and the same key may be used for both encryption and decryption of information. Symmetry means that both parties using this encryption method use the same key for encryption and decryption.
Before data transmission, a sender and a receiver agree on a secret key in advance, and then the secret key is stored by the sender and the receiver; in the symmetric encryption process, a data sender processes a plaintext (original data) through a special encryption algorithm, and then the plaintext is changed into a complex encrypted ciphertext to be sent out. After receiving the ciphertext, the receiver needs to decrypt the ciphertext by using the encryption key and the inverse algorithm of the same algorithm to recover the ciphertext into readable plaintext if the receiver wants to decode the original plaintext.
In an alternative embodiment, before the second party encrypts the statistic information according to the predetermined key information and the encryption algorithm, the method further includes:
the first and second parties determine first and second private keys, respectively;
the first party and the second party respectively determine and disclose a first public key and a second public key according to the first private key and the second private key and based on a public key generation algorithm and global parameters which are disclosed in advance;
the first party determining a shared key based on the first private key, the first public key, and the global parameter;
the second party determines the shared key based on the second private key, the second public key, and the global parameter.
Illustratively, the first and second parties may determine a first private key and a second private key in advance, wherein the first private key may be denoted as XA and the second private key may be denoted as XB, and furthermore, a global parameter may be disclosed in advance, the global parameter may include a prime number q and an integer a, a being an original root of q.
Optionally, the first and second participants may select the same public key generation algorithm, where the public key generation algorithm of the first participant may be YA ^ a ^ XA mod q, the public key generation algorithm of the second participant may be YB ^ a ^ XB mod q,
the calculation formula for the first participant to generate the shared key is K ═ b ^ XA mod q. Likewise, the calculation formula for the second party to generate the shared key is K ═ YA ^ XB mod q. These two calculation formulas produce the same result, in this way it amounts to both parties exchanging the same key.
The embodiment of the disclosure provides a method for calculating federal grouping statistics based on symmetric encryption,
the method is applied to a plurality of participants, the participants including a first participant and a second participant, the first participant storing user group information and the second participant storing user characteristic information, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
by grouping the first user information corresponding to the first participant, the intersection of each group of information and the second user information of the second participant can be accurately determined, so that the calculation amount is reduced, more detailed analysis data can be obtained, and the requirements of subsequent modeling or calculation can be met.
The first participant determines the federal intersection of each user group set and a second user set of the second participant through a safety intersection algorithm;
through the safety intersection algorithm, two parties holding respective sets are allowed to jointly calculate the intersection operation of the two sets, one party or the two parties should obtain correct intersection, and any information in the other party except the intersection can not be obtained.
The second participant determines second user characteristics belonging to the intersection based on the intersection information, and calculates user characteristic statistics in the intersection to obtain statistic information of each user grouping set, wherein the second user characteristics comprise the characteristic information of a second user corresponding to the second user set;
the second party encrypts the statistic information of each user grouping set according to the predetermined key information and an encryption algorithm and sends the encrypted result to the first party;
and the first party decrypts the encrypted result according to the same key information as the second party, and determines the statistic information of each user group set.
The first party and the second party fully utilize the public information to encrypt and decrypt through a symmetric encryption algorithm and a security intersection algorithm, and ensure that the grouping information and the characteristic information of any party are not directly or indirectly exchanged or disclosed to other parties when the grouping statistic calculation is carried out under the scene of federal study,
in addition, the symmetric encryption algorithm of the embodiment of the disclosure has the advantages of high encryption/decryption speed, simple key management, small calculation amount, high encryption speed and high encryption efficiency.
Fig. 2 is a schematic flowchart of a second method for calculating federal packet statistics based on symmetric encryption according to an embodiment of the present disclosure, as shown in fig. 2, where the method is applied to a first party, and the method includes:
s201, grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
s202, determining the federal intersection of each user group set and the second user set according to each user group set and the obtained second user set of the second participant through a safety intersection algorithm;
s203, obtaining the encrypted information sent by the second party, decrypting the encrypted information according to the same key information as the second party, determining the statistic information of each user group set,
wherein the encryption information sent by the second party is determined according to the federal intersection, a predetermined key, the statistic information and an encryption algorithm.
In an alternative embodiment of the method according to the invention,
the method further comprises the following steps:
the first party determining a first private key;
the first participant determines and discloses a first public key based on a public key generation algorithm and global parameters which are disclosed in advance according to the first private key;
determining a shared key based on a symmetric encryption algorithm based on the first public key, the first private key, and the global parameter.
It can be understood that beneficial effects of the method corresponding to the embodiment of fig. 2 in the present disclosure may refer to beneficial effects of the method corresponding to the embodiment of fig. 1, which are not described herein again.
Fig. 3 is a schematic flowchart of a third method for calculating federal packet statistics based on symmetric encryption according to an embodiment of the present disclosure, where as shown in fig. 3, the method is applied to a second party, and the method includes:
s301, determining a federal intersection of a second user set of a second participant and each user group set of the first participant through a safety intersection algorithm;
s302, second user features belonging to the federal intersection element are determined based on the federal intersection, and user feature statistics in the federal intersection are calculated to obtain statistic information of each user group set, wherein the second user features comprise feature information of second users corresponding to the second user sets;
s303, the second party encrypts the statistics of each user group set according to the predetermined key information and encryption algorithm, and sends the encrypted result to the first party so as to determine the statistics information of each user group set.
In an alternative embodiment of the method according to the invention,
the method further comprises the following steps:
the second party determining a second private key;
the second participant determines and discloses a second public key based on a public key generation algorithm and global parameters which are disclosed in advance according to the second private key;
determining a shared key based on a symmetric encryption algorithm based on the second public key, and the second private key, and the global parameter.
It can be understood that, for the beneficial effects of the method corresponding to the embodiment of fig. 3 in the present disclosure, reference may be made to the beneficial effects of the method corresponding to the embodiment of fig. 1, which are not described herein again.
Fig. 4 is a schematic structural diagram of a first symmetric encryption-based federated packet statistic calculation apparatus according to an embodiment of the present disclosure, as shown in fig. 4, the apparatus is applied to a plurality of participants, where the participants include a first participant and a second participant, the first participant stores user group information, and the second participant stores user feature information, and the apparatus includes:
a first unit 41, configured to group a first user set corresponding to a first participant according to user grouping information, and obtain multiple user grouping sets;
a second unit 42, configured to determine, by the first party, a federal intersection of each user group set and a second user set of the second party through a secure intersection algorithm for the each user group set and the second user set;
a third unit 43, configured to determine, by the second participant, a second user feature that belongs to the intersection based on the federal intersection, and perform calculation on user feature statistics in the intersection to obtain statistic information of each user group set, where the second user feature includes feature information of a second user corresponding to the second user group set;
a fourth unit 44, configured to encrypt, by the second party, the statistics information of each user packet set according to predetermined key information and an encryption algorithm, and send an encrypted result to the first party;
a fifth unit 45, configured to decrypt the encrypted result according to the same key information as that of the second party, and determine statistic information corresponding to each user group set.
It can be understood that beneficial effects of the apparatus corresponding to the embodiment of fig. 4 in the present disclosure can refer to beneficial effects of the method corresponding to the embodiment of fig. 1, which are not described herein again.
Fig. 5 is a schematic structural diagram of a second federated packet statistic computing device based on symmetric encryption according to an embodiment of the present disclosure, as shown in fig. 5, where the device is applied to a first participant, and the device includes:
a sixth unit 51, configured to group a first user set corresponding to a first participant according to user grouping information, and obtain multiple user grouping sets;
a seventh unit 52, configured to determine, according to the each user group set and the obtained second user set of the second party, a federal intersection between the each user group set and the second user set through a security intersection algorithm;
an eighth unit 53, configured to acquire the encrypted information sent by the second party, decrypt the encrypted information according to the same key as the second party, and determine statistic information corresponding to each user packet set,
wherein the encryption information sent by the second party is determined according to the federal intersection, a predetermined key, the statistic information and an encryption algorithm.
It can be understood that beneficial effects of the apparatus corresponding to the embodiment of fig. 5 in the present disclosure may refer to beneficial effects of the method corresponding to the embodiment of fig. 1, which are not described herein again.
Fig. 6 is a schematic structural diagram of a third federated packet statistic computing device based on symmetric encryption according to an embodiment of the present disclosure, as shown in fig. 6, where the device is applied to a second party, and the device includes:
a ninth unit 61, configured to determine, by using a security intersection algorithm, a federal intersection with each user group set of the first participant by using the second user group set of the second participant and each user group set of the first participant;
a tenth unit 62, configured to determine, based on the federal intersection information, a second user feature that belongs to the federal intersection element, and perform calculation on user feature statistics in the federal intersection to obtain statistic information of each user group set, where the second user feature includes feature information of a second user corresponding to the second user group;
an eleventh unit 63, configured to encrypt, by the second party, the statistics of each user packet set according to the predetermined key information and an encryption algorithm, and send an encrypted result to the first party, so as to determine statistics information corresponding to each user packet set.
It can be understood that beneficial effects of the apparatus corresponding to the embodiment of fig. 6 in the present disclosure may refer to beneficial effects of the method corresponding to the embodiment of fig. 1, which are not described herein again.
In a seventh aspect of the embodiments of the present disclosure,
provided is an electronic device including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of the preceding claims.
In an eighth aspect of the embodiments of the present disclosure,
there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any of the preceding claims.
The present disclosure also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the readable storage medium may also reside as discrete components in a communication device. The readable storage medium may be a read-only memory (ROM), a random-access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In the above embodiments of the terminal or the server, it should be understood that the Processor may be a Central Processing Unit (CPU), other general-purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present disclosure may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present disclosure, and not for limiting the same; while the present disclosure has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.

Claims (10)

1. A method for calculating federal group statistic based on symmetric encryption, the method being applied to a plurality of participants, the participants including a first participant and a second participant, the first participant storing user group information, the second participant storing user characteristic information, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
the first participant determines the federal intersection of each user group set and a second user set of the second participant through a safety intersection algorithm;
the second participant determines second user features belonging to the intersection based on the federal intersection, and calculates user feature statistics in the intersection to obtain statistic information of each user grouping set, wherein the second user features comprise feature information of second users corresponding to the second user sets;
the second party encrypts the statistic information of each user grouping set according to the predetermined key information and an encryption algorithm and sends the encrypted result to the first party;
and the first party decrypts the encrypted result according to the same key information as the second party, and determines the statistic information of each user group set.
2. The method of claim 1, wherein prior to determining the federal intersection of each of the user group sets with the second user set, the method further comprises:
and the first participant and the second participant adopt the same data desensitization algorithm to perform data desensitization on each user group set corresponding to the first participant and the second user set corresponding to the second participant respectively.
3. The method of claim 1, wherein the determining the federal intersection of each user group set with the second user set comprises:
and the first participant performs pairwise privacy comparison on the first element in each user grouping set and the second element in the second user set of the second participant, and determines the federal intersection of each user grouping set and the second user set.
4. The method of claim 3, wherein before the first participant performs pairwise privacy comparison between the first element in each user group set and the second element in the second user set of the second participant, the method further comprises:
the first participant and the second participant respectively encode the first element and the second element into binary codes with the length of X, and respectively obtain a first binary code and a second binary code;
wherein X is the determined coding length negotiated by the first party and the second party.
5. The method of claim 4, wherein the method for the first participant to perform pairwise privacy comparison between the first element in each user grouping set and the second element in the second user set of the second participant comprises:
the first participant randomly generates two first random binary strings with the same length for each bit in the first binary code;
the first participant selects a corresponding random binary string from the two first random binary strings according to each digit value in the first binary code to obtain a first selected binary string;
the second participant and the first random binary string perform the operation of the inadvertent transmission protocol on each bit in the second binary code to obtain a second selected binary string;
the first participant performs exclusive-or operation on each element in the first selection binary string to obtain a first exclusive-or binary string;
the second participant performs exclusive-or operation on each element in the second selection binary string to obtain a second exclusive-or binary string;
the first participant sending the first xor binary string to a second participant, the second participant sending the second xor binary string to the first participant;
pairwise privacy comparisons are achieved by comparing the first XOR binary string and the second XOR binary string.
6. The method of any of claims 1-5, wherein prior to the second party encrypting the statistics information according to predetermined key information and an encryption algorithm, the method further comprises:
the first and second parties determine first and second private keys, respectively;
the first participant and the second participant respectively determine and disclose a first public key and a second public key according to the first private key and the second private key and based on a public key generation algorithm and global parameters which are disclosed in advance;
the first party determining a shared key based on the first private key, the first public key, and the global parameter;
the second party determines the shared key based on the second private key, the second public key, and the global parameter.
7. A method for calculating federal packet statistics based on symmetric encryption, the method being applied to a first party, the method comprising:
grouping a first user set corresponding to a first participant according to user grouping information to obtain a plurality of user grouping sets;
determining the federal intersection of each user group set and the second user set according to each user group set and the obtained second user set of the second participant through a safety intersection algorithm;
and acquiring encrypted information sent by the second party, decrypting the encrypted information according to a key which is the same as that of the second party, and determining statistic information of each user group set, wherein the encrypted information sent by the second party is determined according to the federal intersection, a predetermined key, the statistic information and an encryption algorithm.
8. A method for calculating federal packet statistics based on symmetric encryption, the method being applied to a second party, the method comprising:
determining a federal intersection of the second user set of the second participant and each user group set of the first participant through a safety intersection algorithm;
determining second user characteristics belonging to the federal intersection element based on the federal intersection, and calculating user characteristic statistics in the federal intersection to obtain statistic information of each user group set, wherein the second user characteristics comprise characteristic information of second users corresponding to the second user sets;
and the second party encrypts the statistic of each user grouping set according to the predetermined key information and an encryption algorithm, and sends the encrypted result to the first party so as to determine the statistic information corresponding to each user grouping set.
9. An electronic device, comprising:
a processor; a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of claims 1 to 6, and/or 7, and/or 8.
10. A computer-readable storage medium having computer program instructions stored thereon, which, when executed by a processor, implement the method of any one of claims 1 to 6, and/or 7, and/or 8.
CN202210163226.8A 2022-02-22 2022-02-22 Symmetric encryption-based federal packet statistic calculation method, device and medium Pending CN114564730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210163226.8A CN114564730A (en) 2022-02-22 2022-02-22 Symmetric encryption-based federal packet statistic calculation method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210163226.8A CN114564730A (en) 2022-02-22 2022-02-22 Symmetric encryption-based federal packet statistic calculation method, device and medium

Publications (1)

Publication Number Publication Date
CN114564730A true CN114564730A (en) 2022-05-31

Family

ID=81713335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210163226.8A Pending CN114564730A (en) 2022-02-22 2022-02-22 Symmetric encryption-based federal packet statistic calculation method, device and medium

Country Status (1)

Country Link
CN (1) CN114564730A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023241262A1 (en) * 2022-06-14 2023-12-21 胜斗士(上海)科技技术发展有限公司 Data intersection method and apparatus, device, and medium
CN117579273A (en) * 2024-01-12 2024-02-20 蓝象智联(杭州)科技有限公司 Private collection intersection solving method and system without exposing intersection ID
CN117579273B (en) * 2024-01-12 2024-04-30 蓝象智联(杭州)科技有限公司 Private collection intersection solving method and system without exposing intersection ID

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023241262A1 (en) * 2022-06-14 2023-12-21 胜斗士(上海)科技技术发展有限公司 Data intersection method and apparatus, device, and medium
CN117579273A (en) * 2024-01-12 2024-02-20 蓝象智联(杭州)科技有限公司 Private collection intersection solving method and system without exposing intersection ID
CN117579273B (en) * 2024-01-12 2024-04-30 蓝象智联(杭州)科技有限公司 Private collection intersection solving method and system without exposing intersection ID

Similar Documents

Publication Publication Date Title
KR102116877B1 (en) New cryptographic systems using pairing with errors
EP1834438B1 (en) Cryptography related to keys
CN111510281B (en) Homomorphic encryption method and device
CN111049650B (en) SM2 algorithm-based collaborative decryption method, device, system and medium
CN111162906B (en) Collaborative secret sharing method, device, system and medium based on vast transmission algorithm
US20100046755A1 (en) Cryptography related to keys with signature
CN112906030B (en) Data sharing method and system based on multi-party homomorphic encryption
KR20060052556A (en) Methods, devices and systems for generating anonymous public keys in a secure communication system
Peng Danger of using fully homomorphic encryption: A look at Microsoft SEAL
CN115580396B (en) Tight trace query system and method
KR101407220B1 (en) A method of efficient secure function evaluation using resettable tamper-resistant hardware tokens
JP2006210964A (en) Method and device for transferring information by elgamal encryption
CN114564730A (en) Symmetric encryption-based federal packet statistic calculation method, device and medium
Li et al. Cryptographic algorithms for privacy-preserving online applications.
Véron Code based cryptography and steganography
CN112398646A (en) Identity-based encryption method and system with short public parameters on ideal lattice
CN114362912A (en) Identification password generation method based on distributed key center, electronic device and medium
CN114465708B (en) Privacy data processing method, device, system, electronic equipment and storage medium
US20130058483A1 (en) Public key cryptosystem and technique
CN116681141A (en) Federal learning method, terminal and storage medium for privacy protection
Peng et al. On the security of fully homomorphic encryption for data privacy in Internet of Things
CN111984932B (en) Two-party data packet statistics method, device and system
Basu et al. Secured hierarchical secret sharing using ECC based signcryption
CN111835825A (en) Method suitable for transmitting messages between two intelligent Internet of things system communication parties
Han et al. Attribute-based data transfer with filtering scheme in cloud computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination