CN115225405B - Matrix decomposition method based on security aggregation and key exchange under federal learning framework - Google Patents

Matrix decomposition method based on security aggregation and key exchange under federal learning framework Download PDF

Info

Publication number
CN115225405B
CN115225405B CN202210899003.8A CN202210899003A CN115225405B CN 115225405 B CN115225405 B CN 115225405B CN 202210899003 A CN202210899003 A CN 202210899003A CN 115225405 B CN115225405 B CN 115225405B
Authority
CN
China
Prior art keywords
client
matrix
gradient
federal learning
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210899003.8A
Other languages
Chinese (zh)
Other versions
CN115225405A (en
Inventor
夏长达
张子扬
夏家骏
张佳辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Light Tree Technology Co ltd
Original Assignee
Shanghai Light Tree Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Light Tree Technology Co ltd filed Critical Shanghai Light Tree Technology Co ltd
Priority to CN202210899003.8A priority Critical patent/CN115225405B/en
Priority to CN202310620692.9A priority patent/CN116545734A/en
Priority to CN202310622218.XA priority patent/CN116545735A/en
Publication of CN115225405A publication Critical patent/CN115225405A/en
Application granted granted Critical
Publication of CN115225405B publication Critical patent/CN115225405B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/06Network architectures or network communication protocols for network security for supporting key management in a packet data network
    • H04L63/061Network architectures or network communication protocols for network security for supporting key management in a packet data network for key exchange, e.g. in peer-to-peer networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0838Key agreement, i.e. key establishment technique in which a shared key is derived by parties as a function of information contributed by, or associated with, each of these
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0869Generation of secret information including derivation or calculation of cryptographic keys or passwords involving random numbers or seeds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0891Revocation or update of secret information, e.g. encryption key update or rekeying
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/08Randomization, e.g. dummy operations or using noise
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a matrix decomposition method of a federal learning framework based on safe aggregation and key exchange, which provides a new idea for federal learning to enhance data security by carrying out safe aggregation on the gradient of an object matrix I of matrix decomposition under the federal learning framework; can not be used locally
Figure DDA0003770190750000011
And a gradient of safe polymerization
Figure DDA0003770190750000012
The training samples of the upper recommendation model (namely the federal learning model) are effectively utilized, so that the user data is ensured not to leave the local area, and the recommendation model training process is safer; masking the gradient and noise, so that leakage of source data information caused by exposing the real gradient of the data is effectively avoided; compared with homomorphic encryption technology adopted in the background technology, the gradient summarization mode based on the security aggregation has the advantages that the calculation complexity of gradient encryption and decryption is lower, the calculation speed is faster, and the training speed of a recommended model is improved.

Description

Matrix decomposition method based on security aggregation and key exchange under federal learning framework
Technical Field
The invention relates to the technical field of information processing, in particular to a matrix decomposition method based on secure aggregation and key exchange under a federal learning framework.
Background
The current security matrix decomposition algorithm is mainly based on a matrix decomposition distributed algorithm, ensures the security of transmission information through the Paillier homomorphic encryption and other encryption technologies, and avoids the local data leakage of users. The implementation steps of the existing security matrix decomposition algorithm mainly comprise:
1. the method comprises the steps that a server initializes an article matrix I, a client locally initializes respective user matrices U, a public key is shared between the server and the client, and a private key is shared only by the client;
2. the server side encrypts I by using public key to obtain ciphertext C I Broadcasting to all clients;
3. each client gets C I Then utilizing local private key pair C I Decrypting to obtain a real object matrix I, calculating the gradient of the U held by the client, updating the U, calculating the gradient G of the I after updating, and encrypting to obtain a ciphertext C G
4. C is collected to server G And update to get C I =G I -C G Then the updated C I Broadcasting to all clients;
5. repeating the steps 3-4 until the algorithm converges.
According to the steps 1-5, the existing scheme ensures that the user data cannot be local, the homomorphic encryption technology enables the server side to not obtain the plaintext of the gradient in the whole training process, so that the original data cannot be reversely pushed out from a single gradient, but the homomorphic encryption scheme needs repeated encryption and decryption to enable training to be not efficient, if homomorphic encryption is removed, the plaintext gradient of the single data is directly summarized, the original data can be reversely pushed out after multi-step training, the safety of the local data cannot be guaranteed, and therefore the technical problem of how to solve the existing safety matrix decomposition algorithm becomes a problem to be solved urgently in the industry.
Disclosure of Invention
The invention aims to make the recommended model training process more efficient and ensure that local data in model training is not leaked, and provides a matrix decomposition method based on safe aggregation and key exchange under a federal learning framework.
To achieve the purpose, the invention adopts the following technical scheme:
the method for matrix decomposition based on secure aggregation and key exchange under the federal learning framework comprises the following steps:
s1, recording a dispatcher of a federal learning framework as a server, and each participating trainer as a client, wherein the server broadcasts an initialized embedded matrix I of an article to each client;
s2, each client X calculates an embedding matrix U related to the respective local user by using the embedding matrix I X Gradient of (2)
Figure BDA0003770190730000021
And utilize->
Figure BDA0003770190730000022
Updating an embedded matrix U of a local user X
S3, each client X uses the locally updated U X Calculating a gradient generated to the embedding matrix I
Figure BDA0003770190730000023
S4, updating the gradient by adopting a key exchange method
Figure BDA0003770190730000024
And is about->
Figure BDA0003770190730000025
Summarizing to obtain->
Figure BDA0003770190730000026
After that, use->
Figure BDA0003770190730000027
Updating the embedded matrix I;
s5, repeating the steps S2-S4 until the termination condition of federal learning is reached.
Preferably, in step S2, the embedding matrix
Figure BDA0003770190730000028
Embedding vector of the associated local user i +.>
Figure BDA0003770190730000029
Gradient of->
Figure BDA00037701907300000210
Calculated by the following formula (1):
Figure BDA00037701907300000211
in the formula (1), L is a loss function of the client X for federal learning,
Figure BDA00037701907300000212
M X representing a scoring matrix at the client X;
I T is the matrix transpose of I;
‖·‖ F the Frobenius norm of the matrix;
I j ∈R 1×k an embedding vector representing an item j common to all clients is an embedding matrix i= [ I ] 1 ,I 2 ,…,I j ,…,I d ]∈R d×k Is the j-th row of (2);
Figure BDA00037701907300000213
representation I j Is a vector transpose of (2);
Figure BDA00037701907300000214
representing the score of the user i about the item j owned by the client X (the missing item that the user i does not have an actual score about the item j is to be predicted after modeling is completed);
j:
Figure BDA00037701907300000215
exists indicates that the user i owned by the client X actually evaluates the excessive item j;
Figure BDA00037701907300000216
representing the sum of the items j actually evaluated by the user i owned by the client X with respect to the token j.
Preferably, in step S2, the user embedding matrix local to each of the clients X is updated by the following formula (2):
Figure BDA00037701907300000217
in the formula (2), lambda U Representing U X Is used for the regularization parameters of the (c),
Figure BDA00037701907300000218
preferably, in step S3, the embedding vector I of the associated item j in the embedding matrix I j Corresponding gradient
Figure BDA00037701907300000219
Calculated by the following formula (3):
Figure BDA0003770190730000031
in the formula (3),
Figure BDA0003770190730000032
representation->
Figure BDA0003770190730000033
Is the j-th row of (2);
Figure BDA0003770190730000034
an embedded vector I representing the item j common to all the clients j Is a vector transpose of (2);
Figure BDA0003770190730000035
representing the embedding matrix U X An embedded vector of a related local user i;
Figure BDA0003770190730000036
a score representing the user i locally owned by the client X with respect to the item j;
i:
Figure BDA0003770190730000037
exists means those users i owned by the client X who have an over-scoring behavior on item j;
Figure BDA0003770190730000038
representing the summation of those users i owned by the client X who have scored the item j with respect to the token i.
Preferably, in step S4, the gradient is replaced
Figure BDA0003770190730000039
The key exchange method adopted specifically comprises the following steps: />
S41, each client X locally generates a private key S X And public key p X The server exchanges the public key generated by each client X, and each client X obtains a corresponding exchange public key set which is marked as C X
S42 according to C X And a private key s locally generated by each of said clients X X Generating a key agreement between the client X and every other client Y, denoted as key_agreement (X, Y);
s43, the client X generates a mask by using the locally generated key_agreement (X, Y) as a seed, marks the mask (X, Y), and updates the gradient in step S3
Figure BDA00037701907300000310
Preferably, in step S41, C X By the followingExpression (4) expresses:
C X ={P 1 ,…,P X ,…,p N expression (4)
In the expression (4) of the above,
Figure BDA00037701907300000311
representing a public key locally generated by the client X;
p represents prime numbers, and each client is pre-customized;
g represents the primitive root of the model p, and each client is pre-customized;
% p represents modulo arithmetic on prime number p;
{p 1 ,…,p X ,…,p N and represents a set of locally generated public keys for all N of the clients received by the server.
Preferably, in step S42, the key_flag (X, Y) is generated by:
the client X exchanges the public key set C from the client X X The public key p of the client Y is taken out Y
The client X is based on the public key p Y And the private key s generated locally X Generated as key_agreement (X, Y).
Preferably, the generation formula of key_agreement (X, Y) is expressed as follows:
Figure BDA0003770190730000041
in the formula (5) of the present invention,
Figure BDA0003770190730000042
represents p Y S of (2) X Power of the order;
p represents prime numbers agreed in advance by each client;
% p represents modulo the prime number p.
Preferably, in step S43, the gradient is updated by the following equation (6)
Figure BDA0003770190730000043
Figure BDA0003770190730000044
In formula (6), a (X, Y) represents 1 or-1, the clients are numbered {1,2, …, X, …, N }, which value is equal to 1 if the number of client X is greater than the number of client Y, and is otherwise equal to-1;
Y∈{1,2,…,N}\{X} representing the summation of all non-X clients Y with respect to token Y.
Preferably, in step S4, the steps are summarized
Figure BDA0003770190730000045
Is expressed by the following formula (7):
Figure BDA0003770190730000046
in step S4, the method of updating the embedding matrix I is expressed by the following formula (8):
Figure BDA0003770190730000047
in the formula (8), lambda I Representing regularization parameters of the embedding matrix I.
Preferably, for the gradient generated in step S3
Figure BDA0003770190730000048
After noise addition, the process goes to step S4, for the gradient +.>
Figure BDA0003770190730000049
The noise adding method is expressed by the following formula (9):
Figure BDA00037701907300000410
/>
in the formula (9), n X Representing gaussian noise.
The invention has the following beneficial effects:
1. gradient with safe polymerization
Figure BDA00037701907300000411
And->
Figure BDA00037701907300000412
And a training sample of the recommendation model is obtained, so that the user data is ensured not to leave the local area, and the recommendation model training process is safer.
2. Masking the gradient and noise, so that leakage of source data information caused by exposing the real gradient of the data is effectively avoided;
3. compared with homomorphic encryption technology adopted in the background technology, the gradient summarization mode based on the security aggregation has the advantages that the calculation complexity of gradient encryption and decryption is lower, the calculation speed is higher, and the training speed of a recommended model is improved;
4. the recommendation model is trained based on the matrix decomposition algorithm provided by the application under the federal learning framework, and in the model training process, the participants do not need to exchange local data, so that the local data is more effectively ensured not to be leaked.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the embodiments of the present invention will be briefly described below. It is evident that the drawings described below are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a diagram of steps in implementing a matrix decomposition method based on secure aggregation and key exchange in a federal learning framework according to an embodiment of the present invention;
fig. 2 is a flow chart of a matrix decomposition method based on secure aggregation and key exchange under the federal learning framework provided by an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further described below by the specific embodiments with reference to the accompanying drawings.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to be limiting of the present patent; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if the terms "upper", "lower", "left", "right", "inner", "outer", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, only for convenience in describing the present invention and simplifying the description, rather than indicating or implying that the apparatus or elements being referred to must have a specific orientation, be constructed and operated in a specific orientation, so that the terms describing the positional relationships in the drawings are merely for exemplary illustration and should not be construed as limiting the present patent, and that the specific meaning of the terms described above may be understood by those of ordinary skill in the art according to specific circumstances.
In the description of the present invention, unless explicitly stated and limited otherwise, the term "coupled" or the like should be interpreted broadly, as it may be fixedly coupled, detachably coupled, or integrally formed, as indicating the relationship of components; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between the two parts or interaction relationship between the two parts. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Taking A, B, C three clients as an example, how the matrix decomposition method based on secure aggregation and key exchange is specifically implemented under the federal learning framework provided in this embodiment is described below:
recording a dispatcher in a federal learning framework as a server, each participating trainer as a client, M as a scoring matrix (such as a matrix corresponding to scoring movies by a plurality of imdb users, containing some missing items needing predictive filling), U A 、U B 、U C The embedded matrix of the local user of the client A, B, C (the local user is numerically represented by the matrix), and I represents the embedded matrix of the object (the common object is numerically represented by the matrix). As shown in fig. 2, the specific implementation steps of the matrix decomposition method based on secure aggregation and key exchange under the federal learning framework provided in this embodiment are as follows:
1. each party determines the good embedding dimension (the embedding dimension represents how much space of the dimension is utilized to numeralize users and articles), the server initializes the embedding matrix I of the articles according to the embedding dimension, and the client A, B, C initializes the own local user embedding matrix U according to the embedding dimension A 、U B 、U C
2. The server broadcasts the embedded matrix I to the client A, B, C;
3. client A calculates U using the embedding matrix I A Gradient of (2)
Figure BDA0003770190730000061
Then updating the embedded matrix U of the local user A
Figure BDA0003770190730000062
Wherein->
Figure BDA0003770190730000063
m A Representing the total number of users of client A, I j An embedded vector representing an item j common to all clients,/->
Figure BDA0003770190730000064
Representation I j Is transposed by the vector of>
Figure BDA0003770190730000065
Representing the score of user i owned by client A with respect to item j, j: ∈>
Figure BDA0003770190730000066
exists means those items j, < >, > which are actually evaluated too much by the user i owned by the client a>
Figure BDA0003770190730000067
Representing those items j that are actually scored too much by user i owned by client a, summed with respect to token i; u (U) A The updating mode of (a) is as follows: />
Figure BDA0003770190730000068
λ U U indicator A Is used for regularization parameters of (a);
gradient corresponding to client B, C respectively
Figure BDA0003770190730000069
Is calculated by (a) and updating U respectively B 、U C The method of (1) is the same as the client A and will not be described in detail here;
4. client A uses locally updated U A Calculating gradients generated for the user on the embedding matrix I
Figure BDA00037701907300000610
Wherein->
Figure BDA00037701907300000611
d represents the total number of common things, i: the total number of the common things>
Figure BDA00037701907300000612
exists represent those users i owned by client a who have scored actions on item j,
Figure BDA00037701907300000613
summing those users i owned by client a who have a scoring behavior for item j with respect to token i;
gradient corresponding to client B, C respectively
Figure BDA00037701907300000614
The calculation method of (1) is the same as the client A and will not be described in detail here;
to avoid exposing the true gradients, the corresponding gradients for each client are preferably noisy, more preferably client A, B, C by differential privacy techniques
Figure BDA00037701907300000615
Respectively plus Gaussian noise n A 、n B 、n C . Taking client A as an example, n A Representing the generated random matrix (size and +.>
Figure BDA00037701907300000616
The same) and (II)>
Figure BDA00037701907300000617
Updated to->
Figure BDA00037701907300000618
5. Client A, B, C generates locally respective public and private keys, p A 、p B 、p C Respectively representing locally generated public keys s of client A, B, C A 、s B 、s C Representing the private keys locally generated by the client A, B, C, respectively. Taking client a as an example, private key s A For locally generated random numbers (less than p in value), p A (by private key s) A Calculated) is
Figure BDA0003770190730000071
Wherein g represents the generator (the primitive root of modulo p, can be selected to be smaller, can be simply taken as 2),>
Figure BDA0003770190730000072
s representing g A The power of the power, p is a large prime number (2048 bits are generally available), and% p represents modulo operation on p, and g and p of each client are predetermined;
6. service side collectionAll public keys p A 、p B 、p C And the public key sent to the client A is p B 、p C The public key sent to client B is p A 、p C The public key sent to the client C is p A 、p B
7. Client a is based on public key p B 、p C And a locally generated private key s A Generating a key_agreement (A, B) of the client B and a key_agreement (A, C) of the client C; client B is based on public key p A 、p C And private key s B Generating a key_agreement (A, B) of the client A and a key_agreement (B, C) of the client C; client C is based on public key p A 、p B And its own private key s C Generate key_agreement (a, C) with client a and key_agreement (B, C) with client B. Taking the example of the client a as the example,
Figure BDA0003770190730000073
Figure BDA0003770190730000074
respectively represent p B S of (2) A Power of the power of p C S of (2) A To power,% p represents modulo p.
8. Client A uses local key_agrement (A, B) as seed to generate mask (A, B), uses local key_agrement (A, C) as seed to generate mask (A, C), and updates gradient
Figure BDA0003770190730000075
Figure BDA0003770190730000076
The client B uses the local key_agrement (A, B) as a seed to generate a mask (A, B), uses the local key_agrement (B, C) as a seed to generate a mask (B, C), and updates the gradient
Figure BDA0003770190730000077
Figure BDA0003770190730000078
The client C uses the local key_agretement (A, C) as a seed to generate a mask (A, C), uses the local key_agretement (B, C) as a seed to generate a mask (B, C), and updates the gradient ∈ ->
Figure BDA0003770190730000079
Figure BDA00037701907300000710
Taking client A as an example, the mask (A, B) is a size, shape and shape generated with key_agrement (A, B) as seed
Figure BDA00037701907300000711
The same random matrix (the seed parameters are directly generated by calling the open source library function).
9. The server gathers the gradients to obtain
Figure BDA00037701907300000712
Then update I to obtain
Figure BDA00037701907300000713
λ I Regularization parameters representing an embedding matrix I;
10. repeating the steps 2-8 until the maximum training times of the federal recommendation model or algorithm convergence is reached.
Briefly, the matrix decomposition method based on secure aggregation and key exchange under the federal learning framework provided in this embodiment, as shown in fig. 1, includes the steps of:
s1, recording a dispatcher of a federal learning framework as a server, taking each participating training party as a client, and broadcasting an initialized embedded matrix I of an article to each client by the server;
s2, each client X calculates an embedding matrix U of each local user by using the embedding matrix I X Gradient of (2)
Figure BDA0003770190730000081
And utilize->
Figure BDA0003770190730000082
Updating an embedded matrix U of a local user X
S3, each client X uses the U updated locally X Calculating the gradient generated to the embedding matrix I
Figure BDA0003770190730000083
S4, updating gradient by adopting key exchange method
Figure BDA0003770190730000084
And is about->
Figure BDA0003770190730000085
Summarizing to obtain->
Figure BDA0003770190730000086
After that, use->
Figure BDA0003770190730000087
Updating the embedded matrix I;
s5, repeating the steps S2-S4 until the termination condition of federal learning is reached.
In conclusion, the gradient of the matrix decomposed object matrix I is safely aggregated under the Federal learning framework, and a new idea is provided for Federal learning to enhance data security; gradient with safe polymerization
Figure BDA0003770190730000088
And->
Figure BDA0003770190730000089
The training sample of the recommendation model (namely the federal learning model) is obtained, so that the user data is ensured not to leave the local area, and the recommendation model training process is safer; masking the gradient and noise, so that leakage of source data information caused by exposing the real gradient of the data is effectively avoided; compared with homomorphic encryption technology adopted in the background technology, the gradient summarization mode based on the security aggregation is provided to carry out gradient decompositionThe method has the advantages of lower computation complexity and higher computation speed, and is beneficial to improving the training speed of the recommended model.
It should be understood that the above description is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be apparent to those skilled in the art that various modifications, equivalents, variations, and the like can be made to the present invention. However, such modifications are intended to fall within the scope of the present invention without departing from the spirit of the present invention. In addition, some terms used in the specification and claims of the present application are not limiting, but are merely for convenience of description.

Claims (10)

1. A matrix factorization method based on secure aggregation and key exchange under a federal learning framework, comprising the steps of:
s1, recording a dispatcher of a federal learning framework as a server, and each participating trainer as a client, wherein the server broadcasts an initialized embedded matrix I of an article to each client;
s2, each client X calculates an embedding matrix U related to the respective local user by using the embedding matrix I X Gradient of (2)
Figure QLYQS_1
And utilize->
Figure QLYQS_2
Updating an embedded matrix U of a local user X
S3, each client X uses the locally updated U X Calculating a gradient generated to the embedding matrix I
Figure QLYQS_3
S4, the client X links the server to update the gradient by adopting a key exchange method
Figure QLYQS_4
And is about->
Figure QLYQS_5
Summarizing to obtain->
Figure QLYQS_6
After that, use->
Figure QLYQS_7
Updating the embedded matrix I;
s5, repeating the steps S2-S4 until the termination condition of federal learning is reached;
in step S2, the embedding matrix
Figure QLYQS_8
Embedding vector of the associated local user i +.>
Figure QLYQS_9
Gradient of->
Figure QLYQS_10
Calculated by the following formula (1):
Figure QLYQS_11
in the formula (1), L is a loss function of the client X for federal learning,
Figure QLYQS_12
M X representing a scoring matrix at the client X;
I T is the matrix transpose of I;
||·|| F the Frobenius norm of the matrix;
I j ∈R 1×k an embedding vector representing an item j common to all clients is an embedding matrix i= [ I ] 1 ,I 2 ,…,I j ,…,I d ]∈R d×k Is the j-th row of (2);
Figure QLYQS_13
representation I j Is a vector transpose of (2);
Figure QLYQS_14
representing the score of the user i about the item j owned by the client X, wherein the missing item of the user i about the item j without actual score is needed to be predicted after modeling;
Figure QLYQS_15
representing that the user i owned by the client X actually evaluates the excessive item j;
Figure QLYQS_16
representing the sum of the items j actually evaluated by the user i owned by the client X with respect to the token j.
2. The matrix factorization method based on secure aggregation and key exchange under the federal learning framework according to claim 1, wherein in step S2, the user embedded matrix local to each of said clients X is updated by the following formula (2):
Figure QLYQS_17
in the formula (2), lambda U Representing U X Is used for the regularization parameters of the (c),
Figure QLYQS_18
3. a matrix factorization method based on secure aggregation and key exchange under a federal learning framework, comprising the steps of:
s1, recording a dispatcher of a federal learning framework as a server, and each participating trainer as a client, wherein the server broadcasts an initialized embedded matrix I of an article to each client;
s2, each client X calculates an embedding matrix U related to the respective local user by using the embedding matrix I X Gradient of (2)
Figure QLYQS_19
And utilize->
Figure QLYQS_20
Updating an embedded matrix U of a local user X
S3, each client X uses the locally updated U X Calculating a gradient generated to the embedding matrix I
Figure QLYQS_21
S4, the client X links the server to update the gradient by adopting a key exchange method
Figure QLYQS_22
And is about->
Figure QLYQS_23
Summarizing to obtain->
Figure QLYQS_24
After that, use->
Figure QLYQS_25
Updating the embedded matrix I;
s5, repeating the steps S2-S4 until the termination condition of federal learning is reached;
in step S3, the embedding vector I of the associated article j in the embedding matrix I j Corresponding gradient
Figure QLYQS_26
Calculated by the following formula (3):
Figure QLYQS_27
in the formula (3),
Figure QLYQS_28
representation->
Figure QLYQS_29
Is the j-th row of (2);
Figure QLYQS_30
an embedded vector I representing the item j common to all the clients j Is a vector transpose of (2);
Figure QLYQS_31
representing the embedding matrix U X An embedded vector of a related local user i;
Figure QLYQS_32
a score representing the user i locally owned by the client X with respect to the item j;
Figure QLYQS_33
representing those users i owned by the client X who have a scoring behavior on item j;
Figure QLYQS_34
representing the summation of those users i owned by the client X who have scored the item j with respect to the token i.
4. A matrix factorization method based on secure aggregation and key exchange under a federal learning framework, comprising the steps of:
s1, recording a dispatcher of a federal learning framework as a server, and each participating trainer as a client, wherein the server broadcasts an initialized embedded matrix I of an article to each client;
s2, each client X calculates an embedding matrix U related to the respective local user by using the embedding matrix I X Gradient of (2)
Figure QLYQS_35
And utilize->
Figure QLYQS_36
Updating an embedded matrix U of a local user X
S3, each client X uses the locally updated U X Calculating a gradient generated to the embedding matrix I
Figure QLYQS_37
S4, the client X links the server to update the gradient by adopting a key exchange method
Figure QLYQS_38
And is about->
Figure QLYQS_39
Summarizing to obtain->
Figure QLYQS_40
After that, use->
Figure QLYQS_41
Updating the embedded matrix I;
s5, repeating the steps S2-S4 until the termination condition of federal learning is reached;
in step S4, the gradient is replaced
Figure QLYQS_42
The key exchange method adopted specifically comprises the following steps:
s41, each client X locally generates a private key S X And public key p X The server exchanges the public key generated by each client X,each client X obtains a corresponding exchange public key set which is marked as C X
S42 according to C X And a private key s locally generated by each of said clients X X Generating a key agreement between the client X and every other client Y, denoted as key_agreement (X, Y);
s43, the client X generates a mask by using the locally generated key_agreement (X, Y) as a seed, marks the mask (X, Y), and updates the gradient in step S3
Figure QLYQS_43
5. The method of matrix factorization based on secure aggregation and key exchange in the federal learning framework of claim 4, wherein in step S41C X Expressed by the following expression (4):
C X ={p 1 ,…,p X ,…,p N expression (4)
In the expression (4) of the above,
Figure QLYQS_44
representing a public key locally generated by the client X;
p represents prime numbers, and each client is pre-customized;
g represents the primitive root of the model p, and each client is pre-customized;
% p represents modulo arithmetic on prime number p;
{p 1 ,…,p X ,...,p N and represents a set of locally generated public keys for all N of the clients received by the server.
6. The method of matrix decomposition based on secure aggregation and key exchange in a federal learning framework according to claim 5, wherein in step S42, the key_pattern (X, Y) is generated by:
the client X exchanges the public key set C from the client X X The public key of the client Y is taken outp Y
The client X is based on the public key p Y And the private key s generated locally X Generated as key_agreement (X, Y).
7. The matrix factorization method based on secure aggregation and key exchange in the federal learning framework according to claim 6, wherein the generation formula of key_agreement (X, Y) is expressed as follows:
Figure QLYQS_45
in the formula (5) of the present invention,
Figure QLYQS_46
represents p Y S of (2) X Power of the order;
p represents prime numbers agreed in advance by each client;
% p represents modulo the prime number p.
8. The method of matrix factorization based on secure aggregation and key exchange in the federal learning framework of claim 4, wherein in step S43, the gradient is updated by the following equation (6)
Figure QLYQS_47
Figure QLYQS_48
In formula (6), a (X, Y) represents 1 or-1, the clients are numbered {1,2, …, X, …, N }, which value is equal to 1 if the number of client X is greater than the number of client Y, and is otherwise equal to-1;
Y∈{1,2,…,N}\{X} representing the summation of all non-X clients Y with respect to token Y.
9. The federal learning according to claim 1 or 3 or 4The matrix decomposition method based on secure aggregation and key exchange under the framework is characterized in that in step S4, the matrix decomposition method is summarized
Figure QLYQS_49
Is expressed by the following formula (7):
Figure QLYQS_50
in step S4, the method of updating the embedding matrix I is expressed by the following formula (8):
Figure QLYQS_51
in the formula (8), lambda I Representing regularization parameters of the embedding matrix I.
10. A method of matrix decomposition based on secure aggregation and key exchange in a federal learning framework according to claim 1 or 3 or 4, wherein the gradient created for step S3
Figure QLYQS_52
After noise addition, the process goes to step S4, for the gradient +.>
Figure QLYQS_53
The noise adding method is expressed by the following formula (9):
Figure QLYQS_54
in the formula (9), n X Representing gaussian noise.
CN202210899003.8A 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange under federal learning framework Active CN115225405B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202210899003.8A CN115225405B (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange under federal learning framework
CN202310620692.9A CN116545734A (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange
CN202310622218.XA CN116545735A (en) 2022-07-28 2022-07-28 Matrix decomposition method under federal learning framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210899003.8A CN115225405B (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange under federal learning framework

Related Child Applications (2)

Application Number Title Priority Date Filing Date
CN202310622218.XA Division CN116545735A (en) 2022-07-28 2022-07-28 Matrix decomposition method under federal learning framework
CN202310620692.9A Division CN116545734A (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange

Publications (2)

Publication Number Publication Date
CN115225405A CN115225405A (en) 2022-10-21
CN115225405B true CN115225405B (en) 2023-04-21

Family

ID=83614120

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202310622218.XA Pending CN116545735A (en) 2022-07-28 2022-07-28 Matrix decomposition method under federal learning framework
CN202210899003.8A Active CN115225405B (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange under federal learning framework
CN202310620692.9A Pending CN116545734A (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202310622218.XA Pending CN116545735A (en) 2022-07-28 2022-07-28 Matrix decomposition method under federal learning framework

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202310620692.9A Pending CN116545734A (en) 2022-07-28 2022-07-28 Matrix decomposition method based on security aggregation and key exchange

Country Status (1)

Country Link
CN (3) CN116545735A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115249074B (en) * 2022-07-28 2023-04-14 上海光之树科技有限公司 Distributed federal learning method based on Spark cluster and Ring-AllReduce architecture
CN115865307B (en) * 2023-02-27 2023-05-09 蓝象智联(杭州)科技有限公司 Data point multiplication operation method for federal learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420232A (en) * 2021-06-02 2021-09-21 杭州电子科技大学 Privacy protection-oriented graph neural network federal recommendation method
CN114510652A (en) * 2022-04-20 2022-05-17 宁波大学 Social collaborative filtering recommendation method based on federal learning
CN114564742A (en) * 2022-02-18 2022-05-31 北京交通大学 Lightweight federated recommendation method based on Hash learning
WO2022141839A1 (en) * 2020-12-31 2022-07-07 平安科技(深圳)有限公司 Method and apparatus for updating federated learning model, and electronic device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10630655B2 (en) * 2017-05-18 2020-04-21 Robert Bosch Gmbh Post-quantum secure private stream aggregation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022141839A1 (en) * 2020-12-31 2022-07-07 平安科技(深圳)有限公司 Method and apparatus for updating federated learning model, and electronic device and storage medium
CN113420232A (en) * 2021-06-02 2021-09-21 杭州电子科技大学 Privacy protection-oriented graph neural network federal recommendation method
CN114564742A (en) * 2022-02-18 2022-05-31 北京交通大学 Lightweight federated recommendation method based on Hash learning
CN114510652A (en) * 2022-04-20 2022-05-17 宁波大学 Social collaborative filtering recommendation method based on federal learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
董业 ; 侯炜 ; 陈小军 ; 曾帅 ; .基于秘密分享和梯度选择的高效安全联邦学习.计算机研究与发展.2020,(10),全文. *
陈国润 ; 母美荣 ; 张蕊 ; 孙丹 ; 钱栋军 ; .基于联邦学习的通信诈骗识别模型的实现.电信科学.2020,(S1),全文. *

Also Published As

Publication number Publication date
CN116545735A (en) 2023-08-04
CN116545734A (en) 2023-08-04
CN115225405A (en) 2022-10-21

Similar Documents

Publication Publication Date Title
CN115225405B (en) Matrix decomposition method based on security aggregation and key exchange under federal learning framework
US7526084B2 (en) Secure classifying of data with Gaussian distributions
CN112149160B (en) Homomorphic pseudo-random number-based federated learning privacy protection method and system
Xing et al. Mutual privacy preserving $ k $-means clustering in social participatory sensing
US11449753B2 (en) Method for collaborative learning of an artificial neural network without disclosing training data
CN111177791B (en) Method and device for protecting business prediction model of data privacy joint training by two parties
Zhao et al. PVD-FL: A privacy-preserving and verifiable decentralized federated learning framework
CN112989368A (en) Method and device for processing private data by combining multiple parties
CN113761557A (en) Multi-party deep learning privacy protection method based on fully homomorphic encryption algorithm
CN113420232A (en) Privacy protection-oriented graph neural network federal recommendation method
CN111104968A (en) Safety SVM training method based on block chain
Minelli Fully homomorphic encryption for machine learning
Niu et al. Anticontrol of a fractional-order chaotic system and its application in color image encryption
CN115186831A (en) Deep learning method with efficient privacy protection
Zhao et al. SGBoost: An efficient and privacy-preserving vertical federated tree boosting framework
Zhang et al. SecureTrain: An approximation-free and computationally efficient framework for privacy-preserved neural network training
CN115865307B (en) Data point multiplication operation method for federal learning
CN116167088A (en) Method, system and terminal for privacy protection in two-party federal learning
CN114358323A (en) Third-party-based efficient Pearson coefficient calculation method in federated learning environment
CN112819058B (en) Distributed random forest evaluation system and method with privacy protection attribute
CN113962286A (en) Decentralized logistic regression classification prediction method based on piecewise function
Arita et al. Two applications of multilinear maps: group key exchange and witness encryption
CN116248252B (en) Data dot multiplication processing method for federal learning
Weng et al. Privacy-Preserving Neural Network Based on Multi-key NTRU Cryptosystem
Sun et al. A lottery SMC protocol for the selection function in software defined wireless sensor networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant