WO2021092980A1 - Longitudinal federated learning optimization method, apparatus and device, and storage medium - Google Patents

Longitudinal federated learning optimization method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2021092980A1
WO2021092980A1 PCT/CN2019/119418 CN2019119418W WO2021092980A1 WO 2021092980 A1 WO2021092980 A1 WO 2021092980A1 CN 2019119418 W CN2019119418 W CN 2019119418W WO 2021092980 A1 WO2021092980 A1 WO 2021092980A1
Authority
WO
WIPO (PCT)
Prior art keywords
participant
value
encrypted
target
encrypted data
Prior art date
Application number
PCT/CN2019/119418
Other languages
French (fr)
Chinese (zh)
Inventor
范涛
杨恺
陈天健
杨强
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2021092980A1 publication Critical patent/WO2021092980A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Definitions

  • This application relates to the technical field of financial technology (Fintech), in particular to vertical federated learning optimization methods, devices, equipment, and storage media.
  • the longitudinal linear regression method in federated learning is a stochastic gradient descent method based on one-step information. Take the vertical federated linear regression in which two parties participate as an example. For example, if party A and party B are set, party A is the host (host) party, with only part of the characteristics of the data, and party B is the guest (guest) party, which has part and A completeness. Different data characteristics have data tags at the same time.
  • Party B needs to request the inner product of the current model parameters and data from Party A to calculate the loss function value and gradient. This process involves Party A sending its encrypted calculation data to Party B, and Party B calculates the encrypted coefficients. , Through the coefficients AB, the two parties can calculate their respective gradient components, and send them to the third party C for decryption and processing, and then send them back to A and B as the descending direction, update the model parameters held by both parties, and iterate this step So A and B can get a trained model.
  • the existing schemes are iteratively optimized based on one-step information of the objective loss function, and its convergence speed is slow. This results in a large number of rounds of data interaction between ABCs, and communication takes a lot of time in cross-enterprise cooperation.
  • the main purpose of this application is to propose a vertical federated learning optimization method, device, equipment, and storage medium, which aims to solve the current long-term technical problem of vertical federated learning.
  • the longitudinal federated learning optimization method includes the following steps:
  • the secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
  • the loss function value and the secondary encrypted data are sent to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging , And calculate the target subgradient value according to the updated second derivative matrix;
  • this application also provides a longitudinal federated learning optimization method including the following steps:
  • the participant calculates based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
  • the target secondary gradient value is sent to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the secondary participant acquisition
  • the present application also provides a longitudinal federated learning optimization device, the longitudinal federated learning optimization device includes:
  • the obtaining module is used for the secondary participant to obtain the encrypted value set with linear regression value sent by the main participant, and calculate the secondary encrypted data according to the encrypted value set;
  • the sending module is configured to send the secondary encrypted data to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, And calculate the target sub-gradient value according to the updated second derivative matrix;
  • the first receiving module is configured to receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant The step of obtaining the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  • the longitudinal federated learning optimization device further includes:
  • the second receiving module is used to receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the The intermediate result value is calculated by the secondary participant based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
  • the update module is configured to respond to the failure of the longitudinal logistic regression model to converge, update the second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate the target sub-gradient value according to the updated second derivative matrix;
  • the convergence module is configured to send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value and continue to execute all The step of obtaining the encrypted value set with linear regression value sent by the main participant by the secondary participant until the vertical federation model corresponding to the coordinator converges.
  • this application also provides a longitudinal federated learning optimization device
  • the longitudinal federated learning optimization device includes: a memory, a processor, and a computer stored on the memory and capable of running on the processor
  • a readable instruction when the computer readable instruction is executed by the processor, implements the steps of the vertical federated learning optimization method as described above.
  • the present application also provides a storage medium having computer-readable instructions stored on the storage medium, and when the computer-readable instructions are executed by a processor, the above-mentioned vertical federated learning optimization method is implemented. step.
  • FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the present application
  • FIG. 2 is a schematic flowchart of the first embodiment of the vertical federated learning optimization method according to this application;
  • FIG. 3 is a schematic flowchart of another embodiment of the vertical federated learning optimization method of this application.
  • Figure 4 is a schematic diagram of the device modules of the vertical federated learning optimization device of the application.
  • Figure 5 is a schematic diagram of the calculation and interaction process of the vertical federated learning optimization method of this application.
  • FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application.
  • the longitudinal federated learning optimization device in the embodiment of the present application may be a PC or a server device, on which a Java virtual machine runs.
  • the vertical federated learning optimization device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to implement connection and communication between these components.
  • the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001.
  • the structure of the device shown in FIG. 1 does not constitute a limitation on the device, and may include more or fewer components than those shown in the figure, or a combination of certain components, or different component arrangements.
  • the memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and computer readable instructions.
  • the network interface 1004 is mainly used to connect to the back-end server and communicate with the back-end server; the user interface 1003 is mainly used to connect to the client (user side) and communicate with the client; and the processor 1001 can be used to call computer-readable instructions stored in the memory 1005, and perform operations in the following vertical federated learning optimization method.
  • FIG. 2 is a schematic flowchart of a first embodiment of a longitudinal federated learning optimization method according to this application. The method includes:
  • Step S10 the secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
  • Linear regression is a method based on a linear model to fit data features (independent variables) and data labels (dependent variables).
  • Vertical federated linear regression means that multiple participants want to combine data for linear regression modeling, but each holds a part of different data characteristics, and the data labels are often owned by only one party. Therefore, in this embodiment, the main participant has only part of the characteristics of the data, while the sub-participants have some data characteristics that are completely different from the main participant.
  • Vertical federated learning means that different parties have different feature data, which is equivalent to dividing each complete data into multiple parts vertically. Each party hopes to implement linear regression model training while protecting data privacy, so as to use the model The parameter predicts the value of the dependent variable on the new data.
  • [[ ⁇ ]] stands for homomorphic encryption operation.
  • a vertical federation scenario only one party holds data labels. Take two parties as an example. Party A holds data x A and maintains the corresponding model parameters w A , and Party B holds x B , y B and owns and maintains the corresponding model parameters. w B.
  • the loss function and gradient can be expressed as the operation of the homomorphic encrypted data of both parties, namely:
  • This solution uses second-order information to propose a fast-convergent technical solution, based on the second-order derivative matrix of the loss function (ie, the Hessian matrix)
  • the design idea of this scheme is based on the quasi-Newton method, using the second-order information to estimate an inverse Hessian matrix H.
  • the gradient g is not used but H g is used as the descending direction to speed up the convergence speed of the algorithm. Since the dimension of the inverse Hessian matrix H is much larger than the gradient, the core point of the design is how to reduce the data communication volume of all parties.
  • This scheme proposes to maintain the inverse Hessian matrix H at the C end, and in addition to calculating the gradient for each L step AB, a small batch of data is randomly selected, and the average value of the previous L step model is calculated Average with the last L-step model Difference Then calculate a vector containing the second-order information of the batch of data Sent to the C end, the dimension is the same as the gradient.
  • the C terminal uses the information of the first M vectors v to update the inverse Hessian matrix once. Therefore, in this embodiment, the main participant is regarded as Party A, the sub-participants are regarded as Party B, and the coordinator is regarded as Party C.
  • Party A calculates the value set of the corresponding data ID in S A uses homomorphic encryption technology for all Encrypt the value to get the encrypted data set Transmit it to Party B. Then, update And judge the relationship between the current iteration number k and L, if the current iteration number k is an integer multiple of L, and the iteration number k is greater than 2L; And calculate the current (t) and last (t-1) Difference between In addition, a small batch of data ID is randomly selected as S H , and the A side calculates the value on S H And homomorphically encrypted data Transmitted to the B side. If the current number of iterations k is an integer multiple of L, and the number of iterations k is not greater than 2L: only update the A side
  • Step S20 Send the secondary encrypted data to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, and according to The updated second-order derivative matrix calculates the target sub-gradient value;
  • the B end transmits the encrypted loss value to the C end.
  • a and B respectively transmit [[g A ]], [[g B ]] to C. Then, determine the relationship between the current iteration number k and L.
  • Party C ie, the coordinator
  • update And judge the relationship between the number of iterations k and 2L if k is not greater than 2L, calculate the product of the pre-selected step size and the gradient And transmit them to A and B respectively (that is, the target secondary gradient value is obtained, and the target secondary gradient value is sent to the main participant, and the product corresponding to the secondary participant is sent to the secondary participant). If k is greater than 2L, merge the two gradients into a long vector g, calculate the step length, the product of H and g, and split them into corresponding parts A and B and transmit them to A and B respectively, namely:
  • k is an integer multiple of L
  • C has also received the encrypted data [[v A ]],[[v B ]], which can be decrypted and combined to get And stored in a v queue of length M.
  • the calculation method is as follows: initialize with the value at the end of the memory queue, that is, calculate H ⁇ p[m]I, where I is the identity matrix.
  • Step S40 Receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant acquiring the primary participant The step of sending encrypted value sets with linear regression values until the vertical federation model corresponding to the coordinator converges.
  • the secondary participant After the secondary participant receives the target secondary gradient value sent by the coordinator, it will update the secondary participant’s own local model parameters according to the target secondary gradient value. At the same time, after the primary participant receives the target primary gradient value sent by the coordinator, The model parameters of the main participant will also be updated based on the product. That is, the two parties AB use the received unencrypted vector to update their local model parameters, namely:
  • Party A calculates and transmits data locally to Party B, which means encrypting It is transmitted to Party B.
  • Party B performs local calculations based on the encrypted data transmitted by Party A to obtain the encryption loss function value and encryption value, and transmits [[d]][[h]] to Party A, and both parties AB also encrypt Calculate the respective gradient values and transmit them to the C party, that is, the two parties AB send [[g A ]],[[g B ]][[v A ]],[[v B ]] to C Party B also transmits [[loss]] to Party C.
  • Party C decrypts the received [[v A ]], [[v B ]], [[loss]] to obtain the decrypted g A , g B , loss ⁇ A , ⁇ B , and judge whether the algorithm has converged according to loss, if not, update H according to the received gradient value, calculate and transmit, that is, when k is not greater than 2L, calculate the pre-selection The product of a fixed step size and the gradient And transmit them to A and B respectively; when k is greater than 2L, merge the two gradients into a long vector g, calculate the product of step length, H and g, and split them into corresponding parts A and B respectively Transmitted to A and B, namely:
  • both parties AB update the local model parameters according to the unencrypted vector passed by party C, namely
  • the loss function value and the secondary encrypted data are calculated and sent to the coordinator, so that the coordinator can determine whether the vertical federation model has converged according to the loss function value. If it does not converge, the second derivative matrix is updated according to the secondary encrypted data, and the target secondary gradient value is calculated according to the updated second derivative matrix, and then the local model parameters of the secondary participant are updated by the target secondary gradient value, thereby avoiding
  • the prior art for longitudinal federated learning uses a first-order algorithm to make the convergence rate slower, and a large number of rounds of data interaction are required. This reduces the amount of communication for longitudinal federated learning and improves the training time of the longitudinal federated logistic regression model. The convergence rate.
  • a second embodiment of the vertical federated learning optimization method of the present application is proposed.
  • This embodiment is a refinement of the step S10 of the first embodiment of the present application.
  • the secondary participant obtains the encrypted value set with the linear regression value sent by the main participant, including: step a, detecting whether the vertical federation model satisfies the preset Judgment condition
  • the main participant when the main participant sends data to the sub-participants, it is also necessary to detect whether the vertical federation model satisfies the preset determination condition, for example, to determine whether the new iteration number of the vertical federation model meets the preset number condition (such as determining the new number of iterations). Whether the number of iterations is an integer multiple of the interval of iteration steps, and whether it is greater than two times greater than the preset number). And perform different operations according to different judgment results. And perform different operations according to different test results.
  • step b if it is satisfied, the secondary participant obtains the main encrypted value and the new encrypted value sent by the main participant, and uses the main encrypted value and the new encrypted value as a linear regression sent by the main participant The encrypted value collection of the value.
  • the main participant In the main participant, first obtain a small batch of data, and according to the formula mentioned in the above embodiment To calculate each logistic regression score, and encrypt these logistic regression scores using homomorphic encryption technology to obtain the main encryption value.
  • a small batch of data is obtained, and According to the formula To calculate each logistic regression score, and encrypt these logistic regression scores using homomorphic encryption technology to obtain a new encrypted value, and use the encrypted data and the new encrypted data together as an encrypted value set with a linear regression value.
  • the encrypted data and the new encrypted data are not the same and are sent to the secondary participant, that is, the secondary participant will obtain the main encrypted value and the new encrypted value sent by the main participant.
  • the main participant only sends the main encrypted value to the sub-participants, that is, at this time, the main encrypted value is a collection of encrypted values with linear regression values.
  • the secondary participant by determining that the vertical federation model satisfies the preset judgment condition, the secondary participant obtains the primary encrypted value and the secondary encrypted value sent by the primary participant, and uses them as a collection of encrypted numerical values with linear regression values. Thereby improving the training speed of the longitudinal linear regression model.
  • step of calculating the secondary encrypted data according to the encrypted value set includes: step c, determining whether the current iteration number corresponding to the secondary participant satisfies a preset number condition,
  • Step d if it is satisfied, calculate the intermediate result value according to the encrypted value set, and calculate the secondary encrypted data according to the intermediate result value.
  • end B is updated And calculate the current (t) and last (t-1) Difference between
  • the B side calculates the S H To calculate the intermediate result value It is transmitted to the A side, and at the same time, the secondary encrypted data in the secondary participant will be calculated based on the intermediate result value.
  • the intermediate result value is calculated according to the encrypted value set, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained.
  • the accuracy of the data is calculated according to the encrypted value set, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained.
  • the step of calculating the intermediate result value according to the encrypted value set, and calculating the secondary encrypted data through the intermediate result value includes: step e, obtaining the local model parameters in the secondary participant based on the encrypted value set The current average value, and obtain the historical average value of the preset step interval before the current average value;
  • the sub-participant After the sub-participant obtains the encrypted value set sent by the main participant, the current average value of the local model parameters in the sub-participant will also be obtained And it is also necessary to obtain the historical average value of the preset step interval before the current average value in the secondary participants.
  • Step f Calculate the difference between the current average value and the historical average value, calculate an intermediate result value according to the difference value, and calculate the secondary encrypted data according to the intermediate result value.
  • the intermediate result value is calculated based on the difference between the current average value and the historical average value among the driving co-participants, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained.
  • the accuracy of the data is calculated based on the difference between the current average value and the historical average value among the driving co-participants, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained. The accuracy of the data.
  • a third embodiment of the vertical federated learning optimization method of this application is proposed. This embodiment is a refinement of the step S30 of the first embodiment of the present application.
  • the step of receiving the target secondary gradient value sent by the coordinator based on the secondary encrypted data includes: step g, receiving the coordinator based on the The target sub-gradient value sent by the sub-encrypted data, wherein the target sub-gradient value is obtained by the second derivative matrix updated by the coordinator according to the target data, and the target data is in response to the failure of the longitudinal logistic regression model to converge, and It is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data sent by the secondary participant when the preset judgment condition is satisfied.
  • the secondary participant When the secondary participant receives the target secondary gradient value fed back by the coordinator, it can update its own local model parameters according to the target secondary gradient value.
  • the target secondary gradient value is when the coordinator determines that the longitudinal logistic regression model does not converge and satisfies
  • the second derivative matrix is updated according to the target data, and calculated according to the updated second derivative matrix, where the target data does not converge in the longitudinal logistic regression model and meets the preset judgment
  • the conditions are met, it is obtained by decrypting and combining the main encrypted data sent by the main participant and the secondary encrypted data sent by the sub-participants.
  • judging whether the longitudinal logistic regression model satisfies the preset judgment condition for example, judging whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining whether the new iteration number is an integer multiple of the iteration step interval, and whether it is greater than Twice is greater than the preset number of times). And perform different operations according to different judgment results.
  • the obtained target sub-gradient is guaranteed The accuracy of the value.
  • step of receiving the target secondary gradient value fed back by the coordinator includes:
  • Step h Receive the target subgradient value fed back by the coordinator, where the target subgradient value is obtained by splitting the first target product by the coordinator, and the first target product is based on the response to the The second-order derivative matrix updated by the longitudinal logistic regression model to satisfy the preset judgment condition, the long vector of the combination of the main gradient value sent by the main participant and the auxiliary gradient value sent by the auxiliary participant, and the preset step The product between the lengths.
  • the secondary participant When the secondary participant receives the target secondary gradient value fed back by the coordinator, it can update its own local model parameters according to the target secondary gradient value, where the target secondary gradient value is obtained by splitting the first target product by the coordinator , And the first target product is when the longitudinal logistic regression model does not converge and meets the preset judgment conditions, according to the updated second-order derivative matrix, the main gradient value sent by the main participant and the sub-gradient value sent by the sub-participant combined long The product of the vector and the preset step size.
  • the target secondary gradient value is obtained by the coordinator splitting the first target product, and the first target product is the product of the long vector, the preset step size, and the updated second derivative matrix, thus The accuracy of the obtained target subgradient value is guaranteed.
  • step of receiving the target secondary gradient value fed back by the coordinator includes:
  • Step k receiving the target subgradient value fed back by the coordinator, wherein the target subgradient value is a second target product, and the second target product is that the coordinator does not converge in response to the longitudinal logistic regression model, and The product of the calculated secondary gradient value sent by the secondary participant and the preset step size is not met.
  • the sub-participant When the sub-participant receives the target sub-gradient value fed back by the coordinator, he can update his own local model parameters according to the target sub-gradient value.
  • the target sub-gradient value is the second product, and the second product is the longitudinal logic of the coordinator.
  • the regression model does not converge and the preset judgment conditions are not met, the sub-gradient value sent by the sub-participants and the preset step length are calculated to obtain its product.
  • This product is the second product, which is the target sub-product.
  • the gradient value When the sub-participant receives the target sub-gradient value fed back by the coordinator, he can update his own local model parameters according to the target sub-gradient value.
  • the target sub-gradient value is the second product, and the second product is the longitudinal logic of the coordinator.
  • the product of the preset step size and the secondary gradient value is calculated, thereby ensuring the obtained target The accuracy of the main gradient value.
  • Fig. 3 is a schematic flow chart of another embodiment of the vertical federated learning optimization method of this application, including: step S100, receiving the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, where The secondary encrypted data is calculated according to the intermediate result value in the secondary participant, the intermediate result value is calculated by the secondary participant according to the encrypted value set sent by the main participant, the encrypted value set Including the main encrypted value and the new encrypted value;
  • the coordinator when it is determined that the longitudinal logistic regression model has not converged according to the loss function value sent by the secondary participant and meets the preset judgment condition, for example, it is judged whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining the new number of iterations). Whether the number of iterations is an integer multiple of the iteration step interval, and whether it is greater than twice the preset number), if the preset number condition is met, it is determined that the longitudinal logistic regression model satisfies the preset judgment condition, when the main participant sends After the primary encrypted data and the secondary encrypted data sent by the secondary participant, the second derivative matrix is updated according to the primary encrypted data and the secondary encrypted data.
  • the preset number condition such as determining the new number of iterations
  • the secondary encrypted data is calculated by the secondary participant based on the intermediate result value of the target value set feedback sent by the main participant, that is, the primary participant sends the encrypted value set to the secondary participant, and the secondary participant calculates it based on the encrypted value set
  • the intermediate result value and the loss function value, and the loss function value is sent to the coordinator, the secondary encrypted data is calculated based on the intermediate result value, and the secondary encrypted data is sent to the coordinator.
  • the encrypted value set may include the main encrypted value corresponding to the data and the new encrypted value corresponding to the new data, that is, whether the current iteration number corresponding to the main participant satisfies a preset condition (such as whether the current iteration number has passed the preset number), If it is not satisfied, the main encrypted value can be used as the target value set, and if it is satisfied, the main encrypted value and the new encrypted value can be used as the encrypted value set.
  • a preset condition such as whether the current iteration number has passed the preset number
  • Step S200 in response to the failure of the longitudinal logistic regression model to converge, update a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate a target subgradient value according to the updated second derivative matrix;
  • the coordinator When the coordinator detects that the longitudinal logistic regression model has not converged, it can update the second derivative matrix based on the main encrypted data sent by the main participant and the auxiliary encrypted data sent by the sub-participants, that is, the main encrypted data and the sub-encrypted data Decrypt and merge them and store them in a queue with a preset length to obtain the target queue, and update the second derivative matrix H according to the target queue.
  • the method of calculating H is to initialize with the value at the end of the memory queue, that is, to calculate H ⁇ p[m]I, where I is the identity matrix.
  • Judgment that is, to determine whether the longitudinal logistic regression model converges, and if it converges, it sends an iterative stop signal to party A and B, and stops the training of the longitudinal logistic regression model. If it does not converge, execute again Until the longitudinal logistic regression model converges.
  • k is greater than 2L
  • the two gradients are merged into a long vector g, the product of the step length, H and g is calculated, and split into the corresponding A and B parts (that is, the target main gradient corresponding to the A side Value and the target sub-gradient value corresponding to party B) are transmitted to A and B respectively, namely:
  • Step S300 Send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant.
  • the participant obtains the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  • the coordinator After the coordinator calculates the target sub-gradient value, it will send the target sub-gradient value to the sub-participants.
  • the sub-participants will update the local model parameters in the sub-participants according to the target sub-gradient value and continue to execute the sub-participants.
  • the participant obtains the encrypted value set with linear regression value sent by the main participant until the longitudinal logistic regression model corresponding to the coordinator converges, and sends an iteration stop signal to the main participant and the secondary participant.
  • the main participant also receives the target main gradient value corresponding to the main participant fed back by the coordinator to update the local model parameters in the main participant.
  • the coordinator updates the second-order derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculates the target sub-gradient value according to the updated second-order derivative matrix, and sends the target sub-gradient value to the sub-participants to update
  • the local model parameters in the sub-participants thus avoiding the phenomenon that the prior art adopts the first-order algorithm for longitudinal federated learning, which makes the convergence speed slow and requires a large number of rounds of data interaction, and reduces the communication for longitudinal federated learning. the amount.
  • step of updating a second derivative matrix according to the primary encrypted data and the secondary encrypted data includes:
  • Step m judging whether the longitudinal logistic regression model satisfies the preset judgment condition
  • the coordinator After the coordinator receives the main gradient value sent by the main participant and the secondary gradient value and loss value sent by the deputy coordinator, and determines that the longitudinal logistic regression model does not converge, it needs to determine whether the longitudinal logistic regression model meets the preset determination conditions, for example Determine whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining whether the new iteration number is an integer multiple of the iteration step interval, and whether it is more than twice greater than the preset number). And perform different operations according to different judgment results.
  • the preset determination conditions for example Determine whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining whether the new iteration number is an integer multiple of the iteration step interval, and whether it is more than twice greater than the preset number).
  • Step n if it is satisfied, decrypt and merge the primary encrypted data and the secondary encrypted data to obtain target data;
  • the coordinator will decrypt and merge the target data after receiving the main encrypted data sent by the main participant and the secondary encrypted data sent by the sub-participants. Encrypted data [[v A ]],[[v B ]] are decrypted to obtain the target data
  • Step p Store the target data in a queue with a preset length to obtain the target queue, and update the second derivative matrix through the target queue.
  • the coordinator stores the target data in a v queue with a length of M (ie, a preset length). At the same time, calculate the current (t) and the last (t-1) Difference between Store it in the s queue of length M. If the current memory has reached the maximum storage length M, delete the first one in the queue and put the latest v and s at the end of the queue. Use m (m not greater than M) v and s in the current memory to calculate H (second derivative matrix). The calculation method is as follows:
  • the target data is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data, and then the second derivative matrix is updated according to the target data, thereby ensuring the effectiveness of the update of the second derivative matrix.
  • the method includes:
  • Step x if not satisfied, the coordinator obtains the first product between the secondary gradient value sent by the secondary participant and the preset step size, and sends the first product as the target secondary gradient value to The associate participant.
  • the coordinator calculates the first product of the pre-selected preset step size and the sub-gradient value, and the preset step size and the main gradient corresponding to the main participant
  • the third product of the value, and the first product is sent to the secondary participant as the target secondary gradient value to update the local model parameters in the secondary participant
  • the third product is sent to the main participant to update the local model in the primary participant Parameters, and then re-train the model according to the updated model parameters to obtain the new loss function value, and send it to the coordinator through the deputy participant.
  • the first product between the sub-gradient value and the preset step size is calculated, and the first product is used as the target sub-gradient value, thereby The accuracy of the obtained target main gradient value is guaranteed.
  • the longitudinal federated learning optimization device includes: an acquisition module for the secondary participant to acquire the encrypted value set with linear regression value sent by the main participant , And calculate the secondary encrypted data according to the encrypted value set; the sending module is used to send the secondary encrypted data to the coordinator, wherein the coordinator is used to respond to the vertical federation model not converging, according to the secondary encryption
  • the data updates the second-order derivative matrix in the coordinator, and calculates the target sub-gradient value according to the updated second-order derivative matrix;
  • the first receiving module is used to receive the target sub-gradient sent by the coordinator based on the sub-encrypted data Gradient value, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the step of obtaining the encrypted value set with linear regression value sent by the primary participant by the secondary participant until the coordinator The corresponding vertical federation model converges.
  • the acquisition module is further configured to: detect whether the vertical federation model meets a preset judgment condition; if so, the secondary participant acquires the primary encrypted value and the new encrypted value sent by the primary participant, and compares all The main encrypted value and the new encrypted value are used as an encrypted value set with a linear regression value sent by the main participant.
  • the acquisition module is further configured to determine whether the current iteration number corresponding to the secondary participant meets a preset number condition, and if so, calculate the intermediate result value according to the encrypted value set, and pass the The intermediate result value calculates the secondary encrypted data.
  • the obtaining module is further configured to: obtain the current average value of the local model parameters in the secondary participants based on the encrypted value set, and obtain the historical average of the preset step interval before the current average value Value; Calculate the difference between the current average and the historical average, and calculate an intermediate result value based on the difference, and calculate the secondary encrypted data by the intermediate result value.
  • the first receiving module is further configured to: receive a target secondary gradient value sent by the coordinator based on the secondary encrypted data, wherein the target secondary gradient value is updated by the coordinator according to the target data The target data is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data sent by the secondary participant in response to the longitudinal logistic regression model not converging and meeting the preset judgment condition. of.
  • the first receiving module is further configured to: receive a target subgradient value fed back by the coordinator, wherein the target subgradient value is obtained by splitting the first target product by the coordinator
  • the first target product is based on the second derivative matrix updated in response to the longitudinal logistic regression model satisfying the preset determination condition, the main gradient value sent by the main participant, and the main gradient value sent by the secondary participant
  • the step of receiving the target sub-gradient value fed back by the coordinator includes: receiving the target sub-gradient value fed back by the coordinator, wherein the target sub-gradient value is a second target product, and the The second target product is the calculated product between the main gradient value sent by the main participant and the preset step length calculated by the coordinator in response to the longitudinal logistic regression model not converging and not satisfying the preset determination condition.
  • the longitudinal federated learning optimization device further includes: a second receiving module for receiving the main encrypted data sent by the main participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is based on the The intermediate result value in the secondary participant is calculated, the intermediate result value is calculated by the secondary participant according to the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
  • the update module is configured to respond to the failure of the longitudinal logistic regression model to converge, update the second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate the target subgradient value according to the updated second derivative matrix; converge; Module, used to send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the The secondary participant obtains the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  • the update module is further configured to determine whether the longitudinal logistic regression model satisfies the preset determination condition; if so, decrypt and merge the primary encrypted data and the secondary encrypted data to Obtain target data; store the target data in a queue with a preset length to obtain the target queue, and update the second derivative matrix through the target queue.
  • the update module is further configured to: if it is not satisfied, the coordinator obtains the first product between the secondary gradient value sent by the secondary participant and the preset step size, and compares the The first product is sent to the secondary participant as the target secondary gradient value.
  • the present application also provides a storage medium, which may be a non-volatile readable storage medium.
  • the storage medium of the present application stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the steps of the vertical federated learning optimization method described above are realized.
  • the method implemented when the computer-readable instructions running on the processor are executed please refer to the respective embodiments of the vertical federated learning optimization method of this application, which will not be repeated here.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disks, optical disks), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Abstract

The present application relates to the technical field of fintech. Disclosed are a longitudinal federated learning optimization method, apparatus and device, and a storage medium. The method comprises: a subsidiary participant acquiring an encrypted value set having a linear regression value and sent by a master participant, and calculating subsidiary encrypted data according to the encrypted value set; sending the subsidiary encrypted data to a coordinator, wherein the coordinator is used to update, in response to the case where a longitudinal federated model does not converge, a second-order derivative matrix in the coordinator according to the subsidiary encrypted data, and calculate a target subsidiary gradient value according to the updated second-order derivative matrix; and receiving the target subsidiary gradient value sent by the coordinator on the basis of the subsidiary encrypted data, updating local model parameters in the subsidiary participant on the basis of the target subsidiary gradient value, and continuing to execute the step of the subsidiary participant acquiring the encrypted value set having a linear regression value and sent by the master participant until the longitudinal federated model corresponding to the coordinator converges.

Description

纵向联邦学习优化方法、装置、设备及存储介质Longitudinal federation learning optimization method, device, equipment and storage medium 技术领域Technical field
本申请涉及金融科技(Fintech)技术领域,尤其涉及纵向联邦学习优化方法、装置、设备及存储介质。This application relates to the technical field of financial technology (Fintech), in particular to vertical federated learning optimization methods, devices, equipment, and storage media.
背景技术Background technique
随着计算机技术的发展,越来越多的技术(大数据、分布式、区块链Blockchain、人工智能等)应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,但由于金融行业的安全性、实时性要求,也对技术提出了更高的要求。例如,联邦学习中的纵向线性回归方法,现有的纵向线性回归方案是基于一阶梯度信息的随机梯度下降方法。以两方参与的纵向联邦线性回归为例,如设定A方和B方,A方为host(主人)方,只有数据的部分特征,B方为guest(客人)方,拥有一部分和A完全不同的数据特征,同时拥有数据标签。B方需要向A方请求当前模型参数与数据的内积用于计算损失函数值和梯度,这个过程中涉及到A方把自己加密的计算数据传给B方,在B方计算出加密的系数,通过该系数AB两方可以算出各自对应的梯度分量,传给第三方C进行解密和处理后发送回A和B两方作为下降方向,更新AB两方各自持有的模型参数,迭代这个步骤从而A和B能够得到训练好的模型。现有的方案基于目标损失函数的一阶梯度信息进行迭代优化,其收敛速度较慢,这导致ABC之间需要大量轮数的数据交互,在跨企业合作中通信占用大量时间。With the development of computer technology, more and more technologies (big data, distributed, blockchain, artificial intelligence, etc.) are applied in the financial field. The traditional financial industry is gradually transforming to Fintech. However, due to financial The industry's security and real-time requirements also place higher requirements on technology. For example, the longitudinal linear regression method in federated learning, the existing longitudinal linear regression scheme is a stochastic gradient descent method based on one-step information. Take the vertical federated linear regression in which two parties participate as an example. For example, if party A and party B are set, party A is the host (host) party, with only part of the characteristics of the data, and party B is the guest (guest) party, which has part and A completeness. Different data characteristics have data tags at the same time. Party B needs to request the inner product of the current model parameters and data from Party A to calculate the loss function value and gradient. This process involves Party A sending its encrypted calculation data to Party B, and Party B calculates the encrypted coefficients. , Through the coefficients AB, the two parties can calculate their respective gradient components, and send them to the third party C for decryption and processing, and then send them back to A and B as the descending direction, update the model parameters held by both parties, and iterate this step So A and B can get a trained model. The existing schemes are iteratively optimized based on one-step information of the objective loss function, and its convergence speed is slow. This results in a large number of rounds of data interaction between ABCs, and communication takes a lot of time in cross-enterprise cooperation.
发明内容Summary of the invention
本申请的主要目的在于提出一种纵向联邦学习优化方法、装置、设备及存储介质,旨在解决目前进行纵向联邦学习时,耗时长的技术问题。The main purpose of this application is to propose a vertical federated learning optimization method, device, equipment, and storage medium, which aims to solve the current long-term technical problem of vertical federated learning.
为实现上述目的,本申请提供一种纵向联邦学习优化方法,所述纵向联邦学习优化方法包括如下步骤:In order to achieve the above objective, the present application provides a longitudinal federated learning optimization method. The longitudinal federated learning optimization method includes the following steps:
副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
将所述损失函数值和所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The loss function value and the secondary encrypted data are sent to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging , And calculate the target subgradient value according to the updated second derivative matrix;
接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Receive the target sub-gradient value sent by the coordinator based on the sub-encrypted data, update the local model parameters in the sub-participant based on the target sub-gradient value, and continue to execute the sub-participant's acquisition of the data sent by the main participant The step of gathering the encrypted values of the linear regression value until the vertical federation model corresponding to the coordinator converges.
此外,本申请还提供一种纵向联邦学习优化方法包括如下步骤:In addition, this application also provides a longitudinal federated learning optimization method including the following steps:
接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;Receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the intermediate result value is the secondary The participant calculates based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;In response to the failure of the longitudinal logistic regression model to converge, update a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate a target auxiliary gradient value according to the updated second derivative matrix;
将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The target secondary gradient value is sent to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the secondary participant acquisition The step of collecting encrypted values with linear regression values sent by the main participant until the vertical federation model corresponding to the coordinator converges.
此外,为实现上述目的,本申请还提供一种纵向联邦学习优化装置,所述纵向联邦学习优化装置包括:In addition, in order to achieve the above-mentioned purpose, the present application also provides a longitudinal federated learning optimization device, the longitudinal federated learning optimization device includes:
获取模块,用于副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The obtaining module is used for the secondary participant to obtain the encrypted value set with linear regression value sent by the main participant, and calculate the secondary encrypted data according to the encrypted value set;
发送模块,用于将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The sending module is configured to send the secondary encrypted data to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, And calculate the target sub-gradient value according to the updated second derivative matrix;
第一接收模块,用于接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The first receiving module is configured to receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant The step of obtaining the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
可选地,所述纵向联邦学习优化装置还包括:Optionally, the longitudinal federated learning optimization device further includes:
第二接收模块,用于接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;The second receiving module is used to receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the The intermediate result value is calculated by the secondary participant based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
更新模块,用于响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;The update module is configured to respond to the failure of the longitudinal logistic regression model to converge, update the second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate the target sub-gradient value according to the updated second derivative matrix;
收敛模块,用于将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The convergence module is configured to send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value and continue to execute all The step of obtaining the encrypted value set with linear regression value sent by the main participant by the secondary participant until the vertical federation model corresponding to the coordinator converges.
此外,为实现上述目的,本申请还提供一种纵向联邦学习优化设备,所述纵向联邦学习优化设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如上所述的纵向联邦学习优化方法的步骤。In addition, in order to achieve the above object, this application also provides a longitudinal federated learning optimization device, the longitudinal federated learning optimization device includes: a memory, a processor, and a computer stored on the memory and capable of running on the processor A readable instruction, when the computer readable instruction is executed by the processor, implements the steps of the vertical federated learning optimization method as described above.
此外,为实现上述目的,本申请还提供一种存储介质,所述存储介质上 存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上所述的纵向联邦学习优化方法的步骤。In addition, in order to achieve the above-mentioned object, the present application also provides a storage medium having computer-readable instructions stored on the storage medium, and when the computer-readable instructions are executed by a processor, the above-mentioned vertical federated learning optimization method is implemented. step.
附图说明Description of the drawings
图1是本申请实施例方案涉及的硬件运行环境的设备结构示意图;FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in a solution of an embodiment of the present application;
图2为本申请纵向联邦学习优化方法第一实施例的流程示意图;FIG. 2 is a schematic flowchart of the first embodiment of the vertical federated learning optimization method according to this application;
图3为本申请纵向联邦学习优化方法另一实施例的流程示意图;FIG. 3 is a schematic flowchart of another embodiment of the vertical federated learning optimization method of this application;
图4为本申请纵向联邦学习优化装置的装置模块示意图;Figure 4 is a schematic diagram of the device modules of the vertical federated learning optimization device of the application;
图5为本申请纵向联邦学习优化方法计算与交互的流程示意图。Figure 5 is a schematic diagram of the calculation and interaction process of the vertical federated learning optimization method of this application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.
如图1所示,图1是本申请实施例方案涉及的硬件运行环境的设备结构示意图。本申请实施例纵向联邦学习优化设备可以是PC机或服务器设备,其上运行有Java虚拟机。如图1所示,该纵向联邦学习优化设备可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。本领域技术人员可以理解,图1中示出的设备结构并不构成对设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及计算机可读指令。在图1所示的设备中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端(用户端),与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的计算机可读指令,并执行下述纵向联邦学习优化方法中的操作。As shown in FIG. 1, FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application. The longitudinal federated learning optimization device in the embodiment of the present application may be a PC or a server device, on which a Java virtual machine runs. As shown in FIG. 1, the vertical federated learning optimization device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002. Among them, the communication bus 1002 is used to implement connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface. The network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface). The memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001. Those skilled in the art can understand that the structure of the device shown in FIG. 1 does not constitute a limitation on the device, and may include more or fewer components than those shown in the figure, or a combination of certain components, or different component arrangements. As shown in FIG. 1, the memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and computer readable instructions. In the device shown in FIG. 1, the network interface 1004 is mainly used to connect to the back-end server and communicate with the back-end server; the user interface 1003 is mainly used to connect to the client (user side) and communicate with the client; and the processor 1001 can be used to call computer-readable instructions stored in the memory 1005, and perform operations in the following vertical federated learning optimization method.
基于上述硬件结构,提出本申请纵向联邦学习优化方法实施例。参照图2,图2为本申请纵向联邦学习优化方法第一实施例流程示意图,所述方法包括:Based on the above hardware structure, an embodiment of the longitudinal federated learning optimization method of this application is proposed. Referring to Fig. 2, Fig. 2 is a schematic flowchart of a first embodiment of a longitudinal federated learning optimization method according to this application. The method includes:
步骤S10,副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;Step S10, the secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
线性回归是一种基于线性模型去拟合数据特征(自变量)和数据标签(因变量)的方法。纵向联邦线性回归是指多个参与方希望将数据联合起来进行线性回归建模,但是各自持有一部分不同的数据特征,而数据标签往往只有一方拥有。因此,在本实施例中,主参与者只有数据的部分特征,而副参与者拥有一部分和主参与者完全不同的数据特征。线性回归(linear regression)的 模型训练是在给定数据特征与标签(x i,y i)下通过最小化损失函数L(w)=∑ i||w Tx i-y i|| 2来得到模型参数ω的过程。纵向联邦学习是指不同方各自拥有不同的特征数据,相当于把每一条完整的数据纵向切分成了多个部分,各方希望在保护数据隐私的情况下实现线性回归模型的训练,从而利用模型参数在新数据上对因变量值进行预测。本方案采用满足加法同态的加密方法,即[[ax]]=a[[x]],[[x]]+[[y]]=[[x+y]]。这里[[·]]代表同态加密操作。纵向联邦场景中只有一方持有数据标签,以两方为例,A方持有数据x A,维护对应的模型参数w A,B方持有x B,y B,拥有并维护对应的模型参数w BLinear regression is a method based on a linear model to fit data features (independent variables) and data labels (dependent variables). Vertical federated linear regression means that multiple participants want to combine data for linear regression modeling, but each holds a part of different data characteristics, and the data labels are often owned by only one party. Therefore, in this embodiment, the main participant has only part of the characteristics of the data, while the sub-participants have some data characteristics that are completely different from the main participant. Linear regression (linear regression) model training is to minimize the loss function L(w)=∑ i ||w T x i -y i || 2 under given data features and labels (x i , y i) The process of obtaining model parameters ω. Vertical federated learning means that different parties have different feature data, which is equivalent to dividing each complete data into multiple parts vertically. Each party hopes to implement linear regression model training while protecting data privacy, so as to use the model The parameter predicts the value of the dependent variable on the new data. This scheme adopts an encryption method satisfying additive homomorphism, namely [[ax]]=a[[x]],[[x]]+[[y]]=[[x+y]]. Here [[·]] stands for homomorphic encryption operation. In a vertical federation scenario, only one party holds data labels. Take two parties as an example. Party A holds data x A and maintains the corresponding model parameters w A , and Party B holds x B , y B and owns and maintains the corresponding model parameters. w B.
为了实现纵向联邦线性回归,需要计算损失函数值与梯度,分别为:loss=l(w)=||w Tx-y|| 2,
Figure PCTCN2019119418-appb-000001
损失函数和梯度可以表示为双方同态加密数据的运算,即:
Figure PCTCN2019119418-appb-000002
In order to achieve longitudinal federated linear regression, it is necessary to calculate the loss function value and gradient, respectively: loss=l(w)=||w T xy|| 2 ,
Figure PCTCN2019119418-appb-000001
The loss function and gradient can be expressed as the operation of the homomorphic encrypted data of both parties, namely:
Figure PCTCN2019119418-appb-000002
本方案利用二阶信息提出了一种快速收敛的技术方案,基于损失函数的二阶导数矩阵(即海森矩阵)
Figure PCTCN2019119418-appb-000003
本方案的设计思想是基于拟牛顿法,利用二阶信息估计一个逆海森矩阵H,在算法中不用梯度g而采用H g作为下降方向,以此来加快算法收敛速度。由于逆海森矩阵H的维度比梯度要大很多,设计的核心要点是如何降低各方的数据通信量。本方案提出在C端维护逆海森矩阵H,每L步AB方除了计算梯度以外额外随机选择一小批数据,计算出前L步模型的平均值
Figure PCTCN2019119418-appb-000004
与上一个L步模型的平均值
Figure PCTCN2019119418-appb-000005
之差
Figure PCTCN2019119418-appb-000006
然后计算出一个包含了该批数据二阶信息的向量
Figure PCTCN2019119418-appb-000007
发送给C端,其维度同梯度相同。C端利用前M个向量v的信息对逆海森矩阵进行一次更新。因此,可以在本实施例中,将主参与者作为A方,将副参与者作为B方,将协调者作为C方进行阐述。因此,先随机选取一小批数据ID为S,并且A方计算出对应数据ID在S里的数值集合
Figure PCTCN2019119418-appb-000008
A使用同态加密技术对所有
Figure PCTCN2019119418-appb-000009
值进行加密,得到加密数据集合
Figure PCTCN2019119418-appb-000010
将其传输给B方。然后,更新
Figure PCTCN2019119418-appb-000011
并判断当前的迭代次数k与L的关系,如果当前的迭代次数k是L的整数倍,且迭代次数k大于2L;A端则令
Figure PCTCN2019119418-appb-000012
并计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000013
之差,即
Figure PCTCN2019119418-appb-000014
另外随机选择一小批数据ID为S H,A端计算出S H上的
Figure PCTCN2019119418-appb-000015
并将同态加密后的数据
Figure PCTCN2019119418-appb-000016
传输给B端。如果当前迭代次数k是L的整数倍,且迭代次数k不大于2L:A端则仅更新
Figure PCTCN2019119418-appb-000017
This solution uses second-order information to propose a fast-convergent technical solution, based on the second-order derivative matrix of the loss function (ie, the Hessian matrix)
Figure PCTCN2019119418-appb-000003
The design idea of this scheme is based on the quasi-Newton method, using the second-order information to estimate an inverse Hessian matrix H. In the algorithm, the gradient g is not used but H g is used as the descending direction to speed up the convergence speed of the algorithm. Since the dimension of the inverse Hessian matrix H is much larger than the gradient, the core point of the design is how to reduce the data communication volume of all parties. This scheme proposes to maintain the inverse Hessian matrix H at the C end, and in addition to calculating the gradient for each L step AB, a small batch of data is randomly selected, and the average value of the previous L step model is calculated
Figure PCTCN2019119418-appb-000004
Average with the last L-step model
Figure PCTCN2019119418-appb-000005
Difference
Figure PCTCN2019119418-appb-000006
Then calculate a vector containing the second-order information of the batch of data
Figure PCTCN2019119418-appb-000007
Sent to the C end, the dimension is the same as the gradient. The C terminal uses the information of the first M vectors v to update the inverse Hessian matrix once. Therefore, in this embodiment, the main participant is regarded as Party A, the sub-participants are regarded as Party B, and the coordinator is regarded as Party C. Therefore, first randomly select a small batch of data ID as S, and Party A calculates the value set of the corresponding data ID in S
Figure PCTCN2019119418-appb-000008
A uses homomorphic encryption technology for all
Figure PCTCN2019119418-appb-000009
Encrypt the value to get the encrypted data set
Figure PCTCN2019119418-appb-000010
Transmit it to Party B. Then, update
Figure PCTCN2019119418-appb-000011
And judge the relationship between the current iteration number k and L, if the current iteration number k is an integer multiple of L, and the iteration number k is greater than 2L;
Figure PCTCN2019119418-appb-000012
And calculate the current (t) and last (t-1)
Figure PCTCN2019119418-appb-000013
Difference between
Figure PCTCN2019119418-appb-000014
In addition, a small batch of data ID is randomly selected as S H , and the A side calculates the value on S H
Figure PCTCN2019119418-appb-000015
And homomorphically encrypted data
Figure PCTCN2019119418-appb-000016
Transmitted to the B side. If the current number of iterations k is an integer multiple of L, and the number of iterations k is not greater than 2L: only update the A side
Figure PCTCN2019119418-appb-000017
B方计算出对应数据ID在S里的数值集合
Figure PCTCN2019119418-appb-000018
B方利用同态加密的性质,计算出加密的loss(损失函数值)值,即
Figure PCTCN2019119418-appb-000019
同时计算得到每条对应数据的加密值[[d]]=2([[u A]]+[[u B]]+[[-y]]),并将其传输给A端。然后,更新
Figure PCTCN2019119418-appb-000020
并判断当前迭代次数k与L的关系。如果当前迭代次数 k是L的整数倍,且迭代次数k大于2L:B端更新
Figure PCTCN2019119418-appb-000021
并计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000022
之差,即
Figure PCTCN2019119418-appb-000023
另外B端计算出S H上的
Figure PCTCN2019119418-appb-000024
从而计算出
Figure PCTCN2019119418-appb-000025
并传输给A端。如果当前迭代次数k是L的整数倍,且迭代次数k不大于2L:B端则仅更新
Figure PCTCN2019119418-appb-000026
Party B calculates the value set of the corresponding data ID in S
Figure PCTCN2019119418-appb-000018
Party B uses the properties of homomorphic encryption to calculate the encrypted loss (loss function value) value, namely
Figure PCTCN2019119418-appb-000019
At the same time, the encrypted value of each corresponding data [[d]]=2([[u A ]]+[[u B ]]+[[-y]]) is calculated and transmitted to the A terminal. Then, update
Figure PCTCN2019119418-appb-000020
And judge the relationship between the current iteration number k and L. If the current number of iterations k is an integer multiple of L, and the number of iterations k is greater than 2L: end B is updated
Figure PCTCN2019119418-appb-000021
And calculate the current (t) and last (t-1)
Figure PCTCN2019119418-appb-000022
Difference between
Figure PCTCN2019119418-appb-000023
In addition, the B side calculates the S H
Figure PCTCN2019119418-appb-000024
To calculate
Figure PCTCN2019119418-appb-000025
And transmitted to the A side. If the current number of iterations k is an integer multiple of L, and the number of iterations k is not greater than 2L: only update the B end
Figure PCTCN2019119418-appb-000026
步骤S20,将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;Step S20: Send the secondary encrypted data to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, and according to The updated second-order derivative matrix calculates the target sub-gradient value;
AB两方各自利用同态加密的性质对每个[[d]]值乘以对应数据x A,x B,然后对得到的向量集合求和,计算出加密的梯度值[[g A]]=∑[[d]]x A,[[g B]]=∑[[d]]x B。B端将加密的loss值传给C端。A、B分别将[[g A]],[[g B]]传输给C。然后,判断当前迭代次数k与L的关系。如果当前迭代次数k是L的整数倍,且迭代次数k大于2L,则A、B两端分别根据中间结果值[[h]]计算出主加密数据和副加密数据[[υ A]]=∑[[h]]x A,[[υ B]]=∑[[h]]x B,并传输给C方。C方(即协调者)对收到的数据进行解密,得到g A,g B,loss。根据loss判断纵向线性回归模型是否收敛,如果收敛,则发送迭代停止信号给A、B,结束算法。若未收敛,更新
Figure PCTCN2019119418-appb-000027
并判断迭代次数k与2L的关系,若k不大于2L,则计算出事先选定的步长与梯度的乘积
Figure PCTCN2019119418-appb-000028
并将其分别传输给A和B(即获取到目标副梯度值,并将目标副梯度值发送到主参与者,将副参与者对应的乘积发送到副参与者)。如果k大于2L,则将两个梯度合并成一个长向量g,计算出步长、H与g的乘积,并拆分成对应的A、B两部分分别传输给A和B,即:
The two parties of AB each use the property of homomorphic encryption to multiply each [[d]] value by the corresponding data x A , x B , and then sum the resulting vector set to calculate the encrypted gradient value [[g A ]] =∑[[d]]x A ,[[g B ]]=∑[[d]]x B. The B end transmits the encrypted loss value to the C end. A and B respectively transmit [[g A ]], [[g B ]] to C. Then, determine the relationship between the current iteration number k and L. If the current iteration number k is an integer multiple of L, and the iteration number k is greater than 2L, the two ends of A and B respectively calculate the primary encrypted data and the secondary encrypted data according to the intermediate result value [[h]] [[υ A ]]= ∑[[h]]x A ,[[υ B ]]=∑[[h]]x B , and transmit to the C party. Party C (ie, the coordinator) decrypts the received data to obtain g A , g B , and loss. Judge whether the longitudinal linear regression model converges according to the loss, and if it converges, send the iteration stop signal to A and B to end the algorithm. If not converged, update
Figure PCTCN2019119418-appb-000027
And judge the relationship between the number of iterations k and 2L, if k is not greater than 2L, calculate the product of the pre-selected step size and the gradient
Figure PCTCN2019119418-appb-000028
And transmit them to A and B respectively (that is, the target secondary gradient value is obtained, and the target secondary gradient value is sent to the main participant, and the product corresponding to the secondary participant is sent to the secondary participant). If k is greater than 2L, merge the two gradients into a long vector g, calculate the step length, the product of H and g, and split them into corresponding parts A and B and transmit them to A and B respectively, namely:
Figure PCTCN2019119418-appb-000029
Figure PCTCN2019119418-appb-000029
如果k是L的整数倍,则C还收到了加密数据[[v A]],[[v B]],将其进行解密后合并可以得到
Figure PCTCN2019119418-appb-000030
并存储在一个长度为M的v队列中。同时,计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000031
之差,即
Figure PCTCN2019119418-appb-000032
将其存在长度为M的s队列中。如果目前的存储器已经达到最大存储长度M,则将队列首个删掉并将最新得到v和s放在队列末尾。利用当前存储器里的m(m不大于M)个v和s,来计算H。计算方法如下:利用存储器队列末尾的值初始化,即计算
Figure PCTCN2019119418-appb-000033
H←p[m]I,其中I为单位矩阵。然后从队列首到队列尾(j=1,..,m)迭代计算得到更新的H:p[j]=1/(v[j] Ts[j]),H←(I-p[j]s[j]v[j] T)H(I-p[j]v[j]s[j] T)+p[j]s[j]s[j] T
If k is an integer multiple of L, then C has also received the encrypted data [[v A ]],[[v B ]], which can be decrypted and combined to get
Figure PCTCN2019119418-appb-000030
And stored in a v queue of length M. At the same time, calculate the current (t) and the last (t-1)
Figure PCTCN2019119418-appb-000031
Difference between
Figure PCTCN2019119418-appb-000032
Store it in the s queue of length M. If the current memory has reached the maximum storage length M, delete the first one in the queue and put the latest v and s at the end of the queue. Use the m (m not greater than M) v and s in the current memory to calculate H. The calculation method is as follows: initialize with the value at the end of the memory queue, that is, calculate
Figure PCTCN2019119418-appb-000033
H←p[m]I, where I is the identity matrix. Then iteratively calculate the updated H from the head of the queue to the end of the queue (j=1,...,m): p[j]=1/(v[j] T s[j]),H←(Ip[j] s[j]v[j] T )H(Ip[j]v[j]s[j] T )+p[j]s[j]s[j] T.
步骤S40,接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Step S40: Receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant acquiring the primary participant The step of sending encrypted value sets with linear regression values until the vertical federation model corresponding to the coordinator converges.
在副参与者接收到协调者发送的目标副梯度值后,会根据目标副梯度值更新副参与者自身的本地模型参数,同时在主参与者接收到协调者发送到的目标主梯度值后,也会根据该乘积更新主参与者自身的模型参数。即AB两方分别用收到的未加密的向量来更新自己的本地模型参数,即:
Figure PCTCN2019119418-appb-000034
After the secondary participant receives the target secondary gradient value sent by the coordinator, it will update the secondary participant’s own local model parameters according to the target secondary gradient value. At the same time, after the primary participant receives the target primary gradient value sent by the coordinator, The model parameters of the main participant will also be updated based on the product. That is, the two parties AB use the received unencrypted vector to update their local model parameters, namely:
Figure PCTCN2019119418-appb-000034
并根据更新后的本地模型参数和公式
Figure PCTCN2019119418-appb-000035
来再次计算主参与者中具有逻辑回归得分的新的加密数值集合,并再次执行获取具有线性回归值的加密数值集合的步骤,以获取新的损失函数值,直到根据新的损失函数值确定纵向联邦模型已收敛。
And based on the updated local model parameters and formulas
Figure PCTCN2019119418-appb-000035
To calculate the new encrypted value set with logistic regression score in the main participant again, and perform the step of obtaining the encrypted value set with linear regression value again to obtain the new loss function value, until the vertical value is determined according to the new loss function value. The federation model has converged.
例如,如图5所示,存在有A、B、C三方进行模型训练,其中A方为主参与者,B方为副参与者,C方为协调者。A方本地计算与传输数据给B方,也就是将加密的
Figure PCTCN2019119418-appb-000036
传输给B方,B方根据A方传输的加密数据进行本地计算得到加密损失函数值和加密值,并将[[d]][[h]]传输给A方,并且AB两方同时根据加密值计算各自的梯度值,并将其传输给C方,也就是AB两方将[[g A]],[[g B]][[v A]],[[v B]]发送给C方,并且B方还将[[loss]]传输给C方,C方对接收到的[[v A]],[[v B]],[[loss]]进行解密,得到解密后的g A,g B,lossμ AB,并根据loss判断算法是否收敛,若未收敛,则根据接收到的梯度值更新H,并计算与传输,即在k不大于2L时,计算出事先选定的步长与梯度的乘积
Figure PCTCN2019119418-appb-000037
并将其分别传输给A和B;在k大于2L时,将两个梯度合并成一个长向量g,计算出步长、H与g的乘积,并拆分成对应的A、B两部分分别传输给A和B,即:
For example, as shown in Figure 5, there are three parties A, B, and C performing model training, where party A is the main participant, party B is the deputy participant, and party C is the coordinator. Party A calculates and transmits data locally to Party B, which means encrypting
Figure PCTCN2019119418-appb-000036
It is transmitted to Party B. Party B performs local calculations based on the encrypted data transmitted by Party A to obtain the encryption loss function value and encryption value, and transmits [[d]][[h]] to Party A, and both parties AB also encrypt Calculate the respective gradient values and transmit them to the C party, that is, the two parties AB send [[g A ]],[[g B ]][[v A ]],[[v B ]] to C Party B also transmits [[loss]] to Party C. Party C decrypts the received [[v A ]], [[v B ]], [[loss]] to obtain the decrypted g A , g B , loss μ A , μ B , and judge whether the algorithm has converged according to loss, if not, update H according to the received gradient value, calculate and transmit, that is, when k is not greater than 2L, calculate the pre-selection The product of a fixed step size and the gradient
Figure PCTCN2019119418-appb-000037
And transmit them to A and B respectively; when k is greater than 2L, merge the two gradients into a long vector g, calculate the product of step length, H and g, and split them into corresponding parts A and B respectively Transmitted to A and B, namely:
Figure PCTCN2019119418-appb-000038
Figure PCTCN2019119418-appb-000038
并且AB两方根据C方传递的未加密的向量来更新本地的模型参数,即
Figure PCTCN2019119418-appb-000039
And both parties AB update the local model parameters according to the unencrypted vector passed by party C, namely
Figure PCTCN2019119418-appb-000039
在本实施例中,通过在副参与者获取到主参与者的加密数值集合后,计算并发送损失函数值和副加密数据至协调者,以便协调者根据损失函数值确定纵向联邦模型是否收敛,若未收敛,则根据副加密数据对二阶导数矩阵进行更新,并根据更新后的二阶导数矩阵计算目标副梯度值,再通过目标副梯度值更新副参与者的本地模型参数,从而避免了现有技术进行纵向联邦学习采用一阶算法而使得收敛速度较慢,需要进行大量轮次的数据交互的现象发生,减少了进行纵向联邦学习的通信量,提高了进行纵向联邦逻辑回归模型训练时的收敛速度。In this embodiment, after the secondary participant obtains the encrypted value set of the main participant, the loss function value and the secondary encrypted data are calculated and sent to the coordinator, so that the coordinator can determine whether the vertical federation model has converged according to the loss function value. If it does not converge, the second derivative matrix is updated according to the secondary encrypted data, and the target secondary gradient value is calculated according to the updated second derivative matrix, and then the local model parameters of the secondary participant are updated by the target secondary gradient value, thereby avoiding The prior art for longitudinal federated learning uses a first-order algorithm to make the convergence rate slower, and a large number of rounds of data interaction are required. This reduces the amount of communication for longitudinal federated learning and improves the training time of the longitudinal federated logistic regression model. The convergence rate.
进一步地,基于本申请纵向联邦学习优化方法第一实施例,提出本申请纵向联邦学习优化方法第二实施例。本实施例是本申请第一实施例的步骤S10,副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤的细化,包括:步骤a,检测纵向联邦模型是否满足预设判定条件;Further, based on the first embodiment of the vertical federated learning optimization method of the present application, a second embodiment of the vertical federated learning optimization method of the present application is proposed. This embodiment is a refinement of the step S10 of the first embodiment of the present application. The secondary participant obtains the encrypted value set with the linear regression value sent by the main participant, including: step a, detecting whether the vertical federation model satisfies the preset Judgment condition
在本实施例中,在主参与者向副参与者发送数据时,还需要检测纵向联邦模型是否满足预设判定条件,例如判断纵向联邦模型的新迭代次数是否满足预设次数条件(如确定新迭代次数是否为迭代步数间隔的整数倍,且是否大于两倍大于预设次数)。并根据不同的判断结果执行不同的操作。并根据不同的检测结果执行不同的操作。In this embodiment, when the main participant sends data to the sub-participants, it is also necessary to detect whether the vertical federation model satisfies the preset determination condition, for example, to determine whether the new iteration number of the vertical federation model meets the preset number condition (such as determining the new number of iterations). Whether the number of iterations is an integer multiple of the interval of iteration steps, and whether it is greater than two times greater than the preset number). And perform different operations according to different judgment results. And perform different operations according to different test results.
步骤b,若满足,则副参与者获取所述主参与者发送的主加密数值和新加 密数值,并将所述主加密数值和所述新加密数值作为所述主参与者发送的具有线性回归值的加密数值集合。In step b, if it is satisfied, the secondary participant obtains the main encrypted value and the new encrypted value sent by the main participant, and uses the main encrypted value and the new encrypted value as a linear regression sent by the main participant The encrypted value collection of the value.
在主参与者中,先获取一小批数据,并根据上述实施例提及的公式
Figure PCTCN2019119418-appb-000040
来计算得到各个逻辑回归得分,并对这些逻辑回归得分使用同态加密技术进行加密,以得到主加密数值,当经过判断发现纵向联邦模型满足预设判定条件时,再获取一小批数据,并根据公式
Figure PCTCN2019119418-appb-000041
来计算得到各个逻辑回归得分,并对这些逻辑回归得分使用同态加密技术进行加密,以得到新加密数值,并将加密数据和新加密数据一起作为具有线性回归值的加密数值集合。需要说明的是加密数据和新加密数据不相同,并发送至副参与者,也就是副参与者会获取到主参与者发送的主加密数值和新加密数值。但是,若纵向联邦模型不满足预设判定条件,则主参与者只发送主加密数值到副参与者,也就是此时主加密数值就是具有线性回归值的加密数值集合。
In the main participant, first obtain a small batch of data, and according to the formula mentioned in the above embodiment
Figure PCTCN2019119418-appb-000040
To calculate each logistic regression score, and encrypt these logistic regression scores using homomorphic encryption technology to obtain the main encryption value. When the vertical federation model is found to meet the preset judgment conditions, a small batch of data is obtained, and According to the formula
Figure PCTCN2019119418-appb-000041
To calculate each logistic regression score, and encrypt these logistic regression scores using homomorphic encryption technology to obtain a new encrypted value, and use the encrypted data and the new encrypted data together as an encrypted value set with a linear regression value. It should be noted that the encrypted data and the new encrypted data are not the same and are sent to the secondary participant, that is, the secondary participant will obtain the main encrypted value and the new encrypted value sent by the main participant. However, if the vertical federation model does not meet the pre-determined conditions, the main participant only sends the main encrypted value to the sub-participants, that is, at this time, the main encrypted value is a collection of encrypted values with linear regression values.
在本实施例中,通过在确定纵向联邦模型满足预设判定条件时,副参与者获取到主参与者发送的主加密数值和副加密数值,并将其作为具有线性回归值的加密数值集合,从而提高了纵向线性回归模型的训练速度。In this embodiment, by determining that the vertical federation model satisfies the preset judgment condition, the secondary participant obtains the primary encrypted value and the secondary encrypted value sent by the primary participant, and uses them as a collection of encrypted numerical values with linear regression values. Thereby improving the training speed of the longitudinal linear regression model.
进一步地,根据所述加密数值集合计算副加密数据的步骤,包括:步骤c,确定所述副参与者对应的当前迭代次数是否满足预设次数条件,Further, the step of calculating the secondary encrypted data according to the encrypted value set includes: step c, determining whether the current iteration number corresponding to the secondary participant satisfies a preset number condition,
当在副参与者中获取到主参与者发送的副加密数值集合后,还需要判断副参与者自身模型的当前迭代次数(也就是更新次数)是否满足预设次数条件,并根据不同的判断结果执行不同的操作。After obtaining the secondary encrypted value set sent by the primary participant in the secondary participant, it is also necessary to determine whether the current iteration number (that is, the number of updates) of the secondary participant’s own model meets the preset number condition, and according to different judgment results Perform different operations.
步骤d,若满足,则根据所述加密数值集合计算中间结果值,并通过所述中间结果值计算副加密数据。Step d, if it is satisfied, calculate the intermediate result value according to the encrypted value set, and calculate the secondary encrypted data according to the intermediate result value.
当经过判断发现当前迭代次数满足预设次数条件时,如当前迭代次数k是L的整数倍,且迭代次数k大于2L:B端更新
Figure PCTCN2019119418-appb-000042
并计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000043
之差,即
Figure PCTCN2019119418-appb-000044
另外B端计算出S H上的
Figure PCTCN2019119418-appb-000045
从而计算出中间结果值
Figure PCTCN2019119418-appb-000046
并传输给A端,同时也会根据中间结果值来计算副参与者中的副加密数据。也就是A、B两端分别根据中间结果值[[h]]计算出主加密数据和副加密数据[[υ A]]=∑[[h]]x A,[[υ B]]=∑[[h]]x B,并传输给C方。但是若不满足,如当前迭代次数k是L的整数倍,且迭代次数k不大于2L:B端则仅更新
Figure PCTCN2019119418-appb-000047
When it is judged that the current iteration number meets the preset number condition, for example, the current iteration number k is an integer multiple of L, and the iteration number k is greater than 2L: end B is updated
Figure PCTCN2019119418-appb-000042
And calculate the current (t) and last (t-1)
Figure PCTCN2019119418-appb-000043
Difference between
Figure PCTCN2019119418-appb-000044
In addition, the B side calculates the S H
Figure PCTCN2019119418-appb-000045
To calculate the intermediate result value
Figure PCTCN2019119418-appb-000046
It is transmitted to the A side, and at the same time, the secondary encrypted data in the secondary participant will be calculated based on the intermediate result value. That is, the two ends of A and B respectively calculate the main encrypted data and the auxiliary encrypted data according to the intermediate result value [[h]] [[υ A ]]=∑[[h]]x A ,[[υ B ]]=∑ [[h]]x B , and transmit it to the C party. But if it is not satisfied, if the current iteration number k is an integer multiple of L, and the iteration number k is not greater than 2L: only update at the B end
Figure PCTCN2019119418-appb-000047
在本实施例中,通过在确定副参与者对应的当前迭代次数满足预设次数条件时,根据加密数值集合计算中间结果值,并通过中间结果值计算副加密数据,从而保障了获取到副加密数据的准确性。In this embodiment, when it is determined that the current iteration number corresponding to the secondary participant meets the preset number condition, the intermediate result value is calculated according to the encrypted value set, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained. The accuracy of the data.
具体地,根据所述加密数值集合计算中间结果值,并通过所述中间结果值计算副加密数据的步骤,包括:步骤e,基于所述加密数值集合获取所述副参与者中本地模型参数的当前平均值,并获取所述当前平均值之前预设步数间隔的历史平均值;Specifically, the step of calculating the intermediate result value according to the encrypted value set, and calculating the secondary encrypted data through the intermediate result value, includes: step e, obtaining the local model parameters in the secondary participant based on the encrypted value set The current average value, and obtain the historical average value of the preset step interval before the current average value;
在副参与者获取到主参与者发送的加密数值集合后,还会获取副参与者中本地模型参数的当前平均值
Figure PCTCN2019119418-appb-000048
并且还需要在副参与者中获取当前平均值 之前预设步数间隔的历史平均值,。
After the sub-participant obtains the encrypted value set sent by the main participant, the current average value of the local model parameters in the sub-participant will also be obtained
Figure PCTCN2019119418-appb-000048
And it is also necessary to obtain the historical average value of the preset step interval before the current average value in the secondary participants.
步骤f,计算所述当前平均值和所述历史平均值之间的差值,并根据所述差值计算中间结果值,通过所述中间结果值计算副加密数据。Step f: Calculate the difference between the current average value and the historical average value, calculate an intermediate result value according to the difference value, and calculate the secondary encrypted data according to the intermediate result value.
当获取到当前平均值和历史平均值后,还需要计算两者之间的差值,也就是计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000049
之差,即
Figure PCTCN2019119418-appb-000050
另外B端计算出S H上的
Figure PCTCN2019119418-appb-000051
从而计算出中间结果值
Figure PCTCN2019119418-appb-000052
并传输给A端,同时也会根据中间结果值来计算副参与者中的副加密数据。也就是A、B两端分别根据中间结果值[[h]]计算出主加密数据和副加密数据[[υ A]]=∑[[h]]x A,[[υ B]]=∑[[h]]x B
When the current average value and the historical average value are obtained, the difference between the two needs to be calculated, that is, to calculate the current (t) and the last (t-1)
Figure PCTCN2019119418-appb-000049
Difference between
Figure PCTCN2019119418-appb-000050
In addition, the B side calculates the S H
Figure PCTCN2019119418-appb-000051
To calculate the intermediate result value
Figure PCTCN2019119418-appb-000052
It is transmitted to the A side, and at the same time, the secondary encrypted data in the secondary participant will be calculated based on the intermediate result value. That is, the two ends of A and B respectively calculate the main encrypted data and the auxiliary encrypted data according to the intermediate result value [[h]] [[υ A ]]=∑[[h]]x A ,[[υ B ]]=∑ [[h]]x B.
在本实施例中,通过驾驶副参与者中当前平均值和历史平均值之间的差值,并根据差值计算中间结果值,通过中间结果值计算副加密数据,从而保障了获取到副加密数据的准确性。In this embodiment, the intermediate result value is calculated based on the difference between the current average value and the historical average value among the driving co-participants, and the secondary encrypted data is calculated by the intermediate result value, thereby ensuring that the secondary encrypted data is obtained. The accuracy of the data.
进一步地,基于本申请纵向联邦学习优化方法第一至第二任意一个实施例的基础上,提出本申请纵向联邦学习优化方法第三实施例。本实施例是本申请第一实施例的步骤S30,接收所述协调者基于所述副加密数据发送的目标副梯度值的步骤的细化,包括:步骤g,接收所述协调者基于所述副加密数据发送的目标副梯度值,其中,所述目标副梯度值由所述协调者根据目标数据更新的二阶导数矩阵获取的,所述目标数据为响应于纵向逻辑回归模型未收敛,且满足预设判定条件,将所述主加密数据和所述副参与者发送的副加密数据进行解密合并得到的。Further, based on any one of the first to second embodiments of the vertical federated learning optimization method of this application, a third embodiment of the vertical federated learning optimization method of this application is proposed. This embodiment is a refinement of the step S30 of the first embodiment of the present application. The step of receiving the target secondary gradient value sent by the coordinator based on the secondary encrypted data includes: step g, receiving the coordinator based on the The target sub-gradient value sent by the sub-encrypted data, wherein the target sub-gradient value is obtained by the second derivative matrix updated by the coordinator according to the target data, and the target data is in response to the failure of the longitudinal logistic regression model to converge, and It is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data sent by the secondary participant when the preset judgment condition is satisfied.
在副参与者接收到协调者反馈的目标副梯度值时,可以根据此目标副梯度值更新自身的本地模型参数,其中,目标副梯度值是协调者在确定纵向逻辑回归模型不收敛,且满足预设判定条件时,根据目标数据对二阶导数矩阵进行更新,并根据已进行更新的二阶导数矩阵进行计算获取的,其中,目标数据是在纵向逻辑回归模型未收敛,且满足预设判定条件时,将主参与者发送的主加密数据和副参与者发送的副加密数据进行解密合并得到的。而判断纵向逻辑回归模型是否满足预设判定条件,可以是例如判断纵向逻辑回归模型的新迭代次数是否满足预设次数条件(如确定新迭代次数是否为迭代步数间隔的整数倍,且是否大于两倍大于预设次数)。并根据不同的判断结果执行不同的操作。When the secondary participant receives the target secondary gradient value fed back by the coordinator, it can update its own local model parameters according to the target secondary gradient value. The target secondary gradient value is when the coordinator determines that the longitudinal logistic regression model does not converge and satisfies When the judgment conditions are preset, the second derivative matrix is updated according to the target data, and calculated according to the updated second derivative matrix, where the target data does not converge in the longitudinal logistic regression model and meets the preset judgment When the conditions are met, it is obtained by decrypting and combining the main encrypted data sent by the main participant and the secondary encrypted data sent by the sub-participants. And judging whether the longitudinal logistic regression model satisfies the preset judgment condition, for example, judging whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining whether the new iteration number is an integer multiple of the iteration step interval, and whether it is greater than Twice is greater than the preset number of times). And perform different operations according to different judgment results.
在本实施例中,通过确定目标副梯度值是根据目标数据和更新的二阶导数矩阵获取的,并且目标数据是主加密数据和副加密数据进行合并得到的,从而保障了获取的目标副梯度值的准确性。In this embodiment, by determining that the target sub-gradient value is obtained based on the target data and the updated second-order derivative matrix, and the target data is obtained by merging the main encrypted data and the sub-encrypted data, the obtained target sub-gradient is guaranteed The accuracy of the value.
进一步地,接收所述协调者反馈的目标副梯度值的步骤,包括:Further, the step of receiving the target secondary gradient value fed back by the coordinator includes:
步骤h,接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值由所述协调者对第一目标乘积进行拆分得到的,所述第一目标乘积为根据响应于所述纵向逻辑回归模型满足所述预设判定条件而更新的二阶导数矩阵、所述主参与者发送的主梯度值和所述副参与者发送的副梯度值合并的长向量,和预设步长之间的乘积。Step h: Receive the target subgradient value fed back by the coordinator, where the target subgradient value is obtained by splitting the first target product by the coordinator, and the first target product is based on the response to the The second-order derivative matrix updated by the longitudinal logistic regression model to satisfy the preset judgment condition, the long vector of the combination of the main gradient value sent by the main participant and the auxiliary gradient value sent by the auxiliary participant, and the preset step The product between the lengths.
在副参与者接收到协调者反馈的目标副梯度值时,可以根据此目标副梯度值更新自身的本地模型参数,其中,目标副梯度值是由协调者对第一目标乘积进行拆分得到的,而第一目标乘积是纵向逻辑回归模型未收敛且满足预设判定条件时,根据已更新的二阶导数矩阵、主参与者发送的主梯度值和副参与者发送的副梯度值合并的长向量、预设的步长进行计算的乘积。When the secondary participant receives the target secondary gradient value fed back by the coordinator, it can update its own local model parameters according to the target secondary gradient value, where the target secondary gradient value is obtained by splitting the first target product by the coordinator , And the first target product is when the longitudinal logistic regression model does not converge and meets the preset judgment conditions, according to the updated second-order derivative matrix, the main gradient value sent by the main participant and the sub-gradient value sent by the sub-participant combined long The product of the vector and the preset step size.
在本实施例中,通过确定目标副梯度值是协调者对第一目标乘积进行拆分得到的,而第一目标乘积是长向量、预设步长和更新的二阶导数矩阵的乘积,从而保障了获取到的目标副梯度值的准确性。In this embodiment, by determining that the target secondary gradient value is obtained by the coordinator splitting the first target product, and the first target product is the product of the long vector, the preset step size, and the updated second derivative matrix, thus The accuracy of the obtained target subgradient value is guaranteed.
进一步地,接收所述协调者反馈的目标副梯度值的步骤,包括:Further, the step of receiving the target secondary gradient value fed back by the coordinator includes:
步骤k,接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值为第二目标乘积,所述第二目标乘积为所述协调者响应于纵向逻辑回归模型未收敛,且不满足预设判定条件,计算的所述副参与者发送的副梯度值和预设的步长之间的乘积。Step k, receiving the target subgradient value fed back by the coordinator, wherein the target subgradient value is a second target product, and the second target product is that the coordinator does not converge in response to the longitudinal logistic regression model, and The product of the calculated secondary gradient value sent by the secondary participant and the preset step size is not met.
在副参与者接收到协调者反馈的目标副梯度值时,可以根据此目标副梯度值更新自身的本地模型参数,其中,目标副梯度值是第二乘积,第二乘积是协调者在纵向逻辑回归模型未收敛,且不满足预设判定条件时,对副参与者发送的副梯度值和预设的步长之间进行计算,以获取其乘积,该乘积就是第二乘积,也就是目标副梯度值。When the sub-participant receives the target sub-gradient value fed back by the coordinator, he can update his own local model parameters according to the target sub-gradient value. The target sub-gradient value is the second product, and the second product is the longitudinal logic of the coordinator. When the regression model does not converge and the preset judgment conditions are not met, the sub-gradient value sent by the sub-participants and the preset step length are calculated to obtain its product. This product is the second product, which is the target sub-product. The gradient value.
在本实施例中,通过确定目标副梯度值时在纵向逻辑回归模型未收敛,且不满足预设判定条件时,计算预设的步长和副梯度值的乘积,从而保障了获取到的目标主梯度值的准确性。In this embodiment, by determining the target secondary gradient value when the longitudinal logistic regression model does not converge and does not meet the preset judgment conditions, the product of the preset step size and the secondary gradient value is calculated, thereby ensuring the obtained target The accuracy of the main gradient value.
进一步地,参照图3,图3为本申请纵向联邦学习优化方法另一实施例的流程示意图,包括:步骤S100,接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;Further, referring to Fig. 3, Fig. 3 is a schematic flow chart of another embodiment of the vertical federated learning optimization method of this application, including: step S100, receiving the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, where The secondary encrypted data is calculated according to the intermediate result value in the secondary participant, the intermediate result value is calculated by the secondary participant according to the encrypted value set sent by the main participant, the encrypted value set Including the main encrypted value and the new encrypted value;
在协调者中当根据副参与者发送的损失函数值确定纵向逻辑回归模型未收敛,且满足预设判定条件时,例如判断纵向逻辑回归模型的新迭代次数是否满足预设次数条件(如确定新迭代次数是否为迭代步数间隔的整数倍,且是否大于两倍大于预设次数),若满足预设次数条件,则确定纵向逻辑回归模型满足预设判定条件时,在接收到主参与者发送的主加密数据和副参与者发送的副加密数据后,根据主加密数据和副加密数据更新二阶导数矩阵。其中,副加密数据是副参与者基于主参与者发送的目标数值集合反馈的中间结果值计算得到的,也就是主参与者发送加密数值集合到副参与者,副参与者根据加密数值集合计算出中间结果值和损失函数值,并将损失函数值发送至协调者,根据中间结果值计算出副加密数据,并将副加密数据发送至协调者。其中,加密数值集合可以包括数据对应的主加密数值和新数据对应的新加密数值,也就是在主参与者对应的当前迭代次数是否满足预设条件(如当前迭代次数是否经过预设次数),若不满足,则可以将主加密数值作为目标数值集合, 若满足,则可以将主加密数值和新加密数值作为加密数值集合。并且在本申请中数据加密的方式可以是采用同态加密的方式。In the coordinator, when it is determined that the longitudinal logistic regression model has not converged according to the loss function value sent by the secondary participant and meets the preset judgment condition, for example, it is judged whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining the new number of iterations). Whether the number of iterations is an integer multiple of the iteration step interval, and whether it is greater than twice the preset number), if the preset number condition is met, it is determined that the longitudinal logistic regression model satisfies the preset judgment condition, when the main participant sends After the primary encrypted data and the secondary encrypted data sent by the secondary participant, the second derivative matrix is updated according to the primary encrypted data and the secondary encrypted data. Among them, the secondary encrypted data is calculated by the secondary participant based on the intermediate result value of the target value set feedback sent by the main participant, that is, the primary participant sends the encrypted value set to the secondary participant, and the secondary participant calculates it based on the encrypted value set The intermediate result value and the loss function value, and the loss function value is sent to the coordinator, the secondary encrypted data is calculated based on the intermediate result value, and the secondary encrypted data is sent to the coordinator. Among them, the encrypted value set may include the main encrypted value corresponding to the data and the new encrypted value corresponding to the new data, that is, whether the current iteration number corresponding to the main participant satisfies a preset condition (such as whether the current iteration number has passed the preset number), If it is not satisfied, the main encrypted value can be used as the target value set, and if it is satisfied, the main encrypted value and the new encrypted value can be used as the encrypted value set. And the method of data encryption in this application can be homomorphic encryption.
步骤S200,响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;Step S200, in response to the failure of the longitudinal logistic regression model to converge, update a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate a target subgradient value according to the updated second derivative matrix;
在协调者检测到纵向逻辑回归模型未收敛时,可以根据主参与者发送的主加密数据和副参与者发送的副加密数据来更新二阶导数矩阵,也就是将主加密数据和副加密数据进行解密合并,并将其存储在一个预设长度的队列中,得到目标队列,并根据此目标队列来更新二阶导数矩阵H,其中,计算H的方法为利用存储器队列末尾的值初始化,即计算
Figure PCTCN2019119418-appb-000053
H←p[m]I,其中I为单位矩阵。然后从队列首到队列尾(j=1,..,m)迭代计算得到更新的H:p[j]=1/(v[j] Ts[j]),H←(I-p[j]s[j]v[j] T)H(I-p[j]v[j]s[j] T)+p[j]s[j]s[j] T
When the coordinator detects that the longitudinal logistic regression model has not converged, it can update the second derivative matrix based on the main encrypted data sent by the main participant and the auxiliary encrypted data sent by the sub-participants, that is, the main encrypted data and the sub-encrypted data Decrypt and merge them and store them in a queue with a preset length to obtain the target queue, and update the second derivative matrix H according to the target queue. The method of calculating H is to initialize with the value at the end of the memory queue, that is, to calculate
Figure PCTCN2019119418-appb-000053
H←p[m]I, where I is the identity matrix. Then iteratively calculate the updated H from the head of the queue to the end of the queue (j=1,...,m): p[j]=1/(v[j] T s[j]),H←(Ip[j] s[j]v[j] T )H(Ip[j]v[j]s[j] T )+p[j]s[j]s[j] T.
并且在本实施例中,若纵向逻辑回归模型未收敛,则更新
Figure PCTCN2019119418-appb-000054
并需要判断当前迭代次数k与迭代步数间隔L的关系,如果不大于2L,则计算出提前设置的步长与梯度的乘积
Figure PCTCN2019119418-appb-000055
并将各自的乘积分别发送到各自对应的A方和B方中,再让A方根据获取到的乘积(即目标主梯度值)更新A方本地的模型参数,并进行下一次的数据模型训练,同理也让B方根据获取到的乘积更新B方本地的模型参数,再进行下一次的模型训练,直到获取到新的损失函数值,并将其传递给到C方(协调者)进行判定,即确定纵向逻辑回归模型是否收敛,若收敛,则发送迭代停止信号给A方和B方,并停止纵向逻辑回归模型的训练。若未收敛,则再次执行
Figure PCTCN2019119418-appb-000056
的操作,直至纵向逻辑回归模型收敛。并且在k大于2L时,则将两个梯度合并成一个长向量g,计算出步长、H与g的乘积,并拆分成对应的A、B两部分(即A方对应的目标主梯度值和B方对应的目标副梯度值)分别传输给A和B,即:
And in this embodiment, if the longitudinal logistic regression model does not converge, update
Figure PCTCN2019119418-appb-000054
And need to judge the relationship between the current iteration number k and the iteration step interval L, if it is not greater than 2L, calculate the product of the step length and the gradient set in advance
Figure PCTCN2019119418-appb-000055
And send the respective products to the corresponding party A and party B respectively, and let party A update the local model parameters of party A according to the obtained product (ie the target main gradient value), and perform the next data model training , In the same way, let Party B update the local model parameters of Party B according to the obtained product, and then perform the next model training until the new loss function value is obtained, and pass it to Party C (coordinator). Judgment, that is, to determine whether the longitudinal logistic regression model converges, and if it converges, it sends an iterative stop signal to party A and B, and stops the training of the longitudinal logistic regression model. If it does not converge, execute again
Figure PCTCN2019119418-appb-000056
Until the longitudinal logistic regression model converges. And when k is greater than 2L, the two gradients are merged into a long vector g, the product of the step length, H and g is calculated, and split into the corresponding A and B parts (that is, the target main gradient corresponding to the A side Value and the target sub-gradient value corresponding to party B) are transmitted to A and B respectively, namely:
Figure PCTCN2019119418-appb-000057
Figure PCTCN2019119418-appb-000057
步骤S300,将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Step S300: Send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant. The participant obtains the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
在协调者计算得到目标副梯度值后,会将此目标副梯度值发送给副参与者,副参与者会根据此目标副梯度值来更新副参与者中的本地模型参数,并会继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合,直至协调者对应的纵向逻辑回归模型收敛,并发送迭代停止信号到主参与者和副参与者。同理主参与者也接收协调者反馈的主参与者对应的目标主梯度值以更新主参与者中的本地模型参数。After the coordinator calculates the target sub-gradient value, it will send the target sub-gradient value to the sub-participants. The sub-participants will update the local model parameters in the sub-participants according to the target sub-gradient value and continue to execute the sub-participants. The participant obtains the encrypted value set with linear regression value sent by the main participant until the longitudinal logistic regression model corresponding to the coordinator converges, and sends an iteration stop signal to the main participant and the secondary participant. Similarly, the main participant also receives the target main gradient value corresponding to the main participant fed back by the coordinator to update the local model parameters in the main participant.
在本实施例中,通过协调者根据主加密数据和副加密数据更新二阶导数矩阵,并根据更新的二阶导数矩阵计算目标副梯度值,将目标副梯度值发送至副参与者,以更新副参与者中的本地模型参数,从而避免了现有技术进行 纵向联邦学习采用一阶算法而使得收敛速度较慢,需要进行大量轮次的数据交互的现象发生,减少了进行纵向联邦学习的通信量。In this embodiment, the coordinator updates the second-order derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculates the target sub-gradient value according to the updated second-order derivative matrix, and sends the target sub-gradient value to the sub-participants to update The local model parameters in the sub-participants, thus avoiding the phenomenon that the prior art adopts the first-order algorithm for longitudinal federated learning, which makes the convergence speed slow and requires a large number of rounds of data interaction, and reduces the communication for longitudinal federated learning. the amount.
进一步地,根据所述主加密数据和所述副加密数据更新二阶导数矩阵的步骤,包括:Further, the step of updating a second derivative matrix according to the primary encrypted data and the secondary encrypted data includes:
步骤m,判断所述纵向逻辑回归模型是否满足所述预设判定条件;Step m, judging whether the longitudinal logistic regression model satisfies the preset judgment condition;
在协调者接收到主参与者发送的主梯度值和副协调者发送的副梯度值、损失值,并确定纵向逻辑回归模型不收敛后,需要判断纵向逻辑回归模型是否满足预设判定条件,例如判断纵向逻辑回归模型的新迭代次数是否满足预设次数条件(如确定新迭代次数是否为迭代步数间隔的整数倍,且是否大于两倍大于预设次数)。并根据不同的判断结果执行不同的操作。After the coordinator receives the main gradient value sent by the main participant and the secondary gradient value and loss value sent by the deputy coordinator, and determines that the longitudinal logistic regression model does not converge, it needs to determine whether the longitudinal logistic regression model meets the preset determination conditions, for example Determine whether the new iteration number of the longitudinal logistic regression model meets the preset number condition (such as determining whether the new iteration number is an integer multiple of the iteration step interval, and whether it is more than twice greater than the preset number). And perform different operations according to different judgment results.
步骤n,若满足,则将所述主加密数据和所述副加密数据进行解密合并,以获取目标数据;Step n, if it is satisfied, decrypt and merge the primary encrypted data and the secondary encrypted data to obtain target data;
当经过判断发现纵向逻辑回归模型满足预设判定条件时,则协调者在接收到主参与者发送的主加密数据和副参与者发送的副加密数据后,进行解密合并得到目标数据,也就是将加密数据[[v A]],[[v B]]进行解密后合得到目标数据
Figure PCTCN2019119418-appb-000058
When the longitudinal logistic regression model is found to meet the preset judgment conditions after judgment, the coordinator will decrypt and merge the target data after receiving the main encrypted data sent by the main participant and the secondary encrypted data sent by the sub-participants. Encrypted data [[v A ]],[[v B ]] are decrypted to obtain the target data
Figure PCTCN2019119418-appb-000058
步骤p,将所述目标数据存储至预设长度的队列中,以获取目标队列,并通过所述目标队列对二阶导数矩阵进行更新。Step p: Store the target data in a queue with a preset length to obtain the target queue, and update the second derivative matrix through the target queue.
协调者将目标数据存储在一个长度为M(即预设长度)的v队列中。同时,计算本次(t)与上次(t-1)的
Figure PCTCN2019119418-appb-000059
之差,即
Figure PCTCN2019119418-appb-000060
将其存在长度为M的s队列中。如果目前的存储器已经达到最大存储长度M,则将队列首个删掉并将最新得到v和s放在队列末尾。利用当前存储器里的m(m不大于M)个v和s,来计算H(二阶导数矩阵)。计算方法如下:
The coordinator stores the target data in a v queue with a length of M (ie, a preset length). At the same time, calculate the current (t) and the last (t-1)
Figure PCTCN2019119418-appb-000059
Difference between
Figure PCTCN2019119418-appb-000060
Store it in the s queue of length M. If the current memory has reached the maximum storage length M, delete the first one in the queue and put the latest v and s at the end of the queue. Use m (m not greater than M) v and s in the current memory to calculate H (second derivative matrix). The calculation method is as follows:
利用存储器队列末尾的值初始化,即计算
Figure PCTCN2019119418-appb-000061
其中I为单位矩阵。然后从队列首到队列尾(j=1,..,m)迭代计算得到更新的H:p[j]=1/(v[j] Ts[j]),H←(I-p[j]s[j]v[j] T)H(I-p[j]v[j]s[j] T)+p[j]s[j]s[j] T
Initialize with the value at the end of the memory queue, that is, calculate
Figure PCTCN2019119418-appb-000061
Where I is the identity matrix. Then iteratively calculate the updated H from the head of the queue to the end of the queue (j=1,...,m): p[j]=1/(v[j] T s[j]),H←(Ip[j] s[j]v[j] T )H(Ip[j]v[j]s[j] T )+p[j]s[j]s[j] T.
在本实施例中,通过将主加密数据和副加密数据进行解密合并得到目标数据,再根据目标数据对二阶导数矩阵进行更新,从而保障了二阶导数矩阵更新的有效性。In this embodiment, the target data is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data, and then the second derivative matrix is updated according to the target data, thereby ensuring the effectiveness of the update of the second derivative matrix.
进一步地,判断所述纵向逻辑回归模型是否满足所述预设判定条件的步骤之后,包括:Further, after the step of judging whether the longitudinal logistic regression model satisfies the predetermined judgment condition, the method includes:
步骤x,若不满足,则所述协调者获取所述副参与者发送的副梯度值和预设的步长之间的第一乘积,并将所述第一乘积作为目标副梯度值发送至所述副参与者。Step x, if not satisfied, the coordinator obtains the first product between the secondary gradient value sent by the secondary participant and the preset step size, and sends the first product as the target secondary gradient value to The associate participant.
当经过判断发现纵向逻辑回归模型不满足预设判定条件,则协调者计算出事先选定的预设步长与副梯度值的第一乘积,和预设步长与主参与者对应的主梯度值的第三乘积,并将第一乘积作为目标副梯度值发送到副参与者中更新副参与者中的本地模型参数,将第三乘积发送到主参与者中更新主参与 者中的本地模型参数,再根据更新后的各个模型参数重新进行模型训练,以获取新的损失函数值,并通过副参与者发送到协调者中。When it is judged that the longitudinal logistic regression model does not meet the preset judgment conditions, the coordinator calculates the first product of the pre-selected preset step size and the sub-gradient value, and the preset step size and the main gradient corresponding to the main participant The third product of the value, and the first product is sent to the secondary participant as the target secondary gradient value to update the local model parameters in the secondary participant, and the third product is sent to the main participant to update the local model in the primary participant Parameters, and then re-train the model according to the updated model parameters to obtain the new loss function value, and send it to the coordinator through the deputy participant.
在本实施例中,通过在确定纵向逻辑回归模型不满足预设判定条件时,计算副梯度值和预设的步长之间的第一乘积,并将第一乘积作为目标副梯度值,从而保障了获取到的目标主梯度值的准确性。In this embodiment, when it is determined that the longitudinal logistic regression model does not satisfy the preset determination condition, the first product between the sub-gradient value and the preset step size is calculated, and the first product is used as the target sub-gradient value, thereby The accuracy of the obtained target main gradient value is guaranteed.
本申请的实施例还提供一种纵向联邦学习优化装置,参照图4,所述纵向联邦学习优化装置包括:获取模块,用于副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;发送模块,用于将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;第一接收模块,用于接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。可选地,所述获取模块,还用于:检测纵向联邦模型是否满足预设判定条件;若满足,则副参与者获取所述主参与者发送的主加密数值和新加密数值,并将所述主加密数值和所述新加密数值作为所述主参与者发送的具有线性回归值的加密数值集合。可选地,所述获取模块,还用于:确定所述副参与者对应的当前迭代次数是否满足预设次数条件,若满足,则根据所述加密数值集合计算中间结果值,并通过所述中间结果值计算副加密数据。可选地,所述获取模块,还用于:基于所述加密数值集合获取所述副参与者中本地模型参数的当前平均值,并获取所述当前平均值之前预设步数间隔的历史平均值;计算所述当前平均值和所述历史平均值之间的差值,并根据所述差值计算中间结果值,通过所述中间结果值计算副加密数据。可选地,所述第一接收模块,还用于:接收所述协调者基于所述副加密数据发送的目标副梯度值,其中,所述目标副梯度值由所述协调者根据目标数据更新的二阶导数矩阵获取的,所述目标数据为响应于纵向逻辑回归模型未收敛,且满足预设判定条件,将所述主加密数据和所述副参与者发送的副加密数据进行解密合并得到的。可选地,所述第一接收模块,还用于:接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值由所述协调者对第一目标乘积进行拆分得到的,所述第一目标乘积为根据响应于所述纵向逻辑回归模型满足所述预设判定条件而更新的二阶导数矩阵、所述主参与者发送的主梯度值和所述副参与者发送的副梯度值合并的长向量,和预设步长之间的乘积。The embodiment of the present application also provides a longitudinal federated learning optimization device. Referring to FIG. 4, the longitudinal federated learning optimization device includes: an acquisition module for the secondary participant to acquire the encrypted value set with linear regression value sent by the main participant , And calculate the secondary encrypted data according to the encrypted value set; the sending module is used to send the secondary encrypted data to the coordinator, wherein the coordinator is used to respond to the vertical federation model not converging, according to the secondary encryption The data updates the second-order derivative matrix in the coordinator, and calculates the target sub-gradient value according to the updated second-order derivative matrix; the first receiving module is used to receive the target sub-gradient sent by the coordinator based on the sub-encrypted data Gradient value, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the step of obtaining the encrypted value set with linear regression value sent by the primary participant by the secondary participant until the coordinator The corresponding vertical federation model converges. Optionally, the acquisition module is further configured to: detect whether the vertical federation model meets a preset judgment condition; if so, the secondary participant acquires the primary encrypted value and the new encrypted value sent by the primary participant, and compares all The main encrypted value and the new encrypted value are used as an encrypted value set with a linear regression value sent by the main participant. Optionally, the acquisition module is further configured to determine whether the current iteration number corresponding to the secondary participant meets a preset number condition, and if so, calculate the intermediate result value according to the encrypted value set, and pass the The intermediate result value calculates the secondary encrypted data. Optionally, the obtaining module is further configured to: obtain the current average value of the local model parameters in the secondary participants based on the encrypted value set, and obtain the historical average of the preset step interval before the current average value Value; Calculate the difference between the current average and the historical average, and calculate an intermediate result value based on the difference, and calculate the secondary encrypted data by the intermediate result value. Optionally, the first receiving module is further configured to: receive a target secondary gradient value sent by the coordinator based on the secondary encrypted data, wherein the target secondary gradient value is updated by the coordinator according to the target data The target data is obtained by decrypting and combining the primary encrypted data and the secondary encrypted data sent by the secondary participant in response to the longitudinal logistic regression model not converging and meeting the preset judgment condition. of. Optionally, the first receiving module is further configured to: receive a target subgradient value fed back by the coordinator, wherein the target subgradient value is obtained by splitting the first target product by the coordinator The first target product is based on the second derivative matrix updated in response to the longitudinal logistic regression model satisfying the preset determination condition, the main gradient value sent by the main participant, and the main gradient value sent by the secondary participant The product of the combined long vector of the sub-gradient values and the preset step length.
可选地,所述接收所述协调者反馈的目标副梯度值的步骤,包括:接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值为第二目标乘积,所述第二目标乘积为所述协调者响应于纵向逻辑回归模型未收敛,且不满足预设判定条件,计算的所述主参与者发送的主梯度值和预设的步长之间的乘积。可选地,所述纵向联邦学习优化装置还包括:第二接收模块,用于接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加 密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;更新模块,用于响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;收敛模块,用于将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。可选地,所述更新模块,还用于:判断所述纵向逻辑回归模型是否满足所述预设判定条件;若满足,则将所述主加密数据和所述副加密数据进行解密合并,以获取目标数据;将所述目标数据存储至预设长度的队列中,以获取目标队列,并通过所述目标队列对二阶导数矩阵进行更新。可选地,所述更新模块,还用于:若不满足,则所述协调者获取所述副参与者发送的副梯度值和预设的步长之间的第一乘积,并将所述第一乘积作为目标副梯度值发送至所述副参与者。Optionally, the step of receiving the target sub-gradient value fed back by the coordinator includes: receiving the target sub-gradient value fed back by the coordinator, wherein the target sub-gradient value is a second target product, and the The second target product is the calculated product between the main gradient value sent by the main participant and the preset step length calculated by the coordinator in response to the longitudinal logistic regression model not converging and not satisfying the preset determination condition. Optionally, the longitudinal federated learning optimization device further includes: a second receiving module for receiving the main encrypted data sent by the main participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is based on the The intermediate result value in the secondary participant is calculated, the intermediate result value is calculated by the secondary participant according to the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value; The update module is configured to respond to the failure of the longitudinal logistic regression model to converge, update the second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate the target subgradient value according to the updated second derivative matrix; converge; Module, used to send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the The secondary participant obtains the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges. Optionally, the update module is further configured to determine whether the longitudinal logistic regression model satisfies the preset determination condition; if so, decrypt and merge the primary encrypted data and the secondary encrypted data to Obtain target data; store the target data in a queue with a preset length to obtain the target queue, and update the second derivative matrix through the target queue. Optionally, the update module is further configured to: if it is not satisfied, the coordinator obtains the first product between the secondary gradient value sent by the secondary participant and the preset step size, and compares the The first product is sent to the secondary participant as the target secondary gradient value.
上述各程序模块所执行的方法可参照本申请纵向联邦学习优化方法各个实施例,此处不再赘述。For the method executed by the above-mentioned program modules, please refer to the respective embodiments of the vertical federated learning optimization method of this application, and will not be repeated here.
本申请还提供一种存储介质,所述存储介质可以为非易失性可读存储介质。本申请存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上所述的纵向联邦学习优化方法的步骤。其中,在所述处理器上运行的计算机可读指令被执行时所实现的方法可参照本申请纵向联邦学习优化方法各个实施例,此处不再赘述。The present application also provides a storage medium, which may be a non-volatile readable storage medium. The storage medium of the present application stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the steps of the vertical federated learning optimization method described above are realized. For the method implemented when the computer-readable instructions running on the processor are executed, please refer to the respective embodiments of the vertical federated learning optimization method of this application, which will not be repeated here.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements not only includes those elements, It also includes other elements not explicitly listed, or elements inherent to the process, method, article, or system. Without more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article, or system that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disks, optical disks), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种纵向联邦学习优化方法,其中,所述纵向联邦学习优化方法包括如下步骤:A longitudinal federated learning optimization method, wherein the longitudinal federated learning optimization method includes the following steps:
    副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
    将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The secondary encrypted data is sent to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, and according to the updated The second derivative matrix calculates the target subgradient value;
    接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant's acquisition of the primary participant's transmission Until the vertical federation model corresponding to the coordinator converges.
  2. 如权利要求1所述的纵向联邦学习优化方法,其中,所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,包括:5. The longitudinal federated learning optimization method according to claim 1, wherein the step of obtaining the encrypted value set with linear regression value sent by the main participant by the secondary participant comprises:
    检测纵向联邦模型满足预设判定条件,则副参与者获取所述主参与者发送的主加密数值和新加密数值,并将所述主加密数值和所述新加密数值作为所述主参与者发送的具有线性回归值的加密数值集合。Detecting that the vertical federation model meets the preset judgment condition, the secondary participant obtains the main encrypted value and the new encrypted value sent by the main participant, and sends the main encrypted value and the new encrypted value as the main participant A collection of encrypted values with linear regression values.
  3. 如权利要求1所述的纵向联邦学习优化方法,其中,所述根据所述加密数值集合计算副加密数据的步骤,包括:5. The vertical federated learning optimization method according to claim 1, wherein the step of calculating secondary encrypted data according to the encrypted value set comprises:
    确定所述副参与者对应的当前迭代次数满足预设次数条件,则根据所述加密数值集合计算中间结果值,并通过所述中间结果值计算副加密数据。It is determined that the current iteration number corresponding to the secondary participant satisfies the preset number condition, then the intermediate result value is calculated according to the encrypted value set, and the secondary encrypted data is calculated based on the intermediate result value.
  4. 如权利要求3所述的纵向联邦学习优化方法,其中,所述根据所述加密数值集合计算中间结果值,并通过所述中间结果值计算副加密数据的步骤,包括:5. The vertical federated learning optimization method according to claim 3, wherein the step of calculating an intermediate result value according to the encrypted value set, and calculating the secondary encrypted data according to the intermediate result value, comprises:
    基于所述加密数值集合获取所述副参与者中本地模型参数的当前平均值,并获取所述当前平均值之前预设步数间隔的历史平均值;Obtaining the current average value of the local model parameters in the secondary participants based on the encrypted value set, and obtaining the historical average value of the preset step interval before the current average value;
    计算所述当前平均值和所述历史平均值之间的差值,并根据所述差值计 算中间结果值,通过所述中间结果值计算副加密数据。Calculate the difference between the current average value and the historical average value, calculate an intermediate result value based on the difference value, and calculate the secondary encrypted data based on the intermediate result value.
  5. 如权利要求1所述的纵向联邦学习优化方法,其中,所述接收所述协调者基于所述副加密数据发送的目标副梯度值的步骤,包括:5. The vertical federated learning optimization method according to claim 1, wherein the step of receiving the target sub-gradient value sent by the coordinator based on the sub-encrypted data comprises:
    接收所述协调者基于所述副加密数据发送的目标副梯度值,其中,所述目标副梯度值由所述协调者根据目标数据更新的二阶导数矩阵获取的,所述目标数据为响应于纵向逻辑回归模型未收敛,且满足预设判定条件,将所述主加密数据和所述副参与者发送的副加密数据进行解密合并得到的。Receive the target sub-gradient value sent by the coordinator based on the sub-encrypted data, where the target sub-gradient value is obtained by the second derivative matrix updated by the coordinator according to the target data, and the target data is in response to The longitudinal logistic regression model does not converge and meets a preset judgment condition, and is obtained by decrypting and merging the primary encrypted data and the secondary encrypted data sent by the secondary participant.
  6. 如权利要求1所述的纵向联邦学习优化方法,其中,所述接收所述协调者反馈的目标副梯度值的步骤,包括:The longitudinal federated learning optimization method according to claim 1, wherein the step of receiving the target subgradient value fed back by the coordinator comprises:
    接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值由所述协调者对第一目标乘积进行拆分得到的,所述第一目标乘积为根据响应于所述纵向逻辑回归模型满足所述预设判定条件而更新的二阶导数矩阵、所述主参与者发送的主梯度值和所述副参与者发送的副梯度值合并的长向量,和预设步长之间的乘积。Receive the target sub-gradient value fed back by the coordinator, where the target sub-gradient value is obtained by splitting the first target product by the coordinator, and the first target product is based on the response to the vertical logic The regression model satisfies the preset judgment condition and updated the second-order derivative matrix, the long vector of the combination of the main gradient value sent by the main participant and the auxiliary gradient value sent by the deputy participant, and the preset step length The product of.
  7. 如权利要求1所述的纵向联邦学习优化方法,其中,所述接收所述协调者反馈的目标副梯度值的步骤,包括:The longitudinal federated learning optimization method according to claim 1, wherein the step of receiving the target subgradient value fed back by the coordinator comprises:
    接收所述协调者反馈的目标副梯度值,其中,所述目标副梯度值为第二目标乘积,所述第二目标乘积为所述协调者响应于纵向逻辑回归模型未收敛,且不满足预设判定条件,计算的所述主参与者发送的主梯度值和预设的步长之间的乘积。Receive the target sub-gradient value fed back by the coordinator, wherein the target sub-gradient value is a second target product, and the second target product is that the coordinator does not converge in response to the longitudinal logistic regression model and does not meet the expected value. Set the judgment condition, the calculated product between the main gradient value sent by the main participant and the preset step length.
  8. 一种纵向联邦学习优化方法,其中,所述纵向联邦学习优化方法包括如下步骤:A longitudinal federated learning optimization method, wherein the longitudinal federated learning optimization method includes the following steps:
    接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;Receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the intermediate result value is the secondary The participant calculates based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
    响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数 据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;In response to the longitudinal logistic regression model not converging, updating a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculating a target auxiliary gradient value according to the updated second derivative matrix;
    将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The target secondary gradient value is sent to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the secondary participant acquisition The step of collecting encrypted values with linear regression values sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  9. 如权利要求8所述的纵向联邦学习优化方法,其中,所述根据所述主加密数据和所述副加密数据更新二阶导数矩阵的步骤,包括:8. The longitudinal federated learning optimization method according to claim 8, wherein the step of updating a second derivative matrix according to the primary encrypted data and the secondary encrypted data comprises:
    确定所述纵向逻辑回归模型满足所述预设判定条件,则将所述主加密数据和所述副加密数据进行解密合并,以获取目标数据;Determining that the longitudinal logistic regression model satisfies the preset determination condition, decrypting and combining the main encrypted data and the auxiliary encrypted data to obtain target data;
    将所述目标数据存储至预设长度的队列中,以获取目标队列,并通过所述目标队列对二阶导数矩阵进行更新。The target data is stored in a queue with a preset length to obtain the target queue, and the second derivative matrix is updated through the target queue.
  10. 如权利要求8所述纵向联邦学习优化方法,其中,所述纵向联邦学习优化方法,包括:8. The longitudinal federated learning optimization method according to claim 8, wherein the longitudinal federated learning optimization method comprises:
    确定所述纵向逻辑回归模型不满足所述预设判定条件,则所述协调者获取所述副参与者发送的副梯度值和预设的步长之间的第一乘积,并将所述第一乘积作为目标副梯度值发送至所述副参与者。If it is determined that the longitudinal logistic regression model does not satisfy the preset judgment condition, the coordinator obtains the first product between the secondary gradient value sent by the secondary participant and the preset step size, and then compares the first product with the preset step size. A product is sent to the secondary participant as the target secondary gradient value.
  11. 一种纵向联邦学习优化装置,其中,所述纵向联邦学习优化装置包括:A longitudinal federated learning optimization device, wherein the longitudinal federated learning optimization device includes:
    获取模块,用于副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The obtaining module is used for the secondary participant to obtain the encrypted value set with linear regression value sent by the main participant, and calculate the secondary encrypted data according to the encrypted value set;
    发送模块,用于将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The sending module is configured to send the secondary encrypted data to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, And calculate the target sub-gradient value according to the updated second derivative matrix;
    第一接收模块,用于接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The first receiving module is configured to receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant The step of obtaining the encrypted value set with linear regression value sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  12. 一种纵向联邦学习优化装置,其中,所述纵向联邦学习优化装置还包括:A longitudinal federated learning optimization device, wherein the longitudinal federated learning optimization device further includes:
    第二接收模块,用于接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;The second receiving module is used to receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the The intermediate result value is calculated by the secondary participant based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
    更新模块,用于响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;The update module is configured to respond to the failure of the longitudinal logistic regression model to converge, update the second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate the target sub-gradient value according to the updated second derivative matrix;
    收敛模块,用于将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The convergence module is configured to send the target secondary gradient value to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value and continue to execute all The step of obtaining the encrypted value set with linear regression value sent by the main participant by the secondary participant until the vertical federation model corresponding to the coordinator converges.
  13. 一种纵向联邦学习优化设备,其中,所述纵向联邦学习优化设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如下步骤:A longitudinal federated learning optimization device, wherein the longitudinal federated learning optimization device includes: a memory, a processor, and computer-readable instructions stored on the memory and running on the processor, the computer readable When an instruction is executed by the processor, the following steps are implemented:
    副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
    将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The secondary encrypted data is sent to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, and according to the updated The second derivative matrix calculates the target subgradient value;
    接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Receive the target secondary gradient value sent by the coordinator based on the secondary encrypted data, update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to execute the secondary participant's acquisition of the primary participant's transmission Until the vertical federation model corresponding to the coordinator converges.
  14. 如权利要求13所述的纵向联邦学习优化设备,其中,所述纵向联邦学习优化设备包括:The longitudinal federated learning optimization device according to claim 13, wherein the longitudinal federated learning optimization device comprises:
    检测纵向联邦模型满足预设判定条件,则副参与者获取所述主参与者发送的主加密数值和新加密数值,并将所述主加密数值和所述新加密数值作为 所述主参与者发送的具有线性回归值的加密数值集合。Detecting that the vertical federation model meets the preset judgment condition, the secondary participant obtains the main encrypted value and the new encrypted value sent by the main participant, and sends the main encrypted value and the new encrypted value as the main participant A collection of encrypted values with linear regression values.
  15. 一种纵向联邦学习优化设备其中,所述纵向联邦学习优化设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如下步骤:A longitudinal federated learning optimization device, wherein the longitudinal federated learning optimization device includes: a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, the computer-readable instructions When executed by the processor, the following steps are implemented:
    接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;Receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the intermediate result value is the secondary The participant calculates based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
    响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;In response to the failure of the longitudinal logistic regression model to converge, update a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate a target auxiliary gradient value according to the updated second derivative matrix;
    将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The target secondary gradient value is sent to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the secondary participant acquisition The step of collecting encrypted values with linear regression values sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  16. 如权利要求15所述的纵向联邦学习优化设备,其中,所述根据所述主加密数据和所述副加密数据更新二阶导数矩阵的步骤,包括:15. The vertical federated learning optimization device according to claim 15, wherein the step of updating a second derivative matrix according to the primary encrypted data and the secondary encrypted data comprises:
    确定所述纵向逻辑回归模型满足所述预设判定条件,则将所述主加密数据和所述副加密数据进行解密合并,以获取目标数据;Determining that the longitudinal logistic regression model satisfies the preset determination condition, decrypting and combining the main encrypted data and the auxiliary encrypted data to obtain target data;
    将所述目标数据存储至预设长度的队列中,以获取目标队列,并通过所述目标队列对二阶导数矩阵进行更新。The target data is stored in a queue with a preset length to obtain the target queue, and the second derivative matrix is updated through the target queue.
  17. 一种存储介质,其中,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下步骤:A storage medium, wherein computer-readable instructions are stored on the storage medium, and the following steps are implemented when the computer-readable instructions are executed by a processor:
    副参与者获取主参与者发送的具有线性回归值的加密数值集合,并根据所述加密数值集合计算副加密数据;The secondary participant obtains the encrypted value set with linear regression value sent by the main participant, and calculates the secondary encrypted data according to the encrypted value set;
    将所述副加密数据发送至协调者,其中,所述协调者用于响应于纵向联邦模型未收敛,根据所述副加密数据更新所述协调者中的二阶导数矩阵,并根据更新后的二阶导数矩阵计算目标副梯度值;The secondary encrypted data is sent to the coordinator, where the coordinator is used to update the second derivative matrix in the coordinator according to the secondary encrypted data in response to the vertical federation model not converging, and according to the updated The second derivative matrix calculates the target subgradient value;
    接收所述协调者基于所述副加密数据发送的目标副梯度值,基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者 获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。Receive the target sub-gradient value sent by the coordinator based on the sub-encrypted data, update the local model parameters in the sub-participant based on the target sub-gradient value, and continue to execute the sub-participant's acquisition of the main participant's transmission Until the vertical federation model corresponding to the coordinator converges.
  18. 如权利要求17所述的存储介质,其中,所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,包括:17. The storage medium of claim 17, wherein the step of the secondary participant acquiring the encrypted value set with linear regression value sent by the main participant comprises:
    检测纵向联邦模型满足预设判定条件,则副参与者获取所述主参与者发送的主加密数值和新加密数值,并将所述主加密数值和所述新加密数值作为所述主参与者发送的具有线性回归值的加密数值集合。Detecting that the vertical federation model meets the preset judgment condition, the secondary participant obtains the main encrypted value and the new encrypted value sent by the main participant, and sends the main encrypted value and the new encrypted value as the main participant A collection of encrypted values with linear regression values.
  19. 一种存储介质,其中,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下步骤:A storage medium, wherein computer-readable instructions are stored on the storage medium, and the following steps are implemented when the computer-readable instructions are executed by a processor:
    接收主参与者发送的主加密数据和副参与者发送的副加密数据,其中,所述副加密数据为根据所述副参与者中的中间结果值计算的,所述中间结果值为所述副参与者根据所述主参与者发送加密数值集合计算的,所述加密数值集合包括主加密数值和新加密数值;Receive the primary encrypted data sent by the primary participant and the secondary encrypted data sent by the secondary participant, wherein the secondary encrypted data is calculated according to the intermediate result value in the secondary participant, and the intermediate result value is the secondary The participant calculates based on the encrypted value set sent by the main participant, and the encrypted value set includes the main encrypted value and the new encrypted value;
    响应于纵向逻辑回归模型未收敛,根据所述主加密数据和所述副加密数据更新二阶导数矩阵,并根据所述更新后的二阶导数矩阵计算目标副梯度值;In response to the failure of the longitudinal logistic regression model to converge, update a second derivative matrix according to the main encrypted data and the auxiliary encrypted data, and calculate a target auxiliary gradient value according to the updated second derivative matrix;
    将所述目标副梯度值发送给所述副参与者,所述副参与者用于基于所述目标副梯度值更新所述副参与者中的本地模型参数,并继续执行所述副参与者获取主参与者发送的具有线性回归值的加密数值集合的步骤,直至所述协调者对应的纵向联邦模型收敛。The target secondary gradient value is sent to the secondary participant, and the secondary participant is used to update the local model parameters in the secondary participant based on the target secondary gradient value, and continue to perform the secondary participant acquisition The step of collecting encrypted values with linear regression values sent by the main participant until the vertical federation model corresponding to the coordinator converges.
  20. 如权利要求19所述的存储介质,其中,所述根据所述主加密数据和所述副加密数据更新二阶导数矩阵的步骤,包括:19. The storage medium of claim 19, wherein the step of updating a second derivative matrix based on the primary encrypted data and the secondary encrypted data comprises:
    确定所述纵向逻辑回归模型满足所述预设判定条件,则将所述主加密数据和所述副加密数据进行解密合并,以获取目标数据;Determining that the longitudinal logistic regression model satisfies the preset determination condition, decrypting and combining the main encrypted data and the auxiliary encrypted data to obtain target data;
    将所述目标数据存储至预设长度的队列中,以获取目标队列,并通过所述目标队列对二阶导数矩阵进行更新。The target data is stored in a queue with a preset length to obtain the target queue, and the second derivative matrix is updated through the target queue.
PCT/CN2019/119418 2019-11-14 2019-11-19 Longitudinal federated learning optimization method, apparatus and device, and storage medium WO2021092980A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911124702.X 2019-11-14
CN201911124702.XA CN110851786B (en) 2019-11-14 2019-11-14 Inter-enterprise data interaction method, device, equipment and storage medium based on longitudinal federal learning

Publications (1)

Publication Number Publication Date
WO2021092980A1 true WO2021092980A1 (en) 2021-05-20

Family

ID=69601691

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/119418 WO2021092980A1 (en) 2019-11-14 2019-11-19 Longitudinal federated learning optimization method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN110851786B (en)
WO (1) WO2021092980A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742673A (en) * 2021-09-07 2021-12-03 石硕 Cloud edge collaborative management and control integrated platform based on federal learning
CN114003939A (en) * 2021-11-16 2022-02-01 蓝象智联(杭州)科技有限公司 Multiple collinearity analysis method for longitudinal federal scene
CN114429223A (en) * 2022-01-26 2022-05-03 上海富数科技有限公司 Heterogeneous model establishing method and device
CN114547643A (en) * 2022-01-20 2022-05-27 华东师范大学 Linear regression longitudinal federated learning method based on homomorphic encryption
CN114841373A (en) * 2022-05-24 2022-08-02 中国电信股份有限公司 Parameter processing method, device, system and product applied to mixed federal scene

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449872B (en) * 2020-03-25 2023-08-08 百度在线网络技术(北京)有限公司 Parameter processing method, device and system based on federal learning
CN111160573B (en) * 2020-04-01 2020-06-30 支付宝(杭州)信息技术有限公司 Method and device for protecting business prediction model of data privacy joint training by two parties
CN112182649B (en) * 2020-09-22 2024-02-02 上海海洋大学 Data privacy protection system based on safe two-party calculation linear regression algorithm
WO2022094888A1 (en) * 2020-11-05 2022-05-12 浙江大学 Decision tree-oriented longitudinal federation learning method
CN112508199A (en) * 2020-11-30 2021-03-16 同盾控股有限公司 Feature selection method, device and related equipment for cross-feature federated learning
CN113934983A (en) * 2021-10-27 2022-01-14 平安科技(深圳)有限公司 Characteristic variable analysis method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635422A (en) * 2018-12-07 2019-04-16 深圳前海微众银行股份有限公司 Joint modeling method, device, equipment and computer readable storage medium
CN110189192A (en) * 2019-05-10 2019-08-30 深圳前海微众银行股份有限公司 A kind of generation method and device of information recommendation model
CN110197084A (en) * 2019-06-12 2019-09-03 上海联息生物科技有限公司 Medical data combination learning system and method based on trust computing and secret protection
KR20190103090A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for learning a model to generate poi data using federated learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034398B (en) * 2018-08-10 2023-09-12 深圳前海微众银行股份有限公司 Gradient lifting tree model construction method and device based on federal training and storage medium
CN109299728B (en) * 2018-08-10 2023-06-27 深圳前海微众银行股份有限公司 Sample joint prediction method, system and medium based on construction of gradient tree model
CN109165515A (en) * 2018-08-10 2019-01-08 深圳前海微众银行股份有限公司 Model parameter acquisition methods, system and readable storage medium storing program for executing based on federation's study
CN110263936B (en) * 2019-06-14 2023-04-07 深圳前海微众银行股份有限公司 Horizontal federal learning method, device, equipment and computer storage medium
CN112732297B (en) * 2020-12-31 2022-09-27 平安科技(深圳)有限公司 Method and device for updating federal learning model, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635422A (en) * 2018-12-07 2019-04-16 深圳前海微众银行股份有限公司 Joint modeling method, device, equipment and computer readable storage medium
CN110189192A (en) * 2019-05-10 2019-08-30 深圳前海微众银行股份有限公司 A kind of generation method and device of information recommendation model
CN110197084A (en) * 2019-06-12 2019-09-03 上海联息生物科技有限公司 Medical data combination learning system and method based on trust computing and secret protection
KR20190103090A (en) * 2019-08-15 2019-09-04 엘지전자 주식회사 Method and apparatus for learning a model to generate poi data using federated learning

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742673A (en) * 2021-09-07 2021-12-03 石硕 Cloud edge collaborative management and control integrated platform based on federal learning
CN114003939A (en) * 2021-11-16 2022-02-01 蓝象智联(杭州)科技有限公司 Multiple collinearity analysis method for longitudinal federal scene
CN114003939B (en) * 2021-11-16 2024-03-15 蓝象智联(杭州)科技有限公司 Multiple collinearity analysis method for longitudinal federal scene
CN114547643A (en) * 2022-01-20 2022-05-27 华东师范大学 Linear regression longitudinal federated learning method based on homomorphic encryption
CN114547643B (en) * 2022-01-20 2024-04-19 华东师范大学 Linear regression longitudinal federal learning method based on homomorphic encryption
CN114429223A (en) * 2022-01-26 2022-05-03 上海富数科技有限公司 Heterogeneous model establishing method and device
CN114429223B (en) * 2022-01-26 2023-11-07 上海富数科技有限公司 Heterogeneous model building method and device
CN114841373A (en) * 2022-05-24 2022-08-02 中国电信股份有限公司 Parameter processing method, device, system and product applied to mixed federal scene

Also Published As

Publication number Publication date
CN110851786B (en) 2023-06-06
CN110851786A (en) 2020-02-28

Similar Documents

Publication Publication Date Title
WO2021092980A1 (en) Longitudinal federated learning optimization method, apparatus and device, and storage medium
WO2021092977A1 (en) Vertical federated learning optimization method, appartus, device and storage medium
WO2021249086A1 (en) Multi-party joint decision tree construction method, device and readable storage medium
WO2020134704A1 (en) Model parameter training method based on federated learning, terminal, system and medium
CN112733967B (en) Model training method, device, equipment and storage medium for federal learning
WO2020029589A1 (en) Model parameter acquisition method and system based on federated learning, and readable storage medium
CN113033828B (en) Model training method, using method, system, credible node and equipment
Ding et al. Security information transmission algorithms for IoT based on cloud computing
US20230039182A1 (en) Method, apparatus, computer device, storage medium, and program product for processing data
WO2022247576A1 (en) Data processing method and apparatus, device, and computer-readable storage medium
WO2021159798A1 (en) Method for optimizing longitudinal federated learning system, device and readable storage medium
TWI749444B (en) Reliable user service system and method
JP2019517167A (en) System and method for establishing a link between identifiers without disclosing specific identification information
CN114696990B (en) Multi-party computing method, system and related equipment based on fully homomorphic encryption
CN111324812A (en) Federal recommendation method, device, equipment and medium based on transfer learning
CN111767411A (en) Knowledge graph representation learning optimization method and device and readable storage medium
CN111368196A (en) Model parameter updating method, device, equipment and readable storage medium
CN114492850A (en) Model training method, device, medium, and program product based on federal learning
CN114429223A (en) Heterogeneous model establishing method and device
CN116502732B (en) Federal learning method and system based on trusted execution environment
CN110874638B (en) Behavior analysis-oriented meta-knowledge federation method, device, electronic equipment and system
CN113449872A (en) Parameter processing method, device and system based on federal learning
CN112836767A (en) Federal modeling method, apparatus, device, storage medium, and program product
US11741257B2 (en) Systems and methods for obtaining anonymized information derived from data obtained from external data providers
US9536199B1 (en) Recommendations based on device usage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19952302

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19952302

Country of ref document: EP

Kind code of ref document: A1