CN115796309A

CN115796309A - Horizontal and vertical combination algorithm for federated learning

Info

Publication number: CN115796309A
Application number: CN202211146107.8A
Authority: CN
Inventors: 怀朋; 喻博; 沈华杰; 徐潜; 贺伟
Original assignee: Tianyi Electronic Commerce Co Ltd
Current assignee: Tianyi Electronic Commerce Co Ltd
Priority date: 2022-09-20
Filing date: 2022-09-20
Publication date: 2023-03-14
Also published as: WO2024060410A1

Abstract

The invention discloses a federated learning transverse and longitudinal combination algorithm, which comprises the following processes: data synchronization and model initialization: all participants synchronously specify in a database according to data latitude characteristics required by the model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal-horizontal mode is adopted, namely, a gradient generated by a longitudinal federation is transmitted to a gradient of a transverse federation in proportion, and the transmission proportion is lambda. The invention can solve the problem of longitudinal and transverse data feature expansion, can expand the training set on the latitude feature, enables the same sample to have higher dimensional feature description, and can expand the data set on the sample size, so that the sample distribution is wider, and a more robust federated learning model is generated.

Description

Horizontal and vertical combination algorithm for federated learning

Technical Field

The invention relates to the field of federal learning, in particular to a joint learning horizontal and vertical combination algorithm.

Background

Federal Learning (Federal Learning) is a new artificial intelligence basic technology, which was proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users, and the design goal of the technology is to carry out efficient machine Learning among multiple parties or multiple computing nodes on the premise of guaranteeing information safety during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance. The machine learning algorithm used for federal learning is not limited to a neural network, and also comprises important algorithms such as a random forest. Federal learning is expected to become the basis of next-generation artificial intelligence cooperative algorithms and cooperative networks.

For different data sets, the federal Learning is divided into horizontal federal Learning (horizontal federal fed Learning), vertical federal Learning (vertical federal fed Learning) and federal Transfer Learning (FmL);

the existing federal learning algorithm can only obtain a transverse or longitudinal model by singly using a transverse or longitudinal federal mode, but in an actual application scene, a data source is uncertain, the data acquisition forms are diversified, if different data sources are expected to expand in latitude and simultaneously expand in sample size for model training, all the present federal learning algorithms cannot realize a mode of combining between vertical and horizontal, and the federal learning combining between vertical and horizontal is very common in actual application, such as anti-fraud of banks, on one hand, more dimensional characteristics of fraud groups are planned through the expansion of telecommunication data, on the other hand, more sample cases are expected to be added through more participating banks, so that a more accurate federal model is trained.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a federated learning transverse and longitudinal combination algorithm.

The invention provides the following technical scheme:

the invention provides a federated learning horizontal and vertical combination algorithm, which comprises the following procedures:

(1) Data synchronization and model initialization: all participants synchronously appoint in a database according to data latitude characteristics required by a model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal mode and a transverse mode are used, namely, a gradient generated by a longitudinal federation is proportionally transmitted to a gradient of a transverse federation, and the transmission proportion is lambda;

(2) Data alignment: sample data needing dimension characteristic expansion is aligned through privacy intersection, namely, each participant of each keyword ID needs corresponding dimension characteristics, and data preparation is carried out for longitudinal federal learning;

(3) Aiming at alignment data of each batch, training a longitudinal federal learning model, wherein the longitudinal federal respectively trains own unique weight w and characteristic x for participants, a training global gradient and global corresponding loss are obtained through an encryption algorithm, a local longitudinal model updates the weight w once, and loss is transferred from current longitudinal learning: loss1, loss2, loss3;

(4) The gradients of the current longitudinal training models, namely, loss1, loss2 and Loss3 are brought into the next longitudinal model training period, so that the gradients of the longitudinal models of two times are associated, the association mode is transferred in a momentum mode, and the gradient of a certain current learning proportion is retained while the training gradient of a certain proportion of the previous period is absorbed:

D(t+1)＝λD(t)+(1-λ)DT

d is a model training gradient, DT is a gradient of current model training, and lambda is used for coordinating parameters of gradient proportion;

(5) The horizontal model receives the gradient value of each vertical model training, the horizontal federation is a global federation, the gradient and loss of each sub-vertical federation need to be considered, and the horizontal federation is used for updating the weight value of the horizontal federation in the previous training period:

W(t+1)＝W(t)+λD(t)

through the updating of the weight along with the gradient, the model loss of the transverse federation iterates towards the minimization direction;

(6) Calculating a Loss function Loss4 of the updated transverse federation model, namely performing one-time updating before actually calculating the global transverse federation, which is called local updating of transverse federation learning;

(7) And integrating the Loss of each sub-longitudinal federal according to a certain proportional weight, and calculating the Loss of the global model, wherein the Loss of the global model is mainly used for correcting each sub-longitudinal federal model and each sub-transverse federal model:

Loss＝a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4

wherein a, b and c are the ratio of the arithmetic functions, and the value of the a, b and c is between 0 and 1;

(8) Finally, the global gradient D is calculated again by using a global loss function, all the horizontal and vertical federal learning is updated at the same time, two times of weight updating are completed in one training period, the first time of local model updating is performed, the second time of global parameter updating is performed, and through two times of weight correction, the local model optimization and the global model optimization are ensured;

(9) The longitudinal and transverse models can finally output a plurality of local longitudinal models and a global optimal model, the longitudinal models have parameters and can be used independently, and the transverse models need to be used together with the longitudinal and transverse models, so that the purposes of training a plurality of models at one time to output, and simultaneously training and predicting the longitudinal and transverse models are fulfilled.

Compared with the prior art, the invention has the following beneficial effects:

1. the method can solve the problem of feature expansion of longitudinal and transverse data, can expand a training set on latitude features, enables the same sample to have feature description with higher dimensionality, and can expand a data set on a sample size, enables the sample distribution to be wider, and generates a more robust federated learning model.

2. The longitudinal models trained pairwise are connected with the transverse models which are used for connecting all data in series in a weight gradient mode ingeniously, meanwhile, in order to guarantee the training effect of each longitudinal model, the loss of each longitudinal model is gathered to the whole situation, the optimum of the local longitudinal federal is guaranteed, the optimum of the local federal to the global model is guaranteed, the weight of the local model is corrected through global parameters, and the training effect and the prediction effect of the local model are improved.

3. The weights are updated twice in one training, which is different from the weights updated once in one training of other models, the weights updated for the first time are updated by local information, and the weights updated for the second time are updated based on global information, namely correction updating, so that the model is more stable and robust in the aspect of training.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is an algorithmic flow chart of the present invention;

FIG. 2 is a longitudinal federated calculation flow- -data alignment schematic of the present invention;

FIG. 3 is a schematic diagram of the vertical federated calculation process- -model training of the present invention;

FIG. 4 is a schematic illustration of longitudinal federal learning of the present invention;

FIG. 5 is a schematic diagram of the gradient optimization mode of the present invention;

FIG. 6 is a schematic of the global model local optimum and global optimum of the present invention;

FIG. 7 is a graph showing the variation of loss with training period of the model training of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation. Wherein like reference numerals refer to like parts throughout.

Example 1

As shown in fig. 1-7, the present invention provides a joint learning horizontal and vertical combination algorithm, which comprises the following procedures:

(1) Data synchronization and model initialization: all participants synchronously specify a database according to data latitude characteristics required by the model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal-horizontal mode is adopted, namely, a gradient generated by a longitudinal federation is transmitted to a gradient of a transverse federation in proportion, and the transmission proportion is lambda;

(2) Data alignment: the method comprises the steps of aligning sample data needing dimension feature expansion through privacy intersection, namely, each participant of each keyword ID needs to have corresponding dimension features, and data preparation is carried out for realizing longitudinal federal learning;

(4) The gradients of the current longitudinal training model, loss1, loss2 and Loss3, are brought into the next longitudinal model training period, so that the gradients of the two longitudinal models are associated, the association mode is transferred in a momentum mode, and the gradient of a certain current learning proportion is retained while the training gradient of a certain proportion of the previous period is absorbed:

D(t+1)＝λD(t)+(1-λ)DT

(5) The horizontal model receives the gradient value of each vertical model training, the horizontal federation is a global federation, the gradient and loss of each sub-vertical federation need to be considered, and the horizontal federation is used for updating the weight value of the horizontal federation in the last training period:

W(t+1)＝W(t)+λD(t)

Loss＝a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4

wherein a, b and c are the ratio of arithmetic functions, and the value of the ratio is between 0 and 1;

(8) Finally, a global gradient D is calculated again by using a global loss function, all the horizontal and vertical federal learning is updated at the same time, two times of weight updating are completed in a training period, a local model is updated for the first time, global parameters are updated for the second time, and the optimal local model and the optimal global model are ensured through two times of weight correction;

Further, examples are as follows:

the telecom operator needs to realize federal learning with n banks (n > = 2), which involves the longitudinal federal learning between the operator and banks, and the horizontal federal learning between several banks, and specifically includes the following processes:

1. model and data initialization: model parameters and corresponding data for the operator and n banks are initialized, where each participant has two models, a vertical model and a horizontal model.

2. Data alignment: and aligning each data according to the key ID by using a data intersection mode in privacy calculation, and virtually packaging the operator data and each bank data again after aligning. Virtual packaging is the binding of two pieces of data together to rename them into a block of data.

3. Model training: the model training is started from the longitudinal federal local model, and the gradient DT and Loss of the longitudinal local model are output. And accumulating and updating the DT according to the proportional parameters to calculate the transverse Loss, and collecting the longitudinal Loss. And constructing a global loss function according to the proportional weight, and calculating a global gradient D.

4. And (3) global correction model: and correcting each longitudinal model parameter and each transverse model parameter by using the global gradient, and enabling the local model to be trained and adjusted towards a global optimal mode. And optimizing each local model in a global correction mode.

5. And (3) modeling results: and finally obtaining a longitudinal federal model and a transverse federal model by the operator and the n participating banks so as to complete the combination of the longitudinal and transverse models.

The invention has the following points:

1. the way the longitudinal and lateral models are combined in federal learning: the longitudinal local federal model is responsible for training data of a longitudinal federal, updating the local model of the participant and outputting loss and gradient of the model, the transverse federal learning model is connected with each sub-longitudinal model in series according to a certain proportion mode to obtain gradient update of the transverse federal learning model, and the gradient of the transverse federal learning model is updated by the gradient of the sub-longitudinal federal model;

2. local optimization with global model rectification: the longitudinal federal model and the transverse federal model in the FL-HVC belong to local models, local updating can be completed, however, as the loss of updating of each sub-model is calculated based on local data, the sub-models are trained towards a local optimization mode, global losses are calculated, the global loss composite losses are proportionally concentrated from the losses of each sub-model, global gradients are obtained through the global losses, and each sub-longitudinal federal model and each sub-transverse federal model are corrected.

Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The cross-longitudinal combination algorithm for the federated learning is characterized by comprising the following processes:

D(t+1)＝λD(t)+(1-λ)DT

W(t+1)＝W(t)+λD(t)

(7) According to a certain proportional weight, integrating the Loss of each sub-longitudinal federal to calculate the Loss of the global model, wherein the Loss of the global model is mainly used for correcting each sub-longitudinal federal model and each sub-transverse federal model:

Loss＝a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4

(9) The longitudinal and transverse models can finally output a plurality of local longitudinal models and a global optimal model, the longitudinal models have own parameters and can be used independently, and the transverse models need to be used together with the longitudinal and transverse models, so that the aims of training a plurality of models at one time to output, and simultaneously training and predicting the longitudinal and transverse models are fulfilled.