CN115796309A - Horizontal and vertical combination algorithm for federated learning - Google Patents
Horizontal and vertical combination algorithm for federated learning Download PDFInfo
- Publication number
- CN115796309A CN115796309A CN202211146107.8A CN202211146107A CN115796309A CN 115796309 A CN115796309 A CN 115796309A CN 202211146107 A CN202211146107 A CN 202211146107A CN 115796309 A CN115796309 A CN 115796309A
- Authority
- CN
- China
- Prior art keywords
- model
- longitudinal
- transverse
- gradient
- global
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 50
- 238000000034 method Methods 0.000 claims abstract description 7
- 230000005540 biological transmission Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 7
- 238000012937 correction Methods 0.000 claims description 6
- 230000000717 retained effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a federated learning transverse and longitudinal combination algorithm, which comprises the following processes: data synchronization and model initialization: all participants synchronously specify in a database according to data latitude characteristics required by the model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal-horizontal mode is adopted, namely, a gradient generated by a longitudinal federation is transmitted to a gradient of a transverse federation in proportion, and the transmission proportion is lambda. The invention can solve the problem of longitudinal and transverse data feature expansion, can expand the training set on the latitude feature, enables the same sample to have higher dimensional feature description, and can expand the data set on the sample size, so that the sample distribution is wider, and a more robust federated learning model is generated.
Description
Technical Field
The invention relates to the field of federal learning, in particular to a joint learning horizontal and vertical combination algorithm.
Background
Federal Learning (Federal Learning) is a new artificial intelligence basic technology, which was proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users, and the design goal of the technology is to carry out efficient machine Learning among multiple parties or multiple computing nodes on the premise of guaranteeing information safety during big data exchange, protecting terminal data and personal data privacy and guaranteeing legal compliance. The machine learning algorithm used for federal learning is not limited to a neural network, and also comprises important algorithms such as a random forest. Federal learning is expected to become the basis of next-generation artificial intelligence cooperative algorithms and cooperative networks.
For different data sets, the federal Learning is divided into horizontal federal Learning (horizontal federal fed Learning), vertical federal Learning (vertical federal fed Learning) and federal Transfer Learning (FmL);
the existing federal learning algorithm can only obtain a transverse or longitudinal model by singly using a transverse or longitudinal federal mode, but in an actual application scene, a data source is uncertain, the data acquisition forms are diversified, if different data sources are expected to expand in latitude and simultaneously expand in sample size for model training, all the present federal learning algorithms cannot realize a mode of combining between vertical and horizontal, and the federal learning combining between vertical and horizontal is very common in actual application, such as anti-fraud of banks, on one hand, more dimensional characteristics of fraud groups are planned through the expansion of telecommunication data, on the other hand, more sample cases are expected to be added through more participating banks, so that a more accurate federal model is trained.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a federated learning transverse and longitudinal combination algorithm.
The invention provides the following technical scheme:
the invention provides a federated learning horizontal and vertical combination algorithm, which comprises the following procedures:
(1) Data synchronization and model initialization: all participants synchronously appoint in a database according to data latitude characteristics required by a model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal mode and a transverse mode are used, namely, a gradient generated by a longitudinal federation is proportionally transmitted to a gradient of a transverse federation, and the transmission proportion is lambda;
(2) Data alignment: sample data needing dimension characteristic expansion is aligned through privacy intersection, namely, each participant of each keyword ID needs corresponding dimension characteristics, and data preparation is carried out for longitudinal federal learning;
(3) Aiming at alignment data of each batch, training a longitudinal federal learning model, wherein the longitudinal federal respectively trains own unique weight w and characteristic x for participants, a training global gradient and global corresponding loss are obtained through an encryption algorithm, a local longitudinal model updates the weight w once, and loss is transferred from current longitudinal learning: loss1, loss2, loss3;
(4) The gradients of the current longitudinal training models, namely, loss1, loss2 and Loss3 are brought into the next longitudinal model training period, so that the gradients of the longitudinal models of two times are associated, the association mode is transferred in a momentum mode, and the gradient of a certain current learning proportion is retained while the training gradient of a certain proportion of the previous period is absorbed:
D(t+1)=λD(t)+(1-λ)DT
d is a model training gradient, DT is a gradient of current model training, and lambda is used for coordinating parameters of gradient proportion;
(5) The horizontal model receives the gradient value of each vertical model training, the horizontal federation is a global federation, the gradient and loss of each sub-vertical federation need to be considered, and the horizontal federation is used for updating the weight value of the horizontal federation in the previous training period:
W(t+1)=W(t)+λD(t)
through the updating of the weight along with the gradient, the model loss of the transverse federation iterates towards the minimization direction;
(6) Calculating a Loss function Loss4 of the updated transverse federation model, namely performing one-time updating before actually calculating the global transverse federation, which is called local updating of transverse federation learning;
(7) And integrating the Loss of each sub-longitudinal federal according to a certain proportional weight, and calculating the Loss of the global model, wherein the Loss of the global model is mainly used for correcting each sub-longitudinal federal model and each sub-transverse federal model:
Loss=a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4
wherein a, b and c are the ratio of the arithmetic functions, and the value of the a, b and c is between 0 and 1;
(8) Finally, the global gradient D is calculated again by using a global loss function, all the horizontal and vertical federal learning is updated at the same time, two times of weight updating are completed in one training period, the first time of local model updating is performed, the second time of global parameter updating is performed, and through two times of weight correction, the local model optimization and the global model optimization are ensured;
(9) The longitudinal and transverse models can finally output a plurality of local longitudinal models and a global optimal model, the longitudinal models have parameters and can be used independently, and the transverse models need to be used together with the longitudinal and transverse models, so that the purposes of training a plurality of models at one time to output, and simultaneously training and predicting the longitudinal and transverse models are fulfilled.
Compared with the prior art, the invention has the following beneficial effects:
1. the method can solve the problem of feature expansion of longitudinal and transverse data, can expand a training set on latitude features, enables the same sample to have feature description with higher dimensionality, and can expand a data set on a sample size, enables the sample distribution to be wider, and generates a more robust federated learning model.
2. The longitudinal models trained pairwise are connected with the transverse models which are used for connecting all data in series in a weight gradient mode ingeniously, meanwhile, in order to guarantee the training effect of each longitudinal model, the loss of each longitudinal model is gathered to the whole situation, the optimum of the local longitudinal federal is guaranteed, the optimum of the local federal to the global model is guaranteed, the weight of the local model is corrected through global parameters, and the training effect and the prediction effect of the local model are improved.
3. The weights are updated twice in one training, which is different from the weights updated once in one training of other models, the weights updated for the first time are updated by local information, and the weights updated for the second time are updated based on global information, namely correction updating, so that the model is more stable and robust in the aspect of training.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is an algorithmic flow chart of the present invention;
FIG. 2 is a longitudinal federated calculation flow- -data alignment schematic of the present invention;
FIG. 3 is a schematic diagram of the vertical federated calculation process- -model training of the present invention;
FIG. 4 is a schematic illustration of longitudinal federal learning of the present invention;
FIG. 5 is a schematic diagram of the gradient optimization mode of the present invention;
FIG. 6 is a schematic of the global model local optimum and global optimum of the present invention;
FIG. 7 is a graph showing the variation of loss with training period of the model training of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation. Wherein like reference numerals refer to like parts throughout.
Example 1
As shown in fig. 1-7, the present invention provides a joint learning horizontal and vertical combination algorithm, which comprises the following procedures:
(1) Data synchronization and model initialization: all participants synchronously specify a database according to data latitude characteristics required by the model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal-horizontal mode is adopted, namely, a gradient generated by a longitudinal federation is transmitted to a gradient of a transverse federation in proportion, and the transmission proportion is lambda;
(2) Data alignment: the method comprises the steps of aligning sample data needing dimension feature expansion through privacy intersection, namely, each participant of each keyword ID needs to have corresponding dimension features, and data preparation is carried out for realizing longitudinal federal learning;
(3) Aiming at alignment data of each batch, training a longitudinal federal learning model, wherein the longitudinal federal respectively trains own unique weight w and characteristic x for participants, a training global gradient and global corresponding loss are obtained through an encryption algorithm, a local longitudinal model updates the weight w once, and loss is transferred from current longitudinal learning: loss1, loss2, loss3;
(4) The gradients of the current longitudinal training model, loss1, loss2 and Loss3, are brought into the next longitudinal model training period, so that the gradients of the two longitudinal models are associated, the association mode is transferred in a momentum mode, and the gradient of a certain current learning proportion is retained while the training gradient of a certain proportion of the previous period is absorbed:
D(t+1)=λD(t)+(1-λ)DT
d is a model training gradient, DT is a gradient of current model training, and lambda is used for coordinating parameters of gradient proportion;
(5) The horizontal model receives the gradient value of each vertical model training, the horizontal federation is a global federation, the gradient and loss of each sub-vertical federation need to be considered, and the horizontal federation is used for updating the weight value of the horizontal federation in the last training period:
W(t+1)=W(t)+λD(t)
through the updating of the weight along with the gradient, the model loss of the transverse federation iterates towards the minimization direction;
(6) Calculating a Loss function Loss4 of the updated transverse federation model, namely performing one-time updating before actually calculating the global transverse federation, which is called local updating of transverse federation learning;
(7) And integrating the Loss of each sub-longitudinal federal according to a certain proportional weight, and calculating the Loss of the global model, wherein the Loss of the global model is mainly used for correcting each sub-longitudinal federal model and each sub-transverse federal model:
Loss=a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4
wherein a, b and c are the ratio of arithmetic functions, and the value of the ratio is between 0 and 1;
(8) Finally, a global gradient D is calculated again by using a global loss function, all the horizontal and vertical federal learning is updated at the same time, two times of weight updating are completed in a training period, a local model is updated for the first time, global parameters are updated for the second time, and the optimal local model and the optimal global model are ensured through two times of weight correction;
(9) The longitudinal and transverse models can finally output a plurality of local longitudinal models and a global optimal model, the longitudinal models have parameters and can be used independently, and the transverse models need to be used together with the longitudinal and transverse models, so that the purposes of training a plurality of models at one time to output, and simultaneously training and predicting the longitudinal and transverse models are fulfilled.
Further, examples are as follows:
the telecom operator needs to realize federal learning with n banks (n > = 2), which involves the longitudinal federal learning between the operator and banks, and the horizontal federal learning between several banks, and specifically includes the following processes:
1. model and data initialization: model parameters and corresponding data for the operator and n banks are initialized, where each participant has two models, a vertical model and a horizontal model.
2. Data alignment: and aligning each data according to the key ID by using a data intersection mode in privacy calculation, and virtually packaging the operator data and each bank data again after aligning. Virtual packaging is the binding of two pieces of data together to rename them into a block of data.
3. Model training: the model training is started from the longitudinal federal local model, and the gradient DT and Loss of the longitudinal local model are output. And accumulating and updating the DT according to the proportional parameters to calculate the transverse Loss, and collecting the longitudinal Loss. And constructing a global loss function according to the proportional weight, and calculating a global gradient D.
4. And (3) global correction model: and correcting each longitudinal model parameter and each transverse model parameter by using the global gradient, and enabling the local model to be trained and adjusted towards a global optimal mode. And optimizing each local model in a global correction mode.
5. And (3) modeling results: and finally obtaining a longitudinal federal model and a transverse federal model by the operator and the n participating banks so as to complete the combination of the longitudinal and transverse models.
The invention has the following points:
1. the way the longitudinal and lateral models are combined in federal learning: the longitudinal local federal model is responsible for training data of a longitudinal federal, updating the local model of the participant and outputting loss and gradient of the model, the transverse federal learning model is connected with each sub-longitudinal model in series according to a certain proportion mode to obtain gradient update of the transverse federal learning model, and the gradient of the transverse federal learning model is updated by the gradient of the sub-longitudinal federal model;
2. local optimization with global model rectification: the longitudinal federal model and the transverse federal model in the FL-HVC belong to local models, local updating can be completed, however, as the loss of updating of each sub-model is calculated based on local data, the sub-models are trained towards a local optimization mode, global losses are calculated, the global loss composite losses are proportionally concentrated from the losses of each sub-model, global gradients are obtained through the global losses, and each sub-longitudinal federal model and each sub-transverse federal model are corrected.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (1)
1. The cross-longitudinal combination algorithm for the federated learning is characterized by comprising the following processes:
(1) Data synchronization and model initialization: all participants synchronously appoint in a database according to data latitude characteristics required by a model, and initialize longitudinal model parameters and transverse global parameters, wherein a longitudinal mode and a transverse mode are used, namely, a gradient generated by a longitudinal federation is proportionally transmitted to a gradient of a transverse federation, and the transmission proportion is lambda;
(2) Data alignment: sample data needing dimension characteristic expansion is aligned through privacy intersection, namely, each participant of each keyword ID needs corresponding dimension characteristics, and data preparation is carried out for longitudinal federal learning;
(3) Aiming at alignment data of each batch, training a longitudinal federal learning model, wherein the longitudinal federal respectively trains own unique weight w and characteristic x for participants, a training global gradient and global corresponding loss are obtained through an encryption algorithm, a local longitudinal model updates the weight w once, and loss is transferred from current longitudinal learning: loss1, loss2, loss3;
(4) The gradients of the current longitudinal training model, loss1, loss2 and Loss3, are brought into the next longitudinal model training period, so that the gradients of the two longitudinal models are associated, the association mode is transferred in a momentum mode, and the gradient of a certain current learning proportion is retained while the training gradient of a certain proportion of the previous period is absorbed:
D(t+1)=λD(t)+(1-λ)DT
d is a model training gradient, DT is a gradient of current model training, and lambda is used for coordinating parameters of gradient proportion;
(5) The horizontal model receives the gradient value of each vertical model training, the horizontal federation is a global federation, the gradient and loss of each sub-vertical federation need to be considered, and the horizontal federation is used for updating the weight value of the horizontal federation in the last training period:
W(t+1)=W(t)+λD(t)
through the updating of the weight along with the gradient, the model loss of the transverse federation iterates towards the minimization direction;
(6) Calculating a Loss function Loss4 of the updated transverse federation model, namely performing one-time updating before actually calculating the global transverse federation, which is called local updating of transverse federation learning;
(7) According to a certain proportional weight, integrating the Loss of each sub-longitudinal federal to calculate the Loss of the global model, wherein the Loss of the global model is mainly used for correcting each sub-longitudinal federal model and each sub-transverse federal model:
Loss=a*Loss1+b*Loss2+c*Loss3+(1-a-b-c)*Loss4
wherein a, b and c are the ratio of the arithmetic functions, and the value of the a, b and c is between 0 and 1;
(8) Finally, the global gradient D is calculated again by using a global loss function, all the horizontal and vertical federal learning is updated at the same time, two times of weight updating are completed in one training period, the first time of local model updating is performed, the second time of global parameter updating is performed, and through two times of weight correction, the local model optimization and the global model optimization are ensured;
(9) The longitudinal and transverse models can finally output a plurality of local longitudinal models and a global optimal model, the longitudinal models have own parameters and can be used independently, and the transverse models need to be used together with the longitudinal and transverse models, so that the aims of training a plurality of models at one time to output, and simultaneously training and predicting the longitudinal and transverse models are fulfilled.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211146107.8A CN115796309A (en) | 2022-09-20 | 2022-09-20 | Horizontal and vertical combination algorithm for federated learning |
PCT/CN2022/137220 WO2024060410A1 (en) | 2022-09-20 | 2022-12-29 | Horizontal and vertical federated learning combined algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211146107.8A CN115796309A (en) | 2022-09-20 | 2022-09-20 | Horizontal and vertical combination algorithm for federated learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115796309A true CN115796309A (en) | 2023-03-14 |
Family
ID=85432050
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211146107.8A Pending CN115796309A (en) | 2022-09-20 | 2022-09-20 | Horizontal and vertical combination algorithm for federated learning |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115796309A (en) |
WO (1) | WO2024060410A1 (en) |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110782042B (en) * | 2019-10-29 | 2022-02-11 | 深圳前海微众银行股份有限公司 | Method, device, equipment and medium for combining horizontal federation and vertical federation |
US11843516B2 (en) * | 2020-03-10 | 2023-12-12 | Asiainfo Technologies (China), Inc. | Federated learning in telecom communication system |
CN112199709A (en) * | 2020-10-28 | 2021-01-08 | 支付宝(杭州)信息技术有限公司 | Multi-party based privacy data joint training model method and device |
CN113642034A (en) * | 2021-06-25 | 2021-11-12 | 合肥工业大学 | Medical big data safety sharing method and system based on horizontal and vertical federal learning |
CN113689003B (en) * | 2021-08-10 | 2024-03-22 | 华东师范大学 | Mixed federal learning framework and method for safely removing third party |
-
2022
- 2022-09-20 CN CN202211146107.8A patent/CN115796309A/en active Pending
- 2022-12-29 WO PCT/CN2022/137220 patent/WO2024060410A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2024060410A1 (en) | 2024-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106296191A (en) | A kind of PoW common recognition mechanism of block chain power-aware | |
CN110263928A (en) | Protect the mobile device-based distributed deep learning training method of data-privacy | |
CN111951149B (en) | Image information steganography method based on neural network | |
CN112712182A (en) | Model training method and device based on federal learning and storage medium | |
CN112862001A (en) | Decentralized data modeling method under privacy protection | |
CN107070578A (en) | A kind of master-salve clock synchronous method suitable for many synchronization field time triggered Ethernets | |
CN111126618A (en) | Multi-source heterogeneous system-based federal learning method and device | |
CN106991474A (en) | The parallel full articulamentum method for interchanging data of deep neural network model and system | |
KR20190083127A (en) | System and method for trainning convolution neural network model using image in terminal cluster | |
WO2020003849A1 (en) | Distributed deep learning system, distributed deep learning method, and computing interconnect device | |
CN114363043A (en) | Asynchronous federated learning method based on verifiable aggregation and differential privacy in peer-to-peer network | |
CN113779615B (en) | Safe decentralized diagram federation learning method | |
Yapp et al. | Communication-efficient and Scalable Decentralized Federated Edge Learning. | |
CN115796309A (en) | Horizontal and vertical combination algorithm for federated learning | |
Wei et al. | Federated learning for minimizing nonsmooth convex loss functions | |
CN114491616A (en) | Block chain and homomorphic encryption-based federated learning method and application | |
CN113988308A (en) | Asynchronous federal gradient averaging algorithm based on delay compensation mechanism | |
Zhao et al. | AFL: An adaptively federated multitask learning for model sharing in industrial IoT | |
CN109783868B (en) | Method for calculating effective OODA chain number | |
CN116305186A (en) | Security aggregation method with low communication overhead and decentralization | |
Yuan et al. | Demonstration of blockchain-based IoT devices anonymous access network using zero-knowledge proof | |
CN113297310A (en) | Method for selecting block chain fragmentation verifier in Internet of things | |
CN109195179B (en) | Distributed congestion control and power distribution method of WSN (Wireless sensor network) | |
US20220261620A1 (en) | Distributed Processing System and Distributed Processing Method | |
CN116756764B (en) | Model blocking aggregation privacy protection method for lithography hotspot detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |