CN111798002A - Local model proportion controllable federated learning global model aggregation method - Google Patents

Local model proportion controllable federated learning global model aggregation method Download PDF

Info

Publication number
CN111798002A
CN111798002A CN202010482489.6A CN202010482489A CN111798002A CN 111798002 A CN111798002 A CN 111798002A CN 202010482489 A CN202010482489 A CN 202010482489A CN 111798002 A CN111798002 A CN 111798002A
Authority
CN
China
Prior art keywords
local
model
internet
things
federal learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010482489.6A
Other languages
Chinese (zh)
Inventor
王瑞
郑劭兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202010482489.6A priority Critical patent/CN111798002A/en
Publication of CN111798002A publication Critical patent/CN111798002A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a federal learning global model polymerization method with controllable local model proportion, and belongs to the field of artificial intelligence of the Internet of things. The method comprises the following steps: the method comprises the following steps: a federal learning system is constructed in the same-domain Internet of things equipment, and the system is divided into two main parts: the global end of the federal learning system and the local end of the internet of things of the federal learning system. The technical scheme of the invention ensures the controllability of the proportion of the local end model of the Internet of things participating in the global model aggregation in the federal learning process, and has great practical significance for improving the model precision of the federal learning global model. The invention is applied to the field of artificial intelligence and data islands.

Description

Local model proportion controllable federated learning global model aggregation method
Technical Field
The invention belongs to the field of artificial intelligence of the Internet of things and relates to federal learning knowledge. In particular to the optimization of a federated learning global model aggregation method. Aiming at the defects of the original federal learning global model polymerization method, the federal learning global model polymerization method with controllable local model proportion is designed and invented.
Background
The rapid development of the internet of things causes the generation of a large amount of data of the internet of things, and the problem of obtaining information in massive data of the internet of things to assist people in production becomes urgent to be solved. Deep learning is used as a powerful data analysis tool, can mine the internet of things data generated and collected in various complex environments, and becomes a very promising method for mining the data information of the internet of things. However, the internet of things devices are distributed in various fields of social production and life, and data generated by the devices are distributed in a plurality of internet of things devices and separated, so that useful information with a certain scale is difficult to form, and the problem is called a data island problem. The problem of data island seriously restricts the utilization of the deep learning method to the data of the Internet of things.
To cope with the data islanding problem google proposed a federal learning method in 2016. Federated learning may be viewed as an encrypted distributed learning technique. The distributed learning algorithm enables the distributed learning algorithm to obtain local update models from a multi-party local end and then aggregate to form a global model, and the encryption technology ensures complete privacy of an intermediate process. The proposal and application of federal learning enable the problem of data island to be effectively solved. The method is a current research hotspot by deeply mining useful information in the data of the Internet of things by means of federal learning so as to guide social life and production.
Disclosure of Invention
The invention provides a federal learning global model polymerization method with controllable local model proportion, and federal learning is an effective method for solving the problem of 'data islands' in the field of artificial intelligence. The existing federal learning adopts a mode of adding and averaging local models to obtain a global model, and the mode has simple operation and low complexity. However, the new global model is used to completely replace the original global model, the historical state of the global model is lost, and once the performance of the new global model is poor, the performance of the whole federal learning system is poor.
In order to solve the problems, the invention provides a global model aggregation method with controllable local model proportion. The specific reasoning steps are as follows:
the method comprises the following steps: the encryption scheme in the federal learning is assumed to be enough to protect the data privacy of all internet of things clients in the same domain from being invaded, and meanwhile, the local model generated by the local end of the internet of things can be guaranteed not to be attacked from the outside in the process of carrying out model aggregation transmission. Only under the premise, the client of the internet of things can be added into the federal learning system with care.
The method is characterized in that a limited number n of local ends of the Internet of things exist in a federal learning system in the same working domain, and the local ends of the Internet of things are agreed to be added into the federal learning process, namely, the local ends of the Internet of things are agreed to contribute data to carry out model training. Then we will note the ith local end of the internet of things in this working domain as IOT-i, and the sample set generated at the local end IOT-i can be written in the form shown below
Figure BDA0002517526320000021
The number of samples at the IOT local end IOT-i is represented by Ni, and Ni can be represented as Ni=||ΩiL. The total amount of samples in the entire federated learning system may be recorded as follows
Figure BDA0002517526320000022
Step two: according to the operation mechanism setting of federal learning, each local end of the Internet of things needs to use own data to train a local model, so that a cost function for training the local model of any local end IOT-i of the Internet of things is defined as
Figure BDA0002517526320000023
Then the global cost function of the entire federated learning system for the same work domain can be written as
Figure BDA0002517526320000031
The final purpose of the federal learning is to enable all local models formed by training the local end of the internet of things participating in the federal learning in the same working domain to be aggregated to form an optimal global model, namely, a loss function of the global model reaches the minimum value. Namely, the following formula is satisfied
Figure BDA0002517526320000032
The global model weight of (2).
Step three: the updating formulas of the weights and the offsets of all local ends of the internet of things participating in the federal learning in the same working domain can be written as follows, wherein R represents the communication turns of the local ends of the internet of things and the global ends of the federal learning, and eta represents the learning rate of the local ends of the internet of things.
Figure BDA0002517526320000033
Figure BDA0002517526320000034
Figure BDA0002517526320000035
Figure BDA0002517526320000036
Figure BDA0002517526320000037
Figure BDA0002517526320000038
Step four: the main innovation point of the invention is that the aggregation mode of the global model for federal learning is improved, and the proportion control parameter rho of the local end model is added in the aggregation process, so that the local model generated by the training of the local end of the internet of things can be proportionally controlled when being integrated into the global model, thereby optimizing the global model of the whole federal learning system, and the specific derivation process is as follows:
Figure BDA0002517526320000039
Figure BDA00025175263200000310
the global model so derived is the final solution, i.e.
Figure BDA0002517526320000041
Drawings
FIG. 1 is a schematic diagram of a federated learning build mechanism;
FIG. 2 is a diagram of a neural network structure for training a local end initial model of the Internet of things;
FIG. 3 is a federated learning flow diagram;
table 1 Non-IID data a proportional table was constructed.
Detailed Description
1. Design of experiments
Typically 127.0.0.1 is the IP address that the system leaves native for loopback testing. The invention uses the 4000 port of the local machine 127.0.0.1 as the global end of the federal learning system in the same domain, and is mainly responsible for two tasks of initializing the original model and aggregating the global model. A process is one execution activity of a program on a computer. When a program is run, a process is started. All processes that are used to perform the various functions of the operating system are system processes, while all processes that are started are user processes. Similarly, the multi-process means that a computer executes a plurality of processes simultaneously, generally, a plurality of software are run simultaneously, and the invention simulates all local terminals of the internet of things in the same domain by adopting a multi-process running mode, namely, each process is used for simulating one local terminal of the internet of things, and is responsible for deep learning training by using local data and uploading a local terminal model obtained by training. The communication between each local end of the Internet of things and the global end of the Federal learning system is simulated by the communication between each process and the 4000 port of the local machine 127.0.0.1. The specific design is shown in figure 1:
the experiment is realized by adopting python language programming, and multiprocessing library is called to realize multiprocess. Because a plurality of subprocesses are required to be started to represent a plurality of internet of things clients, the invention adopts a process pool mode to create the subprocesses in batch, and the specific creation process is as follows:
Figure BDA0002517526320000051
since the client and the global end of the federal learning system in the same domain need to communicate to realize the aggregation of the global model, a communication mechanism between processes is involved. The multiprocessing module of python packages a bottom layer mechanism for interprocess communication and provides various modes of pies, queue and the like for exchanging data. Due to the limitation of the note performance, the experiment sets five local ends of the internet of things, namely five processes, in the federal learning system in the same domain. Because of the communication required between each internet of things client, the identity of each process within the federated learning system must be unique and certain. According to the method and the device, each process has the unique id value by calling the uuid library, and the fact that chaos cannot occur when communication is carried out among the processes is guaranteed.
2. Initial model
The experiment designs a convolution neural network to train an initial model of the global terminal of the federal learning system in the same domain. The advantage of the neural network has a self-learning function. For example, when image recognition is realized, as long as the image sample with the label is input into the artificial neural network in advance, the neural network can slowly learn to recognize similar images through the self-learning function, and the purpose of automatic recognition is achieved. The self-learning function has particularly important significance for prediction, and the future neural network can provide great help for human beings in economic prediction, market prediction, benefit prediction and the like. Meanwhile, the neural network has the capability of searching an optimal solution at a high speed, the optimal solution of a complex problem is searched, a large amount of calculation is usually needed, and the high-speed calculation capability of a computer is exerted by utilizing a feedback type artificial neural network designed aiming at a certain problem, so that the optimal solution can be quickly found.
The convolutional neural network is constructed by using a keras artificial neural network library, and the keras can be used as a high-level application program interface of tensoflow, microsoft-CNTK and theta and can be converted into components under systems such as tensoflow, microsoft-CNTK and the like according to background setting. The convolutional neural network consists of two convolutional layers, a maximum value pooling layer and two full-connection layers, wherein the two dropout layers are added in the network and used for randomly disconnecting partial connection, so that a model is prevented from getting into overfitting, and the connection between the convolutional layers and the full-connection layers is realized by using a flatten layer. The specific structure is shown in fig. 2:
3. data pre-processing
The mnist dataset was used for this experiment. Because most of data generated by the internet of things equipment are Non-IID data, the original mnist data set needs to be divided and preprocessed. The m nist data set is internally provided with 10 types of data with labels of 0-9, the data set is distributed to local ends of the Internet of things according to the label proportion shown in the table 1, the proportion is randomly generated, and two digits after the decimal point is rounded up. Wherein each local side is allocated with 1200 data as a training set and 200 samples as a verification set
TABLE 1 construction of proportional Table for Non-IID data
Figure BDA0002517526320000071
4. Experiment platform
The experimental platform of the experiment is built on a 64-bit windows10 personal notebook, and the equipment comprises an intel (R) core (TM) i7-8750H cpu @2.20GHz processor, an 8GB RAM and an Nvidia1060 independent display card. The device carries out CUDA configuration so as to ensure that the GPU can be used for operation. Meanwhile, a GPU acceleration library cuDNN for a deep neural network is installed and configured to reduce memory overhead and improve the performance of the whole system.
5. Experimental procedure
In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following is an operation flow chart of the present invention.
The existing federal learning adopts a mode of adding and averaging local models to obtain a global model, and the mode has simple operation and low complexity. However, the new global model is used to completely replace the original global model, the historical state of the global model is lost, and once the performance of the new global model is poor, the performance of the whole federal learning system is poor. The method improves the defects, ensures the controllability of the proportion of the local end model of the Internet of things participating in the global model aggregation in the federal learning process, and has great practical significance for improving the model precision of the federal learning global model.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (4)

1. A federal learning global model aggregation method with controllable local model proportion is characterized by comprising the following steps: a federal learning system is constructed in the same-domain Internet of things equipment, and the system is divided into two main parts: the global end of the federal learning system and the local end of the internet of things of the federal learning system.
2. The method of claim 1, wherein the step of building a federal learning system includes:
firstly, a 4000 port of a local machine 127.0.0.1 is used as a global end of a federal learning system in the same domain to be responsible for two tasks of initializing an original model and aggregating a global model;
secondly, the multi-process means that a computer executes a plurality of processes at the same time, generally, a plurality of software are simultaneously operated, and the invention adopts a multi-process operation mode to simulate all local terminals of the internet of things in the same domain, namely, each process is used for simulating one local terminal of the internet of things, is responsible for deep learning training by using local data and uploads a local terminal model obtained by training;
and finally, communication between each process and a 4000 port of the local machine 127.0.0.1 is carried out, and communication between each local end of the internet of things and a global end of federal learning in the federal learning system is simulated.
3. The method according to claim 2, wherein the working steps of the federal learning global model aggregation method with controllable local model proportions comprise:
the method comprises the following steps: assuming that a limited number n of local ends of the Internet of things exist in a federal learning system in the same working domain, and agreeing to be added into the federal learning process, namely agreeing to contribute data thereof to carry out model training on the local ends; then we will note the ith local end of the internet of things in this working domain as IOT-i, and the sample set generated at the local end IOT-i can be written in the form shown below
Figure FDA0002517526310000011
The number of samples at the IOT local end IOT-i is represented by Ni, and Ni can be represented as Ni=||ΩiL; the total amount of samples in the entire federal learning system can be recorded as follows
Figure FDA0002517526310000012
Step two: defining a cost function of local model training of any local end IOT-i of the Internet of things as
Figure FDA0002517526310000013
Then the global cost function of the entire federated learning system for the same work domain can be written as
Figure FDA0002517526310000014
The final purpose of the federal learning is to enable all local models formed by training the local ends of the internet of things participating in the federal learning in the same working domain to be aggregated to form an optimal global model, namely, a loss function of the global model reaches the minimum value; namely, the following formula is satisfied
Figure FDA0002517526310000015
Global model weight of (2);
step three: updating formulas of weights and biases of all local ends of the internet of things participating in federal learning in the same working domain can be written as follows, wherein R represents communication turns of the local ends of the internet of things and the global ends of federal learning, and eta represents learning rate of the local ends of the internet of things;
Figure FDA0002517526310000016
Figure FDA0002517526310000017
Figure FDA0002517526310000018
Figure FDA0002517526310000021
Figure FDA0002517526310000022
Figure FDA0002517526310000023
step four: adding a local end model proportion control parameter rho in the aggregation process of the global model, so that the local model generated by the training of the local end of the Internet of things can be controlled in proportion when being integrated into the global model, thereby optimizing the global model of the whole federal learning system, wherein the specific derivation process is as follows:
Figure FDA0002517526310000024
Figure FDA0002517526310000025
the final result is
Figure FDA0002517526310000026
4. The method as claimed in claim 1, wherein in a global model aggregation process of a same-domain federal learning system, a proportion control parameter for an internet of things local end model is added, so that a proportion of the internet of things local end model participating in global model aggregation in the federal learning process can be controlled.
CN202010482489.6A 2020-05-31 2020-05-31 Local model proportion controllable federated learning global model aggregation method Pending CN111798002A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010482489.6A CN111798002A (en) 2020-05-31 2020-05-31 Local model proportion controllable federated learning global model aggregation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010482489.6A CN111798002A (en) 2020-05-31 2020-05-31 Local model proportion controllable federated learning global model aggregation method

Publications (1)

Publication Number Publication Date
CN111798002A true CN111798002A (en) 2020-10-20

Family

ID=72806631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010482489.6A Pending CN111798002A (en) 2020-05-31 2020-05-31 Local model proportion controllable federated learning global model aggregation method

Country Status (1)

Country Link
CN (1) CN111798002A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200263A (en) * 2020-10-22 2021-01-08 国网山东省电力公司电力科学研究院 Self-organizing federal clustering method applied to power distribution internet of things
CN112819177A (en) * 2021-01-26 2021-05-18 支付宝(杭州)信息技术有限公司 Personalized privacy protection learning method, device and equipment
CN112949837A (en) * 2021-04-13 2021-06-11 中国人民武装警察部队警官学院 Target recognition federal deep learning method based on trusted network
CN113095513A (en) * 2021-04-25 2021-07-09 中山大学 Double-layer fair federal learning method, device and storage medium
CN113179244A (en) * 2021-03-10 2021-07-27 上海大学 Federal deep network behavior feature modeling method for industrial internet boundary safety
CN113490184A (en) * 2021-05-10 2021-10-08 北京科技大学 Smart factory-oriented random access resource optimization method and device
WO2022130098A1 (en) * 2020-12-15 2022-06-23 International Business Machines Corporation Federated learning for multi-label classification model for oil pump management
CN115145966A (en) * 2022-09-05 2022-10-04 山东省计算中心(国家超级计算济南中心) Comparison federal learning method and system for heterogeneous data
CN117313902A (en) * 2023-11-30 2023-12-29 北京航空航天大学 Signal game-based vehicle formation asynchronous federal learning method

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200263A (en) * 2020-10-22 2021-01-08 国网山东省电力公司电力科学研究院 Self-organizing federal clustering method applied to power distribution internet of things
CN112200263B (en) * 2020-10-22 2022-09-16 国网山东省电力公司电力科学研究院 Self-organizing federal clustering method applied to power distribution internet of things
WO2022130098A1 (en) * 2020-12-15 2022-06-23 International Business Machines Corporation Federated learning for multi-label classification model for oil pump management
CN112819177A (en) * 2021-01-26 2021-05-18 支付宝(杭州)信息技术有限公司 Personalized privacy protection learning method, device and equipment
CN113179244A (en) * 2021-03-10 2021-07-27 上海大学 Federal deep network behavior feature modeling method for industrial internet boundary safety
CN112949837A (en) * 2021-04-13 2021-06-11 中国人民武装警察部队警官学院 Target recognition federal deep learning method based on trusted network
CN112949837B (en) * 2021-04-13 2022-11-11 中国人民武装警察部队警官学院 Target recognition federal deep learning method based on trusted network
CN113095513A (en) * 2021-04-25 2021-07-09 中山大学 Double-layer fair federal learning method, device and storage medium
CN113490184A (en) * 2021-05-10 2021-10-08 北京科技大学 Smart factory-oriented random access resource optimization method and device
CN115145966A (en) * 2022-09-05 2022-10-04 山东省计算中心(国家超级计算济南中心) Comparison federal learning method and system for heterogeneous data
CN117313902A (en) * 2023-11-30 2023-12-29 北京航空航天大学 Signal game-based vehicle formation asynchronous federal learning method
CN117313902B (en) * 2023-11-30 2024-02-06 北京航空航天大学 Signal game-based vehicle formation asynchronous federal learning method

Similar Documents

Publication Publication Date Title
CN111798002A (en) Local model proportion controllable federated learning global model aggregation method
US11842172B2 (en) Graphical user interface to an artificial intelligence engine utilized to generate one or more trained artificial intelligence models
US11868896B2 (en) Interface for working with simulations on premises
EP3913545A2 (en) Method and apparatus for updating parameter of multi-task model, and electronic device
US11836650B2 (en) Artificial intelligence engine for mixing and enhancing features from one or more trained pre-existing machine-learning models
WO2022068627A1 (en) Data processing method and related device
Yu From information networking to intelligence networking: Motivations, scenarios, and challenges
CN110929114A (en) Tracking digital dialog states and generating responses using dynamic memory networks
US20180307779A1 (en) Interpreting human-robot instructions
WO2023202511A1 (en) Data processing method, neural network training method and related device
Khanam et al. Artificial intelligence surpassing human intelligence: factual or hoax
US20200342347A1 (en) Machine learning quantum algorithm validator
KR20230171040A (en) Task and process mining by robotic process automation across computing environments
WO2023022823A1 (en) Automated generation of predictive insights classifying user activity
Xu et al. Effectiveness of English online learning based on deep learning
WO2023174189A1 (en) Method and apparatus for classifying nodes of graph network model, and device and storage medium
US20200257980A1 (en) Training optimization for neural networks with batch norm layers
Gundakanal et al. Intelligent libraries: New horizons with artificial intelligence
CN114445692B (en) Image recognition model construction method and device, computer equipment and storage medium
CN112036546B (en) Sequence processing method and related equipment
Suma Priya et al. A Review on the Importance of Machine Learning and Artificial Intelligence in Real Life Problem Solving
US20240020553A1 (en) Interactive electronic device for performing functions of providing responses to questions from users and real-time conversation with the users using models learned by deep learning technique and operating method thereof
Shadadi et al. Hierarchical Parallel Processing for Data Clustering in GPU Using Deep Nearest Neighbor Searching.
El-Hadi Most In-Demand Artificial Intelligence Skills to Learn in 2022
Benhamou Machine Learning in Banking-Tips & Tricks: Session 1-General introduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201020

WD01 Invention patent application deemed withdrawn after publication