CN111798002A

CN111798002A - Local model proportion controllable federated learning global model aggregation method

Info

Publication number: CN111798002A
Application number: CN202010482489.6A
Authority: CN
Inventors: 王瑞; 郑劭兵
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2020-05-31
Filing date: 2020-05-31
Publication date: 2020-10-20

Abstract

The invention discloses a federal learning global model polymerization method with controllable local model proportion, and belongs to the field of artificial intelligence of the Internet of things. The method comprises the following steps: the method comprises the following steps: a federal learning system is constructed in the same-domain Internet of things equipment, and the system is divided into two main parts: the global end of the federal learning system and the local end of the internet of things of the federal learning system. The technical scheme of the invention ensures the controllability of the proportion of the local end model of the Internet of things participating in the global model aggregation in the federal learning process, and has great practical significance for improving the model precision of the federal learning global model. The invention is applied to the field of artificial intelligence and data islands.

Description

Local model proportion controllable federated learning global model aggregation method

Technical Field

The invention belongs to the field of artificial intelligence of the Internet of things and relates to federal learning knowledge. In particular to the optimization of a federated learning global model aggregation method. Aiming at the defects of the original federal learning global model polymerization method, the federal learning global model polymerization method with controllable local model proportion is designed and invented.

Background

The rapid development of the internet of things causes the generation of a large amount of data of the internet of things, and the problem of obtaining information in massive data of the internet of things to assist people in production becomes urgent to be solved. Deep learning is used as a powerful data analysis tool, can mine the internet of things data generated and collected in various complex environments, and becomes a very promising method for mining the data information of the internet of things. However, the internet of things devices are distributed in various fields of social production and life, and data generated by the devices are distributed in a plurality of internet of things devices and separated, so that useful information with a certain scale is difficult to form, and the problem is called a data island problem. The problem of data island seriously restricts the utilization of the deep learning method to the data of the Internet of things.

To cope with the data islanding problem google proposed a federal learning method in 2016. Federated learning may be viewed as an encrypted distributed learning technique. The distributed learning algorithm enables the distributed learning algorithm to obtain local update models from a multi-party local end and then aggregate to form a global model, and the encryption technology ensures complete privacy of an intermediate process. The proposal and application of federal learning enable the problem of data island to be effectively solved. The method is a current research hotspot by deeply mining useful information in the data of the Internet of things by means of federal learning so as to guide social life and production.

Disclosure of Invention

The invention provides a federal learning global model polymerization method with controllable local model proportion, and federal learning is an effective method for solving the problem of 'data islands' in the field of artificial intelligence. The existing federal learning adopts a mode of adding and averaging local models to obtain a global model, and the mode has simple operation and low complexity. However, the new global model is used to completely replace the original global model, the historical state of the global model is lost, and once the performance of the new global model is poor, the performance of the whole federal learning system is poor.

In order to solve the problems, the invention provides a global model aggregation method with controllable local model proportion. The specific reasoning steps are as follows:

the method comprises the following steps: the encryption scheme in the federal learning is assumed to be enough to protect the data privacy of all internet of things clients in the same domain from being invaded, and meanwhile, the local model generated by the local end of the internet of things can be guaranteed not to be attacked from the outside in the process of carrying out model aggregation transmission. Only under the premise, the client of the internet of things can be added into the federal learning system with care.

The method is characterized in that a limited number n of local ends of the Internet of things exist in a federal learning system in the same working domain, and the local ends of the Internet of things are agreed to be added into the federal learning process, namely, the local ends of the Internet of things are agreed to contribute data to carry out model training. Then we will note the ith local end of the internet of things in this working domain as IOT-i, and the sample set generated at the local end IOT-i can be written in the form shown below

The number of samples at the IOT local end IOT-i is represented by Ni, and Ni can be represented as N_i＝||Ω_iL. The total amount of samples in the entire federated learning system may be recorded as follows

Step two: according to the operation mechanism setting of federal learning, each local end of the Internet of things needs to use own data to train a local model, so that a cost function for training the local model of any local end IOT-i of the Internet of things is defined as

Then the global cost function of the entire federated learning system for the same work domain can be written as

The final purpose of the federal learning is to enable all local models formed by training the local end of the internet of things participating in the federal learning in the same working domain to be aggregated to form an optimal global model, namely, a loss function of the global model reaches the minimum value. Namely, the following formula is satisfied

The global model weight of (2).

Step three: the updating formulas of the weights and the offsets of all local ends of the internet of things participating in the federal learning in the same working domain can be written as follows, wherein R represents the communication turns of the local ends of the internet of things and the global ends of the federal learning, and eta represents the learning rate of the local ends of the internet of things.

Step four: the main innovation point of the invention is that the aggregation mode of the global model for federal learning is improved, and the proportion control parameter rho of the local end model is added in the aggregation process, so that the local model generated by the training of the local end of the internet of things can be proportionally controlled when being integrated into the global model, thereby optimizing the global model of the whole federal learning system, and the specific derivation process is as follows:

the global model so derived is the final solution, i.e.

Drawings

FIG. 1 is a schematic diagram of a federated learning build mechanism;

FIG. 2 is a diagram of a neural network structure for training a local end initial model of the Internet of things;

FIG. 3 is a federated learning flow diagram;

table 1 Non-IID data a proportional table was constructed.

Detailed Description

1. Design of experiments

Typically 127.0.0.1 is the IP address that the system leaves native for loopback testing. The invention uses the 4000 port of the local machine 127.0.0.1 as the global end of the federal learning system in the same domain, and is mainly responsible for two tasks of initializing the original model and aggregating the global model. A process is one execution activity of a program on a computer. When a program is run, a process is started. All processes that are used to perform the various functions of the operating system are system processes, while all processes that are started are user processes. Similarly, the multi-process means that a computer executes a plurality of processes simultaneously, generally, a plurality of software are run simultaneously, and the invention simulates all local terminals of the internet of things in the same domain by adopting a multi-process running mode, namely, each process is used for simulating one local terminal of the internet of things, and is responsible for deep learning training by using local data and uploading a local terminal model obtained by training. The communication between each local end of the Internet of things and the global end of the Federal learning system is simulated by the communication between each process and the 4000 port of the local machine 127.0.0.1. The specific design is shown in figure 1:

the experiment is realized by adopting python language programming, and multiprocessing library is called to realize multiprocess. Because a plurality of subprocesses are required to be started to represent a plurality of internet of things clients, the invention adopts a process pool mode to create the subprocesses in batch, and the specific creation process is as follows:

since the client and the global end of the federal learning system in the same domain need to communicate to realize the aggregation of the global model, a communication mechanism between processes is involved. The multiprocessing module of python packages a bottom layer mechanism for interprocess communication and provides various modes of pies, queue and the like for exchanging data. Due to the limitation of the note performance, the experiment sets five local ends of the internet of things, namely five processes, in the federal learning system in the same domain. Because of the communication required between each internet of things client, the identity of each process within the federated learning system must be unique and certain. According to the method and the device, each process has the unique id value by calling the uuid library, and the fact that chaos cannot occur when communication is carried out among the processes is guaranteed.

2. Initial model

The experiment designs a convolution neural network to train an initial model of the global terminal of the federal learning system in the same domain. The advantage of the neural network has a self-learning function. For example, when image recognition is realized, as long as the image sample with the label is input into the artificial neural network in advance, the neural network can slowly learn to recognize similar images through the self-learning function, and the purpose of automatic recognition is achieved. The self-learning function has particularly important significance for prediction, and the future neural network can provide great help for human beings in economic prediction, market prediction, benefit prediction and the like. Meanwhile, the neural network has the capability of searching an optimal solution at a high speed, the optimal solution of a complex problem is searched, a large amount of calculation is usually needed, and the high-speed calculation capability of a computer is exerted by utilizing a feedback type artificial neural network designed aiming at a certain problem, so that the optimal solution can be quickly found.

The convolutional neural network is constructed by using a keras artificial neural network library, and the keras can be used as a high-level application program interface of tensoflow, microsoft-CNTK and theta and can be converted into components under systems such as tensoflow, microsoft-CNTK and the like according to background setting. The convolutional neural network consists of two convolutional layers, a maximum value pooling layer and two full-connection layers, wherein the two dropout layers are added in the network and used for randomly disconnecting partial connection, so that a model is prevented from getting into overfitting, and the connection between the convolutional layers and the full-connection layers is realized by using a flatten layer. The specific structure is shown in fig. 2:

3. data pre-processing

The mnist dataset was used for this experiment. Because most of data generated by the internet of things equipment are Non-IID data, the original mnist data set needs to be divided and preprocessed. The m nist data set is internally provided with 10 types of data with labels of 0-9, the data set is distributed to local ends of the Internet of things according to the label proportion shown in the table 1, the proportion is randomly generated, and two digits after the decimal point is rounded up. Wherein each local side is allocated with 1200 data as a training set and 200 samples as a verification set

TABLE 1 construction of proportional Table for Non-IID data

4. Experiment platform

The experimental platform of the experiment is built on a 64-bit windows10 personal notebook, and the equipment comprises an intel (R) core (TM) i7-8750H cpu @2.20GHz processor, an 8GB RAM and an Nvidia1060 independent display card. The device carries out CUDA configuration so as to ensure that the GPU can be used for operation. Meanwhile, a GPU acceleration library cuDNN for a deep neural network is installed and configured to reduce memory overhead and improve the performance of the whole system.

5. Experimental procedure

In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following is an operation flow chart of the present invention.

The existing federal learning adopts a mode of adding and averaging local models to obtain a global model, and the mode has simple operation and low complexity. However, the new global model is used to completely replace the original global model, the historical state of the global model is lost, and once the performance of the new global model is poor, the performance of the whole federal learning system is poor. The method improves the defects, ensures the controllability of the proportion of the local end model of the Internet of things participating in the global model aggregation in the federal learning process, and has great practical significance for improving the model precision of the federal learning global model.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A federal learning global model aggregation method with controllable local model proportion is characterized by comprising the following steps: a federal learning system is constructed in the same-domain Internet of things equipment, and the system is divided into two main parts: the global end of the federal learning system and the local end of the internet of things of the federal learning system.

2. The method of claim 1, wherein the step of building a federal learning system includes:

firstly, a 4000 port of a local machine 127.0.0.1 is used as a global end of a federal learning system in the same domain to be responsible for two tasks of initializing an original model and aggregating a global model;

secondly, the multi-process means that a computer executes a plurality of processes at the same time, generally, a plurality of software are simultaneously operated, and the invention adopts a multi-process operation mode to simulate all local terminals of the internet of things in the same domain, namely, each process is used for simulating one local terminal of the internet of things, is responsible for deep learning training by using local data and uploads a local terminal model obtained by training;

and finally, communication between each process and a 4000 port of the local machine 127.0.0.1 is carried out, and communication between each local end of the internet of things and a global end of federal learning in the federal learning system is simulated.

3. The method according to claim 2, wherein the working steps of the federal learning global model aggregation method with controllable local model proportions comprise:

the method comprises the following steps: assuming that a limited number n of local ends of the Internet of things exist in a federal learning system in the same working domain, and agreeing to be added into the federal learning process, namely agreeing to contribute data thereof to carry out model training on the local ends; then we will note the ith local end of the internet of things in this working domain as IOT-i, and the sample set generated at the local end IOT-i can be written in the form shown below

The number of samples at the IOT local end IOT-i is represented by Ni, and Ni can be represented as N_i＝||Ω_iL; the total amount of samples in the entire federal learning system can be recorded as follows

Step two: defining a cost function of local model training of any local end IOT-i of the Internet of things as

The final purpose of the federal learning is to enable all local models formed by training the local ends of the internet of things participating in the federal learning in the same working domain to be aggregated to form an optimal global model, namely, a loss function of the global model reaches the minimum value; namely, the following formula is satisfied

Global model weight of (2);

step three: updating formulas of weights and biases of all local ends of the internet of things participating in federal learning in the same working domain can be written as follows, wherein R represents communication turns of the local ends of the internet of things and the global ends of federal learning, and eta represents learning rate of the local ends of the internet of things;

step four: adding a local end model proportion control parameter rho in the aggregation process of the global model, so that the local model generated by the training of the local end of the Internet of things can be controlled in proportion when being integrated into the global model, thereby optimizing the global model of the whole federal learning system, wherein the specific derivation process is as follows:

the final result is

4. The method as claimed in claim 1, wherein in a global model aggregation process of a same-domain federal learning system, a proportion control parameter for an internet of things local end model is added, so that a proportion of the internet of things local end model participating in global model aggregation in the federal learning process can be controlled.