CN113177645B

CN113177645B - Federal learning method and device, computing equipment and storage medium

Info

Publication number: CN113177645B
Application number: CN202110726072.4A
Authority: CN
Inventors: 程勇; 蒋杰; 刘煜宏; 陈鹏; 陶阳宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-09-28
Anticipated expiration: 2041-06-29
Also published as: CN113177645A

Abstract

The embodiment of the application discloses a method and a device for federated learning, computing equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: the second computing device trains the federated learning model in parallel through a plurality of computing unit groups; the second computing device responds to the local updating information of the target computing unit group obtained by the federal learning model and sends the local updating information to the first computing device; the method comprises the steps that a first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received local updating information, and sends the global updating information to the at least one second computing device; the second computing device updates the federated learning model based on the global update information. According to the embodiment of the application, the local updating information is sent in a mode based on the computing unit groups, waiting for all the computing unit groups is not needed, the second computing equipment can participate in the global training of the federated learning model in time, and the federated learning efficiency is improved.

Description

Federal learning method and device, computing equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a method and a device for federated learning, computing equipment and a storage medium.

Background

With the development of computer technology and the progress of artificial intelligence technology, federal learning gradually becomes a hot topic, and the federal learning completes the training of machine learning and deep learning models through multi-party cooperation, so that the problem of data islanding is solved while the privacy of users and the data safety are protected.

The federal learning includes horizontal federal learning, in which a plurality of computing devices respectively use respective sample data to jointly train the same model, but due to different training speeds of the computing devices, the computing device with a slow training speed may reduce the overall training efficiency, which makes it difficult to support efficient horizontal federal learning.

Disclosure of Invention

The embodiment of the application provides a federated learning method, a federated learning device, a computing device and a storage medium, and the federated learning efficiency can be improved. The technical scheme is as follows.

In one aspect, a method for federated learning is provided, which is applied to a model training system including a first computing device and a plurality of second computing devices, and the method includes:

the second computing device trains a federated learning model in parallel through a plurality of computing unit groups of the second computing device, wherein each computing unit group comprises at least one computing unit;

the second computing device responds to local updating information of the local learning model obtained by a target computing unit group of a local end, and sends the local updating information of the target computing unit group to the first computing device, wherein the local updating information is used for updating the federated learning model, and the target computing unit group is any one of the plurality of computing unit groups;

the first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received local updating information, and sends the global updating information to the at least one second computing device;

and the second computing equipment receives the global updating information and updates the local federal learning model according to the global updating information.

Optionally, the receiving, by the first computing device, local update information sent by at least one second computing device, determining global update information according to the received local update information, and sending the global update information to the at least one second computing device includes:

the first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received local updating information, and sends the global updating information to each second computing device; alternatively, the first and second electrodes may be,

the first computing device receives local updating information sent by a plurality of second computing devices, determines global updating information according to the received local updating information, and sends the global updating information to the plurality of second computing devices.

In another aspect, a method for federated learning is provided, which is applied to a second computing device, and includes:

training a federated learning model in parallel through a plurality of computing unit groups, wherein each computing unit group comprises at least one computing unit;

responding to local updating information of the target computing unit group obtained by the target computing unit group, sending the local updating information of the target computing unit group to first computing equipment, wherein the local updating information is used for updating the federated learning model, the first computing equipment is used for receiving the local updating information sent by at least one second computing equipment, determining global updating information according to the received local updating information, and sending the global updating information to the at least one second computing equipment;

and receiving the global updating information, and updating the federal learning model according to the global updating information.

In another aspect, a federated learning apparatus is provided and applied to a model training system, where the model training system includes a first computing device and a plurality of second computing devices, the apparatus includes:

the model training module is used for training the federated learning model in parallel through a plurality of computing unit groups at the home terminal, and each computing unit group comprises at least one computing unit;

a local information sending module, configured to send, to the first computing device, local update information of the target computing unit group in response to a local update information of the local federated learning model obtained by a target computing unit group of a local end, where the local update information is used to update the federated learning model, and the target computing unit group is any one of the plurality of computing unit groups;

the global information sending module is used for receiving local update information sent by at least one second computing device, determining global update information according to the received local update information, and sending the global update information to the at least one second computing device;

and the model training module is also used for receiving the global updating information and updating the local federal learning model according to the global updating information.

Optionally, the local information sending module includes:

a first information sending unit, configured to send, to the first computing device, local update information of each computing unit in the target computing unit group in response to each computing unit in the target computing unit group obtaining the local update information of the federated learning model, in a case where the target computing unit group includes a plurality of computing units.

Optionally, the local information sending module includes:

the second computing device is used for responding to each computing unit in the target computing unit group to obtain local update information of the federated learning model under the condition that the target computing unit group comprises a plurality of computing units, and fusing the local update information of each computing unit in the target computing unit group to obtain fused update information;

a second information sending unit, configured to send the fusion update information to the first computing device.

Optionally, the global information sending module includes:

the global information determining unit is used for receiving the local updating information sent by the second computing equipment with the target quantity and determining the global updating information according to the received local updating information;

a third information sending unit, configured to send the global update information to each second computing device.

Optionally, the global information sending module includes:

the round determining unit is used for receiving local updating information sent by the second computing devices with the target number and determining the difference of iteration rounds, wherein the difference of the iteration rounds is the difference between the maximum iteration round and the minimum iteration round in each second computing device;

the global information determining unit is used for determining the global updating information according to the received local updating information under the condition that the difference of the iteration rounds is not larger than a target threshold value;

a third information sending unit, configured to send the global update information to at least one second computing device.

Optionally, the global information determining unit is further configured to receive, when the difference between the iteration rounds is greater than the target threshold, local update information of each computing unit sent by each second computing device, and determine the global update information according to the received local update information;

the third information sending unit is further configured to send the global update information to each of the second computing devices.

Optionally, the model training module includes:

the sample data dividing unit is used for dividing the sample data into n sample data sets under the condition that the second computing device comprises n computing units, wherein each sample data set comprises at least one sample data, and n is an integer greater than 1;

a sample data allocation unit for allocating the n sample data sets to the n calculation units;

and the model training unit is used for training the federated learning model in parallel through each computing unit in the plurality of computing unit groups based on the sample data set distributed to the computing units.

Optionally, the model training unit is configured to:

determining at least one target computing unit among the n computing units;

splitting the sample data set distributed to the target computing unit to obtain a plurality of sample data groups, wherein each sample data group comprises at least one sample data;

and performing iterative training on the federated learning model respectively based on each sample data group in the plurality of sample data groups through the target computing unit.

Optionally, the local information sending module includes:

a first link sending unit, configured to respond to that each computing unit in a target computing unit group of a local end obtains local update information of the federated learning model, and send the local update information of the target computing unit group to the first computing device through a network link between the second computing device and the first computing device; alternatively, the first and second electrodes may be,

a second link sending unit, configured to respond to that each computing unit in a target computing unit group of a local end obtains local update information of the federated learning model, and send the local update information of each computing unit to the first computing device through a network link between each computing unit and the first computing device, respectively; alternatively, the first and second electrodes may be,

and a third link sending unit, configured to send, in response to each computing unit in a target computing unit group of a local end obtaining local update information of the federated learning model, the local update information of the target computing unit group to the first computing device through a network link between the target computing unit group and the first computing device.

Optionally, the global information sending module is configured to:

receiving local update information sent by at least one second computing device, determining global update information according to the received local update information, and sending the global update information to each second computing device; alternatively, the first and second electrodes may be,

receiving local update information sent by a plurality of second computing devices, determining global update information according to the received local update information, and sending the global update information to the plurality of second computing devices.

Optionally, the global information sending module includes:

and the second fusion unit is used for receiving the local update information sent by the second computing devices and fusing the received local update information to obtain the global update information.

In another aspect, an apparatus for federated learning is provided, which is applied to a second computing device, the apparatus including:

the model training module is used for training the federated learning model in parallel through a plurality of computing unit groups, and each computing unit group comprises at least one computing unit;

the local information sending module is used for responding to local updating information of the target computing unit group obtained by the target computing unit group, sending the local updating information of the target computing unit group to first computing equipment, wherein the local updating information is used for updating the federal learning model, the first computing equipment is used for receiving the local updating information sent by at least one second computing equipment, determining global updating information according to the received local updating information, and sending the global updating information to the at least one second computing equipment;

and the model training module is also used for receiving the global updating information and updating the federal learning model according to the global updating information.

In another aspect, a computing device is provided that includes a processor and a memory having stored therein at least one computer program that is loaded and executed by the processor to perform operations performed in the federal learning method as defined in the preceding aspect.

In another aspect, a computer-readable storage medium is provided, having at least one computer program stored therein, the at least one computer program being loaded and executed by a processor to perform the operations performed in the federal learning method as defined in the above aspects.

In another aspect, a computer program product or a computer program is provided, the computer program product or the computer program comprising computer program code stored in a computer-readable storage medium, the computer program code being read by a processor of a computing device from the computer-readable storage medium, the computer program code being executed by the processor such that the computing device implements the operations performed in the federal learning method as described in the above aspects.

According to the method, the device, the computing equipment and the storage medium provided by the embodiment of the application, the second computing equipment comprises a plurality of computing unit groups, any one of the computing unit groups in the second computing equipment obtains local update information through training a federal learning model, the local update information can be sent to the first computing equipment, and the first computing equipment determines global update information according to the local update information. That is, the second computing device sends the local update information to the first computing device in a manner based on the computing unit groups, and the local update information can be sent to the first computing device without waiting for all the computing unit groups to obtain the local update information, so that the second computing device can send the local update information in time, thereby participating in the global training process of the federal learning model in time, and further the first computing device can also determine the global update information in time, which is beneficial to improving the efficiency of the whole federal learning.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of sample data provided by an embodiment of the present application;

FIG. 2 is an architecture diagram of a model training system provided in an embodiment of the present application;

FIG. 3 is a flowchart of a method for federated learning according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for federated learning according to an embodiment of the present application;

fig. 5 is a schematic diagram of a parameter information transmission method according to an embodiment of the present application;

fig. 6 is a schematic diagram of a parameter information transmission method according to an embodiment of the present application;

FIG. 7 is a flowchart of a method for federated learning according to an embodiment of the present application;

FIG. 8 is a schematic structural diagram of a Federation learning device according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of another federated learning apparatus provided in an embodiment of the present application;

FIG. 10 is a schematic structural diagram of another federated learning apparatus provided in an embodiment of the present application;

fig. 11 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present application more clear, the embodiments of the present application will be further described in detail with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like as used herein may be used herein to describe various concepts, which are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first computing device may be referred to as a second computing device, and similarly, a second computing device may be referred to as a first computing device, without departing from the scope of the present application.

For example, the at least one computing unit may be any integer number of computing units greater than or equal to one, such as one computing unit, two computing units, three computing units, and the like. The plurality means two or more, and for example, the plurality of calculation units may be any integer number of calculation units equal to or larger than two, such as two calculation units and three calculation units. Each means each of at least one, for example, each calculation unit means each of a plurality of calculation units, and if the plurality of calculation units is 3 calculation units, each calculation unit means each of 3 calculation units.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and researched in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical services, smart customer service, internet of vehicles, automatic driving, smart traffic and the like.

Cloud Technology refers to a hosting Technology for unifying resources of hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. Cloud computing refers to a mode of delivery and use of IT (Internet Technology) infrastructure, and refers to obtaining required resources through a network in an on-demand and easily extensible manner; the generalized cloud computing refers to a delivery and use mode of a service, and refers to obtaining a required service in an on-demand and easily-extensible manner through a network. Such services may be IT and software, internet related, or other services. Cloud Computing is a product of development and fusion of traditional computers and Network Technologies, such as Grid Computing (Grid Computing), Distributed Computing (Distributed Computing), Parallel Computing (Parallel Computing), Utility Computing (Utility Computing), Network Storage (Network Storage Technologies), Virtualization (Virtualization), Load balancing (Load Balance), and the like. With the development of diversification of internet, real-time data stream and connecting equipment and the promotion of demands of search service, social network, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.

The federal learning method provided by the embodiment of the application will be explained below based on an artificial intelligence technology and a cloud technology.

To facilitate understanding of the methods provided by the embodiments of the present application, the terms referred to in the present application are explained as follows.

Federal Learning (fed Learning): the federated learning is also called joint learning, can realize the 'availability but invisibility' of data on the premise of protecting the privacy of users and the data security, namely, the training task of the machine learning model is completed through multi-party cooperation, and in addition, the reasoning service of the machine learning model can be provided.

In the artificial intelligence era, the acquisition of machine learning, particularly deep learning models, requires a large amount of training data as a premise. In many business scenarios, however, the training data for the model is often scattered across different business teams, departments, and even different companies. In order to ensure user privacy and data security, data cannot be directly exchanged between different data sources, so that a so-called data island is formed, and data cooperation and large data required by an acquisition model are prevented. In recent years, the federal learning technology is rapidly developed, and a new solution is provided for cross-department, cross-organization and cross-industry data cooperation. Federal learning trains a machine learning model by fully utilizing data of multiple data sources while protecting user privacy and data security, and the performance of the machine learning model is improved by using the multiple and complementary data sources, for example, the accuracy of an advertisement recommendation model is improved.

Unlike traditional centralized machine learning, in the federated learning process, one or more machine learning models are cooperatively trained by two or more participants together. In terms of classification, based on the distribution characteristics of data, federal Learning can be divided into Horizontal federal Learning (Horizontal federal learned Learning), Vertical federal Learning (Vertical federal learned Learning), and federal Transfer Learning (federal transferred Learning). The horizontal federated learning is also called federated learning based on samples, and is suitable for the situation that sample sets share the same feature space but sample spaces are different; the longitudinal federated learning is also called feature-based federated learning and is suitable for the situation that sample sets share the same sample space but feature spaces are different; federated migration learning then applies to cases where the sample sets differ not only in the sample space but also in the feature space.

The federal learning method provided in the embodiment of the present application belongs to a horizontal federal learning method, and an application scenario of horizontal federal learning is that respective sample data in each computing device of federal learning has the same feature space and different sample spaces, for example, as shown in fig. 1, the feature spaces of the sample data in the second computing device 1, the second computing device 2, and the second computing device 3 are all feature 1-feature L, but the sample space in the second computing device 1 is U1-U4, the sample space in the second computing device 2 is U5-U10, and the sample space in the second computing device 3 is U11-U15. The benefit of horizontal federal learning is the ability to increase the overall sample data volume. The core idea of the horizontal federal learning is that each second computing device trains a model at the local end by using sample data owned by each second computing device, and then the models trained by the second computing devices are fused, for example, security model fusion (i.e., security aggregation) is performed based on cryptography or secret sharing technology, so that a global model trained by combining the sample data of the second computing devices is obtained.

The federal learning method provided in the embodiment of the present application is applied to a model training system, which is shown in fig. 2, and the model training system includes a first computing device 201 and a plurality of second computing devices 202, where the first computing device 201 is a server or a terminal, and the second computing devices 202 are a server, a terminal, or a client running in the terminal, and the like. The server is an independent physical server, or a server cluster or distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data and artificial intelligence platform, and the like. The terminal is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto. The first computing device 201 and the second computing device 202 can be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

The first computing device 201 and each second computing device 202 have the federal learning model stored therein, and the model structure of the federal learning model stored in each computing device is the same. Each second computing device 202 also has respective sample data stored therein, with sample data in different second computing devices 202 having the same feature space and different sample spaces.

Each second computing device 202 runs a plurality of computing units, where a computing Unit corresponds to one or more CPU (Central Processing Unit) cores in the second computing device 202, or a computing Unit corresponds to one or more GPU (Graphics Processing Unit) cores in the second computing device 202. The plurality of computing units are computing units for performing model training tasks, such as computing unit 1 and computing unit 2 running in a first second computing device 202, computing unit 3 and computing unit 4 running in a second computing device 202, computing unit 5, computing unit 6 and computing unit 7 running in a last computing device 202, and so on. Each second computing device 202 parallelly trains the federal learning model based on sample data of the local end through each computing unit of the local end, and sends a training result of a certain computing unit to the first computing device 201, the first computing device 201 fuses the training results of the multiple second computing devices 202, and the federal learning model on each second computing device 202 is updated according to the fusion result.

For example, each second computing device 202 sends local update information obtained by training the federal learning model of a certain computing unit at the local end to the first computing device 201. As shown in FIG. 2, first computing device 202 updates information local to computing unit 1

And local update information of the calculation unit 2

Sent to the first computing device 201, the second computing device 202 updates the local update information of the computing unit 3

And local update information of the calculation unit 4

Sent to the first computing device 201, the last computing device 202 updates the local update information of the computing unit 5

Local update information of the calculation unit 6

And local update information of the calculation unit 7

To the first computing device 201.

In one possible implementation, the federal learning system is applied to a block chain, and the first computing device 201 and the second computing device 202 are respectively different nodes in the block chain, and each node stores corresponding sample data. Each node can store the trained federated learning model in the blockchain, which can then be used by the node or by nodes corresponding to other devices in the blockchain.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. The blockchain is essentially a decentralized database, which is a string of data blocks associated by using cryptography, each data block contains information of a batch of network transactions, and the information is used for verifying the validity (anti-counterfeiting) of the information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer. Each block comprises a hash value of the block storage transaction record (hash value of the block) and a hash value of the previous block, and the blocks are connected through the hash values to form a block chain. The block may include information such as a time stamp at the time of block generation.

Taking a distributed system as an example of a blockchain system, the blockchain system is formed by a plurality of nodes (computing devices in any form in an access network, such as servers and user terminals) and a client, a Peer-To-Peer (P2P, Peer To Peer) network is formed between the nodes, and the P2P Protocol is an application layer Protocol operating on top of a Transmission Control Protocol (TCP). In a distributed system, any machine, such as a server or a terminal, can join to become a node, and the node comprises a hardware layer, a middle layer, an operating system layer and an application layer.

The method provided by the embodiment of the application can be used in any scene of horizontal federal learning, and can support the training of any type of machine learning model or deep learning model, and each computing device can adopt any distributed model training method and any type of model training framework at the home terminal.

For example, in the context of assisted interrogation in the medical field: the first computing device is a federal server for federal learning, the plurality of second computing devices are application servers belonging to different hospitals, and sample data corresponding to different patients are stored in each application server, for example, the sample data is medical history information or complaint information of the patients. Each application server adopts the method provided by the embodiment of the application, the disease prediction model is trained in parallel based on sample data of the local terminal through each computing unit of the local terminal, the training result of a certain computing unit is sent to the federal server, the training results of the plurality of application servers are fused by the federal server, and the disease prediction model on each application server is updated according to the fusion result. And then, the disease of the user is predicted based on the trained disease prediction model, so that a doctor determines the disease possibly suffered by the user according to the disease prediction result and other information of the user, and an auxiliary decision is provided in the clinical diagnosis process of the doctor.

For example, in an item recommendation scenario: the first computing device is a federal server for federal learning, the plurality of second computing devices are application servers corresponding to different shopping applications, and sample data corresponding to different user identifications are stored in each application server, wherein the sample data comprises historical shopping information, historical browsing information or historical clicking information and the like corresponding to the user identifications. Each application server adopts the method provided by the embodiment of the application, the classification model is trained in parallel based on sample data of the local terminal through each computing unit of the local terminal, the training result of a certain computing unit is sent to the federal server, the federal server fuses the training results of a plurality of application servers, and the classification model on each application server is updated according to the fusion result. The user is then classified based on the trained classification model for subsequent recommendation of items that meet the user's preferences based on the category to which the user belongs.

In the process of training the model by adopting the federal learning method, a plurality of iterative processes are required, and the embodiment of the application takes any iterative process as an example to explain the training process of the model.

Fig. 3 is a flowchart of a federated learning method provided in the embodiment of the present application. The embodiment of the present application is applied to the model training system shown in fig. 2, where the model training system includes a first computing device and a plurality of second computing devices, and the embodiment of the present application only takes one interaction process between any one of the second computing devices and the first computing device as an example to describe a training process of a model, referring to fig. 3, and the method includes the following steps.

301. And the second computing equipment trains the federal learning model in parallel through a plurality of computing unit groups at the home terminal.

Each second computing device has a plurality of computing unit groups running therein, each computing unit group including at least one computing unit, the plurality of computing units being configured to perform tasks of model training. The computing unit corresponds to one or more CPU cores in the second computing device, or the computing unit corresponds to one or more GPU cores in the second computing device. And each second computing device stores a federal learning model, the structure of each federal learning model is the same, and the federal learning model can be any type of federal learning model which executes any task.

For any second computing device, the second computing device trains the federated learning model of the local end in parallel through a plurality of computing unit groups of the local end. And because each computing unit group comprises at least one computing unit, the second computing device trains the federal learning model of the local terminal in parallel through each computing unit in each computing unit group of the local terminal.

302. And the second computing device responds to the local updating information of the local learning model obtained by the target computing unit group of the local terminal and sends the local updating information of the target computing unit group to the first computing device.

The local update information is used for updating the federal learning model, where the local update information is obtained by the computing unit in the second computing device through training the federal learning model, for example, the local update information is a model parameter obtained by training the federal learning model, or the local update information is a gradient operator obtained by training the federal learning model, and the like, which is not limited in the embodiment of the present application.

The target computing unit group is any one of the plurality of computing unit groups, that is, any one of the second computing devices obtains the local update parameter of the federal learning model, and the second computing device sends the local update information of the computing unit group to the first computing device. In the embodiment of the application, the computing unit group includes at least one computing unit, and any one of the computing unit groups obtains the local update information, and then sends the local update information to the first computing device, that is, the local update information is sent to the first computing device in a manner based on the computing unit group, and the local update information can be sent to the first computing device without all the computing unit groups in the second computing device obtaining the local update information.

303. The first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received local updating information, and sends the global updating information to the at least one second computing device.

The model training system includes a plurality of second computing devices, each of which executes step 301-. The first computing device receives the local updating information sent by the at least one second computing device, determines global updating information according to the received at least one local updating information, and sends the global updating information to the at least one second computing device.

Wherein the global update information is also used to update the federated learning model. The local updating information is obtained by a computing unit group in the second computing device through training a local federal learning model, and data on which the local federal learning model is trained is sample data of the local, so that the local updating information obtained by each second computing device is obtained based on the sample data of the local, and if the second computing device updates the local federal learning model only according to the local updating information, the local updating information only refers to the sample data of the local, so that the information amount is small, and the model is not generalized enough. The global updating information is determined by the first computing device according to the local updating information sent by the at least one second computing device, then the global updating information is sent to the at least one second computing device, and the subsequent second computing device updates the local federal learning model according to the global updating information, so that the whole training process of the local federal learning model is equivalent to the process of referring to sample data of the local terminal and also referring to sample data in other second computing devices, thereby improving the information quantity of the sample data and being beneficial to improving the generalization of the model.

304. And the second computing equipment receives the global updating information and updates the federal learning model of the local terminal according to the global updating information.

For any second computing device, the second computing device receives global update information sent by the first computing device, the global update information refers to local update information obtained by other second computing devices, and is equivalent to reference to sample data in other second computing devices, so that the federal learning model at the local end is updated according to the global update information. Therefore, each second computing device in the model training system can refer to sample data in other second computing devices and train the federal learning model at the local end, and finally the federal learning model in each second computing device tends to converge.

In the method provided by the embodiment of the application, the second computing device includes a plurality of computing unit groups, any one of the computing unit groups in the second computing device obtains local update information by training the federal learning model, and all the local update information can be sent to the first computing device, and the first computing device determines global update information according to the local update information. That is, the second computing device sends the local update information to the first computing device in a manner based on the computing unit groups, and the local update information can be sent to the first computing device without waiting for all the computing unit groups to obtain the local update information, so that the second computing device can send the local update information in time, thereby participating in the global training process of the federal learning model in time, and further the first computing device can also determine the global update information in time, which is beneficial to improving the efficiency of the whole federal learning.

Fig. 4 is a flowchart of a federated learning method provided in the embodiment of the present application. The embodiment of the present application is applied to the model training system shown in fig. 2, where the model training system includes a first computing device and a plurality of second computing devices, and the embodiment of the present application only takes one interaction process between any one of the second computing devices and the first computing device as an example to describe a training process of a model, referring to fig. 4, and the method includes the following steps.

401. And the second computing equipment trains the federal learning model in parallel through a plurality of computing unit groups at the home terminal.

Each second computing device has a plurality of computing unit groups running therein, each computing unit group including at least one computing unit. Each second computing device also stores a federal learning model and respective sample data, the model structure of the federal learning model in each second computing device is the same, and the sample data in each second computing device has the same feature space and different sample spaces. In the embodiment of the application, the plurality of second computing devices jointly train the federal learning model by adopting a horizontal federal learning mode based on respective sample data. The federated learning model stored in each second computing device may be referred to as a "local model", the local models trained by the plurality of second computing devices are fused by the first computing device, and the obtained federated learning model may be referred to as a "global model". Each local model has the same model structure, model parameters of different local models may be the same or different in each stage of model training, and after the joint training is completed, each second computing device obtains a global model with the same model structure and model parameters.

In one possible implementation, in a case where the second computing device includes n computing units, the second computing device divides the sample data into n sample data sets, each sample data set including at least one sample data, n being an integer greater than 1. The second computing device distributes the n sample data sets to the n computing units, and the federated learning model is trained in parallel through each computing unit in the plurality of computing unit sets based on the sample data sets distributed by the computing units.

In the embodiment of the application, a plurality of computing units in the second computing device adopt a parallel mode to perform distributed training, each computing unit can obtain a sample data set, each computing unit trains a local federal learning model based on the distributed sample data set in the process of one-time iterative model training, and the training processes among the computing units are not affected. The second computing device divides the sample data into n sample data sets, and the sample data sets distributed to each computing unit are different, so that the sample data sets used by each computing unit for training the federal learning model are different, the plurality of computing units train the federal learning model in a parallel mode, namely, different sample data sets are respectively used for training the federal learning model at the same time, the data volume used for training the model at the same time can be increased, and the efficiency of the second computing device for training the federal learning model is improved.

Optionally, the second computing device determines at least one target computing unit in the n computing units, and splits the sample data set allocated to the target computing unit to obtain a plurality of sample data groups, where each sample data group includes at least one sample data. And the second computing device respectively carries out iterative training on the federated learning model on the basis of each sample data set in the plurality of sample data sets through the target computing unit.

In order to accelerate the training speed of some computing units, the second computing device determines a target computing unit in the n computing units, and splits the sample data set allocated to the target computing unit to obtain a plurality of sample data sets with smaller data size. And in the process of carrying out the next iterative training through the target computing unit, training the federated learning model based on the next sample data group in the plurality of sample data groups. That is, for the target calculation unit, only one sample data set is used in the process of one iterative training without using a complete sample data set, and the sample data sets are sequentially rotated in each iterative training, so that the data amount and the operation amount in the process of one iterative training are reduced, the speed of one iterative training is accelerated, and local update information of the federal learning model can be timely obtained.

For other computing units except the target computing unit, the second computing device does not need to split the sample data set, directly trains the federated learning model based on the complete sample data set in the process of one-time iterative training through other computing units, and continues to train the federated learning model based on the complete sample data set in the process of the next iterative training through other computing units. That is, for other computing units, in the process of training each iteration, the complete sample data set is used for training.

Optionally, in the case that the second computing device includes only one computing unit, if the computing resource of the second computing device is limited, which results in a slow processing speed, the second computing device may split the sample data into a plurality of sample data groups with a small data size, and in each iterative training process, only one sample data group is used, and the plurality of sample data groups are sequentially rotated, so that the second computing device with limited computing resources may also send local update information obtained by training the federated learning model to the first computing device in time. If the computing resources of the second computing device are rich and the processing speed is high, the second computing device uses complete sample data in each iterative training process so as to improve the data volume used in each iterative training process.

402. And the second computing device responds to the local updating information of the local learning model obtained by the target computing unit group of the local terminal and sends the local updating information of the target computing unit group to the first computing device.

The local update information is used for updating the federal learning model, and the target computing unit group is any one of the plurality of computing unit groups. Therefore, any one of the computing unit groups in the second computing device obtains the local update parameters of the federated learning model, and the second computing device sends the local update information of the computing unit group to the first computing device. The second computing device can send the local update information without waiting for all the computing unit groups to obtain the local update information, so that the second computing device can be ensured to send the local update information obtained by the computing units in the target computing unit group to the first computing device in time so as to participate in the training of the global model, and the problem that the second computing device cannot participate in the training of the global model in time due to the fact that the training speed of some computing unit groups is low is avoided.

Wherein the local update information is obtained by training a federal learning model. In a possible implementation manner, for each computing unit, in a one-time iterative training process, training is performed using a plurality of sample data, and then the computing unit obtains update information corresponding to each sample data in the one-time iterative training process, and performs weighted average on the update information corresponding to the plurality of sample data to obtain local update information in the current iterative training process. For example, the local update information is a model parameter or a gradient operator, etc.

In a possible implementation manner, each computing unit group includes a plurality of computing units, the second computing device determines, in response to any one of the computing units at the local end obtaining local update information of the federal learning model, whether or not other computing units in the computing unit group in which the computing unit is located obtain the local update information, and if all the other computing units obtain the local update information, that is, each computing unit in the computing unit group obtains the local update information, the second computing device sends the local update information of the computing unit group to the first computing device. If at least one computing unit in the other computing units does not obtain the local update information, the second computing device does not send the local update information to the first computing device, and continues to wait for the other computing units which do not obtain the local update information until each computing unit in the computing unit group obtains the local update information, and then sends the local update information of the computing unit group to the first computing device.

In another possible implementation manner, each computing unit group includes one computing unit, and the second computing device sends the local update information of the computing unit to the first computing device in response to any one of the computing units at the local end obtaining the local update information of the federal learning model. In the case that each computing unit group includes one computing unit, it is equivalent to send local update information to the first device in units of computing units, and compared with the case that each computing unit group includes a plurality of computing units, in this case, the process of sending local update information by each computing unit does not influence each other, and the frequency of sending local update information to the first computing device by the second computing device can be further increased. For example, the second computing device includes n computing units, where n is an integer greater than 1, and when a jth computing unit of the second computing device obtains local update information of the federal learning model, the jth computing unit sends the local update information to the first computing device, where j is a positive integer not greater than n.

In the embodiment of the application, the second computing device comprises a plurality of computing units, the plurality of computing units are divided to obtain a plurality of computing unit groups, and since how many computing units are included in each computing unit group can be flexibly set, by dividing the computing unit groups, the local update information of how many computing units are sent to the first computing device at each time can be flexibly controlled. Since each group of computing units includes more computing units, the amount of information transmitted to the first computing device at a time is more, but the frequency of transmitting local update information to the first computing device may be relatively low; the fewer the computing units included in each computing unit group, the higher the frequency of sending local update information to the first computing device, but the relatively smaller the amount of information sent to the first computing device at a time, so by dividing the computing unit groups, the frequency of sending local update information to the first computing device, as well as the amount of information, can be flexibly controlled.

In another possible implementation, in a case where the target computing unit group includes a plurality of computing units, the second computing device sends the local update information of each computing unit in the target computing unit group to the first computing device in response to each computing unit in the target computing unit group getting the local update information of the federated learning model.

If the target computing unit group includes a plurality of computing units, each computing unit may obtain the local update information, and thus the target computing unit group may include a plurality of local update information, each time the local update information of the target computing unit group is transmitted to the first computing device means that the plurality of local update information is transmitted to the first computing device. For example, the second computing device packages the plurality of local update information and simultaneously transmits the plurality of packaged local update information to the first computing device.

In another possible implementation manner, in a case that the target computing unit group includes a plurality of computing units, the second computing device, in response to each computing unit in the target computing unit group obtaining local update information of the federal learning model, fuses the local update information of each computing unit in the target computing unit group to obtain fused update information; the fusion update information is sent to the first computing device.

If the target computing unit group includes a plurality of computing units, each computing unit obtains local update information, and therefore the target computing unit group includes a plurality of local update information, each time the local update information of the target computing unit group is sent to the first computing device, it means that the fused update information obtained by fusing the plurality of local update information is sent to the first computing device. For example, the first computing device fuses the plurality of local update information in a weighted average manner to obtain fused update information.

In another possible implementation manner, the second computing device sends the local update information of the target computing unit group to the first computing device through a network link between the second computing device and the first computing device in response to each computing unit in the target computing unit group of the local end obtaining the local update information of the federated learning model.

If a network link is established between the second computing device and the first computing device, each computing unit in the target computing unit group sends local update information to the first computing device through the network link.

In another possible implementation manner, the second computing device, in response to each computing unit in the target computing unit group of the local end obtaining local update information of the federated learning model, sends the local update information of each computing unit to the first computing device through a network link between each computing unit and the first computing device, respectively;

if a network link is established between each computing unit in the second computing device and the first computing device, each computing unit in the target computing unit group sends local update information to the first computing device through the respective network link.

In another possible implementation manner, the second computing device sends the local update information of the target computing unit group to the first computing device through the network link between the target computing unit group and the first computing device in response to each computing unit in the target computing unit group of the local end obtaining the local update information of the federated learning model.

If a network link is established between each computing unit group in the second computing device and the first computing device, each computing unit in the target computing unit group sends local update information to the first computing device through the network link of the target computing unit group.

In another possible implementation manner, the second computing device, in response to the local update information of the federate learning model obtained by the local target computing unit group, splits the local update information into a plurality of local update sub-information, and sequentially sends each piece of local update sub-information to the first computing device.

If the network resource of the second computing device is limited, which causes the transmission speed of the second computing device to be slow, or the model parameters of the federal learning model to be more, the local update information can be split into a plurality of local update sub-information, only one local update sub-information is sent each time, so that the local update information is sent to the first computing device in batches, and partial local update information can be sent to the first computing device in time to participate in the training process of the global model in time.

403. The first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received local updating information, and sends the global updating information to the at least one second computing device.

The model training system includes a plurality of second computing devices, each of which executes the steps 401 and 402, so as to send the local update information obtained by the local computing unit group to the first computing device, and therefore the first computing device receives the local update information sent by different second computing devices. The first computing device receives local updating information sent by at least one second computing device, determines global updating information according to the received at least one local updating information, and sends the global updating information to the at least one second computing device, so that the second computing device updates the federal learning model according to the global updating information.

In one possible implementation manner, the first computing device receives local update information sent by a plurality of second computing devices, and fuses the received plurality of local update information to obtain global update information. Optionally, the first computing device fuses the plurality of local update messages in a federal average, safe aggregation or weighted average manner, so as to obtain global update messages.

In another possible implementation manner, in the horizontal federal learning, the first computing device may coordinate local update information of multiple second computing devices in a Synchronous Parallel (BSP), Asynchronous Parallel (ASP), delayed Asynchronous Parallel (SSP), or the like, so as to perform information fusion.

(1) Synchronous parallel mode: the first computing device receives the local updating information of each computing unit sent by each second computing device, and determines global updating information according to the received local updating information.

In the synchronous parallel mode, the first computing device updates the global update information after receiving the local update information of each computing unit sent by each second computing device, so the first computing device needs to wait for the local update information of each computing unit. The synchronous parallel mode can ensure the convergence of global update information, and the global update information is used for updating the federated learning model, so that the convergence of the federated learning model can be ensured.

(2) Asynchronous parallel mode: the first computing device receives the local updating information sent by the second computing devices of the target number, and determines global updating information according to the received local updating information.

In asynchronous parallel mode, the first computing device updates the global parameter information after receiving a target number of updates of the local parameter information of the second computing devices, e.g., the target number is 1, 5, or 10, etc., so the first computing device does not need to wait for all of the second computing devices to send the local update information. The asynchronous parallel mode avoids the problem of waiting for the second computing device with a low training speed, and can accelerate the updating process of global parameter information, so that the training time of the federal learning model is shortened.

(3) Delay-synchronous parallel mode: the first computing device receives local updating information sent by second computing devices of a target number, and determines the difference of iteration rounds, wherein the difference of the iteration rounds is the difference between the maximum iteration round and the minimum iteration round in each second computing device; and under the condition that the difference of the iteration rounds is not larger than the target threshold, the first computing device determines global updating information according to the received local updating information and sends the global updating information to at least one second computing device. Or, the first computing device receives the local update information of each computing unit sent by each second computing device under the condition that the difference of the iteration turns is larger than the target threshold, determines the global update information according to the received local update information, and sends the global update information to each second computing device.

The delay synchronous parallel mode is a mode combining a synchronous parallel mode and an asynchronous parallel mode, and under the delay synchronous parallel mode, whether the synchronous parallel mode or the asynchronous parallel mode is adopted is determined according to the difference of iteration rounds. And if the difference between the maximum iterated round and the minimum iterated round is not greater than a target threshold value, an asynchronous parallel mode is adopted, global update information is determined according to the received local update information of the target number, and waiting for the computing units in other second computing devices is not needed. If the difference between the maximum iterated round and the minimum iterated round is larger than the target threshold, a synchronous parallel mode is adopted, after receiving the local update information of the target number, the computing units of other second computing devices continue to wait, and after receiving the local update information of each computing unit sent by each second computing device, the global update information is determined according to the received plurality of local update information.

The iterated rounds referred to in the embodiments of the present application refer to iterated rounds of each second computing device from the last time the synchronous parallel mode is adopted to the present time. And updating once by adopting a synchronous parallel mode every time the difference between the maximum iterated turn and the minimum iterated turn is larger than a target threshold value. That is, the updating is performed in a manner of alternating asynchronous parallel mode and synchronous parallel mode in a stage during a period of time. Optionally, whenever the difference between the maximum iteration round and the minimum iteration round is greater than the target threshold, the target number is updated by using the synchronous parallel mode, and then the asynchronous parallel mode is used again until the difference between the maximum iteration round and the minimum iteration round is greater than the target threshold, and the synchronous parallel mode is continuously replaced. The asynchronous parallel mode can accelerate the training speed of the model, and the synchronous parallel mode can improve the convergence of the model, so the training speed of the model and the convergence of the model can be balanced by adopting the delay synchronous parallel mode.

In another possible implementation manner, the first computing device receives local update information sent by at least one second computing device, determines global update information according to the received local update information, and sends the global update information to each second computing device.

The local updating information of the second computing equipment is obtained by training the federal learning model based on sample data of the local terminal, and the sample data stored in each second computing equipment is different, so that the obtained local updating information is also different. The first computing device determines global update information according to the local update information of the at least one second computing device, and the global update information refers to the local update information of the at least one second computing device, which is equivalent to reference to sample data of the at least one second computing device, and then the first computing device sends the global update information to each second computing device, so that the federal learning model at the local end is updated according to the global update information.

In another possible implementation manner, the first computing device receives local update information sent by a plurality of second computing devices, determines global update information according to the received local update information, and sends the global update information to the plurality of second computing devices. The number of the plurality of second computing devices may be determined by the model training system, dynamically determined by the first computing device, or determined according to a preset update rule, which is not limited in the embodiment of the present application.

The first computing device determines global update information based on local update information for a plurality of second computing devices, the global update information references local update information of a plurality of second computing devices, equivalent to sample data referencing the plurality of second computing devices, and thus the global update information is updated relative to the single local update information, the global update information has better generalization, the first computing device sends the global update information to the plurality of second computing devices so as to update the local federated learning model according to the global update information, therefore, the updating processes of the federal learning model in the plurality of second computing devices can be referred to each other, not only the sample data of the local end is referred to, but also the sample data of other second computing devices are referred to, which is equivalent to improving the data volume of the sample data corresponding to the federal learning model, and is beneficial to improving the generalization of the federal learning model.

404. And the second computing equipment receives the global updating information and updates the federal learning model of the local terminal according to the global updating information.

And for any second computing device, updating the local federal learning model by the second computing device according to the global updating information. Therefore, each second computing device in the model training system can refer to sample data in other second computing devices and train the federal learning model at the local end, and finally the federal learning model in each second computing device tends to converge.

In one possible implementation manner, in order to ensure the security of the data, an encryption algorithm is used to transmit the update information, as shown in fig. 5, the above steps 402-404 are replaced by the following steps 501-503.

501. The second computing device responds to the local updating information of the local learning model obtained by the target computing unit group of the local end, encrypts the local updating information of the target computing unit group to obtain encrypted local updating information, and sends the encrypted local updating information to the first computing device.

502. The first computing device receives the encrypted local update information sent by the at least one second computing device, determines the encrypted global update information according to the received encrypted local update information, and sends the encrypted global update information to the at least one second computing device.

503. And the second computing equipment receives the encrypted global updating information, decrypts the encrypted global updating information to obtain decrypted global updating information, and updates the local federal learning model according to the decrypted global updating information.

As shown in fig. 6, in order to ensure the security and privacy of the local update information, after acquiring the local update information, the second computing device 602 encrypts the local update information by using an encryption algorithm, and sends the encrypted local update information to the first computing device 601. After receiving the encrypted local update information, the first computing device 601 directly adopts a security aggregation manner such as homomorphic encryption to fuse the received encrypted local update information without decrypting the encrypted local update information, so as to obtain encrypted global update information, and sends the encrypted global update information to the second computing device 602. The second computing device 602 decrypts the encrypted global update information using a decryption algorithm corresponding to the encryption algorithm, to obtain decrypted global update information. The encryption algorithm and the corresponding decryption algorithm are only stored in the second computing device 602, the encryption algorithm and the corresponding decryption algorithm are not stored in the first computing device 601, and the first computing device 601 does not need to decrypt the encrypted local update information or the encrypted global update information, so that the security and privacy of the local update information and the global update information can be ensured.

The embodiment of the application provides a horizontal federated learning method based on parallel updating of a computing unit group, and takes the model training system shown in fig. 2 as an example, which includes a first computing device and k second computing devices, where k is an integer greater than 1. Each second computing device is provided with a plurality of computing units, and distributed training is performed on the federal learning model among the computing units in a parallel mode. Taking an example that one computing unit group includes one computing unit, each second computing device sends the local update information obtained by the computing unit to the first computing device with the computing unit as the minimum unit. For different computing units of different second computing devices, the first computing device employs a synchronous parallel mode, an asynchronous parallel mode, or a delayed synchronous parallel mode to determine global update information. Under the asynchronous parallel mode or the delayed synchronous parallel mode, the situation that part of second computing equipment cannot timely participate in the training of the global model due to the limitation of computing resources or network resources can be avoided by the horizontal federal learning method based on the parallel updating of the computing units, and the training efficiency of the federal learning model is improved.

It should be noted that, in the embodiment of the application, only one second computing device is used to send local update information to a first computing device for example, in the process of training the federal learning model, each second computing device sends local update information to the first computing device, each second computing device performs multiple iterations, sends multiple local update information to the first computing device, and the first computing device also performs multiple iterations to send multiple global update information to the second computing device. In one possible implementation, in response to the iteration round of the first computing device reaching a first threshold, stopping training of the federated learning model; or stopping training the federated learning model in response to the loss value obtained by each second computing device in the current iteration turn not being greater than a second threshold. The first threshold and the second threshold are both arbitrary values, for example, the first threshold is 100 or 150, and the second threshold is 0.01 or 0.03.

And the sample data is divided into n sample data sets, and the sample data sets allocated to each computing unit are different, so that the sample data sets used by each computing unit for training the federal learning model are different, and the federal learning model is trained among the multiple computing units in a parallel mode, namely the federal learning model is trained by using different sample data sets at the same time, so that the data volume used for training the model in the same time can be increased, and the efficiency of the second computing device for training the federal learning model is improved.

And for the target computing unit, only one sample data set is used in the process of one iterative training without using a complete sample data set, and the sample data sets are sequentially alternated during each iterative training, so that the data volume and the operation volume in the process of one iterative training are reduced, the speed of one iterative training is accelerated, and local update information of the federal learning model can be timely obtained.

Fig. 7 is a flowchart of a method for federated learning according to an embodiment of the present application. The embodiment of the present application is applied to a second computing device in the model training system shown in fig. 2, and referring to fig. 7, the method includes the following steps.

701. The second computing device trains the federated learning model in parallel through a plurality of computing unit groups.

702. And the second computing device responds to the local updating information of the target computing unit group obtained by the target computing unit group, and sends the local updating information of the target computing unit group to the first computing device.

The local updating information is used for updating the federal learning model, the first computing device is used for receiving the local updating information sent by the at least one second computing device, determining global updating information according to the received local updating information, and sending the global updating information to the at least one second computing device.

The steps 701-702 are similar to the steps 401-402, and are not described in detail herein.

703. And the second computing equipment receives the global updating information and updates the federal learning model according to the global updating information.

Step 703 is the same as step 404, and is not described in detail herein.

In the method provided by the embodiment of the application, the second computing device includes a plurality of computing unit groups, any one of the computing unit groups in the second computing device obtains local update information by training the federal learning model, the local update information can be sent to the first computing device, and the first computing device determines global update information according to the local update information. That is, the second computing device sends the local update information to the first computing device in a manner based on the computing unit groups, and the local update information can be sent to the first computing device without waiting for all the computing unit groups to obtain the local update information, so that the second computing device can send the local update information in time, and thus the second computing device can participate in the global training process of the federal learning model in time.

Fig. 8 is a schematic structural diagram of a bang learning device according to an embodiment of the present application. Referring to fig. 8, the apparatus is applied to a model training system including a first computing device and a plurality of second computing devices, and includes:

the model training module 801 is used for training the federal learning model in parallel through a plurality of computing unit groups at the local end, wherein each computing unit group comprises at least one computing unit;

a local information sending module 802, configured to send, to the first computing device, local update information of the target computing unit group in response to the local update information of the local target computing unit group obtaining the federated learning model, where the local update information is used to update the federated learning model, and the target computing unit group is any one of the multiple computing unit groups;

a global information sending module 803, configured to receive local update information sent by at least one second computing device, determine global update information according to the received local update information, and send the global update information to the at least one second computing device;

the model training module 801 is further configured to receive global update information, and update the local federated learning model according to the global update information.

According to the federal learning device provided by the embodiment of the application, any one calculation unit group obtains local update information by training a federal learning model, the local update information can be sent to the first calculation device, and global update information is determined according to the local update information. That is, the local update information is sent to the first computing device by adopting a mode based on the computing unit groups, and the local update information can be sent to the first computing device without waiting for all the computing unit groups to obtain the local update information, so that the local update information can be sent in time, the local update information can participate in the global training process of the federal learning model in time, the global update information can be determined in time, and the efficiency of the whole federal learning can be improved.

Optionally, referring to fig. 9, the local information sending module 802 includes:

a first information sending unit 812, configured to, in a case where the target computing unit group includes a plurality of computing units, send, to the first computing device, local update information of each computing unit in the target computing unit group in response to each computing unit in the target computing unit group obtaining the local update information of the federated learning model.

a first fusion unit 822, configured to, in a case that the target computing unit group includes multiple computing units, in response to each computing unit in the target computing unit group obtaining local update information of the federal learning model, fuse the local update information of each computing unit in the target computing unit group to obtain fusion update information by the second computing device;

a second information sending unit 832, configured to send the fusion update information to the first computing device.

Optionally, referring to fig. 9, the global information sending module 803 includes:

a global information determination unit 813 configured to receive the local update information sent by the target number of second computing devices, and determine global update information according to the received local update information;

a third information sending unit 823 is configured to send global update information to each second computing device.

a round determining unit 833, configured to receive local update information sent by the target number of second computing devices, and determine a difference between iteration rounds, where the difference between the iteration rounds is a difference between a maximum iteration round and a minimum iteration round in each second computing device;

a global information determination unit 813 configured to determine global update information according to the received local update information in a case where the difference between the iteration rounds is not greater than a target threshold;

a third information sending unit 823 is configured to send global update information to at least one second computing device.

Optionally, referring to fig. 9, the global information determining unit 813 is further configured to receive, in a case that a difference between iteration rounds is greater than a target threshold, local update information of each computing unit sent by each second computing device, and determine global update information according to the received local update information;

the third information sending unit 823 is further configured to send global update information to each second computing device.

Optionally, referring to fig. 9, the model training module 801 includes:

a sample data dividing unit 811 for dividing, in a case where the second computing device includes n computing units, sample data into n sample data sets, each sample data set including at least one sample data, n being an integer greater than 1;

a sample data distributing unit 821 for distributing n sample data sets to n computing units;

and the model training unit 831 is used for training the federal learning model in parallel through each computing unit in the plurality of computing unit groups based on the sample data set distributed to the computing units.

Optionally, referring to fig. 9, a model training unit 831, configured to:

determining at least one target computing unit among the n computing units;

splitting a sample data set distributed to a target computing unit to obtain a plurality of sample data groups, wherein each sample data group comprises at least one sample data;

and respectively carrying out iterative training on the federated learning model based on each sample data set in the plurality of sample data sets through the target calculation unit.

a first link sending unit 842, configured to send, in response to each computing unit in the local target computing unit group obtaining local update information of the federated learning model, the local update information of the target computing unit group to the first computing device through a network link between the second computing device and the first computing device; alternatively, the first and second electrodes may be,

a second link sending unit 852, configured to respond to each computing unit in the target computing unit group of the local end to obtain local update information of the federal learning model, and send the local update information of each computing unit to the first computing device through a network link between each computing unit and the first computing device; alternatively, the first and second electrodes may be,

a third link sending unit 862, configured to send, in response to each computing unit in the local target computing unit group obtaining the local update information of the federated learning model, the local update information of the target computing unit group to the first computing device through the network link between the target computing unit group and the first computing device.

Optionally, referring to fig. 9, the global information sending module 803 is configured to:

the second fusion unit 843 is configured to receive local update information sent by multiple second computing devices, and fuse the received local update information to obtain global update information.

It should be noted that: in the federal learning apparatus provided in the foregoing embodiment, when performing federal learning, only the division of the above function modules is used as an example, and in practical applications, the function distribution may be completed by different function modules according to needs, that is, the internal structure of the computing device is divided into different function modules, so as to complete all or part of the functions described above. In addition, the federal learning device and the federal learning method embodiment provided by the above embodiments belong to the same concept, and the specific implementation process thereof is described in the method embodiment, which is not described herein again.

Fig. 10 is a schematic structural diagram of a bang learning device according to an embodiment of the present application. Referring to fig. 10, the apparatus is applied to a second computing device, and includes:

a model training module 1001, configured to train a federated learning model in parallel through a plurality of computing unit groups, where each computing unit group includes at least one computing unit;

the local information sending module 1002 is configured to send, in response to the target computing unit group obtaining local update information of the federal learning model, the local update information of the target computing unit group to the first computing device, where the local update information is used to update the federal learning model, and the first computing device is configured to receive the local update information sent by the at least one second computing device, determine global update information according to the received local update information, and send the global update information to the at least one second computing device;

the model training module 1001 is further configured to receive global update information, and update the federal learning model according to the global update information.

According to the federal learning device provided by the embodiment of the application, any one calculation unit group obtains local update information by training a federal learning model, the local update information can be sent to the first calculation device, and global update information is determined according to the local update information. That is, the local update information is sent to the first computing device by adopting a mode based on the computing unit groups, and the local update information can be sent to the first computing device without waiting for all the computing unit groups to obtain the local update information, so that the local update information can be sent in time, and the local update information can participate in the global training process of the federal learning model in time.

The embodiment of the present application further provides a computing device, where the computing device includes a processor and a memory, where the memory stores at least one computer program, and the at least one computer program is loaded and executed by the processor to implement the operations performed in the federal learning method in the foregoing embodiments.

Optionally, the computing device is provided as a terminal. Fig. 11 shows a schematic structural diagram of a terminal 1100 according to an exemplary embodiment of the present application.

The terminal 1100 includes: a processor 1101 and a memory 1102.

Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1101 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1101 may be integrated with a GPU (Graphics Processing Unit, image Processing interactor) for rendering and drawing content required to be displayed by the display screen. In some embodiments, the processor 1101 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1102 may include one or more computer-readable storage media, which may be non-transitory. Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1102 is used to store at least one computer program for being hosted by processor 1101 to implement the federal learning methodology provided by method embodiments of the present application.

In some embodiments, the terminal 1100 may further include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Optionally, the peripheral device comprises: at least one of radio frequency circuitry 1104, display screen 1105, camera assembly 1106, audio circuitry 1107, positioning assembly 1108, and power supply 1109.

The peripheral interface 1103 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1101 and the memory 1102. In some embodiments, the processor 1101, memory 1102, and peripheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1101, the memory 1102 and the peripheral device interface 1103 may be implemented on separate chips or circuit boards, which is not limited by this embodiment.

The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1104 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1104 may communicate with other devices via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1104 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1105 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1105 is a touch display screen, the display screen 1105 also has the ability to capture touch signals on or over the surface of the display screen 1105. The touch signal may be input to the processor 1101 as a control signal for processing. At this point, the display screen 1105 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, display 1105 may be one, disposed on a front panel of terminal 1100; in other embodiments, the display screens 1105 can be at least two, respectively disposed on different surfaces of the terminal 1100 or in a folded design; in other embodiments, display 1105 can be a flexible display disposed on a curved surface or on a folded surface of terminal 1100. Even further, the display screen 1105 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display screen 1105 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

Camera assembly 1106 is used to capture images or video. Optionally, camera assembly 1106 includes a front camera and a rear camera. The front camera is disposed on the front panel of the terminal 1100, and the rear camera is disposed on the rear surface of the terminal 1100. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1106 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 1107 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1101 for processing or inputting the electric signals to the radio frequency circuit 1104 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location of terminal 1100. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1101 or the radio frequency circuit 1104 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1107 may also include a headphone jack.

Positioning component 1108 is used to locate the current geographic position of terminal 1100 for purposes of navigation or LBS (Location Based Service). The Positioning component 1108 may be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union galileo System.

Power supply 1109 is configured to provide power to various components within terminal 1100. The power supply 1109 may be alternating current, direct current, disposable or rechargeable. When the power supply 1109 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

Optionally, the computing device is provided as a server. Fig. 12 is a schematic structural diagram of a server 1200 according to an embodiment of the present application, where the server 1200 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1201 and one or more memories 1202, where the memory 1202 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 1201 to implement the methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is loaded and executed by a processor to implement the operations performed in the federal learning method in the foregoing embodiments.

The embodiments of the present application also provide a computer program product or a computer program, where the computer program product or the computer program includes computer program code, the computer program code is stored in a computer-readable storage medium, a processor of a computing device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computing device implements the operations performed in the federal learning method as in the above embodiments. In some embodiments, the computer program according to embodiments of the present application may be deployed to be executed on one computing device or on multiple computing devices located at one site, or may be executed on multiple computing devices distributed at multiple sites and interconnected by a communication network, and the multiple computing devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only an alternative embodiment of the present application and should not be construed as limiting the present application, and any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for federated learning, applied to a model training system comprising a first computing device and a plurality of second computing devices, the method comprising:

in the case that the second computing device comprises n computing units, the second computing device divides the sample data into n sample data sets, each sample data set comprising at least one sample data, the n being an integer greater than 1;

the second computing device allocating the n sample data sets to the n computing units;

the second computing device trains a federated learning model in parallel through each computing unit in a plurality of computing unit groups of the home terminal based on the sample data set distributed by the computing unit, wherein each computing unit group comprises at least one computing unit;

the second computing device receives the global updating information and updates the local federal learning model according to the global updating information;

the second computing device trains a federated learning model in parallel through each computing unit in a plurality of computing unit groups of the local end based on the sample data set distributed by the computing unit, and the federated learning model training method includes:

the second computing device determines at least one target computing unit among the n computing units;

the second computing device splits the sample data set distributed by the target computing unit to obtain a plurality of sample data groups, wherein each sample data group comprises at least one sample data;

and the second computing device respectively carries out iterative training on the federated learning model based on each sample data group in the plurality of sample data groups through the target computing unit.

2. The method according to claim 1, wherein the second computing device, in response to a local update information of the local federated learning model being obtained by a local target computing unit group of the local end, sends the local update information of the target computing unit group to the first computing device, and includes:

in a case where the target computing unit group includes a plurality of computing units, the second computing device sends, to the first computing device, local update information for each computing unit in the target computing unit group in response to each computing unit in the target computing unit group obtaining the local update information for the federated learning model.

3. The method according to claim 1, wherein the second computing device, in response to a local update information of the local federated learning model being obtained by a local target computing unit group of the local end, sends the local update information of the target computing unit group to the first computing device, and includes:

under the condition that the target computing unit group comprises a plurality of computing units, the second computing device responds to each computing unit in the target computing unit group to obtain local updating information of the federated learning model, and fuses the local updating information of each computing unit in the target computing unit group to obtain fused updating information;

sending the fusion update information to the first computing device.

4. The method of claim 1, wherein the first computing device receives local update information sent by at least one second computing device, determines global update information according to the received local update information, and sends the global update information to the at least one second computing device, and wherein the method comprises:

the first computing device receives local updating information sent by a target number of second computing devices, and determines the global updating information according to the received local updating information;

the first computing device sends the global update information to each second computing device.

5. The method of claim 1, wherein the first computing device receives local update information sent by at least one second computing device, determines global update information according to the received local update information, and sends the global update information to the at least one second computing device, and wherein the method comprises:

the first computing device receives local updating information sent by second computing devices of a target number, and determines a difference of iteration rounds, wherein the difference of the iteration rounds is a difference between the maximum iteration round and the minimum iteration round in each second computing device;

the first computing device determines the global update information according to the received local update information when the difference between the iteration rounds is not greater than a target threshold;

the first computing device sends the global update information to at least one second computing device.

6. The method of claim 5, further comprising:

the first computing device receives local updating information of each computing unit sent by each second computing device under the condition that the difference of the iteration rounds is larger than the target threshold value, and determines the global updating information according to the received local updating information;

the first computing device sends the global update information to the each second computing device.

7. The method according to claim 1, wherein the second computing device, in response to a local update information of the local federated learning model being obtained by a local target computing unit group of the local end, sends the local update information of the target computing unit group to the first computing device, and includes:

the second computing device responds to each computing unit in a target computing unit group of a local end to obtain local updating information of the federated learning model, and sends the local updating information of the target computing unit group to the first computing device through a network link between the second computing device and the first computing device; alternatively, the first and second electrodes may be,

the second computing device responds to each computing unit in a target computing unit group of a local end to obtain local updating information of the federal learning model, and the second computing device sends the local updating information of each computing unit to the first computing device through a network link between each computing unit and the first computing device; alternatively, the first and second electrodes may be,

the second computing device responds to each computing unit in a target computing unit group of a local end to obtain local updating information of the federated learning model, and sends the local updating information of the target computing unit group to the first computing device through a network link between the target computing unit group and the first computing device.

8. The method of claim 1, wherein the first computing device receives local update information sent by at least one second computing device, and wherein determining global update information based on the received local update information comprises:

the first computing device receives local update information sent by a plurality of second computing devices, and fuses the received local update information to obtain the global update information.

9. A method for federated learning, applied to a second computing device, the method comprising:

under the condition that n computing units are included, dividing sample data into n sample data sets, wherein each sample data set comprises at least one sample data, and n is an integer greater than 1;

allocating the n sample data sets to the n computing units;

training a federated learning model in parallel based on a sample data set distributed by a plurality of computing units through each computing unit in the computing unit groups, wherein each computing unit group comprises at least one computing unit;

receiving the global updating information, and updating the federal learning model according to the global updating information;

wherein the training of the federated learning model in parallel, based on the sample data set assigned to the computing unit, by each computing unit of the plurality of computing unit groups, comprises:

determining at least one target computing unit among the n computing units;

10. The apparatus for federated learning is applied to a model training system, wherein the model training system comprises a first computing device and a plurality of second computing devices, and the apparatus comprises:

the model training module is used for dividing the sample data into n sample data sets under the condition that the model training module comprises n computing units, wherein each sample data set comprises at least one sample data, and n is an integer greater than 1;

the model training module is further configured to assign the n sample data sets to the n computing units;

the model training module is further used for training a federated learning model in parallel through each computing unit in a plurality of computing unit groups based on the sample data set distributed by the computing unit, wherein each computing unit group comprises at least one computing unit;

the model training module is also used for receiving the global updating information and updating the local federal learning model according to the global updating information;

wherein the model training module is further configured to:

determining at least one target computing unit among the n computing units;

11. An apparatus for federated learning, applied to a second computing device, the apparatus comprising:

the model training module is further configured to receive the global update information, and update the federal learning model according to the global update information;

wherein the model training module is further configured to:

determining at least one target computing unit among the n computing units;

12. A computing device comprising a processor and a memory, the memory having stored therein at least one computer program that is loaded and executed by the processor to perform operations performed in the federal learning method as claimed in any of claims 1 to 8, or to perform operations performed in the federal learning method as claimed in claim 9.

13. A computer-readable storage medium having stored therein at least one computer program, which is loaded and executed by a processor, to perform operations performed in the federal learning method as claimed in any of claims 1 to 8, or to perform operations performed in the federal learning method as claimed in claim 9.