CN111522669A

CN111522669A - Method, device and equipment for optimizing horizontal federated learning system and readable storage medium

Info

Publication number: CN111522669A
Application number: CN202010359198.8A
Authority: CN
Inventors: 程勇; 刘洋; 陈天健
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2020-04-29
Filing date: 2020-04-29
Publication date: 2020-08-11
Also published as: WO2021219054A1

Abstract

The invention discloses a method, a device, equipment and a readable storage medium for optimizing a transverse federated learning system, which are used for acquiring equipment resource information of each piece of participating equipment participating in the transverse federated learning; respectively configuring calculation task parameters in the federated learning model training process corresponding to each participating device according to the device resource information, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size; and correspondingly sending the calculation task parameters to each participating device so that each participating device can execute a federal learning task according to the respective calculation task parameters. The invention realizes the contribution brought by utilizing the data of each data party in the federal learning and improves the overall efficiency of the federal learning system.

Description

Method, device and equipment for optimizing horizontal federated learning system and readable storage medium

Technical Field

The invention relates to the field of artificial intelligence, in particular to a method, a device, equipment and a readable storage medium for optimizing a horizontal federal learning system.

Background

With the gradual development of the federal learning technology, the federal learning is increasingly applied to various fields to solve the problem of data islanding. As federal learning techniques mature, the efficiency requirements for the federal learning process are increasing in many areas, such as in the field of real-time data processing. However, in the actual federal learning process, the technical resource conditions of the devices of each data party are different, and some devices are relatively deficient in computing resources, so that the high-efficiency federal learning cannot be well supported, and even the overall efficiency of the federal learning is reduced; if the devices do not participate in the federal learning in order not to influence the overall efficiency of the federal learning, the contribution of the data in the devices cannot be utilized. Therefore, how to realize the contribution brought by the data of each data party in the federal learning without influencing the overall efficiency of the federal learning becomes a problem to be solved urgently.

Disclosure of Invention

The invention mainly aims to provide a method, a device, equipment and a readable storage medium for optimizing a horizontal federal learning system, and aims to solve the problem of how to realize the contribution brought by utilizing data of each data party in the federal learning and not influence the overall efficiency of the federal learning.

In order to achieve the above object, the present invention provides a method for optimizing a horizontal federal learning system, which is applied to a coordinating device participating in horizontal federal learning, and comprises the following steps:

acquiring equipment resource information of each piece of participating equipment participating in horizontal federal learning;

respectively configuring calculation task parameters in the federated learning model training process corresponding to each participating device according to the device resource information, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size;

and correspondingly sending the calculation task parameters to each participating device so that each participating device can execute a federal learning task according to the respective calculation task parameters.

Optionally, the step of respectively configuring, according to the device resource information, calculation task parameters in the federated learning model training process corresponding to the participating devices includes:

classifying the participating devices according to the device resource information, and determining the resource types to which the participating devices belong respectively;

and respectively configuring calculation task parameters in the federated learning model training process corresponding to each participating device according to the resource category to which each participating device belongs.

Optionally, the step of respectively configuring a calculation task parameter in a federated learning model training process corresponding to each of the participating devices according to the resource category to which each of the participating devices belongs includes:

respectively determining candidate task parameters corresponding to the participating devices according to the resource categories to which the participating devices belong;

respectively determining the expected processing time lengths corresponding to the participating devices based on the candidate task parameters, and detecting whether preset time length consistency conditions are met among the expected processing time lengths;

and if the predicted processing time length meets the preset time length consistency condition, correspondingly taking the candidate task parameters of the participating devices as the calculation task parameters of the participating devices.

Optionally, when the model to be trained in the horizontal federal learning is a recurrent neural network model, and the calculation task parameter includes a predicted processing time step, after the step of correspondingly sending the calculation task parameter to each of the participating devices, the method further includes:

configuring a time step selection strategy corresponding to each participating device according to the predicted processing time step corresponding to each participating device;

and correspondingly sending the time step selection strategy to each participating device so that each participating device can select sequence selection data from respective sequence data according to the respective time step selection strategy, and executing a federal learning task according to the sequence selection data, wherein the time step of the sequence selection data is less than or equal to the respective predicted processing time step of each participating device.

Optionally, when the calculation task parameter includes a predicted processing batch size, after the step of correspondingly sending the calculation task parameter to each of the participating devices, the method further includes:

configuring the learning rate corresponding to each participating device according to the expected processing batch size corresponding to each participating device;

and correspondingly sending the learning rate to each piece of participating equipment so that each piece of participating equipment can execute a federal learning task according to the respective learning rate and the expected processing batch size received from the coordinating equipment.

Optionally, the step of correspondingly sending the calculation task parameters to each of the participating devices so that each of the participating devices executes a federal learning task according to the respective calculation task parameters includes:

and correspondingly sending the calculation task parameters to each participating device, and sending the estimated time updated by the global model of the current round to each participating device, so that each participating device adjusts the times of local model training according to the estimated time when performing local model training according to the calculation task parameters.

Optionally, the step of obtaining device resource information of each participating device participating in the horizontal federal learning includes:

receiving device resource information sent by each participating device participating in horizontal federal learning, wherein the device resource information at least comprises one or more of electric quantity resource information, calculation resource information and communication resource information.

In order to achieve the above object, the present invention provides an optimizing device for a horizontal federal learning system, wherein the optimizing device is deployed on a coordinating device participating in horizontal federal learning, and the optimizing device comprises:

the acquisition module is used for acquiring equipment resource information of each piece of participating equipment participating in horizontal federal learning;

the configuration module is used for respectively configuring calculation task parameters in the federated learning model training process corresponding to the participating devices according to the device resource information, and the calculation task parameters comprise predicted processing time step length and/or predicted processing batch size;

and the sending module is used for correspondingly sending the calculation task parameters to each participating device so that each participating device can execute a federal learning task according to the respective calculation task parameters.

In order to achieve the above object, the present invention further provides a horizontal federal learning system optimization device, including: a memory, a processor, and a lateral federated learning system optimization program stored on the memory and operable on the processor that, when executed by the processor, implements the steps of the lateral federated learning system optimization method described above.

In addition, to achieve the above object, the present invention further provides a computer readable storage medium, on which a horizontal federal learning system optimization program is stored, wherein the horizontal federal learning system optimization program, when executed by a processor, implements the steps of the horizontal federal learning system optimization method as described above.

In the invention, the device resource information of each participating device is obtained through the coordinating device; respectively configuring calculation task parameters for each participating device according to the device resource information of each participating device, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size; and correspondingly sending the calculation task parameters of each participating device to each participating device, so that each participating device can execute the federal learning task according to the calculation task parameters. In the invention, the computing tasks to be processed locally by each participating device are coordinated by configuring computing task parameters for each participating device by a coordinating device, wherein the computing task parameters comprise a predicted processing time step and/or a predicted processing batch size, namely the number of the computing tasks to be processed locally by each participating device is coordinated in a mode of configuring the predicted processing time step and/or the predicted processing batch size for the participating device; different calculation task parameters are configured for each participating device according to the device resource information of each participating device, and the difference of each participating device in the device resource condition is considered; the method has the advantages that more calculation tasks are distributed to the participating equipment with rich equipment resources, less calculation tasks are distributed to the participating equipment with less equipment resources, the participating equipment with less equipment resources can also quickly finish local model parameter updating, the participating equipment with more rich calculation resources does not need to spend time waiting, accordingly, the overall efficiency of transverse federal learning of each participating equipment is improved, meanwhile, the contribution of each participating equipment to model training can be applied, the contribution of each participating equipment local data can be utilized, the contribution brought by the participating equipment with less resources can be utilized, the stability of the model is further improved, and the optimization of the efficiency and the model performance of a transverse federal learning system is achieved.

Drawings

FIG. 1 is a schematic diagram of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a first embodiment of a method for optimizing a horizontal federated learning system of the present invention;

FIG. 3 is a block diagram of a preferred embodiment of the horizontal federated learning system optimization apparatus of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.

It should be noted that, in the embodiment of the present invention, the horizontal federal learning system optimization device may be a smart phone, a personal computer, a server, and the like, which is not limited herein.

As shown in fig. 1, the lateral federal learning system optimization device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration of the apparatus shown in FIG. 1 does not constitute a limitation on the lateral Federal learning System optimization apparatus, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

As shown in FIG. 1, memory 1005, which is one type of computer storage medium, may include an operating system, a network communications module, a user interface module, and a horizontal federated learning system optimization program. The operating system is a program for managing and controlling hardware and software resources of the equipment, and supports the running of a horizontal federal learning system optimization program and other software or programs.

When the horizontal federal learning system optimized device is a coordinating device participating in horizontal federal learning, in the device shown in fig. 1, the user interface 1003 is mainly used for data communication with a client; the network interface 1004 is mainly used for establishing communication connection with a participating device participating in horizontal federal learning; the processor 1001 may be configured to invoke a horizontal federated learning system optimization program stored in the memory 1005 and perform the following operations:

Further, the step of respectively configuring the calculation task parameters in the federated learning model training process corresponding to each of the participating devices according to the device resource information includes:

Further, the step of respectively configuring the calculation task parameters in the federated learning model training process corresponding to each of the participating devices according to the resource category to which each of the participating devices belongs includes:

Further, when the model to be trained in the horizontal federal learning is a recurrent neural network model, the calculation task parameter includes a predicted processing time step, and after the step of correspondingly sending the calculation task parameter to each of the participating devices, the processor 1001 may be configured to invoke the horizontal federal learning system optimization program stored in the memory 1005, and further perform the following operations:

Further, when the calculation task parameter includes a predicted processing batch size, after the step of sending the calculation task parameter to each of the participating devices, the processor 1001 may be configured to invoke a horizontal federal learning system optimization program stored in the memory 1005, and further perform the following operations:

Further, the step of correspondingly sending the calculation task parameters to each of the participating devices so that each of the participating devices executes the federal learning task according to the respective calculation task parameters includes:

Further, the step of obtaining device resource information of each participating device participating in the horizontal federal learning includes:

Based on the structure, various embodiments of the optimization method of the horizontal federal learning system are provided.

Referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of the optimization method for a horizontal federated learning system according to the present invention. It should be noted that, although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown or described herein.

In this embodiment, the optimization method of the horizontal federal learning system is applied to the coordinating device participating in horizontal federal learning, and the coordinating device and each participating device participating in horizontal federal learning may be a smart phone, a personal computer, a server, and other devices. In this embodiment, the method for optimizing the horizontal federal learning system includes:

step S10, acquiring equipment resource information of each participating equipment participating in horizontal federal learning;

in this embodiment, the coordinating device and each participating device may establish a communication connection in advance through inquiry handshake authentication and identity authentication, and determine a model to be trained for the federal learning, such as a neural network model or other machine learning models. The model to be trained with the same or similar structure can be constructed locally by each participating device, or the model to be trained can be constructed by the coordinating device and then sent to each participating device. Each participating device has local training data for training the model to be trained.

In the horizontal federal learning, the coordinating device and the participating devices are matched with each other to perform multiple times of global model updating on the model to be trained, wherein the model updating refers to updating model parameters of the model to be trained, such as connection weight values among neurons in a neural network, and finally the model meeting the quality requirement is obtained. In the first global model updating, each participating device adopts respective local training data to carry out local training on a local model to be trained so as to obtain local model parameter updating, wherein the local model parameter updating can be gradient information used for updating model parameters or the locally updated model parameters; each participating device updates and sends the local model parameter of each participating device to the coordinating device; the coordination equipment fuses the local model parameter updates, if weighted average is carried out, the global model parameter update is obtained and sent to each participating equipment; and each participating device updates the model parameters of the local model to be trained by adopting global model parameter updating, namely, the local model to be trained is updated, so that the global model updating is completed once. After each global model update, the model parameters of the models to be trained local to each participating device are synchronized.

In the federal learning process, the coordinating device may obtain device resource information of each participating device. The device resource information may be resource information related to computing efficiency and communication efficiency in the participating device, for example, computing resource information, power resource information, communication resource information, and the like, the computing resource may be represented by the number of CPUs and GPUs owned by the participating device, the power resource may be represented by the time for which the participating device can continue to operate, and the communication resource may be represented by the communication rate of the participating device. The coordination device may send a device resource query request to each of the participating devices, and each of the participating devices uploads current device resource information to the coordination device after receiving the request.

Step S20, respectively configuring calculation task parameters in the federal learning model training process corresponding to each participating device according to the device resource information, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size;

the coordination device may obtain device resource information of each participating device before performing the first global model update, to configure a calculation task parameter in a subsequent global model update for each participating device; or before entering a certain global model update or each global model update, acquiring the device resource information of each participating device to configure the computing task parameters in the current global model update for each participating device. That is, the coordinating device may configure the calculation task parameters of each participating device that participate in the federal learning model training for each participating device in the process of the federal learning model training by acquiring the device resource information of each participating device.

The calculation task parameters include a predicted processing time step and/or a predicted processing batch size, that is, the calculation task parameters include the predicted processing time step, or include the predicted processing batch size, or include the predicted processing time step and the predicted processing batch size. The time step refers to the number of time steps, the time step is a concept in the recurrent neural network model, and is for the sequence data, and the predicted processing time step refers to the number of time steps that the participating device is predicted to process. The batch size (mini-batch size) refers to the size of the data batch used for each model update, and the expected processing batch size refers to the data batch size expected to be used by the participating devices when performing local model updates.

After the coordination device obtains the device resource information of each participating device, the coordination device configures the calculation task parameters for each participating device according to the device resource information. Specifically, the richer the computing resources of the participating devices, the higher the computing efficiency, the richer the power resources, the longer the time for continuously participating in the federal study, the richer the communication resources (the larger the communication bandwidth, the shorter the communication delay), the higher the efficiency of transmitting data. Then, for the same computing task, the more rich the device resource information, the less time it takes. The principle of configuring the calculation task parameters is to configure more calculation tasks for participating devices with rich device resources and configure less calculation tasks for participating devices with less device resources, so as to make the processing time of each participating device the same as much as possible. How many computing tasks can be quantified by a predicted processing time step or a predicted processing batch size; the more processing time steps are expected or the larger processing batch size is expected to represent the larger amount of data to be processed by the participating devices, and thus the more processing time steps are expected to represent the more computing tasks and the larger processing batch size is expected to represent the more computing tasks. It should be noted that, because the participating devices upload the gradient information or the model parameter information locally, the uploaded data is not different in data size due to different batch sizes or time step sizes, and if the communication resources of the participating devices are rich, it indicates that the data transmission speed of the participating devices is fast, the time spent on uploading data is relatively short, and much time can be spent on local model training, so that the coordinating device can allocate more computing tasks to the participating devices with rich communication resources, that is, allocate more predicted processing time steps or predicted processing batch sizes.

There are various ways for the coordinating device to configure the computing task parameters based on the device resource information of the participating devices. For example, the corresponding relationship between the device resource information and the calculation task parameters may be preset; dividing the equipment resource information into a plurality of segments according to the numerical values, wherein the resource abundance represented by each segment is different, for example, when the equipment resource information is the number of CPUs, the number of the CPUs can be divided into a plurality of segments, and the more the number of the CPUs is, the richer the representation technology resource is; presetting computing task parameters corresponding to each segment, namely corresponding relations, wherein the more the represented resource richness degree of the segments is, the more the computing tasks defined by the correspondingly set computing task parameters are; and the coordination equipment judges which segment the equipment resource information of the participating equipment falls into, and correspondingly configures the computing task parameter corresponding to which segment for the participating equipment.

Step S30, correspondingly sending the calculation task parameters to each of the participating devices, so that each of the participating devices executes a federal learning task according to the respective calculation task parameters.

And the coordination equipment correspondingly sends the calculation task parameters of each participating equipment to each participating equipment. And after receiving the calculation task parameters, the participating equipment executes the federal learning task by adopting the received calculation task parameters. Specifically, when the coordination device sends a calculation task parameter to be used in each subsequent global model update, the participating device participates in each subsequent global model update based on the calculation task parameter, so as to complete the federal learning task; when the computing task parameters to be used in one global model update sent by the coordinating device, the participating devices participate in the one global model update based on the computing task parameters.

Taking the example that the participating devices participate in one global model update based on the calculation task parameters:

when the model to be trained is a recurrent neural network and the calculation task parameter is the predicted processing time step length, the participating equipment selects data of the predicted processing time step length from the sequence data used for model training locally (the selected data is called sequence selection data); specifically, for each piece of sequence data composed of a plurality of pieces of time step data, the participating device selects a part of the time step data from the sequence data as sequence selection data, and the time step of the part of the time step data may be less than or equal to the predicted processing time step, wherein when the time step of the sequence data is less than or equal to the predicted processing time step, the sequence data is not selected, and the sequence data is directly used as the sequence selection data; for example, for a piece of sequence data with 32 time step data, the expected processing time step is 15, and the participating device may select 15 time step data from the piece of sequence data as sequence selection data; the selection modes are various, for example, one mode is selected at intervals, or 15 modes are selected at random, and different selection modes can be set according to specific model application scenes; in addition, the selection modes adopted when the global model participates in updating can be different, so that most local data can be used for participating in model training; selecting data based on each sequence of which the selected time step is less than or equal to the predicted processing time step, performing local model training on the model to be trained for one or more times to obtain local model parameter updating, wherein the process of performing local model training is consistent with the general training process of the recurrent neural network, and detailed description is omitted; updating and sending the local model parameters to the coordination equipment, fusing the local model parameter updates of each participating equipment by the coordination equipment to obtain global model parameter updates, and sending the global model parameter updates to each participating equipment; and after receiving the global model parameter update, the participating equipment updates the model parameters of the local model to be trained by adopting the global model parameter update.

When the model to be trained is a certain neural network model or other machine learning models and the calculation task parameter is the expected processing batch size, the participating device can divide the local multiple pieces of training data into multiple batches, wherein the size of each batch, namely the number of pieces of training data contained in each batch, is smaller than or equal to the expected processing batch size received from the coordinating device; for example, if the expected processing batch size received by the participating device is 100, and the participating device has 1000 pieces of training data locally, the participating device may divide the local training data into 10 batches; after the participating devices batch the local training data according to the expected processing batch size, the participating devices adopt the data of one batch to update the local model in the process of participating in the global model update.

When the model to be trained is a cyclic neural network and the calculation task parameters are the predicted processing time step and the predicted processing batch size, the participating device selects each piece of training data in a time step by combining the operations in the two cases, selects each piece of sequence data obtained by selection for batching, and adopts the sequence selection data of one batch to participate in the global model updating each time.

It can be understood that, since the neural network nodes corresponding to different time steps in the recurrent neural network share the weight, the input data of the recurrent neural network may be variable in length, that is, the time steps of the input data may be different, so that it can be supported that each participating device performs local model training based on different predicted processing time steps, respectively.

In the embodiment, the device resource information of each participating device is acquired through the coordinating device; respectively configuring calculation task parameters for each participating device according to the device resource information of each participating device, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size; and correspondingly sending the calculation task parameters of each participating device to each participating device, so that each participating device can execute the federal learning task according to the calculation task parameters. In this embodiment, the computing tasks to be processed locally by the respective participating devices are coordinated by configuring computing task parameters for the respective participating devices by the coordinating device, where the computing task parameters include a predicted processing time step and/or a predicted processing batch size, that is, how many computing tasks to be processed locally by the respective participating devices are coordinated by configuring the participating devices with the predicted processing time step and/or the predicted processing batch size; different calculation task parameters are configured for each participating device according to the device resource information of each participating device, and the difference of each participating device in the device resource condition is considered; the method has the advantages that more calculation tasks are distributed to the participating equipment with rich equipment resources, less calculation tasks are distributed to the participating equipment with less equipment resources, the participating equipment with less equipment resources can also quickly finish local model parameter updating, the participating equipment with more rich calculation resources does not need to spend time waiting, accordingly, the overall efficiency of transverse federal learning of each participating equipment is improved, meanwhile, the contribution of each participating equipment to model training can be applied, the contribution of each participating equipment local data can be utilized, the contribution brought by the participating equipment with less resources can be utilized, the stability of the model is further improved, and the optimization of the efficiency and the model performance of a transverse federal learning system is achieved.

Further, based on the first embodiment, a second embodiment of the optimization method for a horizontal federal learning system of the present invention is provided, in this embodiment, the step S20 includes:

step S201, classifying each participating device according to the resource information of each device, and determining the resource category to which each participating device belongs;

in this embodiment, a feasible way for the coordinating device to configure the computing task parameters according to the device resource information of the participating devices is provided. Specifically, the coordinating device classifies each participating device according to the device resource information of each participating device, and determines the resource category of each participating device. The coordination device may arrange the device resource information according to the value size, may set the number of categories to be classified in advance, and then equally divide the interval formed by the minimum value and the maximum value in the sequence to obtain several divided segments of the preset number of categories, so that each divided segment is a category, and it is determined which divided segment the value corresponding to each participating device resource information belongs to. For example, when the device resource information includes computing resource information expressed in terms of the number of CPUs of the participating devices, the number of CPUs of the respective participating devices may be arranged. It can be understood that, compared with a mode of presetting each resource category, in this embodiment, the resource category is determined according to the resource information of each device, so that the resource category is divided more according to the actual resource situation of the participating device, and the resource situation that the participating device is not invariable in the federal learning process can be adapted.

When the device resource information includes data of multiple types of device resources, the data of the various types of device resources can be normalized, so that the data of the various types of resources can be calculated and compared. The normalization processing mode may adopt a common normalization mode, which is not described in detail herein. For example, when the device resource includes a computing resource, an electric quantity resource and a communication resource, the computing resource, the electric quantity resource and the communication resource can be calculated and compared by performing normalization processing on data of the computing resource, data of the electric quantity resource and data of the communication resource; for the equipment resource information of each piece of participating equipment, acquiring normalized data of various equipment resources of the participating equipment, setting weight values of various resources in advance according to the influence degree of the various resources on the local computing efficiency of the participating equipment, and then carrying out weighted average on the normalized data of the various equipment resources to obtain a numerical value capable of evaluating the total resource abundance of the participating equipment; and the coordination equipment performs the sorting, dividing and classifying operations based on the calculated numerical value of each participating equipment. The equipment resource information of each participating device is normalized to calculate an integral resource abundance value, and the complex equipment resource information is quantized, so that the calculation task parameters can be more conveniently and accurately configured for each participating device; and by quantizing the complex equipment resource information, the resource classification of each participating equipment can be realized, and further, the computing task can be more quickly configured for the participating equipment.

Step S202, respectively configuring calculation task parameters in the federated learning model training process corresponding to each participating device according to the resource type of each participating device.

After determining the resource category to which each participating device belongs, the coordinating device configures the computing task parameters corresponding to each participating device according to the resource category to which each participating device belongs. Specifically, a maximum expected processing time step may be preset, and numbering may be performed on each resource category from low to high according to resource richness: 1. 2, 3 …, calculating the expected processing time step corresponding to each category according to the maximum expected processing time step, specifically dividing the maximum expected processing time step by the number of resource categories to obtain a minimum time step, and then multiplying the number of each resource category by the minimum time step to obtain the expected processing time step corresponding to each resource category. For example, the maximum expected processing time step is 32, and the number of resource categories is 4, then, from low to high in richness of the resource categories, the expected processing time steps corresponding to the 4 resource categories are: 8. 16, 24 and 32. Similarly, a maximum expected batch size may be set in advance, and then the expected batch size corresponding to each category may be calculated based on the maximum expected batch size.

Further, the step S202 includes:

step S2021, determining candidate task parameters corresponding to each participating device according to the resource type to which each participating device belongs;

further, the coordinating device may determine candidate task parameters corresponding to each participating device according to the resource category to which each participating device belongs. Specifically, the calculation task parameter corresponding to each resource category may be calculated according to a manner similar to that described above for calculating the predicted processing time step corresponding to each resource category according to the maximum predicted processing time step, and further, the calculation task parameter corresponding to each participating device is determined according to the resource category to which each participating device belongs, and the calculation task parameter is taken as the candidate task parameter.

Step S2022, respectively determining the expected processing time length corresponding to each participating device based on the candidate task parameters, and detecting whether the expected processing time lengths meet a preset time length consistency condition;

the coordination device may determine the expected processing time corresponding to each participating device based on the candidate task parameter corresponding to each participating device, that is, determine the time that each participating device needs to spend in executing the federal learning task according to the respective candidate task parameter, specifically, the time that the participating device needs to perform local model training and upload model parameter updates when participating devices participate in the next global model update according to the candidate task parameter.

Specifically, the coordinating device may estimate, according to the device resource information of each participating device, a time required for each participating device to perform local model training and upload model parameter updates according to the candidate task parameters. The unit time of processing the unit time step or the unit batch size by using the unit resource can be set in advance according to tests or experience, then the predicted processing time length of the participating equipment can be calculated according to the unit time and the resources actually owned by the participating equipment and the predicted processing time step or the predicted processing batch size in the candidate task parameters, and the whole process can be similar to the principle that the unit is multiplied by the number to obtain the total amount. For example, it is empirically set in advance that a participating device has 1 CPU and needs x time duration for local model training and y time duration for uploading model parameter updates when processing data of 1 time step, and a participating device has 3 CPUs and needs 10 time steps for processing, so the expected processing time duration of the participating device is (10x +10 y)/3.

After the coordination device obtains the expected processing time length corresponding to each participating device through calculation, whether the expected processing time length of each participating device meets the preset time length consistency condition or not can be detected. The preset time length consistency condition may be preset according to a principle that the expected processing time lengths of the participating devices are guaranteed to be consistent as much as possible. For example, the preset duration consistency condition may be set such that the difference between the maximum value and the minimum value in each predicted processing duration needs to be smaller than a set threshold, and when the coordinating device detects that the difference between the maximum value and the minimum value in each predicted processing duration is smaller than the threshold, it indicates that the predicted processing duration of each participating device satisfies the preset duration consistency condition, that is, the predicted processing durations of each participating device are substantially the same. When the coordinating device detects that the difference value between the maximum value and the minimum value in each predicted processing time length is not smaller than the threshold value, it indicates that the predicted processing time lengths of the participating devices do not satisfy the preset time length consistency condition, that is, the predicted processing time lengths of the participating devices are different greatly.

Step S2023, if the predicted processing time lengths satisfy the preset time length consistency condition, corresponding the candidate task parameter of each participating device to a calculation task parameter of each participating device.

If the coordination device detects that each predicted processing time meets the preset time consistency condition, the candidate task parameters of each participating device can be correspondingly used as the final calculation task parameters of each participating device.

If the coordinating device detects that the predicted processing time lengths do not meet the preset time length consistency condition, the difference between the predicted processing time lengths of the participating devices is relatively large, at this time, if each participating device performs a subsequent federal learning task based on the corresponding candidate task parameters, then there may be a large difference in processing time and some participating devices may need to wait for other participating devices, for example, in the process of updating the global model, the time spent on updating the local model and uploading the model updating parameters of one part of the participating devices is shorter, while the time spent on updating the local model and uploading the model updating parameters of the other part of the participating devices is longer, therefore, the coordination equipment and the participating equipment which takes a relatively short time wait until the participating equipment which takes a relatively long time uploads the model parameter update, and the coordination equipment can be fused to obtain the global model parameter update, so that the global model update is completed. Therefore, the coordinating device can adjust on the basis of each candidate task parameter when detecting that each predicted processing time does not meet the preset time consistency condition. For example, the candidate task parameter of the participating device with the largest expected processing duration may be adjusted to be smaller, the predicted processing time step may be adjusted to be smaller, the predicted processing batch size may be adjusted to be smaller, or both the predicted processing time step and the predicted processing batch size may be adjusted to be smaller. After adjustment is carried out on the basis of each candidate task parameter, the coordination equipment estimates the predicted processing time length of each participating equipment again on the basis of each adjusted candidate task parameter, detects whether each predicted processing time length meets the preset time length consistency condition or not, and the steps are repeated until the preset time length consistency condition is detected.

In this embodiment, the calculation task parameters which can enable the predicted processing time of each participating device to meet the preset time consistency condition are configured for each participating device, so that each participating device is consistent as much as possible in the predicted processing time, and therefore, each participating device does not need to wait for too long time or even wait, and even if the participating device has a poor device resource condition, the participating device can keep pace with the participating device having rich device resources, and each participating device can participate in the horizontal federal learning, so that the overall efficiency of the horizontal federal learning is improved, the contribution brought by the data of each participating device can be utilized, and even if the participating device has the poor device resource condition, the contribution brought by the possessed training data can be utilized.

Further, based on the first and second embodiments, a third embodiment of the optimization method of the horizontal federal learning system of the present invention is provided. In this embodiment, after the step S30, the method further includes:

step S40, configuring the time step selection strategy corresponding to each participating device according to the predicted processing time step corresponding to each participating device;

further, when the model to be trained in the horizontal federal learning is a recurrent neural network model and the calculation task parameters include the predicted processing time step length, the coordinating device may configure the time step selection strategy corresponding to each participating device according to the predicted processing time step length corresponding to each participating device. The Recurrent Neural Network model related in each embodiment of the present invention may be a common RNN, or may be a deep RNN, an LSTM (long short-Term Memory, long short-Term Memory Network), a GRU (Gated Recurrent Unit), an IndRNN (independent Recurrent Neural Network), or the like.

The time step selection strategy is a strategy for selecting partial time step data from the sequence data, the coordination equipment can configure the time step selection strategy for each participating equipment in various ways, and the principle is based on that the positions of the time step data selected from the sequence data by each participating equipment in the sequence data form complementation as much as possible. For example, if the predicted processing time step size of each of the participating devices a and B is 15, the time step selection policy configured by the coordinating device for the participating device a may be to select time step data with a single serial number in the sequence data, and the time step selection policy configured for the participating device B may be to select time step data with a double serial number in the sequence data. Different time step selection strategies are configured for each participating device, even complementary time step selection strategies can be formed, so that the adopted sequence data are distributed differently on the time steps when each participating device carries out local model training, the model to be trained can learn features from more different sequence data, the generalization capability of the model can be improved, and better prediction capability can be realized for new samples in various forms.

Step S50, correspondingly sending the time step selection policy to each of the participating devices, so that each of the participating devices selects sequence selection data from the respective sequence data according to the respective time step selection policy, and executes a federal learning task according to the sequence selection data, wherein a time step of the sequence selection data is less than or equal to the respective predicted processing time step of each of the participating devices.

And the coordination equipment correspondingly sends the time step selection strategy of each participating equipment to each participating equipment. It should be noted that the coordinating device may send the time step selection policy of each participating device and the predicted processing time step to the participating devices together, or may send them separately. After receiving the predicted processing time step length and the time step selection strategy, the participating equipment selects sequence selection data from the local sequence data according to the time step selection strategy, wherein the time step length of the sequence selection data is less than or equal to the predicted processing time step length.

Further, in an embodiment, when the calculating the task parameter includes the expected processing batch size, after the step S30, the method further includes:

step S60 is to configure learning rates corresponding to the participating devices according to the expected processing batch sizes corresponding to the participating devices, respectively;

further, when the calculation task parameter includes a predicted processing batch size, the coordinating device may configure a learning rate corresponding to each participating device according to the predicted processing batch size corresponding to each participating device. The learning rate is a hyper-parameter in the model training process, the coordination equipment configures various learning rates for each participating equipment, and the principle can be based on that the expected batch size is in direct proportion to the learning rate. For example, the coordinating facility may set a reference learning rate, configure a learning rate that is less than the reference learning rate for the participating facility if the predicted processing batch size for the participating facility is less than the batch size corresponding to the reference learning rate, and configure a learning rate that is greater than the reference learning rate for the participating facility if the predicted processing batch size for the participating facility is greater than the batch size corresponding to the reference learning rate.

Step S70, correspondingly sending the learning rate to each of the participating devices, so that each of the participating devices executes a federal learning task according to the respective learning rate and the expected processing batch size received from the coordinating device.

After configuring the learning rate for each participating device, the coordinating device correspondingly sends the learning rate of each participating device to each participating device. It should be noted that the coordinating device may send the learning rate of each participating device and the expected batch size to the participating devices together, or may send them separately. After receiving the learning rate and the predicted batch size, the participating devices execute a federal learning task according to the learning rate and the predicted batch size, for example, during local model training, model training is performed by using a data batch of the predicted batch size, and when model parameters are updated, the learning rate is used for updating the model parameters.

In this embodiment, the coordination device configures the learning rate for each participating device based on the expected processing batch size of each participating device, so that the coordination device can integrally control the model convergence speed of each participating device, and thus the model convergence speeds of each participating device tend to be consistent by setting different learning rates for each participating device, and thus the model to be trained can better converge in the federal learning process.

It should be noted that, when the calculation task parameters include the predicted processing time step and the predicted processing batch size, the above scheme of configuring the time step selection strategy and the learning rate of each participating device by the coordinating device may also be implemented in combination. Specifically, the coordinating device configures a time step selection strategy of each participating device according to the predicted processing time step length of each participating device, and configures the learning rate of each participating device according to the predicted processing batch size of each participating device; the coordination equipment correspondingly sends the time step selection strategy and the learning rate of each participating equipment and the calculation task parameters to each participating equipment; and the participating equipment selects sequence selection data from the local sequence data according to the time step selection strategy, performs local model training by adopting the sequence selection data with the expected processing batch size, and performs model updating by adopting the received learning rate in the training process.

Further, the step S20 includes:

step S203, correspondingly sending the calculation task parameters to each of the participating devices, and sending the predicted time length updated by the global model in the current round to each of the participating devices, so that each of the participating devices adjusts the number of times of local model training according to the predicted time length when performing local model training according to the calculation task parameters.

And the coordination equipment sends the calculation task parameters of each participating equipment to each participating equipment, and simultaneously sends the estimated time length updated by the global model of the round to each participating equipment. The estimated duration of the global model update of the current round may be determined according to the estimated processing duration of each participating device, for example, a maximum value of the estimated processing durations of each participating device is used as the estimated duration of the global model update of the current round. After receiving the calculation task parameters and the predicted time length, the participating device performs local model training according to the calculation task parameters, and can adjust the times of the local model training according to the predicted time length. Specifically, the participating device may calculate a time length 1 spent in one local model training after one local model training, then determine whether a result obtained by subtracting the time length 1 from the estimated time length is greater than the time length 1, and if so, perform one local model training again, that is, increase the number of times of one local model training; calculating the time length 2 spent by the local model training again, judging whether the result obtained by subtracting the time length 1 and the time length 2 from the estimated time length is greater than the time length 2, and if so, performing the local model training again; therefore, when the remaining time length is detected to be less than the time length spent on the local model training last time, the local model training is not carried out any more, and the local model parameters obtained by the local model training are updated and uploaded to the coordination equipment. That is, when the participating device determines that the remaining duration is sufficient for local model training, the participating device adds one local model training.

In this embodiment, the coordinated device sends the estimated time length of the global model update of the current round to each participating device, so that when the speed of each participating device actually performing local model training is fast, time spent on waiting for other participating devices can be avoided by increasing the number of times of local model training.

Further, the step S10 includes:

step S101, receiving device resource information sent by each participating device participating in horizontal federal learning, wherein the device resource information at least comprises one or more of electric quantity resource information, calculation resource information and communication resource information.

Further, each participating device may actively upload its own device resource information to the coordinating device. And the coordination equipment receives the equipment resource information uploaded by each participating equipment. The device resource information may include at least one or more of power resource information, computing resource information, and communication resource information. Specifically, the computing resources may be represented by the number of CPUs and GPUs owned by the participating devices, the power resources may be represented by the time for which the participating devices can continue to operate, and the communication resources may be represented by the communication rates of the participating devices.

Further, in an embodiment, each participating device may be a remote sensing satellite with different sequence image data, and each remote sensing satellite performs horizontal federal learning by using respective image data to train the RNN to complete a weather prediction task. The coordinating device can be one of the remote sensing satellites or a base station located on the ground. The coordination equipment acquires equipment resource information of each remote sensing satellite, and then respectively configures calculation task parameters for each remote sensing satellite according to the equipment resource information of each remote sensing satellite, wherein the calculation task parameters comprise a predicted processing time step and/or a predicted processing batch size; and correspondingly sending the calculation task parameters of each remote sensing satellite to each remote sensing satellite, so that each remote sensing satellite executes a federal learning task according to the calculation task parameters, and the RNN training is completed. After the trained RNN is obtained, each remote sensing satellite can input the recently shot sequence remote sensing image data into the RNN, and the next weather condition is obtained through RNN prediction. In the RNN training process, the coordination equipment coordinates the calculation tasks of the remote sensing satellites according to the equipment resource information of the remote sensing satellites, so that the remote sensing satellites with abundant calculation resources do not need to spend time waiting in the training process, the overall efficiency of transverse federal learning of the remote sensing satellites is improved, and the deployment of the meteorological prediction RNN can be accelerated. In addition, in the training process, the contribution of the data of each remote sensing satellite to model training can be applied, including the contribution brought by the remote sensing satellite with the shortage of resources, so that the stability of the model is further improved, and the reliability of the weather condition and result obtained through RNN prediction is higher.

In addition, an embodiment of the present invention further provides a device for optimizing a horizontal federal learning system, where the device is deployed in a coordinating device participating in horizontal federal learning, and with reference to fig. 3, the device includes:

the acquiring module 10 is configured to acquire device resource information of each participating device participating in horizontal federal learning;

a configuration module 20, configured to respectively configure a calculation task parameter in the federated learning model training process corresponding to each of the participating devices according to the device resource information, where the calculation task parameter includes a predicted processing time step and/or a predicted processing batch size;

a sending module 30, configured to correspondingly send the calculation task parameters to each of the participating devices, so that each of the participating devices executes a federal learning task according to the respective calculation task parameters.

Further, the configuration module 20 includes:

the classification unit is used for classifying the participating devices according to the device resource information and determining the resource types of the participating devices;

and the configuration unit is used for respectively configuring the calculation task parameters in the federated learning model training process corresponding to each participating device according to the resource type of each participating device.

Further, the configuration unit includes:

the first determining subunit is configured to determine, according to the resource category to which each of the participating devices belongs, a candidate task parameter corresponding to each of the participating devices respectively;

the detection subunit is configured to determine predicted processing durations corresponding to the participating devices respectively based on the candidate task parameters, and detect whether preset duration consistency conditions are met among the predicted processing durations;

and the second determining subunit is configured to, if the preset duration consistency condition is met between the predicted processing durations, correspondingly use the candidate task parameters of each of the participating devices as the calculation task parameters of each of the participating devices.

Further, when the model to be trained in the horizontal federal learning is a recurrent neural network model and the calculation task parameter includes a predicted processing time step length, the configuration module 20 is further configured to configure a time step selection strategy corresponding to each of the participating devices according to the predicted processing time step length corresponding to each of the participating devices;

the sending module 30 is further configured to correspondingly send the time step selection policy to each of the participating devices, so that each of the participating devices selects sequence selection data from respective sequence data according to the respective time step selection policy, and executes a federal learning task according to the sequence selection data, where a time step of the sequence selection data is less than or equal to the respective predicted processing time step of each of the participating devices.

Further, when the calculation task parameter includes a predicted processing batch size, the configuration module 20 is further configured to configure a learning rate corresponding to each of the participating devices according to the predicted processing batch size corresponding to each of the participating devices;

the sending module 30 is further configured to correspondingly send the learning rate to each of the participating devices, so that each of the participating devices executes a federal learning task according to the respective learning rate and the expected batch size received from the coordinating device.

Further, the sending module 30 is further configured to correspondingly send the calculation task parameter to each of the participating devices, and send the estimated time length updated by the global model in the current round to each of the participating devices, so that each of the participating devices adjusts the number of times of local model training according to the estimated time length when performing local model training according to the calculation task parameter.

Further, the obtaining module 10 is further configured to receive device resource information sent by each participating device participating in the horizontal federal learning, where the device resource information at least includes one or more of electric quantity resource information, calculation resource information, and communication resource information.

The development content of the specific implementation mode of the optimization device of the horizontal federal learning system is basically the same as that of each embodiment of the optimization method of the horizontal federal learning system, and is not described herein any more.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a horizontal federal learning system optimization program is stored on the storage medium, and when being executed by a processor, the horizontal federal learning system optimization program implements the steps of the horizontal federal learning system optimization method as described below.

For the embodiments of the horizontal federal learning system optimization device and the computer-readable storage medium of the present invention, reference may be made to the embodiments of the horizontal federal learning system optimization method of the present invention, which are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for optimizing a transverse federated learning system is applied to a coordination device participating in the transverse federated learning, and comprises the following steps:

2. The method for optimizing a transverse federated learning system as claimed in claim 1, wherein the step of configuring, according to the information about the resources of each piece of equipment, the calculation task parameters in the federated learning model training process corresponding to each piece of participating equipment includes:

3. The method for optimizing a transverse federated learning system as claimed in claim 2, wherein the step of configuring the calculation task parameters in the federated learning model training process corresponding to each of the participating devices according to the resource class to which each of the participating devices belongs, respectively, includes:

4. The method for optimizing a horizontal federal learning system as claimed in claim 1, wherein when the model to be trained in horizontal federal learning is a recurrent neural network model and the calculation task parameters include a predicted processing time step, the step of correspondingly sending the calculation task parameters to each of the participating devices further comprises:

5. The method for optimizing a transverse federated learning system as set forth in claim 1, wherein, when the calculation task parameter includes a projected processing batch size, after the step of sending the calculation task parameter to each of the participating devices, further including:

6. The method for optimizing a transverse federated learning system as defined in claim 1, wherein the step of sending the calculation task parameters to each of the participating devices in a corresponding manner so that each of the participating devices performs a federated learning task based on the respective calculation task parameters includes:

7. The method for optimizing a transverse federated learning system of any one of claims 1 to 6, wherein the step of obtaining device resource information for each participating device participating in the transverse federated learning includes:

8. A lateral federated learning system optimization apparatus, wherein the apparatus is deployed in a coordinating device participating in lateral federated learning, the apparatus comprises:

9. A lateral federated learning system optimization apparatus, comprising: a memory, a processor, and a lateral federated learning system optimization program stored on the memory and executable on the processor that, when executed by the processor, performs the steps of the lateral federated learning system optimization method of any of claims 1-7.

10. A computer readable storage medium having stored thereon a lateral federal learning system optimization program which, when executed by a processor, performs the steps of a method for lateral federal learning system optimization as claimed in any of claims 1 to 7.