CN114492853A - Federal learning method, device, electronic equipment and storage medium - Google Patents

Federal learning method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114492853A
CN114492853A CN202210110135.8A CN202210110135A CN114492853A CN 114492853 A CN114492853 A CN 114492853A CN 202210110135 A CN202210110135 A CN 202210110135A CN 114492853 A CN114492853 A CN 114492853A
Authority
CN
China
Prior art keywords
model data
model
data
serialized
computing node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210110135.8A
Other languages
Chinese (zh)
Inventor
杨博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210110135.8A priority Critical patent/CN114492853A/en
Publication of CN114492853A publication Critical patent/CN114492853A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure provides a method and a device for federated learning, electronic equipment and a storage medium, and relates to the technical field of artificial intelligence, in particular to the field of distributed data processing and deep learning. The specific implementation scheme is as follows: and responding to the first serialization model data received from the first computing node, and performing deserialization processing on the first serialization model data to obtain first deserialization model data. The first serialized model data satisfies a predetermined data format; processing the first anti-sequence model data by using a first local model of the federal learning task to obtain second anti-sequence model data; and carrying out serialization processing on the second deserialized model data to obtain second serialized model data. The second serialized model data satisfies a predetermined data format; and sending the second serialized model data to the second computing node so that the second computing node processes third deserialized model data corresponding to the second serialized model data by using a second local model of the federated learning task.

Description

Federal learning method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technology, and more particularly, to the field of distributed data processing and deep learning technology. In particular, the present invention relates to a federal learning method, a federal learning apparatus, an electronic device, and a storage medium.
Background
Federal Learning (FL) is a distributed machine Learning technique. Federated learning enables collaborative training of models using multiple devices and their respective local data without disclosing local data for each device.
Disclosure of Invention
The disclosure provides a federated learning method, a federated learning apparatus, an electronic device, and a storage medium.
According to an aspect of the present disclosure, there is provided a federated learning method, including: in response to receiving first serialization model data from a first computing node, performing deserialization processing on the first serialization model data to obtain first deserialization model data, wherein the first serialization model data meets a preset data format; processing the first anti-sequence model data by using a first local model of a federal learning task to obtain second anti-sequence model data; serializing the second deserialized model data to obtain second serialized model data, wherein the second serialized model data meets the predetermined data format; and sending the second serialized model data to a second computing node so that the second computing node processes third deserialized model data corresponding to the second serialized model data using a second local model of the federated learning task.
According to another aspect of the present disclosure, there is provided a bang learning device, including: a first obtaining module, configured to perform deserialization processing on first serialized model data received from a first computing node in response to the first serialized model data, so as to obtain first deserialized model data, where the first serialized model data satisfies a predetermined data format; the second obtaining module is used for processing the first anti-sequence model data by utilizing a first local model of a federal learning task to obtain second anti-sequence model data; a third obtaining module, configured to perform serialization processing on the second deserialization model data to obtain second serialization model data, where the second serialization model data satisfies the predetermined data format; and a sending module, configured to send the second serialized model data to a second computing node, so that the second computing node processes third deserialized model data corresponding to the second serialized model data by using a second local model of the federated learning task.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the method of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of the present disclosure.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method of the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 schematically illustrates an exemplary system architecture to which the federated learning methods and apparatus may be applied, in accordance with an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a federated learning method in accordance with an embodiment of the present disclosure;
FIG. 3A schematically illustrates an example schematic of a predetermined data format in accordance with an embodiment of the disclosure;
FIG. 3B illustrates an example diagram of a federated learning process in accordance with an embodiment of the present disclosure;
FIG. 4 is a block diagram that schematically illustrates a federated learning facility, in accordance with an embodiment of the present disclosure; and
FIG. 5 schematically illustrates a block diagram of an electronic device adapted to implement the federated learning method in accordance with an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
For the federal learning based on different depth learning frames, the different depth learning frames have respective data formats, so that the federal learning based on the multi-depth learning frame is difficult to realize due to incompatibility of data structures of the different depth learning frames. For example, the output of a compute node that deploys a Tensor Flow cannot be utilized directly as input by a compute node that deploys a propeller (i.e., Paddle).
Therefore, the embodiment of the disclosure provides a federated learning scheme. The method comprises the steps of performing deserialization processing on first serialized model data which are from a first computing node and are constructed according to a preset data format to obtain first deserialized model data, processing the first deserialized model data through a first local model to obtain second deserialized model data, performing serializing processing on the second deserialized model data to obtain second serialized model data which meet the preset data format, and sending the second serialized model data to a second computing node so that the second computing node processes third deserialized model data which correspond to the second serialized model data through the second local model.
The serialized model data of different computing nodes are all data according to a preset data format, so that data structures of different computing nodes are compatible, and accordingly, the federal learning based on a multi-deep learning framework is realized. In addition, because the sequence model data is data according to a preset data format, decoupling from a specific deep learning framework can be realized, so that different computer nodes can utilize respective familiar deep learning frameworks to carry out federal learning, and the federal learning cost is reduced.
Fig. 1 schematically illustrates an exemplary system architecture to which the federated learning methods and apparatus may be applied, according to an embodiment of the present disclosure.
It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios. For example, in another embodiment, an exemplary system architecture to which the federal learning method and apparatus may be applied may include a terminal device, but the terminal device may implement the federal learning method and apparatus provided in the embodiments of the present disclosure without interacting with a server.
As shown in FIG. 1, a system architecture 100 according to this embodiment may include a set of compute nodes 101 and a network 102. The set of compute nodes 101 may include compute nodes, i.e., compute nodes 10 end devices 101_1, 101_2, 101_3, ·. May be an integer greater than 1. . Network 102 serves as a medium for providing communication links between various computing nodes. Network 102 may include various connection types, such as wired and/or wireless communication links, and so forth.
The computing nodes may be various types of servers that provide various services. For example, the Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, and the defects of high management difficulty and weak service extensibility in a conventional physical host and a VPS (Virtual Private Server) service are overcome. The server may also be a server of a distributed system, or a server incorporating a blockchain.
The collection of computing nodes 101 may cooperate to accomplish a federated learning task. The federated learning task has a global model corresponding to it. The global model may include a plurality of local models. Different local models may be deployed at different compute nodes. For example, the global model may include local models, i.e., local model 1, local model 2, local model 3, a. The local model is deployed at the compute node. Different computing nodes may deploy the same or different deep learning frameworks.
For an intermediate computing node in the computing node set 101, the intermediate computing node may receive an output result from a previous computing node of the intermediate computing node, and process the output result using a local model associated with the intermediate computing node to obtain an output result corresponding to the intermediate computing node. And sending the output result of the intermediate computing node to the next computing node of the intermediate computing node so that the next computing node processes the output result by using the local model corresponding to the next computing node until the output result of the last computing node is obtained.
For example, compute node 101_2 may be referred to as a third compute node, which may be an intermediate compute node as described above. The compute node 101_1 is referred to as a first compute node, which may be the last compute node described above. Compute node 101_3 is referred to as the second compute node, which may be the next compute node as described above.
The third computing node 101_2 may perform deserialization processing on the first serialized model data in response to receiving the first serialized model data from the first computing node 1011, resulting in first deserialized model data. The first serialized model data satisfies a predetermined data format.
The third computing node 101_2 may process the first deserialization model data using the first local model of the federated learning task to obtain the second deserialization model data.
The third computing node 101_2 may perform serialization processing on the second deserialization model data to obtain second serialized model data. The second serialized model data satisfies a predetermined data format.
The third computing node 101_2 sends the second serialized model data to the second computing node 1013 so that the second computing node 101_3 processes third deserialized model data corresponding to the second serialized model data using a second local model of the federated learning task.
It should be noted that the federal learning method provided by the embodiments of the present disclosure may be generally executed by the third computing node. Correspondingly, the federal learning device provided by the embodiment of the disclosure can also be arranged in the third computing node. The third compute node may be any one of the compute nodes in fig. 1 except compute node 101 and the compute node.
Alternatively, the federated learning approach provided by embodiments of the present disclosure may also be performed by a computing node or cluster of computing nodes that is different from, and capable of communicating with, a third computing node. Accordingly, the federated learning device provided by the embodiments of the present disclosure may also be disposed in a computing node or a cluster of computing nodes that is different from and capable of communicating with a third computing node.
FIG. 2 schematically illustrates a flow chart of a federated learning method in accordance with an embodiment of the present disclosure.
As shown in FIG. 2, the method 200 includes operations S210-S240.
In operation S210, in response to receiving the first serialized model data from the first computing node, the first serialized model data is deserialized to obtain first deserialized model data. The first serialized model data satisfies a predetermined data format.
In operation S220, the first deserialization model data is processed using the first local model of the federal learning task to obtain second deserialization model data.
In operation S230, the second deserialized model data is serialized to obtain second serialized model data. The second serialized model data satisfies a predetermined data format.
In operation S240, the second serialized model data is sent to the second computing node so that the second computing node processes third deserialized model data corresponding to the second serialized model data using a second local model of the federated learning task.
According to embodiments of the present disclosure, a federated learning task may refer to a learning task that requires multiple computing nodes to cooperate to complete. The model corresponding to the federated learning task may be referred to as a global model. The global model may include a plurality of local models. The compute node may deploy a local model. Different computing nodes may deploy the same or different deep learning frameworks. The learning task may include at least one of an image processing task, an audio processing task, and a text processing task. For example, the image processing task may include at least one of image recognition, image classification, object detection, image segmentation, and the like. The audio processing tasks may include speech recognition, and the like. The text processing task may include at least one of translation and text generation, etc.
According to an embodiment of the present disclosure, the method provided by the embodiment of the present disclosure may be performed by a third computing node. The first compute node, the second compute node, and the third compute node may each deploy a local model of a federated learning task. The local model of the federal learning task deployed at the third compute node may be referred to as a first local model. The local model of the federated learning task deployed to the second compute node is referred to as a second local model. The output of the first computing node may be provided as an input to a third computing node, and the output of the third computing node may be provided as an input to the second computing node.
According to an embodiment of the present disclosure, the first serialized model data may refer to data obtained by serializing an output result of the first computing node by the first computing node. The output result of the first compute node may be referred to as fourth deserialized model data. The first deserialization model data may refer to data obtained by deserializing the first serialization model data by the third computing node. The second deserialization model data may refer to data that the third computing node processes the first deserialization model data with the first local model corresponding to the third computing node. The second serialized model data may refer to data obtained by the third computing node serializing the second deserialized model data.
According to an embodiment of the present disclosure, the predetermined data format may include at least one of: information relating to a predetermined data type identification, information relating to a predetermined number of dimension elements and information relating to a predetermined set of mapping position intervals. The predetermined data type identification may characterize the data type. The predetermined number of dimension elements may characterize the number of elements included in each dimension in the predetermined model matrix. The set of predetermined mapping position intervals may comprise at least one predetermined mapping position interval. The predetermined mapped location interval may characterize a predetermined location interval corresponding to each element in the predetermined model matrix. The predetermined position interval may be a position interval formed by a first predetermined starting position and a second predetermined ending position. The predetermined model matrix may refer to a model matrix obtained by deserializing predetermined serialized model data. The predetermined serialization model data may refer to serialization model data that requires deserialization processing. The predetermined serialization model data may include first serialization model data and second serialization model data. The predetermined serialized model data can be a linear list.
According to an embodiment of the disclosure, the third computing node may, in response to receiving the data processing request from the first computing node, parse the data processing request to obtain the first serialized model data. The third computing node may process the first serialized model data according to a deserialization policy to obtain first deserialized model data. The deserialization policy may be determined according to a policy for determining a predetermined data format. For example, the first anti-sequence model data may be a first model matrix.
According to an embodiment of the disclosure, the third computing node may input the first deserialization model data into the first local model of the federal learning task corresponding to the third computing node to obtain the second deserialization model data. And processing the second deserialized model data according to the serialization strategy to obtain second serialized model data. The serialization policy may be determined according to a policy for determining a predetermined data format.
According to an embodiment of the disclosure, the third computing node may send the second serialized model data to the second computing node after obtaining the second serialized model data, so that the second computing node may process the third deserialized model data corresponding to the second serialized model data using the second local model of the federated learning task corresponding to the second computing node. If the second computing node is determined not to be the last computing node for processing the federal learning task, the second computing node may process the third deserialization model data by using a third local model of the federal learning task corresponding to the second computing node to obtain fifth deserialization model data, serialize the fifth deserialization model data to obtain third serialized model data, and send the third serialized model data to a next computing node of the second computing node. And repeating the operation until an output result of the last computing node for processing the federal learning task is obtained. The serialized model data in the above-described operation process may be model data satisfying a predetermined data format.
According to an embodiment of the present disclosure, the operations S210 to S240 may be applied to training a global model of the federal learning task, and may also be applied to performing inference by using the trained global model. The configuration may be performed according to actual service requirements, which is not limited herein.
According to the embodiment of the disclosure, deserialization is performed on first serialized model data from a first computing node, the first serialized model data being constructed according to a predetermined data format, so as to obtain first deserialized model data, the first deserialized model data is processed by a first local model, so as to obtain second deserialized model data, serialization is performed on the second deserialized model data, so as to obtain second serialized model data meeting the predetermined data format, and the second serialized model data is sent to a second computing node, so that the second computing node processes third deserialized model data corresponding to the second serialized model data by a second local model. Due to the fact that the serialized model data of different computing nodes are data in a preset data format, data structures of different computing nodes are compatible, and accordingly federal learning based on a multi-depth learning framework is achieved. In addition, because the sequence model data is data according to a preset data format, decoupling from a specific deep learning framework can be realized, so that different computer nodes can utilize respective familiar deep learning frameworks to carry out federal learning, and the federal learning cost is reduced.
According to an embodiment of the present disclosure, the deep learning framework corresponding to the first and second local models is different.
According to an embodiment of the present disclosure, the deep learning framework may be any type of deep learning framework, and the present disclosure is not particularly limited thereto.
According to embodiments of the present disclosure, the deep learning framework deployed by the second and third computing nodes may be different. Thus, the deep learning framework corresponding to the first local model deployed at the third compute node and the second local model deployed at the second compute node is different.
According to an embodiment of the present disclosure, the first serialized model data includes a first data type identification, a first number of dimension elements, and at least one first mapping position interval.
According to an embodiment of the present disclosure, operation S210 may include the following operations.
And obtaining first model data according to at least one first mapping position interval. The first model data is converted into a first model matrix corresponding to the first data type identifier according to the number of first dimension elements based on a predetermined binary arithmetic standard. Each dimension of the first model matrix comprises a number of elements corresponding to the number of elements of the dimension comprised by the number of elements of the first dimension. The first model matrix is determined as first deserialized model data.
According to an embodiment of the present disclosure, the first data type identifies a data type that may characterize the first deserialization model data. The data type may include at least one of: floating point numbers and integers. The floating point number may include at least one of: half precision floating point number, full precision floating point number, and double precision floating point number. The integer may include at least one of: 8-bit signed integers, 8-bit unsigned integers, 16-bit signed integers, 32-bit signed integers and 64-bit signed integers. In practical cases, the elements included in the model matrix may utilize full precision floating point numbers. In addition, half-precision floating point numbers may also be utilized to compress the size of the model matrix.
According to an embodiment of the present disclosure, the data type identification may be characterized by a binary code group of a first predetermined number of bits. The value of the first predetermined number of bits may be configured according to actual service requirements, and is not limited herein. For example, the first predetermined number of bits may be 4. The data type identification may be "0", and thus,can obtain 8 (i.e. 2)38) data types for respectively characterizing the 8 data types described above. Alternatively, the data type identifier may be "1", and thus, 8 data types may also be obtained for respectively characterizing the above-mentioned 8 data types. The data type identifier may be configured according to actual service requirements, and is not limited herein.
According to an embodiment of the present disclosure, the first data type identifier may be characterized by a first predetermined number of bits of a binary code set. For example, the first data type may be characterized by a 4-bit binary code group. The first dimension element number may represent the number of elements of each dimension of the model matrix obtained by performing deserialization processing on the first serialized model data. The first dimension element number may comprise a number of elements of at least one dimension. The model matrix obtained by performing deserialization processing on the first serialized model data may be referred to as a first model matrix.
According to an embodiment of the present disclosure, the number of elements of each dimension of the first dimension number of elements may be characterized by a binary code set of a second predetermined number of bits. The value of the second predetermined number of bits may be configured according to actual service requirements, and is not limited herein. For example, the second predetermined number of bits may be 32 bits. Thus, the upper limit of the number of elements can be determined to be 4294967295 (i.e., 2)324294967295). This upper limit is about twice the upper limit of the actual traffic demand. Further, the number of 32-bit binary code groups may be set with a predetermined code group, and data belonging to the number of format code groups may be included after the first bit of the predetermined code group is set. For example, the following binary code set sequence may be set to specify the data organization of the model matrix. The binary code group sequence is T, b1、b2、......、bh-1、bh. The length of the binary code group sequence is 32(h + 1). h may be an integer greater than or equal to 1.
According to an embodiment of the present disclosure, the first mapping position interval may characterize a second starting position interval corresponding to each element in the first model matrix. The second start position interval may be a position interval formed by the second start position and the second end position. The first mapping position interval may be determined according to a predetermined mapping relationship. The predetermined mapping may be matched to the memory organization of the compute nodes.
According to an embodiment of the present disclosure, the predetermined mapping relationship may be determined according to the following formula (1).
Figure BDA0003493989970000091
In accordance with an embodiment of the present disclosure,
Figure BDA0003493989970000092
an element in the first model matrix may be characterized. m may characterize the number of bits of the binary code set.
Figure BDA0003493989970000093
A first predetermined starting position may be characterized.
Figure BDA0003493989970000101
A first predetermined end position may be characterized.
According to an embodiment of the present disclosure, the predetermined binary arithmetic criteria may include a criterion for encoding a binary interchange format. For example, the predetermined binary arithmetic standard may include the IEEE floating-point binary arithmetic standard (i.e., IEEE 754).
According to an embodiment of the present disclosure, after determining the at least one first mapped location interval, the first model data may be derived from the first mapped location intervals corresponding to the at least one first mapped location intervals, respectively. And converting the first model data into a first model matrix corresponding to the first data type identification and having the number of elements included in each dimension of the first model matrix consistent with the number of elements included in the number of elements of the first dimension based on a predetermined binary arithmetic standard, namely, enabling the number of elements included in the dimension of the first model matrix to be consistent with the number of elements of the dimension of the first dimension element for each dimension of the first model matrix. The obtained first model matrix may thus be determined as first deserialized model data.
According to an embodiment of the present disclosure, operation S210 may include the following operations.
And responding to the first serialization model data received from the first computing node by calling the preset data interface, and performing deserialization processing on the first serialization model data to obtain first deserialization model data.
According to an embodiment of the present disclosure, the predetermined data interface may refer to an interface capable of implementing a data format compatible with different depth learning frameworks. The predetermined data interface may be a data interface matching a predetermined data format. The predetermined data format may be in the form of a Tensor (i.e., Tensor). The predetermined data interface may be a data interface independent of the deep learning framework.
According to an embodiment of the present disclosure, transmission of data is achieved by receiving serialized model data using a predetermined data interface.
According to an embodiment of the disclosure, the second deserialized model data comprises a second model matrix.
According to an embodiment of the present disclosure, performing serialization processing on the second deserialization model data to obtain second serialized model data may include the following operations.
And converting the second model matrix into at least one second mapping position interval according to the preset mapping relation. The predetermined mapping relation is matched with the memory organization form of the computing node. And determining the number of elements of each dimension included in the second model matrix to obtain the number of elements of the second dimension. And obtaining second serialization model data according to the second data type identifier, the second dimension element number and at least one second mapping position interval.
According to an embodiment of the disclosure, a computing node in which a predetermined mapping relationship matches a memory organization form of the computing node may refer to a computing node participating in a federal learning task. For example, a first compute node, a second compute node, and a third compute node.
According to an embodiment of the present disclosure, the second data type identifies a data type that may characterize the second deserialization model data. For the description of the data types, reference may be made to the corresponding parts above, which are not described herein again.
According to an embodiment of the present disclosure, the second data type identification may be characterized by a binary code group of a first predetermined number of bits. For example, the second data type may be characterized by a 4-bit binary code group. The second dimension element number may represent the number of elements of each dimension of the model matrix obtained by performing deserialization processing on the second serialized model data. The second dimension element number may comprise a number of elements of at least one dimension.
According to an embodiment of the present disclosure, the number of elements of each dimension of the second dimension number of elements may be characterized by a binary code set of a second predetermined number of bits. The value of the second predetermined number of bits may be configured according to actual service requirements, and is not limited herein. For the description of the binary code group of the second predetermined number of bits, the above corresponding parts may be referred to, and are not described herein again.
According to an embodiment of the present disclosure, the second mapping position interval may characterize a second end position interval corresponding to each element in the second model matrix. The second end position interval may be a position interval formed by the third start position and the fourth end position. The second mapping position section may be determined according to a predetermined mapping relationship. The predetermined mapping may be matched to the memory organization of the compute nodes.
According to an embodiment of the present disclosure, the second model matrix may be converted into at least one second mapping position interval according to a predetermined mapping relationship. And obtaining second serialized model data according to the second data type identifier, the second dimension element number and at least one second mapping position interval. The number of elements of the second dimension may be determined according to the number of elements of each dimension comprised by the second model matrix.
According to the embodiment of the present disclosure, the serialized model data (i.e., the model matrix) can be a linear list, and the model matrix is also a linear list in terms of organization in the memory of the compute node, so no additional resource consumption is required for the serialization process. Resource consumption in the whole process is only caused by memory copy, and exponential acceleration can be brought.
According to the embodiment of the disclosure, the third deserialization model data is obtained by deserializing the second serialization model data by the second computing node.
According to an embodiment of the disclosure, the third deserialization model data may be obtained by the second compute node processing the second serialization model data according to a deserialization policy. For example, the second serialized model data can include a third data type identification, a third number of dimension elements, and at least one third mapping position interval.
According to an embodiment of the disclosure, the third deserialized model data may be a third model matrix. The third model matrix may be a model matrix that converts the third model data into a model matrix corresponding to the third data type identification according to the number of third dimension elements based on a predetermined binary arithmetic criterion. The third model data may be derived from at least one third mapped location interval. The number of elements of each dimension of the third model matrix corresponds to the number of elements of the dimension comprised by the number of elements of the third dimension.
The federal learning method according to an embodiment of the present disclosure is further described below with reference to fig. 3A and 3B in conjunction with specific embodiments.
Fig. 3A schematically illustrates an example schematic diagram of a predetermined data format according to an embodiment of the present disclosure.
As shown in fig. 3A, in 300A, the predetermined DATA format may include a predetermined DATA header 301 and a predetermined DATA body (i.e., DATA) 302. The predetermined data header 301 may include a predetermined data type identification (i.e., INDEX)3010 and a predetermined number of dimension elements (i.e., swap) 3011. Predetermined data header 301 may include L predetermined mapping position intervals, i.e., predetermined mapping position intervals 302_1, ·. L may be an integer greater than or equal to 1. L is in the form of {1, 2.
For example, the predetermined data type identification may be "0000". The predetermined number of dimension elements may be "0X 00100X 0011". The predetermined mapped location interval 302_1 may be "0 XFFF". The predetermined mapping position interval 302_ l may be "0 XFFAA". The predetermined mapping position interval 302_ L may be "0X 00110X 0011". The left side of the blank in the predetermined mapping position interval represents a first predetermined starting position, and the right side of the blank represents a second predetermined ending position.
According to an embodiment of the present disclosure, the first serialization model data and the second serialization model data in the embodiment of the present disclosure each satisfy the above-described predetermined data format.
Fig. 3B illustrates an example schematic diagram of a federal learning process in accordance with an embodiment of the present disclosure.
As shown in fig. 3B, in 300B, third computing node 304 may include an deserialization layer 3040, a processing layer 3041, and a serialization layer 3042. The treatment layer 3041 may be a training layer.
The first computing node 303 sends the first serialized model data satisfying the predetermined data format to the third computing node 304.
The deserialization layer 3040 of the third calculation node 304 deserializes the first serialized model data to obtain the first deserialized model data.
The processing layer 3041 of the third computing node 304 processes the first deserialization model data by using the first local model of the federal learning task to obtain the second deserialization model data.
The serialization layer 3042 of the third computation node 304 serializes the second deserialization model data to obtain second serialized model data satisfying a predetermined data format.
The third computing node 304 sends the second serialized model data to the second computing node 305, so that the second computing node 304 performs deserialization processing on the second serialized model data by using the second local model processing of the federal learning task to obtain third deserialization data.
The second compute node 305 may include an deserialization layer and a processing layer if the second compute node 305 is the last compute node to process the federated learning task. In this case, the processing layer included in the second computing node 305 may be a prediction layer.
The above is merely an exemplary embodiment, but is not limited thereto, and other federal learning methods known in the art may be included as long as the federal learning based on the multi-deep learning framework can be implemented.
Fig. 4 schematically illustrates a block diagram of a federal learning device in accordance with an embodiment of the present disclosure.
As shown in fig. 4, the federal learning device 400 can include a first obtaining module 410, a second obtaining module 420, a third obtaining module 430, and a sending module 440.
The first obtaining module 410 is configured to perform deserialization processing on the first serialized model data in response to receiving the first serialized model data from the first computing node, so as to obtain first deserialized model data. The first serialized model data satisfies a predetermined data format.
And a second obtaining module 420, configured to process the first deserializing model data by using the first local model of the federal learning task to obtain second deserializing model data.
And a third obtaining module 430, configured to perform serialization processing on the second deserialization model data to obtain second serialized model data. The second serialized model data satisfies a predetermined data format.
A sending module 440, configured to send the second serialized model data to the second computing node, so that the second computing node processes third deserialized model data corresponding to the second serialized model data by using the second local model of the federal learning task.
According to an embodiment of the present disclosure, the first serialized model data includes a first data type identification, a first number of dimension elements, and at least one first mapping position interval.
According to an embodiment of the present disclosure, the first obtaining module 410 may include a first obtaining unit, a first converting unit, and a first determining unit.
And the first obtaining unit is used for obtaining first model data according to at least one first mapping position interval.
A first conversion unit for converting the first model data into a first model matrix corresponding to the first data type identification according to the number of the first dimension elements based on a predetermined binary arithmetic standard. Each dimension of the first model matrix comprises a number of elements corresponding to the number of elements of the dimension comprised by the number of elements of the first dimension.
A first determining unit, configured to determine the first model matrix as first deserialized model data.
According to an embodiment of the disclosure, the second deserialized model data comprises a second model matrix.
According to an embodiment of the present disclosure, the third obtaining module 430 may include a second converting unit, a second determining unit, and a second obtaining unit.
And the second conversion unit is used for converting the second model matrix into at least one second mapping position interval according to the preset mapping relation. The predetermined mapping relation is matched with the memory organization form of the computing node.
And the second determining unit is used for determining the number of elements of each dimension included in the second model matrix to obtain the number of elements of the second dimension.
And the second obtaining unit is used for obtaining second serialized model data according to the second data type identifier, the second dimension element number and at least one second mapping position interval.
According to an embodiment of the present disclosure, the first obtaining module 410 may include a third obtaining unit.
And the third obtaining unit is used for responding to the first serialization model data received from the first computing node by calling the preset data interface, and performing deserialization processing on the first serialization model data to obtain the first deserialization model data.
According to an embodiment of the present disclosure, the deep learning framework corresponding to the first and second local models is different.
According to the embodiment of the disclosure, the third deserialization model data is obtained by deserializing the second serialized model data by the second computing node.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
According to an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as above.
According to an embodiment of the present disclosure, a non-transitory computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the method as above.
According to an embodiment of the disclosure, a computer program product comprising a computer program which, when executed by a processor, implements the method as above.
FIG. 5 schematically illustrates a block diagram of an electronic device adapted to implement the federated learning method in accordance with an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the electronic device 500 includes a computing unit 501, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
A number of components in the electronic device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 executes the respective methods and processes described above, such as the federal learning method. For example, in some embodiments, the federal learning method can be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When loaded into RAM 503 and executed by computing unit 501, may perform one or more of the steps of the federated learning method described above. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the federal learning method in any other suitable manner (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (15)

1. A method for federated learning, comprising:
in response to receiving first serialization model data from a first computing node, performing deserialization processing on the first serialization model data to obtain first deserialization model data, wherein the first serialization model data meets a preset data format;
processing the first anti-sequence model data by using a first local model of a federal learning task to obtain second anti-sequence model data;
serializing the second deserialized model data to obtain second serialized model data, wherein the second serialized model data meets the preset data format; and
and sending the second serialized model data to a second computing node, so that the second computing node processes third deserialized model data corresponding to the second serialized model data by using a second local model of the federated learning task.
2. The method of claim 1, wherein the first serialized model data includes a first data type identification, a first number of dimension elements, and at least one first mapping position interval:
wherein, the deserializing the first serialized model data to obtain the first deserialized model data includes:
obtaining first model data according to the at least one first mapping position interval;
converting the first model data into a first model matrix corresponding to the first data type identifier according to the first dimension element number based on a predetermined binary arithmetic standard, wherein each dimension of the first model matrix comprises the element number consistent with the element number of the dimension comprised by the first dimension element number; and
determining the first model matrix as the first deserialized model data.
3. The method of claim 1 or 2, wherein the second deserialization model data comprises a second model matrix;
wherein, the serializing the second deserializing model data to obtain second serialized model data includes:
converting the second model matrix into at least one second mapping position interval according to a preset mapping relation, wherein the preset mapping relation is matched with a memory organization form of the computing node;
determining the number of elements of each dimension included in the second model matrix to obtain the number of elements of a second dimension; and
and obtaining the second serialized model data according to a second data type identifier, the second dimension element number and the at least one second mapping position interval.
4. The method of claim 1, wherein said deserializing the first serialized model data in response to receiving the first serialized model data from the first computing node to obtain the first deserialized model data comprises:
and in response to receiving first serialization model data from the first computing node by calling a preset data interface, performing deserialization processing on the first serialization model data to obtain the first deserialization model data.
5. The method of any of claims 1-4, wherein a deep learning framework corresponding to the first and second local models is different.
6. The method according to any one of claims 1 to 5, wherein the third deserialization model data is obtained by deserializing the second serialization model data by the second computation node.
7. A bang learning device, comprising:
the first obtaining module is used for responding to first serialization model data received from a first computing node, and performing deserialization processing on the first serialization model data to obtain first deserialization model data, wherein the first serialization model data meet a preset data format;
the second obtaining module is used for processing the first anti-sequence model data by utilizing a first local model of a federal learning task to obtain second anti-sequence model data;
a third obtaining module, configured to perform serialization processing on the second deserialization model data to obtain second serialization model data, where the second serialization model data meets the predetermined data format; and
and the sending module is used for sending the second serialized model data to a second computing node so that the second computing node processes third deserialized model data corresponding to the second serialized model data by using a second local model of the federated learning task.
8. The apparatus of claim 7, wherein the first serialized model data includes a first data type identification, a first number of dimension elements, and at least one first mapping position interval;
wherein the first obtaining module includes:
a first obtaining unit, configured to obtain first model data according to the at least one first mapping position interval;
a first conversion unit, configured to convert the first model data into a first model matrix corresponding to the first data type identifier according to the first dimension element number based on a predetermined binary arithmetic standard, wherein each dimension of the first model matrix includes an element number that is consistent with an element number of the dimension included in the first dimension element number; and
a first determining unit, configured to determine the first model matrix as the first deserialized model data.
9. The apparatus of claim 7 or 8, wherein the second deserialization model data comprises a second model matrix;
wherein the third obtaining module includes:
the second conversion unit is used for converting the second model matrix into at least one second mapping position interval according to a preset mapping relation, wherein the preset mapping relation is matched with the memory organization form of the computing node;
a second determining unit, configured to determine the number of elements of each dimension included in the second model matrix, to obtain a second dimension element number; and
a second obtaining unit, configured to obtain the second serialized model data according to a second data type identifier, the second dimension element number, and the at least one second mapping position interval.
10. The apparatus of claim 9, wherein the first obtaining means comprises:
and the third obtaining unit is used for responding to the first serialization model data received from the first computing node by calling a preset data interface, and performing deserialization processing on the first serialization model data to obtain the first deserialization model data.
11. The apparatus of any of claims 7-10, wherein a deep learning framework corresponding to the first and second local models is different.
12. The device according to any one of claims 7 to 11, wherein the third deserialization model data is obtained by deserializing the second serialization model data by the second computation node.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method according to any one of claims 1-6.
15. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 6.
CN202210110135.8A 2022-01-28 2022-01-28 Federal learning method, device, electronic equipment and storage medium Pending CN114492853A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210110135.8A CN114492853A (en) 2022-01-28 2022-01-28 Federal learning method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210110135.8A CN114492853A (en) 2022-01-28 2022-01-28 Federal learning method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114492853A true CN114492853A (en) 2022-05-13

Family

ID=81478448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210110135.8A Pending CN114492853A (en) 2022-01-28 2022-01-28 Federal learning method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114492853A (en)

Similar Documents

Publication Publication Date Title
CN110298035B (en) Word vector definition method, device, equipment and storage medium based on artificial intelligence
WO2020143320A1 (en) Method and apparatus for acquiring word vectors of text, computer device, and storage medium
CN112506935B (en) Data processing method, device, electronic equipment, storage medium and program product
CN113159091B (en) Data processing method, device, electronic equipment and storage medium
CN115690443B (en) Feature extraction model training method, image classification method and related devices
US20220343512A1 (en) Method and apparatus of processing image, electronic device, and storage medium
CN113627536A (en) Model training method, video classification method, device, equipment and storage medium
CN114201242B (en) Method, device, equipment and storage medium for processing data
CN114861758A (en) Multi-modal data processing method and device, electronic equipment and readable storage medium
CN113052246B (en) Method and related apparatus for training classification model and image classification
CN113963186A (en) Training method of target detection model, target detection method and related device
CN113344074A (en) Model training method, device, equipment and storage medium
CN112560936A (en) Model parallel training method, device, equipment, storage medium and program product
CN112686365B (en) Method, device and computer equipment for operating neural network model
CN115186738B (en) Model training method, device and storage medium
CN114492853A (en) Federal learning method, device, electronic equipment and storage medium
CN113627526B (en) Vehicle identification recognition method and device, electronic equipment and medium
US20220113943A1 (en) Method for multiply-add operations for neural network
WO2023029464A1 (en) Data processing apparatus and method, chip, computer device, and storage medium
CN116302218A (en) Function information adding method, device, equipment and storage medium
CN115688917A (en) Neural network model training method and device, electronic equipment and storage medium
CN113392653A (en) Translation method, related device, equipment and computer readable storage medium
CN116560817B (en) Task execution method, device, electronic equipment and storage medium
CN115001628B (en) Data encoding method and device, data decoding method and device and data structure
CN115361032B (en) Antenna unit for 5G communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination