CN114764601A

CN114764601A - Gradient data fusion method and device and storage medium

Info

Publication number: CN114764601A
Application number: CN202210482492.7A
Authority: CN
Inventors: 田天; 其他发明人请求不公开姓名
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2022-05-05
Filing date: 2022-05-05
Publication date: 2022-07-19
Anticipated expiration: 2042-05-05
Also published as: CN114764601B

Abstract

The embodiment of the application relates to the field of data processing, and provides a gradient data fusion method, a gradient data fusion device and a storage medium, wherein the method comprises the following steps: a first node acquires first data, wherein the first data is a first fragment or data obtained by preprocessing the first fragment or gradient data, and the first fragment is one of a plurality of fragments corresponding to the gradient data in the first node and not participating in gradient fusion; the method comprises the steps that a first node sends first data to a target node in at least one second node, wherein the first data is used for carrying out gradient fusion with gradient data in the target node; the first node determines a first gradient fusion result according to first gradient fusion data received from the third node, wherein the first gradient fusion data is obtained by performing gradient fusion on the basis of the first data and gradient data in other nodes. According to the scheme, when gradient fusion is carried out on the gradient data of the nodes, the privacy of the gradient data of the nodes is protected, and the efficiency of generating a gradient fusion result is improved.

Description

Gradient data fusion method and device and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a gradient data fusion method, apparatus, and storage medium.

Background

Data privacy protection means that the tag or feature of the place to which the data belongs (i.e., the authorized party) cannot be transmitted in the clear to the non-data-belonging place (i.e., the unauthorized party). In the scenes of horizontal federal learning and the like, a plurality of participants not only complete model training through data interaction, but also need to protect the data privacy of the participants. Currently, mainly, a plurality of participants respectively calculate gradient of a model locally and encrypt gradient data by using a public key, and then send a ciphertext gradient to an aggregation node (agglomerator); the aggregation node performs gradient fusion (such as accumulation) on the ciphertext gradient sent by each participant to obtain a gradient fusion result, and sends the gradient fusion result encrypted by the private key to each participant; each participant decrypts the gradient fusion result based on the same private key, so that the local model is updated based on plaintext gradient data obtained through decryption, thereby accelerating the training convergence of the local model and improving the training precision of the local.

However, in this horizontal federal learning manner, since data encryption and decryption operations need to be performed multiple times in the data interaction process between the aggregation node and each participant, it takes too long time for each participant to obtain the gradient fusion result, i.e., the processing efficiency of gradient fusion is low, and finally the training speed of each participant on the local model is seriously reduced, for example, the model training speed may be slowed by more than 100 times.

Therefore, the current gradient data fusion scheme cannot improve the efficiency of each participant in obtaining the gradient fusion result while protecting the data privacy of each participant.

Disclosure of Invention

The embodiment of the application provides a host scheduling method, a host scheduling device, a computing device and a storage medium, so as to achieve the purpose of dynamically adapting to the scheduling preference of different user scenes for host resources and improve the flexibility of host scheduling.

In a first aspect, a gradient data fusion method provided in an embodiment of the present application is introduced from the perspective of a first node in a ring-shaped communication system, where the method is applied to the ring-shaped communication system, and the ring-shaped communication system includes the first node, a third node, and at least one second node, where the at least one second node includes a target node; in specific implementation, a first node acquires first data, wherein the first data is a first fragment or data obtained by preprocessing gradient data in the first fragment or the first node, and the first fragment is one fragment which does not participate in gradient fusion in a plurality of fragments corresponding to the gradient data in the first node; then, the first node sends first data to the target node, the first data is used for carrying out gradient fusion with gradient data in the target node, and the first node receives the first gradient fusion data from the third node, and the first gradient fusion data is obtained by carrying out gradient fusion on the basis of the first data, the gradient data in at least one second node and the gradient data in the third node; and the first node determines a first gradient fusion result of the first node according to the first gradient fusion data.

In a possible implementation manner, when the first data obtained by the first node is the first segment or data obtained by performing the preset processing on the first segment, the first node may further segment the gradient data in the first node to obtain a plurality of segments.

In a possible implementation manner, the first node sends the first gradient fusion result to the target node, and the first gradient fusion result is used to update the segment in the target node, that is, the target node may refer to one segment in the gradient data stored by itself in generating the first gradient fusion result, and update the segment into the first gradient fusion result.

In a possible implementation manner, when the first data is the first segment or the data obtained by performing the preset processing on the first segment, after receiving the second gradient fusion result sent by the third node, the first node may refer to the second segment in which the second gradient fusion result is generated in the gradient data stored in the first node, and update the second segment into the second gradient fusion result.

In a second aspect, an embodiment of the present application provides a data processing apparatus having a function of implementing the gradient data fusion method corresponding to the first aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware.

The data processing device is applied to a first node in a ring-shaped communication system, the ring-shaped communication system further comprises a third node and at least one second node, and the at least one second node comprises a target node; the data processing apparatus includes:

the processing module is used for acquiring first data, wherein the first data is a first fragment, the first fragment is one of a plurality of fragments corresponding to the gradient data in the first node and not participating in gradient fusion, or the first data is data obtained by preprocessing the first fragment or the gradient data;

the receiving and sending module is used for sending first data to the target node, and the first data is used for carrying out gradient fusion with gradient data in the target node; receiving first gradient fusion data from a third node, wherein the first gradient fusion data is obtained by performing gradient fusion on the first data, gradient data in at least one second node and gradient data in the third node;

the processing module is further configured to determine a first gradient fusion result of the first node according to the first gradient fusion data.

In a possible implementation manner, when the first data obtained by the first node is the first segment or is data obtained by performing the preset processing on the first segment, the processing module is further configured to segment the gradient data in the first node to obtain a plurality of segments.

In a possible implementation, the transceiver module sends the first gradient fusion result to the target node to update the slice in the target node.

In a possible implementation manner, when the first data is the first segment or data obtained by performing the preset processing on the first segment, after the transceiver module receives a second gradient fusion result sent by the third node, the processing module participates in the gradient data stored in the processing module itself in the second segment generating the second gradient fusion result, and updates the second gradient fusion result into the second gradient fusion result.

In a third aspect, an embodiment of the present application further provides a computing device, where the device may include a processor and a memory:

the memory is used for storing a computer program;

the processor is configured to perform the method provided in any one of the embodiments of the first aspect and the first aspect according to the computer program.

In a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium is configured to store a computer program, where the computer program is configured to execute the method provided in any one of the foregoing first aspect and embodiments of the first aspect.

In the above implementation manner of the embodiment of the present application, a first node in an annular communication system acquires first data, where the first data is a first segment or data obtained by preprocessing gradient data in the first segment or the first node, and the first segment is one segment that does not participate in gradient fusion in multiple segments corresponding to the gradient data in the first node; then, the first node sends first data to a target node in at least one second node in the ring-shaped communication system, the first data is used for performing gradient fusion with the gradient data in the target node, and the first node receives first gradient fusion data from a third node in the ring-shaped communication system, and the first gradient fusion data is obtained by performing gradient fusion on the first data, the gradient data in the at least one second node and the gradient data in the third node; and the first node determines a first gradient fusion result of the first node according to the first gradient fusion data.

Therefore, because the data transmitted between the second node and the third node in the ring-shaped communication system is the gradient fusion data, rather than the original gradient data in the nodes, in the process of implementing the gradient data fusion, not only can the privacy of the gradient data of the second node and the third node be protected, but also the nodes do not need to encrypt and decrypt the interactive data for many times, so that the efficiency of obtaining the gradient fusion result by the nodes can be effectively improved. In addition, when the first data sent by the first node to the target node is preprocessed data, the gradient data of the first node cannot be leaked to the second node, so that the privacy of the gradient data of the first node can be further protected.

In addition, in the annular communication system, a centralized aggregation node does not need to be introduced to participate in gradient data fusion, so that the difficulty brought by searching for a third-party aggregation node in an actual application scene can be reduced, and the leakage of gradient data of all other nodes caused by collusion between any node and the third-party aggregation node can be avoided.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings.

Fig. 1a is a schematic structural diagram of an exemplary ring communication system according to an embodiment of the present application;

fig. 1b is a schematic structural diagram of another exemplary ring communication system provided in an embodiment of the present application;

fig. 1c is a schematic structural diagram of a target system including a plurality of ring communication systems according to an embodiment of the present application;

FIG. 1d is a schematic flow chart of a gradient fusion method provided in an embodiment of the present application;

FIG. 2 is a schematic flow chart of a gradient data fusion method in conjunction with a particular scenario;

fig. 3 is a schematic diagram of each node in the ring communication system 100 dividing its gradient data into multiple segments;

FIG. 4 shows a gradient fusion result A to be generated by the node 101₀A schematic diagram of (a);

FIG. 5 shows a gradient fusion result A to be generated by node 101₀A schematic diagram of the transfer to the remaining nodes;

FIG. 6 is a diagram of node 101 generating gradient fusion result A₀、B₀、C₀、D₀、E₀A schematic diagram of (a);

FIG. 7 is a schematic diagram of node 101 sequentially transmitting generated gradient fusion data to other nodes;

fig. 8 is a schematic diagram illustrating secret sharing processing performed on one segment of the node 101 to the node 105;

FIG. 9 is a diagram illustrating a gradient fusion process performed from node 101 to node 105;

fig. 10 is a schematic diagram of gradient fusion data generated after 4 times of gradient fusion are performed on all nodes in the ring communication system 100;

fig. 11 is a diagram illustrating gradient fusion results generated by all nodes in the ring communication system 100;

fig. 12 is a schematic diagram of a gradient fusion result finally obtained by all nodes in the ring communication system 100;

fig. 13 is another schematic flow chart of a gradient data fusion method provided in an embodiment of the present application;

FIG. 14 is a block diagram of a data processing apparatus according to an embodiment of the present application;

fig. 15 is a schematic hardware configuration diagram of a data processing apparatus according to an embodiment of the present application.

Detailed Description

The terms "first," "second," and the like in the description and claims of the embodiments of the present application and in the drawings described above are used for distinguishing between similar elements (e.g., a first section and a second section in the embodiments of the present application refer to different sections) and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be implemented in other sequences than those illustrated or described herein. Furthermore, the terms "comprise" and "have", and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those explicitly listed, but may include other steps or modules not explicitly listed or inherent to such process, method, article, or apparatus, such that the division into blocks presented in an embodiment of the present application is merely a logical division, and may be implemented in practice in other ways, such that multiple blocks may be combined or integrated into another system, or some features may be omitted, or not implemented, and such that shown or discussed as coupled or directly coupled or communicative with each other may be through interfaces, and such that indirect coupling or communicative coupling between blocks may be electrical or other similar, the embodiments of the present application are not limited thereto. Moreover, the modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to implement the purpose of the embodiments of the present application.

In an actual application scenario, the same model, such as a deep learning model, may be deployed locally to which multiple data belong, and each data may be trained locally to obtain gradient data generated during each round of training of parameters in the local model, where the gradient data may be used to update the parameters in the local model to complete training of the local model.

In a horizontal federal learning scenario, gradient data can be interacted among places (called participants below) where each data belongs through a client or other devices, gradient fusion is performed on the gradient data in a plurality of participants, and accordingly parameter updating is performed on local models of the participants based on results after the gradient fusion. Therefore, the convergence of the local model can be accelerated, and the training efficiency and accuracy of the local model can be improved.

However, in the process, the complete gradient data of different participants cannot be directly transmitted to other participants, so that the leakage of the gradient data is avoided, and the data privacy of each participant is protected. If each participant performs gradient fusion on the gradient data by means of the aggregation node deployed independently, in the process of interacting the gradient data between each participant and the aggregation node, a process of encrypting and decrypting the gradient data needs to be performed, which may result in that each participant takes too long to obtain a gradient fusion result, thereby seriously reducing the training speed of each participant on the local model.

Based on this, the present embodiment provides a gradient data fusion method, a related apparatus and a storage medium, which may be applied to an annular communication system, and aims to protect the privacy and security of gradient data of a participant and improve the efficiency of the participant in obtaining a gradient fusion result. Specifically, data communication is performed between a plurality of participants based on a ring communication system, and each participant may transmit the preprocessed gradient data or gradient fusion data generated by the participant to a downstream participant, and the downstream participant continues gradient fusion based on the gradient data stored by the downstream participant and transfers the gradient fusion data, thereby generating a final gradient fusion result based on the gradient data in the plurality of participants. In this way, the gradient data received by the downstream participant is preprocessed or is gradient fusion data, but not the gradient data of the upstream participant, so that the privacy and the security of the gradient data of the participants can be effectively protected, and meanwhile, a plurality of participants do not need to frequently perform data encryption and decryption operations, so that the efficiency of the participants in obtaining the gradient fusion result can be improved.

For example, the solution of the embodiment of the present application may be implemented based on a cloud technology, and particularly relates to the technical fields of cloud computing, cloud storage, and the like in the cloud technology, which are respectively described below.

Among them, cloud computing (cloud computing) is a computing mode that distributes computing tasks over a resource pool formed by a large number of computers, so that various application systems can acquire computing power, storage space, and information services as needed. The network that provides the resources is called the "cloud". Resources in the "cloud" appear to the user as if they are infinitely expandable and can be acquired at any time, used on demand, expanded at any time, and paid for use. Generally, an Infrastructure as a Service (IaaS) platform is established by an Infrastructure as a Service (Infrastructure as a Service) provider of cloud computing, and multiple types of virtual resources are deployed in a resource pool and are selectively used by external clients (such as participants in the embodiment of the present application).

According to the logic function division, a Platform as a Service (PaaS) layer can be deployed on the IaaS layer, a Software as a Service (SaaS) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS layer. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms group sender, etc. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.

A distributed cloud storage system (hereinafter, referred to as a storage system) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of different types in a network through application software or application interfaces to cooperatively work by using functions such as cluster application, grid technology, and a distributed storage file system, and provides a data storage function and a service access function to the outside. At present, a storage method of a storage system is as follows: logical volumes are created, and when a logical volume is created, physical storage space, which may be the disk composition of a certain storage device or several storage devices, is allocated to each logical volume. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.

Illustratively, the gradient data fusion method provided by the present embodiment can be applied to the ring communication system 100 shown in fig. 1 a. As shown in fig. 1a, the ring communication system 100 includes a plurality of nodes, and fig. 1a illustrates an example in which 5 nodes are included, and in actual application, the number of nodes included in the ring communication system 100 may be any value greater than 1. Typically, each node in the ring communication system 100 may be implemented by a client (client) or other device owned by a participant.

The gradient data in the nodes 101 to 105 may be divided into 5 segments (or called chunks) according to the number of nodes (that is, 5) in advance. When gradient data is exchanged between the nodes, the gradient data in the node 101 may be divided into a plurality of fragments, and a first fragment of the 5 fragments is sent to the node 102. The node 102 may perform gradient fusion on the received fragment and the first fragment in the node 102 to obtain gradient fusion data 1, and the node 102 continuously sends the gradient fusion data 1 to the node 103. The node 103 may perform gradient fusion on the received gradient fusion data 1 and the first fragment in the node 103 to obtain gradient fusion data 2, and the node 103 continuously sends the gradient fusion data 2 to the node 104. The node 104 may perform gradient fusion on the received gradient fusion data 2 and the first fragment in the node 104 to obtain gradient fusion data 3, and the node 104 continuously sends the gradient fusion data 3 to the node 105. The node 105 may perform gradient fusion on the received gradient fusion data 3 and the first segment in the node 105 to obtain gradient fusion data 4, where the gradient fusion data 4 is a result obtained by performing gradient fusion on the gradient data of the first segment in all nodes. Accordingly, the remaining slices in each node may be subjected to gradient fusion by referring to the similar process described above.

In the process of implementing gradient fusion for the first segment of the gradient data in each node (other segments are similar), the data transmitted between the nodes 102 to 105 are all data obtained after gradient fusion (i.e. gradient fusion data 4 to gradient fusion data 4), but not original gradient data in the nodes, which not only can protect data privacy of multiple participants, but also each node does not need to encrypt and decrypt interactive data, so that the efficiency of the participants in obtaining gradient fusion results can be effectively improved.

Further, before sending the gradient data to the node 102, the node 101 may also perform preset processing through a first segment in the node 101 to generate first data, so that the node 101 may send the first data to the node 102 for gradient fusion, and the node 102, the node 103, the node 104, and the node 105 sequentially transmit the gradient fusion data to the node 101. Based on the preset processing performed on the first segment before, the node 101 may generate a final gradient fusion result according to the gradient fusion data sent by the node 105. Therefore, the data privacy of all the participants is protected while the gradient fusion result is rapidly generated.

In addition, in the ring communication system 100, a centralized aggregation node does not need to be referred to participate in gradient data fusion, so that the difficulty in finding a third-party aggregation node in an actual application scene can be reduced, and leakage of gradient data of all other nodes caused by collusion between any node and the third-party aggregation node can be avoided.

In the above embodiment, the nodes 101 to 105 perform fusion of gradient data in units of slices. In other implementations, the gradient data of nodes 101 to 105 participating in gradient fusion each time may be all gradient data in each node. In a specific implementation, the node 101 may perform preset processing on all gradient data in the node 101, generate first data, and transmit the first data to the node 102. The node 102 may perform gradient fusion on the received first data and its own gradient data to generate gradient fusion data I, and send the gradient fusion data I to the node 103. The node 103 may perform gradient fusion on the received gradient fusion data I and its own gradient data to generate gradient fusion data II, and send the gradient fusion data II to the node 104. The node 104 may perform gradient fusion on the received gradient fusion data II and its own gradient data to generate gradient fusion data III, and send the gradient fusion data III to the node 105. Finally, the node 105 may send the gradient fusion data III to the node 101, and the node 101 generates a result of finally fusing the gradient data in all the nodes according to the gradient fusion data III based on the preset processing procedure. Therefore, the privacy of the gradient data of all the participants can be protected, and the efficiency of acquiring the gradient fusion result by the participants can be effectively improved.

It should be understood that the architecture of the ring communication system 100 shown in fig. 1a is only one example provided in the embodiment of the present application, and in practical applications, the architecture of the ring communication system 100 is not limited to the example shown in fig. 1 a. For example, in other possible ring communication systems 100, a greater number or a lesser number of nodes may be included, for example, in the ring communication system 100 shown in fig. 1b, 3 nodes may be included in the ring communication system 100. In summary, the embodiments of the present application can be applied to any applicable ring communication system, and are not limited to the above examples. In an actual application scenario, the ring communication system 100 shown in fig. 1a and 1b may be deployed as an independent system. Alternatively, as shown in fig. 1c, the ring communication system 100 shown in fig. 1a and 1b may also be integrated in the target system 200 as a subsystem, that is, between some nodes in the target system 200, a ring communication system may be constructed and a corresponding gradient data fusion process may be performed.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, various non-limiting embodiments accompanying examples of the present application are described below with reference to the accompanying drawings. It should be apparent that the embodiments described are some, but not all embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.

Referring to fig. 1d, a signaling interaction diagram of a gradient fusion method in this embodiment is shown, where the method may be applied to a ring communication system, such as the ring communication systems shown in fig. 1a and 1b, and the ring communication system includes a first node, a third node, and m second nodes (m is a positive integer), and the at least one second node includes a target node. As shown in fig. 1d, the method may specifically include:

s101: the method includes the steps that a first node obtains first data, the first data are first fragments, or data obtained after preprocessing is carried out on the first fragments or gradient data in the first node, and the first fragments are one of a plurality of fragments corresponding to the gradient data in the first node and participate in gradient fusion.

In this embodiment, when gradient fusion needs to be performed on gradient data stored in a plurality of nodes in the ring communication system, a first node in the ring communication system may acquire data participating in the gradient fusion (for convenience of distinction, this embodiment is referred to as first data).

In some possible embodiments, the first node may slice the gradient data stored in itself, for example, the gradient data may be sliced into a plurality of slices according to the number of nodes in the ring communication system, so that the first node may take a first slice of the plurality of slices as the first data. Or, the first node may perform preprocessing on the first segment after segmenting the gradient data to obtain the first segment, and obtain the preprocessed first segment as the first data.

The preprocessing performed on the first fragment may be, for example, secret sharing performed on the first fragment, or the like. The secret sharing (secret share) refers to splitting a secret into multiple pieces of data, each piece of data may be referred to as a share, and the split secret may be recovered based on all shares, while the secret may not be recovered based on a single share. For example, if the first fragment is data 10, data 2 and data 8 can be generated by secret sharing of the first fragment, and data 10 can be obtained by calculating the sum of data 2 and data 8. Accordingly, after the first node performs secret sharing on the first segment, two pieces of data may be generated, where one piece of data may be used as the first data, and the other piece of data may be used as the second data and stored in the first node. In actual application, the preprocessing for the first fragment may be encryption, multiplication, or data position changing, or may be other processing, and this embodiment is not limited to this.

In another possible implementation, the first node may pre-process all gradient data stored in the first node, and use the pre-processed data as the first data, and so on. The preprocessing performed on all gradient data may be the above processes such as secret sharing, encryption, multiplication, or data position changing, or may be other processes, which are not described herein again.

In this embodiment, after obtaining the first data, the first node may send the first data to other nodes in the ring communication system, so as to perform gradient fusion based on the first data. Therefore, the first node may further perform the following steps:

s102: the first node transmits the first data to a target node of the at least one second node.

The target node is a downstream node adjacent to the first node in the ring communication system, that is, in the ring communication system, data transmitted by the first node is usually received by the target node.

S103: and the target node performs gradient fusion on the received first data and the gradient data stored by the target node to generate gradient fusion data 1.

In this embodiment, the data amount of the gradient data participating in the gradient fusion in the target node is kept consistent with the data amount of the first data. Specifically, when the first data is one fragment (or one preprocessed fragment) in the first node, the target node performs gradient fusion with the received first data by using gradient data of one fragment stored by the target node, so as to generate gradient fusion data 1; and when the first data is all gradient data of the first node, the target node performs gradient fusion with the received first data by using all gradient data stored by the target node, so as to generate gradient fusion data 1. In general, the generated gradient fusion data 1 corresponds to the first data in data amount.

In a scenario of horizontal federal learning and the like, multiple nodes typically deploy the same model respectively, so that the data volume of gradient data generated by the multiple nodes when the multiple nodes train the model respectively is the same. When gradient fusion is performed, each node generally performs fusion on gradient data of the same parameter of the model. For example, if the first data sent by the first node is specifically gradient data of a first network layer parameter deployed in the model of the first node, the target node performs gradient fusion with the first data by using the gradient data of the first network layer parameter deployed in the model of the second node.

S104: the target node sends the gradient fusion data 1 to a downstream second node adjacent to the target node in the m second nodes, so that the downstream second node performs gradient fusion with the gradient data stored by the downstream second node by using the received gradient fusion data to generate corresponding gradient fusion data.

In this implementation, the m second nodes may perform gradient fusion on the gradient fusion data received by themselves and a part or all of the gradient data stored by themselves in sequence with reference to a gradient fusion process executed by the target node, and transmit the generated gradient fusion result to the next second node, and a data amount of the gradient data participating in the gradient fusion in each second node is consistent with a data amount of the received data. In this way, the mth second node can generate gradient fusion data m after performing gradient fusion between the gradient fusion data (m-1) sent by the (m-1) th node and the gradient data stored in the second node.

S105: and the third node receives the gradient fusion data m sent by the mth second node.

S106: the third node performs gradient fusion with the gradient data stored by the third node by using the gradient fusion data m to generate first gradient fusion data.

S107: the third node sends first gradient fusion data to the first node.

S108: the first node determines a first gradient fusion result of the first node according to the first gradient fusion data.

As some examples, when the first data sent by the first node to the target node in step S102 is the first fragment, the first gradient fusion result is the first gradient fusion data received by the first node. At this time, the first gradient fusion result is obtained by performing gradient fusion according to one segment (including the first data, the corresponding segment in the m second nodes, and the corresponding segment in the third node) in all the nodes.

When the first data sent by the first node to the target node in step S102 is data obtained by preprocessing the first segment, after the first node receives the first gradient fusion data, the first node may generate a first gradient fusion result according to the first gradient fusion data and the preprocessing of the first segment. Taking the preprocessing, specifically the secret sharing processing as an example, after receiving the first gradient fusion data, the first node may perform gradient fusion on the first gradient fusion data and the second data generated by preprocessing the first segment to generate a first gradient fusion result. At this time, the first gradient fusion result is obtained by performing gradient fusion according to one fragment (including the first data, the fragments corresponding to the m second nodes, and the fragments corresponding to the third nodes) of all the nodes

When the first data sent by the first node to the target node in step S102 is data obtained by preprocessing all gradient data in the first node, after the first node receives the first gradient fusion data, the first node may generate a first gradient fusion result according to the first gradient fusion data and the preprocessing of the first segment. At this time, the first gradient fusion result is obtained by performing gradient fusion according to all gradient data in all nodes.

Because the data transmitted between the second node and the third node in the ring-shaped communication system is the gradient fusion data, rather than the original gradient data in the nodes, in the process of realizing the gradient data fusion, the privacy of the gradient data between the second node and the third node can be protected, and the nodes do not need to encrypt and decrypt interactive data for many times, so that the efficiency of obtaining the gradient fusion result by the nodes can be effectively improved. In addition, when the first data sent by the first node to the target node is the preprocessed data, the gradient data of the first node cannot be leaked to the second node, so that the privacy of the gradient data of the first node can be further protected.

Further, after obtaining the first gradient fusion result, the first node may also transmit the first gradient fusion result to other nodes, so that all nodes can obtain the first gradient fusion result. Based on this, the present embodiment may further include the following steps:

s109: and the first node sends the first gradient fusion result to the target node.

After receiving the first gradient fusion result, the target node may store the first gradient fusion result separately in the first node, or may replace the first segment with the first gradient fusion result, that is, the storage space of the first segment is used to store the first gradient fusion result, so as to reduce the storage resource required by the first node in the gradient fusion process.

S110: and the first gradient fusion data are sequentially transmitted among the target node, the rest second nodes and the third nodes.

It should be noted that, in the embodiment shown in fig. 1b, a process of performing gradient fusion according to the first data of the first node and corresponding gradient data in the other nodes is mainly described, and when the first data is specifically data of one segment in the first node (or data obtained by preprocessing one segment), for each of the other segments stored in the first node, the processes shown in the above steps S101 to S108 may be referred to generate a gradient fusion result corresponding to each segment, respectively, so as to obtain a complete gradient fusion result corresponding to all gradient data in the first node. And the first node can transmit the complete gradient fusion result to the other nodes, so that all the nodes can obtain the required complete gradient fusion result.

Further, when the first data is specifically one fragment or data obtained by preprocessing one fragment, in the ring communication system, the gradient fusion process may be executed in parallel on a plurality of fragments in the first node with reference to the processes from step S101 to step S108, so as to further accelerate the overall gradient data fusion process.

As an example, the first node may simultaneously send multiple fragments to the target node to trigger a parallel gradient fusion process for the multiple fragments.

In another example, when the first node sends the first fragment to the target node, the target node may send the second fragment in the gradient data of the target node to the next second node, and the next second node continues to send the third fragment to the downstream second node, and so on, so that each node may execute the gradient fusion process according to the received gradient fusion data and the corresponding fragment of the gradient data stored in the node while sending the gradient fusion data to the downstream node, thereby implementing the parallel execution of the gradient data fusion process by the multiple nodes. For example, the first node may not only send the first data to the target node, but also receive, from the third node, second gradient fusion data obtained by performing gradient fusion on partial gradient data (e.g., one of the fragments) in the third node and partial gradient data (e.g., one of the fragments) in one or more second nodes (not including the target node), so that the first node may perform gradient fusion on the second gradient fusion data and a second fragment of multiple fragments corresponding to gradient data stored in the first node to obtain third gradient fusion data, and send the third gradient fusion data to the target node, so that the target node performs a corresponding gradient fusion process based on the third gradient fusion data.

Correspondingly, in the process of executing the gradient fusion in parallel by the plurality of nodes in the ring-shaped communication system, different nodes can respectively obtain the gradient fusion result generated by one fragment based on the gradient data stored by the different nodes, and the different nodes can obtain the gradient fusion results corresponding to the different fragments, so that the plurality of nodes can send the gradient fusion results corresponding to the respectively generated fragment to respective downstream nodes in parallel. For example, the first node may receive a second gradient fusion result sent by the third node while sending the first gradient fusion result to the target node, where the second gradient fusion result is obtained by performing gradient fusion on a third segment of the multiple segments corresponding to the gradient data of the first node, partial gradient data (specifically, one segment of each second node) in all the second nodes, and partial gradient data (specifically, one segment of the third node) in the third node. Therefore, the efficiency of acquiring a complete gradient fusion result by each node can be improved.

For convenience of description and understanding, the technical solutions of the embodiments of the present application will be described below with reference to a specific ring communication system. Referring to fig. 2, a schematic flow chart of a gradient data fusion method provided in the embodiment of the present application, where the method is applied to the ring communication system 100 shown in fig. 1a, and specifically may include:

S201: the nodes 101 to 105 divide the gradient data in each node into 5 pieces according to the number of nodes participating in the fusion of the gradient data in the ring communication system 100.

In the concrete implementation process, the first-stage reactor,the node 101 may divide its gradient data into 5 slices, which are respectively a shown in fig. 3₀、b₀、c₀、d₀、e₀And (4) slicing. Similarly, the node 102 divides its gradient data into a₁、b₁、c₁、d₁、e₁These 5 slices; the node 103 divides its gradient data into a₂、b₂、c₂、d₂、e₂These 5 slices; the node 104 divides its gradient data into a₃、b₃、c₃、d₃、e₃These 5 slices; the node 105 divides its gradient data into a₄、b₄、c₄、d₄、e₄These 5 slices.

S202: node 101 to a₀The secret sharing is carried out by the fragments to generate a₀₁And a is₀₂。

The secret sharing (secret share) is to split a secret into multiple pieces of data, each piece of data may be referred to as a share, and the split secret may be recovered based on all shares, while the secret cannot be recovered based on a single share. In this embodiment, node 101 may be connected to a₀The fragments are taken as secrets to be split to generate data a₀₁And data a₀₂Wherein, data a₀₁And data a₀₂Can be equal to a₀The data amount of the slices is equal. For example, assume a ₀The data in the slice is {6,16,5,8,4,10,21}, then node 101 may assign a to this₀Splitting data in a slice into data sets a₀₁And data set a₀₂And, a₀₁Is {2,5,1,3,1,9,15} and a₀₂Is {4,11,4,5,3,1,6 }. Wherein a is accumulated by correspondence₀₁And a₀₂Data in the data set, i.e. a can be recovered₀Data in slices, e.g. a₀Data "21" in a slice can be calculated by a₀₁Data "15" and a in (1)₀₂The sum of data "6" in (1) is recovered.

In this embodiment, node 101 may be according to a₀₁Anda₀₂recovery of a₀Sharding, therefore, node 101 may generate a₀₁Alternative a₀Shards, as shown in FIG. 4, and stores a separately in node 101₀₂And (4) data.

S203: node 101 will a₀₁To node 102.

It is noted that since the gradient data sent by node 101 to node 102 is for a₀Sharding for secret sharing generated a₀₁Data other than a₀The data privacy of the node 101 can be protected by fragmenting the data itself, so that the leakage of gradient data does not occur when data communication is performed between the node 101 and the node 102.

S204: a to be received by the node 102₀₁And a₁The segments are subjected to gradient fusion to generate gradient fusion data A₁And fusing the generated gradient with data A ₁To the node 103.

In this embodiment, the gradient fusion of the two sliced data may be, for example, performing a summation operation on the gradient data in the two slices. For example, assume a₀₁Is {3,5,4,2,7,2}, a₁Is {1,5,6,9,11,21}, then for a₀₁And a₁A obtained by gradient fusion₁May be 4,10,10,11,18, 23. In this manner, node 102 may complete a gradient fusion process for the gradient data.

As shown in FIG. 4, node 102 is connected to a₀₁And a₁After gradient fusion, a can be used₁Storage location of the slice stores the generated gradient fusion data A₁I.e. using A₁Covering a₁。

It should be noted that, since the data sent by the node 102 to the node 103 is the data after gradient fusion, and is not the gradient data in the node 102, the data privacy of the node 102 can be effectively protected.

S205: node 103 will receive a₁And a₂The segments are subjected to gradient fusion to generate gradient fusion data A₂And fusing the generated gradient with data A₂To node 104.

Wherein the gradient is fused with the data A₂Is according to a in node 101₀₁A in the node 102₁Shards and a in node 103₂The fragments were obtained by gradient fusion, as shown in FIG. 4.

S206: node 104 will receive a ₂And a₃The fragments are subjected to gradient fusion to generate gradient fusion data A₃And fusing the generated gradient with data A₃To the node 105.

Wherein the gradient is fused with the data A₃Is according to a in node 101₀₁A in the node 102₁Sharded, a in node 103₂Sharding and a in node 104₃The fragments were obtained by gradient fusion, as shown in FIG. 4.

S207: the node 105 fuses the data A according to the received gradient₃And a₄The segments are subjected to gradient fusion to generate gradient fusion data A₄And fusing the generated gradient with data A₄To node 101.

Wherein the gradient is fused with the data A₄Is according to a in node 101₀₁A in the node 102₁Sharded, a in node 103₂Sharded, a in node 104₃Sharding and a in node 105₄The fragments were obtained by gradient fusion, as shown in FIG. 4.

S208: the node 101 fuses the data A according to the received gradient₄And pair a₀A generated by secret sharing by fragmentation₀₂Performing gradient fusion to generate a gradient fusion result A₀。

At this time, the gradient fusion result A₀Is according to a in node 101₀₁、a₀₂A in the node 102₁Sharded, a in node 103₂Sharded, a in node 104₃Sharding and a in node 105₄The fragments were obtained by gradient fusion, as shown in FIG. 4. Due to a in the node 101 ₀₁、a₀₂After gradient fusion, a in the node 101 can be obtained₀Slicing, therefore, the gradient fusion result A ultimately generated by the node 101₀I.e. according to node 101 to nodeA in 105₀、a₁、a₂、a₃、a₄And segmenting gradient fusion results generated by gradient fusion.

Further, the node 101 is obtaining a gradient fusion result a₀The gradient can then be fused to the result A₀Transmitted to other nodes so as to enable other nodes to acquire the information based on a₀、a₁、a₂、a₃、a₄The results generated by gradient fusion were sliced as shown in fig. 5. In particular, the node 101 may fuse the gradient into the result a₀Sent to the node 102, and the node 102 can use the received gradient fusion result A to fuse₀Covering a₁Sharding (or may be saved separately, etc.) and fusing the gradient by node 102 to result A₀To the node 103. Node 103 may then fuse the gradients to data A₀Sent to the node 104, and the gradient fusion result A can be obtained by the node 104₀And on to node 105.

In the implementation of₀、a₁、a₂、a₃、a₄In the process of gradient fusion of the segments, in this embodiment, a is paired with a by node 101₀The slicing is exemplified to perform the secret sharing process. In practical application, the node 101 may also be a₀Performing other predetermined processes, e.g. on a ₀The encryption, multiplication, or data position change is performed, but this embodiment is not limited thereto. Accordingly, node 101 fuses data A according to the gradient sent by node 105₄And a is previously paired with₀The final gradient fusion result A is generated by the preset processing₀。

In addition, in the embodiment shown in fig. 2, an exemplary description is mainly given to a gradient fusion process performed on gradient data of one segment from the node 101 to the node 105, and in practical application, for each segment from the node 101 to the node 105, the gradient fusion result a corresponding to each segment can be completed in the above manner₀、B₀、C₀、D₀、E₀As shown in fig. 6. Wherein, the gradient is fused with the knotFruit B₀Based on b in node 101 to node 105₀(including b)₀₁、b₀₂)、b₁、b₂、b₃、b₄The fragments are obtained by gradient fusion; gradient fusion results C₀Based on c in nodes 101 to 105₀(including c)₀₁、c₀₂)、c₁、c₂、c₃、c₄The fragments are obtained by gradient fusion; gradient fusion results D₀Based on d in nodes 101 to 105₀(including d)₀₁、d₀₂)、d₁、d₂、d₃、d₄The fragments are subjected to gradient fusion to obtain; gradient fusion results E₀Based on e in nodes 101 to 105₀(including e)₀₁、e₀₂)、e₁、e₂、e₃、e₄And carrying out gradient fusion on the fragments to obtain the compound.

For convenience of understanding, the embodiments of the present application provide the following implementation examples:

In the first implementation example, after the node 101 performs the above steps S201 to S208, the node 101 may perform the process described in the above steps S201 to S208 on the basis of the process described in the above steps S201 to S208 on the node 101 to b of the node 105₀(including b)₀₁、b₀₂)、b₁、b₂、b₃、b₄The fragments are subjected to gradient fusion to generate a gradient fusion result B₀. Then, based on the procedures described in the above steps S201 to S208, c of the nodes 101 to 105 is determined₀(including c)₀₁、c₀₂)、c₁、c₂、c₃、c₄The fragments are subjected to gradient fusion to generate a gradient fusion result C₀. By parity of reasoning, gradient fusion result A is generated in sequence₀、B₀、C₀、D₀、E₀As shown in fig. 6. Wherein, the gradient is fused to result B₀、C₀、D₀、E₀The gradient fusion process of (1) can be seen in the generation of gradient fusion data A described above₀The implementation process of (2) is not described herein.

Thus, by pairing node 101 to node105, the fragments are subjected to gradient fusion one by one, so that not only can a gradient fusion result A required by each node be generated₀、B₀、C₀、D₀、E₀Moreover, the data transmitted between the nodes 102 to 105 is gradient fusion data, rather than original gradient data in the nodes, which can effectively protect the privacy of the gradient data from the nodes 102 to 105 in the process of implementing gradient data fusion, and in addition, the nodes do not need to encrypt and decrypt the interactive data for multiple times, so that the efficiency of obtaining the gradient fusion result by the nodes can be effectively improved.

In a second implementation example, the node 101 may perform a gradient fusion process for multiple slices in each node simultaneously based on the similar process of the above-described step S201 to step S208. For example, the node 101 may pre-share all slices (i.e., the entire gradient data) secretly, so as to be based on a₀Fragment generation a₀₁And a₀₂Based on b₀Fragment Generation b₀₁And b₀₂Based on c₀Fragment Generation c₀₁And c₀₂Based on d₀Fragment Generation d₀₁And d₀₂Based on e₀Fragment generation e₀₁And e₀₂Wherein a is₀₂、b₀₂、c₀₂、d₀₂、e₀₂Saved separately in node 101 and sent a to node 102₀₂、b₀₂、c₀₂、d₀₂、e₀₂. Then, the node 102 may perform corresponding gradient fusion on the received data and each segment in the node 102 to generate corresponding gradient fusion data a₁、B₁、C₁、D₁、E₁And respectively cover the original a₁、b₁、c₁、d₁、e₁Slicing, fusing the generated gradient with data A₁、B₁、C₁、D₁、E₁Sending to the node 103 for gradient fusion, and so on. Finally, the node 105 fuses the generated gradient data A₄、B₄、C₄、D₄、E₄Is sent to node 101 so that node 101 passes A₄With a stored separately₀₂Performing gradient fusion to generate final gradient fusion data A₀And gradient fusion data B is generated in a similar manner₀、C₀、D₀、E₀And gradient fusion results as shown in fig. 6 were obtained.

As such, the overall efficiency of generating gradient fusion data by multiple nodes may be improved relative to the first implementation example. In practical application, when the node 101 shares all the segments in a secret manner, the nodes 101 to 105 may not need to perform segment processing on the gradient data, that is, each node may be regarded as including only one segment, and the gradient data in the segment is all the gradient data in the node.

Node 101 may then generate gradient fusion data A₀、B₀、C₀、D₀、E₀And sequentially transmitting the data to the other nodes, as shown in fig. 7, so that the other nodes can all obtain the gradient fusion result generated based on the gradient data in all the nodes.

In a third implementation example, each node may be responsible for performing gradient data fusion on different slices based on the similar processes of step S201 to step S208.

In specific implementation, after the nodes 101 to 105 segment their respective gradient data according to the number of nodes, secret sharing may be performed on the segments in different sequence positions.

For example, as shown in FIG. 8, first, node 101 pairs a₀The secret sharing is carried out by the fragments to generate a₀₁And a₀₂And, a is₀Segment replacement is a ₀₁A is to₀₂Storing separately; node 102 pair b₁Sharing secret by slicing to generate b₁₁And b₁₂And, b is₁The segment is replaced by b₁₁B is mixing₁₂Storing separately; node 103 to c₂Sharding for secret sharing to generate c₂₁And c₂₂And, c is₂Segment replacement is c₂₁C is mixing₂₂Storing separately; node 104 pair d₃Shards are shared secretly to generate d₃₁And d₃₂And, d is₃Segment replacement is d₃₁D is mixing₃₂Storing separately; node 105 pair e₄Shards are shared secretly to generate e₄₁And e₄₂And, e is₄Segment replacement is e₄₁A 1, e₄₂And (4) storing the extract separately.

Node 101 then sends a₀₁To node 102, and at the same time node 102 sends b₁₁Sending to the node 103, the node 103 sends c₂₁Sending d to node 104, node 104 sending d₃₁Sending to node 105, node 105 sending e₄₁And sent to node 101 as shown in fig. 8.

Thus, node 101 is based on e sent by node 105₄₁And fragment and e₀The fragments are subjected to gradient fusion to generate E₀(ii) a Meanwhile, node 102 is based on a sent by node 101₀₁And fragment and a₁The fragments are subjected to gradient fusion to generate A₁(ii) a Node 103 sends b based on node 201₁₁And fragment and b₂The fragments are subjected to gradient fusion to generate B₂(ii) a Node 104 sends c based on node 103₂₁And fragment and c₃The fragments are subjected to gradient fusion to generate C ₃(ii) a Node 105 sends d based on node 104₃₁And slicing and d₄The fragments are subjected to gradient fusion to generate D₄As shown in fig. 9. Thus, the nodes 101 to 105 complete a gradient fusion process.

Node 101 then sends E₀Sent to node 102, and at the same time node 102 sends a₁Sending to node 103, node 103 sends B₂Sending to node 104, node 104 sends C₃Sending D to node 105, node 105 sending D₄And sending the data to the nodes 101, as shown in fig. 9, so that each node continues gradient fusion with its own segment based on the received gradient fusion data.

By analogy, current gradient fusion data is continuously transmitted between the nodes, and finally, after 4 times of gradient fusion, gradient fusion data as shown in fig. 10 can be obtained by each node.

In the fifth gradient fusion process, node 101 integrates B₀To node 102, and at the same time node 102 sends C₁Sending D to node 103, node 103 sending D₂Sending to node 104, node 104 sends E₃Sending to the node 105, the node 105 sends A₄And sent to node 101 as shown in fig. 10.

Thus, node 101 may be based on a received a₄With a stored separately₀₂Performing gradient fusion to generate a gradient fusion result A₀I.e. based on a₀、a₁、a₂、a₃、a₄And performing gradient fusion to obtain a final gradient fusion result. Similarly, node 102 may rely on a received B ₀With b stored separately₁₂Performing gradient fusion to generate a gradient fusion result B₁(ii) a Node 103 may be based on received C₁With c stored separately₂₂Performing gradient fusion to generate a gradient fusion result C₂(ii) a Node 104 may be based on the received D₂With d stored separately₃₂Performing gradient fusion to generate a gradient fusion result D₃(ii) a Node 105 may be based on the received E₃With e stored separately₄₂Performing gradient fusion to generate a gradient fusion result E₄As shown in fig. 11.

Since each node (for example, the node 101 to the node 105) participates in data transmission, data reception, and gradient data fusion in each step, compared with the second implementation example, the data processing efficiency can be further improved, and at the same time, the resource utilization rate of each node can be improved.

Finally, the nodes 101 to 105 may respectively fuse the generated gradients into a result a₀、B₀、C₀、D₀、E₀And sequentially transmitting the data to the other nodes, as shown in fig. 12, so that the other nodes can all obtain the gradient fusion result generated based on the gradient data in all the nodes.

It should be noted that, for convenience of understanding, in the above implementation examples, the ring-shaped communication system 100 includes 5 nodes, and the gradient fusion process performed on the gradient data in the ring-shaped communication system 100 is described. In practical applications, the ring communication system 100 may include any number of nodes, and the fusion of the gradient data between the nodes may be implemented in the manner described with reference to the above implementation examples.

Taking the example that the ring communication system 100 includes N nodes and performs gradient data fusion by using the third implementation example, each node divides its own gradient data into N segments, and the N segments in each node may be numbered by 0 to (N-1). Moreover, the ith node can perform secret sharing on the jth fragment of the ith node to generate two pieces of data, one piece of data is used for covering the jth fragment, and the other piece of data is independently stored in the jth node.

Then, the N nodes may generate gradient fusion results corresponding to different segments in N steps. In specific implementation, in the ith step (0< ═ i < N-1) in the first N-1 steps, the node j sends the self slice ((j-i) mod N) to the node (j +1) mod N and receives the slice ((j-1-i) mod N) sent by the node ((j-1) mod N); and the node j performs gradient fusion on the (j-1-i) th mod N slice sent by the node ((j-1) mod N) and the stored ((j-1-i) mod N) th slice, and replaces the ((j-1-i) mod N) th slice in the node j with the generated gradient fusion result. In the Nth step, the node j sends the (j +1) mod N) number fragment to the node ((j +1) mod N) and receives the j number fragment sent by the node ((j-1) mod N); in addition, the node j performs gradient fusion on the j & ltth & gt fragment sent by the node ((j-1) mod N) and the j & ltth & gt fragment stored by the node j, and replaces the j & ltth & gt fragment in the node j with the generated gradient fusion result. In this way, the jth segment in the node j is the gradient fusion result of the jth segments in all nodes in the initial state.

Finally, each node can share the gradient fusion result of one fragment stored by the node to other nodes by executing N-1 steps. In specific implementation, in the pth step (0< ═ p < N-1) of N-1 steps, the node j transmits the No. (j-p) mod N) th slice to the node ((j +1) mod N), and receives the no ((j-1-p) mod N) th slice transmitted by ((j-1) mod N); in addition, the node j replaces the received (j-1-p) th mod N) slice with the self ((j-1-p) mod N) th slice to obtain a gradient fusion result for the (j-1-p) th mod N) slice.

It should be noted that, in the embodiment shown in fig. 2, the secret sharing is performed on one or more segments in the node, and then the gradient data transmission is performed. In other possible embodiments, the node may also transmit gradient data directly to other nodes. Specifically, refer to another flow chart of the gradient data fusion method shown in fig. 13. Taking the application of the method to the ring communication system 100 shown in fig. 1a as an example, the method shown in fig. 13 may specifically include:

s1301: each node (for example, nodes 101 to 105 in fig. 13) divides its own gradient data into a plurality of segments according to the number of nodes participating in the gradient data fusion in the ring communication system 100.

Taking the division of the gradient data of the node into 5 slices as an example, for example, the node 101 may divide the gradient data of itself into 5 slices, which are respectively a shown in fig. 3₀、b₀、c₀、d₀、e₀And (5) slicing. Similarly, the node 102 divides its gradient data into a₁、b₁、c₁、d₁、e₁These 5 slices; the node 103 divides its gradient data into a₂、b₂、c₂、d₂、e₂These 5 slices; the node 104 divides its gradient data into a₃、b₃、c₃、d₃、e₃These 5 slices; the node 105 divides its gradient data into a₄、b₄、c₄、d₄、e₄These 5 slices.

S1302: node 101 will a₀The fragments are sent to node 102.

S1303: a to be received by the node 102₀Sharded and self-stored a₁The fragments are subjected to gradient fusion to generate gradient fusion data A₁And fusing the generated gradient with data A₁To the node 103.

In this embodiment, the gradient fusion of the two slices may be, for example, performing a summation operation on gradient data in the two slices.

Node 102 is in node a₀And a₁After gradient fusion, a can be used₁Storage location of the slice stores the generated gradient fusion data A₁I.e. using A₁Covering a₁。

It should be noted that, since the data sent by the node 102 to the node 103 is data subjected to gradient fusion, and is not gradient data in the node 102, the data privacy of the node 102 can be effectively protected.

S1304: node 103 will receive A₁And a₂The fragments are subjected to gradient fusion to generate gradient fusion data A₂And fusing the generated gradient with data A₂To node 104.

Wherein the gradient is fused with the data A₂Is according to a in node 101₀A in the node 102₁Shards and a in node 103₂And carrying out gradient fusion on the fragments to obtain the compound.

S1305: node 104 will receive a₂And a₃The fragments are subjected to gradient fusion to generate gradient fusion data A₃And fusing the generated gradient with data A₃To the node 105.

S1306: the node 105 fuses the data A according to the received gradient₃And a₄The fragments are subjected to gradient fusion to generate gradient fusion data A₄。

Wherein the gradient is fused with the data A₄Is according to a in node 101₀A in the node 102₁Sharded, a in node 103₂Sharded, a in node 104₃Sharding and a in node 105₄And carrying out gradient fusion on the fragments to obtain the fusion protein. In this way, the fusion of gradient data for the first segment in all nodes can be realized. In the process of realizing gradient data fusion, only the node 102 knows the a in the node 101₀Slicing at the jointBetween the nodes 102 and 105, data transmitted by the nodes 102 and 105 is data obtained after gradient fusion processing, so that data privacy between the nodes 102 and 105 can be effectively protected.

Further, the node 104 is obtaining the gradient fusion data A₄The gradient may then be fused to data A₄Transmitted to other nodes so as to enable other nodes to acquire the information based on a₀、a₁、a₂、a₃、a₄And (4) segmenting a result generated by gradient fusion. In particular implementations, the node 104 may fuse the gradient with the data A₄Sent to node 101, and the node 101 can fuse data A with the received gradient₄Cover a₀Sharding (or may be saved separately, etc.) and fusing the gradient data A by the node 101₄To node 102. Node 102 may then fuse the gradients with data A₄Sent to the node 103, and the gradient fusion data A can be transmitted by the node 103₄And on to node 104.

In addition, in the embodiment shown in fig. 13, an exemplary description is mainly given of a gradient fusion process performed on gradient data of one segment from the node 101 to the node 105, and in practical application, for each segment from the node 101 to the node 105, a gradient fusion result corresponding to each segment can be completed in the above manner.

Similar to the foregoing embodiment, in this embodiment, after the gradient fusion result is generated based on one segment in each node, the gradient fusion result may be generated based on one segment in the next node. Or, the gradient fusion results corresponding to all the fragments may be generated in parallel between the nodes, and the process of generating the gradient fusion results corresponding to all the fragments is not limited in the embodiment of the present application.

For example, in an implementation example, the node 101 may send all the fragments in the node 101 to the node 102, and the node 102 performs gradient data fusion with all the fragments stored in itself based on the received fragments, and then sends the generated gradient fusion data to the next node to trigger the next node to perform a gradient fusion process based on the received gradient fusion data until the last node completes the fusion of all the gradient data, so as to obtain gradient fusion results of all the gradient data in all the nodes.

As can be seen, since the data transmitted between the nodes 102 and 105 are gradient fusion data obtained after gradient fusion processing, rather than gradient data in the nodes, in the data interaction process, the gradient data of the nodes can be prevented from being known by other nodes, so that the privacy and security of the gradient data from the nodes 102 to 105 can be effectively protected.

In yet another implementation example, different nodes may be respectively responsible for generating gradient fusion results of the slices in different ordinal positions. Specifically, the node 101 may send the first segment to the node 102, and refer to the gradient data fusion process described in the embodiment shown in fig. 13, and perform corresponding gradient fusion operations in turn from the node 102 to the node 105 based on the gradient data of the first segment stored respectively. In this process, the node 102 may send the gradient data of the second segment stored by itself to the node 103, and in the above process, the node 103, the node 104, the node 105, and the node 101 sequentially perform a corresponding gradient fusion operation based on the gradient data of the second segment stored by themselves. The node 103 sends the gradient data of the third segment stored by itself to the node 104, the node 104 sends the gradient data of the fourth segment stored by itself to the node 105, and the node 105 sends the gradient data of the fifth segment stored by itself to the node 101.

Therefore, although the gradient data of one segment in each node is known by the next node, the gradient data of other segments in each node is not known by other nodes, so that the privacy and the safety of the gradient data of each node can be protected.

It should be noted that the gradient data fusion method in the embodiments shown in fig. 2 and fig. 13 is applied to the ring communication system 100 shown in fig. 1b as an example. In an actual application scenario, a plurality of gradient fusion results may be generated based on a plurality of different ring communication systems, and a final gradient fusion result is generated by combining the plurality of gradient fusion results, where the gradient fusion result is generated based on gradient data of nodes in the plurality of ring communication systems, so that reliability and stability of the final generated gradient fusion result may be improved.

For example, in the target system 200 shown in fig. 1c, each of the plurality of ring-shaped communication systems 100 may generate a corresponding gradient fusion result by using the gradient data fusion method in the embodiment shown in fig. 2 and fig. 13, so that the node Q in the target system 200 may summarize the gradient fusion results generated by each ring-shaped communication system 100, and merge the summarized plurality of gradient fusion results to generate a final gradient fusion result. Alternatively, the node Q may also collect gradient data in the node Q1 and the node Q2 (and the node Q) independent of the ring communication system 100 in the target system 200, and generate a final gradient fusion result by performing a further gradient fusion operation on the gradient fusion result generated by the plurality of ring communication systems 100, the gradient data in the node Q1, the node Q2 (and the node Q).

In practical application, the plurality of ring communication systems 100 in the target system 200 may execute the process of generating the gradient fusion result in parallel, so that the time consumed by the node Q to obtain the gradient fusion result generated by each of the plurality of ring communication systems 100 may be reduced, and the efficiency of obtaining the final gradient fusion result by the target system 200 may be further improved.

In addition, the embodiment of the application also provides a data processing device. Referring to fig. 14, a schematic diagram of a data processing apparatus shown in fig. 14 is applicable to any node in a ring communication system. The data processing apparatus 1400 in the embodiment of the present application can implement the steps corresponding to the gradient data fusion method executed in the embodiment corresponding to fig. 2 and fig. 13. The functions implemented by the data processing apparatus 1400 may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware. The data processing apparatus 1400 may include a transceiver module 1401 and a processing module 1402, and the implementation of the functions of the transceiver module 1401 and the processing module 1402 may refer to the operations executed in the embodiments corresponding to fig. 2 and fig. 13, which are not described herein again. For example, the processing module 1402 may be configured to obtain first data, perform segmentation on the gradient data, and the like, and the processing module 1402 may be further configured to determine a first gradient fusion result of the first node, and the like.

In a possible implementation, the processing module 1402 is further configured to obtain second data, where the second data is generated by secret sharing of gradient data in the first segment or the first node;

the processing module 1402 is specifically configured to perform gradient fusion on the first gradient fusion data and the second data to generate the first gradient fusion result.

In a possible implementation manner, the first data is the first slice, and the first gradient fusion result is gradient fusion data sent by the third node.

In a possible implementation manner, the transceiver module 1401 is further configured to receive, when the first data is the first segment or data obtained by performing preset processing on the first segment, second gradient fusion data from the third node, where the second gradient fusion data is obtained by performing gradient fusion on partial gradient data in the third node and partial gradient data in one or more second nodes;

the processing module 1402 is further configured to perform gradient fusion on the second gradient fusion data and a second segment of the multiple segments to obtain third gradient fusion data;

The transceiver module 1401 is further configured to send the third gradient fusion data to the target node.

In a possible implementation manner, the transceiver module 1401 is further configured to send, to the target node, a first gradient fusion result of the first node after the processing module determines the first gradient fusion result.

In a possible implementation manner, the transceiver module 1401 is further configured to receive a second gradient fusion result from the third node when the first data is the first segment or data obtained by performing preset processing on the first segment, where the second gradient fusion result is obtained by performing gradient fusion on a third segment in the multiple segments, partial gradient data in all second nodes, and partial gradient data in the third nodes.

It should be noted that, for the contents of information interaction, execution processes, and the like between the modules and units of the apparatus, since the method embodiments in the embodiments of the present application are based on the same concept, the technical effects brought by the contents are the same as those of the method embodiments in the embodiments of the present application, and specific contents may refer to the descriptions in the foregoing method embodiments in the embodiments of the present application, and are not repeated herein.

The data processing apparatus in the embodiment of the present application is described above from the perspective of the modular functional entity, and the data processing apparatus in the embodiment of the present application is described below from the perspective of hardware processing.

It should be noted that in the embodiments (including the embodiment shown in fig. 14) of the present application, all the entity devices corresponding to the transceiver module may be transceivers (e.g., may be implemented by a transmitter and a receiver), and all the entity devices corresponding to the processing module may be processors. The data processing apparatuses 1400 shown in fig. 14 may each have a structure as shown in fig. 15. In the data processing apparatus 1500 shown in fig. 15, the processor 1501, the transmitter 1502 and the receiver 1503 implement the same or similar functions of the processing module and the transceiver module provided in the foregoing apparatus embodiment corresponding to the data processing apparatus, and the memory 1504 in fig. 15 stores a computer program that needs to be called when the processor executes the gradient data fusion method described above.

In addition, an embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the data processing method described in the foregoing method embodiment.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the system, the apparatus, and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one position, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer-readable storage medium.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the present application are generated in whole or in part when the computer program is loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a server, a data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

The technical solutions provided by the embodiments of the present application are introduced in detail, and the principles and implementations of the embodiments of the present application are explained by applying specific examples in the embodiments of the present application, and the descriptions of the embodiments are only used to help understanding the method and core ideas of the embodiments of the present application; meanwhile, for a person skilled in the art, according to the idea of the embodiment of the present application, there may be a change in the specific implementation and application scope, and in summary, the content of the present specification should not be construed as a limitation to the embodiment of the present application.

Claims

1. A gradient data fusion method is applied to a ring-shaped communication system, wherein the ring-shaped communication system comprises a first node, a third node and at least one second node, and the at least one second node comprises a target node;

the method comprises the following steps:

the first node acquires first data, wherein the first data is a first fragment, the first fragment is one of a plurality of fragments corresponding to gradient data in the first node and not participating in gradient fusion, or the first data is data obtained by preprocessing the first fragment or the gradient data;

The first node sends the first data to the target node, and the first data is used for gradient fusion with gradient data in the target node;

the first node receives first gradient fusion data from the third node, wherein the first gradient fusion data is obtained by performing gradient fusion on the first data, the gradient data in the at least one second node and the gradient data in the third node;

and the first node determines a first gradient fusion result of the first node according to the first gradient fusion data.

2. The method of claim 1, further comprising:

obtaining second data, wherein the second data is generated by secret sharing of gradient data in the first segment or the first node;

the first node determines a first gradient fusion result of the first node according to the first gradient fusion data, and the method comprises the following steps:

and the first node performs gradient fusion on the first gradient fusion data and the second data to generate a first gradient fusion result.

3. The method of claim 1, wherein the first data is the first slice, and wherein the first gradient fusion result is gradient fusion data sent by the third node.

4. The method according to claim 1, wherein when the first data is the first slice or data obtained by performing a predetermined process on the first slice, the method further comprises:

the first node receives second gradient fusion data from the third node, and the second gradient fusion data is obtained by performing gradient fusion on partial gradient data in the third node and partial gradient data in one or more second nodes;

the first node performs gradient fusion on the second gradient fusion data and a second fragment of the plurality of fragments to obtain third gradient fusion data;

the first node sends the third gradient fusion data to the target node.

5. The method of claim 1, wherein after determining the first gradient fusion result for the first node, the method further comprises:

and the first node sends the first gradient fusion result to the target node.

6. The method of claim 1, wherein when the first data is the first tile or pre-processed data for the first tile, the method further comprises:

And the first node receives a second gradient fusion result from the third node, and the second gradient fusion result is obtained by performing gradient fusion according to a third fragment of the fragments, partial gradient data in all the second nodes and partial gradient data in the third node.

7. The method according to any one of claims 1 to 6, wherein when the first data is a first fragment or the first data is data preprocessed for the first fragment, the method further comprises:

and the first node divides the gradient data in the first node to obtain the plurality of fragments.

8. A data processing apparatus, wherein the data processing apparatus is applied to a first node in a ring communication system, the ring communication system further includes a third node and at least one second node, and the at least one second node includes a target node; the data processing apparatus includes:

a processing module, configured to obtain first data, where the first data is a first fragment, and the first fragment is one of multiple fragments corresponding to gradient data in the first node that does not participate in gradient fusion, or the first data is data obtained by preprocessing the first fragment or the gradient data;

A transceiver module, configured to send the first data to the target node, where the first data is used for performing gradient fusion with gradient data in the target node; receiving first gradient fusion data from the third node, wherein the first gradient fusion data is obtained by performing gradient fusion on the first data, the gradient data in the at least one second node and the gradient data in the third node;

9. A data processing apparatus, characterized in that the apparatus comprises:

at least one processor, memory, and transceiver;

wherein the memory is adapted to store a computer program and the processor is adapted to invoke the computer program stored in the memory to perform the method performed by the first node according to any of claims 1-7.

10. A computer-readable storage medium, characterized in that it comprises instructions which, when run on a computer, cause the computer to perform the method performed by the first node according to any of claims 1-7.