CN117155928A - Communication task processing method, system, equipment, cluster and readable storage medium - Google Patents

Communication task processing method, system, equipment, cluster and readable storage medium Download PDF

Info

Publication number
CN117155928A
CN117155928A CN202311423755.8A CN202311423755A CN117155928A CN 117155928 A CN117155928 A CN 117155928A CN 202311423755 A CN202311423755 A CN 202311423755A CN 117155928 A CN117155928 A CN 117155928A
Authority
CN
China
Prior art keywords
receiving
communication
node
sending
communication task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311423755.8A
Other languages
Chinese (zh)
Other versions
CN117155928B (en
Inventor
高开
郭振华
王丽
曹芳
唐轶男
赵雅倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN202311423755.8A priority Critical patent/CN117155928B/en
Publication of CN117155928A publication Critical patent/CN117155928A/en
Application granted granted Critical
Publication of CN117155928B publication Critical patent/CN117155928B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0894Policy-based network configuration management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Abstract

The invention discloses a communication task processing method, a system, equipment, a cluster and a readable storage medium, which relate to the field of distributed clusters and aim to solve the problem that a communication strategy wastes bandwidth in nodes; when one or more communication tasks exist, distributing each communication task to each sending device in a one-to-one correspondence manner, dividing task data corresponding to the communication tasks to obtain a plurality of partition data, and controlling the sending devices corresponding to the communication tasks to sequentially send the plurality of partition data to a receiving node; for each receiving node, the receiving node is controlled to synchronize the received partition data in each device in the own node, and the partition data is sent to other receiving nodes which do not receive the partition data. The invention can fully utilize the bandwidth in the node and improve the resource utilization rate of the distributed cluster.

Description

Communication task processing method, system, equipment, cluster and readable storage medium
Technical Field
The present invention relates to the field of distributed clusters, and in particular, to a method, a system, a device, a cluster, and a readable storage medium for processing a communication task.
Background
At present, more and more large models appear in the field of deep learning, and the large models generally adopt a model parallel method to perform parallel distributed training and reasoning on a distributed cluster. The key of the performance of the model parallel strategy is that communication needs to be carried out between parallel devices, different computing parts of the model are deployed on different device groups, and when communication tasks exist among multiple groups of devices, the existing communication strategy is that sending devices in a distributed cluster send task data of the communication tasks to each receiving device one by one. In this case, the total communication delay is proportional to the total number of devices, and this communication strategy wastes intra-node bandwidth if some receiving devices are located at the same node.
Therefore, how to provide a solution to the above technical problem is a problem that a person skilled in the art needs to solve at present.
Disclosure of Invention
The invention aims to provide a communication task processing method, a system, equipment, a cluster and a readable storage medium, which can fully utilize the bandwidth in a node and improve the resource utilization rate of a distributed cluster.
In order to solve the technical problems, the invention provides a communication task processing method, which comprises the following steps:
determining at least one sending node and at least one receiving node in a distributed cluster according to a deployment strategy of a preset model on each device of the distributed cluster, wherein the sending node comprises at least one device, and the receiving node comprises at least one device;
when one or more communication tasks exist, distributing each communication task to each sending device in a one-to-one correspondence manner, dividing task data corresponding to each communication task to obtain a plurality of partition data, and controlling the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, wherein the sending device is the device in the sending node;
and for each receiving node, controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node, and sending the partition data to other receiving nodes which do not receive the partition data.
In an exemplary embodiment, the communication task processing method further includes:
Determining a first receiving device and at least one second receiving device among all the devices of each of the receiving nodes;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
and controlling the first receiving equipment of the receiving node to receive the partition data, and respectively sending the partition data to the second receiving equipment so as to enable the partition data to be subjected to data synchronization in the equipment in the self node.
In an exemplary embodiment, the communication task processing method further includes:
determining a first receiving device and at least one second receiving device among all the devices of each of the receiving nodes;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
the first receiving equipment controlling the receiving node receives the partition data, divides the partition data into a plurality of sub-data, and sends each sub-data to each second receiving equipment in a one-to-one correspondence manner;
and for each second receiving device, controlling the second receiving device to send the sub-data acquired by the second receiving device to other second receiving devices so as to enable the partition data to be subjected to data synchronization in each device in the node of the second receiving device.
In an exemplary embodiment, the process of allocating each communication task to each transmitting device in a one-to-one correspondence includes:
dividing a plurality of communication tasks into a plurality of task groups when a plurality of communication tasks exist;
and allocating the task groups to the transmission nodes in a one-to-one correspondence manner, so that in each transmission node, the communication tasks in the task groups are allocated to the transmission devices in the transmission node in a one-to-one correspondence manner.
In an exemplary embodiment, the process of dividing the plurality of communication tasks into a plurality of task groups includes:
determining a number of nodes of the sending nodes in the distributed cluster;
and dividing a plurality of communication tasks into task groups of the number of nodes.
In an exemplary embodiment, the process of dividing the plurality of communication tasks into task groups of the number of nodes includes:
determining the traffic corresponding to each communication task;
and dividing a plurality of communication tasks into task groups with the number of nodes according to the traffic corresponding to each communication task.
In an exemplary embodiment, the difference in total traffic of any two of the task groups is less than a preset value.
In an exemplary embodiment, the device is a graphics processor device or a tensor processing unit device.
In an exemplary embodiment, after one or more communication tasks exist and each communication task is allocated to each sending device in a one-to-one correspondence manner, before controlling the sending device corresponding to the communication task to send a plurality of partition data to one receiving node in sequence, the communication task processing method further includes:
when a plurality of communication tasks exist, determining the scheduling sequence of each communication task;
the process of controlling the sending device corresponding to the communication task to send the plurality of partition data to one receiving node in turn comprises the following steps:
and according to the scheduling sequence, controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node.
In an exemplary embodiment, the determining the scheduling order of each of the communication tasks includes:
determining the scheduling sequence of each communication task according to preset scheduling conditions; the preset scheduling conditions include serial scheduling of the communication tasks allocated to the same transmitting node, serial scheduling and/or parallel scheduling of the communication tasks allocated to different transmitting nodes.
In an exemplary embodiment, the determining the scheduling order of each of the communication tasks includes:
determining all scheduling sequences of the communication tasks;
calculating the total completion time of each communication task under each scheduling sequence;
determining the scheduling sequence corresponding to the minimum value of the total completion time as an optimal scheduling sequence;
according to the scheduling sequence, the process of controlling the sending device corresponding to each communication task to send the plurality of partition data to one receiving node in turn comprises the following steps:
and controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the optimal scheduling sequence.
In an exemplary embodiment, the determining the scheduling order of each of the communication tasks includes:
determining traffic for each of said communication tasks in each of said sending nodes;
and determining the scheduling sequence of each communication task in each sending node according to the traffic.
In an exemplary embodiment, determining the scheduling order of the communication tasks in each of the transmitting nodes according to the traffic amount includes:
And determining the scheduling sequence of each communication task in each sending node according to the sequence from the large to the small of the traffic or the sequence from the small to the large.
In an exemplary embodiment, after determining at least one transmitting node and at least one receiving node in the distributed cluster according to the deployment policy of the preset model on each device of the distributed cluster, the communication task processing method further includes:
allocating independent numbers to each sending node according to the deployment strategy;
the process of determining the scheduling sequence of each communication task in each sending node according to the traffic volume comprises the following steps:
and determining the scheduling sequence of each communication task in each sending node according to the number and the size of the traffic.
In an exemplary embodiment, the determining the scheduling order of the communication tasks in each of the transmitting nodes according to the numbers and the traffic sizes includes:
determining the scheduling sequence of each communication task in the sending node with the odd number according to the sequence of the traffic from the large to the small;
and/or the number of the groups of groups,
and determining the scheduling sequence of each communication task in the even numbered sending nodes according to the sequence from the small traffic to the large traffic.
In order to solve the technical problem, the present invention further provides a communication task processing system, including:
a first determining module, configured to determine at least one transmitting node and at least one receiving node in a distributed cluster according to a deployment policy of a preset model on each device of the distributed cluster, where the transmitting node includes at least one device, and the receiving node includes at least one device;
the dividing module is used for distributing each communication task to each sending device in a one-to-one correspondence manner when one or more communication tasks exist, dividing task data corresponding to each communication task to obtain a plurality of partition data, and controlling the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, wherein the sending device is the device in the sending node;
and the synchronization module is used for controlling the receiving nodes to synchronize the data of the partition data received by the receiving nodes in the equipment in the receiving nodes and sending the partition data to other receiving nodes which do not receive the partition data.
In order to solve the technical problem, the present invention further provides an electronic device, including:
a memory for storing a computer program;
a processor for implementing the steps of the communication task processing method as claimed in any one of the preceding claims when executing the computer program.
In order to solve the above technical problem, the present invention further provides a distributed cluster, including:
an electronic device as described above;
the system comprises at least one sending node, at least one receiving device and at least one processing device, wherein the sending device is used for sending a plurality of partition data obtained by dividing task data of a communication task;
the at least one receiving node comprises a first receiving device and at least one second receiving device, wherein the first receiving device is used for transmitting the partition data to the second receiving device in the own node and transmitting the partition data to the first receiving devices of other receiving nodes which do not receive the partition data each time the partition data is received.
In an exemplary embodiment, the first receiving device is configured to, each time the partition data is received, send the partition data to the second receiving device in its own node, and then send the partition data to the first receiving device of the other receiving nodes that do not receive the partition data.
To solve the above technical problem, the present invention further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the communication task processing method as set forth in one of the above.
The invention provides a communication task processing method, which is characterized in that a sending node and a receiving node in a distributed cluster are determined according to a deployment strategy of a preset model, task data of a communication task sent by sending equipment in the sending node are firstly divided, each divided partition data is sequentially sent to the receiving node, each receiving equipment in the receiving node synchronizes the partition data, bandwidths in the receiving node are fully utilized, then the partition data are sent to other receiving nodes for receiving, and communication efficiency is improved by reducing communication traffic among the nodes, so that resource utilization rate of the distributed cluster is improved. The invention also provides a distributed cluster, a communication system, electronic equipment and a computer readable storage medium thereof, and the distributed cluster has the same beneficial effects as the communication task processing method.
Drawings
For a clearer description of embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
FIG. 1 is a flow chart of steps of a communication task processing method provided by the invention;
FIG. 2 is a schematic diagram of a communication strategy, for example, a transmitting device;
FIG. 3 is a schematic diagram of another communication strategy, for example, a transmitting device;
FIG. 4 is a schematic diagram of another communication strategy, for example, a transmitting device;
fig. 5 is a schematic diagram of a communication policy, which is provided by the present invention and is an example of a sending device;
fig. 6 is a schematic diagram of a communication strategy of multiple communication tasks according to the present invention;
FIG. 7 is a schematic diagram of another multi-communication-task communication strategy provided by the present invention;
FIG. 8 is a schematic diagram of a communication task processing system according to the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to the present invention;
FIG. 10 is a schematic diagram of a distributed cluster according to the present invention;
fig. 11 is a schematic structural diagram of a computer readable storage medium according to the present invention.
Detailed Description
The core of the invention is to provide a communication task processing method, a system, equipment, a cluster and a readable storage medium, which can fully utilize the bandwidth in the node and improve the resource utilization rate of the distributed cluster.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart illustrating a communication task processing method according to the present invention, where the communication task processing method includes:
s101: determining at least one sending node and at least one receiving node in the distributed cluster according to deployment strategies of a preset model on each device of the distributed cluster, wherein the sending node comprises at least one device, and the receiving node comprises at least one device;
in this embodiment, the model parallel policy divides the preset model into different modules, each module is deployed on devices of different nodes of the distributed cluster, each module may correspond to a plurality of nodes, each node includes a plurality of devices, communications between the nodes need to pass through a central processing unit, the bandwidth is low, communications in the nodes pass through PCIE (Peripheral Component Interconnect Express, high-speed serial computer expansion bus standard) communications, and the bandwidth is high.
Firstly, according to a deployment strategy of a preset model on each device of a distributed cluster, a sending device and a receiving device in each device in the distributed cluster can be determined, and DSi is used as an ith corresponding data slicing communication task between two modules in the embodiment. The transmitting device is assembled intoModule for transmitting DSi to receiving device setAssuming that it takes time T to send a fixed-size object from one device in one module to another device in another module, the intra-node device communication in one module takes time u, and for all communication strategies, the time to send a fixed-size object from 1 device to one module containing a×b devices is denoted by T (a; B), where a is the number of nodes on the module and B is the number of devices in each node.
For ease of understanding, referring to fig. 2, fig. 2 is a schematic diagram of a communication policy, which is exemplified by a transmitting device, the transmitting device transmits DSi to each receiving device, respectively, in which case the total communication delay T (a; B) = ABt is proportional to the total number of devices, and if some receiving devices are located in the same receiving node, the communication policy wastes bandwidth in the node. Referring to fig. 3, fig. 3 is a schematic diagram of another communication policy, which is exemplified by a transmitting device, the transmitting device sends DSi slices, local aggregation is performed in the receiving nodes, each receiving node receives only one complete DSi copy, and slice data is aggregated by using the intra-node high-speed bandwidth, so as to achieve this objective, DSi is divided into B parts, and each part is sent to one of the receiving nodes, in this case, communication delay T (a, B) =at+au, where the communication policy improves communication efficiency by reducing the traffic between the nodes, and vertical bar parts in fig. 3 are used to illustrate the slice data received by the receiving devices in the receiving node 1 and the receiving node 2. Referring to fig. 4, fig. 4 is a schematic diagram of another communication strategy, taking a transmitting device as an example, transmitting with smaller slices of data, then performing global aggregation in a receiving node, further cutting DSi into blocks with finer granularity, performing a×b slicing on DSi, and transmitting only one slice to each device in the receiving node, then performing global aggregation by the devices in the receiving node to collect each slice on all devices, wherein delay in the receiving node is approximately t, and delay in the node can be covered and ignored. The overall delay T (a; B) =2t, does not increase with the number of nodes or devices, and the vertical bar portion in fig. 4 is used to illustrate the slice data received by the receiving devices in the receiving node 1 and the receiving node 2.
S102: when one or more communication tasks exist, distributing each communication task to each sending device in a one-to-one correspondence manner, dividing task data corresponding to each communication task to obtain a plurality of partition data, and controlling the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, wherein the sending device is a device in the sending node;
s103: for each receiving node, the receiving node is controlled to synchronize the received partition data in each device in the own node, and the partition data is sent to other receiving nodes which do not receive the partition data.
On the basis, the embodiment performs data transmission through a two-stage communication strategy, referring to fig. 5, fig. 5 shows a transmitting node and at least two receiving nodes, the transmitting node comprises a transmitting device, the receiving nodes comprise but are not limited to a receiving node 1 and a receiving node 2, and other receiving nodes, the receiving node 1 and the receiving node 2 comprise a plurality of receiving devices, when any receiving device starts to receive data, the receiving device can serve as another transmitting end to transmit data to the rest receiving devices, and data transmission between the data transmitting and receiving nodes in the receiving nodes can be performed simultaneously. In this embodiment, task data DSi to be output on a sending device is divided into K partitions, in fig. 5, partition data is shown in a black area, first, the sending device sends one partition data to a first receiving device of a receiving node 1, and after the first receiving device of the receiving node 1 receives a complete partition, the sending device sends the partition data to a second receiving device of the same node, and sends the partition data to the first receiving device of the receiving node 2; while the first receiving device of the receiving node 1 transmits the partition data to the first receiving device of the receiving node 2, the second receiving device in the first receiving node performs data transmission between other devices in the same node, the transmitting device in the transmitting node transmits the second partition data to the first receiving device of the receiving node 1, and so on, the data of the K partitions performs the streaming communication between the nodes, and the data communication between the second receiving device in different receiving nodes and the other receiving devices is performed in parallel. In this case, the time to transmit each partition data is T/K, the total delay is time T (a; B) =t/k× (k+a) =t+at/K, and when K is selected as a relatively large value, the total delay is around T.
In this embodiment, according to the deployment policy of the preset model, the sending node and the receiving node in the distributed cluster are determined, the task data of the communication task sent by the sending device in the sending node is firstly divided, each divided partition data is sequentially sent to the receiving node, each receiving device in the receiving node performs synchronization of the partition data, the bandwidth in the receiving node is fully utilized, then the partition data is sent to other receiving nodes for receiving, and the communication efficiency is improved by reducing the communication traffic among the nodes, so that the resource utilization rate of the distributed cluster is improved.
Based on the above embodiments:
in an exemplary embodiment, the communication task processing method further includes:
determining a first receiving device and at least one second receiving device among all devices of each receiving node;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
the first receiving device of the control receiving node receives the partition data and sends the partition data to each second receiving device respectively so that the partition data are subjected to data synchronization in each device in the own node.
In this embodiment, a first receiving device, that is, a device in which the receiving node communicates with a transmitting node or other receiving nodes, and a second receiving device, that is, a device other than the first receiving device in the receiving node, are determined in each receiving node. In this embodiment, taking a receiving node as an example, after a first receiving device in the receiving node receives the partition data, the partition data is sent to any one of the second receiving devices, and after the second receiving device receives the partition data, the second receiving device may send the partition data to the other second receiving devices respectively.
In an exemplary embodiment, the communication task processing method further includes:
determining a first receiving device and at least one second receiving device among all devices of each receiving node;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
the first receiving equipment controlling the receiving node receives the partition data, divides the partition data into a plurality of sub-data, and sends each sub-data to each second receiving equipment in a one-to-one correspondence manner;
and for each second receiving device, controlling the second receiving device to send the sub-data acquired by the second receiving device to other second receiving devices so as to synchronize the data of the partition data in each device in the node of the second receiving device.
In this embodiment, a first receiving device, that is, a device in which the receiving node communicates with a transmitting node or other receiving nodes, and a second receiving device, that is, a device other than the first receiving device in the receiving node, are determined in each receiving node. In this embodiment, taking a receiving node as an example, after a first receiving device in the receiving node receives the partition data, the partition data is sent to any one of second receiving devices, after the second receiving device receives the partition data, the second receiving device may divide the received partition data into a plurality of sub-data, send the plurality of sub-data to other second receiving devices respectively, and communicate with each other between the other second receiving devices, so as to implement data aggregation in the receiving node, thereby obtaining complete partition data.
The communication tasks between each module include tasks for multiple unit communications. These tasks may have overlapping sender and receiver devices, so that different task communication capabilities affect each other. Referring to fig. 6 and 7, a scheme for implementing load balancing for optimizing an entire communication task is described, and four tasks, namely task 1, task 2, task 3 and task 4, are illustrated in fig. 6 and 7.
In an exemplary embodiment, the process of allocating each communication task to each transmitting device in a one-to-one correspondence includes:
dividing a plurality of communication tasks into a plurality of task groups when a plurality of communication tasks exist;
the plurality of task groups are allocated to the plurality of sending nodes in a one-to-one correspondence manner, so that in each sending node, each communication task in the task groups is allocated to each sending device in the sending node in a one-to-one correspondence manner.
In an exemplary embodiment, the process of dividing the plurality of communication tasks into a plurality of task groups includes:
determining the node number of the sending nodes in the distributed cluster;
the plurality of communication tasks are divided into task groups of a number of nodes. In an exemplary embodiment, the process of dividing the plurality of communication tasks into task groups of the number of nodes includes:
determining the traffic corresponding to each communication task;
and dividing the plurality of communication tasks into task groups with the number of nodes according to the traffic corresponding to each communication task.
It can be understood that, assuming that the distributed cluster includes P sending nodes, all communication tasks are divided into P task groups, the number of communication tasks in each task group may be the same or different, the traffic of each communication task is comprehensively considered, the total traffic of each communication task in each task group is guaranteed to be equivalent, that is, the difference value of the total traffic of any two task groups is smaller than a preset value, then the communication tasks of the P task groups are allocated to the P sending nodes in a one-to-one correspondence manner, and a sending device for sending the allocated communication tasks is determined in the P sending nodes.
In an exemplary embodiment, the device is a graphics processor device or a tensor processing unit device.
In an exemplary embodiment, when there are one or more communication tasks, after each communication task is allocated to each transmitting device in a one-to-one correspondence manner, before controlling the transmitting device to which the communication task corresponds to sequentially transmit the plurality of partition data to one receiving node, the communication task processing method further includes:
when a plurality of communication tasks exist, determining the scheduling sequence of each communication task;
the process of controlling the transmitting device corresponding to the communication task to sequentially transmit the plurality of partition data to one receiving node comprises the following steps:
and controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the scheduling sequence.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining the scheduling sequence of each communication task according to preset scheduling conditions; the preset scheduling conditions include serial scheduling of each communication task allocated to each transmitting device of the same transmitting node, serial scheduling and/or parallel scheduling of each communication task allocated to each transmitting device of different transmitting nodes.
Assuming that each transmitting node has data of a corresponding task, the whole communication task is optimized so that the communication time of the whole task is shortest, namely, a proper transmitting device is searched in a search space, and the proper scheduling sequence is that the communication time of the whole task is shortest. Assuming that K communication tasks are set, firstly selecting a transmitting node N from the i-th task in the transmitting device group, and determining a scheduling sequence of each communication task after all the communication tasks select the transmitting nodes, wherein the communication tasks in the same transmitting node have a sequential scheduling sequence, and the communication tasks in different transmitting nodes can be scheduled in parallel. Let the scheduling start time Si of the ith communication task be Ti. The overall problem can be modeled as: selecting a suitable scheduling order S and transmitting device n such that:
the optimal solution of the above equation is solved by adopting a strategy of forward and reverse sequencing of traffic, and it can be understood that, for fixed traffic and communication bandwidth, the scheduling time of a single communication task is T (a; B) =t/k× (k+a) =t+at/K. When proper K is selected, the scheduling time of a single communication task approaches t, and when the bandwidth is unchanged, the scheduling time t is in direct proportion to the traffic, and the larger the traffic is, the longer the scheduling time is.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining all scheduling sequences of all communication tasks;
calculating the total completion time of each communication task under each scheduling sequence;
determining a scheduling sequence corresponding to the minimum value of the total completion time as an optimal scheduling sequence;
according to the scheduling sequence, the process of controlling the sending equipment corresponding to each communication task to send the plurality of partition data to one receiving node in sequence comprises the following steps:
and controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the optimal scheduling sequence.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining traffic for each communication task in each sending node;
and determining the scheduling sequence of each communication task in each sending node according to the traffic.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks in each transmitting node according to the size of the traffic includes:
the scheduling sequence of each communication task in each sending node is determined according to the sequence from the big to the small of the traffic or the sequence from the small to the big.
In an exemplary embodiment, after determining at least one transmitting node and at least one receiving node in the distributed cluster according to the deployment policy of the preset model on each device of the distributed cluster, the communication task processing method further includes:
allocating independent numbers for each sending node according to the deployment strategy;
the process of determining the scheduling sequence of each communication task in each sending node according to the traffic volume comprises the following steps:
and determining the scheduling sequence of each communication task in each sending node according to the number and the size of the traffic.
In an exemplary embodiment, determining the scheduling order of the communication tasks in each transmitting node according to the number and the size of the traffic comprises:
determining the scheduling sequence of each communication task in the odd-numbered sending nodes according to the sequence of the traffic from large to small;
and/or the number of the groups of groups,
the scheduling order of each communication task in the even numbered transmitting nodes is determined in the order of the traffic from small to large.
The embodiment provides a strategy for equally dividing communication tasks, odd node forward scheduling tasks and even node reverse scheduling tasks according to traffic to realize load balancing of multiple communication tasks, so that complexity of a search space is greatly reduced. Firstly, sequencing the traffic of all communication tasks from large to small, dividing the communication tasks according to the number of sending nodes, equally dividing the communication tasks into the same number of parts as the number of the sending nodes according to the traffic, respectively deploying the divided communication task groups on different sending nodes, determining a scheduling sequence of the communication tasks in the sending nodes according to the size of the traffic, scheduling the communication tasks by odd sending nodes according to the forward sequencing of the traffic, and scheduling the communication tasks by even sending nodes according to the reverse sequencing of the traffic. It will be appreciated that the beginning and ending phases of a single scheduled task have a low bandwidth utilization. When a task with larger traffic is being scheduled in a certain sending node, tasks with small traffic are scheduled frequently in other sending nodes more efficiently.
In summary, the communication optimization strategy provided by the embodiment can effectively improve the communication efficiency in the large model training, furthest improve the bandwidth utilization rate and improve the performance of the model training. The optimization method provided by the patent can be deployed on a large-scale cluster, and meanwhile, the communication bandwidth among the clusters is efficiently utilized, so that the optimal communication efficiency is achieved, and the performance of the server is improved.
In a second aspect, referring to fig. 8, fig. 8 is a schematic structural diagram of a communication task processing system according to the present invention, where the communication task processing system includes:
a first determining module 11, configured to determine at least one transmitting node and at least one receiving node in the distributed cluster according to a deployment policy of a preset model on each device of the distributed cluster, where the transmitting node includes at least one device, and the receiving node includes at least one device;
the dividing module 12 is configured to, when one or more communication tasks exist, allocate each communication task to each sending device in a one-to-one correspondence manner, divide task data corresponding to the communication task for each communication task to obtain a plurality of partition data, and control the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, where the sending device is a device in the sending node;
And the synchronization module 13 is used for controlling the receiving nodes to synchronize the data of the received partition data in each device in the own node according to each receiving node, and sending the partition data to other receiving nodes which do not receive the partition data.
In an exemplary embodiment, the communication task processing system further includes:
a second determining module, configured to determine a first receiving device and at least one second receiving device among all devices of each receiving node;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
the first receiving device of the control receiving node receives the partition data and sends the partition data to each second receiving device respectively so that the partition data are subjected to data synchronization in each device in the own node.
In an exemplary embodiment, the communication task processing system further includes:
a third determining module, configured to determine a first receiving device and at least one second receiving device among all devices of each receiving node;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
The first receiving equipment controlling the receiving node receives the partition data, divides the partition data into a plurality of sub-data, and sends each sub-data to each second receiving equipment in a one-to-one correspondence manner;
and for each second receiving device, controlling the second receiving device to send the sub-data acquired by the second receiving device to other second receiving devices so as to synchronize the data of the partition data in each device in the node of the second receiving device.
In an exemplary embodiment, the process of allocating each communication task to each transmitting device in a one-to-one correspondence includes:
dividing a plurality of communication tasks into a plurality of task groups when a plurality of communication tasks exist;
the plurality of task groups are allocated to the plurality of sending nodes in a one-to-one correspondence manner, so that in each sending node, each communication task in the task groups is allocated to each sending device in the sending node in a one-to-one correspondence manner.
In an exemplary embodiment, the process of dividing the plurality of communication tasks into a plurality of task groups includes:
determining the node number of the sending nodes in the distributed cluster;
the plurality of communication tasks are divided into task groups of a number of nodes.
In an exemplary embodiment, the process of dividing the plurality of communication tasks into task groups of the number of nodes includes:
Determining the traffic corresponding to each communication task;
and dividing the plurality of communication tasks into task groups with the number of nodes according to the traffic corresponding to each communication task.
In an exemplary embodiment, the difference in total traffic of any two task groups is less than a preset value.
In an exemplary embodiment, the device is a graphics processor device or a tensor processing unit device.
In an exemplary embodiment, the communication task processing system further includes:
a fourth determining module, configured to determine a scheduling order of each communication task when there are a plurality of communication tasks;
the process of controlling the transmitting device corresponding to the communication task to sequentially transmit the plurality of partition data to one receiving node comprises the following steps:
and controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the scheduling sequence.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining the scheduling sequence of each communication task according to preset scheduling conditions; the preset scheduling conditions include serial scheduling of each communication task allocated to each transmitting device of the same transmitting node, serial scheduling and/or parallel scheduling of each communication task allocated to each transmitting device of different transmitting nodes.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining all scheduling sequences of all communication tasks;
calculating the total completion time of each communication task under each scheduling sequence;
determining a scheduling sequence corresponding to the minimum value of the total completion time as an optimal scheduling sequence;
according to the scheduling sequence, the process of controlling the sending equipment corresponding to each communication task to send the plurality of partition data to one receiving node in sequence comprises the following steps:
and controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the optimal scheduling sequence.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks includes:
determining traffic for each communication task in each sending node;
and determining the scheduling sequence of each communication task in each sending node according to the traffic.
In an exemplary embodiment, the process of determining the scheduling order of the respective communication tasks in each transmitting node according to the size of the traffic includes:
the scheduling sequence of each communication task in each sending node is determined according to the sequence from the big to the small of the traffic or the sequence from the small to the big.
In an exemplary embodiment, the communication task processing system further includes:
the distribution module is used for distributing independent numbers to each sending node according to the deployment strategy;
the process of determining the scheduling sequence of each communication task in each sending node according to the traffic volume comprises the following steps:
and determining the scheduling sequence of each communication task in each sending node according to the number and the size of the traffic.
In an exemplary embodiment, determining the scheduling order of the communication tasks in each transmitting node according to the number and the size of the traffic comprises:
determining the scheduling sequence of each communication task in the odd-numbered sending nodes according to the sequence of the traffic from large to small;
and/or the number of the groups of groups,
the scheduling order of each communication task in the even numbered transmitting nodes is determined in the order of the traffic from small to large.
In a third aspect, referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to the present invention, where the electronic device includes:
a memory 21 for storing a computer program;
a processor 22 for implementing the steps of the communication task processing method as described in any one of the embodiments above when executing a computer program.
Of course, the electronic device may further include:
An input interface 23, which is connected to the processor 22 via a communication bus, for obtaining externally imported computer programs, parameters and instructions, which are stored in the memory 21 under the control of the processor 22. The input interface 23 may be connected to an input device for receiving parameters or instructions manually entered by a user. The input device can be a touch layer covered on a display screen, or can be a key, a track ball or a touch pad arranged on a terminal shell.
And a display unit 24, connected to the processor 22 through a communication bus, for displaying data transmitted by the processor 22. The display unit 24 may be a liquid crystal display or an electronic ink display, etc.
The network port 25 is connected to the processor 22 through a communication bus, and is used for communication connection with external terminal devices. The communication technology adopted by the communication connection can be a wired communication technology or a wireless communication technology, such as a mobile high-definition link technology, a universal serial bus, a high-definition multimedia interface, a wireless fidelity technology, a Bluetooth communication technology, a low-power consumption Bluetooth communication technology, an IEEE80. S-based communication technology and the like.
For an introduction of an electronic device provided by the present invention, refer to the above embodiment, and the disclosure is not repeated here.
The distributed cluster provided by the invention has the same beneficial effects as the communication task processing method.
In a fifth aspect, the present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a communication task processing method as described above for one embodiment.
The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes. Fig. 11 is a schematic structural diagram of a computer readable storage medium according to an embodiment of the present invention, where the computer readable storage medium may be a nonvolatile or non-transient memory chip, and specifically includes a decoding driver, a memory matrix, a read/write circuit, an address line, a data line, a chip select line, and a read/write control line.
For an introduction to a computer readable storage medium provided by the present invention, refer to the above embodiments, and the disclosure is not repeated here.
The computer readable storage medium provided by the invention has the same beneficial effects as the communication task processing method.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (20)

1. A communication task processing method, comprising:
determining at least one sending node and at least one receiving node in a distributed cluster according to a deployment strategy of a preset model on each device of the distributed cluster, wherein the sending node comprises at least one device, and the receiving node comprises at least one device;
when one or more communication tasks exist, distributing each communication task to each sending device in a one-to-one correspondence manner, dividing task data corresponding to each communication task to obtain a plurality of partition data, and controlling the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, wherein the sending device is the device in the sending node;
and for each receiving node, controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node, and sending the partition data to other receiving nodes which do not receive the partition data.
2. The communication task processing method according to claim 1, characterized in that the communication task processing method further comprises:
Determining a first receiving device and at least one second receiving device among all the devices of each of the receiving nodes;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
and controlling the first receiving equipment of the receiving node to receive the partition data, and respectively sending the partition data to the second receiving equipment so as to enable the partition data to be subjected to data synchronization in the equipment in the self node.
3. The communication task processing method according to claim 1, characterized in that the communication task processing method further comprises:
determining a first receiving device and at least one second receiving device among all the devices of each of the receiving nodes;
the process of controlling the receiving node to synchronize the data of the partition data received by the receiving node in each device in the self node comprises the following steps:
the first receiving equipment controlling the receiving node receives the partition data, divides the partition data into a plurality of sub-data, and sends each sub-data to each second receiving equipment in a one-to-one correspondence manner;
And for each second receiving device, controlling the second receiving device to send the sub-data acquired by the second receiving device to other second receiving devices so as to enable the partition data to be subjected to data synchronization in each device in the node of the second receiving device.
4. The communication task processing method according to claim 1, wherein the process of assigning each of the communication tasks to each of the transmitting devices in one-to-one correspondence comprises:
dividing a plurality of communication tasks into a plurality of task groups when a plurality of communication tasks exist;
and allocating the task groups to the transmission nodes in a one-to-one correspondence manner, so that in each transmission node, the communication tasks in the task groups are allocated to the transmission devices in the transmission node in a one-to-one correspondence manner.
5. The communication task processing method according to claim 4, wherein the process of dividing the plurality of communication tasks into a plurality of task groups includes:
determining a number of nodes of the sending nodes in the distributed cluster;
and dividing a plurality of communication tasks into task groups of the number of nodes.
6. The communication task processing method according to claim 5, wherein the process of dividing the plurality of communication tasks into task groups of the number of nodes includes:
Determining the traffic corresponding to each communication task;
and dividing a plurality of communication tasks into task groups with the number of nodes according to the traffic corresponding to each communication task.
7. The communication task processing method according to claim 6, wherein a difference in total traffic of any two of the task groups is smaller than a preset value.
8. The communication task processing method according to claim 1, wherein the device is a graphics processor device or a tensor processing unit device.
9. The communication task processing method according to any one of claims 1 to 8, wherein when one or more communication tasks exist, after allocating each of the communication tasks to each of the transmitting devices in a one-to-one correspondence, before controlling the transmitting device to which the communication task corresponds to sequentially transmit the plurality of partition data to one of the receiving nodes, the communication task processing method further comprises:
when a plurality of communication tasks exist, determining the scheduling sequence of each communication task;
the process of controlling the sending device corresponding to the communication task to send the plurality of partition data to one receiving node in turn comprises the following steps:
And according to the scheduling sequence, controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node.
10. The communication task processing method according to claim 9, wherein the process of determining the scheduling order of each of the communication tasks includes:
determining the scheduling sequence of each communication task according to preset scheduling conditions; the preset scheduling conditions include serial scheduling of the communication tasks allocated to the same transmitting node, serial scheduling and/or parallel scheduling of the communication tasks allocated to different transmitting nodes.
11. The communication task processing method according to claim 9, wherein the process of determining the scheduling order of each of the communication tasks includes:
determining all scheduling sequences of the communication tasks;
calculating the total completion time of each communication task under each scheduling sequence;
determining the scheduling sequence corresponding to the minimum value of the total completion time as an optimal scheduling sequence;
according to the scheduling sequence, the process of controlling the sending device corresponding to each communication task to send the plurality of partition data to one receiving node in turn comprises the following steps:
And controlling the sending equipment corresponding to each communication task to sequentially send the plurality of partition data to one receiving node according to the optimal scheduling sequence.
12. The communication task processing method according to claim 9, wherein the process of determining the scheduling order of each of the communication tasks includes:
determining traffic for each of said communication tasks in each of said sending nodes;
and determining the scheduling sequence of each communication task in each sending node according to the traffic.
13. The communication task processing method according to claim 12, wherein the process of determining the scheduling order of the respective communication tasks in each of the transmitting nodes according to the size of the traffic includes:
and determining the scheduling sequence of each communication task in each sending node according to the sequence from the large to the small of the traffic or the sequence from the small to the large.
14. The communication task processing method according to claim 13, wherein after determining at least one transmitting node and at least one receiving node in the distributed cluster according to a deployment policy of a preset model on each device of the distributed cluster, the communication task processing method further comprises:
Allocating independent numbers to each sending node according to the deployment strategy;
the process of determining the scheduling sequence of each communication task in each sending node according to the traffic volume comprises the following steps:
and determining the scheduling sequence of each communication task in each sending node according to the number and the size of the traffic.
15. The communication task processing method according to claim 14, wherein the process of determining the scheduling order of the respective communication tasks in each of the transmitting nodes based on the number and the size of the traffic comprises:
determining the scheduling sequence of each communication task in the sending node with the odd number according to the sequence of the traffic from the large to the small;
and/or the number of the groups of groups,
and determining the scheduling sequence of each communication task in the even numbered sending nodes according to the sequence from the small traffic to the large traffic.
16. A communication task processing system, comprising:
a first determining module, configured to determine at least one transmitting node and at least one receiving node in a distributed cluster according to a deployment policy of a preset model on each device of the distributed cluster, where the transmitting node includes at least one device, and the receiving node includes at least one device;
The dividing module is used for distributing each communication task to each sending device in a one-to-one correspondence manner when one or more communication tasks exist, dividing task data corresponding to each communication task to obtain a plurality of partition data, and controlling the sending device corresponding to the communication task to sequentially send the plurality of partition data to one receiving node, wherein the sending device is the device in the sending node;
and the synchronization module is used for controlling the receiving nodes to synchronize the data of the partition data received by the receiving nodes in the equipment in the receiving nodes and sending the partition data to other receiving nodes which do not receive the partition data.
17. An electronic device, comprising:
a memory for storing a computer program;
processor for implementing the steps of the communication task processing method according to any of claims 1-15 when executing said computer program.
18. A distributed cluster, comprising:
the electronic device of claim 17;
the system comprises at least one sending node, at least one receiving device and at least one processing device, wherein the sending device is used for sending a plurality of partition data obtained by dividing task data of a communication task;
The at least one receiving node comprises a first receiving device and at least one second receiving device, wherein the first receiving device is used for transmitting the partition data to the second receiving device in the own node and transmitting the partition data to the first receiving devices of other receiving nodes which do not receive the partition data each time the partition data is received.
19. The distributed cluster of claim 18, wherein the first receiving device is configured to, for each time one of the partition data is received, send the partition data to the second receiving device in its own node, and then send the partition data to the first receiving devices of other receiving nodes that do not receive the partition data.
20. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the communication task processing method as claimed in any one of claims 1-15.
CN202311423755.8A 2023-10-31 2023-10-31 Communication task processing method, system, equipment, cluster and readable storage medium Active CN117155928B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311423755.8A CN117155928B (en) 2023-10-31 2023-10-31 Communication task processing method, system, equipment, cluster and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311423755.8A CN117155928B (en) 2023-10-31 2023-10-31 Communication task processing method, system, equipment, cluster and readable storage medium

Publications (2)

Publication Number Publication Date
CN117155928A true CN117155928A (en) 2023-12-01
CN117155928B CN117155928B (en) 2024-02-09

Family

ID=88912463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311423755.8A Active CN117155928B (en) 2023-10-31 2023-10-31 Communication task processing method, system, equipment, cluster and readable storage medium

Country Status (1)

Country Link
CN (1) CN117155928B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260832A1 (en) * 2003-06-23 2004-12-23 Newisys, Inc., A Delaware Corporation Bandwidth, framing and error detection in communications between multi-processor clusters of multi-cluster computer systems
US20050034048A1 (en) * 2003-08-05 2005-02-10 Newisys, Inc. Reliable communication between multi-processor clusters of multi-cluster computer systems
CN109792406A (en) * 2018-07-27 2019-05-21 袁振南 Message delivery method, device and storage medium in server cluster
CN111813858A (en) * 2020-07-10 2020-10-23 电子科技大学 Distributed neural network hybrid synchronous training method based on self-organizing grouping of computing nodes
CN112118315A (en) * 2020-09-18 2020-12-22 北京有竹居网络技术有限公司 Data processing system, method, device, electronic equipment and storage medium
US20210004163A1 (en) * 2019-07-05 2021-01-07 Vmware, Inc. Performing resynchronization jobs in a distributed storage system based on a parallelism policy
CN113377540A (en) * 2021-06-15 2021-09-10 上海商汤科技开发有限公司 Cluster resource scheduling method and device, electronic equipment and storage medium
CN114625533A (en) * 2022-02-28 2022-06-14 中国农业银行股份有限公司 Distributed task scheduling method and device, electronic equipment and storage medium
US20220318060A1 (en) * 2021-03-31 2022-10-06 International Business Machines Corporation Full-dimensional scheduling and scaling for microservice applications
CN115248728A (en) * 2022-09-21 2022-10-28 之江实验室 Distributed training task scheduling method, system and device for intelligent computing
CN115567997A (en) * 2022-04-11 2023-01-03 荣耀终端有限公司 Method and device for selecting central node of equipment discovery group
CN116956756A (en) * 2023-09-21 2023-10-27 浪潮电子信息产业股份有限公司 Model deployment method, task processing method, device, equipment and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260832A1 (en) * 2003-06-23 2004-12-23 Newisys, Inc., A Delaware Corporation Bandwidth, framing and error detection in communications between multi-processor clusters of multi-cluster computer systems
US20050034048A1 (en) * 2003-08-05 2005-02-10 Newisys, Inc. Reliable communication between multi-processor clusters of multi-cluster computer systems
CN109792406A (en) * 2018-07-27 2019-05-21 袁振南 Message delivery method, device and storage medium in server cluster
US20210004163A1 (en) * 2019-07-05 2021-01-07 Vmware, Inc. Performing resynchronization jobs in a distributed storage system based on a parallelism policy
CN111813858A (en) * 2020-07-10 2020-10-23 电子科技大学 Distributed neural network hybrid synchronous training method based on self-organizing grouping of computing nodes
CN112118315A (en) * 2020-09-18 2020-12-22 北京有竹居网络技术有限公司 Data processing system, method, device, electronic equipment and storage medium
US20220318060A1 (en) * 2021-03-31 2022-10-06 International Business Machines Corporation Full-dimensional scheduling and scaling for microservice applications
CN113377540A (en) * 2021-06-15 2021-09-10 上海商汤科技开发有限公司 Cluster resource scheduling method and device, electronic equipment and storage medium
WO2022262167A1 (en) * 2021-06-15 2022-12-22 上海商汤科技开发有限公司 Cluster resource scheduling method and apparatus, electronic device and storage medium
CN114625533A (en) * 2022-02-28 2022-06-14 中国农业银行股份有限公司 Distributed task scheduling method and device, electronic equipment and storage medium
CN115567997A (en) * 2022-04-11 2023-01-03 荣耀终端有限公司 Method and device for selecting central node of equipment discovery group
CN115248728A (en) * 2022-09-21 2022-10-28 之江实验室 Distributed training task scheduling method, system and device for intelligent computing
CN116956756A (en) * 2023-09-21 2023-10-27 浪潮电子信息产业股份有限公司 Model deployment method, task processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN117155928B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN106503791A (en) System and method for the deployment of effective neutral net
US20180018197A1 (en) Virtual Machine Resource Allocation Method and Apparatus
CN111143257A (en) DDR arbitration controller, video cache device and video processing system
CN106502806A (en) A kind of bus protocol command processing device and correlation technique
CN104243405A (en) Request processing method, device and system
CN102158906B (en) Service quality sensory system and task scheduling method thereof
CN103281227A (en) Method of driving bus arrangement
CN103353851A (en) Method and equipment for managing tasks
CN112148468B (en) Resource scheduling method and device, electronic equipment and storage medium
CN101989942A (en) Arbitration control method, communication method, arbitrator and communication system
CN115150286B (en) Transmission node changing method, device, computer equipment and storage medium
CN102264109A (en) Method of distributing bandwidth for service and for terminal service execution, and equipment thereof
CN109548161A (en) A kind of method, apparatus and terminal device of wireless resource scheduling
CN113946431A (en) Resource scheduling method, system, medium and computing device
CN117155928B (en) Communication task processing method, system, equipment, cluster and readable storage medium
CN117155791B (en) Model deployment method, system, equipment and medium based on cluster topology structure
CN103853676A (en) PCIe (Peripheral Component Interface express) bus based channel allocating, releasing, data transmitting method and system
CN116663639B (en) Gradient data synchronization method, system, device and medium
CN113608887A (en) Real-time interaction method for digital twin virtual and real network information
CN110750363B (en) Computer storage management method and device, electronic equipment and storage medium
CN106357764B (en) The method of data synchronization and server of mobile terminal
CN102594642A (en) Method for real-time controller area network (CAN) communication
CN110012240A (en) A kind of signal source switch method, system, storage medium and control host
CN114124973A (en) Multi-cloud-scene-oriented mirror image synchronization method and device
CN104081735B (en) For the system in transmission over networks simultaneous streaming

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant