WO2022007587A1 - Switch and data processing system - Google Patents
Switch and data processing system Download PDFInfo
- Publication number
- WO2022007587A1 WO2022007587A1 PCT/CN2021/099527 CN2021099527W WO2022007587A1 WO 2022007587 A1 WO2022007587 A1 WO 2022007587A1 CN 2021099527 W CN2021099527 W CN 2021099527W WO 2022007587 A1 WO2022007587 A1 WO 2022007587A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- port
- aggregation
- switch
- processing unit
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/24—Multipath
- H04L45/245—Link aggregation, e.g. trunking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/35—Switches specially adapted for specific applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/34—Flow control; Congestion control ensuring sequence integrity, e.g. using sequence numbers
Definitions
- the present application relates to the field of communications, and in particular, to a switch and a data processing system.
- AI Artificial intelligence
- HPC high performance computing
- MPI message passing interface
- the amount of data that the data node needs to transmit to the aggregation node also increases, and the aggregation node needs to process more data, so that the data transmission bandwidth in the entire system cannot meet the requirements, resulting in an increase in the delay of data aggregation processing. problem. Therefore, how to reduce the processing delay of data aggregation has become an urgent technical problem to be solved.
- the present application provides a switch, device and system for data processing, so as to provide a low-latency data processing method and improve the efficiency of data processing.
- the present application provides a switch, the switch is connected to at least two data nodes, and the at least two data nodes respectively perform a first operation of a distributed computing task; the switch is used for receiving data sent by the at least two data nodes.
- the result data of the first operation perform the second operation of the distributed computing task according to the received result data of the first operation, obtain the result data of the second operation, and distribute the result data of the second operation to the above at least two data nodes .
- the switch and the data node can jointly complete the distributed computing task, that is, the operation of the distributed computing task can be performed during the data transmission process of the switch, so as to avoid the problem of low efficiency caused by the second operation performed by a separate node , thereby improving the efficiency of data processing.
- the switch completes the second operation of the distributed computing task during data transmission, there is no need to deploy a separate node, which reduces the cost of the system.
- the switch includes a processing unit and at least two ports, each port is connected to a data node, and each port is used to receive the result data of the first operation sent by the connected data node, and send the first operation The result data of the operation is forwarded to the processing unit.
- the second operation of executing the distributed computing task by the processing unit of the switch is implemented, thereby reducing the delay of data processing.
- each port before forwarding the result data of the first operation to the processing unit, each port is further configured to perform the third operation of the distributed computing task on the result data of the first operation. That is to say, each port can also perform the third operation of the distributed computing task before forwarding the result data of the first operation to the processing unit, thereby accelerating the speed of data processing.
- the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks or distributed cloud computing tasks.
- the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
- the switch is an access switch or an aggregation switch.
- the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively perform the third operation of the distributed computing task.
- the processing unit can instruct the ports connected to the data nodes that perform the distributed computing task to perform the third operation of the distributed computing task respectively through the operation command, so as to realize the purpose of the operation of the switch performing the distributed computing task during the data transmission process, improving the performance of the distributed computing task. Efficiency of data processing.
- the processing unit sends an operation command to the at least two ports connected to the data nodes performing distributed computing tasks through a first loop, where the first loop includes at least For two ports, the order of the first loop indicates the order in which the above at least two ports receive or execute operation commands.
- the transmission of the operation command and the result data of the operation command is realized through the first loop, so that the influence of the data aggregation processing on other types of data processing processes can be avoided.
- the bandwidth of the first loop can be configured according to service requirements, thereby ensuring the performance of data processing.
- the result data of the third operation and the operation command are forwarded to the adjacent subsequent port, until The sequentially last port in the first loop sends all the result data of the third operation to the processing unit.
- the operation commands can be executed by each port in the first loop in turn, and the result data of the first operation is sent to the adjacent subsequent ports until the first loop.
- the last port in the above completes the processing process of the operation command, so that each port completes the operation of the distributed computing task according to the operation command, and accelerates the process of data processing.
- the processing unit is further configured to, before sending the operation command, receive the packet headers respectively sent by the at least two ports connected to the data node executing the distributed computing task, each packet header Including data type and message serial number; establishing operation table entry according to the message header, wherein the operation table entry records the data to be processed and processed data in each type of data; sending operation command according to the operation table entry.
- the processing unit instructs the operation commands of the at least two ports according to the processing conditions of each type of data, and then each port completes the processing process of the distributed computing task according to the operation commands.
- the packet header further includes a port identifier
- the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
- the switch includes at least one of the first loops.
- the switch is further configured to establish a second loop, and distribute result data of the second operation through the second loop.
- the switch may further include a second loop for distributing the result data of the second operation, so as to avoid the influence of distributing the result data of the second operation on other types of operations.
- each port can sequentially acquire the result data of the second operation.
- the result data of the second operation refers to the aggregated results of all the same type of data, and the switch can send the above-mentioned aggregated results of all the same type of data to the execution distribution through the second loop.
- the data nodes of the distributed computing task are connected to the port, and then the aggregation results of all the same type of data are sent to the data nodes that execute the distributed computing task through each port, so as to realize the operation of executing the distributed computing task during the data transmission process and accelerate the data. processing efficiency.
- the first loop and the second loop may be the same.
- the first loop and the second loop may also be different.
- the present application provides a method for data processing.
- the method is executed by a switch, the switch is connected to at least two data nodes, and each data node is used to perform a first operation of a distributed computing task, and the specific data processing process Including: the switch respectively receiving the result data of the first operation sent by the at least two data nodes; performing the second operation of the distributed computing task according to the received result data of the first operation, obtaining the result data of the second operation, and distributing the second operation of the distributed computing task.
- the result data of the second operation It can be seen from the above content that the switch performs the operation of distributed computing tasks during the data transmission process, which improves the efficiency of data processing.
- the switch includes a processing unit and at least two ports, each port is connected to a data node, and each port is used to receive the result data of the first operation sent by the connected data node, and send the first operation The result data of an operation is forwarded to the processing unit.
- the second operation of executing the distributed computing task by the processing unit of the switch is implemented, thereby reducing the delay of data processing.
- each port before forwarding the result data of the first operation to the processing unit, each port is further configured to perform the third operation of the distributed computing task on the result data of the first operation. That is to say, each port can also perform the third operation of the distributed computing task before forwarding the result data of the first operation to the processing unit, thereby accelerating the speed of data processing.
- the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks or distributed cloud computing tasks.
- the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
- the switch is an access switch or an aggregation switch.
- the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively perform the third operation of the distributed computing task.
- the processing unit can instruct the ports connected to the data nodes that perform the distributed computing task to perform the third operation of the distributed computing task respectively through the operation command, so as to realize the purpose of the operation of the switch performing the distributed computing task during the data transmission process, improving the performance of the distributed computing task. Efficiency of data processing.
- the processing unit sends an operation command to the at least two ports connected to the data nodes performing distributed computing tasks through a first loop, where the first loop includes at least For two ports, the order of the first loop indicates the order in which the above at least two ports receive or execute operation commands.
- the transmission of the operation command and the result data of the operation command is realized through the first loop, so that the influence of the data aggregation processing on other types of data processing processes can be avoided.
- the bandwidth of the first loop can be configured according to service requirements, thereby ensuring the performance of data processing.
- the result data of the third operation and the operation command are forwarded to the adjacent subsequent port, until The sequentially last port in the first loop sends all the result data of the third operation to the processing unit.
- the operation commands can be executed by each port in the first loop in turn, and the result data of the first operation is sent to the adjacent subsequent ports until the first loop.
- the last port in the above completes the processing process of the operation command, so that each port completes the operation of the distributed computing task according to the operation command, and accelerates the process of data processing.
- the processing unit is further configured to, before sending the operation command, receive the packet headers respectively sent by the at least two ports connected to the data node executing the distributed computing task, each packet header Including data type and message serial number; establishing operation table entry according to the message header, wherein the operation table entry records the data to be processed and processed data in each type of data; sending operation command according to the operation table entry.
- the processing unit instructs the operation commands of the at least two ports according to the processing conditions of each type of data, and then each port completes the processing process of the distributed computing task according to the operation commands.
- the packet header further includes a port identifier
- the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
- the switch includes at least one of the first loops.
- the switch is further configured to establish a second loop, and distribute result data of the second operation through the second loop.
- the switch may further include a second loop for distributing the result data of the second operation, so as to avoid the influence of distributing the result data of the second operation on other types of operations.
- each port can sequentially acquire the result data of the second operation.
- the result data of the second operation refers to the aggregated results of all the same type of data, and the switch can send the above-mentioned aggregated results of all the same type of data to the execution distribution through the second loop.
- the data nodes of the distributed computing task are connected to the port, and then the aggregation results of all the same type of data are sent to the data nodes that execute the distributed computing task through each port, so as to realize the operation of executing the distributed computing task during the data transmission process and accelerate the data. processing efficiency.
- the first loop and the second loop may be the same.
- the first loop and the second loop may also be different.
- the present application provides an apparatus for data processing, the apparatus comprising various modules for executing the data processing method in the second aspect or any possible implementation manner of the second aspect.
- the present application provides a system for data processing, the system includes a switching network and at least two data nodes connected to the switching network, wherein the at least two data nodes are used to perform the first step of the distributed computing task respectively.
- each switch includes a first processor and at least two ports, each port is respectively used for connecting with a data node that performs distributed computing tasks, and the first processor and each port are respectively used for The operation steps of the method described in any possible implementation manner of the second aspect are performed.
- the present application provides a computer-readable storage medium, where a command is stored in the computer-readable storage medium, which, when executed on a computer, causes the computer to execute the methods described in the above aspects.
- the present application provides a computer program product comprising commands that, when run on a computer, cause the computer to perform the methods described in the above aspects.
- the present application may further combine to provide more implementation manners.
- FIG. 1 is a schematic diagram of a polymerization process provided by an embodiment of the present application.
- FIG. 2 is a schematic structural diagram of a data processing system 100 provided by the present application.
- FIG. 3 is a schematic structural diagram of a switch according to an embodiment of the present application.
- FIG. 4 is a schematic flowchart of a data processing method provided by an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of a switch 500 according to an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of another switch 600 according to an embodiment of the present application.
- the computing tasks are jointly performed by a data node and a switch connected to the data node, so as to improve the efficiency of data processing.
- the computing tasks performed jointly by the data nodes and switches can be called distributed computing tasks.
- the data node may be a node in the form of a computing device (for example, a server), or a node in a virtualized form such as a virtual machine or a container.
- the virtual machine or container may be deployed on at least one computing device (for example, a server). ), each computing device is connected to a switch.
- Distributed computing tasks include distributed artificial intelligence (AI) computing tasks or distributed high-performance computing (HPC) tasks or distributed graphics computing (graphic computing) tasks or distributed cloud computing tasks or other Computational tasks that can be processed by distributed computing.
- AI distributed artificial intelligence
- HPC distributed high-performance computing
- graphics computing graphics computing
- distributed cloud computing task refers to that in artificial intelligence or high-performance computing or image computing or other scenarios, the computing tasks in the scenario are jointly performed by data nodes and switches in the form of virtual machines or containers.
- switches can also be implemented in a virtualized form.
- the operation of the distributed computing task performed by the data node may also be referred to as the first operation, and the operation of the distributed computing task performed by the switch may be referred to as the second operation.
- aggregation refers to the operation of accumulating similar data.
- a data node can generate the same kind of data.
- the operation of the data node generating the same kind of data can also be called the first operation of a distributed computing task.
- Similar data includes similar parameters in algorithms used in artificial intelligence scenarios, or similar data generated by computing-intensive tasks.
- Specific similar data can be set according to application scenarios and business requirements; switches can perform data aggregation operations on similar data generated by data nodes. Then, an aggregation result of the same type of data is obtained.
- the operation of performing data aggregation on the same type of data generated by the data node by the switch may also be referred to as a second operation.
- data node A, data node B and data node C generate three types of parameters, for example, data node A generates A0, A1 and A2, data node A generates A0, A1 and A2 B generates B0, B1 and B2, and data node C generates C0, C1 and C2.
- the aggregation results obtained by the switch after the aggregation operation include A0+B0+C0, A1+B1+C1, and A2+B2+C2.
- FIG. 2 is a schematic structural diagram of a data processing system 100 provided by an embodiment of the present application. As shown in the figure, the system 100 includes a switching network 10 and a data node 20 , and the data node 20 is connected to the switching network 10 .
- the data node 20 is used to generate data to be processed by distributed computing tasks, for example, artificial intelligence parameter training and/or data of the same type to be aggregated for data-intensive computing tasks in high-performance computing scenarios, and generate the data in the form of packets.
- the form sends the same kind of data to be aggregated to the switch connected to it. For example, as shown in FIG.
- the data node 20 includes six data nodes, wherein the data nodes 201 to 203 are connected to the switch 102 , the data nodes 204 to 206 are connected to the switch 103 , and the switch 102 and the switch 103 are connected through the switch 101 , Further, the communication connection between the data nodes is realized, and each data node can send the generated data of the same type to be aggregated to the switching network 10, and the switch in the switching network 10 performs the data aggregation operation.
- the switching network 10 is used for implementing data transmission and performing operations of distributed computing tasks in the system 100 (eg, performing data aggregation operations in a distributed computing task of aggregating data).
- the switching network 10 includes at least one switch.
- the present application takes the switch network 10 including three switches as an example, the switch 101 may also be referred to as an aggregation switch, and the switches 102 and 103 may also be referred to as access switches.
- the access switch is used to connect the data nodes 20 and perform the operation of distributed computing tasks of the data nodes 20 connected to it; the aggregation switch is used to realize the data transmission and distributed computing tasks of the data nodes connected by different access switches. operate.
- only one switch may be set in the switching network 10 to implement data transmission and aggregation processing of the data nodes 201 to 206 .
- the structure of the switching network and the number of switches can be set according to service requirements, and the present application does not limit the number and networking mode of the switches in the switching network 10 .
- the following embodiments of the present application take the switching network 10 shown in FIG. 2 as an example for description.
- FIG. 3 is a schematic structural diagram of a switch according to an embodiment of the application.
- the switch includes a processing unit 110, a plurality of ports (for example, ports 1201 to 1222), and a crossover network (crossbar)130.
- the processing unit 110 is configured to perform a data aggregation operation.
- the processing unit 110 is further configured to instruct the port to perform data aggregation processing.
- the processing unit 110 further includes an aggregation result cache 111 , a calculation unit 112 , a command generation module 113 and a packet header management module 114 .
- the aggregation result cache 111 is used to store the aggregation results of the same type of data.
- the memory in the switch or the cache of the processor in the switch can be used to realize the function of the aggregation result cache 111 .
- the computing unit 112 is configured to perform an aggregation operation on the aggregation results sent by each port, and manage operation table items (for aggregation operations, it may specifically be an aggregation table item), including generating, updating, and deleting operation table items, wherein the operation table items It is used to record the data to be processed and the processed data in each type of data.
- the command generating unit 113 is configured to determine the same type of data to be aggregated according to the operation table entry, generate an operation command (for a data aggregation operation, it may specifically be an aggregation command), and send the aggregation command to the first port of the aggregation loop,
- the port searches whether there is the same type of data to be aggregated according to the data stored in the input buffer of the port, and performs the aggregation operation.
- the message header management unit 114 is used to parse the message header sent by each port, so as to send the message sequence number in the message header to the command generation unit 114, and the command generation unit 114 generates the aggregation command according to the message sequence number and the aggregation table entry.
- the packet header management unit 114 is also used for connecting to the cross-connect network 130, which may also be referred to as a cross-connect matrix, and is used for implementing the transmission of the packet header between the processing unit 110 and each port.
- each port in the switch can be connected to a data node, and each port includes a computing unit and a memory, for example, port 1201 includes a computing unit 12011 and a memory 12012 .
- the computing unit is configured to parse the message sent by the data node to obtain message header and payload data.
- the computing unit is further configured to perform the aggregation operation according to the aggregation command sent by the processing unit 110 .
- the memory of each port can be further divided into an input buffer and an output buffer (not shown in the figure) according to different types of stored data.
- the input buffer is used to store the message sent by the data node connected to the port.
- the message includes a message header and payload data, and the payload data includes the same type of data to be aggregated.
- the output cache is used to receive and store the aggregation results of all the same type of data sent by the processing unit 110 after the system 100 completes the aggregation operation of all the same type of data to be processed, so as to send all the same type of data to the data node connected to the port. Aggregate result of class data.
- the number of ports in the switch varies according to different products produced by manufacturers, and the present application does not limit the number of ports included in the switch.
- FIG. 3 also shows two data transmission loops: an aggregation loop and a distribution loop.
- the aggregation loop can also be referred to as the first loop
- the distribution loop can be referred to as the second loop.
- the aggregation loop is a set of ports connected to data nodes participating in distributed computing tasks, including at least two ports sorted according to preset rules, and the sorting of ports in the aggregation loop is used to indicate that the at least two ports receive Or the order in which aggregate commands are executed.
- the aggregation command generated by the processing unit 110 can be sequentially transmitted from the first port of the aggregation loop to the last port, and each port can perform a corresponding aggregation operation according to the aggregation command and whether the same type of data to be aggregated is stored in the port, and aggregate the data.
- the result and the aggregation command are sent to the next port in the loop adjacent to the port, ..., and so on, until the last port in the aggregation loop completes the processing of the aggregation command, and the aggregation result of the aggregation command is sent to
- the processing unit further completes the processing of an aggregation command in the aggregation loop. As shown in FIG.
- the aggregation loop includes: processing unit 110-port 1201-port 1202-...-port 1211-port 1212-...-port 1221-port 1222-processing unit 110, forming a closed loop, aggregating commands It can be executed by each port of the aggregation ring, and the final result data is transmitted to the processing unit 110 by the last port in the aggregation ring.
- the port 1201 connected to the processing unit 110 may also be called the first port of the aggregation loop, and is used to receive the aggregation command sent by the processing unit 110 .
- Aggregate commands are processed by port 1201, port 1202, port 1211, port 1212, .
- port 1202, port 1211, port 1212, and port 1221 can also be referred to as link ports of the aggregation ring, and each port can receive the aggregation command sent by its adjacent previous port in the aggregation ring and the corresponding The adjacent preceding port performs the aggregation operation according to the aggregation result of the aggregation command, and based on the received aggregation command and the adjacent preceding port according to the aggregation result of the aggregation command.
- port 1202 can receive the aggregation command sent by port 1201 and the aggregation result of the aggregation command by port 1201.
- port 1201 When port 1201 does not include the same type of data to be aggregated, port 1201 directly sends the aggregation command to port 1202. It can be understood that the aggregation result of the aggregation command by port 1201 is empty or none; when the aggregation port 1201 includes the same kind of data to be aggregated, in addition to sending the aggregation command to port 1202, port 1201 will also perform the operation on port 1201. The aggregation results are sent to ports 1202, ..., and so on. In the aggregation loop, each port receives the aggregation command in turn, performs aggregation operations according to the aggregation command, and combines the aggregation command and the aggregation performed by the port according to the aggregation command.
- the result is sent to the next port in the aggregation ring that is adjacent to it.
- the final aggregation result of the aggregation command is transmitted to the processing unit 110 by the last port in the aggregation loop (also referred to as a tail port).
- the port 1222 is the last port in the order of the aggregation loop, and the port 1222 transmits the final aggregation result to the processing unit 110, and then the processing unit 110 determines whether the aggregation processing of all similar data is completed.
- the switch may include at least one aggregation loop, and each aggregation loop includes at least two ports sorted according to preset rules.
- a distribution loop is a collection of ports connected to data nodes participating in distributed computing tasks, including at least two ports sorted according to preset rules, and the sorting of ports is used to indicate ports connected to data nodes participating in distributed computing tasks
- the order of receiving the aggregated results of all the same kind of data so that the data nodes connected to the above ports can obtain the aggregated results of all the same kind of data, and then complete other operations of the distributed computing task.
- the distribution loop includes processing unit 110-port 1201-port 1202-...-port 1211, then the processing unit 110 can send the aggregation results of all the same data to the distribution loop through the above-mentioned distribution loop each port.
- the central unit 110 can also send the aggregation results of all similar data to the first port (for example, port 1201) of the distribution loop, and then the first port sends the aggregation results of all similar data
- the first port for example, port 1201
- the first port sends the aggregation results of all similar data
- each port in the distribution loop can obtain the Aggregate results.
- each port after each port receives the aggregated results of all the same data, it can store the aggregated results of all the same types of data in the memory of the port, specifically in the output cache of the memory.
- the last port in the distribution loop may also send a notification message to the central unit, where the notification message is used to instruct the ports in the distribution loop to obtain the aggregation results of all similar data. condition.
- the switch includes at least one distribution loop, and each distribution loop includes at least two ports sorted according to preset rules.
- the aggregation loop and the distribution loop can be the same loop, that is, the aggregation loop is used to transmit aggregation commands and port aggregation results in the data aggregation process, and is also used to transmit the aggregation results of the aggregation commands and ports involved in distributed computing.
- the port to which the data node is connected sends the aggregated result of all the first type of data.
- the aggregation loop and the distribution loop can also be different loops.
- the adjacent ports in the aggregation loop and the distribution loop, as well as the connection between the ports and the processing units, may be physical connections (eg, conductive traces) in a printed circuit board (PCB). connect.
- PCB printed circuit board
- the number of ports included in the aggregation loop and the distribution loop can be configured according to service requirements, and the data transmission paths are transmitted one by one along the ports included in the aggregation loop or the distribution loop.
- the data processing method provided by the application is described in detail below with reference to FIG. 4 .
- the method is described by taking a distributed computing task as the processing process of data aggregation as an example.
- the data to be aggregated is referred to as
- the first port, the second port and the third port are respectively the ports connected to the data nodes performing distributed computing tasks, and the first port, the second port and the third port constitute an aggregation loop, and the first port, the second port and the third port constitute an aggregation loop.
- One port is the first port of the aggregation loop, and the third port is the last port of the aggregation loop.
- the method includes:
- the processing unit receives the first packet header sent by the first port.
- the processing unit receives the second packet header sent by the second port.
- the processing unit receives the third packet header sent by the third port.
- the same type of data to be aggregated has an associated packet column number, and each data can be sent from the data node to the switch using one packet.
- Each packet includes a packet header and static payload data. Specifically, after the port parses the packet to obtain the packet header, it can send the packet header to the processing unit through the cross-connect network.
- Each packet header includes the packet sequence number, and the packet sequence number is used to instruct the data node connected to the port to send the packet.
- the sequence number of the packet each packet carries at least one piece of data to be aggregated.
- the packet header further includes a data type, where the data type is used to indicate the type of data to be processed.
- the generation rule of the packet header may be determined by the data node and then notified to the switch, or may be determined by the switch and then notified to the data node, which is not limited in this application.
- a fixed identification bit can be set in the specified field of the message sequence number.
- the first field is the sequence number
- the field 2 is used to indicate the data type.
- the second field is 1. , it means that the message associated with the message sequence number includes the first type of data with the data type 1, and the aggregation operation can be performed on the data with the data type 1 in the data processing process.
- the packet header further includes a third field for indicating the offset bit.
- Offset bit used to indicate the total number of data of the same type to be aggregated on the same port. For example, when the sequence number of the received packet is 3 in the third field, it means that the total number of data to be aggregated on the port is 3.
- the packet header may further include field 4 for indicating the port identifier.
- the port identifier is used to indicate the identifier of the port that sends the packet header to the processing unit, and the port identifier may be represented by numbers and/or letters.
- the processing unit may separately record the identifier of the port that sends the packet header.
- Table 1 is an example of a packet header
- S304 (optionally): The processing unit checks the reliability of the packet sequence numbers in the respective packet headers respectively.
- the processing unit After the processing unit receives the packet headers sent by the port, it can perform reliability verification on the packet serial numbers included in the respective packet headers, and the reliability verification method can be any one of the following methods:
- the processing unit may check the reliability of the packet serial number according to a preset rule.
- the data node and the processing unit may pre-agreed a generation rule for the message sequence number, which may also be called a preset rule, and each message sequence number is a globally unique identifier.
- each packet carries a first type of data, that is, the packet sequence number can uniquely identify a first type of data.
- the processing unit can check the validity of each packet serial number according to the preset rule. Specifically, the processing unit may pre-store a preset message sequence number table, where the preset message sequence number table is used to record the set of all message sequence numbers generated according to the preset rules, and the processing unit may store the preset message sequence number table in the preset message sequence number table. If there is a message sequence number to be queried in the preset message sequence number table, the result of the reliability check of the message sequence number is considered to be passed; otherwise, the message sequence number is considered to be reliable. The result of the sex check is failed.
- the processing unit may calculate the validity of the packet sequence number according to a preset rule.
- the packet sequence number may be a random number or an identifier generated according to a preset rule, and is used to globally uniquely identify the sequence number of a packet. For example, when the packet sequence number is a random number generated by a hash algorithm and obtained by encryption using an encryption algorithm, the processing unit can decrypt the algorithm, determine the decrypted packet sequence number according to the hash algorithm, and determine the decrypted packet sequence number.
- the packet sequence number of the message is within the pre-agreed range of message sequence numbers, if it is within the pre-agreed range of message sequence numbers, it is considered that the reliability check result of the message sequence number is passed; if it is not within the pre-agreed range of message sequence numbers within, it is considered that the reliability check result of the packet number 1 is not passed.
- the packet sequence number may also be generated by using a custom algorithm or a general algorithm other than the hash algorithm, which is not limited in this application.
- the validity of the packet header can be determined before the aggregation operation is performed, thereby avoiding the problem of data errors caused by aggregating non-typed data, and improving the accuracy of distributed computing tasks.
- the processing unit generates an aggregation entry according to the sequence numbers of each packet.
- the processing unit can generate an aggregation entry according to the packet sequence number in the received packet header, and the aggregation entry records the data to be processed and the processed data in each type of data, that is, in the data aggregation processing, the aggregation table
- the item is used to indicate the aggregation status of the first type of data, including the packet sequence number and the aggregation status of each packet sequence number, where the aggregation status is used to indicate the aggregation status of the first type of data associated with each packet sequence number, and the aggregation status It includes any one of "not aggregated", "aggregated", and "not aggregated, and the header has not been received".
- the aggregation entry may also include a port identifier associated with the packet sequence number.
- the aggregation entry may further include the data type and offset bit associated with the packet sequence number.
- Table 2 is a summary result of a processing unit receiving packet headers provided by an embodiment of the present application.
- the processing unit can know from the packet headers received by each port that the ports whose port identifier is 1 are to be aggregated. There are 3 data with the data type 1, and the received packets with the serial numbers 1 and 2; the port with the port ID of 2 has a total of 4 data with the data type 1 to be aggregated, and the received packets have the serial numbers 1 and 2. 3 packets; the port with the port ID of 3 has a total of 2 data types of 1 data to be aggregated, and received packets with packet sequence numbers 1 and 2.
- Table 2 A summary result of the header received by a processing unit
- the processing unit can determine the message sequence numbers of all the first-type data to be aggregated according to the offset bits of the message headers in Table 2, and the aggregation state of the first-type data corresponding to each message sequence number, and then generate according to the above determination results. Aggregate entry indicating the aggregation of the first type of data. For example, according to Table 2, it can be known that the data nodes connected to the port with the port identifier 1 generate a total of 3 message sequence numbers with the data category 1, and the processing unit has received the message sequence numbers sent by the port with the port identifier 1 as 1 and 1.
- the packet header of 2 the processing unit has not obtained the packet header with the packet sequence number of 3; the data node connected with the port ID of 2 generates a total of 4 packet sequence numbers with the data category of 1, and the processing unit has received the port ID of 2.
- the packet headers with the packet sequence numbers 1 and 3 sent by the port, the processing unit does not obtain the packet headers with the packet sequence numbers 2 and 4; the data node connected to the port with the port ID of 3 generates a total of 2 data types of 1
- the processing unit has obtained the packet headers with the packet sequence numbers 1 and 2.
- the processing unit can first determine the packet sequence numbers of all the first type of data to be aggregated and the port identifiers associated with each packet sequence number according to the above situation, and further identify the aggregation of each packet sequence number. state. For example, the port ID is 1 and the aggregation status of the packet sequence number 1 is not "un-aggregated", and the port ID is 1 and the aggregation status of the packet sequence number 1 is not "un-aggregated and the packet header has not been received”.
- the aggregation state may also be identified in any form such as numbers or letters or a combination of data and letters.
- the processing unit can learn the aggregation state of the first type of data to be aggregated by generating the aggregation entry as shown in Table 3. Further, the processing unit can generate an aggregation command according to the aggregation entry, and the aggregation command is used to instruct the port according to the aggregation command. Perform data aggregation operations.
- the processing unit determines the packet sequence number of the first type of data to be aggregated according to the aggregation entry, and generates an aggregation command.
- the processing unit can generate an aggregation command based on the packet sequence numbers and port identifiers associated with the unaggregated data. It includes at least one packet sequence number of the first type of data to be aggregated.
- the processing unit may generate an aggregation command according to a filtering rule, and the filtering rule is used to filter the packet sequence numbers of the first type of data to be aggregated included in the aggregation command, which specifically includes any one of the following methods:
- Manner 1 In a polling manner, the packet sequence number of at least one type of data to be aggregated is determined according to the size of the packet sequence number.
- one or more packet sequence numbers of the first type of data to be aggregated may be selected from all the packet sequence numbers of the first type of data to be aggregated in a polling manner and according to the size of the packet sequence numbers.
- Manner 2 Determine at least one packet sequence number of the first type of data to be aggregated according to the priority mode.
- the first type of data to be aggregated may also carry a priority identifier, which is carried in the message, and the priority is used to identify the priority of the first type of data associated with it.
- the first type of data generated by each data node is important data in the aggregated data, and the priority of the first data can be marked as high.
- the message sent by the data node also carries information indicating the priority.
- the processing unit may select one or more pieces of data to be aggregated from the packet sequence numbers of all the first type of data to be aggregated according to the priority of the first type of data to be aggregated.
- Manner 3 Select at least one packet sequence number of the first type of data to be aggregated according to the status of the received packet headers.
- the processing unit can also determine the received packet sequence numbers first, and then use the method in the received packet sequence numbers.
- the method of the first or the second mode selects at least one packet sequence number of the first data.
- the processing unit can generate only one aggregation command according to the aggregation table entry, and the aggregation command includes the packet sequence numbers of all the first-type data to be aggregated filtered in any of the above methods; it can also generate multiple aggregation commands, Each aggregation command includes a packet sequence number of the first type of data to be aggregated; multiple aggregation commands can also be generated, and each aggregation command includes a packet sequence number of part of the first type of data to be aggregated.
- the processing unit only generates one aggregation command, and the command includes the packet sequence numbers of all the first type of data to be aggregated filtered in any of the foregoing manners for description.
- the aggregation command further includes a port identifier associated with the packet sequence number of the first type of data to be aggregated.
- the processing unit sends an aggregation command to the first port.
- the processing unit may use the aggregation loop to send the aggregation command, and if the first port is the first port of the aggregation loop, the processing unit sends the aggregation command to the first port. That is to say, the processing unit directly sends the aggregation command to the first port of the aggregation ring. After the first port completes the processing of the aggregation command, it sends the aggregation command to the subsequent ports adjacent to the first port in the aggregation ring. command, and then the port completes the aggregation processing of the port according to the aggregation result of the first port and the aggregation command.
- the result data of the aggregation operation and the aggregation command are forwarded to the adjacent port.
- the last port in the aggregation loop sends all the result data of the aggregation command to the processing unit until the last port in the order in the aggregation loop. For the specific process, refer to step S308 to step S310.
- the first port When the first port includes the packet sequence number of the first type of data to be aggregated, the first port performs an aggregation operation, and sends an aggregation command and an aggregation result of the first port to the second port.
- the first port After the first port receives the message from the data node connected to it, it parses the message to obtain the message header and static payload data, sends the message header to the processing unit through the cross-connect network, and combines the message header with the static payload data.
- the dead load data is stored to the memory of the first port.
- the processing of the aggregation command is performed according to the packet sequence number of the first type of data to be aggregated in the aggregation command. Specifically, the first port can first determine whether the memory of the first port includes the packet sequence number of the first type of data to be aggregated; then, determine the static load data associated with the packet sequence number according to the packet sequence number; Aggregate operations are performed on the data.
- the first port may also first determine whether to include the identifier of the first port according to the port identifier associated with the packet sequence number in the aggregation command; then, determine whether the memory of the first port includes the packet of the first type of data to be aggregated. sequence number; then determine the payload data associated with the message sequence number according to the message sequence number; finally, perform an aggregation operation according to the payload data.
- the operation performed by each port in the aggregation ring according to the aggregation command may also be referred to as the third operation of the distributed computing task.
- each port performs the third operation to obtain result data, which may also be It is called the result data of the third operation.
- the first port After the first port executes the aggregation command, it will send the aggregation command and the aggregation result of the first port to the port (for example, the second port) adjacent to the first port in the aggregation loop, and the second port will continue according to the aggregation.
- the aggregation operation is performed on the command and the aggregation result of the first port.
- the aggregation command generated with reference to the aggregation table entry shown in Table 3 includes the packet sequence numbers 1 and 2 in the port with the port identifier 1, and the packet sequence numbers 1 and 2 in the port with the port identifier 2.
- the static load data obtained by the first port can also be called the aggregation result obtained by the first port according to the aggregation command, or the aggregation result of the first port, or the result data of the first port executing the aggregation command. Or referred to as the result data of the third operation performed by the first port.
- the first port can directly send the aggregation command to the second port.
- the aggregate result is zero or empty.
- the second port When the second port includes the packet sequence number of the first type of data to be aggregated, the second port performs an aggregation operation, and sends an aggregation command and an aggregation result of the second port to the third port.
- the second port may also search for matching static payload data in the memory of the second port according to the packet sequence number of the first type of data to be aggregated in the aggregation command. Specifically, the second port may first determine whether the memory of the second port includes the message sequence number of the first type of data to be aggregated; then, determine the first data to be aggregated according to the message sequence number, that is, the message sequence number is associated with the static load data; perform aggregation operations based on the static load data. Wherein, when the second port performs the aggregation operation, it needs to first determine whether the first port sends the aggregation result generated by the first port according to the aggregation command.
- the aggregation operation is performed on the first type of data to be aggregated stored in the storage device to obtain the aggregation result of the second port, that is, when the second port needs to perform the aggregation operation on the basis of the aggregation result of the first port.
- the aggregation command generated with reference to the aggregation table entry shown in Table 3 includes the packet sequence numbers 1 and 2 in the port whose port identifier is 1, and the packet sequence number in the port whose port identifier is 2 is 1. and 3, and the packet sequence numbers 1 and 2 in the port with the port ID 3 as an example, when the first port aggregates the first type of data associated with the packet numbers 1 and 2 according to the aggregation command When the aggregation result of the first port is obtained, the above-mentioned aggregation result and the aggregation command are sent to the second port.
- the second port will execute the above-mentioned aggregation result according to the above-mentioned aggregation result and the aggregation command.
- One type of data aggregation obtains the aggregation result of the second port.
- the aggregation result of the second port includes the first type of data associated with the packet sequence numbers 1 and 2 in port 1, and the packet sequence numbers 1 and 3 in port 2. The aggregated result of the associated first-class data.
- the aggregation result obtained by the second port according to the aggregation command may also be referred to as the aggregation result of the second port, or the aggregation result obtained by the second port executing the aggregation command and the result data obtained by the second port executing the third operation.
- the second port when the second port does not include the packet sequence number of the first type of data to be aggregated by the aggregation command, the second port can directly send the aggregation command and the aggregation result of the first port to the third port.
- the aggregated result of the second port can be considered to be zero or empty.
- the second port can directly aggregate the command to the third port, that is, in this case, the first Neither the one port nor the second port includes the packet sequence number of the first type of data to be aggregated by the aggregation command, and the aggregation result of the first port and the aggregation result of the second port are both zero or empty.
- the third port includes the packet sequence number of the first type of data to be aggregated, perform an aggregation operation, and send the aggregation result of the third port to the processing unit.
- the third port is the last port in the aggregation ring, that is, the third port is the last port in the aggregation ring in sequence. Illustratively, port 1222 as shown in FIG. 3 . Similar to the above step S309, the third port will also determine whether there is a packet sequence number of the first type of data to be aggregated indicated in the aggregation command in the memory of the third port, and whether the data in the aggregation loop adjacent to the third port exists. The aggregation result of the previous port performs the aggregation operation, and sends the result data of the aggregation operation to the processing unit.
- the third port performs the aggregation operation according to the first type of data to be aggregated stored in the third port and the aggregation result of the second loop to obtain the aggregation result, which can be called the aggregation of the third port.
- the aggregation result obtained by the second port executing the aggregation command, or the result data of the third port executing the aggregation command can be called the aggregation of the third port.
- each port in the aggregation ring does not store the aggregation result of the first type of data that the port performs aggregation according to the aggregation command.
- the aggregation result of the aggregation command is sent to the aggregation ring and the The port is adjacent to the rear port.
- the processing unit determines that the aggregation operation of all the first type of data has not been completed, the processing unit generates a new aggregation command according to the aggregation table item, repeats the operations of steps S306 to S310, and determines according to the aggregation results of at least two aggregation commands Aggregate result of all first-class data.
- the processing unit when it obtains the aggregation result of the last port in the aggregation loop in step S310, it can update the aggregation state of the message sequence number in Table 3, and judge whether the aggregation of all the first type data has been completed according to the updated result.
- the processing unit may perform the aggregation operation again on the aggregation results of the multiple aggregation commands, thereby obtaining the aggregation results of all the first-type data.
- the above steps S301 to S311 can also be referred to as a data aggregation process.
- the processing unit can send the aggregation results of all data through the distribution loop to and participate in distributed computing through the data distribution process.
- the port to which the data nodes of the task are connected, and then all the first type of data is sent to the data nodes participating in the distributed computing task through the above port, so that the data nodes participating in the distributed computing task continue to complete other operations of the distributed computing.
- steps S312 to S313 please refer to the description of steps S312 to S313.
- the processing unit sends the aggregation result of all the first type data to the first port through the distribution loop.
- the first port sends the aggregation result of all the first type of data to the second port.
- the second port sends the aggregation result of all the first type of data to the third port.
- the distribution loop is a path in the switch for sending the aggregated results of all the first type of data, and is a data transmission loop formed by at least two ports sorted according to preset rules.
- the switch shown in FIG. 3 includes two distribution loops.
- Distribution loop 1 is processing unit 110-port 1201-port 1202...-port 1211
- distribution loop 2 is processing unit 110-port 1222-port 1221- ... - port 1212.
- the processing unit determines that the aggregation of all the data of the first type has been completed, the results of all the data of the first type can be distributed to the ports connected to the data nodes participating in the distributed computing through the distribution loop, and then transmitted to the data nodes participating in the distributed computing. data node.
- the processing unit clears the aggregation command table entry and the aggregation result cache.
- the processing unit may clear the aggregated result cache and delete the aggregated command entry, thereby freeing the storage space of the processing unit.
- the aggregation method provided by the present application can directly perform the data aggregation operation by the switch during the data transmission process, avoiding the occupation caused by the aggregation operation performed by the dedicated aggregation node in the traditional technology. Problems such as network resources, low transmission rate, and prolonged processing time have improved the efficiency of aggregation processing.
- the processing unit and each port in the switch can perform aggregation operations on the data to be aggregated in a distributed manner, avoiding the performance bottleneck problem caused by the aggregation operation performed by a single subject, and further reducing the latency of aggregation processing.
- the transmission bandwidth of distributed computing can be greatly improved.
- the aggregation results of the first type of data are only stored in the aggregation result cache of the processing unit, and the port does not need to cache the aggregation results of some types of data during the data processing process, and only completes the aggregation of all the same types of data. After the operation, it is necessary to store the aggregated results of all similar data, which greatly reduces the capacity requirement of cached data in the port.
- the crossover network 130 in addition to using the aggregation loop and the distribution loop to transmit data, can also be used to directly implement data transmission between the processing unit and each port.
- the cross-connect network 130 is not only used for realizing the transmission of packet headers between the processing unit and each port, but also for transmitting the aggregation command generated by the processing unit and the processing result of each port executing the aggregation command.
- the above implementation manner can also realize the process of data aggregation realized by the switch during data transmission, thereby avoiding the problems of long time and low efficiency caused by the aggregation operation performed by a single aggregation node in the traditional technology.
- the distributed computing task can also be performed only by the processing unit of the switch, that is, each port obtains data when After the node sends the packet carrying the first type of data to be aggregated and the packet header, the packet is sent to the processing unit, which parses the packet, obtains the packet sequence number in the packet header, and executes the execution according to the packet sequence number. Aggregate operation.
- each port can also complete the packet parsing process, and send the packet header and the static payload data to the processing unit respectively, and then the processing unit performs the aggregation operation.
- the above process can also achieve the purpose of performing aggregation operations by the switch during the data transmission process, thereby improving the efficiency of data processing.
- the switch 500 provided by the present application can complete the distributed computing task together with the data nodes, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing.
- aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
- FIG. 5 is a schematic structural diagram of a switch 500 provided by the present application. As shown in the figure, the switch 500 is used to connect at least two data nodes, and the at least two data nodes are used to respectively perform a first operation of a distributed computing task. , the switch 500 includes a first processing unit 501, wherein
- a first processing unit 501 configured to receive the result data of the first operation sent by the at least two data nodes; perform the second operation of the distributed computing task according to the received result data of the first operation, Obtaining result data of the second operation; distributing the result data of the second operation.
- the shown switch 500 further includes at least two ports, each port is connected to a data node, and each port includes a receiving unit 502 and a sending unit 503, wherein,
- a receiving unit 502 configured to receive the result data of the first operation sent by the connected data node
- the sending unit 503 is configured to forward the result data of the first operation to the first processing unit 501 .
- each port further includes a second processing unit 504, configured to perform all operations on the result data of the first operation before the sending unit 503 forwards the result data of the first operation to the first processing unit 501.
- the third operation of the distributed computing task is described.
- the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks.
- the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
- the switch is an access switch or an aggregation switch.
- the first processing unit 501 is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the second processor 504 of the at least two ports to perform the third operation respectively .
- the at least two ports are sorted according to a preset rule to form a first loop, and the order of the first loop indicates an order in which the at least two ports receive or execute the operation command.
- the result data of the third operation and the operation command are forwarded to the adjacent subsequent port.
- port until the last port in the first loop in sequence sends all the result data of the third operation to the first processing unit 501 .
- the first processing unit 501 is further configured to receive packet headers respectively sent by the at least two ports before sending the operation command, and each packet header includes a data type and a packet sequence number;
- the message header establishes an operation table entry, and the operation table entry records the data to be processed and the processed data in each type of data; the operation command is sent according to the operation table entry.
- the packet header further includes a port identifier
- the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
- the switch includes at least one of the first loops.
- the switch is further configured to establish a second loop
- the first processing unit 501 is further configured to distribute result data of the second operation through the second loop.
- the switch includes at least one of the second loops.
- the first processing unit 501 and the second processing unit 504 in this embodiment of the present application may be implemented by an application-specific integrated circuit (ASIC), respectively, or a programmable logic device (PLD)
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the above-mentioned PLD can be a complex program logic device (complex programmable logical device, CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL) or any combination thereof.
- CPLD complex programmable logical device
- FPGA field-programmable gate array
- GAL general array logic
- the first processing unit 501 , the second processing unit 504 and their respective modules can also be software modules.
- the switch 500 may correspond to executing the methods described in the embodiments of the present application, and the above and other operations and/or functions of the various units in the switch 500 are respectively in order to implement the corresponding processes of the respective methods in FIG. 4 , For brevity, details are not repeated here.
- the switch 500 provided by the present application can complete the distributed computing task together with the data nodes, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing.
- aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
- FIG. 6 is a schematic structural diagram of another switch 600 provided by the present application.
- the switch 600 includes a first processor 601 and at least two ports 602, wherein each port 602 is respectively used for participating in distributed
- the data nodes of the computing tasks are connected through the network 603, wherein,
- a first processor 601 configured to respectively receive the result data of the first operation sent by the at least two data nodes, and perform the second operation of the distributed computing task according to the received result data of the first operation , obtain the result data of the second operation, and distribute the result data of the second operation.
- the first processor 601 may be a CPU, and the processor 601 may also be other general-purpose processors, digital signal processors (digital signal processing, DSP), application specific integrated circuits (ASICs) , Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- DSP digital signal processor
- ASICs application specific integrated circuits
- FPGA Field Programmable Gate Array
- a general purpose processor may be a microprocessor or any conventional processor or the like.
- the network 603 may be a bus, and the bus may include a power bus, a control bus, a status signal bus, and the like in addition to a data bus.
- the first processor 601 may be configured to implement the functions of the computing unit 112 , the command generating unit 13 , and the message header management unit 114 in the processing unit 110 shown in FIG. 2 , which will not be repeated here for brevity.
- the first processor 601 further includes a memory (not shown in the figure), and the memory is used to provide commands and data to the first processor 601, so that the first processor can perform the operations of the method shown in FIG. 4 . step.
- the memory may include read-only memory and random access memory, and the memory may also include non-volatile random access memory.
- a memory may also be included outside the first processor 601 to provide commands and data to the first processor 601, so that the first processor may execute the operation steps of the method shown in FIG. 4 .
- each port 602 includes a second processor 6021 and a memory 6022, wherein the second processor 6021 may also be a CPU, and the processor 6021 may also be other general-purpose processors, digital signal processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- DSPs digital signal processors
- ASICs Application Specific Integrated Circuits
- FPGAs Field Programmable Gate Arrays
- a general purpose processor may be a microprocessor or any conventional processor or the like.
- Each port 602 may be used to implement the operation steps of the method performed by the first port or the second port or the third port in the method shown in FIG. 4 , which will not be repeated here for brevity.
- the switch 600 may correspond to the switch 500 in the embodiment of the present application, and may correspond to the corresponding subject in executing the method shown in FIG. 4 in the embodiment of the present application, and the switch 600
- the above-mentioned and other operations and/or functions of each module are respectively to implement the corresponding flow of each method in FIG. 4 , and are not repeated here for brevity.
- the switch 600 provided by the present application can complete the distributed computing task together with the data node, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing.
- aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
- the present application also provides a data processing system, the system includes a switching network and at least two data nodes connected to the switching network that respectively perform a first operation of a distributed computing task, the switching network includes at least one switch, and each switch includes The first processor and at least two ports shown in FIG. 6 are used to implement the functions of the corresponding execution body in the method shown in FIG. 4 , which are not described here for brevity.
- the system can realize distributed computing tasks. In the process of data transmission, switches perform operations of distributed computing tasks, thereby improving the efficiency of data processing and reducing the delay of data processing.
- the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
- the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer commands. When the computer program commands are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
- the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
- the computer commands may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer commands may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
- the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media.
- the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media.
- the semiconductor medium may be a solid state drive (SSD).
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Multi Processors (AREA)
Abstract
A switch. The switch is connected to at least two data nodes. The switch is used for respectively receiving result data, which is sent by the at least two data nodes, of a first operation of a distributed computing task, executing a second operation of the distributed computing task according to the received result data of the first operation, so as to obtain result data of the second operation, and distributing the result data of the second operation. Thus, data nodes and a switch jointly execute operation processes of a distributed computing task, thereby improving the data processing efficiency and reducing the processing delay.
Description
本申请涉及通信领域,尤其涉及一种交换机和数据处理的系统。The present application relates to the field of communications, and in particular, to a switch and a data processing system.
在模拟人的思维过程和智能行为(如训练、推理)的人工智能(artificial intelligence,AI)参数训练,以及利用聚合计算能力来处理计算密集型计算任务的高性能计算(high performance computing,HPC)等场景中,经常需要聚合多个数据节点(数据节点通常为服务器)的同类数据。例如,人工智能参数训练中所使用的聚合处理的计算机程序all_reduce(),以及高性能计算中所使用的消息传递接口(message passing interface,MPI)的聚合处理的计算机程序MPI_all_reduce()。上述聚合处理均由一个独立的聚合节点执行,其中,聚合节点可以是一台独立的服务器,但是,随着数据节点的数量越来越多、数据节点产生的待聚合的数据量也越来越大,数据节点需要向聚合节点传输的数据量也随之增大,而聚合节点则需要处理更多数据,使得整个系统中存在数据传输带宽无法满足要求,导致数据聚合处理的时延增大的问题。因此,如何降低数据聚合的处理时延成为亟待解决的技术问题。Artificial intelligence (AI) parameter training that simulates human thought processes and intelligent behaviors (such as training, reasoning), and high performance computing (HPC) that utilizes aggregate computing power to handle computationally intensive computing tasks In such scenarios, it is often necessary to aggregate the same data of multiple data nodes (data nodes are usually servers). For example, the computer program all_reduce() for aggregation processing used in artificial intelligence parameter training, and the computer program MPI_all_reduce() for aggregation processing of message passing interface (MPI) used in high-performance computing. The above aggregation processing is performed by an independent aggregation node, where the aggregation node can be an independent server. However, as the number of data nodes increases, the amount of data to be aggregated generated by the data nodes also increases. The amount of data that the data node needs to transmit to the aggregation node also increases, and the aggregation node needs to process more data, so that the data transmission bandwidth in the entire system cannot meet the requirements, resulting in an increase in the delay of data aggregation processing. problem. Therefore, how to reduce the processing delay of data aggregation has become an urgent technical problem to be solved.
发明内容SUMMARY OF THE INVENTION
本申请提供一种数据处理的交换机、装置和系统,以提供一种低时延的数据处理方法,提升数据处理的效率。The present application provides a switch, device and system for data processing, so as to provide a low-latency data processing method and improve the efficiency of data processing.
第一方面,本申请提供一种交换机,该交换机与至少两个数据节点相连,上述至少两个数据节点分别执行分布式计算任务的第一操作;交换机,用于接收上述至少两个数据节点发送的第一操作的结果数据,根据接收的第一操作的结果数据执行分布式计算任务的第二操作,获得第二操作的结果数据,并向上述至少两个数据节点分发第二操作的结果数据。通过上述描述可知,交换机和数据节点可以共同完成分布式计算任务,即在交换机传输数据过程中即可执行分布式计算任务的操作,避免由单独的节点执行第二操作所带来的效率低问题,进而提升数据处理的效率。此外,由于交换机在数据传输中即完成分布式计算任务的第二操作,无需部署单独的节点,降低了系统的成本。In a first aspect, the present application provides a switch, the switch is connected to at least two data nodes, and the at least two data nodes respectively perform a first operation of a distributed computing task; the switch is used for receiving data sent by the at least two data nodes. the result data of the first operation, perform the second operation of the distributed computing task according to the received result data of the first operation, obtain the result data of the second operation, and distribute the result data of the second operation to the above at least two data nodes . It can be seen from the above description that the switch and the data node can jointly complete the distributed computing task, that is, the operation of the distributed computing task can be performed during the data transmission process of the switch, so as to avoid the problem of low efficiency caused by the second operation performed by a separate node , thereby improving the efficiency of data processing. In addition, since the switch completes the second operation of the distributed computing task during data transmission, there is no need to deploy a separate node, which reduces the cost of the system.
作为一种可能的实现方式,交换机包括处理单元和至少两个端口,每个端口连接一个数据节点,每个端口用于接收所连接的数据节点发送的第一操作的结果数据,并将第一操作的结果数据转发至处理单元。由此实现由交换机的处理单元执行分布式计算任务的第二操作,进而降低数据处理的时延。As a possible implementation manner, the switch includes a processing unit and at least two ports, each port is connected to a data node, and each port is used to receive the result data of the first operation sent by the connected data node, and send the first operation The result data of the operation is forwarded to the processing unit. Thereby, the second operation of executing the distributed computing task by the processing unit of the switch is implemented, thereby reducing the delay of data processing.
作为另一种可能的实现方式,在将第一操作的结果数据转发至处理单元之前,每个端口还用于对第一操作的结果数据执行分布式计算任务的第三操作。也就是说,每个端口也可以在向处理单元转发第一操作的结果数据前,执行分布式计算任务的第三操作,进而加速数据处理的速度。As another possible implementation manner, before forwarding the result data of the first operation to the processing unit, each port is further configured to perform the third operation of the distributed computing task on the result data of the first operation. That is to say, each port can also perform the third operation of the distributed computing task before forwarding the result data of the first operation to the processing unit, thereby accelerating the speed of data processing.
作为另一种可能的实现方式,分布式计算任务包括分布式人工智能计算任务或分布式高性能计算任务或分布式图形计算任务或分布式云计算任务。As another possible implementation manner, the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks or distributed cloud computing tasks.
作为另一种可能的实现方式,分布式计算任务的第二操作或第三操作包括对同类数据进 行聚合的操作。As another possible implementation manner, the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
作为另一种可能的实现方式,交换机为接入交换机或汇聚交换机。As another possible implementation manner, the switch is an access switch or an aggregation switch.
作为另一种可能的实现方式,处理单元还用于向上述至少两个端口发送操作命令,其中,操作命令用于指示上述至少两个端口分别执行分布式计算任务的第三操作。处理单元可以通过操作命令指示与执行分布式计算任务的数据节点连接的端口分别执行分布式计算任务的第三操作,进而实现在数据传输过程中,交换机执行分布式计算任务的操作的目的,提升数据处理的效率。As another possible implementation manner, the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively perform the third operation of the distributed computing task. The processing unit can instruct the ports connected to the data nodes that perform the distributed computing task to perform the third operation of the distributed computing task respectively through the operation command, so as to realize the purpose of the operation of the switch performing the distributed computing task during the data transmission process, improving the performance of the distributed computing task. Efficiency of data processing.
作为另一种可能的实现方式,处理单元通过第一环路向与执行分布式计算任务的数据节点连接的上述至少两个端口发送操作命令,其中,第一环路包括按照预设规则排序的至少两个端口,第一环路的顺序指示上述至少两个端口接收或执行操作命令的顺序。通过第一环路实现操作命令和操作命令的结果数据的传输,可以避免数据聚合处理对其他类型的数据处理过程的影响。而且,可以根据业务需求配置第一环路的带宽,进而保证数据处理的性能。As another possible implementation manner, the processing unit sends an operation command to the at least two ports connected to the data nodes performing distributed computing tasks through a first loop, where the first loop includes at least For two ports, the order of the first loop indicates the order in which the above at least two ports receive or execute operation commands. The transmission of the operation command and the result data of the operation command is realized through the first loop, so that the influence of the data aggregation processing on other types of data processing processes can be avoided. Moreover, the bandwidth of the first loop can be configured according to service requirements, thereby ensuring the performance of data processing.
作为另一种可能的实现方式,在第一环路中顺序在前的端口根据操作命令执行第三操作之后,将第三操作的结果数据与操作命令转发至相邻的在后的端口,直至第一环路中顺序最后的端口将全部的第三操作的结果数据发送至处理单元。通过第一环路的数据通信路径,操作命令可以依次由第一环路中各个端口分别执行,并将其执行第一操作的结果数据发送至相邻的在后的端口,直至第一环路中最后一个端口均完成操作命令的处理过程,使得各个端口根据操作命令分别完成分布式计算任务的操作,加速数据处理的过程。As another possible implementation manner, after the first port in the first loop performs the third operation according to the operation command, the result data of the third operation and the operation command are forwarded to the adjacent subsequent port, until The sequentially last port in the first loop sends all the result data of the third operation to the processing unit. Through the data communication path of the first loop, the operation commands can be executed by each port in the first loop in turn, and the result data of the first operation is sent to the adjacent subsequent ports until the first loop The last port in the above completes the processing process of the operation command, so that each port completes the operation of the distributed computing task according to the operation command, and accelerates the process of data processing.
作为另一种可能的实现方式,处理单元,还用于在发送操作命令之前,接收与执行分布式计算任务的数据节点连接的上述至少两个端口分别发送的报文头,每个报文头包括数据类别和报文序号;根据报文头建立操作表项,其中,操作表项记录每一类数据中待处理的数据与已处理的数据;根据操作表项发送操作命令。由此实现处理单元根据每一类数据的处理情况指示上述至少两个端口的操作命令,进而由各个端口根据操作命令完成分布式计算任务的处理过程。As another possible implementation manner, the processing unit is further configured to, before sending the operation command, receive the packet headers respectively sent by the at least two ports connected to the data node executing the distributed computing task, each packet header Including data type and message serial number; establishing operation table entry according to the message header, wherein the operation table entry records the data to be processed and processed data in each type of data; sending operation command according to the operation table entry. In this way, the processing unit instructs the operation commands of the at least two ports according to the processing conditions of each type of data, and then each port completes the processing process of the distributed computing task according to the operation commands.
作为另一种可能的实现方式,报文头还包括端口标识,操作表项还用于记录每一类数据中每个待处理数据对应的端口标识。As another possible implementation manner, the packet header further includes a port identifier, and the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
作为另一种可能的实现方式,交换机中包括至少一个所述第一环路。As another possible implementation manner, the switch includes at least one of the first loops.
作为另一种可能的实现方式,交换机还用于建立第二环路,通过第二环路分发第二操作的结果数据。交换机中还可以包括第二环路,第二环路用于分发第二操作的结果数据,以此避免分发第二操作的结果数据对其他类型的操作的影响。在第二环路中,各个端口可以依次获取第二操作的结果数据。示例地,在聚合数据的分布式计算任务中,第二操作的结果数据是指所有同一类数据的聚合结果,交换机可以通过第二环路将上述所有同一类数据的聚合结果发送至与执行分布式计算任务的数据节点相连的端口,进而通过各个端口将上述所有同一类数据的聚合结果发送与执行分布式计算任务的数据节点,实现在数据传输过程中执行分布式计算任务的操作,加速数据处理效率。As another possible implementation manner, the switch is further configured to establish a second loop, and distribute result data of the second operation through the second loop. The switch may further include a second loop for distributing the result data of the second operation, so as to avoid the influence of distributing the result data of the second operation on other types of operations. In the second loop, each port can sequentially acquire the result data of the second operation. For example, in the distributed computing task of aggregated data, the result data of the second operation refers to the aggregated results of all the same type of data, and the switch can send the above-mentioned aggregated results of all the same type of data to the execution distribution through the second loop. The data nodes of the distributed computing task are connected to the port, and then the aggregation results of all the same type of data are sent to the data nodes that execute the distributed computing task through each port, so as to realize the operation of executing the distributed computing task during the data transmission process and accelerate the data. processing efficiency.
作为另一种可能的实现方式,第一环路和第二环路可以相同。As another possible implementation, the first loop and the second loop may be the same.
作为另一种可能的实现方式,第一环路和第二环路也可以不同。As another possible implementation manner, the first loop and the second loop may also be different.
第二方面,本申请提供一种数据处理的方法,该方法由交换机执行,交换机与至少两个数据节点相连,每个数据节点用于执行分布式计算任务的第一操作,具体数据处理的过程包括:交换机分别接收上述至少两个数据节点发送的第一操作的结果数据;根据接收的第一操作的结果数据执行分布式计算任务的第二操作,获得第二操作的结果数据,并分发第二操作 的结果数据。通过上述内容可知,交换机在数据传输过程中即执行分布式计算任务的操作,提升了数据处理的效率。In a second aspect, the present application provides a method for data processing. The method is executed by a switch, the switch is connected to at least two data nodes, and each data node is used to perform a first operation of a distributed computing task, and the specific data processing process Including: the switch respectively receiving the result data of the first operation sent by the at least two data nodes; performing the second operation of the distributed computing task according to the received result data of the first operation, obtaining the result data of the second operation, and distributing the second operation of the distributed computing task. The result data of the second operation. It can be seen from the above content that the switch performs the operation of distributed computing tasks during the data transmission process, which improves the efficiency of data processing.
在一种可能的实现方式中,交换机包括处理单元和至少两个端口,每个端口连接一个数据节点,每个端口用于接收所连接的数据节点发送的第一操作的结果数据,并将第一操作的结果数据转发至处理单元。由此实现由交换机的处理单元执行分布式计算任务的第二操作,进而降低数据处理的时延。In a possible implementation manner, the switch includes a processing unit and at least two ports, each port is connected to a data node, and each port is used to receive the result data of the first operation sent by the connected data node, and send the first operation The result data of an operation is forwarded to the processing unit. Thereby, the second operation of executing the distributed computing task by the processing unit of the switch is implemented, thereby reducing the delay of data processing.
作为另一种可能的实现方式,在将第一操作的结果数据转发至处理单元之前,每个端口还用于对第一操作的结果数据执行分布式计算任务的第三操作。也就是说,每个端口也可以在向处理单元转发第一操作的结果数据前,执行分布式计算任务的第三操作,进而加速数据处理的速度。As another possible implementation manner, before forwarding the result data of the first operation to the processing unit, each port is further configured to perform the third operation of the distributed computing task on the result data of the first operation. That is to say, each port can also perform the third operation of the distributed computing task before forwarding the result data of the first operation to the processing unit, thereby accelerating the speed of data processing.
作为另一种可能的实现方式,分布式计算任务包括分布式人工智能计算任务或分布式高性能计算任务或分布式图形计算任务或分布式云计算任务。As another possible implementation manner, the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks or distributed cloud computing tasks.
作为另一种可能的实现方式,分布式计算任务的第二操作或第三操作包括对同类数据进行聚合的操作。As another possible implementation manner, the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
作为另一种可能的实现方式,交换机为接入交换机或汇聚交换机。As another possible implementation manner, the switch is an access switch or an aggregation switch.
作为另一种可能的实现方式,处理单元还用于向上述至少两个端口发送操作命令,其中,操作命令用于指示上述至少两个端口分别执行分布式计算任务的第三操作。处理单元可以通过操作命令指示与执行分布式计算任务的数据节点连接的端口分别执行分布式计算任务的第三操作,进而实现在数据传输过程中,交换机执行分布式计算任务的操作的目的,提升数据处理的效率。As another possible implementation manner, the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively perform the third operation of the distributed computing task. The processing unit can instruct the ports connected to the data nodes that perform the distributed computing task to perform the third operation of the distributed computing task respectively through the operation command, so as to realize the purpose of the operation of the switch performing the distributed computing task during the data transmission process, improving the performance of the distributed computing task. Efficiency of data processing.
作为另一种可能的实现方式,处理单元通过第一环路向与执行分布式计算任务的数据节点连接的上述至少两个端口发送操作命令,其中,第一环路包括按照预设规则排序的至少两个端口,第一环路的顺序指示上述至少两个端口接收或执行操作命令的顺序。通过第一环路实现操作命令和操作命令的结果数据的传输,可以避免数据聚合处理对其他类型的数据处理过程的影响。而且,可以根据业务需求配置第一环路的带宽,进而保证数据处理的性能。As another possible implementation manner, the processing unit sends an operation command to the at least two ports connected to the data nodes performing distributed computing tasks through a first loop, where the first loop includes at least For two ports, the order of the first loop indicates the order in which the above at least two ports receive or execute operation commands. The transmission of the operation command and the result data of the operation command is realized through the first loop, so that the influence of the data aggregation processing on other types of data processing processes can be avoided. Moreover, the bandwidth of the first loop can be configured according to service requirements, thereby ensuring the performance of data processing.
作为另一种可能的实现方式,在第一环路中顺序在前的端口根据操作命令执行第三操作之后,将第三操作的结果数据与操作命令转发至相邻的在后的端口,直至第一环路中顺序最后的端口将全部的第三操作的结果数据发送至处理单元。通过第一环路的数据通信路径,操作命令可以依次由第一环路中各个端口分别执行,并将其执行第一操作的结果数据发送至相邻的在后的端口,直至第一环路中最后一个端口均完成操作命令的处理过程,使得各个端口根据操作命令分别完成分布式计算任务的操作,加速数据处理的过程。As another possible implementation manner, after the first port in the first loop performs the third operation according to the operation command, the result data of the third operation and the operation command are forwarded to the adjacent subsequent port, until The sequentially last port in the first loop sends all the result data of the third operation to the processing unit. Through the data communication path of the first loop, the operation commands can be executed by each port in the first loop in turn, and the result data of the first operation is sent to the adjacent subsequent ports until the first loop The last port in the above completes the processing process of the operation command, so that each port completes the operation of the distributed computing task according to the operation command, and accelerates the process of data processing.
作为另一种可能的实现方式,处理单元,还用于在发送操作命令之前,接收与执行分布式计算任务的数据节点连接的上述至少两个端口分别发送的报文头,每个报文头包括数据类别和报文序号;根据报文头建立操作表项,其中,操作表项记录每一类数据中待处理的数据与已处理的数据;根据操作表项发送操作命令。由此实现处理单元根据每一类数据的处理情况指示上述至少两个端口的操作命令,进而由各个端口根据操作命令完成分布式计算任务的处理过程。As another possible implementation manner, the processing unit is further configured to, before sending the operation command, receive the packet headers respectively sent by the at least two ports connected to the data node executing the distributed computing task, each packet header Including data type and message serial number; establishing operation table entry according to the message header, wherein the operation table entry records the data to be processed and processed data in each type of data; sending operation command according to the operation table entry. In this way, the processing unit instructs the operation commands of the at least two ports according to the processing conditions of each type of data, and then each port completes the processing process of the distributed computing task according to the operation commands.
作为另一种可能的实现方式,报文头还包括端口标识,操作表项还用于记录每一类数据中每个待处理数据对应的端口标识。As another possible implementation manner, the packet header further includes a port identifier, and the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
作为另一种可能的实现方式,交换机中包括至少一个所述第一环路。As another possible implementation manner, the switch includes at least one of the first loops.
作为另一种可能的实现方式,交换机还用于建立第二环路,通过第二环路分发第二操作 的结果数据。交换机中还可以包括第二环路,第二环路用于分发第二操作的结果数据,以此避免分发第二操作的结果数据对其他类型的操作的影响。在第二环路中,各个端口可以依次获取第二操作的结果数据。示例地,在聚合数据的分布式计算任务中,第二操作的结果数据是指所有同一类数据的聚合结果,交换机可以通过第二环路将上述所有同一类数据的聚合结果发送至与执行分布式计算任务的数据节点相连的端口,进而通过各个端口将上述所有同一类数据的聚合结果发送与执行分布式计算任务的数据节点,实现在数据传输过程中执行分布式计算任务的操作,加速数据处理效率。As another possible implementation manner, the switch is further configured to establish a second loop, and distribute result data of the second operation through the second loop. The switch may further include a second loop for distributing the result data of the second operation, so as to avoid the influence of distributing the result data of the second operation on other types of operations. In the second loop, each port can sequentially acquire the result data of the second operation. For example, in the distributed computing task of aggregated data, the result data of the second operation refers to the aggregated results of all the same type of data, and the switch can send the above-mentioned aggregated results of all the same type of data to the execution distribution through the second loop. The data nodes of the distributed computing task are connected to the port, and then the aggregation results of all the same type of data are sent to the data nodes that execute the distributed computing task through each port, so as to realize the operation of executing the distributed computing task during the data transmission process and accelerate the data. processing efficiency.
作为另一种可能的实现方式,第一环路和第二环路可以相同。As another possible implementation, the first loop and the second loop may be the same.
作为另一种可能的实现方式,第一环路和第二环路也可以不同。As another possible implementation manner, the first loop and the second loop may also be different.
第三方面,本申请提供一种数据处理的装置,所述装置包括用于执行第二方面或第二方面任一种可能实现方式中的数据处理的方法的各个模块。In a third aspect, the present application provides an apparatus for data processing, the apparatus comprising various modules for executing the data processing method in the second aspect or any possible implementation manner of the second aspect.
第四方面,本申请提供一种数据处理的系统,该系统包括交换网络和与交换网络连接的至少两个数据节点,其中,上述至少两个数据节点,用于分别执行分布式计算任务的第一操作;交换网络包括至少一个交换机,上述至少一个交换机用于分别接收上述至少两个数据节点发送的第一操作的结果数据,根据接收的第一操作的结果数据执行分布式计算任务的第二操作,获得第二操作的结果数据,并分发第二操作的结果数据。In a fourth aspect, the present application provides a system for data processing, the system includes a switching network and at least two data nodes connected to the switching network, wherein the at least two data nodes are used to perform the first step of the distributed computing task respectively. An operation: The switching network includes at least one switch, and the at least one switch is configured to respectively receive the result data of the first operation sent by the at least two data nodes, and perform the second operation of the distributed computing task according to the received result data of the first operation. operation, obtain the result data of the second operation, and distribute the result data of the second operation.
作为一种可能的实现方式,每个交换机包括第一处理器和至少两个端口,每个端口分别用于与执行分布式计算任务的数据节点相连,第一处理器和每个端口分别用于执行第二方面任一种可能实现方式中所述方法的操作步骤。As a possible implementation manner, each switch includes a first processor and at least two ports, each port is respectively used for connecting with a data node that performs distributed computing tasks, and the first processor and each port are respectively used for The operation steps of the method described in any possible implementation manner of the second aspect are performed.
第五方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有命令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a fifth aspect, the present application provides a computer-readable storage medium, where a command is stored in the computer-readable storage medium, which, when executed on a computer, causes the computer to execute the methods described in the above aspects.
第六方面,本申请提供了一种包含命令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a sixth aspect, the present application provides a computer program product comprising commands that, when run on a computer, cause the computer to perform the methods described in the above aspects.
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。On the basis of the implementation manners provided by the above aspects, the present application may further combine to provide more implementation manners.
图1为本申请实施例提供的一种聚合处理的示意图;1 is a schematic diagram of a polymerization process provided by an embodiment of the present application;
图2为本申请提供的一种数据处理的系统100的架构示意图;FIG. 2 is a schematic structural diagram of a data processing system 100 provided by the present application;
图3为本申请实施例提供的一种交换机的结构示意图;FIG. 3 is a schematic structural diagram of a switch according to an embodiment of the present application;
图4为本申请实施例提供的一种数据处理的方法的流程示意图;4 is a schematic flowchart of a data processing method provided by an embodiment of the present application;
图5为本申请实施例提供的一种交换机500的结构示意图;FIG. 5 is a schematic structural diagram of a switch 500 according to an embodiment of the present application;
图6为本申请实施例提供的另一种交换机600的结构示意图。FIG. 6 is a schematic structural diagram of another switch 600 according to an embodiment of the present application.
为了解决传统技术中数据处理时延高的问题,本申请提出一种数据处理的方法,由数据节点和连接数据节点的交换机共同执行的计算任务,以提升数据处理的效率,为了便于描述,也可以将数据节点和交换机共同执行的计算任务称为分布式计算任务。其中,数据节点可以是计算设备(例如,服务器)形式的节点,也可以是虚拟机或容器等虚拟化形式的节点,此时,虚拟机或容器可以部署在至少一台计算设备(例如,服务器)中,每台计算设备与交换 机相连。分布式计算任务包括分布式人工智能(artificial intelligence,AI)计算任务或分布式高性能计算(high-performance computing,HPC)任务或分布式图形计算(graphic computing)任务或分布式云计算任务或其他可以采用分布式计算方式处理的计算任务。其中,分布式云计算任务,是指在人工智能或高性能计算或图像计算或其他场景中,由虚拟机或容器等虚拟化形式的数据节点和交换机共同执行该场景中计算任务。可选地,除了数据节点可以采用虚拟化形式实现外,在分布式云计算任务中,交换机也可以采用虚拟化形式实现。In order to solve the problem of high data processing delay in the traditional technology, this application proposes a data processing method. The computing tasks are jointly performed by a data node and a switch connected to the data node, so as to improve the efficiency of data processing. For the convenience of description, also The computing tasks performed jointly by the data nodes and switches can be called distributed computing tasks. The data node may be a node in the form of a computing device (for example, a server), or a node in a virtualized form such as a virtual machine or a container. In this case, the virtual machine or container may be deployed on at least one computing device (for example, a server). ), each computing device is connected to a switch. Distributed computing tasks include distributed artificial intelligence (AI) computing tasks or distributed high-performance computing (HPC) tasks or distributed graphics computing (graphic computing) tasks or distributed cloud computing tasks or other Computational tasks that can be processed by distributed computing. Among them, the distributed cloud computing task refers to that in artificial intelligence or high-performance computing or image computing or other scenarios, the computing tasks in the scenario are jointly performed by data nodes and switches in the form of virtual machines or containers. Optionally, in addition to data nodes that can be implemented in a virtualized form, in a distributed cloud computing task, switches can also be implemented in a virtualized form.
为了便于描述,也可以将数据节点执行的分布式计算任务的操作称为第一操作,将交换机执行的该分布式计算任务的操作称为第二操作。For ease of description, the operation of the distributed computing task performed by the data node may also be referred to as the first operation, and the operation of the distributed computing task performed by the switch may be referred to as the second operation.
示例地,在人工智能、高性能计算、图形计算等应用场景中,部分计算任务可以采用数据节点和连接数据节点的交换机共同参与数据处理的过程,以数据聚合的分布式计算任务为例,聚合是指将同类数据进行累加的操作。例如,在人工智能或高性能计算场景中,数据节点可以生成同类数据,此时,也可以将数据节点生成同类数据的操作称为分布式计算任务的第一操作。同类数据包括人工智能场景使用的算法中同类参数、或者计算密集型任务生成的同类数据,具体同类数据可以根据应用场景和业务需求设定;交换机可以对数据节点生成的同类数据执行数据聚合操作,进而获得同类数据的聚合结果,此时,也可以将交换机对数据节点生成的同类数据执行数据聚合的操作称为第二操作。For example, in artificial intelligence, high-performance computing, graphics computing and other application scenarios, some computing tasks can use data nodes and switches connecting data nodes to jointly participate in the process of data processing. Taking the distributed computing task of data aggregation as an example, aggregation Refers to the operation of accumulating similar data. For example, in an artificial intelligence or high-performance computing scenario, a data node can generate the same kind of data. In this case, the operation of the data node generating the same kind of data can also be called the first operation of a distributed computing task. Similar data includes similar parameters in algorithms used in artificial intelligence scenarios, or similar data generated by computing-intensive tasks. Specific similar data can be set according to application scenarios and business requirements; switches can perform data aggregation operations on similar data generated by data nodes. Then, an aggregation result of the same type of data is obtained. At this time, the operation of performing data aggregation on the same type of data generated by the data node by the switch may also be referred to as a second operation.
图1为本申请提供的一种聚合操作的示意图,如图所示,数据节点A、数据节点B和数据节点C分别生成3类参数,例如,数据节点A生成A0、A1和A2,数据节点B生成B0、B1和B2、数据节点C生成C0、C1和C2。假设尾数相同的数据为同类数据,则交换机执行聚合操作后所获得的聚合结果包括A0+B0+C0、A1+B1+C1和A2+B2+C2。1 is a schematic diagram of an aggregation operation provided by this application. As shown in the figure, data node A, data node B and data node C generate three types of parameters, for example, data node A generates A0, A1 and A2, data node A generates A0, A1 and A2 B generates B0, B1 and B2, and data node C generates C0, C1 and C2. Assuming that the data with the same mantissa is the same type of data, the aggregation results obtained by the switch after the aggregation operation include A0+B0+C0, A1+B1+C1, and A2+B2+C2.
接下来,以分布式计算任务为数据聚合处理为例,结合附图对本申请所要保护的技术方案进行详细描述。Next, taking the distributed computing task as data aggregation processing as an example, the technical solutions to be protected by the present application will be described in detail with reference to the accompanying drawings.
图2是本申请实施例提供的一种数据处理的系统100的架构示意图,如图所示,系统100包括交换网络10和数据节点20,数据节点20与交换网络10相连。2 is a schematic structural diagram of a data processing system 100 provided by an embodiment of the present application. As shown in the figure, the system 100 includes a switching network 10 and a data node 20 , and the data node 20 is connected to the switching network 10 .
其中,数据节点20,用于生成分布式计算任务待处理的数据,例如,人工智能的参数训练和/或高性能计算场景中数据密集型计算任务的待聚合的同类数据,并以报文的形式将待聚合的同类数据发送至与其连接的交换机。示例地,如图2所示,数据节点20包括六个数据节点,其中,数据节点201至203与交换机102相连,数据节点204至206与交换机103相连,交换机102和交换机103通过交换机101相连,进而实现数据节点之间的通信连接,每个数据节点可以将其生成的待聚合的同类数据发送至交换网络10,由交换网络10中交换机执行数据聚合操作。The data node 20 is used to generate data to be processed by distributed computing tasks, for example, artificial intelligence parameter training and/or data of the same type to be aggregated for data-intensive computing tasks in high-performance computing scenarios, and generate the data in the form of packets. The form sends the same kind of data to be aggregated to the switch connected to it. For example, as shown in FIG. 2 , the data node 20 includes six data nodes, wherein the data nodes 201 to 203 are connected to the switch 102 , the data nodes 204 to 206 are connected to the switch 103 , and the switch 102 and the switch 103 are connected through the switch 101 , Further, the communication connection between the data nodes is realized, and each data node can send the generated data of the same type to be aggregated to the switching network 10, and the switch in the switching network 10 performs the data aggregation operation.
交换网络10,用于实现系统100中数据传输和执行分布式计算任务的操作(例如,聚合数据的分布式计算任务中执行数据聚合的操作),交换网络10包括至少一个交换机。如图所示,本申请以交换网络10包括三个交换机为例,交换机101也可以称为汇聚交换机,交换机102和交换机103也可以称为接入交换机。接入交换机用于连接数据节点20,以及执行与其连接的数据节点20的分布式计算任务的操作;汇聚交换机则用于实现不同接入交换机所连接的数据节点的数据传输和分布式计算任务的操作。The switching network 10 is used for implementing data transmission and performing operations of distributed computing tasks in the system 100 (eg, performing data aggregation operations in a distributed computing task of aggregating data). The switching network 10 includes at least one switch. As shown in the figure, the present application takes the switch network 10 including three switches as an example, the switch 101 may also be referred to as an aggregation switch, and the switches 102 and 103 may also be referred to as access switches. The access switch is used to connect the data nodes 20 and perform the operation of distributed computing tasks of the data nodes 20 connected to it; the aggregation switch is used to realize the data transmission and distributed computing tasks of the data nodes connected by different access switches. operate.
可选地,图2所示的系统100中,交换网络10也可以仅设置一个交换机,用于实现数据节点201至206的数据传输和聚合处理。Optionally, in the system 100 shown in FIG. 2 , only one switch may be set in the switching network 10 to implement data transmission and aggregation processing of the data nodes 201 to 206 .
值得说明的是,由于交换机中端口数量有限,随着系统中交换节点数量的增加,单个交换机可能无法满足系统组网要求,此时,则可以通过增加交换机数量的方式拓展系统中可接 入数据节点的数量。因此,具体实施时,可以根据业务需求设置交换网络的结构,以及交换机的数量,本申请对交换网络10中交换机的数量和组网方式不做限定。为了便于描述,本申请的以下实施例中以图2所示的交换网络10为例进行说明。It is worth noting that due to the limited number of ports in the switch, as the number of switching nodes in the system increases, a single switch may not be able to meet the system networking requirements. In this case, the accessible data in the system can be expanded by increasing the number of switches. the number of nodes. Therefore, in specific implementation, the structure of the switching network and the number of switches can be set according to service requirements, and the present application does not limit the number and networking mode of the switches in the switching network 10 . For ease of description, the following embodiments of the present application take the switching network 10 shown in FIG. 2 as an example for description.
图2中每个交换机均可以在传输数据节点的数据的过程中,实现数据聚合处理。进一步地,参见图3,图3为本申请实施例提供的一种交换机的结构示意图,如图所示,交换机中包括处理单元110、多个端口(例如,端口1201至端口1222)和交叉网络(crossbar)130。Each switch in FIG. 2 can implement data aggregation processing in the process of transmitting data of data nodes. Further, referring to FIG. 3, FIG. 3 is a schematic structural diagram of a switch according to an embodiment of the application. As shown in the figure, the switch includes a processing unit 110, a plurality of ports (for example, ports 1201 to 1222), and a crossover network (crossbar)130.
其中,处理单元110,用于执行数据聚合操作。可选地,处理单元110,还用于指示端口执行数据聚合处理。进一步地,处理单元110又包括聚合结果缓存111、计算单元112、命令生成模块113和报文头管理模块114。聚合结果缓存111,用于存储同类数据的聚合结果,具体实施中,可以利用交换机中存储器或交换机中处理器的缓存实现聚合结果缓存111的功能。计算单元112,用于对各个端口发送的聚合结果执行聚合操作,以及管理操作表项(对于聚合操作,具体可以是聚合表项),包括生成、更新和删除操作表项,其中,操作表项用于记录每一类数据中待处理的数据与已处理的数据。命令生成单元113,则用于根据操作表项确定待聚合的同类数据,并生成操作命令(对于数据聚合操作,具体可以是聚合命令),并将聚合命令发送至聚合环路的首个端口,由该端口根据端口的输入缓存中存储的数据查找是否存在待聚合的同类数据,并执行聚合操作。报文头管理单元114,用于解析各个端口发送的报文头,以便将报文头中报文序号发送至命令生成单元114,由命令生成单元114根据报文序号和聚合表项生成聚合命令。此外,报文头管理单元114,还用于与交叉网络130相连,交叉网络130也可以称为交叉开关矩阵,用于实现处理单元110和各个端口之间报文头的传输。Among them, the processing unit 110 is configured to perform a data aggregation operation. Optionally, the processing unit 110 is further configured to instruct the port to perform data aggregation processing. Further, the processing unit 110 further includes an aggregation result cache 111 , a calculation unit 112 , a command generation module 113 and a packet header management module 114 . The aggregation result cache 111 is used to store the aggregation results of the same type of data. In specific implementation, the memory in the switch or the cache of the processor in the switch can be used to realize the function of the aggregation result cache 111 . The computing unit 112 is configured to perform an aggregation operation on the aggregation results sent by each port, and manage operation table items (for aggregation operations, it may specifically be an aggregation table item), including generating, updating, and deleting operation table items, wherein the operation table items It is used to record the data to be processed and the processed data in each type of data. The command generating unit 113 is configured to determine the same type of data to be aggregated according to the operation table entry, generate an operation command (for a data aggregation operation, it may specifically be an aggregation command), and send the aggregation command to the first port of the aggregation loop, The port searches whether there is the same type of data to be aggregated according to the data stored in the input buffer of the port, and performs the aggregation operation. The message header management unit 114 is used to parse the message header sent by each port, so as to send the message sequence number in the message header to the command generation unit 114, and the command generation unit 114 generates the aggregation command according to the message sequence number and the aggregation table entry. . In addition, the packet header management unit 114 is also used for connecting to the cross-connect network 130, which may also be referred to as a cross-connect matrix, and is used for implementing the transmission of the packet header between the processing unit 110 and each port.
交换机中多个端口分别用于连接数据节点,每个端口可以与一个数据节点相连,每个端口均包括计算单元和存储器,例如,端口1201包括计算单元12011、存储器12012。其中,计算单元,用于解析数据节点发送的报文获得报文头和静荷(payload)数据。可选地,计算单元,还用于根据处理单元110发送的聚合命令执行聚合操作。此外,每个端口的存储器按照存储数据类型的不同,又可以区分为输入缓存和输出缓存(图中未示出)。输入缓存,用于存储与该端口连接的数据节点所发送的报文,该报文中包括报文头和静荷(payload)数据,静荷数据中包括待聚合的同类数据。输出缓存,则用于在系统100完成待处理的所有同一类数据的聚合操作后,接收并存储处理单元110发送的所有同一类数据的聚合结果,以便向与该端口连接的数据节点发送所有同一类数据的聚合结果。Multiple ports in the switch are respectively used to connect data nodes, each port can be connected to a data node, and each port includes a computing unit and a memory, for example, port 1201 includes a computing unit 12011 and a memory 12012 . The computing unit is configured to parse the message sent by the data node to obtain message header and payload data. Optionally, the computing unit is further configured to perform the aggregation operation according to the aggregation command sent by the processing unit 110 . In addition, the memory of each port can be further divided into an input buffer and an output buffer (not shown in the figure) according to different types of stored data. The input buffer is used to store the message sent by the data node connected to the port. The message includes a message header and payload data, and the payload data includes the same type of data to be aggregated. The output cache is used to receive and store the aggregation results of all the same type of data sent by the processing unit 110 after the system 100 completes the aggregation operation of all the same type of data to be processed, so as to send all the same type of data to the data node connected to the port. Aggregate result of class data.
值得说明的是,交换机中端口的数量依据厂商所生产的产品的不同有所差异,本申请对交换机所包括的端口数量不做限定。It should be noted that the number of ports in the switch varies according to different products produced by manufacturers, and the present application does not limit the number of ports included in the switch.
图3中还示出了两种数据传输环路:聚合环路和分发环路,为了便于描述,也可以将聚合环路称为第一环路,将分发环路称为第二环路。FIG. 3 also shows two data transmission loops: an aggregation loop and a distribution loop. For convenience of description, the aggregation loop can also be referred to as the first loop, and the distribution loop can be referred to as the second loop.
其中,聚合环路,是与参与分布式计算任务的数据节点连接的端口的集合,包括按照预设规则排序的至少两个端口,聚合环路中端口的排序用于指示上述至少两个端口接收或执行聚合命令的顺序。处理单元110生成的聚合命令可以依次从聚合环路的首个端口传输至最后一个端口,每个端口可以根据聚合命令和该端口中是否存储有待聚合的同类数据执行相应的聚合操作,并将聚合结果和聚合命令发送至该环路中与该端口相邻的下一个端口,……,依此类推,直到聚合环路中最后一个端口完成聚合命令的处理,将该聚合命令的聚合结果发送至处理单元,进而在该聚合环路中完成一个聚合命令的处理过程。如图3所示,聚合环路包括:处理单元110-端口1201-端口1202-…-端口1211-端口1212-…-端口1221-端口1222-处理单元110,形成一个闭合的环路,聚合命令可以分别由聚合环路各个端口分别执行,并由聚 合环路中最后一个端口将最终结果数据传输至处理单元110。其中,与处理单元110连接的端口1201也可以称为聚合环路的首个端口,用于接收处理单元110发送的聚合命令。聚合命令会依次由端口1201、端口1202、端口1211、端口1212、…、端口1221和端口1222处理,并有端口1222将最终结果传输至处理单元110。其中,端口1202、端口1211、端口1212、端口1221也可以称为聚合环路的链路端口,每个端口可以接收在聚合环路中与其相邻的在前的端口发送的聚合命令以及与其相邻的在前的端口根据该聚合命令的聚合结果,并基于接收的聚合命令和与其相邻的在前的端口根据该聚合命令的聚合结果执行聚合操作。例如,端口1202可以接收端口1201发送的聚合命令和端口1201对聚合命令的聚合结果,当端口1201中不包括待聚合的同类数据时,端口1201直接将聚合命令发送至端口1202,此时,也可以理解为端口1201对聚合命令的聚合结果为空或无;当聚合端口1201中包括待聚合的同类数据时,端口1201除了将聚合命令发送至端口1202外,还会将端口1201执行操作明的聚合结果发送至端口1202,……,依此类推,在聚合环路中,每个端口依次接收聚合命令,并根据该聚合命令执行聚合操作,并将聚合命令和该端口根据聚合命令执行的聚合结果发送至聚合环路中与其相邻的在后的端口。最后由聚合环路中顺序最后的端口(也可以称为尾端口)将聚合命令的最终聚合结果传输至处理单元110。例如,端口1222为聚合环路中顺序最后的端口,端口1222将最终聚合结果传输至处理单元110,进而由处理单元110判断是否完成所有同类数据的聚合处理。The aggregation loop is a set of ports connected to data nodes participating in distributed computing tasks, including at least two ports sorted according to preset rules, and the sorting of ports in the aggregation loop is used to indicate that the at least two ports receive Or the order in which aggregate commands are executed. The aggregation command generated by the processing unit 110 can be sequentially transmitted from the first port of the aggregation loop to the last port, and each port can perform a corresponding aggregation operation according to the aggregation command and whether the same type of data to be aggregated is stored in the port, and aggregate the data. The result and the aggregation command are sent to the next port in the loop adjacent to the port, ..., and so on, until the last port in the aggregation loop completes the processing of the aggregation command, and the aggregation result of the aggregation command is sent to The processing unit further completes the processing of an aggregation command in the aggregation loop. As shown in FIG. 3, the aggregation loop includes: processing unit 110-port 1201-port 1202-...-port 1211-port 1212-...-port 1221-port 1222-processing unit 110, forming a closed loop, aggregating commands It can be executed by each port of the aggregation ring, and the final result data is transmitted to the processing unit 110 by the last port in the aggregation ring. The port 1201 connected to the processing unit 110 may also be called the first port of the aggregation loop, and is used to receive the aggregation command sent by the processing unit 110 . Aggregate commands are processed by port 1201, port 1202, port 1211, port 1212, . Among them, port 1202, port 1211, port 1212, and port 1221 can also be referred to as link ports of the aggregation ring, and each port can receive the aggregation command sent by its adjacent previous port in the aggregation ring and the corresponding The adjacent preceding port performs the aggregation operation according to the aggregation result of the aggregation command, and based on the received aggregation command and the adjacent preceding port according to the aggregation result of the aggregation command. For example, port 1202 can receive the aggregation command sent by port 1201 and the aggregation result of the aggregation command by port 1201. When port 1201 does not include the same type of data to be aggregated, port 1201 directly sends the aggregation command to port 1202. It can be understood that the aggregation result of the aggregation command by port 1201 is empty or none; when the aggregation port 1201 includes the same kind of data to be aggregated, in addition to sending the aggregation command to port 1202, port 1201 will also perform the operation on port 1201. The aggregation results are sent to ports 1202, ..., and so on. In the aggregation loop, each port receives the aggregation command in turn, performs aggregation operations according to the aggregation command, and combines the aggregation command and the aggregation performed by the port according to the aggregation command. The result is sent to the next port in the aggregation ring that is adjacent to it. Finally, the final aggregation result of the aggregation command is transmitted to the processing unit 110 by the last port in the aggregation loop (also referred to as a tail port). For example, the port 1222 is the last port in the order of the aggregation loop, and the port 1222 transmits the final aggregation result to the processing unit 110, and then the processing unit 110 determines whether the aggregation processing of all similar data is completed.
可选地,交换机中可以包括至少一个聚合环路,每个聚合环路包括按照预设规则排序的至少两个端口。Optionally, the switch may include at least one aggregation loop, and each aggregation loop includes at least two ports sorted according to preset rules.
分发环路,是与参与分布式计算任务的数据节点连接的端口的集合,包括按照预设规则排序的至少两个端口,端口的排序用于指示与参与分布式计算任务的数据节点连接的端口接收所有同类数据的聚合结果的顺序,进而使得与上述端口连接的数据节点可以获知所有同类数据的聚合结果,进而完成分布式计算任务的其他操作。例如,如图3所示,分发环路包括处理单元110-端口1201-端口1202-…-端口1211,则处理单元110可以通过上述分发环路将所有同类数据的聚合结果发送至分发环路中各个端口。A distribution loop is a collection of ports connected to data nodes participating in distributed computing tasks, including at least two ports sorted according to preset rules, and the sorting of ports is used to indicate ports connected to data nodes participating in distributed computing tasks The order of receiving the aggregated results of all the same kind of data, so that the data nodes connected to the above ports can obtain the aggregated results of all the same kind of data, and then complete other operations of the distributed computing task. For example, as shown in FIG. 3, the distribution loop includes processing unit 110-port 1201-port 1202-...-port 1211, then the processing unit 110 can send the aggregation results of all the same data to the distribution loop through the above-mentioned distribution loop each port.
具体地,与聚合环路类似,中央单元110也可以将所有同类数据的聚合结果发送至分发环路的首个端口(例如,端口1201),再由首个端口将所有同类数据的聚合结果发送至与首个端口相邻的在后的端口(例如,端口1202),……,依此类推,分发环路中每个端口均可以获得与其相邻的在前的端口发送的所有同类数据的聚合结果。此外,每个端口在接收到所有同类数据的聚合结果后,可以将所有同类数据的聚合结果存储在端口的存储器中,具体可以存储在存储器的输出缓存中。Specifically, similar to the aggregation loop, the central unit 110 can also send the aggregation results of all similar data to the first port (for example, port 1201) of the distribution loop, and then the first port sends the aggregation results of all similar data To subsequent ports adjacent to the first port (eg, port 1202), . . . , and so on, each port in the distribution loop can obtain the Aggregate results. In addition, after each port receives the aggregated results of all the same data, it can store the aggregated results of all the same types of data in the memory of the port, specifically in the output cache of the memory.
可选地,分发环路中最后一个端口在接收到所有同类数据的聚合结果后,还可以向中央单元发送通知消息,该通知消息用于指示分发环路中端口获取所有同类数据的聚合结果的情况。Optionally, after receiving the aggregation results of all similar data, the last port in the distribution loop may also send a notification message to the central unit, where the notification message is used to instruct the ports in the distribution loop to obtain the aggregation results of all similar data. condition.
可选地,交换机中包括至少一个分发环路,每个分发环路中分别包括按照预设规则排序的至少两个端口。Optionally, the switch includes at least one distribution loop, and each distribution loop includes at least two ports sorted according to preset rules.
可选地,聚合环路和分发环路可以是相同的环路,也就是说,聚合环路即用于数据聚合过程中传输聚合命令和端口的聚合结果,还用于向参与分布式计算的数据节点相连的端口发送所有第一类数据的聚合结果。Optionally, the aggregation loop and the distribution loop can be the same loop, that is, the aggregation loop is used to transmit aggregation commands and port aggregation results in the data aggregation process, and is also used to transmit the aggregation results of the aggregation commands and ports involved in distributed computing. The port to which the data node is connected sends the aggregated result of all the first type of data.
可选地,聚合环路和分发环路也可以是不同的环路。Optionally, the aggregation loop and the distribution loop can also be different loops.
作为一种可能的实施例,聚合环路和分发环路中相邻端口,以及端口和处理单元的连接 方式可以通过印刷电路板(printed circuit board,PCB)中物理连线(例如,导电线路)连接。As a possible embodiment, the adjacent ports in the aggregation loop and the distribution loop, as well as the connection between the ports and the processing units, may be physical connections (eg, conductive traces) in a printed circuit board (PCB). connect.
值得说明的是,聚合环路和分发环路分别包括的端口数量可以根据业务需求进行配置,数据传输路径延聚合环路或分发环路中所包括的端口依次逐个进行传输。It is worth noting that the number of ports included in the aggregation loop and the distribution loop can be configured according to service requirements, and the data transmission paths are transmitted one by one along the ports included in the aggregation loop or the distribution loop.
下面结合图4详细介绍申请提供的数据处理的方法,如图所示,该方法以分布式计算任务为数据聚合的处理过程为例进行说明,此外,为了便于描述,将待聚合的数据称为第一类数据,以第一端口、第二端口和第三端口分别为与执行分布式计算任务的数据节点连接的端口,且第一端口、第二端口和第三端口构成聚合环路,第一端口为聚合环路的首个端口,第三端口为聚合环路的最后一个端口为例进行说明。该方法具体包括:The data processing method provided by the application is described in detail below with reference to FIG. 4 . As shown in the figure, the method is described by taking a distributed computing task as the processing process of data aggregation as an example. In addition, for the convenience of description, the data to be aggregated is referred to as For the first type of data, the first port, the second port and the third port are respectively the ports connected to the data nodes performing distributed computing tasks, and the first port, the second port and the third port constitute an aggregation loop, and the first port, the second port and the third port constitute an aggregation loop. One port is the first port of the aggregation loop, and the third port is the last port of the aggregation loop. Specifically, the method includes:
S301、处理单元接收第一端口发送的第一报文头。S301. The processing unit receives the first packet header sent by the first port.
S302、处理单元接收第二端口发送的第二报文头。S302. The processing unit receives the second packet header sent by the second port.
S303、处理单元接收第三端口发送的第三报文头。S303. The processing unit receives the third packet header sent by the third port.
在分布式计算任务中,待聚合的同类数据具有关联的报文列号,每个数据可以利用一个报文由数据节点发送至交换机。每个报文包括报文头和静荷数据。具体地,端口在解析报文获得报文头后,可以通过交叉网络向处理单元发送报文头,每个报文头包括报文序号,报文序号用于指示与该端口连接的数据节点发送的报文的序号,每个报文中携带至少一个待聚合的数据。可选地,报文头中还包括数据类型,数据类型用于指示待处理的数据的类型。具体实施中,报文头的生成规则可以由数据节点确定后通知交换机,也可以由交换机确定后通知数据节点,本申请对此不做限定。In a distributed computing task, the same type of data to be aggregated has an associated packet column number, and each data can be sent from the data node to the switch using one packet. Each packet includes a packet header and static payload data. Specifically, after the port parses the packet to obtain the packet header, it can send the packet header to the processing unit through the cross-connect network. Each packet header includes the packet sequence number, and the packet sequence number is used to instruct the data node connected to the port to send the packet. The sequence number of the packet, each packet carries at least one piece of data to be aggregated. Optionally, the packet header further includes a data type, where the data type is used to indicate the type of data to be processed. In specific implementation, the generation rule of the packet header may be determined by the data node and then notified to the switch, or may be determined by the switch and then notified to the data node, which is not limited in this application.
示例地,可以在报文序号的指定字段设置固定标识位,如在表1中第一字段为序号、字段2用于指示数据类别,当端口接收报文的报文序号中第二字段为1时,则表示该报文序号关联的报文包括数据类别为1的第一类数据,数据处理过程中可以针对数据类型为1的数据执行聚合操作。For example, a fixed identification bit can be set in the specified field of the message sequence number. For example, in Table 1, the first field is the sequence number, and the field 2 is used to indicate the data type. When the port receives the message sequence number of the message, the second field is 1. , it means that the message associated with the message sequence number includes the first type of data with the data type 1, and the aggregation operation can be performed on the data with the data type 1 in the data processing process.
可选地,报文头中还包括用于指示偏移位的第三字段。偏移位,用于指示同一个端口待聚合的同类数据的总数。例如,当接收报文序号为第三字段为3时,则表示该端口待聚合数据的总数为3个。Optionally, the packet header further includes a third field for indicating the offset bit. Offset bit, used to indicate the total number of data of the same type to be aggregated on the same port. For example, when the sequence number of the received packet is 3 in the third field, it means that the total number of data to be aggregated on the port is 3.
可选地,报文头中还可以包括用于指示端口标识的字段4。其中,端口标识,用于指示向处理单元发送报文头的端口的标识,该端口标识可以利用数字和/或字母表示。可选地,处理单元也可以在接收端口发送的报文头时,分别记录发送报文头的端口的标识。Optionally, the packet header may further include field 4 for indicating the port identifier. The port identifier is used to indicate the identifier of the port that sends the packet header to the processing unit, and the port identifier may be represented by numbers and/or letters. Optionally, when receiving the packet header sent by the port, the processing unit may separately record the identifier of the port that sends the packet header.
表1为一种报文头的示例Table 1 is an example of a packet header
字段1field 1 | 字段2field 2 | 字段3field 3 | 字段4field 4 |
序号serial number | 数据类别data category | 偏移位offset bit | 端口标识Port ID |
S304(可选地)、处理单元分别校验各个报文头中报文序号的可靠性。S304 (optionally): The processing unit checks the reliability of the packet sequence numbers in the respective packet headers respectively.
处理单元接收到与端口发送的报文头后,可以针对各个报文头中所包括的报文序号进行可靠性校验,其中,可靠性校验的方式可以采用以下方式中任意一种:After the processing unit receives the packet headers sent by the port, it can perform reliability verification on the packet serial numbers included in the respective packet headers, and the reliability verification method can be any one of the following methods:
方式一、处理单元可以根据预设规则校验报文序号的可靠性。Manner 1: The processing unit may check the reliability of the packet serial number according to a preset rule.
数据节点和处理单元可以预先约定报文序号的生成规则,也可以称为预设规则,每个报文序号为全局唯一标识。可选地,每个报文携带一个第一类数据,也就是说,报文序号可以唯一标识一个第一类数据。处理单元可以根据该预设规则校验每个报文序号的合法性。具体地,处理单元可以预先存储包括预设报文序号表,该预设报文序号表用于记录按照预设规则 生成的所有报文序列号的集合,处理单元可以在预设报文序号表中查询是否存在各个报文序号,当预设报文序号表中存在待查询的报文序号时,则认为该报文序号可靠性校验的结果为通过;否则,则认为该报文序号可靠性校验的结果为不通过。The data node and the processing unit may pre-agreed a generation rule for the message sequence number, which may also be called a preset rule, and each message sequence number is a globally unique identifier. Optionally, each packet carries a first type of data, that is, the packet sequence number can uniquely identify a first type of data. The processing unit can check the validity of each packet serial number according to the preset rule. Specifically, the processing unit may pre-store a preset message sequence number table, where the preset message sequence number table is used to record the set of all message sequence numbers generated according to the preset rules, and the processing unit may store the preset message sequence number table in the preset message sequence number table. If there is a message sequence number to be queried in the preset message sequence number table, the result of the reliability check of the message sequence number is considered to be passed; otherwise, the message sequence number is considered to be reliable. The result of the sex check is failed.
方式二、处理单元可以按照预设规则计算报文序号的合法性。In a second manner, the processing unit may calculate the validity of the packet sequence number according to a preset rule.
报文序号可以是按照预设规则生成的随机数或标识,用于全局唯一标识一个报文的序号。例如,当报文序号是利用哈希(hash)算法生成的随机数并利用加密算法加密获得时,处理单元可以解密该算法,依据哈希算法确定解密后的报文序号,并确定该解密后的报文序号是否在预先约定的报文序号范围内,如果在预先约定的报文序号范围内,则认为该报文序列号可靠性校验结果为通过;如果不在预先约定的报文序号范围内,则认为该报文序号1可靠性校验结果为不通过。可选地,报文序号也可以利用除哈希算法以外的其他自定义算法或通用算法生成,本申请对此不做限定。The packet sequence number may be a random number or an identifier generated according to a preset rule, and is used to globally uniquely identify the sequence number of a packet. For example, when the packet sequence number is a random number generated by a hash algorithm and obtained by encryption using an encryption algorithm, the processing unit can decrypt the algorithm, determine the decrypted packet sequence number according to the hash algorithm, and determine the decrypted packet sequence number. Whether the message sequence number of the message is within the pre-agreed range of message sequence numbers, if it is within the pre-agreed range of message sequence numbers, it is considered that the reliability check result of the message sequence number is passed; if it is not within the pre-agreed range of message sequence numbers within, it is considered that the reliability check result of the packet number 1 is not passed. Optionally, the packet sequence number may also be generated by using a custom algorithm or a general algorithm other than the hash algorithm, which is not limited in this application.
通过对报文序号的可靠性验证,可以在执行聚合操作前确定报文头的合法性,进而避免聚合非一类数据所导致的数据错误问题,提升分布式计算任务的准确性。By verifying the reliability of the packet sequence number, the validity of the packet header can be determined before the aggregation operation is performed, thereby avoiding the problem of data errors caused by aggregating non-typed data, and improving the accuracy of distributed computing tasks.
S305、处理单元根据各个报文序号生成聚合表项。S305. The processing unit generates an aggregation entry according to the sequence numbers of each packet.
处理单元可以根据接收的报文头中报文序号生成聚合表项,聚合表项记录每一类数据中待处理的数据与已处理的数据,也就是说,在数据聚合处理中,该聚合表项用于指示第一类数据的聚合情况,包括报文序号和每个报文序号的聚合状态,其中,聚合状态用于指示每个报文序号关联的第一类数据的聚合状态,聚合状态包括“未聚合”、“已聚合”、以及“未聚合,且未收到报文头”中任意一种。可选地,聚合表项也可以包括报文序号关联的端口标识。可选地,聚合表项还可以包括报文序号关联的数据类别和偏移位。The processing unit can generate an aggregation entry according to the packet sequence number in the received packet header, and the aggregation entry records the data to be processed and the processed data in each type of data, that is, in the data aggregation processing, the aggregation table The item is used to indicate the aggregation status of the first type of data, including the packet sequence number and the aggregation status of each packet sequence number, where the aggregation status is used to indicate the aggregation status of the first type of data associated with each packet sequence number, and the aggregation status It includes any one of "not aggregated", "aggregated", and "not aggregated, and the header has not been received". Optionally, the aggregation entry may also include a port identifier associated with the packet sequence number. Optionally, the aggregation entry may further include the data type and offset bit associated with the packet sequence number.
示例地,表2为本申请实施例提供的一种处理单元接收报文头的汇总结果,如表所示,处理单元由接收各个端口的报文头可知,端口标识为1的端口待聚合的数据类型为1的数据共3个,已接收报文序号为1和2的报文;端口标识为2的端口待聚合的数据类型为1的数据共4个,已接收报文序号为1和3的报文;端口标识为3的端口待聚合的数据类型为1的数据共2个,已接收报文序号为1和2的报文。Exemplarily, Table 2 is a summary result of a processing unit receiving packet headers provided by an embodiment of the present application. As shown in the table, the processing unit can know from the packet headers received by each port that the ports whose port identifier is 1 are to be aggregated. There are 3 data with the data type 1, and the received packets with the serial numbers 1 and 2; the port with the port ID of 2 has a total of 4 data with the data type 1 to be aggregated, and the received packets have the serial numbers 1 and 2. 3 packets; the port with the port ID of 3 has a total of 2 data types of 1 data to be aggregated, and received packets with packet sequence numbers 1 and 2.
表2 一种处理单元接收报文头的汇总结果Table 2 A summary result of the header received by a processing unit
处理单元可以根据表2中报文头的偏移位确定所有待聚合的第一类数据的报文序号,以及每个报文序号对应的第一类数据的聚合状态,进而根据上述确定结果生成指示第一类数据聚合情况的聚合表项。示例地,根据表2可以获知:端口标识为1的端口连接的数据节点共生成3个数据类别为1的报文序号,处理单元已接收端口标识为1的端口发送的报文序号为1和2的报文头,处理单元未获得报文序号为3的报文头;端口标识为2端口连接的数据节 点共生成4个数据类别为1的报文序号,处理单元已接收端口标识为2的端口发送的报文序号为1和3的报文头,处理单元未获得报文序号为2和4的报文头;端口标识为3的端口连接的数据节点共生成2个数据类别为1的报文序号,处理单元已获得报文序号为1和2的报文头。此时,如表3所示,处理单元可以根据上述情况先确定所有待聚合的第一类数据的报文序号和每个报文序号关联的端口标识,并进一步标识每个报文序号的聚合状态。例如,端口标识为1,报文序号为1的聚合状态未“未聚合”,端口标识为1,报文序号为1的聚合状态未“未聚合,且未收到报文头”。The processing unit can determine the message sequence numbers of all the first-type data to be aggregated according to the offset bits of the message headers in Table 2, and the aggregation state of the first-type data corresponding to each message sequence number, and then generate according to the above determination results. Aggregate entry indicating the aggregation of the first type of data. For example, according to Table 2, it can be known that the data nodes connected to the port with the port identifier 1 generate a total of 3 message sequence numbers with the data category 1, and the processing unit has received the message sequence numbers sent by the port with the port identifier 1 as 1 and 1. The packet header of 2, the processing unit has not obtained the packet header with the packet sequence number of 3; the data node connected with the port ID of 2 generates a total of 4 packet sequence numbers with the data category of 1, and the processing unit has received the port ID of 2. The packet headers with the packet sequence numbers 1 and 3 sent by the port, the processing unit does not obtain the packet headers with the packet sequence numbers 2 and 4; the data node connected to the port with the port ID of 3 generates a total of 2 data types of 1 The processing unit has obtained the packet headers with the packet sequence numbers 1 and 2. At this time, as shown in Table 3, the processing unit can first determine the packet sequence numbers of all the first type of data to be aggregated and the port identifiers associated with each packet sequence number according to the above situation, and further identify the aggregation of each packet sequence number. state. For example, the port ID is 1 and the aggregation status of the packet sequence number 1 is not "un-aggregated", and the port ID is 1 and the aggregation status of the packet sequence number 1 is not "un-aggregated and the packet header has not been received".
表3 一种聚合表项的示例Table 3 An example of an aggregate table entry
可选地,除了利用如表3所示的文字形式标识聚合状态以外,还可以利用数字或字母或数据和字母的组合等任意形式标识聚合状态。Optionally, in addition to identifying the aggregation state in a literal form as shown in Table 3, the aggregation state may also be identified in any form such as numbers or letters or a combination of data and letters.
处理单元通过生成如表3所示的聚合表项可以获知待聚合的第一类数据的聚合状态,进一步地,处理单元可以根据聚合表项生成聚合命令,该聚合命令用于指示端口根据聚合命令执行数据聚合操作。The processing unit can learn the aggregation state of the first type of data to be aggregated by generating the aggregation entry as shown in Table 3. Further, the processing unit can generate an aggregation command according to the aggregation entry, and the aggregation command is used to instruct the port according to the aggregation command. Perform data aggregation operations.
S306、处理单元根据聚合表项确定待聚合的第一类数据的报文序号,并生成聚合命令。S306. The processing unit determines the packet sequence number of the first type of data to be aggregated according to the aggregation entry, and generates an aggregation command.
处理单元在确定待聚合的所有第一类数据的报文序号和每个报文序号的聚合状态后,可以基于未聚合的数据所关联的报文序号和端口标识生成聚合命令,该聚合命令中包括至少一个待聚合的第一类数据的报文序号。After determining the packet sequence numbers of all the first-type data to be aggregated and the aggregation state of each packet sequence number, the processing unit can generate an aggregation command based on the packet sequence numbers and port identifiers associated with the unaggregated data. It includes at least one packet sequence number of the first type of data to be aggregated.
具体地,处理单元可以按照筛选规则生成聚合命令,筛选规则用于筛选聚合命令中所包括的待聚合的第一类数据的报文序号,具体包括以下方式中任意一种:Specifically, the processing unit may generate an aggregation command according to a filtering rule, and the filtering rule is used to filter the packet sequence numbers of the first type of data to be aggregated included in the aggregation command, which specifically includes any one of the following methods:
方式一、按照轮询的方式,根据报文序号的大小确定至少一个待聚合的第一类数据的报文序号。Manner 1: In a polling manner, the packet sequence number of at least one type of data to be aggregated is determined according to the size of the packet sequence number.
具体地,可以按照轮询的方式,根据报文序列号的大小,在所有待聚合的第一类数据的报文序号中选择一个或多个待聚合的第一类数据的报文序号。Specifically, one or more packet sequence numbers of the first type of data to be aggregated may be selected from all the packet sequence numbers of the first type of data to be aggregated in a polling manner and according to the size of the packet sequence numbers.
方式二、按照优先级方式,确定至少一个待聚合的第一类数据的报文序号。Manner 2: Determine at least one packet sequence number of the first type of data to be aggregated according to the priority mode.
待聚合的第一类数据也可以带有优先级标识,该优先级标识被携带在报文中,该优先级用于标识与之关联的第一类数据的优先级,例如,图1中某个数据节点所生成的第一类数据为聚合数据中重要数据,则该第一数据的优先级则可以标识为高,相应地,该数据节点发送 的报文中也携带指示优先级的信息。处理单元可以根据待聚合的第一类数据的优先级,在所有待聚合的第一类数据的报文序号中选择一个或多个待聚合数据。The first type of data to be aggregated may also carry a priority identifier, which is carried in the message, and the priority is used to identify the priority of the first type of data associated with it. The first type of data generated by each data node is important data in the aggregated data, and the priority of the first data can be marked as high. Correspondingly, the message sent by the data node also carries information indicating the priority. The processing unit may select one or more pieces of data to be aggregated from the packet sequence numbers of all the first type of data to be aggregated according to the priority of the first type of data to be aggregated.
方式三、按照已接收的报文头情况选择至少一个待聚合的第一类数据的报文序号。Manner 3: Select at least one packet sequence number of the first type of data to be aggregated according to the status of the received packet headers.
除了上述两种方式中在所有待聚合的第一类数据的报文序号中筛选报文序号外,处理单元还可以先确定已接收的报文序号,然后在已接收的报文序号中使用方式一或方式二的方法选择至少一个第一数据的报文序号。In addition to filtering the packet sequence numbers in the packet sequence numbers of all the first type of data to be aggregated in the above two methods, the processing unit can also determine the received packet sequence numbers first, and then use the method in the received packet sequence numbers. The method of the first or the second mode selects at least one packet sequence number of the first data.
进一步地,处理单元可以根据聚合表项仅生产一个聚合命令,该聚合命令中包括上述任意一种方式中筛选的所有待聚合的第一类数据的报文序号;也可以生成多个聚合命令,每个聚合命令包括一个待聚合的第一类数据的报文序号;还可以生成多个聚合命令,每个聚合命令包括部分待聚合的第一类数据的报文序号。为了便于描述,以处理单元仅生成一个聚合命令,该命令包括上述任意一种方式中筛选的所有待聚合的第一类数据的报文序号为例进行描述。Further, the processing unit can generate only one aggregation command according to the aggregation table entry, and the aggregation command includes the packet sequence numbers of all the first-type data to be aggregated filtered in any of the above methods; it can also generate multiple aggregation commands, Each aggregation command includes a packet sequence number of the first type of data to be aggregated; multiple aggregation commands can also be generated, and each aggregation command includes a packet sequence number of part of the first type of data to be aggregated. For ease of description, the processing unit only generates one aggregation command, and the command includes the packet sequence numbers of all the first type of data to be aggregated filtered in any of the foregoing manners for description.
可选地,聚合命令还包括待聚合的第一类数据的报文序号关联的端口标识。Optionally, the aggregation command further includes a port identifier associated with the packet sequence number of the first type of data to be aggregated.
S307、处理单元向第一端口发送聚合命令。S307. The processing unit sends an aggregation command to the first port.
处理单元可以利用聚合环路发送聚合命令,若第一端口为聚合环路的首个端口,则处理单元向第一端口发送该聚合命令。也就是说,处理单元直接将聚合命令发送给聚合环路的首个端口,首个端口完成聚合命令的处理后,再向聚合环路中与首个端口相邻的在后的端口发送该聚合命令,进而由该端口根据首个端口的聚合结果和聚合命令完成本端口的聚合处理,顺序在前的端口根据聚合命令执行聚合操作之后,将聚合操作的结果数据与该聚合命令转发至相邻的在后的端口,直至聚合环路中顺序最后的端口将该聚合命令的全部的结果数据发送至处理单元,具体过程参见步骤S308至步骤S310。The processing unit may use the aggregation loop to send the aggregation command, and if the first port is the first port of the aggregation loop, the processing unit sends the aggregation command to the first port. That is to say, the processing unit directly sends the aggregation command to the first port of the aggregation ring. After the first port completes the processing of the aggregation command, it sends the aggregation command to the subsequent ports adjacent to the first port in the aggregation ring. command, and then the port completes the aggregation processing of the port according to the aggregation result of the first port and the aggregation command. After the previous port performs the aggregation operation according to the aggregation command, the result data of the aggregation operation and the aggregation command are forwarded to the adjacent port. The last port in the aggregation loop sends all the result data of the aggregation command to the processing unit until the last port in the order in the aggregation loop. For the specific process, refer to step S308 to step S310.
S308、当第一端口包括待聚合的第一类数据的报文序号时,第一端口执行聚合操作,并向第二端口发送聚合命令和第一端口的聚合结果。S308. When the first port includes the packet sequence number of the first type of data to be aggregated, the first port performs an aggregation operation, and sends an aggregation command and an aggregation result of the first port to the second port.
第一端口在接收到与之连接的数据节点的报文后,会解析该报文获得报文头和静荷数据,并将报文头通过交叉网络发送给处理单元,并将报文头和静荷数据存储至第一端口的存储器。当第一端口接收到聚合命令时,则会根据聚合命令中待聚合的第一类数据的报文序号执行聚合命令的处理。具体地,第一端口可以先确定第一端口的存储器是否包括待聚合的第一类数据的报文序号;然后,根据报文序号确定与该报文序号关联的静荷数据;再根据静荷数据执行聚合操作。After the first port receives the message from the data node connected to it, it parses the message to obtain the message header and static payload data, sends the message header to the processing unit through the cross-connect network, and combines the message header with the static payload data. The dead load data is stored to the memory of the first port. When the first port receives the aggregation command, the processing of the aggregation command is performed according to the packet sequence number of the first type of data to be aggregated in the aggregation command. Specifically, the first port can first determine whether the memory of the first port includes the packet sequence number of the first type of data to be aggregated; then, determine the static load data associated with the packet sequence number according to the packet sequence number; Aggregate operations are performed on the data.
可选地,第一端口也可以先根据聚合命令中报文序号关联的端口标识确定是否包括第一端口的标识;然后,确定第一端口的存储器是否包括待聚合的第一类数据的报文序号;再根据报文序号确定该报文序号关联的静荷数据;最后,根据静荷数据执行聚合操作。为了便于描述,也可以将聚合环路中每个端口根据聚合命令执行的操作称为分布式计算任务的第三操作,相应地,每个端口执行第三操作获得结果数据,该结果数据也可以称为第三操作的结果数据。Optionally, the first port may also first determine whether to include the identifier of the first port according to the port identifier associated with the packet sequence number in the aggregation command; then, determine whether the memory of the first port includes the packet of the first type of data to be aggregated. sequence number; then determine the payload data associated with the message sequence number according to the message sequence number; finally, perform an aggregation operation according to the payload data. For the convenience of description, the operation performed by each port in the aggregation ring according to the aggregation command may also be referred to as the third operation of the distributed computing task. Correspondingly, each port performs the third operation to obtain result data, which may also be It is called the result data of the third operation.
第一端口在执行完聚合命令后,会将聚合命令和第一端口的聚合结果发送给聚合环路中与第一端口相邻的端口(例如,第二端口),由第二端口继续根据聚合命令和第一端口的聚合结果执行聚合操作。After the first port executes the aggregation command, it will send the aggregation command and the aggregation result of the first port to the port (for example, the second port) adjacent to the first port in the aggregation loop, and the second port will continue according to the aggregation. The aggregation operation is performed on the command and the aggregation result of the first port.
示例地,以参考表3所示的聚合表项生成的聚合命令包括端口标识为1的端口中报文序号为1和2的报文序号,端口标识为2的端口中报文序号为1和3的报文序号,以及端口标识为3的端口中报文序号为1和2的报文序号为例,当第一端口接收到聚合命令时,则会聚 合报文序号为1和2所关联的静荷数据,获得聚合结果,该聚合结果也可以称为第一端口根据聚合命令获得的聚合结果,或者称为第一端口的聚合结果,或者称为第一端口执行聚合命令的结果数据,或者称为第一端口执行第三操作的结果数据。Illustratively, the aggregation command generated with reference to the aggregation table entry shown in Table 3 includes the packet sequence numbers 1 and 2 in the port with the port identifier 1, and the packet sequence numbers 1 and 2 in the port with the port identifier 2. The packet sequence number of 3, and the packet sequence numbers of 1 and 2 in the port with the port ID of 3 as an example, when the first port receives the aggregation command, it will aggregate the packet sequence numbers associated with 1 and 2. The static load data obtained by the first port can also be called the aggregation result obtained by the first port according to the aggregation command, or the aggregation result of the first port, or the result data of the first port executing the aggregation command. Or referred to as the result data of the third operation performed by the first port.
可选地,当第二端口中不包括聚合命令中待聚合的第一类数据的报文序号时,第一端口可以直接向第二端口发送聚合命令,此时,也可以认为第一端口的聚合结果为零或空。Optionally, when the second port does not include the packet sequence number of the first type of data to be aggregated in the aggregation command, the first port can directly send the aggregation command to the second port. The aggregate result is zero or empty.
S309、当第二端口包括待聚合的第一类数据的报文序号时,第二端口执行聚合操作,并向第三端口发送聚合命令和第二端口的聚合结果。S309. When the second port includes the packet sequence number of the first type of data to be aggregated, the second port performs an aggregation operation, and sends an aggregation command and an aggregation result of the second port to the third port.
与步骤S308类似,第二端口在接收到聚合命令后,也可以根据聚合命令中待聚合的第一类数据的报文序号在第二端口的存储器中查找与之匹配的静荷数据。具体地,第二端口可以先确定第二端口的存储器是否包括待聚合的第一类数据的报文序号;然后,再根据报文序号确定待聚合的第一数据,也就是该报文序号关联的静荷数据;根据静荷数据执行聚合操作。其中,第二端口执行聚合操作时,需要先判断第一端口是否发送第一端口根据聚合命令生成的聚合结果,如果是,第二端口则需要根据第一端口的聚合结果和第二端口的存储器中存储的待聚合的第一类数据执行聚合操作,获得第二端口的聚合结果,也即,当第二端口需要在第一端口的聚合结果基础上执行聚合操作。Similar to step S308, after receiving the aggregation command, the second port may also search for matching static payload data in the memory of the second port according to the packet sequence number of the first type of data to be aggregated in the aggregation command. Specifically, the second port may first determine whether the memory of the second port includes the message sequence number of the first type of data to be aggregated; then, determine the first data to be aggregated according to the message sequence number, that is, the message sequence number is associated with the static load data; perform aggregation operations based on the static load data. Wherein, when the second port performs the aggregation operation, it needs to first determine whether the first port sends the aggregation result generated by the first port according to the aggregation command. The aggregation operation is performed on the first type of data to be aggregated stored in the storage device to obtain the aggregation result of the second port, that is, when the second port needs to perform the aggregation operation on the basis of the aggregation result of the first port.
示例地,仍以参考表3所示的聚合表项生成的聚合命令包括端口标识为1的端口中报文序号为1和2的报文序号,端口标识为2的端口中报文序号为1和3的报文序号,以及端口标识为3的端口中报文序号为1和2的报文序号为例,当第一端口根据聚合命令将报文序号1和2关联的第一类数据聚合获得第一端口的聚合结果时,向第二端口发送上述聚合结果和聚合命令,相应地,第二端口会根据上述聚合结果和聚合命令执行上述聚合结果和报文序号为1和3关联的第一类数据聚合获得第二端口的聚合结果,此时,第二端口的聚合结果包括端口1中报文序号为1和2关联的第一类数据,以及端口2中报文序号为1和3关联的第一类数据的聚合结果。为了便于描述,也可以将第二端口根据聚合命令获得的聚合结果称为第二端口的聚合结果,或者第二端口执行聚合命令获得的聚合结果,第二端口执行第三操作获得的结果数据。Illustratively, the aggregation command generated with reference to the aggregation table entry shown in Table 3 includes the packet sequence numbers 1 and 2 in the port whose port identifier is 1, and the packet sequence number in the port whose port identifier is 2 is 1. and 3, and the packet sequence numbers 1 and 2 in the port with the port ID 3 as an example, when the first port aggregates the first type of data associated with the packet numbers 1 and 2 according to the aggregation command When the aggregation result of the first port is obtained, the above-mentioned aggregation result and the aggregation command are sent to the second port. Accordingly, the second port will execute the above-mentioned aggregation result according to the above-mentioned aggregation result and the aggregation command. One type of data aggregation obtains the aggregation result of the second port. At this time, the aggregation result of the second port includes the first type of data associated with the packet sequence numbers 1 and 2 in port 1, and the packet sequence numbers 1 and 3 in port 2. The aggregated result of the associated first-class data. For ease of description, the aggregation result obtained by the second port according to the aggregation command may also be referred to as the aggregation result of the second port, or the aggregation result obtained by the second port executing the aggregation command and the result data obtained by the second port executing the third operation.
可选地,当第二端口中不包括聚合命令待聚合的第一类数据的报文序号时,第二端口可以直接向第三端口发送聚合命令和第一端口的聚合结果,此时,也可以认为第二端口的聚合结果为零或空。Optionally, when the second port does not include the packet sequence number of the first type of data to be aggregated by the aggregation command, the second port can directly send the aggregation command and the aggregation result of the first port to the third port. The aggregated result of the second port can be considered to be zero or empty.
值得说明的是,如果第一端口中也不包括聚合命令待聚合的第一类数据的报文序号,则第二端口可以直接向第三端口聚合命令,也就是说,在此情况下,第一端口和第二端口均未包括聚合命令待聚合的第一类数据的报文序号,第一端口的聚合结果和第二端口的聚合结果均为零或空。It is worth noting that, if the first port does not include the packet sequence number of the first type of data to be aggregated by the aggregation command, the second port can directly aggregate the command to the third port, that is, in this case, the first Neither the one port nor the second port includes the packet sequence number of the first type of data to be aggregated by the aggregation command, and the aggregation result of the first port and the aggregation result of the second port are both zero or empty.
S310、当第三端口包括待聚合的第一类数据的报文序列号时,执行聚合操作,并向处理单元发送第三端口的聚合结果。S310. When the third port includes the packet sequence number of the first type of data to be aggregated, perform an aggregation operation, and send the aggregation result of the third port to the processing unit.
第三端口为聚合环路中最后一个端口,也就是说,第三端口是聚合环路中顺序最后的端口。示例地,如图3所示的端口1222。与上述步骤S309类似,第三端口也会根据第三端口的存储器中是否存在聚合命令中指示的待聚合的第一类数据的报文序号,以及聚合环路中与第三端口相邻的在先的端口的聚合结果执行聚合操作,并将其执行聚合操作的结果数据发送给处理单元。The third port is the last port in the aggregation ring, that is, the third port is the last port in the aggregation ring in sequence. Illustratively, port 1222 as shown in FIG. 3 . Similar to the above step S309, the third port will also determine whether there is a packet sequence number of the first type of data to be aggregated indicated in the aggregation command in the memory of the third port, and whether the data in the aggregation loop adjacent to the third port exists. The aggregation result of the previous port performs the aggregation operation, and sends the result data of the aggregation operation to the processing unit.
值得说明的是,第三端口是根据第三端口中存储的待聚合的第一类数据和第二环路的聚合结果执行聚合操作,获得聚合结果,该聚合结果可以称为第三端口的聚合结果,或者第二 端口执行聚合命令获得的聚合结果,或第三端口执行聚合命令的结果数据。It is worth noting that the third port performs the aggregation operation according to the first type of data to be aggregated stored in the third port and the aggregation result of the second loop to obtain the aggregation result, which can be called the aggregation of the third port. As a result, either the aggregation result obtained by the second port executing the aggregation command, or the result data of the third port executing the aggregation command.
可选地,聚合环路中每个端口并不存储端口根据聚合命令执行聚合部分第一类数据的聚合结果,当完成聚合命令的处理后,即将聚合命令的聚合结果发送给聚合环路中与该端口相邻的在后端口。Optionally, each port in the aggregation ring does not store the aggregation result of the first type of data that the port performs aggregation according to the aggregation command. When the processing of the aggregation command is completed, the aggregation result of the aggregation command is sent to the aggregation ring and the The port is adjacent to the rear port.
S311、当处理单元判断未完成所有第一类数据的聚合操作时,处理单元根据聚合表项生成新的聚合命令,重复执行步骤S306至S310的操作,并根据至少两个聚合命令的聚合结果确定所有第一类数据的聚合结果。S311. When the processing unit determines that the aggregation operation of all the first type of data has not been completed, the processing unit generates a new aggregation command according to the aggregation table item, repeats the operations of steps S306 to S310, and determines according to the aggregation results of at least two aggregation commands Aggregate result of all first-class data.
进一步地,当处理单元获取步骤S310中聚合环路中最后一个端口的聚合结果时,可以更新表3中报文序号的聚合状态,并根据更新后的结果判断是否已完成所有第一类数据的聚合操作,如果未完成所有第一类数据的聚合操作,则可以参考上述步骤S306至步骤S310再次生成新的聚合命令,并由聚合环路中各个端口根据新的聚合命令执行聚合操作,再由第三端口向处理单元发送新的聚合命令的聚合结果,进而由处理单元根据多个聚合命令的聚合结果获得所有第一类数据的聚合结果。也即当存在多个聚合命令获得的聚合结果时,处理单元可以对多个聚合命令的聚合结果再次执行聚合操作,进而获得所有第一类数据的聚合结果。Further, when the processing unit obtains the aggregation result of the last port in the aggregation loop in step S310, it can update the aggregation state of the message sequence number in Table 3, and judge whether the aggregation of all the first type data has been completed according to the updated result. Aggregation operation, if the aggregation operation of all the first type of data has not been completed, you can refer to the above steps S306 to S310 to generate a new aggregation command again, and each port in the aggregation loop performs the aggregation operation according to the new aggregation command, and then by The third port sends the aggregation result of the new aggregation command to the processing unit, and then the processing unit obtains the aggregation result of all the first type of data according to the aggregation results of the multiple aggregation commands. That is, when there are aggregation results obtained by multiple aggregation commands, the processing unit may perform the aggregation operation again on the aggregation results of the multiple aggregation commands, thereby obtaining the aggregation results of all the first-type data.
上述步骤S301至步骤S311也可以称为数据聚合过程,在完成所有第一类数据的聚合操作后,处理单元可以通过数据分发过程将所有数据的聚合结果通过分发环路发送给与参与分布式计算任务的数据节点相连的端口,进而通过上述端口将所有第一类数据发送至参与分布式计算任务的数据节点,使得参与分布式计算任务的数据节点继续完成分布式计算的其他操作。具体过程参见步骤S312至步骤S313的描述。The above steps S301 to S311 can also be referred to as a data aggregation process. After completing the aggregation operation of all the first type of data, the processing unit can send the aggregation results of all data through the distribution loop to and participate in distributed computing through the data distribution process. The port to which the data nodes of the task are connected, and then all the first type of data is sent to the data nodes participating in the distributed computing task through the above port, so that the data nodes participating in the distributed computing task continue to complete other operations of the distributed computing. For the specific process, please refer to the description of steps S312 to S313.
S312、当已完成所有第一类数据的聚合时,处理单元通过分发环路向第一端口发送所有第一类数据的聚合结果。S312. When the aggregation of all the first type data has been completed, the processing unit sends the aggregation result of all the first type data to the first port through the distribution loop.
S313、第一端口向第二端口发送所有第一类数据的聚合结果。S313. The first port sends the aggregation result of all the first type of data to the second port.
S314、第二端口向第三端口发送所有第一类数据的聚合结果。S314. The second port sends the aggregation result of all the first type of data to the third port.
分发环路是交换机中用于发送所有第一类数据的聚合结果的路径,是按照预设规则排序的至少两个端口构成的数据传输环路。The distribution loop is a path in the switch for sending the aggregated results of all the first type of data, and is a data transmission loop formed by at least two ports sorted according to preset rules.
示例地,图3所示的交换机包括两个分发环路,分发环路1为处理单元110-端口1201-端口1202…-端口1211,分发环路2为处理单元110-端口1222-端口1221-…-端口1212。当处理单元确定已完成所有第一类数据的聚合时,可以通过分发环路将所有第一类数据的结果分发至与参与分布式计算的数据节点相连的端口,进而传输至参与分布式计算的数据节点。For example, the switch shown in FIG. 3 includes two distribution loops. Distribution loop 1 is processing unit 110-port 1201-port 1202...-port 1211, and distribution loop 2 is processing unit 110-port 1222-port 1221- ... - port 1212. When the processing unit determines that the aggregation of all the data of the first type has been completed, the results of all the data of the first type can be distributed to the ports connected to the data nodes participating in the distributed computing through the distribution loop, and then transmitted to the data nodes participating in the distributed computing. data node.
S315、(可选地)处理单元清理聚合命令表项和聚合结果缓存。S315. (Optionally) the processing unit clears the aggregation command table entry and the aggregation result cache.
在处理单元完成包括所有第一类数据的聚合结果分发后,处理单元可以清理聚合结果缓存、删除聚合命令表项,进而释放处理单元的存储空间。After the processing unit completes the distribution of the aggregated results including all the first type of data, the processing unit may clear the aggregated result cache and delete the aggregated command entry, thereby freeing the storage space of the processing unit.
通过上述数据聚合过程和聚合数据分发过程的描述可知,本申请提供的聚合方法能够在数据传输过程中由交换机直接执行数据聚合操作,避免了传统技术中专用聚合节点执行聚合操作所带来的占用网络资源、传输速率低、处理时延长等问题,提升了聚合处理的效率。此外,交换机中处理单元和各个端口可以采用分布式方式分别对待聚合的数据执行聚合操作,避免由单一主体执行聚合操作所引起的性能瓶颈问题,进一步降低了聚合处理的时延。而且,由于避免利用独立设备执行聚合操作,减少了系统中节点数量,降低了系统成本。另一方面,通过聚合环路和分发环路分别执行聚合处理和分发处理,避免占用其他类型数据的传输的带宽,可以大幅提升分布式计算的传输带宽。此外,由于在数据处理过程中,第一类数据的聚合结果仅在处理单元的聚合结果缓存中,端口在数据处理过程中无需缓存部分一类数据的聚 合结果,仅在完成所有同类数据的聚合操作后才需要存储所有同类数据的聚合结果,大幅降低了端口中缓存数据的容量需求。It can be seen from the above description of the data aggregation process and the aggregation data distribution process that the aggregation method provided by the present application can directly perform the data aggregation operation by the switch during the data transmission process, avoiding the occupation caused by the aggregation operation performed by the dedicated aggregation node in the traditional technology. Problems such as network resources, low transmission rate, and prolonged processing time have improved the efficiency of aggregation processing. In addition, the processing unit and each port in the switch can perform aggregation operations on the data to be aggregated in a distributed manner, avoiding the performance bottleneck problem caused by the aggregation operation performed by a single subject, and further reducing the latency of aggregation processing. Moreover, since the use of independent equipment to perform aggregation operations is avoided, the number of nodes in the system is reduced, and the system cost is reduced. On the other hand, by performing aggregation processing and distribution processing through the aggregation loop and the distribution loop, respectively, to avoid occupying the transmission bandwidth of other types of data, the transmission bandwidth of distributed computing can be greatly improved. In addition, in the data processing process, the aggregation results of the first type of data are only stored in the aggregation result cache of the processing unit, and the port does not need to cache the aggregation results of some types of data during the data processing process, and only completes the aggregation of all the same types of data. After the operation, it is necessary to store the aggregated results of all similar data, which greatly reduces the capacity requirement of cached data in the port.
作为一种可能的实施例,图4所示的数据处理的方法除了利用聚合环路和分发环路传输数据以外,还可以直接利用交叉网络130实现处理单元和各个端口的数据传输,此时,交叉网络130不仅用于实现处理单元和各个端口间传输报文头,还用于传输处理单元生成的聚合命令,以及各个端口执行聚合命令的处理结果。上述实现方式也能实现在数据传输中由交换机实现数据聚合的过程,进而避免传统技术中由单一聚合节点执行聚合操作所带来耗时长、效率低的问题。As a possible embodiment, in the data processing method shown in FIG. 4, in addition to using the aggregation loop and the distribution loop to transmit data, the crossover network 130 can also be used to directly implement data transmission between the processing unit and each port. In this case, The cross-connect network 130 is not only used for realizing the transmission of packet headers between the processing unit and each port, but also for transmitting the aggregation command generated by the processing unit and the processing result of each port executing the aggregation command. The above implementation manner can also realize the process of data aggregation realized by the switch during data transmission, thereby avoiding the problems of long time and low efficiency caused by the aggregation operation performed by a single aggregation node in the traditional technology.
作为一种可能的实施例,除了图4所示的方法外,本申请所提供的数据处理方法中,还可以仅由交换机的处理单元执行分布式计算任务,也即,各个端口在获取到数据节点发送的携带待聚合的第一类数据和报文头的报文后,将报文发送至处理单元,由处理单元解析报文,获取报文头中报文序号,并根据报文序号执行聚合操作。可选地,各个端口也可以完成报文的解析过程,将报文头和静荷数据分别发送至处理单元,进而由处理单元执行聚合操作。上述过程也可以实现在数据传输过程中,由交换机执行聚合操作的目的,提升数据处理的效率。As a possible embodiment, in addition to the method shown in FIG. 4 , in the data processing method provided by the present application, the distributed computing task can also be performed only by the processing unit of the switch, that is, each port obtains data when After the node sends the packet carrying the first type of data to be aggregated and the packet header, the packet is sent to the processing unit, which parses the packet, obtains the packet sequence number in the packet header, and executes the execution according to the packet sequence number. Aggregate operation. Optionally, each port can also complete the packet parsing process, and send the packet header and the static payload data to the processing unit respectively, and then the processing unit performs the aggregation operation. The above process can also achieve the purpose of performing aggregation operations by the switch during the data transmission process, thereby improving the efficiency of data processing.
通过上述内容描述,本申请提供的交换机500可以和数据节点共同完成分布式计算任务,使得交换机在数据传输过程中即完成分布式计算任务的操作,提升了数据处理的效率和速度。而且,通过聚合环路和分发环路分别执行聚合处理和分发处理,避免占用其他类型数据的传输的带宽,可以大幅提升分布式计算的传输带宽。Through the above description, the switch 500 provided by the present application can complete the distributed computing task together with the data nodes, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing. Moreover, aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
值得说明的是,对于上述方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请所必须的。It is worth noting that, for the purpose of simple description, the above method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence, and secondly, Those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the present application.
本领域的技术人员根据以上描述的内容,能够想到的其他合理的步骤组合,也属于本申请的保护范围内。其次,本领域技术人员也应该熟悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请所必须的。Other reasonable step combinations that those skilled in the art can think of based on the above description also fall within the protection scope of the present application. Secondly, those skilled in the art should also be familiar with that, the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the present application.
上文中结合图1至图4,详细描述了根据本申请所提供的数据处理的方法,下面将结合图5至图6,描述根据本申请所提供的数据处理的交换机。The data processing method provided according to the present application is described in detail above with reference to FIGS. 1 to 4 , and the data processing switch provided according to the present application will be described below with reference to FIGS. 5 to 6 .
图5为本申请提供的一种交换机500的结构示意图,如图所示,交换机500用于连接至少两个数据节点,所述至少两个数据节点用于分别执行分布式计算任务的第一操作,交换机500包括第一处理单元501,其中FIG. 5 is a schematic structural diagram of a switch 500 provided by the present application. As shown in the figure, the switch 500 is used to connect at least two data nodes, and the at least two data nodes are used to respectively perform a first operation of a distributed computing task. , the switch 500 includes a first processing unit 501, wherein
第一处理单元501,用于接收所述至少两个数据节点发送的所述第一操作的结果数据;根据接收的所述第一操作的结果数据执行所述分布式计算任务的第二操作,获得所述第二操作的结果数据;分发所述第二操作的结果数据。a first processing unit 501, configured to receive the result data of the first operation sent by the at least two data nodes; perform the second operation of the distributed computing task according to the received result data of the first operation, Obtaining result data of the second operation; distributing the result data of the second operation.
可选地,所示交换机500还包括至少两个端口,每个端口连接一个数据节点,每个端口包括接收单元502和发送单元503,其中,Optionally, the shown switch 500 further includes at least two ports, each port is connected to a data node, and each port includes a receiving unit 502 and a sending unit 503, wherein,
接收单元502,用于接收所连接的数据节点发送的所述第一操作的结果数据;a receiving unit 502, configured to receive the result data of the first operation sent by the connected data node;
发送单元503,用于将所述第一操作的结果数据转发至第一处理单元501。The sending unit 503 is configured to forward the result data of the first operation to the first processing unit 501 .
可选地,每个端口还包括第二处理单元504,用于在发送单元503将所述第一操作的结果数据转发至第一处理单元501之前,对所述第一操作的结果数据执行所述分布式计算任务的第三操作。Optionally, each port further includes a second processing unit 504, configured to perform all operations on the result data of the first operation before the sending unit 503 forwards the result data of the first operation to the first processing unit 501. The third operation of the distributed computing task is described.
可选地,所述分布式计算任务包括分布式人工智能计算任务或分布式高性能计算任务或 分布式图形计算任务。Optionally, the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks.
可选地,所述分布式计算任务的第二操作或第三操作包括对同类数据进行聚合的操作。Optionally, the second operation or the third operation of the distributed computing task includes an operation of aggregating data of the same type.
可选地,所述交换机为接入交换机或汇聚交换机。Optionally, the switch is an access switch or an aggregation switch.
可选地,第一处理单元501,还用于向所述至少两个端口发送操作命令,所述操作命令用于指示所述至少两个端口的第二处理器504分别执行所述第三操作。Optionally, the first processing unit 501 is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the second processor 504 of the at least two ports to perform the third operation respectively .
可选地,所述至少两个端口按照预设规则排序组成第一环路,所述第一环路的顺序指示所述至少两个端口接收或执行所述操作命令的顺序。Optionally, the at least two ports are sorted according to a preset rule to form a first loop, and the order of the first loop indicates an order in which the at least two ports receive or execute the operation command.
可选地,所述第一环路中顺序在前的端口根据所述操作命令执行所述第三操作之后,将所述第三操作的结果数据与所述操作命令转发至相邻的在后的端口,直至所述第一环路中顺序最后的端口将全部的第三操作的结果数据发送至第一处理单元501。Optionally, after the port in the first loop performs the third operation according to the operation command, the result data of the third operation and the operation command are forwarded to the adjacent subsequent port. port until the last port in the first loop in sequence sends all the result data of the third operation to the first processing unit 501 .
可选地,第一处理单元501,还用于在发送所述操作命令之前,接收所述至少两个端口分别发送的报文头,每个报文头包括数据类别和报文序号;根据所述报文头建立操作表项,所述操作表项记录每一类数据中待处理的数据与已处理的数据;根据所述操作表项发送所述操作命令。Optionally, the first processing unit 501 is further configured to receive packet headers respectively sent by the at least two ports before sending the operation command, and each packet header includes a data type and a packet sequence number; The message header establishes an operation table entry, and the operation table entry records the data to be processed and the processed data in each type of data; the operation command is sent according to the operation table entry.
可选地,报文头还包括端口标识,所述操作表项还用于记录每一类数据中每个待处理数据对应的端口标识。Optionally, the packet header further includes a port identifier, and the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
可选地,所述交换机中包括至少一个所述第一环路。Optionally, the switch includes at least one of the first loops.
可选地,所述交换机还用于建立第二环路,所述第一处理单元501,还用于通过所述第二环路分发所述第二操作的结果数据。Optionally, the switch is further configured to establish a second loop, and the first processing unit 501 is further configured to distribute result data of the second operation through the second loop.
可选地,所述交换机中包括至少一个所述第二环路。Optionally, the switch includes at least one of the second loops.
应理解的是,本申请实施例的第一处理单元501和第二处理单元504分别可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以通过软件实现图4所示的数据处理方法时,第一处理单元501、第二处理单元504及其各个模块也可以为软件模块。It should be understood that the first processing unit 501 and the second processing unit 504 in this embodiment of the present application may be implemented by an application-specific integrated circuit (ASIC), respectively, or a programmable logic device (PLD) To achieve, the above-mentioned PLD can be a complex program logic device (complex programmable logical device, CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL) or any combination thereof. When the data processing method shown in FIG. 4 can also be implemented by software, the first processing unit 501 , the second processing unit 504 and their respective modules can also be software modules.
根据本申请实施例的交换机500可对应于执行本申请实施例中描述的方法,并且交换机500中的各个单元的上述和其它操作和/或功能分别为了实现图4中的各个方法的相应流程,为了简洁,在此不再赘述。The switch 500 according to the embodiments of the present application may correspond to executing the methods described in the embodiments of the present application, and the above and other operations and/or functions of the various units in the switch 500 are respectively in order to implement the corresponding processes of the respective methods in FIG. 4 , For brevity, details are not repeated here.
通过上述内容描述,本申请提供的交换机500可以和数据节点共同完成分布式计算任务,使得交换机在数据传输过程中即完成分布式计算任务的操作,提升了数据处理的效率和速度。而且,通过聚合环路和分发环路分别执行聚合处理和分发处理,避免占用其他类型数据的传输的带宽,可以大幅提升分布式计算的传输带宽。Through the above description, the switch 500 provided by the present application can complete the distributed computing task together with the data nodes, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing. Moreover, aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
图6为本申请提供的另一种交换机600的结构示意图,如图所示,交换机600中包括第一处理器601和至少两个端口602,其中,每个端口602分别用于与参与分布式计算任务的数据节点通过网络603相连,其中,FIG. 6 is a schematic structural diagram of another switch 600 provided by the present application. As shown in the figure, the switch 600 includes a first processor 601 and at least two ports 602, wherein each port 602 is respectively used for participating in distributed The data nodes of the computing tasks are connected through the network 603, wherein,
第一处理器601,用于分别接收所述至少两个数据节点发送的所述第一操作的结果数据,根据接收的所述第一操作的结果数据执行所述分布式计算任务的第二操作,获得所述第二操作的结果数据,并分发所述第二操作的结果数据。A first processor 601, configured to respectively receive the result data of the first operation sent by the at least two data nodes, and perform the second operation of the distributed computing task according to the received result data of the first operation , obtain the result data of the second operation, and distribute the result data of the second operation.
应理解,在本申请实施例中,该第一处理器601可以是CPU,该处理器601还可以是其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、 现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。It should be understood that in this embodiment of the present application, the first processor 601 may be a CPU, and the processor 601 may also be other general-purpose processors, digital signal processors (digital signal processing, DSP), application specific integrated circuits (ASICs) , Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or any conventional processor or the like.
网络603,可以是总线,该总线除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。The network 603 may be a bus, and the bus may include a power bus, a control bus, a status signal bus, and the like in addition to a data bus.
可选地,第一处理器601可以用于实现如图2所示的处理单元110中计算单元112、命令生成单元13、报文头管理单元114的功能,为了简洁,在此不再赘述。Optionally, the first processor 601 may be configured to implement the functions of the computing unit 112 , the command generating unit 13 , and the message header management unit 114 in the processing unit 110 shown in FIG. 2 , which will not be repeated here for brevity.
可选地,第一处理器601中还包括存储器(图中未示出),存储器用于向第一处理器601提供命令和数据,使得第一处理器可以执行如图4所示方法的操作步骤。该存储器可以包括只读存储器和随机存取存储器,存储器还可以包括非易失性随机存取存储器。Optionally, the first processor 601 further includes a memory (not shown in the figure), and the memory is used to provide commands and data to the first processor 601, so that the first processor can perform the operations of the method shown in FIG. 4 . step. The memory may include read-only memory and random access memory, and the memory may also include non-volatile random access memory.
可选地,第一处理器601外部也可以包括存储器,以向第一处理器601提供命令和数据,使得第一处理器可以执行如图4所示方法的操作步骤。Optionally, a memory may also be included outside the first processor 601 to provide commands and data to the first processor 601, so that the first processor may execute the operation steps of the method shown in FIG. 4 .
可选地,每个端口602包括第二处理器6021和存储器6022,其中,第二处理器6021也可以是CPU,该处理器6021还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。Optionally, each port 602 includes a second processor 6021 and a memory 6022, wherein the second processor 6021 may also be a CPU, and the processor 6021 may also be other general-purpose processors, digital signal processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or any conventional processor or the like.
每个端口602可以用于实现如图4所示的方法中第一端口或第二端口或第三端口执行的方法的操作步骤,为了简洁,在此不再赘述。Each port 602 may be used to implement the operation steps of the method performed by the first port or the second port or the third port in the method shown in FIG. 4 , which will not be repeated here for brevity.
应理解,根据本申请实施例的交换机600可对应于本申请实施例中的交换机500,并可以对应于执行根据本申请实施例中图4所示的方法中的相应主体,并且交换机600中的各个模块的上述和其它操作和/或功能分别为了实现图4中的各个方法的相应流程,为了简洁,在此不再赘述。It should be understood that the switch 600 according to the embodiment of the present application may correspond to the switch 500 in the embodiment of the present application, and may correspond to the corresponding subject in executing the method shown in FIG. 4 in the embodiment of the present application, and the switch 600 The above-mentioned and other operations and/or functions of each module are respectively to implement the corresponding flow of each method in FIG. 4 , and are not repeated here for brevity.
通过上述内容描述,本申请提供的交换机600可以和数据节点共同完成分布式计算任务,使得交换机在数据传输过程中即完成分布式计算任务的操作,提升了数据处理的效率和速度。而且,通过聚合环路和分发环路分别执行聚合处理和分发处理,避免占用其他类型数据的传输的带宽,可以大幅提升分布式计算的传输带宽。Through the above description, the switch 600 provided by the present application can complete the distributed computing task together with the data node, so that the switch can complete the operation of the distributed computing task during the data transmission process, which improves the efficiency and speed of data processing. Moreover, aggregation processing and distribution processing are respectively performed through the aggregation loop and the distribution loop, so as to avoid occupying the transmission bandwidth of other types of data, which can greatly improve the transmission bandwidth of distributed computing.
本申请还提供一种数据处理的系统,该系统包括交换网络和与交换网络连接的至少两个分别执行分布式计算任务的第一操作的数据节点,交换网络包括至少一个交换机,每个交换机包括如图6所示的第一处理器和至少两个端口,并用于实现如图4所示方法中相应执行主体的功能,为了简洁,在此不再赘述。该系统可以实现分布式计算任务,在数据传输过程中,由交换机执行分布式计算任务的操作,进而提升数据处理的效率、降低数据处理的时延。The present application also provides a data processing system, the system includes a switching network and at least two data nodes connected to the switching network that respectively perform a first operation of a distributed computing task, the switching network includes at least one switch, and each switch includes The first processor and at least two ports shown in FIG. 6 are used to implement the functions of the corresponding execution body in the method shown in FIG. 4 , which are not described here for brevity. The system can realize distributed computing tasks. In the process of data transmission, switches perform operations of distributed computing tasks, thereby improving the efficiency of data processing and reducing the delay of data processing.
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机命令。在计算机上加载或执行所述计算机程序命令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机命令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机命令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质 可以是固态硬盘(solid state drive,SSD)。The above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer commands. When the computer program commands are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer commands may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer commands may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media. The semiconductor medium may be a solid state drive (SSD).
以上所述,仅为本申请的具体实施方式。熟悉本技术领域的技术人员根据本申请提供的具体实施方式,可想到变化或替换,都应涵盖在本申请的保护范围之内。The above descriptions are merely specific embodiments of the present application. Those skilled in the art can think of changes or substitutions based on the specific embodiments provided by the present application, which should all fall within the protection scope of the present application.
Claims (25)
- 一种交换机,其特征在于,所述交换机与至少两个数据节点连接;A switch, characterized in that the switch is connected to at least two data nodes;所述交换机,用于分别接收所述至少两个数据节点发送的分布式计算任务的第一操作的结果数据,根据接收的所述第一操作的结果数据执行所述分布式计算任务的第二操作,获得所述第二操作的结果数据,并分发所述第二操作的结果数据。The switch is configured to respectively receive the result data of the first operation of the distributed computing task sent by the at least two data nodes, and execute the second operation of the distributed computing task according to the received result data of the first operation. operation, obtain the result data of the second operation, and distribute the result data of the second operation.
- 根据权利要求1所述的交换机,其特征在于,所述交换机包括处理单元和至少两个端口,每个端口连接到一个数据节点;The switch of claim 1, wherein the switch comprises a processing unit and at least two ports, each port being connected to a data node;所述每个端口用于接收所连接的数据节点发送的所述第一操作的结果数据,并将所述第一操作的结果数据转发至所述处理单元。Each port is configured to receive the result data of the first operation sent by the connected data node, and forward the result data of the first operation to the processing unit.
- 根据权利要求2所述的交换机,其特征在于,在将所述第一操作的结果数据转发至所述处理单元之前,所述每个端口还用于对所述第一操作的结果数据执行所述分布式计算任务的第三操作。The switch according to claim 2, wherein, before forwarding the result data of the first operation to the processing unit, each port is further configured to perform all the required operations on the result data of the first operation. The third operation of the distributed computing task is described.
- 根据权利要求1-3任一项所述的交换机,其特征在于,所述分布式计算任务包括分布式人工智能计算任务或分布式高性能计算任务或分布式图形计算任务或分布式云计算任务。The switch according to any one of claims 1-3, wherein the distributed computing tasks include distributed artificial intelligence computing tasks, distributed high-performance computing tasks, distributed graphics computing tasks, or distributed cloud computing tasks .
- 根据权利要求4所述的交换机,其特征在于,所述分布式计算任务的第二操作或第三操作包括对同类数据进行聚合的操作。The switch according to claim 4, wherein the second operation or the third operation of the distributed computing task comprises an operation of aggregating data of the same type.
- 根据权利要求1-5任一项所述的交换机,其特征在于,所述交换机为接入交换机或汇聚交换机。The switch according to any one of claims 1-5, wherein the switch is an access switch or an aggregation switch.
- 根据权利要求3-6任一项所述的交换机,其特征在于,所述处理单元还用于向所述至少两个端口发送操作命令,所述操作命令用于指示所述至少两个端口分别执行所述第三操作。The switch according to any one of claims 3-6, wherein the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively The third operation is performed.
- 根据权利要求7所述的交换机,其特征在于,所述至少两个端口按照预设规则排序组成第一环路,所述第一环路的顺序指示所述至少两个端口接收或执行所述操作命令的顺序。The switch according to claim 7, wherein the at least two ports are sorted according to a preset rule to form a first loop, and the order of the first loop instructs the at least two ports to receive or execute the The sequence of action commands.
- 根据权利要求8所述的交换机,其特征在于,所述第一环路中顺序在前的端口根据所述操作命令执行所述第三操作之后,将所述第三操作的结果数据与所述操作命令转发至相邻的在后的端口,直至所述第一环路中顺序最后的端口将全部的第三操作的结果数据发送至所述处理单元。The switch according to claim 8, wherein after the port in the first loop performs the third operation according to the operation command, the result data of the third operation is compared with the The operation command is forwarded to the adjacent subsequent ports until the last port in the first loop in sequence sends all the result data of the third operation to the processing unit.
- 根据权利要求7所述的交换机,其特征在于,The switch according to claim 7, wherein,所述处理单元,还用于接收所述至少两个端口分别发送的报文头,每个报文头包括报文序号;根据所述报文头建立操作表项,所述操作表项记录每一类数据中待处理的数据与已处理的数据;根据所述操作表项发送所述操作命令。The processing unit is further configured to receive message headers respectively sent by the at least two ports, and each message header includes a message sequence number; an operation table entry is established according to the message header, and the operation table entry records each message header. The data to be processed and the processed data in a class of data; the operation command is sent according to the operation table entry.
- 根据权利要求10所述的交换机,其特征在于,所述报文头还包括端口标识,所述操 作表项还用于记录每一类数据中每个待处理数据对应的端口标识。The switch according to claim 10, wherein the message header further includes a port identifier, and the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
- 根据权利要求1-11任一项所述的交换机,其特征在于,所述交换机还用于建立第二环路,通过所述第二环路分发所述第二操作的结果数据。The switch according to any one of claims 1-11, wherein the switch is further configured to establish a second loop, and distribute result data of the second operation through the second loop.
- 一种数据处理的系统,其特征在于,所示系统包括交换网络和与所述交换网络连接的至少两个数据节点;A system for data processing, characterized in that the system includes a switching network and at least two data nodes connected to the switching network;所述至少两个数据节点,用于分别执行分布式计算任务的第一操作;The at least two data nodes are used to respectively perform the first operation of the distributed computing task;所述交换网络包括至少一个交换机,所述至少一个交换机用于分别接收所述至少两个数据节点发送的所述第一操作的结果数据,根据接收的所述第一操作的结果数据执行所述分布式计算任务的第二操作,获得所述第二操作的结果数据,并分发所述第二操作的结果数据。The switching network includes at least one switch, and the at least one switch is configured to respectively receive result data of the first operation sent by the at least two data nodes, and execute the result data of the first operation according to the received result data of the first operation. The second operation of the distributed computing task obtains the result data of the second operation, and distributes the result data of the second operation.
- 根据权利要求13所述的系统,其特征在于,所述至少一个交换机中每个交换机包括处理单元和至少两个端口,每个端口连接到一个数据节点;The system of claim 13, wherein each of the at least one switch includes a processing unit and at least two ports, each port being connected to a data node;所述每个端口,用于接收所连接的数据节点发送的所述第一操作的结果数据,并将所述第一操作的结果数据转发至所述处理单元。Each port is configured to receive the result data of the first operation sent by the connected data node, and forward the result data of the first operation to the processing unit.
- 根据权利要求14所述的系统,其特征在于,在将所述第一操作的结果数据转发至所述处理单元之前,所述每个端口还用于对所述第一操作的结果数据执行所述分布式计算任务的第三操作。15. The system according to claim 14, wherein each port is further configured to perform all operations on the result data of the first operation before forwarding the result data of the first operation to the processing unit. The third operation of the distributed computing task is described.
- 根据权利要求13至15中任一所述的系统,其特征在于,所述分布式计算任务包括分布式人工智能计算任务或分布式高性能计算任务或分布式图形计算任务。The system according to any one of claims 13 to 15, wherein the distributed computing tasks include distributed artificial intelligence computing tasks or distributed high-performance computing tasks or distributed graphics computing tasks.
- 根据权利要求16所述的系统,其特征在于,所述分布式计算任务的第二操作或第三操作包括对同类数据进行聚合的操作。The system according to claim 16, wherein the second operation or the third operation of the distributed computing task comprises an operation of aggregating data of the same type.
- 根据权利要求13至16中任一所述的系统,其特征在于,所述交换机为接入交换机或汇聚交换机。The system according to any one of claims 13 to 16, wherein the switch is an access switch or an aggregation switch.
- 根据权利要求15至18中任一所述的系统,其特征在于,所述处理单元还用于向所述至少两个端口发送操作命令,所述操作命令用于指示所述至少两个端口分别执行所述第三操作。The system according to any one of claims 15 to 18, wherein the processing unit is further configured to send an operation command to the at least two ports, where the operation command is used to instruct the at least two ports to respectively The third operation is performed.
- 根据权利要求19所述的系统,其特征在于,所述至少两个端口按照预设规则排序组成第一环路,所述第一环路的顺序指示所述至少两个端口接收到或执行所述操作命令的顺序。The system according to claim 19, wherein the at least two ports are ordered according to a preset rule to form a first loop, and the order of the first loop indicates that the at least two ports receive or execute the Describe the sequence of operation commands.
- 根据权利要求20所述的系统,其特征在于,所述第一环路中顺序在前的端口根据所述操作命令执行所述第三操作之后,将所述第三操作的结果数据与所述操作命令转发至相邻的在后的端口,直至所述第一环路中顺序最后的端口将全部的第三操作的结果数据发送至所述处理单元。The system according to claim 20, wherein after the port in the first loop performs the third operation according to the operation command, the result data of the third operation is compared with the result data of the third operation. The operation command is forwarded to the adjacent subsequent ports until the last port in the first loop in sequence sends all the result data of the third operation to the processing unit.
- 根据权利要求19所述的系统,其特征在于,所述处理单元,还用于接收所述至少两个端口分别发送的报文头,每个报文头包括报文序号;根据所述报文头建立操作表项,所述操作表项记录每一类数据中待处理的数据与已处理的数据,所述处理单元;根据所述操作表项发送所述操作命令。The system according to claim 19, wherein the processing unit is further configured to receive packet headers respectively sent by the at least two ports, and each packet header includes a packet sequence number; The header establishes an operation table entry, and the operation table entry records the data to be processed and the processed data in each type of data, and the processing unit sends the operation command according to the operation table entry.
- 根据权利要求22所述的交换机,其特征在于,所述报文头还包括端口标识,所述操作表项还用于记录每一类数据中每个待处理数据对应的端口标识。The switch according to claim 22, wherein the packet header further includes a port identifier, and the operation table entry is further used to record the port identifier corresponding to each data to be processed in each type of data.
- 根据权利要求13至23任一项所述的系统,其特征在于,所述交换机还用于建立第二环路,通过所述第二环路分发所述第二操作的结果数据。The system according to any one of claims 13 to 23, wherein the switch is further configured to establish a second loop, and distribute the result data of the second operation through the second loop.
- 一种非易失性计算机可读存储介质,其特征在于,所述非易失性计算机可读存储介质中包括指令,当其在计算机上运行时,使得计算机执行权利要求1至12中任一所述的交换机所执行的操作步骤。A non-volatile computer-readable storage medium, characterized in that the non-volatile computer-readable storage medium includes instructions that, when executed on a computer, cause the computer to execute any one of claims 1 to 12 The operation steps performed by the switch.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010649718 | 2020-07-08 | ||
CN202010649718.9 | 2020-07-08 | ||
CN202011551606.6A CN113992604A (en) | 2020-07-08 | 2020-12-24 | Switch and data processing system |
CN202011551606.6 | 2020-12-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022007587A1 true WO2022007587A1 (en) | 2022-01-13 |
Family
ID=79553525
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/099527 WO2022007587A1 (en) | 2020-07-08 | 2021-06-10 | Switch and data processing system |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022007587A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105357124A (en) * | 2015-11-22 | 2016-02-24 | 华中科技大学 | MapReduce bandwidth optimization method |
US20160323150A1 (en) * | 2014-09-24 | 2016-11-03 | Lntel Corporation | System, method and apparatus for improving the performance of collective operations in high performance computing |
CN106936777A (en) * | 2015-12-29 | 2017-07-07 | 中移(苏州)软件技术有限公司 | Cloud computing distributed network implementation method based on OpenFlow, system |
CN110233798A (en) * | 2018-03-05 | 2019-09-13 | 华为技术有限公司 | Data processing method, apparatus and system |
-
2021
- 2021-06-10 WO PCT/CN2021/099527 patent/WO2022007587A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160323150A1 (en) * | 2014-09-24 | 2016-11-03 | Lntel Corporation | System, method and apparatus for improving the performance of collective operations in high performance computing |
CN105357124A (en) * | 2015-11-22 | 2016-02-24 | 华中科技大学 | MapReduce bandwidth optimization method |
CN106936777A (en) * | 2015-12-29 | 2017-07-07 | 中移(苏州)软件技术有限公司 | Cloud computing distributed network implementation method based on OpenFlow, system |
CN110233798A (en) * | 2018-03-05 | 2019-09-13 | 华为技术有限公司 | Data processing method, apparatus and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10659254B2 (en) | Access node integrated circuit for data centers which includes a networking unit, a plurality of host units, processing clusters, a data network fabric, and a control network fabric | |
US10521283B2 (en) | In-node aggregation and disaggregation of MPI alltoall and alltoallv collectives | |
US10554554B2 (en) | Hybrid network processing load distribution in computing systems | |
US20150215236A1 (en) | Method and apparatus for locality sensitive hash-based load balancing | |
CN112929299B (en) | SDN cloud network implementation method, device and equipment based on FPGA accelerator card | |
US20120151292A1 (en) | Supporting Distributed Key-Based Processes | |
US8989193B2 (en) | Facilitating insertion of device MAC addresses into a forwarding database | |
US20210218808A1 (en) | Small Message Aggregation | |
CN101019385A (en) | Port aggregation across stack of devices | |
US11252027B2 (en) | Network element supporting flexible data reduction operations | |
US20160072906A1 (en) | Hybrid tag matching | |
CN112291293A (en) | Task processing method, related equipment and computer storage medium | |
US10715424B2 (en) | Network traffic management with queues affinitized to one or more cores | |
US10805241B2 (en) | Database functions-defined network switch and database system | |
US20240195749A1 (en) | Path selection for packet transmission | |
US12038866B2 (en) | Broadcast adapters in a network-on-chip | |
US20180203895A1 (en) | Best-efforts database functions | |
KR20240004315A (en) | Network-attached MPI processing architecture within SMARTNICs | |
JP2017509055A (en) | Method and apparatus for processing data packets based on parallel protocol stack instances | |
US10616116B1 (en) | Network traffic load balancing using rotating hash | |
WO2022007587A1 (en) | Switch and data processing system | |
CN111143427B (en) | Distributed information retrieval method, system and device based on online computing | |
US10560527B2 (en) | Network service chains using hardware logic devices in an information handling system | |
CN117640513A (en) | Data processing method, device and system | |
CN113992604A (en) | Switch and data processing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21837159 Country of ref document: EP Kind code of ref document: A1 |