WO2019169986A1 - 数据处理方法、装置及系统 - Google Patents

数据处理方法、装置及系统 Download PDF

Info

Publication number
WO2019169986A1
WO2019169986A1 PCT/CN2019/074052 CN2019074052W WO2019169986A1 WO 2019169986 A1 WO2019169986 A1 WO 2019169986A1 CN 2019074052 W CN2019074052 W CN 2019074052W WO 2019169986 A1 WO2019169986 A1 WO 2019169986A1
Authority
WO
WIPO (PCT)
Prior art keywords
switching device
computing nodes
computing
data
target
Prior art date
Application number
PCT/CN2019/074052
Other languages
English (en)
French (fr)
Inventor
黄伊
夏寅贲
刘孟竹
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19765026.0A priority Critical patent/EP3754915B1/en
Priority to EP22163754.9A priority patent/EP4092992A1/en
Publication of WO2019169986A1 publication Critical patent/WO2019169986A1/zh
Priority to US17/012,941 priority patent/US11522789B2/en
Priority to US17/978,378 priority patent/US11855880B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/64Routing or path finding of packets in data switching networks using an overlay routing layer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/02Topology update or discovery
    • H04L45/025Updating only a limited number of routers, e.g. fish-eye update
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • H04L45/122Shortest path evaluation by minimising distances, e.g. by selecting a route with minimum of number of hops
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/12Shortest path evaluation
    • H04L45/127Shortest path evaluation based on intermediate node capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/302Route determination based on requested QoS
    • H04L45/306Route determination based on the nature of the carried application
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/42Centralised routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/41Flow control; Congestion control by acting on aggregated flows or links
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/63Routing a service request depending on the request content or context
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/20Hop count for routing purposes, e.g. TTL

Definitions

  • the present application relates to the field of distributed computing, and in particular, to a data processing method, apparatus, and system.
  • Distributed machine learning generally uses data parallelism for model training.
  • an algorithm model is stored in each computing node (also referred to as a worker), and each node can separately acquire partial sample data and can train the acquired sample data. Get the model parameters.
  • Each computing node needs to send the calculated model parameters to a parameter server (ps), which is used to aggregate and update the model parameters reported by each computing node, and then send the updated model parameters to each calculation. node.
  • ps parameter server
  • a high performance computing (HPC) data center network is generally used to implement distributed machine learning.
  • one server may be selected as the parameter server, another server may be selected as the primary node, and multiple other servers may be selected as the computing nodes.
  • the master node is configured to deliver the network address of the parameter server to the plurality of computing nodes, and drive the plurality of computing nodes to perform distributed machine learning tasks.
  • the parameter server and each computing node can exchange data through the switching device to implement the reporting of the model parameters and the delivery of the updated model parameters.
  • the present invention provides a data processing method, device and system, which can solve the problem that the data center network in the related art realizes distributed computing, the data transmission amount in the network is large, network congestion may occur, and the calculation efficiency is affected.
  • the technical solutions are as follows:
  • a data processing method for a controller of a data center network, and the method may include:
  • the controller may be used in a switching device for connecting the multiple computing nodes
  • the target switching device is determined, and the routing information corresponding to the specified computing task is sent to the target switching device and the designated node, where the routing information is used to indicate a data forwarding path between the multiple computing nodes and the target switching device.
  • the routing information is used to perform the merging process on the data reported by the target switching device to the plurality of computing nodes, and then send the merged data to each computing node according to the routing information. That is, the target switching device may combine the data reported by the multiple computing nodes according to the routing information and send the data to each computing node.
  • the designated node may send the routing information to each of the plurality of computing nodes except the designated node, and each computing node may report the routing information to the target switching device according to the routing information. data.
  • each computing node since the controller can select the target switching device to combine the data reported by the multiple computing nodes, each computing node does not need to send data to the parameter server through the switching device, and the parameter server does not need to pass the switching device.
  • the result of the merge processing is fed back to each computing node, which effectively reduces the data transmission amount in the data center network, reduces the probability of network congestion and the delay of data transmission, and improves the calculation efficiency of the calculation task.
  • the data forwarding path between the multiple computing nodes and the target switching device may include at least one switching device, and the method may further include:
  • the computing nodes perform the combining process on the data reported by the at least two computing nodes through the intermediate switching device, and then send the data directly to the intermediate switching device, which can further reduce the network.
  • the amount of data transmission which in turn can further reduce the probability of network congestion.
  • the process of the controller separately sending the routing information corresponding to the specified computing task to the target switching device and the designated node may include:
  • routing information including an identifier of the directly connected device of the target switching device, where the directly connected device of the target switching device is a computing node or an intermediate switching device;
  • the identifier is sent to the corresponding computing node;
  • the process for the controller to send routing information to the intermediate switching device may include:
  • the routing information including the identifier of the directly connected device of the intermediate switching device is sent to the intermediate switching device, and the directly connected device of the intermediate switching device is a computing node, the target switching device, or other intermediate switching device.
  • the identifier of each device can be the IP address of the device.
  • the routing information sent by the controller to each device may include only the identifier of the directly connected device of the device, so that the data amount of the routing information may be further reduced on the basis of ensuring normal forwarding of data, thereby effectively improving the transmission efficiency of the routing information.
  • the process of determining, by the controller, the target switching device from the switching device used to connect the multiple computing nodes may include:
  • the sum of the number of route hops between each switching device and each computing node in the switching device for connecting the plurality of computing nodes is separately calculated; the switching device with the smallest sum of routing hops is determined as the target switching device.
  • the switching device with the smallest number of routing hops is selected as the target switching device, which can ensure that the total path between the selected target switching device and each computing node is short, and the data transmission in the network can be effectively reduced.
  • the amount is small, which in turn can reduce the probability of network congestion.
  • the process of determining, by the controller, the switching device with the smallest sum of route hops as the target switching device may include:
  • the performance parameters of the switching device with the smallest sum of hops of each route are respectively determined, and the performance parameters include available bandwidth, throughput, computing load, and selected as target switching. At least one of the number of times of the device; in the switching device that minimizes the sum of the plurality of routing hops, the switching device whose performance parameter meets the preset condition is determined as the target switching device.
  • Selecting the target switching device according to the performance parameters of the switching device ensures that the selected target switching device performs better and can ensure higher computing efficiency.
  • the process of determining, by the controller, the switching device with the smallest sum of route hops as the target switching device may include:
  • the degree of equalization of the number of routing hops between the switching device with the smallest sum of hops of each route and each of the computing nodes is determined respectively; Among the switching devices with the smallest sum, the switching device with the highest degree of equalization of the route hop count is determined as the target switching device.
  • the target switching device is selected to ensure that the path length between the selected target switching device and each computing node is relatively balanced, thereby ensuring that the time required for each computing node to report data is relatively close, so that the target is
  • the switching device can receive the data reported by all computing nodes in a short time and perform the merge processing, which further improves the execution efficiency of the computing task.
  • the controller may further detect whether the multiple computing nodes are directly connected to the same switching device; when the multiple computing When the nodes are directly connected to the same switching device, the controller can directly determine the switching device directly connected to the multiple computing nodes as the target switching device, without calculating the sum of the routing hops between the switching device and each computing node. The determining efficiency of the target switching device can be improved; when the plurality of computing nodes are directly connected to different switching devices, the controller calculates the sum of the number of routing hops between each switching device and each computing node.
  • the method may further include:
  • the controller may also determine at least one candidate switching device from the switching device used to connect the multiple computing nodes.
  • Each of the alternate switching devices is connectable to at least two computing nodes through a downstream path; thereafter, the controller can determine the target switching device from the at least one alternate switching device.
  • the controller may further determine at least one candidate switching device from the switching device used to connect the multiple computing nodes, where each candidate switching device may pass the downlink path and the At least two of the plurality of computing nodes are connected; thereafter, the controller may determine the target switching device from the at least one alternate switching device.
  • the processing request sent by the specified node may further include: a combination processing type corresponding to the specified computing task; correspondingly, the method may further include:
  • the merging processing type corresponding to the specified computing task is sent to the target switching device, and the target switching device is configured to perform merging processing on the data reported by the multiple computing nodes according to the merging processing type.
  • the received data may be combined according to the merge processing type corresponding to the specified computing task, thereby ensuring the precision of data processing.
  • another data processing method is applied to the switching device of the data center network, where the method may include: receiving routing information corresponding to the specified computing task sent by the controller, where the routing information is used to indicate multiple computing nodes. a data forwarding path between the target computing device and the target computing device, wherein the plurality of computing nodes are configured to perform the specified computing task; further, the switching device may perform the combining processing on the data reported by the multiple computing nodes, and may, according to the routing information, Send the merged data.
  • the routing information is sent after the controller receives the processing request for the specified computing task sent by the designated node, and determines the target switching device from the switching device used to connect the multiple computing nodes.
  • the method provided by the present application since the switching device can combine the data reported by the multiple computing nodes and then send the data, the computing nodes need not send data to the parameter server through the switching device, and the parameter server does not need to merge through the switching device.
  • the processed result is fed back to each computing node, which effectively reduces the data transmission amount in the data center network, reduces the probability of network congestion and the delay of data transmission, and improves the calculation efficiency of the computing task.
  • the switching device may further receive a merge processing type corresponding to the specified computing task sent by the controller, before the combining the data reported by the multiple computing nodes; correspondingly, the switching device is configured to the multiple computing nodes
  • the process of performing the merging process on the reported data may include: merging the data reported by the plurality of computing nodes according to the merging processing type.
  • the switching device may be a target switching device.
  • the process for the target switching device to send the merged data according to the routing information may include:
  • the merged data is sent to each computing node.
  • the switching device may be an intermediate switching device that is used to connect the target switching device and the at least two computing nodes.
  • the intermediate switching device may perform a process of combining the data reported by the multiple computing nodes. The method includes: combining the data reported by the at least two computing nodes;
  • the process for the intermediate switching device to send the merged data according to the routing information may include: sending the merged data to the target switching device according to the routing information.
  • a data processing apparatus for a controller of a data center network, the apparatus comprising at least one module for implementing the data processing method provided by the first aspect above.
  • a data processing apparatus which is applied to a switching device of a data center network, and the apparatus may include at least one module for implementing the data processing method provided by the second aspect.
  • a controller may include a processor, a memory, and a communication interface; the memory stores a computer program for the processor to operate, the processor, the memory, and the communication interface may be used The data processing method provided by the above first aspect is implemented.
  • a switching device comprising a switch chip, a processor and a memory, the switch chip, the processor and the memory being usable for implementing the data processing method provided by the second aspect.
  • a data processing system may include: a controller, a plurality of computing nodes, and at least one switching device; the controller may include the data processing device shown in the third aspect, or may be the fifth The controller shown in the aspect; each switching device may comprise the data processing device shown in the fourth aspect, or may be the switching device shown in the seventh aspect.
  • a computer readable storage medium in an eighth aspect, storing instructions for causing the computer to perform the first aspect or the second aspect when the computer readable storage medium is run on a computer The data processing method provided.
  • a computer program product comprising instructions for causing a computer to perform the data processing method of the first aspect or the second aspect described above when the computer program product is run on a computer is provided.
  • the present application provides a data processing method, apparatus, and system, where a processing request for a specified computing task sent by a specified node to a controller includes an identifier of a plurality of computing nodes for executing the specified computing task,
  • the controller may determine the target switching device from the switching device used to connect the multiple computing nodes, and send the target switching device and the designated node respectively to indicate the multiple computing nodes and the Routing information of the data forwarding path between the target switching devices, so that each computing node can report data to the target switching device according to the routing information, and the target switching device can perform data reported by the multiple computing nodes according to the routing information.
  • the merge process is sent to each compute node.
  • each computing node does not need to send data to the parameter server through the switching device, and the parameter server does not need to feed back the combined result to each computing node through the switching device, thereby effectively reducing the data center network.
  • the amount of data transmission reduces the probability of network congestion and the delay of data transmission, and improves the efficiency of computing tasks.
  • FIG. 1A is a structural diagram of a data center network involved in a data processing method according to an embodiment of the present invention
  • FIG. 1B is a structural diagram of a switching device according to an embodiment of the present invention.
  • FIG. 1C is a structural diagram of another switching device according to an embodiment of the present invention.
  • 1D is a structural diagram of a controller in a data center network according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of a data processing method according to an embodiment of the present invention.
  • FIG. 3 is a structural diagram of another data center network according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a topology structure between multiple computing nodes according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of a method for determining a target switching device according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a topology structure between another multiple computing nodes according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a sending module according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of a determining module according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of still another data processing apparatus according to an embodiment of the present invention.
  • the data center network may include a controller 01, multiple computing nodes 02, and a plurality of computing nodes. At least one switching device 03 of node 02 is calculated.
  • the controller 01 and each computing node 02 can be deployed in a server, and the switching device 03 can be a switch with data forwarding and data processing functions.
  • the controller 01 establishes a communication connection with each switching device 03, and a communication connection can be established between the two computing nodes 02 through the switching device 03.
  • the plurality of computing nodes 02 may be used to implement distributed computing tasks such as distributed machine learning.
  • the plurality of computing nodes 02 may implement Deep Neural Networks (DNN) artificial intelligence ( Artificial Intelligence, AI) model training.
  • DNN Deep Neural Networks
  • AI Artificial Intelligence, AI
  • an algorithm model of the distributed computing task may be deployed in each of the plurality of computing nodes, and One of the computing nodes is selected as a designated node, or one of the other computing nodes may be selected.
  • a driver for driving the distributed computing task is deployed in the designated node, and the plurality of computing nodes can execute distributed computing tasks in parallel under the driving of the designated node.
  • the computing performance of computing nodes with Tensor Processing Unit (TPU) and Graphics Processing Unit (GPU) as the core has been greatly improved, which makes each calculation
  • the calculation time is greatly shortened. Therefore, the communication time between each computing node and the parameter server is also high.
  • the communication time is limited to the millisecond time period.
  • the function of the parameter server can be offloaded to the switching device, that is, the computing device can be used for each computing node. After the reported data is combined and then fed back to each computing node, the communication time of the data can be effectively shortened, and the execution efficiency of the distributed computing task is improved.
  • FIG. 1B is a structural diagram of a switching device according to an embodiment of the present invention.
  • each switching device 02 in the data center network may include a switching function component 021, a network computing component 022, and a network management component 023.
  • the switching function component 021 is configured to implement the data forwarding function of the traditional switching device.
  • the network computing component 022 is configured to perform the combining processing on the data reported by the multiple computing nodes 02.
  • the network management component 023 is configured to sense the network topology and store different distributions. The routing information corresponding to the computing task is calculated, and the switching function component 021 is forwarded according to the routing information to forward the merged data of the network computing component 022.
  • FIG. 1C is a structural diagram of another switching device according to an embodiment of the present invention.
  • the hardware part of the switching device 02 may mainly include a switch chip 02a, a central processing unit (CPU) 02b, and Memory 02c.
  • the software portion may include at least one container 02d, a parameter server 02e deployed within each container 02d, and a network management component 023.
  • the parameter server 02e deployed in the container 02d may refer to an application capable of implementing the parameter server function.
  • the switching chip 02a may be a switching function component of the switching device 02, and is used to implement forwarding of Layer 2 packets or Layer 3 packets.
  • the CPU 02b and the parameter server 02e may be network computing components of the switching device 02; the CPU may be a CPU based on an x86 instruction set (or other type of instruction set) with high computational performance for providing conventional virtualization
  • the processing requirements of the software such as the container and the data calculation function for supporting the parameter server; the parameter server 02e runs on the CPU 02b and has the data merge processing (also referred to as convergence processing) function required for distributed computing.
  • the switch chip 02a and the CPU 02b can be connected through a high-speed interconnect interface b1, which can be a network interface card (NIC) interface, and can satisfy distributed computing data transmission.
  • a high-speed interconnect interface b1 can be a network interface card (NIC) interface
  • the network adapter is also generally called a network card.
  • the bandwidth rate of the high speed interconnect interface b1 may be a multiple of the bandwidth rate of the external interface a1 of the switching device.
  • the unidirectional bandwidth rate of the high speed interconnect interface b1 may be greater than 40 Gbps (gigabits per second).
  • the high-speed interconnect interface b1 can effectively reduce the probability of network congestion caused when multiple computing nodes or switching devices simultaneously report data (also referred to as multiple hits) to one switching device.
  • FIG. 1D is a structural diagram of a controller in a data center network according to an embodiment of the present invention.
  • the controller 01 may be a software-defined network (SDN)-based controller.
  • the SDN architecture may include an application layer, a control layer, and a forwarding layer.
  • the application layer of the controller 01 includes a distributed computing acceleration SDN application (referred to as an acceleration application) 011
  • the control layer includes a distributed computing acceleration SDN controller (referred to as an acceleration controller) 012
  • the forwarding layer includes a distributed computing acceleration SDN data channel. (referred to as data channel) 013.
  • the acceleration application 011 is mainly used to interact with a specified node through a network service interface (for example, a Restful interface). For example, the acceleration application may receive a processing request sent by the designated node, and may feed back to the designated node the routing information determined by the controller, where the routing information may include an identifier of a switching device for implementing the parameter server function.
  • the acceleration application 011 can also interact with the acceleration controller 012, and can provide the acceleration controller 012 with the identifier of the calculation node corresponding to the specified calculation task, and the information such as the combination processing type, and can receive the feedback from the acceleration controller 012. Routing information.
  • the acceleration controller 012 can be a functional body in the controller 01 for implementing distributed computing acceleration.
  • the acceleration controller 012 stores a physical topology of the data center network, and can be determined according to the physical topology to accelerate the specified distribution. Calculate the routing information of the task.
  • the acceleration controller 012 can also uniformly obtain performance parameters of each switching device in the data center network, and the performance parameters may include available bandwidth, throughput, and computing load.
  • the data channel 013 can be a logical data forwarding channel, which constitutes a data forwarding path between the controller and the designated node, and a data forwarding path between the controller and each switching device.
  • FIG. 2 is a flowchart of a data processing method according to an embodiment of the present invention. The method may be applied to the data center network shown in FIG. 1A. Referring to FIG. 2, the method may include:
  • Step 101 Each switching device reports topology information to the controller.
  • the topology information reported by each switching device may include an identifier of the switching device, and an identifier of a device (for example, a computing node or another switching device) to which the switching device is connected, where the identifier of the device may be a network address of the device, for example, Internet Protocol (IP) address.
  • IP Internet Protocol
  • each switching device in the data center network has a topology-aware function.
  • Each switching device can obtain the identifier of the connected device and report it to the controller after the topology of the data center network is stable. Only the target switching device and the intermediate switching device selected from the plurality of switching devices included in the data center network are shown in FIG. 2. In fact, each switching device in the data center network can report the topology to the controller. information. After the controller obtains the reported topology information of each switching device, the overall topology of the data center network can be determined.
  • FIG. 3 is a structural diagram of another data center network according to an embodiment of the present invention.
  • the data center network includes a controller 01, a computing node V1 to a computing node V8, and a switching device. SW1 to SW6.
  • the device connected to the switching device SW2 includes: a computing node V1, a computing node V2, a switching device SW1, and a switching device SW6, and the switching device SW2 can obtain topology information through its network management component, and through its switching function component Controller 01 reports topology information.
  • the topology information reported by the switching device SW2 to the controller 01 may include an IP address of the switching device SW2, an IP address of the computing node V1, an IP address of the computing node V2, an IP address of the switching device SW1, and an IP address of the switching device SW6.
  • the controller 01 can determine that the topology of the data center network is a two-layer leaf-spine topology.
  • the switching device SW2 to the switching device SW5 is a leaf switching device (ie, a first layer switching device), and the switching device SW1 and the switching device SW6 are spine switching devices (ie, a second layer switching device), and each A leaf switching device is connected to two computing nodes.
  • Step 102 The designated node sends a processing request for the specified computing task to the controller.
  • the processing request may include an identification of a plurality of computing nodes for performing the specified computing task, the designated node being a computing node selected in advance from a computing node included in the data center network for driving execution of the specified computing task.
  • a distributed driver for driving the plurality of computing nodes to perform the specified computing task may be deployed in the designated node.
  • the designated node and the plurality of computing nodes for performing the specified computing task are pre-selected by the developer, and the designated node may be one of the plurality of computing nodes, or
  • the specified node may also be a computing node that is separately set, which is not limited in this embodiment of the present invention.
  • the specified computing task is a distributed AI training task
  • the computing node for executing the distributed AI training task includes a computing node V1, a computing node V2, and a computing node V7
  • the designated node is the computing node V1.
  • the computing node V1 may send a processing request to the controller 01 through an interface provided by the controller 01 (for example, a Restful interface), where the processing request may include an identifier of the distributed AI training task.
  • a computing node list in which the IP address of the computing node V1, the IP address of the computing node V2, and the IP address of the computing node V7 are recorded.
  • Step 103 The controller determines a topology structure between the plurality of computing nodes according to the received topology information.
  • the controller 01 may determine a topology between the plurality of computing nodes for performing the specified computing task according to a predetermined topology of the data center network.
  • the topology between the plurality of compute nodes can include the plurality of compute nodes, and a switching device for connecting the plurality of compute nodes.
  • each spine switching device can be connected to all leaf switching devices in the data center network, so All spine switching devices in the data center network are included in the topology between the multiple compute nodes.
  • the controller 01 may determine the topology between the three computing nodes according to a predetermined topology of the data center network, and an IP address of each computing node in the computing node V1, the computing node V2, and the computing node V7. structure.
  • the topology between the three compute nodes can be as shown in Figure 4.
  • the topology includes the three compute nodes and all switching devices used to connect the three compute nodes.
  • the set of switching devices used to connect the three compute nodes is ⁇ SW2, SW5, SW1, SW6 ⁇ .
  • Step 104 The controller determines, according to the topology, at least one candidate switching device from the switching devices used to connect the plurality of computing nodes.
  • Each of the alternate switching devices may be a switching device that can connect to at least two computing nodes of the multiple computing nodes by using a downlink path, and the downlink path may include other switching devices, or may not include other switching devices. .
  • the set of switching devices for connecting the computing node V1, the computing node V2, and the computing node V7 is ⁇ SW2, SW5, SW1, SW6 ⁇ .
  • the spine switching device SW1 can communicate with the computing node V1, the computing node V2, and the computing node V7 through the downlink path, respectively, and the spine switching device SW6 can also communicate with the computing node V1, the computing node V2, and the computing node V7 through the downlink path;
  • the switching device SW2 can communicate with the computing node V1 and the computing node V2 through the downlink path, and the leaf switching device SW5 can only communicate with the computing node V7 through the downlink path, so the controller 01 can exchange the switching device SW1, the switching device SW2, and the switching device SW6. Determined as an alternate switching device.
  • each switching device has the capability of performing data combining processing, and the alternative switching device can be connected to at least two computing nodes through a downlink path, so that at least two of the connected devices can be connected thereto.
  • the data reported by the computing nodes is combined and sent out, and the target switching device and the intermediate switching device are determined from the candidate switching device, so that the amount of data transmission in the network is small when performing distributed computing.
  • Step 105 The controller determines the target switching device from the at least one candidate switching device.
  • FIG. 5 is a flowchart of a method for determining a target switching device according to an embodiment of the present invention.
  • the process for determining a target switching device may include:
  • Step 1051 Detect whether multiple computing nodes are directly connected to the same candidate switching device.
  • the controller may perform step 1052; when the plurality of computing nodes are directly connected to different alternate switching devices, the controller may perform step 1053.
  • the controller 01 can Go to step 1052.
  • a computing node for performing a certain specified computing task includes a computing node V1, a computing node V2, and a computing node V7, since both the computing node V1 and the computing node V2 are directly connected to the alternate switching device SW2.
  • the computing node V7 is connected to the alternate switching device SW1 through the switching device SW5.
  • the alternate switching devices directly connected to the three computing nodes are not the same alternate switching device, so the controller 01 can perform step 1053.
  • Step 1052 Determine an alternate switching device that directly connects the multiple computing nodes as a target switching device.
  • the controller may directly determine the candidate switching device directly connected to the plurality of computing nodes as the target switching device.
  • the controller 01 can directly determine the alternate switching device SW3 as the target switching device.
  • the calculated data may be reported to the target switching device, and the target switching device may merge the data reported by the computing nodes and merge the data.
  • the processed data is then sent to each compute node separately. Since the target switching device can implement the function of the parameter server, the computing nodes do not need to report data to the parameter server through the switching device, and the parameter server does not need to send the combined processed data through the switching device, thereby effectively reducing the data.
  • the amount of data transmission in the central network reduces the probability of network congestion and the delay of data transmission, which can effectively improve the performance of the specified computing tasks.
  • Step 1053 Calculate a sum of route hops between each candidate switching device and each computing node.
  • the controller may calculate a routing hop count between each candidate switching device and each computing node based on a topology between the plurality of computing nodes. And, the candidate switching device that minimizes the sum of the number of route hops determines the target switching device.
  • the controller when the sum of the number of route hops between the first candidate switching device and each computing node is counted, the controller may be in the topology between the first candidate switching device and each computing node. The path between each adjacent two devices is recorded as one hop.
  • the path between the computing node V1 and the alternate switching device SW2 can be recorded as one hop.
  • the path between the computing node V2 and the alternate switching device SW2 may be recorded as one hop, and the path between the alternate switching device SW2 and the alternate switching device SW1 may be recorded as one hop, between the computing node V7 and the switching device SW5.
  • the path is recorded as one hop, and the path between the switching device SW5 and the alternate switching device SW1 can be recorded as one hop, so the controller 01 can determine the number of routing hops between the alternate switching device SW1 and the three computing nodes. And for 5.
  • the controller 01 can determine that the sum of the route hops between the candidate switching device SW6 and the three computing nodes is also 5, and the sum of the routing hops between the candidate switching device SW2 and the three computing nodes. Also 5
  • Step 1054 When the candidate switching device with the smallest sum of routing hops includes one, the candidate switching device with the smallest sum of the routing hops is determined as the target switching device.
  • each switching device has the capability of performing data combining processing.
  • a switching device receives data reported by multiple other devices (for example, a computing node or a switching device)
  • the received data is combined and sent to the next hop switching device. Therefore, each time the computing nodes report data, the one-hop path between each adjacent two devices can be used to transmit only one copy. data.
  • the sum of the number of route hops can intuitively reflect the amount of data transmission between each candidate switching device and each computing node during data transmission, and select the alternative switching device with the smallest sum of routing hops as The target switching device enables the data transmission between the target switching device and each computing node to be less, which can effectively reduce the data transmission delay and the probability of network congestion, thereby effectively improving the execution efficiency of the computing task.
  • Step 1055 When the number of candidate switching devices with the smallest sum of route hops includes multiple, determine performance parameters of the candidate switching device with the smallest sum of each route hops respectively.
  • the controller may determine the target switching device according to performance parameters of the candidate switching device with the smallest sum of hops.
  • the performance parameter of the candidate switching device with the smallest sum of each route hop may include at least one of available bandwidth, computing load, throughput, and the number of times selected as the target switching device.
  • the calculation load may refer to a load when the switching device performs data combining processing.
  • the controller may obtain performance parameters of each switching device in the data center network in real time or periodically. For example, the controller 01 may periodically acquire the switching device by using the acceleration controller 012. Performance parameters.
  • Step 1056 In the candidate switching device that minimizes the sum of the plurality of routing hops, the candidate switching device whose performance parameter meets the preset condition is determined as the target switching device.
  • the preset condition is also different according to different types of parameters included in the performance parameters acquired by the controller.
  • the preset condition may be: the available bandwidth of the switching device is the highest; when the performance parameter includes the throughput, the preset condition may be: the throughput of the switching device is the lowest;
  • the preset condition may be: the calculation load of the switching device is the lowest; when the performance parameter includes the number of times selected as the target switching device, the preset condition may be: selected as the target switching device. The least number of times.
  • the controller may sequentially determine the priority according to the preset parameter priority and the higher priority parameter.
  • the controller may first compare the available alternate switching devices when determining the target switching device. Bandwidth, and select the alternate switching device with the highest available bandwidth as the target switching device; if the available switching device with the highest available bandwidth includes multiple, the controller can continue to compare the computing load of the candidate switching device with the highest available bandwidth. If the candidate switching device with the lowest load is included in the candidate switching device with the highest available bandwidth, the controller may continue to compare the throughput of each candidate switching device until it is determined that the preset condition is met. Target switching device. In addition, if the controller determines that the candidate switching device that meets the preset condition includes a plurality of performance parameters, the controller may select any of the multiple candidate switching devices that meet the preset condition by the performance parameter. An alternate switching device is determined as the target switching device.
  • the controller can compare the three Performance parameters of the alternate switching device. Assuming that the performance parameter acquired by the controller is the calculation load, and the calculation load of the alternative switching device SW1 is the lowest, the controller may determine the candidate switching device SW1 as the target switching device.
  • the target switching device is selected by the sum of the routing hops and the performance parameter of the switching device, so that the data transmission between the selected target switching device and each computing node is small, and the target switching device is The performance is better, which can ensure high calculation efficiency.
  • the controller may determine the target switching device based on the performance parameter, and may also determine the balance degree of the route hops between the candidate switching device and the computing node with the smallest sum of each route hops.
  • the candidate switching device with the most balanced routing hop count is determined as the target switching device.
  • the controller may also determine, after the multiple candidate switching devices whose performance parameters meet the preset condition, in step 1056, determine the target switching device based on the degree of equalization of the number of hops.
  • the merging process can be performed because the target switching device needs to obtain the data reported by all the computing nodes for performing the specified computing task. Therefore, the switching device with the highest degree of hop count is selected as the target switching device, and the computing nodes can be guaranteed.
  • the time required for reporting data is relatively close, so that the target switching device can receive the data reported by all computing nodes in a short time, and perform the combining process, which reduces the waiting time of the target switching device, and further improves the computing task. effectiveness.
  • the degree of equalization of the number of hops of the route may be determined by parameters such as the variance, the mean square error, or the average difference of the number of hops of the route. And the level of the equalization is negatively correlated with the parameter value of any of the above parameters, that is, the smaller the parameter value, the higher the degree of equilibrium. For example, for each candidate switching device in the candidate switching device with the smallest sum of multiple routing hops, the controller may separately count the number of routing hops between the candidate switching device and each computing node, and calculate The variance of the number of route hops between the candidate switching device and each computing node, after which the candidate switching device with the smallest variance can be determined as the target switching device.
  • any one of the multiple candidate switching devices may also be determined as the target switching device.
  • the controller may further determine, from the plurality of candidate switching devices, candidate switching devices that are connectable to the plurality of computing nodes through the downlink path, and then determine the target switching device from the candidate switching devices.
  • the alternate switching devices are SW1, SW2, and SW6, since the alternate switching devices SW1 and SW6 can establish a connection with the three computing nodes through the downstream path, the controller can The alternate switching devices SW1 and SW6 serve as candidate switching devices and determine the target switching device from the two candidate switching devices.
  • Step 106 The controller determines, in the candidate switching device other than the target switching device, an alternate switching device that connects the target switching device with the at least two computing nodes as the intermediate switching device.
  • the controller determines a plurality of candidate switching devices in the foregoing step 104, after determining the target switching device, the remaining candidate switching devices may also be used to connect to the target.
  • the alternate switching device of the switching device and at least two of the plurality of computing nodes is determined to be an intermediate switching device.
  • the intermediate switching device may combine the data reported by the at least two computing nodes connected to the target switching device during the execution of the specified computing task, thereby further reducing the amount of data transmission in the network. .
  • the candidate switching device SW2 can be connected because of the remaining two alternative switching devices.
  • the target switching device SW1 and the two computing nodes (V1 and V2) can therefore determine the alternative switching device SW2 as an intermediate switching device.
  • a computing node for performing a specified computing task includes computing nodes V1, V2, V3, and V7.
  • the switching devices include: SW21, SW23, SW1, and SW6. If the final target switching device is SW1, among the remaining three candidate switching devices, the controller switching device SW1 and the two computing nodes (V1 and V2) can be connected due to the alternative switching devices SW21 and SW23, so the controller 01 can determine the alternate switching devices SW21 and SW23 as intermediate switching devices.
  • Step 107 The controller sends routing information to the target switching device, the intermediate switching device, and the designated node, respectively.
  • the routing information can be used to indicate a data forwarding path between the plurality of computing nodes and the target switching device.
  • the routing information may include an identifier of the multiple computing nodes and an identifier of the target switching device. If the data forwarding path between the plurality of computing nodes and the target switching device further includes an intermediate switching device, the routing information may further include an identifier of the intermediate switching device.
  • the routing information corresponding to the specified computing task sent by the controller to each device may include only the device in the data forwarding path.
  • the directly connected device may include the plurality of computing nodes, the intermediate switching device, and the target switching device, and the other switching devices that are not selected as the intermediate switching device in the data forwarding path are not in the statistical range of the routing information.
  • the routing information sent by the controller to the target switching device may include only the identifier of the directly connected device of the target switching device, and the directly connected device of the target switching device may be a computing node or an intermediate switching device.
  • the routing information sent by the controller to the designated node may include only the identifier of the switching device directly connected to each computing node, that is, the routing information sent by the controller to the designated node may include a parameter server list, where the parameter server list is recorded for An identifier of a switching device that implements a parameter server function; the designated node is configured to send an identifier of an intermediate switching device or a target switching device directly connected to each computing node to a corresponding computing node.
  • the routing information sent by the controller to each intermediate switching device may include the identifier of the directly connected device of the intermediate switching device, and the directly connected device of each intermediate switching device is a computing node, the target switching device, or other intermediate switching device.
  • the computing nodes for performing distributed AI training tasks are computing nodes V1, V2, and V7
  • the target switching device is SW1
  • the intermediate switching device is SW2.
  • the direct connection device of the target switching device SW1 is the intermediate switching device SW2 and the computing node V7
  • the directly connected device of the intermediate switching device SW2 is the target switching device SW1, and the computing nodes V1 and V2
  • the direct connecting device of the computing nodes V1 and V2 All are the intermediate switching device SW2
  • the directly connected device of the computing node V7 is the target switching device SW1.
  • the routing information sent by the controller 01 to the target switching device SW1 may include only the IP address of the intermediate switching device SW2 and the IP address of the computing node V7; the routing information sent by the controller 01 to the intermediate switching device SW2 may include the target switching device.
  • the routing information sent by the controller 01 to the designated node V1 may include the IP address of the intermediate switching device SW2 and the IP address of the target switching device SW1.
  • the process by which the controller 01 sends routing information to each device may be as shown by the dotted line numbered 2 in FIG.
  • the controller may further generate a task ID (taskID) for the specified computing task, for example, the controller is distributed.
  • the task ID generated by the AI training task may be 1; or the controller may directly determine the identifier of the specified computing task carried in the processing request as the task identifier.
  • the routing information may be carried in the routing information, so that each device may store routing information corresponding to different computing tasks based on the task identifier.
  • the routing information stored by the target switching device SW1 can be as shown in Table 1.
  • the routing information corresponding to the task ID of task ID includes two IP addresses, IP1 and IP2, where IP1 can be The IP address of the intermediate switching device SW2, IP2 is the IP address of the computing node V7; the routing information corresponding to the computing task with the taskID 2 may include three IP addresses from IP3 to IP5.
  • taskID Routing information Merge type 1 IP1, IP2 Calculating the weighted average 2 IP3, IP4, IP5 Summation
  • the processing request sent by the designated node to the controller may further include a merge processing type corresponding to the specified computing task. Therefore, the controller may further send the merge processing type corresponding to the specified computing task to the target switching device and each intermediate switching device, so that the target switching device and each intermediate switching device may perform multiple calculations according to the combined processing type.
  • the data reported by the node is merged. Since the merge processing types corresponding to different computing tasks may be different, the received data may be combined according to the merge processing type corresponding to the specified computing task, thereby ensuring the precision of data processing.
  • the merge processing type may include any one of calculating an average value, calculating a weighted average value, summing, calculating a maximum value, and calculating a minimum value.
  • the controller may send the routing processing type corresponding to the specified computing task while sending the routing information corresponding to the specified computing task to the target switching device and each intermediate switching device; or the controller may separately send the specified
  • the merging processing type corresponding to the calculation task is not limited in this embodiment of the present invention.
  • the controller sends the routing information corresponding to the distributed AI training task to the target switching device SW1 and the intermediate switching device SW2.
  • the merging processing type corresponding to the distributed AI training task may be calculated as a weighted average value, so that each switching device may store the merging processing type corresponding to the distributed AI training task.
  • the target switching device SW1 may store the merge processing type corresponding to the calculation task whose task ID is 1 as the calculated weighted average value; the merge processing type corresponding to the calculation task whose task ID is 2 is the summation.
  • Step 108 The designated node sends routing information to each computing node.
  • the designated node may forward the routing information to each computing node, so that each computing node may report the data according to the received routing information after completing the data calculation.
  • the routing information sent by the controller to the designated node may include only the identifier of the directly connected device of each computing node, where the device connected between each computing node is an intermediate switching device for implementing the parameter server function. Or target switching device. Therefore, the routing information sent by the designated node to each computing node may also include only the identifier of the switching device directly connected to the computing node for implementing the parameter server function.
  • the routing information sent by the designated node V1 to the computing node V2 may include only the IP address of the intermediate switching device SW2 directly connected to the computing node V2, and the routing information sent by the designated node V1 to the computing node V7 may only be The IP address of the target switching device SW1 to which the computing node V7 is directly connected is included.
  • the process of the routing node V1 sending routing information to each computing node may be as shown by the dotted line numbered 3 in FIG.
  • Step 109 Each computing node performs data calculation according to an algorithm model corresponding to the specified computing task.
  • an algorithm model corresponding to the specified computing task is pre-stored in each computing node for performing the specified computing task, and each computing node receives the driving instruction sent by the designated node, that is, Data calculation can be performed on the acquired input data according to the algorithm model.
  • the distributed AI training task is a training task for DNN based image recognition applications.
  • the training task may include multiple iterations of the same calculation set. During each iteration, multiple sample pictures may be input to each computing node, and each computing node may input multiple samples according to a pre-stored neural network model. The image is subjected to data calculation to obtain a gradient of the neural network model used by the image recognition application (i.e., error correction data).
  • Step 110 Each computing node reports data to a corresponding switching device.
  • the calculated data may be reported to the corresponding switching device according to the received routing information.
  • the computing node V1 and the computing node V2 may send the calculated gradient to the intermediate switching device SW2 according to the received routing information; referring to the dotted line numbered 5 in FIG.
  • the computing node V7 can directly send the calculated gradient to the target switching device SW1 according to the received routing information. Referring to FIG. 4, it can be seen that when the computing node V7 reports the gradient to the target switching device SW1, the data needs to be transparently transmitted through the switching device SW5, that is, the switching device SW5 only forwards the data without processing the data.
  • Step 111 The intermediate switching device performs a merge process on the data reported by the at least two computing nodes to which the intermediate switching device is connected.
  • a parameter server for performing data combining processing may be configured in each switching device. After receiving the routing information sent by the controller, each intermediate switching device can configure and start the local parameter server instance, and after receiving the data reported by the at least two computing nodes connected thereto, based on the parameter server instance, The received data is merged.
  • the intermediate switching device SW2 receives the gradient reported by the computing node V1 and the computing node V2, the gradients reported by the two computing nodes may be combined.
  • controller may also send the merge processing type corresponding to the specified computing task to each intermediate switching device; accordingly, each intermediate switching device receives at least two computing nodes for performing the specified computing task. After the reported data, the data processed by the at least two computing nodes may be combined according to the type of the merge processing corresponding to the specified computing task.
  • the controller 01 when the controller 01 sends the routing information to the intermediate switching device SW2, it also states that the merging processing type corresponding to the distributed AI training task is a calculated weighted average.
  • the intermediate switching device SW2 receives the gradient reported by the computing node V1 and the computing node V2, the weighted average of the gradients reported by the two computing nodes may be calculated.
  • the weight of the gradient may be calculated by the intermediate switching device SW2 according to the weights reported by the computing nodes.
  • Step 112 The intermediate switching device sends the merged data to the target switching device.
  • the intermediate switching device After the intermediate switching device combines the data reported by the at least two computing nodes connected thereto, the merged processed data may be sent to the target switching device according to the received routing information.
  • the intermediate switching device can only forward the data reported by the at least two computing nodes after the merging process is performed, and the data is reported by the two computing nodes. It is necessary to report one channel of data to the target switching device, thereby effectively reducing the amount of data transmission in the data center network and reducing the probability of network congestion.
  • the intermediate switching device SW2 may send the calculated weighted average value of the target switching device SW1 to the target switching device SW1 according to the received routing information.
  • Step 113 The target switching device performs a merge process on the received data.
  • a parameter server for performing merge processing on data may be configured in the target switching device. After receiving the routing information sent by the controller, the target switching device may configure and start the local parameter server instance, and after receiving the data reported by the computing node and/or the intermediate switching device, the target switching device may receive the data based on the parameter server instance. The data is merged.
  • the target switching device SW2 receives the gradient reported by the computing node V7 and the weighted average reported by the intermediate switching device SW2, the gradient and the weighted average may be combined.
  • the controller may also send the merging processing type corresponding to the specified computing task to the target switching device; accordingly, after the target switching device receives the data reported by the computing node and/or the intermediate switching device corresponding to the specified computing task, The combined processing type corresponding to the specified computing task may be combined to process the received data.
  • the target switching device SW1 receives the gradient reported by the computing node V7 whose IP address is IP2, and the weighted average reported by the intermediate switching device SW2 whose IP address is IP1. After the value, a weighted average of the gradient and the weighted average can be calculated. When the calculation node V7 reports the gradient, the weight corresponding to the gradient may be reported. Therefore, the target switching device SW1 may calculate the gradient reported by the computing node V7 and the weighted average reported by the intermediate switching device SW2 according to the weight reported by the computing node V7. A weighted average of the values.
  • Step 114 The target switching device sends the merged data to each computing node.
  • the target switching device may separately send the merged data to each computing node according to the routing information, so that the respective computing nodes continue to execute the specified computing task according to the merged processed data.
  • the data forwarding path when the target switching device sends the merged data to the first computing node may be the same as the data forwarding path when the data is reported by the first computing node, or may be different. Make a limit.
  • the first computing node can be any of the plurality of computing nodes.
  • each computing node may send an obtaining request to the target switching device before the next iteration begins, and the target switching device may After receiving the acquisition request, the merged processed data is sent to each computing node.
  • the target switching device SW1 may send the calculated weighted average values to the computing node V1, the computing node V2, and the computing node V7, respectively.
  • the target switching device SW1 can forward the weighted average to the computing node V1 and the computing node V2 respectively through the intermediate switching device SW2, and can forward the weighted average to the computing node V7 through the switching device SW5.
  • the computing node V1, the computing node V2, and the computing node V7 are configured to continue model training of the image recognition application based on the weighted average.
  • the sequence of the steps of the data processing method provided by the embodiment of the present invention may be appropriately adjusted, and the steps may also be correspondingly increased or decreased according to the situation.
  • the step 104 may be deleted according to the situation.
  • the controller may directly determine the target switching device from the switching device used to connect the plurality of computing nodes; correspondingly, after the step 106, control
  • the switching device between the target switching device and each computing node may be determined as a target switching device by using a switching device for connecting the target switching device and the at least two computing nodes.
  • step 106 and step 111 may also be deleted according to the situation, that is, the controller may only determine one target switching device, and the target switching device performs a combining process on the data reported by each computing node.
  • the above steps 1051 and 1052 may be deleted according to the situation, that is, after determining the topology between the computing nodes, the controller may directly determine the target switching device according to the sum of the number of routing hops.
  • the above steps 1053 and 1054 may also be deleted according to the situation, that is, the controller may directly determine the target switching device based on the performance parameters of each switching device (or the degree of equalization of the number of routing hops). Any method that can be easily conceived within the scope of the present invention within the technical scope of the present invention is well within the scope of the present invention, and therefore will not be described again.
  • the embodiment of the present invention provides a data processing method, where a processing request for a specified computing task sent by a specified node to a controller includes an identifier of a plurality of computing nodes for executing the specified computing task, and the controller
  • the target switching device may be determined from the switching devices used to connect the plurality of computing nodes, and respectively sent to the target switching device and the designated node to indicate that the multiple computing nodes are exchanged with the target
  • the routing information of the data forwarding path between the devices so that each computing node can report data to the target switching device according to the routing information, and the target switching device can combine the data reported by the multiple computing nodes according to the routing information. Then send it to each compute node.
  • the target switching device can combine the data reported by the multiple computing nodes, the computing nodes need not send data to the parameter server through the switching device, and the parameter server does not need to feed back the combined result to each computing through the switching device. Nodes effectively reduce the amount of data transmission in the data center network, reduce the probability of network congestion and the delay of data transmission, and improve the efficiency of computing tasks.
  • the method provided by the embodiment of the present invention may also be applied to an HPC data center network, where the HPC data center network may use a Message Passing Interface (MPI) as a programming interface for distributed information interaction, and may adopt The Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) technology offloads some operations (such as Reduction operations and Aggregation operations) in the MPI set operation to the switching device, that is, the switching device performs the part. operating. That is, the data center network can support the SHARP technology, and can also support the data processing method provided by the application.
  • MPI Message Passing Interface
  • SHARP Scalable Hierarchical Aggregation and Reduction Protocol
  • each switching device in the data center network can implement the SHARP technology under the control of the management server, and
  • the data processing method provided by the embodiment of the present invention can be implemented under the control of the controller. Since the SHARP technology is limited to the MPI set operation, the computing node needs to use a specific MPI function library, and its application flexibility is low; and since the root aggregation node is not set in the MPI set operation, the management server selects to execute the part.
  • the computational complexity of operating switching devices is high, so it is difficult to support larger-scale data center networks using only the SHARP technology.
  • the switching device can implement the function of the parameter server, each switching device is no longer limited by the MPI set operation, thereby effectively improving the flexibility of data processing.
  • the controller since the controller only needs to select the target switching device and the intermediate switching device for implementing the parameter server function, the selection process has a low computational complexity and can support a large-scale data center network.
  • FIG. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
  • the apparatus may be applied to the controller 01 of the data center network shown in FIG. 1A.
  • the apparatus may include:
  • the receiving module 201 is configured to receive a processing request sent by the designated node for the specified computing task, where the processing request includes an identifier of the multiple computing nodes used to execute the specified computing task.
  • the processing request includes an identifier of the multiple computing nodes used to execute the specified computing task.
  • the determining module 202 is configured to determine a target switching device from the switching device used to connect the plurality of computing nodes.
  • a target switching device from the switching device used to connect the plurality of computing nodes.
  • the sending module 203 is configured to send routing information corresponding to the specified computing task to the target switching device and the designated node, where the routing information is used to indicate a data forwarding path between the multiple computing nodes and the target switching device.
  • the sending module 203 reference may be made to the detailed description in step 107 in the foregoing embodiment shown in FIG. 2, and details are not described herein again.
  • the routing information is used to send the merged data to each computing node according to the routing information after the target switching device performs the combining process on the data reported by the multiple computing nodes. That is, the target switching device may combine the data reported by the multiple computing nodes according to the routing information, and then send the data to each computing node.
  • the designated node may send the routing information to each computing node after receiving the routing information, and each computing node is configured to report data to the target switching device according to the routing information.
  • the functions of the receiving module 201 and the sending module 203 may be similar to the functions of the acceleration application 011 and the data channel 013 in the architecture shown in FIG. 1D; the function of the determining module 202 may be the same as that of the acceleration controller 012 in the architecture shown in FIG. 1D. The function is similar.
  • the data forwarding path between the multiple computing nodes and the target switching device may include at least one switching device
  • the determining module 202 is further configured to: determine, from the at least one switching device, at least one intermediate switching device, where each intermediate switching device is connected to at least two computing nodes.
  • each intermediate switching device is connected to at least two computing nodes.
  • the sending module 203 is further configured to send the routing information to each intermediate switching device, where each intermediate switching device is configured to combine the data reported by the at least two computing nodes connected by the intermediate switching device according to the routing information. Then sent to the target switching device.
  • each intermediate switching device is configured to combine the data reported by the at least two computing nodes connected by the intermediate switching device according to the routing information. Then sent to the target switching device.
  • the routing information may include: an identifier of each computing node, an identifier of the target switching device, and an identifier of the intermediate device.
  • FIG. 8 is a schematic structural diagram of a sending module according to an embodiment of the present invention.
  • the sending module 203 can include:
  • the first sending sub-module 2031 is configured to send, to the target switching device, an identifier of the directly connected device of the target switching device, where the directly connected device of the target switching device is a computing node or an intermediate switching device.
  • the second sending sub-module 2032 is configured to send, to the designated node, an identifier of the directly connected device of each computing node, where the directly connected device of each computing node is a target switching device or an intermediate switching device, and the designated node is used to The identifier of the directly connected device of the computing node is sent to the corresponding computing node.
  • the third sending sub-module 2033 is configured to send, to the intermediate switching device, the identifier of the directly connected device of the intermediate switching device, where the directly connected device of the intermediate switching device is a computing node, the target switching device, or other intermediate switching device.
  • FIG. 9 is a schematic structural diagram of a determining module according to an embodiment of the present invention.
  • the determining module 202 may include:
  • the calculation sub-module 2021 is configured to calculate a sum of route hops between each switching device and each computing node in the switching device used to connect the plurality of computing nodes.
  • the calculation sub-module 2021 reference may be made to the detailed description in step 1053 in the embodiment shown in FIG. 5, and details are not described herein again.
  • the first determining submodule 2022 is configured to determine, as the target switching device, the switching device that minimizes the sum of the routing hops.
  • first determining sub-module 2022 For a specific implementation of the first determining sub-module 2022, reference may be made to the detailed description in steps 1054 to 1056 in the foregoing embodiment shown in FIG. 5, and details are not described herein again.
  • the determining module 202 may further include:
  • the detecting sub-module 2023 is configured to implement the method shown in step 1051 in the embodiment shown in FIG. 5 above.
  • the second determining sub-module 2024 is configured to implement the method shown in step 1052 in the embodiment shown in FIG. 5 above.
  • calculation sub-module 2021 can be used to implement the method shown in step 1053 in the embodiment shown in FIG. 5 above.
  • the determining module 202 is further configured to implement the method shown in step 103 to step 105 in the foregoing embodiment shown in FIG. 2 .
  • the processing request may further include: a merge processing type corresponding to the specified computing task;
  • the sending module 203 is further configured to send, to the target switching device, a merge processing type corresponding to the specified computing task, where the target switching device is configured to merge the data reported by the multiple computing nodes according to the merge processing type. deal with.
  • the target switching device is configured to merge the data reported by the multiple computing nodes according to the merge processing type.
  • the embodiment of the present invention provides a data processing apparatus, where the processing request received by the apparatus for a specified computing task includes an identifier of a plurality of computing nodes for executing the specified computing task, and the apparatus may Determining, by the switching device for connecting the plurality of computing nodes, a target switching device, and respectively transmitting, to the target switching device and the designated node, a route for indicating a data forwarding path between the plurality of computing nodes and the target switching device.
  • the information is such that each computing node can report data to the target switching device according to the routing information, and the target switching device can combine the data reported by the multiple computing nodes according to the routing information, and then send the data to each computing node.
  • the target switching device can combine the data reported by the multiple computing nodes, the computing nodes need not send data to the parameter server through the switching device, and the parameter server does not need to feed back the combined result to each computing through the switching device. Nodes effectively reduce the amount of data transmission in the data center network, reduce the probability of network congestion and the delay of data transmission, and improve the efficiency of computing tasks.
  • FIG. 10 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present disclosure.
  • the apparatus may be applied to the switching device 03 of the data center network shown in FIG. 1A.
  • the apparatus may include:
  • the receiving module 301 is configured to receive routing information corresponding to the specified computing task sent by the controller, where the routing information is used to indicate a data forwarding path between the plurality of computing nodes and the target switching device, where the multiple computing nodes are configured to perform the specified Calculation task.
  • the routing information is used to indicate a data forwarding path between the plurality of computing nodes and the target switching device, where the multiple computing nodes are configured to perform the specified Calculation task.
  • the processing module 302 is configured to perform a merge process on the data reported by the multiple computing nodes.
  • the processing module 302 refer to the detailed description in step 111 or step 113 in the foregoing embodiment shown in FIG. 2, and details are not described herein again.
  • the sending module 303 is configured to send the merged data according to the routing information.
  • the sending module 303 refer to the detailed description in step 112 or step 114 in the foregoing embodiment shown in FIG. 2, and details are not described herein again.
  • the routing information is sent after the controller receives the processing request for the specified computing task sent by the designated node, and determines the target switching device from the switching device used to connect the multiple computing nodes.
  • the functions of the receiving module 301 and the transmitting module 303 can be similar to those of the switching function component 021 in the architecture shown in FIG. 1B; the functions of the processing module 302 can be similar to those of the network computing component 022 in the architecture shown in FIG. 1B.
  • the receiving module 301 is further configured to: before combining the data reported by the multiple computing nodes, receive a merge processing type corresponding to the specified computing task sent by the controller.
  • receive a merge processing type corresponding to the specified computing task sent by the controller For a specific implementation, refer to the detailed description in step 107 in the foregoing embodiment shown in FIG. 2, and details are not described herein again.
  • the processing module 302 is configured to: perform merging processing on the data reported by the multiple computing nodes according to the merging processing type.
  • the processing module 302 is configured to: perform merging processing on the data reported by the multiple computing nodes according to the merging processing type.
  • the sending module 303 can be used to implement the method shown in step 114 in the foregoing embodiment shown in FIG. 2 .
  • the processing module 302 can be used to implement the step 111 shown in the foregoing embodiment shown in FIG. Methods.
  • the sending module 303 can be used to implement the method shown in step 112 in the embodiment shown in FIG. 2 above.
  • the data processing apparatus may further include a topology sensing module, where the topology sensing module is configured to acquire the identifiers of other devices connected to the switching device and report the signals to the controller after the topology of the data center network is stable.
  • the role of the topology aware module can be similar to that of the network management component 023 in the architecture shown in FIG. 1B.
  • the embodiment of the present invention provides a data processing apparatus, which may report data reported by a plurality of computing nodes for performing the specified computing task according to routing information corresponding to a specified computing task sent by the controller.
  • the merge process is performed before being sent to each compute node. Therefore, each computing node does not need to send data to the parameter server through the switching device, and the parameter server does not need to feed back the combined result to each computing node through the switching device, thereby effectively reducing the data transmission amount in the data center network and reducing the data transmission amount.
  • the probability of network congestion and the delay of data transmission improve the computational efficiency of computing tasks.
  • FIG. 11 is a schematic structural diagram of a data processing apparatus 600 provided by an embodiment of the present application.
  • the data processing apparatus may be configured in the controller 01 shown in FIG. 1A .
  • the data processing apparatus is provided. 600 may include a processor 610, a communication interface 620, and a memory 630, which are respectively coupled to the processor 610.
  • the communication interface 620 and the memory 630 are coupled to the processor 610 via the bus 640 and the memory 610. Connected.
  • the processor 610 can be a central processing unit (CPU), and the processor 610 includes one or more processing cores.
  • the processor 610 executes various functional applications and data processing by running a computer program.
  • the processor 610 reference may be made to the detailed description in the steps 103 to 106 in the embodiment shown in FIG. 2 and the detailed description in the embodiment shown in FIG. 5, and details are not described herein again.
  • the communication interface 620 can be used for the data processing device 600 to communicate with an external device, such as a display, a third-party device (for example, a storage device, a mobile terminal, a switching device, etc.).
  • an external device such as a display, a third-party device (for example, a storage device, a mobile terminal, a switching device, etc.).
  • a third-party device for example, a storage device, a mobile terminal, a switching device, etc.
  • the memory 630 may include, but is not limited to, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM), a flash memory, an optical memory.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory an optical memory.
  • the memory 630 is responsible for information storage, for example, for storing computer programs for execution by the processor 610.
  • the data processing device 600 may further include an input/output (I/O) interface (not shown in FIG. 11).
  • the I/O interface is coupled to the processor 610, the communication interface 620, and the memory 630.
  • the I/O interface can be, for example, a universal serial bus (USB).
  • the embodiment of the present invention further provides a switching device.
  • the switching device 02 can include a switching chip 02a, a CPU 02b, and a memory 02c.
  • a computer program can be stored in the memory 02c, and the CPU 02b can implement the method shown in step 111 or step 113 in the embodiment shown in FIG. 2 by executing the computer program.
  • the specific implementation process is not described herein again.
  • the switch chip 02a can be used to implement the method shown in step 101, step 112 and step 114 in the embodiment shown in FIG. 2, and the specific implementation process is not described herein again.
  • the embodiment of the present invention further provides a data processing system.
  • the system may include: a controller 01, a plurality of computing nodes 02, and at least one switching device 03.
  • the controller 01 may include a data processing device as shown in FIG. 7 or FIG. 11 , and the data processing device may include the transmitting module illustrated in FIG. 8 and the determining module illustrated in FIG. 9; or, the controller 01 may be a map The controller shown in 1D.
  • Each switching device may include a data processing device as shown in FIG. 10, or each switching device may be a switching device as shown in FIG. 1B or FIG. 1C.
  • the embodiment of the present invention provides a computer readable storage medium, where the computer readable storage medium stores instructions, and when the computer readable storage medium is run on a computer, causes the computer to execute the data processing method provided by the foregoing method embodiment .
  • the embodiment of the invention further provides a computer program product comprising instructions, when the computer program product is run on a computer, causing the computer to execute the data processing method provided by the method embodiment.

Abstract

本申请提供了一种数据处理方法、装置及系统,涉及分布式计算领域,控制器接收到指定节点发送的携带有用于执行指定计算任务的多个计算节点的标识的处理请求后,可以从用于连接该多个计算节点的交换设备中确定目标交换设备;并分别向该目标交换设备以及该指定节点发送用于指示该多个计算节点与该目标交换设备之间的数据转发路径的路由信息;其中,目标交换设备用于根据该路由信息对该多个计算节点上报的数据进行合并处理后发送至每个计算节点,指定节点用于将该路由信息发送至每个计算节点,每个计算节点可以根据该路由信息向目标交换设备上报数据。本申请提供的方法可以降低网络拥塞的概率,提高计算任务的执行算效率。

Description

数据处理方法、装置及系统
本申请要求了2018年3月5日提交的,申请号为CN 201810178287.5发明名称为“数据处理方法、装置及系统”的中国申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及分布式计算领域,特别涉及一种数据处理方法、装置及系统。
背景技术
分布式机器学习一般采用数据并行的方式进行模型训练。在采用该数据并行的方式进行模型训练时,各个计算节点(也称为worker)中均存储有算法模型,且每个节点可以分别获取到部分样本数据,并能对获取到的样本数据进行训练得到模型参数。各个计算节点需要将计算得到的模型参数发送至参数服务器(parameter server,ps),该参数服务器用于对各个计算节点上报的模型参数进行汇聚更新,并将更新后的模型参数再发送至各个计算节点。
相关技术中,通常采用高性能计算(High Performance Computing,HPC)数据中心网络来实现分布式机器学习。具体的,可以选取一台服务器作为参数服务器,选取另一台服务器作为主节点,并可以选取多台其他服务器作为计算节点。其中,该主节点用于向该多个计算节点下发该参数服务器的网络地址,以及驱动该多个计算节点执行分布式机器学习任务。在该HPC数据中心网络中,该参数服务器与各个计算节点之间可以通过交换设备交互数据,以实现模型参数的上报,以及更新后的模型参数的下发。
但是,在该分布式机器学习的过程,数据中心网络中的数据传输量较大,可能会出现网络拥塞,导致计算节点与参数服务器之间的数据传输时延较大,影响分布式机器学习的效率。
发明内容
本申请提供了一种数据处理方法、装置及系统,可以解决相关技术中的数据中心网络在实现分布式计算时,网络中数据传输量较大,可能会出现网络拥塞,影响计算效率的问题。技术方案如下:
第一方面,提供了一种数据处理方法,应用于数据中心网络的控制器,该方法可以包括:
接收指定节点发送的针对指定计算任务的处理请求,该处理请求中包括用于执行该指定计算任务的多个计算节点的标识,之后控制器可以从用于连接该多个计算节点的交换设备中确定目标交换设备,并分别向目标交换设备以及指定节点发送指定计算任务对应的路由信息,该路由信息用于指示该多个计算节点与目标交换设备之间的数据转发路径。
其中,该路由信息用于在目标交换设备对多个计算节点上报的数据进行合并处理后根据该路由信息将该合并处理后的数据发送至每个计算节点。也即是,目标交换设备可以根据该路由信息对该多个计算节点上报的数据进行合并处理后发送至每个计算节点。此外,指定节点接收到路由信息后,可以将该路由信息发送至该多个计算节点中除该指定节点之外的每个计算节点,每个计算节点可以根据该路由信息向该目标交换设备上报数据。
本申请提供的方法,由于控制器可以选取目标交换设备对多个计算节点上报的数据进行合并处理,因此各计算节点无需再通过交换设备向参数服务器发送数据,参数服务器也无需再通过交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据 传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
可选的,该多个计算节点与该目标交换设备之间的数据转发路径上可以包括至少一个交换设备,该方法还可以包括:
将该数据转发路径上包括的至少一个交换设备中,与该多个计算节点中的至少两个计算节点连接的交换设备确定为中间交换设备;并向中间交换设备发送路由信息,该路由信息用于该中间交换设备将与其连接的至少两个计算节点上报的数据进行合并处理后根据该路由信息将合并处理后的数据发送至该目标交换设备。
各计算节点在向目标交换设备上报数据的过程中,通过中间交换设备对至少两个计算节点上报的数据进行合并处理后再发出,相比于中间交换设备直接转发数据,可以进一步减小网络中的数据传输量,进而可以进一步降低网络拥塞的概率。
可选的,控制器分别向目标交换设备以及该指定节点发送指定计算任务对应的路由信息的过程可以包括:
向目标交换设备发送包括该目标交换设备的直连设备的标识的路由信息,该目标交换设备的直连设备为计算节点或者中间交换设备;
向指定节点发送包括每个计算节点的直连设备的标识的路由信息,每个计算节点的直连设备为目标交换设备或者中间交换设备,该指定节点用于将每个计算节点的直连设备的标识发送至对应的计算节点;
相应的,控制器向中间交换设备发送路由信息的过程可以包括:
向中间交换设备发送包括该中间交换设备的直连设备的标识的路由信息,中间交换设备的直连设备为计算节点、该目标交换设备或其他中间交换设备。
其中每个设备的标识可以为设备的IP地址。
控制器向每个设备发送的路由信息可以仅包括该设备的直连设备的标识,从而可以在保证数据正常转发的基础上,进一步降低路由信息的数据量,有效提高路由信息的传输效率。
可选的,控制器从用于连接该多个计算节点的交换设备中确定目标交换设备的过程可以包括:
分别计算用于连接该多个计算节点的交换设备中,每个交换设备与各个计算节点之间的路由跳数之和;将路由跳数之和最少的交换设备确定为目标交换设备。
在本发明实施例中,选取路由跳数之和最少的交换设备作为目标交换设备,可以保证选取出的目标交换设备与各个计算节点之间的总路径较短,可以有效降低网络中的数据传输量较少,进而可以降低网络拥塞的概率。
作为一种可选的实现方式,控制器将路由跳数之和最少的交换设备确定为目标交换设备的过程可以包括:
当路由跳数之和最少的交换设备包括多个时,分别确定每个路由跳数之和最少的交换设备的性能参数,该性能参数包括可用带宽、吞吐量、计算负载以及被选为目标交换设备的次数中的至少一种;将多个路由跳数之和最少的交换设备中,性能参数满足预设条件的交换设备确定为目标交换设备。
根据交换设备的性能参数选取目标交换设备,可以保证选取出的目标交换设备的性能较好,能够保证较高的计算效率。
作为另一种可选的实现方式,控制器将路由跳数之和最少的交换设备确定为目标交换设备的过程可以包括:
当路由跳数之和最少的交换设备包括多个时,分别确定每个路由跳数之和最少的交换设备与各个所述计算节点之间的路由跳数的均衡程度;将多个路由跳数之和最少的交换设备中,路由跳数的均衡程度最高的交换设备确定为目标交换设备。
根据路由跳数的均衡程度选取目标交换设备,可以保证选取出的目标交换设备与各个计算节点之间的路径长度较为均衡,进而可以保证各个计算节点上报数据时所需的时长较为接近,使得目标交换设备可以在较短的时间内接收到所有计算节点上报的数据,并进行合并处理,进一步提高了计算任务的执行效率。
可选的,控制器在计算每个交换设备与各个计算节点之间的路由跳数之和之前,还可以先检测该多个计算节点是否均直接连接至同一个交换设备;当该多个计算节点均直接连接至同一个交换设备时,控制器可以直接将该多个计算节点直接连接的交换设备确定为目标交换设备,而无需再计算交换设备与各个计算节点之间的路由跳数之和,可以提高目标交换设备的确定效率;当该多个计算节点直接连接至不同的交换设备时,控制器再计算每个交换设备与各个计算节点之间的路由跳数之和。
可选的,该方法还可以包括:
接收该数据中心网络中每个交换设备上报的拓扑信息;根据接收到的拓扑信息,确定该多个计算节点之间的拓扑结构;相应的,控制器在确定与该各个计算节点均具有连接关系的交换设备时,可以基于该拓扑结构确定。
可选的,控制器从与多个计算节点均具有连接关系的交换设备中确定目标交换设备时,还可以先从用于连接该多个计算节点的交换设备中确定至少一个备选交换设备,每个备选交换设备能够通过下行路径与至少两个计算节点连接;之后,控制器可以从该至少一个备选交换设备中确定该目标交换设备。
可选的,控制器在确定目标交换设备时,还可以先从用于连接该多个计算节点的交换设备中确定至少一个备选交换设备,其中每个备选交换设备可以通过下行路径与该多个计算节点中的至少两个计算节点连接;之后,控制器可以再从该至少一个备选交换设备中确定目标交换设备。
可选的,指定节点发送的处理请求还可以包括:该指定计算任务对应的合并处理类型;相应的,该方法还可以包括:
向目标交换设备发送该指定计算任务对应的合并处理类型,该目标交换设备用于按照该合并处理类型对该多个计算节点上报的数据进行合并处理。
由于不同的计算任务对应的合并处理类型可能不同,按照指定计算任务对应的合并处理类型对接收到的数据进行合并处理,可以保证数据处理的精度。
第二方面,提高了另一种数据处理方法,应用于数据中心网络的交换设备,该方法可以包括:接收控制器发送的指定计算任务对应的路由信息,该路由信息用于指示多个计算节点与目标交换设备之间的数据转发路径,该多个计算节点用于执行该指定计算任务;进一步的,交换设备可以对该多个计算节点上报的数据进行合并处理,并可以根据该路由信息,发送合并处理后的数据。其中,该路由信息为控制器接收到指定节点发送的针对该指定计算任务的处理请求后,从用于连接该多个计算节点的交换设备中确定目标交换设备后发送的。
本申请提供的方法,由于交换设备可以对多个计算节点上报的数据进行合并处理后再发出,因此各计算节点无需再通过交换设备向参数服务器发送数据,参数服务器也无需再通过 交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
可选的,交换设备在对该多个计算节点上报的数据进行合并处理之前,还可以接收该控制器发送的该指定计算任务对应的合并处理类型;相应的,交换设备对该多个计算节点上报的数据进行合并处理的过程可以包括:按照该合并处理类型,对该多个计算节点上报的数据进行合并处理。
可选的,该交换设备可以为目标交换设备;此时,该目标交换设备根据该路由信息,发送合并处理后的数据的过程可以包括:
根据该路由信息,向每个计算节点发送合并处理后的数据。
可选的,该交换设备可以为用于连接该目标交换设备和至少两个该计算节点的中间交换设备;此时,该中间交换设备对该多个计算节点上报的数据进行合并处理的过程可以包括:对至少两个该计算节点上报的数据进行合并处理;
相应的,中间交换设备根据该路由信息,发送合并处理后的数据的过程可以包括:根据该路由信息,向该目标交换设备发送合并处理后的数据。
第三方面,提供了一种数据处理装置,应用于数据中心网络的控制器,该装置可以包括至少一个模块,该至少一个模块用于实现上述第一方面所提供的数据处理方法。
第四方面,提供了一种数据处理装置,应用于数据中心网络的交换设备,该装置可以包括至少一个模块,该至少一个模块用于实现上述第二方面所提供的数据处理方法。
第五方面,提供了一种控制器,该控制器可以包括处理器、存储器以及通信接口;该存储器中存储有供该处理器运行的计算机程序,该处理器、存储器以及该通信接口可以用于实现上述第一方面所提供的数据处理方法。
第六方面,提供了一种交换设备,该交换设备包括交换芯片、处理器以及存储器,该交换芯片、处理器以及存储器可以用于实现上述第二方面所提供的数据处理方法。
第七方面,提供了一种数据处理系统,该系统可以包括:控制器、多个计算节点以及至少一个交换设备;该控制器可以包括第三方面所示的数据处理装置,或者可以为第五方面所示的控制器;每个交换设备可以包括第四方面所示的数据处理装置,或者可以为第七方面所示的交换设备。
第八方面,提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当该计算机可读存储介质在计算机上运行时,使得计算机执行上述第一方面或第二方面所提供的数据处理方法。
第九方面,提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,可以使得计算机执行上述第一方面或第二方面所提供的数据处理方法。
综上所述,本申请提供了一种数据处理方法、装置及系统,指定节点向控制器发送的针对指定计算任务的处理请求中包括用于执行该指定计算任务的多个计算节点的标识,控制器接收到该处理请求后,可以从用于连接该多个计算节点的交换设备中确定目标交换设备,并分别向该目标交换设备以及该指定节点发送用于指示该多个计算节点与该目标交换设备之间的数据转发路径的路由信息,以使得每个计算节点可以根据该路由信息向该目标交换设备上报数据,目标交换设备可以根据该路由信息对该多个计算节点上报的数据进行合并处理后再发送至每个计算节点。因此通过本申请提供的方法,各计算节点无需再通过交换设备向参数服务器发送数据,参数服务器也无需再通过交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
附图说明
图1A是本发明实施例提供的数据处理方法所涉及的数据中心网络的架构图;
图1B是本发明实施例提供的一种交换设备的架构图;
图1C是本发明实施例提供的另一种交换设备的架构图;
图1D是本发明实施例提供的一种数据中心网络中的控制器的架构图;
图2是本发明实施例提供的一种数据处理方法的流程图;
图3是本发明实施例提供的另一种数据中心网络的架构图;
图4是本发明实施例提供的一种多个计算节点之间的拓扑结构示意图;
图5是本发明实施例提供的一种确定目标交换设备的方法流程图;
图6是本发明实施例提供的另一种多个计算节点之间的拓扑结构示意图;
图7是本发明实施例提供的一种数据处理装置的结构示意图;
图8是本发明实施例提供的一种发送模块的结构示意图;
图9是本发明实施例提供的一种确定模块的结构示意图;
图10是本发明实施例提供的另一种数据处理装置的结构示意图;
图11是本发明实施例提供的又一种数据处理装置的结构示意图。
具体实施方式
图1A是本发明实施例提供的数据处理方法所涉及的数据中心网络的架构图,如图1A所示,该数据中心网络可以包括控制器01、多个计算节点02以及用于连接该多个计算节点02的至少一个交换设备03。其中,该控制器01以及每个计算节点02均可以部署在服务器中,该交换设备03可以为具备数据转发以及数据处理功能的交换机。参考图1A可以看出,该控制器01与每个交换设备03均建立有通信连接,任意两个计算节点02之间可以通过交换设备03建立通信连接。
在本发明实施例中,该多个计算节点02可以用于实现分布式机器学习等分布式计算任务,例如该多个计算节点02可以实现基于深度神经网络(Deep Neural Networks,DNN)人工智能(Artificial Intelligence,AI)模型训练。在通过该多个计算节点02中的若干计算节点实现某个分布式计算任务时,可以在该若干计算节点中的每个计算节点中均部署该分布式计算任务的算法模型,并且可以在该若干计算节点中选取一个指定节点,或者也可以在 其他计算节点中选取一个指定节点。该指定节点中部署有用于驱动该分布式计算任务的驱动程序,该若干计算节点可以在该指定节点的驱动下,并行执行分布式计算任务。
随着计算硬件的快速发展,以张量处理单元(Tensor Processing Unit,TPU)和图形处理单元(Graphics Processing Unit,GPU)为核心的计算节点的计算性能得到了大幅度的提升,这使得各个计算节点执行分布式计算任务时的计算时间大幅缩短,因此对各计算节点与参数服务器之间的通信时间也提出了较高要求,一般需要将该通信时间限制在毫秒级时间周期内。
为了缩短各个计算节点与参数服务器之间的通信时间,本发明实施例提供的数据中心网络中,可以将参数服务器的功能卸载(offload)至交换设备中,即可以由该交换设备对各个计算节点上报的数据进行合并处理后,再反馈至各个计算节点,从而可以有效缩短数据的通信时间,提高分布式计算任务的执行效率。
图1B是本发明实施例提供的一种交换设备的架构图,如图1B所示,数据中心网络中的每个交换设备02可以包括交换功能组件021、网络计算组件022以及网络管理组件023。其中,交换功能组件021用于实现传统交换设备的数据转发功能;该网络计算组件022用于对多个计算节点02上报的数据进行合并处理;网络管理组件023用于感知网络拓扑、存储不同分布式计算任务所对应的路由信息,以及根据该路由信息指导交换功能组件021转发网络计算组件022合并处理后的数据。
图1C是本发明实施例提供的另一种交换设备的架构图,如图1C所示,该交换设备02的硬件部分主要可以包括交换芯片02a、中央处理器(Central Processing Unit,CPU)02b以及存储器02c。软件部分可以包括至少一个容器02d、部署在每个容器02d内的参数服务器02e以及网络管理组件023。该部署在容器02d内的参数服务器02e可以是指能够实现参数服务器功能的应用程序。其中,交换芯片02a可以为交换设备02的交换功能组件,用于实现二层报文或者三层报文的转发。CPU 02b以及该参数服务器02e可以为交换设备02的网络计算组件;该CPU可以为基于x86指令集(或者其他类型的指令集)的CPU,其具有较高的计算性能,用于提供常规虚拟化容器等软件的处理需求以及用于支撑参数服务器的数据计算功能;该参数服务器02e运行在CPU 02b上,具备分布式计算所需的数据合并处理(也可以称为汇聚处理)功能。
此外,参考图1C,该交换芯片02a与CPU 02b之间可以通过高速互连接口b1连接,该高速互连接口b1可以为网络适配器(Network Interface Card,NIC)接口,能够满足分布式计算数据传输的高带宽和低时延要求。其中,网络适配器一般也称为网卡。该高速互连接口b1的带宽速率可以为交换设备的对外接口a1的带宽速率的多倍,例如该高速互连接口b1的单向带宽速率可以大于40Gbps(吉比特每秒)。该高速互连接口b1可以有效降低多个计算节点或者交换设备同时向一个交换设备上报数据(也称为多打一)时出现导致网络拥塞的概率。
图1D是本发明实施例提供的一种数据中心网络中的控制器的架构图,如图1D所示,该控制器01可以为基于软件定义网络(Software-defined Network,SDN)架构的控制器,该SDN架构可以包括应用层、控制层和转发层。其中,控制器01的应用层包括分布式计算加速SDN应用(简称加速应用)011,控制层包括分布式计算加速SDN控制器(简称加速控制器)012,转发层包括分布式计算加速SDN数据通道(简称数据通道)013。
其中,该加速应用011主要用于通过网络服务接口(例如Restful接口)与指定节点交 互。例如,该加速应用可以接收指定节点发送的处理请求,并可以将控制器确定的路由信息(该路由信息中可以包括用于实现参数服务器功能的交换设备的标识)反馈给该指定节点。此外,该加速应用011还可以与加速控制器012交互,可以向该加速控制器012提供指定计算任务对应的计算节点的标识,以及合并处理类型等信息,并可以接收该加速控制器012反馈的路由信息。
该加速控制器012可以为控制器01中用于实现分布式计算加速的功能主体,该加速控制器012中保存了数据中心网络的物理拓扑,并可以根据该物理拓扑,确定用于加速指定分布式计算任务的路由信息。此外,加速控制器012还可以统一获取数据中心网络中各个交换设备的性能参数,该性能参数可以包括可用带宽、吞吐量和计算负载等。
该数据通道013可以为逻辑上的数据转发通道,构成了控制器与指定节点之间的数据转发路径,以及控制器与各交换设备之间的数据转发路径。
图2是本发明实施例提供的一种数据处理方法的流程图,该方法可以应用于图1A所示的数据中心网络中,参考图2,该方法可以包括:
步骤101、每个交换设备向控制器上报拓扑信息。
其中,每个交换设备上报的拓扑信息可以包括该交换设备的标识,以及该交换设备所连接的设备(例如计算节点或者其他交换设备)的标识,该设备的标识可以为设备的网络地址,例如互联网协议(Internet Protocol,IP)地址。在本发明实施例中,数据中心网络中的每个交换设备均具备拓扑感知功能,每个交换设备可以在数据中心网络的拓扑稳定后,获取其所连接的设备的标识并上报至控制器。图2中仅示出了从该数据中心网络所包括的多个交换设备中选取出的目标交换设备和中间交换设备,实际上该数据中心网络中的每个交换设备均可以向控制器上报拓扑信息。控制器获取到各个交换设备的上报的拓扑信息后,即可确定出数据中心网络的整体拓扑结构。
示例的,图3是本发明实施例提供的另一种数据中心网络的架构图,如图3所示,假设该数据中心网络中包括控制器01,计算节点V1至计算节点V8,以及交换设备SW1至SW6。其中,交换设备SW2所连接的设备包括:计算节点V1、计算节点V2、交换设备SW1以及交换设备SW6,则该交换设备SW2可以通过其网络管理组件获取到拓扑信息,并通过其交换功能组件向控制器01上报拓扑信息。该交换设备SW2向控制器01上报的拓扑信息中可以包括交换设备SW2的IP地址、计算节点V1的IP地址、计算节点V2的IP地址、交换设备SW1的IP地址以及交换设备SW6的IP地址。
控制器01根据交换设备SW1至交换设备SW6上报的拓扑信息,可以确定该数据中心网络的拓扑结构为二层的叶脊(leaf-spine)拓扑结构。其中,交换设备SW2至交换设备SW5为叶(leaf)交换设备(即第一层交换设备),交换设备SW1和交换设备SW6为脊(spine)交换设备(即第二层交换设备),并且每个leaf交换设备连接有两个计算节点。
步骤102、指定节点向控制器发送针对指定计算任务的处理请求。
该处理请求可以包括用于执行该指定计算任务的多个计算节点的标识,该指定节点为预先从数据中心网络所包括的计算节点中选定的用于驱动执行指定计算任务的计算节点。该指定节点中可以部署有用于驱动该多个计算节点执行该指定计算任务的分布式驱动程序。一个具体的实施例中,该指定节点以及用于执行该指定计算任务的多个计算节点均为开发人员预先选定的,并且该指定节点可以为该多个计算节点中的一个节点,或者,该指定节点也可以 为单独设置的一个计算节点,本发明实施例对此不做限定。
示例的,假设该指定计算任务为分布式AI训练任务,用于执行该分布式AI训练任务的计算节点包括计算节点V1、计算节点V2和计算节点V7,且指定节点为计算节点V1。参考图3中编号为1的虚线,该计算节点V1可以通过控制器01提供的接口(例如Restful接口)向控制器01发送处理请求,该处理请求中可以包括该分布式AI训练任务的标识,以及计算节点列表,该计算节点列表中记录有计算节点V1的IP地址、计算节点V2的IP地址及计算节点V7的IP地址。
步骤103、控制器根据接收到的拓扑信息,确定多个计算节点之间的拓扑结构。
控制器01接收到指定节点发送的处理请求后,可以根据预先确定的数据中心网络的拓扑结构,确定用于执行该指定计算任务的多个计算节点之间的拓扑结构。该多个计算节点之间的拓扑结构可以包括该多个计算节点,以及用于连接该多个计算节点的交换设备。
一般的,对于leaf-spine拓扑结构的数据中心网络,考虑到spine交换设备所具备的负载均衡的特性,即每个spine交换设备均可以与数据中心网络中的所有leaf交换设备连接,因此可以将数据中心网络中的所有spine交换设备均纳入至该多个计算节点之间的拓扑结构中。
示例的,控制器01可以根据预先确定的数据中心网络的拓扑结构,以及该计算节点V1、计算节点V2和计算节点V7中每个计算节点的IP地址,确定该三个计算节点之间的拓扑结构。该三个计算节点之间的拓扑结构可以如图4所示。从图4可以看出,该拓扑结构包括该三个计算节点,以及用于连接该三个计算节点的所有交换设备。用于连接该三个计算节点的交换设备的集合为{SW2,SW5,SW1,SW6}。
步骤104、控制器基于该拓扑结构,从用于连接该多个计算节点的交换设备中确定至少一个备选交换设备。
其中,每个备选交换设备可以为能够通过下行路径与该多个计算节点中的至少两个计算节点连接的交换设备,且该下行路径上可以包括其他交换设备,也可以不包括其他交换设备。
示例的,如图4所示,假设用于连接计算节点V1、计算节点V2和计算节点V7的交换设备的集合为{SW2,SW5,SW1,SW6}。由于其中spine交换设备SW1可以通过下行路径分别与计算节点V1、计算节点V2和计算节点V7连通,spine交换设备SW6也可以通过下行路径分别与计算节点V1、计算节点V2和计算节点V7连通;leaf交换设备SW2通过下行路径能够与计算节点V1和计算节点V2连通,而leaf交换设备SW5通过下行路径只能与计算节点V7连通,因此控制器01可以将交换设备SW1、交换设备SW2以及交换设备SW6确定为备选交换设备。
由于本发明实施例提供的方法中,每个交换设备均具备对数据进行合并处理的能力,而备选交换设备可以通过下行路径与至少两个计算节点连接,因此能够对其所连接的至少两个计算节点上报的数据进行合并处理后再发出,从该备选交换设备中确定目标交换设备以及中间交换设备,可以保证进行分布式计算时,网络中的数据传输量较小。
步骤105、控制器从该至少一个备选交换设备中确定目标交换设备。
图5是本发明实施例提供的一种确定目标交换设备的方法流程图,参考图5,该确定目标交换设备的过程可以包括:
步骤1051、检测多个计算节点是否均直接连接至同一个备选交换设备。
当该多个计算节点均直接连接至同一个备选交换设备时,控制器可以执行步骤1052;当 该多个计算节点直接连接至不同的备选交换设备时,控制器可以执行步骤1053。
示例的,参考图3,假设用于执行某个指定计算任务的计算节点包括计算节点V3和计算节点V4,则由于该两个计算节点均直接连接至备选交换设备SW3,因此控制器01可以执行步骤1052。
或者,如图4所示,假设用于执行某个指定计算任务的计算节点包括计算节点V1、计算节点V2和计算节点V7,由于计算节点V1和计算节点V2均直接连接至备选交换设备SW2,而计算节点V7通过交换设备SW5连接至备选交换设备SW1,该三个计算节点直接连接的备选交换设备不为同一个备选交换设备,因此控制器01可以执行步骤1053。
步骤1052、将该多个计算节点直接连接的备选交换设备确定为目标交换设备。
当该多个计算节点均直接连接至同一个备选交换设备时,控制器可以直接将该多个计算节点直接连接的备选交换设备确定为目标交换设备。参考图3,如上述计算节点包括计算节点V3和计算节点V4示例中,控制器01可以直接将备选交换设备SW3确定为目标交换设备。
在本发明实施例中,该多个计算节点在执行指定计算任务时,可以将计算得到的数据上报至目标交换设备,目标交换设备可以对该各个计算节点上报的数据进行合并处理,并将合并处理后的数据再分别发送至每个计算节点。由于该目标交换设备即可实现参数服务器的功能,因此该各个计算节点无需再通过交换设备向参数服务器上报数据,参数服务器也无需再通过交换设备下发合并处理后的数据,因此有效减少了数据中心网络中的数据传输量,降低了网络拥塞的概率和数据传输的时延,进而可以有效提高指定计算任务的执行算效率。
步骤1053、计算每个备选交换设备与各个计算节点之间的路由跳数之和。
当该多个计算节点直接连接至不同的备选交换设备时,控制器可以基于该多个计算节点之间的拓扑结构,计算每个备选交换设备与各个计算节点之间的路由跳数之和,并将路由跳数之和最少的备选交换设备确定目标交换设备。
在本发明实施例中,在统计第一备选交换设备与各个计算节点之间的路由跳数之和时,控制器可以将该第一备选交换设备与各个计算节点之间的拓扑结构中,每相邻两个设备之间的路径记为一跳。
示例的,对于图4所示的拓扑结构,由于备选交换设备SW1与该三个计算节点之间的拓扑结构中,计算节点V1与备选交换设备SW2之间的路径可以记为一跳,计算节点V2与备选交换设备SW2之间的路径可以记为一跳,备选交换设备SW2与备选交换设备SW1之间的路径可以记为一跳,计算节点V7与交换设备SW5之间的路径记为一跳,交换设备SW5与备选交换设备SW1之间的路径可以记为一跳,因此控制器01可以确定该备选交换设备SW1与该三个计算节点之间的路由跳数之和为5。同样的,控制器01可以确定备选交换设备SW6与该三个计算节点之间的路由跳数之和也为5,备选交换设备SW2与该三个计算节点之间的路由跳数之和也为5。
步骤1054、当路由跳数之和最少的备选交换设备包括一个时,将该路由跳数之和最少的备选交换设备确定为目标交换设备。
若控制器能确定出一个路由跳数之和最少的备选交换设备,则可以直接将该路由跳数之和最少的备选交换设备确定为目标交换设备。由于在本发明实施例提供的方法中,每个交换设备均具备对数据进行合并处理的能力,当某个交换设备接收到多个其他设备(例如计算节点或交换设备)上报的数据后,可以将接收到的数据进行合并处理后再发送至下一跳交换设备,因此在各个计算节点每次上报数据的过程中,每相邻两个设备之间的一跳路径可以仅用 于传输一份数据。
根据上述分析可知,通过路由跳数之和能够直观反映出数据传输过程中,每个备选交换设备与各计算节点之间的数据传输量,选择路由跳数之和最少的备选交换设备作为目标交换设备,使得目标交换设备与各个计算节点之间数据传输量均较少,可以有效降低数据传输时延以及网络拥塞的概率,进而可以有效提高计算任务的执行效率。
步骤1055、当路由跳数之和最少的备选交换设备包括多个时,分别确定每个路由跳数之和最少的备选交换设备的性能参数。
若路由跳数之和最少的备选交换设备包括多个,控制器则可以根据各个路由跳数之和最少的备选交换设备的性能参数,确定该目标交换设备。其中,每个路由跳数之和最少的备选交换设备的性能参数可以包括可用带宽、计算负载、吞吐量以及被选为目标交换设备的次数中的至少一种。其中,计算负载可以是指交换设备对数据进行合并处理时的负载。在本发明实施例中,控制器可以实时或者周期性的获取数据中心网络中每个交换设备的性能参数,例如,该控制器01可以通过其加速控制器012周期性的获取每个交换设备的性能参数。
步骤1056、将多个路由跳数之和最少的备选交换设备中,性能参数满足预设条件的备选交换设备确定为目标交换设备。
具体的实施例中,根据控制器所获取到的性能参数所包括的参数类型的不同,该预设条件也有所不同。例如:当该性能参数包括可用带宽时,该预设条件可以为:交换设备的可用带宽最高;当该性能参数包括吞吐量时,该预设条件可以为:交换设备的吞吐量最低;当该性能参数包括计算负载时,该预设条件可以为:交换设备的计算负载最低;当该性能参数包括被选为目标交换设备的次数时,该预设条件可以为:被选为目标交换设备的次数最少。此外,若该性能参数包括多种类型的参数,控制器可以根据预设的参数优先级,以优先级较高的参数为基准依次进行判断。
例如,假设该预设的参数优先级为:可用带宽、计算负载、吞吐量和被选为目标交换设备的次数,则控制器在确定目标交换设备时,可以先对比各个备选交换设备的可用带宽,并选择可用带宽最高的备选交换设备作为目标交换设备;若可用带宽最高的备选交换设备包括多个,则控制器可以继续对比该多个可用带宽最高的备选交换设备的计算负载,若该多个可用带宽最高的备选交换设备中,计算负载最低的备选交换设备包括多个,则控制器可以继续对比各个备选交换设备的吞吐量,直至确定出满足该预设条件的目标交换设备。此外,若控制器通过上述判断过程,确定出的性能参数满足该预设条件的备选交换设备包括多个,则控制器可以从该性能参数满足预设条件的多个备选交换设备中任意确定一个备选交换设备作为该目标交换设备。
示例的,由于在图4所示的拓扑结构中,备选交换设备SW1、备选交换设备SW2和备选交换设备SW6对应的路由跳数之和均为5,则控制器可以对比该三个备选交换设备的性能参数。假设控制器获取到的性能参数为计算负载,且备选交换设备SW1的计算负载最低,则控制器可以将该备选交换设备SW1确定为目标交换设备。
在本发明实施例中,通过路由跳数之和,以及交换设备的性能参数选取目标交换设备,可以保证选取出的目标交换设备与各个计算节点之间的数据传输量较少,且目标交换设备的性能较好,能够保证较高的计算效率。
在本发明实施例中,控制器除了可以基于性能参数确定目标交换设备,还可以分别确定每个路由跳数之和最少的备选交换设备与各个计算节点之间的路由跳数的均衡程度,并将路 由跳数的均衡程度最高的备选交换设备确定为目标交换设备。当然,控制器也可以在上述步骤1056中,确定出多个性能参数均满足该预设条件的备选交换设备后,再基于该路由跳数的均衡程度确定目标交换设备。由于目标交换设备需要获取到用于执行指定计算任务的所有计算节点上报的数据后,才能进行合并处理,因此选取路由跳数的均衡程度最高的交换设备作为目标交换设备,可以保证该各个计算节点上报数据时所需的时长较为接近,使得目标交换设备可以在较短的时间内接收到所有计算节点上报的数据,并进行合并处理,降低了目标交换设备的等待时长,进一步提高了计算任务的执行效率。
其中,路由跳数的均衡程度可以由路由跳数的方差、均方差或平均差等参数确定。且该均衡程度的高低与上述任一参数的参数值大小负相关,即参数值越小表明均衡程度越高。示例的,对于多个路由跳数之和最少的备选交换设备中的每个备选交换设备,控制器可以分别统计该备选交换设备与每个计算节点之间的路由跳数,并计算该备选交换设备与各个计算节点之间的路由跳数的方差,之后可以将方差最小的备选交换设备确定为目标交换设备。
需要说明的是,在本发明实施例中,当控制器在上述步骤104中确定出多个备选交换设备时,还可以将该多个备选交换设备中的任意一个确定为目标交换设备。或者,控制器还可以进一步从该多个备选交换设备中,确定出能够通过下行路径与该多个计算节点均连接的候选交换设备,进而再从该候选交换设备中确定目标交换设备。例如,对于图4所示的拓扑结构,备选交换设备为SW1、SW2和SW6,由于其中备选交换设备SW1和SW6能够通过下行路径与该三个计算节点建立连接,因此控制器可以将该备选交换设备SW1和SW6作为候选交换设备,并从该两个候选交换设备中确定目标交换设备。
步骤106、控制器将除目标交换设备之外的备选交换设备中,用于连接该目标交换设备与至少两个计算节点的备选交换设备确定为中间交换设备。
在本发明实施例中,若控制器在上述步骤104中确定出了多个备选交换设备,则在确定出目标交换设备之后,还可以将剩余的备选交换设备中,用于连接该目标交换设备与该多个计算节点中的至少两个计算节点的备选交换设备确定为中间交换设备。由于在执行指定计算任务的过程中,该中间交换设备可以将其所连接的至少两个计算节点上报的数据进行合并处理后发送至目标交换设备,由此可以进一步减小网络中的数据传输量。
示例的,参考图4,假设备选交换设备为SW1、SW2和SW6,则当控制器01确定目标交换设备为SW1之后,由于剩余的两个备选交换设备中,备选交换设备SW2能够连接该目标交换设备SW1和两个计算节点(V1和V2),因此可以将该备选交换设备SW2确定为中间交换设备。
或者,参考图6,假设用于执行指定计算任务的计算节点包括计算节点V1、V2、V3和V7,根据图6所示的拓扑结构可知,能够通过下行路径与至少两个计算节点连接的备选交换设备包括:SW21、SW23、SW1和SW6。若最终确定的目标交换设备为SW1,剩余的三个备选交换设备中,由于备选交换设备SW21和SW23,能够连接该目标交换设备SW1和两个计算节点(V1和V2),因此控制器01可以将该备选交换设备SW21和SW23均确定为中间交换设备。
步骤107、控制器分别向该目标交换设备、中间交换设备以及该指定节点发送路由信息。
该路由信息可以用于指示该多个计算节点与该目标交换设备之间的数据转发路径。例如,该路由信息中可以包括该多个计算节点的标识以及该目标交换设备的标识。若该多个计算节点与目标交换设备之间的数据转发路径上还包括中间交换设备,则该路由信息中还可以包括该中间交换设备的标识。
在本发明实施例中,为了降低路由信息的数据量,提高路由信息的发送效率,控制器向 每个设备发送的该指定计算任务对应的路由信息中可以仅包括该设备在该数据转发路径中的直连设备的标识。其中,直连设备可以包括该多个计算节点、中间交换设备和目标交换设备,而对于该数据转发路径上未被选取为中间交换设备的其他交换设备,则不在该路由信息的统计范围内。
例如,该控制器向该目标交换设备发送的路由信息可以仅包括该目标交换设备的直连设备的标识,该目标交换设备的直连设备可以为计算节点或者中间交换设备。控制器向该指定节点发送的路由信息可以仅包括每个计算节点直接连接的交换设备的标识,即该控制器向指定节点发送的路由信息可以包括参数服务器列表,该参数服务器列表中记录有用于实现参数服务器功能的交换设备的标识;该指定节点用于将每个计算节点直接连接的中间交换设备或者目标交换设备的标识发送至对应的计算节点。控制器向每个中间交换设备发送的路由信息可以包括该中间交换设备的直连设备的标识,每个中间交换设备的直连设备为计算节点、该目标交换设备或者其他中间交换设备。
示例的,结合图4,假设用于执行分布式AI训练任务的计算节点为计算节点V1、V2和V7,目标交换设备为SW1,中间交换设备为SW2。其中,目标交换设备SW1的直连设备为中间交换设备SW2和计算节点V7;中间交换设备SW2的直连设备为目标交换设备SW1,以及计算节点V1和V2;计算节点V1和V2的直连设备均为中间交换设备SW2,计算节点V7的直连设备为目标交换设备SW1。则控制器01向该目标交换设备SW1发送的路由信息可以仅包括中间交换设备SW2的IP地址,以及计算节点V7的IP地址;控制器01向中间交换设备SW2发送的路由信息可以包括目标交换设备SW1的IP地址,计算节点V1的IP地址,以及计算节点V2的IP地址。控制器01向指定节点V1发送的路由信息可以包括中间交换设备SW2的IP地址,以及目标交换设备SW1的IP地址。该控制器01向各个设备发送路由信息的过程可以如图3中编号为2的虚线所示。
需要说明的是,在上述步骤102中,控制器接收到指定节点发送的针对指定计算任务的处理请求后,还可以为该指定计算任务生成一个任务标识(taskID),例如,控制器为分布式AI训练任务生成的taskID可以为1;或者,该控制器也可以直接将该处理请求中携带的该指定计算任务的标识确定为该任务标识。
相应的,控制器向各个设备发送该指定计算任务对应的路由信息时,还可以在该路由信息中携带该任务标识,以便各个设备可以基于该任务标识,存储不同计算任务所对应的路由信息。
示例的,目标交换设备SW1存储的路由信息可以如表1所示,从表1可以看出,taskID为1的计算任务所对应的路由信息中包括IP1和IP2共两个IP地址,其中IP1可以为中间交换设备SW2的IP地址,IP2为计算节点V7的IP地址;taskID为2的计算任务所对应的路由信息中则可以包括IP3至IP5共三个IP地址。
表1
taskID 路由信息 合并处理类型
1 IP1、IP2 计算加权平均值
2 IP3、IP4、IP5 求和
还需要说明的是,在上述步骤102中,指定节点向控制器发送的处理请求中还可以包括 该指定计算任务对应的合并处理类型。因此,控制器还可以向该目标交换设备以及每个中间交换设备发送该指定计算任务对应的合并处理类型,以便该目标交换设备和每个中间交换设备可以按照该合并处理类型,对多个计算节点上报的数据进行合并处理。由于不同的计算任务对应的合并处理类型可能不同,按照指定计算任务对应的合并处理类型对接收到的数据进行合并处理,可以保证数据处理的精度。
示例性的,该合并处理类型可以包括:计算平均值、计算加权平均值、求和、计算最大值和计算最小值中的任一种。
此外,控制器可以在向该目标交换设备以及每个中间交换设备发送该指定计算任务对应的路由信息的同时,发送该指定计算任务对应的合并处理类型;或者,控制器也可以单独发送该指定计算任务对应的合并处理类型,本发明实施例对此不做限定。
示例的,假设该taskID为1的分布式AI训练任务对应的合并处理类型为计算加权平均值,则该控制器向目标交换设备SW1,以及中间交换设备SW2发送分布式AI训练任务对应的路由信息时,可以在该路由信息中声明该分布式AI训练任务对应的合并处理类型为计算加权平均值,以便各个交换设备可以存储该分布式AI训练任务对应的合并处理类型。例如,参考表1,目标交换设备SW1可以存储taskID为1的计算任务对应的合并处理类型为计算加权平均值;taskID为2的计算任务对应的合并处理类型为求和。
步骤108、指定节点向每个计算节点发送路由信息。
指定节点接收到控制器发送的该指定计算任务对应的路由信息后,可以将该路由信息转发至各个计算节点,以便各个计算节点在完成数据计算后,可以根据接收到的路由信息上报数据。
进一步的,由于控制器向该指定节点发送的路由信息中可以仅包括每个计算节点的直连设备的标识,其中每个计算节点之间连接的设备为用于实现参数服务器功能的中间交换设备或目标交换设备。因此该指定节点向每个计算节点发送的路由信息中,也可以仅包括该计算节点所直接连接的用于实现参数服务器功能的交换设备的标识。
示例的,该指定节点V1向计算节点V2发送的路由信息中可以仅包括该计算节点V2所直接连接的中间交换设备SW2的IP地址,该指定节点V1向计算节点V7发送的路由信息中可以仅包括该计算节点V7所直接连接的目标交换设备SW1的IP地址。该指定节点V1向各个计算节点发送路由信息的过程可以如图4中编号为3的虚线所示。
步骤109、每个计算节点根据该指定计算任务对应的算法模型进行数据计算。
在本发明实施例中,用于执行该指定计算任务的每个计算节点中预先存储有该指定计算任务对应的算法模型,每个计算节点在接收到该指定节点下发的驱动指令后,即可根据该算法模型对获取到的输入数据进行数据计算。
示例的,假设该分布式AI训练任务为基于DNN的图像识别应用的训练任务。该训练任务可以包括多个相同计算集合的迭代,每次迭代过程中,可以分别向每个计算节点输入多张样本图片,每个计算节点可以根据预先存储的神经网络模型对输入的多张样本图片进行数据计算,得到图像识别应用所使用的神经网络模型的梯度(即误差修正数据)。
步骤110、每个计算节点向对应的交换设备上报数据。
进一步的,每个计算节点完成数据计算后,即可根据接收到的路由信息,向对应的交换设备上报计算得到的数据。
示例的,参考图4中编号为4的虚线,计算节点V1和计算节点V2可以根据接收到的路 由信息,将计算得到的梯度发送至中间交换设备SW2;参考图4中编号为5的虚线,计算节点V7可以根据接收到的路由信息,将计算得到的梯度直接发送至目标交换设备SW1。参考图4还可以看出,该计算节点V7向目标交换设备SW1上报梯度时,需要通过交换设备SW5进行数据的透传,即该交换设备SW5仅转发数据,而不会对数据进行处理。
步骤111、中间交换设备对其所连接的至少两个计算节点上报的数据进行合并处理。
在本发明实施例中,参考图1C,每个交换设备中可以配置有用于对数据进行合并处理的参数服务器。每个中间交换设备在接收到控制器发送的路由信息后,可以配置和启动本地参数服务器实例,并可以在接收到其所连接的至少两个计算节点上报的数据后,基于该参数服务器实例,对接收到的数据进行合并处理。
示例的,中间交换设备SW2接收到计算节点V1和计算节点V2上报的梯度后,可以对该两个计算节点上报的梯度进行合并处理。
进一步的,由于控制器还可以向每个中间交换设备发送该指定计算任务对应的合并处理类型;因此相应的,每个中间交换设备在接收到用于执行该指定计算任务的至少两个计算节点上报的数据后,还可以按照该指定计算任务对应的合并处理类型,对该至少两个计算节点上报的数据进行合并处理。
示例的,假设控制器01向中间交换设备SW2发送路由信息时,还声明了分布式AI训练任务对应的合并处理类型为计算加权平均值。则中间交换设备SW2接收到计算节点V1和计算节点V2上报的梯度后,可以计算该两个计算节点上报的梯度的加权平均值。其中,每个计算节点上报梯度时,还可以上报该梯度所对应的权重,因此中间交换设备SW2可以根据各个计算节点上报的权重,计算梯度的加权平均值。
步骤112、中间交换设备向目标交换设备发送合并处理后的数据。
每个中间交换设备对其所连接的至少两个计算节点上报的数据进行合并处理后,即可根据接收到的路由信息,向目标交换设备发送合并处理后的数据。由于该中间交换设备能够对至少两个计算节点上报的数据进行合并处理后再发出,相比于交换设备分别转发两个计算节点上报的数据,本发明实施例提供的方法中,中间交换设备仅需向目标交换设备上报一路数据,从而能够有效减少数据中心网络中的数据传输量,降低网络拥塞的概率。
示例的,参考图4中编号为6的虚线,中间交换设备SW2可以根据接收到的路由信息中,目标交换设备SW1的IP地址,将其计算得到的加权平均值发送至该目标交换设备SW1。
步骤113、目标交换设备对接收到的数据进行合并处理。
在本发明实施例中,参考图1C,目标交换设备中可以配置有用于对数据进行合并处理的参数服务器。目标交换设备在接收到控制器发送的路由信息后,可以配置和启动本地参数服务器实例,并可以在接收到计算节点和/或中间交换设备上报的数据后,基于该参数服务器实例,对接收到的数据进行合并处理。
示例的,目标交换设备SW2接收到计算节点V7上报的梯度,以及中间交换设备SW2上报的加权平均值之后,可以对该梯度以及加权平均值进行合并处理。
进一步的,由于控制器还可以向目标交换设备发送该指定计算任务对应的合并处理类型;因此相应的,目标交换设备接收到该指定计算任务对应的计算节点和/或中间交换设备上报的数据后,还可以按照该指定计算任务对应的合并处理类型,对接收到的数据进行合并处理。
示例的,假设目标交换设备存储有表1所示的对应关系,则目标交换设备SW1接收到IP地址为IP2的计算节点V7上报的梯度,以及IP地址为IP1的中间交换设备SW2上报的加权 平均值之后,可以计算该梯度和该加权平均值的加权平均值。其中,计算节点V7上报梯度时,还可以上报该梯度所对应的权重,因此目标交换设备SW1可以根据该计算节点V7上报的权重,计算计算节点V7上报的梯度与中间交换设备SW2上报的加权平均值的加权平均值。
步骤114、目标交换设备向每个计算节点发送合并处理后的数据。
最后,目标交换设备即可根据路由信息,将合并处理后的数据分别发送至每个计算节点,以便该各个计算节点根据合并处理后的数据继续执行该指定计算任务。并且,该目标交换设备向第一计算节点发送合并处理后的数据时的数据转发路径,与该第一计算节点上报数据时的数据转发路径可以相同,也可以不同,本发明实施例对此不做限定。该第一计算节点可以为该多个计算节点中的任一节点。
可选的,在本发明实施例中,由于分布式计算任务一般包括多个迭代的计算过程,因此每个计算节点可以在下一个迭代开始前,向该目标交换设备发送获取请求,目标交换设备可以在接收到该获取请求后,将合并处理后的数据发送至各个计算节点。
示例的,参考图4中编号为7的虚线,目标交换设备SW1可以将计算得到的加权平均值分别发送至计算节点V1、计算节点V2和计算节点V7。例如,该目标交换设备SW1可以通过中间交换设备SW2将加权平均值分别转发至计算节点V1和计算节点V2,并可以通过交换设备SW5将加权平均值转发至计算节点V7。该计算节点V1、计算节点V2和计算节点V7用于根据该加权平均值继续进行图像识别应用的模型训练。
需要说明的是,本发明实施例提供的数据处理方法的步骤的先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减。例如,步骤104可以根据情况进行删除,则在上述步骤105中,控制器可以直接从用于连接该多个计算节点的交换设备中确定目标交换设备;相应的,在上述步骤后106中,控制器可以将该目标交换设备与各个计算节点之间的数据转发路径上,用于连接该目标交换设备和至少两个计算节点的交换设备确定为目标交换设备。或者,上述步骤106和步骤111也可以根据情况进行删除,即控制器可以仅确定一个目标交换设备,由该目标交换设备对各个计算节点上报的数据进行合并处理。又或者,上述步骤1051和步骤1052也可以根据情况进行删除,即控制器在确定各个计算节点之间的拓扑结构之后,可以直接根据路由跳数之和确定目标交换设备。又或者,上述步骤1053和步骤1054也可以根据情况进行删除,即控制器可以直接基于各交换设备的性能参数(或路由跳数的均衡程度)确定目标交换设备。任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本发明的保护范围之内,因此不再赘述。
综上所述,本发明实施例提供了一种数据处理方法,指定节点向控制器发送的针对指定计算任务的处理请求中包括用于执行该指定计算任务的多个计算节点的标识,控制器接收到该处理请求后,可以从用于连接该多个计算节点的交换设备中确定目标交换设备,并分别向该目标交换设备以及该指定节点发送用于指示该多个计算节点与该目标交换设备之间的数据转发路径的路由信息,以使得每个计算节点可以根据该路由信息向该目标交换设备上报数据,目标交换设备可以根据该路由信息对该多个计算节点上报的数据进行合并处理后再发送至每个计算节点。由于目标交换设备可以对多个计算节点上报的数据进行合并处理,因此各计算节点无需再通过交换设备向参数服务器发送数据,参数服务器也无需再通过交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
可选的,本发明实施例提供的方法还可以应用于HPC数据中心网络中,该HPC数据中心网络可以使用消息传输接口(MPI,Message Passing Interface)作为分布式信息交互的编程接口,并且可以采用可扩展分层聚合协议(Scalable Hierarchical Aggregation and Reduction Protocol,SHARP)技术将MPI的集合操作中的部分操作(例如Reduction操作和Aggregation操作)卸载(offload)到交换设备上,即由交换设备执行该部分操作。也即是,该数据中心网络即可以支持SHARP技术,也可以支持本申请所提供的数据处理方法,具体的,数据中心网络中的各个交换设备可以在管理服务器的控制下实现该SHARP技术,并可以在控制器的控制下实现本发明实施例提供的数据处理方法。由于SHARP技术受限于MPI集合操作,需要计算节点采用特定的MPI函数库,其应用灵活性较低;又由于MPI集合操作中并未设定根汇聚节点,导致管理服务器选择用于执行该部分操作的交换设备时的计算复杂度较高,因此仅采用该SHARP技术难以支持较大规模的数据中心网络。而该HPC数据中心网络在采用本发明实施例提供的数据处理方法后,由于交换设备可以实现参数服务器的功能,因此各个交换设备不再受限于MPI集合操作,有效提高了数据处理的灵活性;并且,由于控制器仅需选取用于实现参数服务器功能的目标交换设备和中间交换设备,因此其选取过程的计算复杂度较低,能够支持较大规模的数据中心网络。
图7是本发明实施例提供的一种数据处理装置的结构示意图,该装置可以应用于图1A所示的数据中心网络的控制器01中,参考图7,该装置可以包括:
接收模块201,用于接收指定节点发送的针对指定计算任务的处理请求,该处理请求包括用于执行该指定计算任务的多个计算节点的标识。接收模块201的具体实现可以参考上述图2所示实施例中的步骤102中的详细描述,此处不再予以赘述。
确定模块202,用于从用于连接该多个计算节点的交换设备中确定目标交换设备。确定模块202的具体实现可以参考上述图2所示实施例中的步骤104和步骤105中的详细描述,此处不再予以赘述。
发送模块203,用于分别向该目标交换设备以及该指定节点发送该指定计算任务对应的路由信息,该路由信息用于指示该多个计算节点与该目标交换设备之间的数据转发路径。发送模块203的具体实现可以参考上述图2所示实施例中的步骤107中的详细描述,此处不再予以赘述。
其中,该路由信息用于在目标交换设备对该多个计算节点上报的数据进行合并处理后根据该路由信息将合并处理后的数据发送至每个计算节点。即该目标交换设备可以根据该路由信息对该多个计算节点上报的数据进行合并处理后发送至每个计算节点。此外,该指定节点可以在接收到该路由信息后,将该路由信息发送至每个计算节点,每个计算节点用于根据该路由信息向该目标交换设备上报数据。
此外,该接收模块201和发送模块203的功能可以与图1D所示架构中,加速应用011和数据通道013的功能类似;确定模块202的功能可以与图1D所示架构中加速控制器012的功能类似。
可选的,该多个计算节点与该目标交换设备之间的数据转发路径上可以包括至少一个交换设备;
相应的,该确定模块202,还可以用于:从该至少一个交换设备中确定至少一个中间交换设备,每个中间交换设备与至少两个计算节点连接。具体实现可以参考上述图2所示实施 例中的步骤106中的详细描述,此处不再予以赘述。
该发送模块203,还可以用于向每个中间交换设备发送该路由信息,每个中间交换设备用于根据该路由信息,将该中间交换设备连接的至少两个计算节点上报的数据进行合并处理后发送至该目标交换设备。具体实现可以参考上述图2所示实施例中的步骤107中的详细描述,此处不再予以赘述。
可选的,该路由信息可以包括:每个计算节点的标识、目标交换设备的标识,以及中间设备的标识;图8是本发明实施例提供的一种发送模块的结构示意图,参考图8,该发送模块203可以包括:
第一发送子模块2031,用于向该目标交换设备发送该目标交换设备的直连设备的标识,该目标交换设备的直连设备为计算节点或者中间交换设备。
第二发送子模块2032,用于向该指定节点发送每个计算节点的直连设备的标识,每个计算节点的直连设备为目标交换设备或者中间交换设备,该指定节点用于将每个计算节点的直连设备的标识发送至对应的计算节点。
第三发送子模块2033,用于向中间交换设备发送该中间交换设备的直连设备的标识,中间交换设备的直连设备为计算节点、该目标交换设备或者其他中间交换设备。
以上各发送子模块的具体实现可以参考上述图2所示实施例中的步骤107中的详细描述,此处不再予以赘述。
图9是本发明实施例提供的一种确定模块的结构示意图,参考图9,该确定模块202可以包括:
计算子模块2021,用于计算用于连接该多个计算节点的交换设备中,每个交换设备与各个计算节点之间的路由跳数之和。该计算子模块2021的具体实现可以参考上述图5所示实施例中的步骤1053中的详细描述,此处不再予以赘述。
第一确定子模块2022,用于将路由跳数之和最少的交换设备确定为目标交换设备。
该第一确定子模块2022的具体实现可以参考上述图5所示实施例中的步骤1054至步骤1056中的详细描述,此处不再予以赘述。
可选的,如图9所示,该确定模块202还可以包括:
检测子模块2023,用于实现上述图5所示实施例中步骤1051所示的方法。
第二确定子模块2024,用于实现上述图5所示实施例中步骤1052所示的方法。
相应的,计算子模块2021可以用于实现上述图5所示实施例中步骤1053所示的方法。
可选的,该确定模块202还可以用于实现上述图2所示实施例中步骤103至步骤105所示的方法。
可选的,该处理请求还可以包括:该指定计算任务对应的合并处理类型;
相应的,该发送模块203,还可以用于向该目标交换设备发送该指定计算任务对应的合并处理类型,该目标交换设备用于按照该合并处理类型对该多个计算节点上报的数据进行合并处理。其具体实现可以参考上述图2所示实施例中的步骤107中的详细描述,此处不再予以赘述。
综上所述,本发明实施例提供了一种数据处理装置,该装置接收到的针对指定计算任务的处理请求中包括用于执行该指定计算任务的多个计算节点的标识,该装置可以从用于连接该多个计算节点的交换设备中确定目标交换设备,并分别向该目标交换设备以及该指定节点发送用于指示该多个计算节点与该目标交换设备之间的数据转发路径的路由信息,以使得每 个计算节点可以根据该路由信息向该目标交换设备上报数据,目标交换设备可以根据该路由信息对该多个计算节点上报的数据进行合并处理后再发送至每个计算节点。由于目标交换设备可以对多个计算节点上报的数据进行合并处理,因此各计算节点无需再通过交换设备向参数服务器发送数据,参数服务器也无需再通过交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
图10是本发明实施例提供的另一种数据处理装置的结构示意图,该装置可以应用于图1A所示的数据中心网络的交换设备03中,参考图10,该装置可以包括:
接收模块301,用于接收控制器发送的指定计算任务对应的路由信息,该路由信息用于指示多个计算节点与目标交换设备之间的数据转发路径,该多个计算节点用于执行该指定计算任务。接收模块301的具体实现可以参考上述图2所示实施例中的步骤107中的详细描述,此处不再予以赘述。
处理模块302,用于对该多个计算节点上报的数据进行合并处理。处理模块302的具体实现可以参考上述图2所示实施例中的步骤111或者步骤113中的详细描述,此处不再予以赘述。
发送模块303,用于根据该路由信息,发送合并处理后的数据。发送模块303的具体实现可以参考上述图2所示实施例中的步骤112或者步骤114中的详细描述,此处不再予以赘述。
其中,该路由信息为该控制器接收到指定节点发送的针对该指定计算任务的处理请求后,从用于连接该多个计算节点的交换设备中确定目标交换设备后发送的。
此外,该接收模块301和发送模块303的功能可以与图1B所示架构中交换功能组件021的功能类似;该处理模块302的功能可以与图1B所示架构中网络计算组件022的功能类似。
可选的,该接收模块301,还用于在对该多个计算节点上报的数据进行合并处理之前,接收该控制器发送的该指定计算任务对应的合并处理类型。其具体实现可以参考上述图2所示实施例中步骤107中的详细描述,此处不再予以赘述。
相应的,该处理模块302,可以用于:按照该合并处理类型,对该多个计算节点上报的数据进行合并处理。其具体实现可以参考上述图2所示实施例中步骤111或者步骤113中的详细描述,此处不再予以赘述。
可选的,当该交换设备可以为目标交换设备时,该发送模块303可以用于实现上述图2所示实施例中步骤114所示的方法。
可选的,当该交换设备为用于连接该目标交换设备和至少两个该计算节点的中间交换设备时,该处理模块302,可以用于实现上述图2所示实施例中步骤111所示的方法。
该发送模块303,可以用于实现上述图2所示实施例中步骤112所示的方法。
此外,在本发明实施例中,该数据处理装置还可以包括拓扑感知模块,该拓扑感知模块用于在数据中心网络的拓扑稳定后,获取交换设备所连接的其他设备的标识并上报至控制器。该拓扑感知模块的作用可以与图1B所示架构中网络管理组件023的功能类似。
综上所述,本发明实施例提供了一种数据处理装置,该装置可以根据控制器发送的指定计算任务所对应的路由信息,对用于执行该指定计算任务的多个计算节点上报的数据进行合并处理后再发送至每个计算节点。因此各计算节点无需再通过交换设备向参数服务器发送数 据,参数服务器也无需再通过交换设备将合并处理后的结果反馈至各计算节点,有效减小了数据中心网络中的数据传输量,降低了网络拥塞的概率以及数据传输的时延,提高了计算任务的执行算效率。
关于上述实施例中的装置,其中各个模块执行操作的实现方式已经在有关该方法的实施例中进行了详细描述,故此处不再阐述说明。
请参考图11,其示出了本申请实施例提供的一种数据处理装置600的结构示意图,该数据处理装置可以配置于图1A所示的控制器01中,参见图11,该数据处理装置600可以包括:处理器610、通信接口620和存储器630,通信接口620和存储器630分别与处理器610相连,示例地,如图11所示,通信接口620和存储器630通过总线640与处理器610相连。
其中,处理器610可以为中央处理器(CPU),处理器610包括一个或者一个以上处理核心。处理器610通过运行计算机程序,从而执行各种功能应用以及数据处理。该处理器610的具体实现可以参考上述图2所示实施例中步骤103至步骤106中的详细描述,以及图5所示实施例中的详细描述,此处不再予以赘述。
其中,通信接口620可以为多个,该通信接口620用于数据处理装置600与外部设备进行通信,该外部设备例如显示器、第三方设备(例如,存储设备、移动终端和交换设备等)等。该通信接口620的具体实现可以参考上述图2所示实施例中步骤101、步骤102以及步骤107中的详细描述,此处不再予以赘述。
其中,存储器630可以包括但不限于:随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、快闪存储器、光存储器。该存储器630负责信息存储,例如,该存储器630用于存储供处理器610运行的计算机程序。
可选地,该数据处理装置600还可以包括:输入/输出(I/O)接口(图11中未示出)。I/O接口与处理器610、通信接口620以及存储器630连接。I/O接口例如可以为通用串行总线(USB)。
本发明实施例还提供了一种交换设备,如图1C所示,该交换设备02可以包括交换芯片02a、CPU 02b以及存储器02c。其中存储器02c中可以存储有计算机程序,该CPU 02b可以通过执行该计算机程序,实现上述图2所示实施例中步骤111或者步骤113所示的方法,其具体实现过程此处不再予以赘述。该交换芯片02a可以用于实现上述图2所示实施例中步骤101、步骤112和步骤114所示的方法,其具体实现过程此处不再予以赘述。
本发明实施例还提供了一种数据处理系统,参考图1A,该系统可以包括:控制器01、多个计算节点02以及至少一个交换设备03。
该控制器01可以包括如图7或图11所示的数据处理装置,该数据处理装置可以包括图8所示的发送模块和图9所示的确定模块;或者,该控制器01可以为图1D所示的控制器。每个交换设备可以包括如图10所示的数据处理装置,或者,每个交换设备可以为如图1B或图1C所示的交换设备。
本发明实施例提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当该计算机可读存储介质在计算机上运行时,使得计算机执行上述方法实施例提供的数据处理方法。
本发明实施例还提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述方法实施例所提供的数据处理方法。

Claims (28)

  1. 一种数据处理方法,其特征在于,应用于数据中心网络的控制器,所述方法包括:
    接收指定节点发送的针对指定计算任务的处理请求,所述处理请求包括用于执行所述指定计算任务的多个计算节点的标识,所述多个计算节点包括所述指定节点;
    从用于连接所述多个计算节点的交换设备中确定目标交换设备;
    分别向所述目标交换设备以及所述指定节点发送所述指定计算任务对应的路由信息,所述路由信息用于指示所述多个计算节点与所述目标交换设备之间的数据转发路径;
    其中,所述路由信息用于在所述目标交换设备对所述多个计算节点上报的数据进行合并处理后根据所述路由信息将所述合并处理后的数据发送至每个所述计算节点。
  2. 根据权利要求1所述的方法,其特征在于,所述多个计算节点与所述目标交换设备之间的数据转发路径上包括至少一个交换设备,所述方法还包括:
    将所述数据转发路径上包括的至少一个交换设备中,与所述多个计算节点中的至少两个计算节点连接的交换设备确定为中间交换设备;
    向所述中间交换设备发送所述路由信息,所述路由信息用于所述中间交换设备将与所述中间交换设备连接的至少两个计算节点上报的数据进行合并处理后根据所述路由信息将所述合并处理后的数据发送至所述目标交换设备。
  3. 根据权利要求2所述的方法,其特征在于,所述分别向所述目标交换设备以及所述指定节点发送所述指定计算任务对应的路由信息,包括:
    向所述目标交换设备发送包括所述目标交换设备的直连设备的标识的路由信息,所述目标交换设备的直连设备为计算节点或者中间交换设备;
    向所述指定节点发送包括每个所述计算节点的直连设备的标识的路由信息,每个所述计算节点的直连设备为目标交换设备或者中间交换设备,所述指定节点用于将每个所述计算节点的直连设备的标识发送至对应的计算节点;
    所述向所述中间交换设备发送所述路由信息,包括:
    向所述中间交换设备发送包括所述中间交换设备的直连设备的标识的路由信息,所述中间交换设备的直连设备为计算节点、所述目标交换设备或其他中间交换设备。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述从用于连接所述多个计算节点的交换设备中确定目标交换设备,包括:
    分别计算用于连接所述多个计算节点的交换设备中,每个交换设备与各个所述计算节点之间的路由跳数之和;
    将路由跳数之和最少的交换设备确定为目标交换设备。
  5. 根据权利要求4所述的方法,其特征在于,所述将路由跳数之和最少的交换设备确定为目标交换设备,包括:
    当路由跳数之和最少的交换设备包括多个时,分别确定每个所述路由跳数之和最少的交换设备的性能参数,所述性能参数包括可用带宽、计算负载、吞吐量以及被选为目标交换设 备的次数中的至少一种;
    将多个所述路由跳数之和最少的交换设备中,性能参数满足预设条件的交换设备确定为目标交换设备。
  6. 根据权利要求4所述的方法,其特征在于,所述将路由跳数之和最少的交换设备确定为目标交换设备,包括:
    当路由跳数之和最少的交换设备包括多个时,分别确定每个所述路由跳数之和最少的交换设备与各个所述计算节点之间的路由跳数的均衡程度;
    将多个所述路由跳数之和最少的交换设备中,路由跳数的均衡程度最高的交换设备确定为目标交换设备。
  7. 根据权利要求4所述的方法,其特征在于,在计算每个交换设备与各个所述计算节点之间的路由跳数之和之前,所述方法还包括:
    检测所述多个计算节点是否均直接连接至同一个交换设备;
    当所述多个计算节点均直接连接至同一个交换设备时,将所述多个计算节点直接连接的交换设备确定为所述目标交换设备;
    所述计算用于连接所述多个计算节点的交换设备中,每个交换设备与各个所述计算节点之间的路由跳数之和,包括:
    当所述多个计算节点直接连接至不同的交换设备时,计算用于连接所述多个计算节点的交换设备中,每个交换设备与各个所述计算节点之间的路由跳数之和。
  8. 根据权利要求1至3任一所述的方法,其特征在于,所述从用于连接所述多个计算节点的交换设备中确定目标交换设备,包括:
    从用于连接所述多个计算节点的交换设备中确定至少一个备选交换设备,每个所述备选交换设备通过下行路径与所述多个计算节点中的至少两个计算节点连接;
    从所述至少一个备选交换设备中确定所述目标交换设备。
  9. 根据权利要求1至3任一所述的方法,其特征在于,所述处理请求还包括:所述指定计算任务对应的合并处理类型;所述方法还包括:
    向所述目标交换设备发送所述指定计算任务对应的合并处理类型,所述目标交换设备用于按照所述合并处理类型对所述多个计算节点上报的数据进行合并处理。
  10. 一种数据处理方法,其特征在于,应用于数据中心网络的交换设备,所述方法包括:
    接收控制器发送的指定计算任务对应的路由信息,所述路由信息用于指示多个计算节点与目标交换设备之间的数据转发路径,所述多个计算节点用于执行所述指定计算任务;
    对所述多个计算节点上报的数据进行合并处理;
    根据所述路由信息,发送合并处理后的数据;
    其中,所述路由信息为所述控制器接收到指定节点发送的针对所述指定计算任务的处理请求后,从用于连接所述多个计算节点的交换设备中确定目标交换设备后发送的。
  11. 根据权利要求10所述的方法,其特征在于,在对所述多个计算节点上报的数据进行合并处理之前,所述方法还包括:
    接收所述控制器发送的所述指定计算任务对应的合并处理类型;
    所述对所述多个计算节点上报的数据进行合并处理,包括:
    按照所述合并处理类型,对所述多个计算节点上报的数据进行合并处理。
  12. 根据权利要求10或11所述的方法,其特征在于,所述交换设备为所述目标交换设备;所述根据所述路由信息,发送合并处理后的数据,包括:
    根据所述路由信息,向每个所述计算节点发送合并处理后的数据。
  13. 根据权利要求10或11所述的方法,其特征在于,所述交换设备为用于连接所述目标交换设备和至少两个所述计算节点的中间交换设备;
    所述对所述多个计算节点上报的数据进行合并处理,包括:
    对至少两个所述计算节点上报的数据进行合并处理;
    所述根据所述路由信息,发送合并处理后的数据,包括:
    根据所述路由信息,向所述目标交换设备发送合并处理后的数据。
  14. 一种数据处理装置,其特征在于,应用于数据中心网络的控制器,所述装置包括:
    接收模块,用于接收指定节点发送的针对指定计算任务的处理请求,所述处理请求包括用于执行所述指定计算任务的多个计算节点的标识,所述多个计算节点包括所述指定节点;
    确定模块,用于从用于连接所述多个计算节点的交换设备中确定目标交换设备;
    发送模块,用于分别向所述目标交换设备以及所述指定节点发送所述指定计算任务对应的路由信息,所述路由信息用于指示所述多个计算节点与所述目标交换设备之间的数据转发路径;
    其中,所述路由信息用于在所述目标交换设备对所述多个计算节点上报的数据进行合并处理后根据所述路由信息将所述合并处理后的数据发送至每个所述计算节点。
  15. 根据权利要求14所述的装置,其特征在于,所述多个计算节点与所述目标交换设备之间的数据转发路径上包括至少一个交换设备;
    所述确定模块,还用于将所述数据转发路径上包括的至少一个交换设备中,与所述多个计算节点中的至少两个计算节点连接的交换设备确定为中间交换设备;
    所述发送模块,还用于向所述中间交换设备发送所述路由信息,所述路由信息用于将与所述中间交换设备连接的至少两个计算节点上报的数据进行合并处理后根据所述路由信息将所述合并处理后的数据发送至所述目标交换设备。
  16. 根据权利要求15所述的装置,其特征在于,所述发送模块,包括:
    第一发送子模块,用于向所述目标交换设备发送包括所述目标交换设备的直连设备的标识的路由信息,所述目标交换设备的直连设备为计算节点或者中间交换设备;
    第二发送子模块,用于向所述指定节点发送包括每个所述计算节点的直连设备的标识的路由信息,每个所述计算节点的直连设备为目标交换设备或者中间交换设备,所述指定节点 用于将每个所述计算节点的直连设备的标识发送至对应的计算节点;
    第三发送子模块,用于向所述中间交换设备发送包括所述中间交换设备的直连设备的标识的路由信息,所述中间交换设备的直连设备为计算节点、所述目标交换设备或其他中间交换设备。
  17. 根据权利要求14至16任一所述的装置,其特征在于,所述确定模块,包括:
    计算子模块,用于计算用于连接所述多个计算节点的交换设备中,每个交换设备与各个所述计算节点之间的路由跳数之和;
    第一确定子模块,用于将路由跳数之和最少的交换设备确定为目标交换设备。
  18. 根据权利要求17所述的装置,其特征在于,所述第一确定子模块,用于:
    当路由跳数之和最少的交换设备包括多个时,分别确定每个所述路由跳数之和最少的交换设备的性能参数,所述性能参数包括可用带宽、计算负载、吞吐量以及被选为目标交换设备的次数中的至少一种;
    将多个所述路由跳数之和最少的交换设备中,性能参数满足预设条件的交换设备确定为目标交换设备。
  19. 根据权利要求17所述的装置,其特征在于,所述第一确定子模块,用于:
    当路由跳数之和最少的交换设备包括多个时,分别确定每个所述路由跳数之和最少的交换设备与各个计算节点所述之间的路由跳数的均衡程度;
    将多个所述路由跳数之和最少的交换设备中,路由跳数的均衡程度最高的交换设备确定为目标交换设备。
  20. 根据权利要求17所述的装置,其特征在于,所述确定模块,还包括:
    检测子模块,用于检测所述多个计算节点是否均直接连接至同一个交换设备;
    第二确定子模块,用于当所述多个计算节点均直接连接至同一个交换设备时,将所述多个计算节点直接连接的交换设备确定为所述目标交换设备;
    所述计算子模块,用于:当所述多个计算节点直接连接至不同的交换设备时,计算用于连接所述多个计算节点的交换设备中,每个交换设备与各个所述计算节点之间的路由跳数之和。
  21. 根据权利要求14至16任一所述的装置,其特征在于,所述确定模块,用于:
    从用于连接所述多个计算节点的交换设备中确定至少一个备选交换设备,每个所述备选交换设备通过下行路径与至少两个计算节点连接;
    从所述至少一个备选交换设备中确定所述目标交换设备。
  22. 根据权利要求14至16任一所述的装置,其特征在于,所述处理请求还包括:所述指定计算任务对应的合并处理类型;
    所述发送模块,还用于向所述目标交换设备发送所述指定计算任务对应的合并处理类型,所述目标交换设备用于按照所述合并处理类型对所述多个计算节点上报的数据进行合并 处理。
  23. 一种数据处理装置,其特征在于,应用于数据中心网络的交换设备,所述装置包括:
    接收模块,用于接收控制器发送的指定计算任务对应的路由信息,所述路由信息用于指示多个计算节点与目标交换设备之间的数据转发路径,所述多个计算节点用于执行所述指定计算任务;
    处理模块,用于对所述多个计算节点上报的数据进行合并处理;
    发送模块,用于根据所述路由信息,发送合并处理后的数据;
    其中,所述路由信息为所述控制器接收到指定节点发送的针对所述指定计算任务的处理请求后,从用于连接所述多个计算节点的交换设备中确定目标交换设备后发送的。
  24. 根据权利要求23所述的装置,其特征在于,
    所述接收模块,还用于在对所述多个计算节点上报的数据进行合并处理之前,接收所述控制器发送的所述指定计算任务对应的合并处理类型;
    所述处理模块,用于:
    按照所述合并处理类型,对所述多个计算节点上报的数据进行合并处理。
  25. 根据权利要求23或24所述的装置,其特征在于,所述交换设备为所述目标交换设备;所述发送模块,用于:
    根据所述路由信息,向每个所述计算节点发送合并处理后的数据。
  26. 根据权利要求23或24所述的装置,其特征在于,所述交换设备为用于连接所述目标交换设备和至少两个所述计算节点的中间交换设备;
    所述处理模块,用于:对至少两个所述计算节点上报的数据进行合并处理;
    所述发送模块,用于:根据所述路由信息,向所述目标交换设备发送合并处理后的数据。
  27. 一种数据处理系统,其特征在于,所述系统包括:控制器、多个计算节点以及至少一个交换设备;
    所述控制器包括如权利要求14至22任一所述的数据处理装置;
    每个所述交换设备包括如权利要求23至26任一所述的数据处理装置。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述计算机可读存储介质在计算机上运行时,使得计算机执行权利要求1至13任一所述的数据处理方法。
PCT/CN2019/074052 2018-03-05 2019-01-31 数据处理方法、装置及系统 WO2019169986A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP19765026.0A EP3754915B1 (en) 2018-03-05 2019-01-31 Data processing method and apparatus
EP22163754.9A EP4092992A1 (en) 2018-03-05 2019-01-31 Data processing method, apparatus, and system
US17/012,941 US11522789B2 (en) 2018-03-05 2020-09-04 Data processing method, apparatus, and system for combining data for a distributed calculation task in a data center network
US17/978,378 US11855880B2 (en) 2018-03-05 2022-11-01 Data processing method, apparatus, and system for combining data for a distributed calculation task in a data center network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810178287.5 2018-03-05
CN201810178287.5A CN110233798B (zh) 2018-03-05 2018-03-05 数据处理方法、装置及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/012,941 Continuation US11522789B2 (en) 2018-03-05 2020-09-04 Data processing method, apparatus, and system for combining data for a distributed calculation task in a data center network

Publications (1)

Publication Number Publication Date
WO2019169986A1 true WO2019169986A1 (zh) 2019-09-12

Family

ID=67846411

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074052 WO2019169986A1 (zh) 2018-03-05 2019-01-31 数据处理方法、装置及系统

Country Status (4)

Country Link
US (2) US11522789B2 (zh)
EP (2) EP4092992A1 (zh)
CN (2) CN110233798B (zh)
WO (1) WO2019169986A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115936091A (zh) * 2022-11-24 2023-04-07 北京百度网讯科技有限公司 深度学习模型的训练方法、装置、电子设备以及存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11328222B1 (en) * 2019-05-10 2022-05-10 Innovium, Inc. Network switch with integrated gradient aggregation for distributed machine learning
US11057318B1 (en) 2019-08-27 2021-07-06 Innovium, Inc. Distributed artificial intelligence extension modules for network switches
CN111079948B (zh) * 2019-12-27 2023-02-24 电子科技大学 一种基于sdn的分布式机器学习训练加速方法
CN113301073A (zh) * 2020-04-16 2021-08-24 阿里巴巴集团控股有限公司 分布式机器学习系统中服务器节点之间的通信方法和装置
WO2021249023A1 (zh) * 2020-06-08 2021-12-16 华为技术有限公司 集合通信系统中控制报文处理方法、装置、设备及系统
CN113810439B (zh) * 2020-06-12 2023-02-03 华为技术有限公司 一种以太网存储系统及其信息通告方法和相关装置
WO2022007587A1 (zh) * 2020-07-08 2022-01-13 华为技术有限公司 交换机和数据处理的系统
US11516311B2 (en) 2021-01-22 2022-11-29 Avago Technologies International Sales Pte. Limited Distributed machine-learning resource sharing and request routing
CN117793036A (zh) * 2022-09-29 2024-03-29 华为技术有限公司 节点控制方法、装置、处理系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289745A1 (en) * 2013-03-25 2014-09-25 Seven Networks, Inc. Intelligent alarm manipulator and resource tracker
CN105357124A (zh) * 2015-11-22 2016-02-24 华中科技大学 一种MapReduce带宽优化方法
CN105634938A (zh) * 2014-11-30 2016-06-01 中国科学院沈阳自动化研究所 一种用于软件定义网络的数据双路径备份传输方法

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7349416B2 (en) 2002-11-26 2008-03-25 Cisco Technology, Inc. Apparatus and method for distributing buffer status information in a switching fabric
US7613804B2 (en) * 2003-11-25 2009-11-03 Microsoft Corporation Systems and methods for state management of networked systems
US8307065B2 (en) * 2006-08-22 2012-11-06 Centurylink Intellectual Property Llc System and method for remotely controlling network operators
KR101394357B1 (ko) * 2007-10-09 2014-05-13 삼성전자주식회사 무선 센서 네트워크 시스템 및 그의 클러스터 관리 방법
US9001656B2 (en) * 2009-11-18 2015-04-07 Nec Corporation Dynamic route branching system and dynamic route branching method
US20120218893A1 (en) * 2011-02-25 2012-08-30 Srikanth Natarajan Method and apparatus for analyzing a network
CN103139265B (zh) * 2011-12-01 2016-06-08 国际商业机器公司 大规模并行计算系统中的网络传输自适应优化方法及系统
US9013995B2 (en) * 2012-05-04 2015-04-21 Telefonaktiebolaget L M Ericsson (Publ) Congestion control in packet data networking
US9218573B1 (en) 2012-05-22 2015-12-22 Google Inc. Training a model using parameter server shards
CN103106253B (zh) * 2013-01-16 2016-05-04 西安交通大学 一种MapReduce计算模型中基于遗传算法的数据平衡方法
CN104023039B (zh) * 2013-02-28 2018-02-02 国际商业机器公司 数据包传输方法和装置
CN103366015B (zh) 2013-07-31 2016-04-27 东南大学 一种基于Hadoop的OLAP数据存储与查询方法
US9396448B2 (en) * 2013-09-10 2016-07-19 Nice-Systems Ltd. Distributed and open schema interactions management system and method
EP3143733B1 (en) * 2014-05-13 2018-12-05 Telefonaktiebolaget LM Ericsson (publ) Virtual flow network in a cloud environment
CN105339934B (zh) * 2014-05-22 2018-10-19 华为技术有限公司 一种报文处理方法及装置
US10049134B2 (en) * 2014-06-12 2018-08-14 International Business Machines Corporation Method and system for processing queries over datasets stored using hierarchical data structures
CN105446979B (zh) 2014-06-27 2019-02-01 华为技术有限公司 数据挖掘方法和节点
CN104378161B (zh) * 2014-10-22 2017-03-01 华中科技大学 一种基于AXI4总线架构的FCoE协议加速引擎IP核
US10592379B2 (en) * 2014-11-18 2020-03-17 Comcast Cable Communications Management, Llc Methods and systems for status determination
CN104935458B (zh) * 2015-04-29 2018-05-29 中国人民解放军国防科学技术大学 一种基于分布式自动化测量的性能瓶颈分析方法及装置
CN106302182B (zh) * 2015-06-08 2019-06-25 上海宽带技术及应用工程研究中心 基于sdn的主机发现方法及系统
CN106326308B (zh) * 2015-07-03 2019-06-11 华中科技大学 一种基于sdn的网内重复数据删除方法及系统
US20180103119A1 (en) * 2016-10-11 2018-04-12 Synergex Group System and method for pairing devices to complete a task using an application request.
CN106961387B (zh) * 2017-03-30 2020-05-01 中国科学院信息工程研究所 一种基于转发路径自迁移的链路型DDoS防御方法及系统
CN107094115B (zh) * 2017-05-19 2020-06-16 重庆邮电大学 一种基于sdn的蚁群优化负载均衡路由算法
CN107291847B (zh) * 2017-06-02 2019-06-25 东北大学 一种基于MapReduce的大规模数据分布式聚类处理方法
US10564946B1 (en) * 2017-12-13 2020-02-18 Amazon Technologies, Inc. Dependency handling in an on-demand network code execution system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289745A1 (en) * 2013-03-25 2014-09-25 Seven Networks, Inc. Intelligent alarm manipulator and resource tracker
CN105634938A (zh) * 2014-11-30 2016-06-01 中国科学院沈阳自动化研究所 一种用于软件定义网络的数据双路径备份传输方法
CN105357124A (zh) * 2015-11-22 2016-02-24 华中科技大学 一种MapReduce带宽优化方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3754915A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115936091A (zh) * 2022-11-24 2023-04-07 北京百度网讯科技有限公司 深度学习模型的训练方法、装置、电子设备以及存储介质
CN115936091B (zh) * 2022-11-24 2024-03-08 北京百度网讯科技有限公司 深度学习模型的训练方法、装置、电子设备以及存储介质

Also Published As

Publication number Publication date
EP3754915A1 (en) 2020-12-23
CN110233798A (zh) 2019-09-13
EP4092992A1 (en) 2022-11-23
CN113098773A (zh) 2021-07-09
CN110233798B (zh) 2021-02-26
CN113098773B (zh) 2022-12-30
US11855880B2 (en) 2023-12-26
EP3754915A4 (en) 2020-12-23
US11522789B2 (en) 2022-12-06
US20230047068A1 (en) 2023-02-16
EP3754915B1 (en) 2022-04-06
US20200403904A1 (en) 2020-12-24

Similar Documents

Publication Publication Date Title
WO2019169986A1 (zh) 数据处理方法、装置及系统
US10826841B2 (en) Modification of queue affinity to cores based on utilization
US11283718B2 (en) Hybrid network processing load distribution in computing systems
US11456941B2 (en) Extensible network traffic engineering platform for increasing network resiliency in cloud applications
US8155518B2 (en) Dynamic load balancing of fibre channel traffic
CN105308931A (zh) 分布负载平衡器中的不对称封包流
CN102577280B (zh) 发送报文的方法、装置和系统
US11252027B2 (en) Network element supporting flexible data reduction operations
CN110058937B (zh) 用于调度专用处理资源的方法、设备和介质
TW201926069A (zh) 運算裝置、其資源分配方法及通訊系統
CN108933829A (zh) 一种负载均衡方法及装置
US10931548B1 (en) Collecting health monitoring data pertaining to an application from a selected set of service engines
WO2024021486A1 (zh) 一种负载均衡方法、系统、电子设备及存储介质
JP7451689B2 (ja) ネットワーク輻輳処理方法、モデル更新方法、および関連装置
WO2021120633A1 (zh) 一种负载均衡方法及相关设备
Imani et al. iSample: Intelligent client sampling in federated learning
US10911366B2 (en) Technologies for balancing throughput across input ports of a multi-stage network switch
JPWO2020053989A1 (ja) 無線アクセスネットワークのコントローラ
CN114363248A (zh) 计算系统、加速器、交换平面及聚合通信方法
CN114579311B (zh) 执行分布式计算任务的方法、装置、设备以及存储介质
CN110928683B (zh) 基于两类密集型虚拟机的边缘计算资源分配方法
EP4260202A1 (en) Time-sensitive data delivery in distributed computing systems
KR20210060187A (ko) 네트워크 제어 방법 및 장치
CN115208769B (zh) 适用于Dragonfly拓扑的环形通信方法
US20220109639A1 (en) Path selection for packet transmission

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19765026

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019765026

Country of ref document: EP

Effective date: 20200915