WO2023130835A1 - 一种数据交换方法及装置 - Google Patents

一种数据交换方法及装置 Download PDF

Info

Publication number
WO2023130835A1
WO2023130835A1 PCT/CN2022/131459 CN2022131459W WO2023130835A1 WO 2023130835 A1 WO2023130835 A1 WO 2023130835A1 CN 2022131459 W CN2022131459 W CN 2022131459W WO 2023130835 A1 WO2023130835 A1 WO 2023130835A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
switching
information
block
Prior art date
Application number
PCT/CN2022/131459
Other languages
English (en)
French (fr)
Inventor
叶秋红
何子键
林云
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023130835A1 publication Critical patent/WO2023130835A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/11Identifying congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction

Definitions

  • the present application relates to the technical field of communications, and in particular to a data exchange method and device.
  • the data exchange network generally adopts a multi-level (for example, two-level or three-level) switching node networking mode to provide a fully connected network for many servers (servers) in the access network, and exchange data between different servers.
  • each switching node has a cache with a certain capacity, which can be used to absorb bursty data flows.
  • the output port may be congested. Causing buffer overflow, resulting in data packet loss and other phenomena.
  • ECN explicit congestion notification
  • PFC priority-based flow control
  • tail drop tail drop
  • ECN can be used to realize the flow control of the source node (such as server or mobile phone, etc.), for example, the switching node in the network notifies the source node to reduce the transmission rate before congestion occurs, so as to reduce the effect of network congestion.
  • PFC can be used to implement flow control between switching nodes. For example, a downstream switching node notifies an upstream switching node to stop sending data to avoid local buffer overflow.
  • Tail drop refers to a way to reduce congestion by dropping data packets. For example, a switching node directly discards newly received data packets when the cache is full.
  • the present application provides a data exchange method and device, which solves the problems of low exchange efficiency and prolonged exchange time in the data exchange network in the prior art.
  • a data exchange method comprising: a source node receives flow indication information from a first switching node, where the flow indication information is used to indicate that a target data flow is congested, and the first switching node is the target data flow A node in the switching path; the source node sends a plurality of write data information and a plurality of data blocks of the target data flow to a plurality of switching nodes, and the plurality of write data information is used to instruct the plurality of switching nodes to store the plurality of data blocks And stop forwarding the multiple data blocks.
  • the first switching node when the first switching node determines that the target data flow is congested or is about to be congested, the first switching node can notify the source node, so that the source node can store multiple data blocks of the target data flow in multiple switches respectively.
  • the source node can store the multiple data blocks in the larger-capacity buffer pool formed by the plurality of switching nodes, thereby providing a larger buffer, reducing the congestion of the target data flow, avoiding head-blocking, and improving The ability to absorb burst traffic, thereby improving data exchange efficiency and reducing exchange delay.
  • the method further includes: the source node receives A plurality of block description information of a switching node, the plurality of block description information corresponds to the plurality of data blocks, and each block description information is used to indicate the node information for storing the corresponding data block, such as the node identifier for storing the data block and the storage address of the data block; the source node sends the multiple block description information to the destination node according to.
  • the block description information is also used to indicate the identifier of the data packet included in the corresponding data block; or, the source node sends the identifier of the data packet included in each of the multiple data blocks to the destination node.
  • the multiple switching nodes may return the block description information of the corresponding stored data blocks to the source node to The source node sends multiple pieces of block description information to the destination node, so that the destination node can perform orderly scheduling on the multiple data blocks according to the multiple pieces of block description information.
  • the method before the source node sends multiple write data information and multiple data blocks of the target data stream to multiple switching nodes, the method further includes: the source node sends the target data
  • the data to be exchanged in the flow is divided into the multiple data blocks, and the number of the multiple data blocks is greater than or equal to the number of the multiple switching nodes.
  • the source node may divide the data to be exchanged in the target data stream into the multiple data blocks according to actual conditions, so as to disperse and store the data blocks in multiple switching nodes.
  • the order of the data block corresponding to a switching node in the multiple switching nodes and the distance corresponding to the switching node are arranged in ascending order
  • the order of multiple distances is the same
  • the distance corresponding to the switching node is the distance between the switching node and the first switching node
  • the multiple distances include the distance between each switching node and the first switching node among the multiple switching nodes distance between.
  • a method for exchanging data includes: a switching node sends flow indication information to a source node, where the flow indication information is used to indicate that a target data flow is congested, and the switching node is in the switching path of the target data flow the node; the switching node receives the write data information from the source node and the data block of the target data flow, and the write data information is used to instruct the switching node to store the data block and stop forwarding the data block; the switching node according to the write data The information stores the data block; the switching node receives scheduling information from the destination node, and the scheduling information is used to schedule the data block; the switching node sends the data block to the destination node.
  • the switching node when the switching node determines that the target data flow is congested or is about to be congested, the switching node can notify the source node, so that the source node can store multiple data blocks of the target data flow in multiple switching nodes respectively , that is, the source node can store the multiple data blocks in the larger-capacity buffer pool formed by the plurality of switching nodes, thereby providing a larger buffer, reducing the congestion of the target data flow, avoiding head resistance, and improving the response to bursts.
  • the ability to absorb traffic thereby improving data exchange efficiency and reducing exchange delays.
  • the method further includes: the switching node sending block description information of the data block to the destination node; or, The switch node sends the block description information of the data block to the source node; wherein the block description information is used to indicate the information of the node storing the data block.
  • the block description information is also used to indicate the identifier of the data packet included in the corresponding data block.
  • a data exchange method comprising: a first switching node sends flow indication information to a source node, where the flow indication information is used to indicate that a target data flow is congested, and the first switching node is the target data flow Nodes in the switching path; when the source node receives the flow indication information, it sends a plurality of write data information and a plurality of data blocks of the target data flow to a plurality of switching nodes, and the plurality of write data information is used to indicate
  • the multiple switching nodes store the multiple data blocks and stop forwarding the multiple data blocks; the multiple switching nodes receive the multiple write data information and the multiple data blocks, and store the multiple data blocks according to the multiple write data information data blocks; the destination node sends multiple scheduling information to the multiple switching nodes, and the multiple scheduling information is used to schedule the multiple data blocks; the multiple switching nodes receive the multiple scheduling information, and according to the multiple scheduling information The information sends the plurality of data blocks to the destination node.
  • the first switching node can notify the source node, so that the source node can store multiple data blocks of the target data flow in multiple In the switching node, that is, the source node can store the multiple data blocks in a larger-capacity buffer pool formed by the multiple switching nodes, so that the destination node can schedule corresponding data blocks from the multiple switching nodes, so that the The data exchange network can provide a larger buffer, reduce the congestion of the target data flow, avoid head resistance, improve the absorption capacity of burst traffic, and then improve the exchange efficiency of data and reduce the exchange delay.
  • the method further includes: the multiple switching nodes sending multiple pieces of block description information to the source node, where the multiple pieces of block description information correspond to the multiple data blocks one by one, Each block description information is used to indicate node information storing the corresponding data block; the source node receives the multiple block description information, and sends the multiple block description information to the destination node.
  • the multiple switching nodes send multiple pieces of block description information to the destination node, and each piece of block description information is used to indicate the node information storing the corresponding data block.
  • the multiple switching nodes can return block descriptions of the corresponding stored data blocks to the source node or the destination node information, so that the destination node can schedule the multiple data blocks in an orderly manner according to the multiple block description information.
  • the method before the destination node sends a plurality of scheduling information to a plurality of switching nodes, the method further includes: when the destination node receives the plurality of block description information, according to the plurality of The block description information determines the scheduling sequence of the multiple data blocks, and the scheduling sequence is used to schedule the multiple data blocks from the multiple switching nodes.
  • the destination node may perform orderly scheduling on the multiple data blocks according to the multiple block description information or the storage indication information.
  • the method further includes: the source node divides the data to be exchanged in the target data flow into the multiple data blocks, and the number of the multiple data blocks is greater than or equal to the Number of multiple switching nodes.
  • the source node may divide the data to be exchanged in the target data stream into the multiple data blocks according to actual conditions, so as to disperse and store the data blocks in multiple switching nodes.
  • the order of the data blocks stored by one of the multiple switching nodes in the multiple data blocks and the distance corresponding to the switching node are in ascending order
  • the order of the multiple distances arranged is consistent, the distance corresponding to the switching node is the distance between the switching node and the first switching node, and the multiple distances include the distance between the multiple switching nodes and the first switching node distance.
  • the path when the destination node schedules the multiple data blocks can be reduced, so as to improve the efficiency of the destination node for scheduling the multiple data blocks.
  • a data switching device as a source node, includes: a receiving unit, configured to receive flow indication information from a first switching node, where the flow indication information is used to indicate that a target data flow is congested, and the first The switching node is a node in the switching path of the target data flow; the sending unit is configured to send a plurality of write data information and a plurality of data blocks of the target data flow to a plurality of switching nodes, and the plurality of write data information is used to indicate The plurality of switching nodes store the plurality of data blocks and stop forwarding the plurality of data blocks.
  • the receiving unit is further configured to: receive a plurality of block description information from the plurality of switching nodes, where the plurality of block description information corresponds to the plurality of data blocks one by one, Each block description information is used to indicate the node information storing the corresponding data block; the sending unit is also used to: send the multiple block description information to the destination node.
  • the device further includes: a processing unit, configured to divide the data to be exchanged in the target data stream into the multiple data blocks, and the number of the multiple data blocks is greater than or equal to the number of the plurality of switching nodes.
  • the order of the data block corresponding to a switching node in the multiple switching nodes and the distance corresponding to the switching node are arranged in ascending order
  • the order of multiple distances is the same
  • the distance corresponding to the switching node is the distance between the switching node and the first switching node
  • the multiple distances include the distance between each switching node and the first switching node among the multiple switching nodes distance between.
  • a data switching device which, as a switching node, includes: a sending unit, configured to send flow indication information to a source node, where the flow indication information is used to indicate that a target data flow is congested, and the switching node is the A node in the switching path of the target data flow; a receiving unit, configured to receive write data information from the source node and a data block of the target data flow, the write data information is used to instruct the switching node to store the data block and stop forwarding the data block A data block; a processing unit configured to store the data block according to the write data information; the receiving unit is also configured to receive scheduling information from the destination node, the scheduling information is used to schedule the data block; the sending unit is also used to Send the data block to the destination node.
  • the sending unit is further configured to: send the block description information of the data block to the destination node; or send the block description information of the data block to the source node; wherein, The block description information is used to indicate the node information storing the data block.
  • a data switching network includes a source node, multiple switching nodes, and a destination node, and the multiple switching nodes include a first switching node; wherein, the first switching node is configured to send the source node
  • the node sends flow indication information, the flow indication information is used to indicate that the target data flow is congested, and the first switching node is a node in the switching path of the target data flow;
  • the source node is used to receive the flow indication information and send Multiple switching nodes send multiple write data messages and multiple data blocks of the target data stream, where the multiple write data messages are used to instruct the multiple switch nodes to store the multiple data blocks and stop forwarding the multiple data blocks;
  • the plurality of switching nodes is configured to receive the plurality of write data information and the plurality of data blocks, and store the plurality of data blocks according to the plurality of write data information;
  • the destination node is configured to send to the plurality of switching nodes A plurality of scheduling information, the plurality of scheduling information is used to
  • the multiple switching nodes are further configured to: send multiple pieces of block description information to the source node, where the multiple pieces of block description information are in one-to-one correspondence with the multiple data blocks, and each The pieces of block description information are used to indicate the node information storing the corresponding data block; the source node is also used to: receive the pieces of block description information, and send the pieces of block description information to the destination node.
  • the multiple switching nodes are further configured to: send the multiple pieces of block description information to the destination node.
  • the destination node is further configured to: receive the multiple block description information, and determine the scheduling sequence of the multiple data blocks according to the multiple block description information, and the scheduling sequence uses for scheduling the multiple data blocks from the multiple switching nodes.
  • the source node is further configured to: divide the data to be exchanged in the target data stream into the multiple data blocks, and the number of the multiple data blocks is greater than or equal to the multiple The number of switching nodes.
  • the order of the data blocks stored by one of the multiple switching nodes in the multiple data blocks and the distance corresponding to the switching node are in ascending order
  • the order of the multiple distances arranged is consistent, the distance corresponding to the switching node is the distance between the switching node and the first switching node, and the multiple distances include the distance between the multiple switching nodes and the first switching node distance.
  • a data exchange device in yet another aspect of the present application, includes: a processor, a memory, a communication interface, and a bus, the processor, the memory, and the communication interface are connected through the bus; the memory is used to store The program code, when the program code is executed by the processor, causes the data exchange device to execute the data exchange method provided in the first aspect or any possible implementation manner of the first aspect.
  • a data exchange device in yet another aspect of the present application, includes: a processor, a memory, a communication interface, and a bus, the processor, the memory, and the communication interface are connected through the bus; the memory is used to store The program code, when the program code is executed by the processor, causes the data exchange device to execute the data exchange method provided in the second aspect or any possible implementation manner of the second aspect.
  • a data exchange device in yet another aspect of the present application, includes: a processor, a memory, a communication interface, and a bus, the processor, the memory, and the communication interface are connected through the bus; the memory is used to store The program code, when the program code is executed by the processor, causes the data exchange device to execute the data exchange method provided in the third aspect or any possible implementation manner of the third aspect.
  • a computer-readable storage medium is provided, and a computer program or instruction is stored in the computer-readable storage medium.
  • the computer program or instruction is executed, the first aspect or the first aspect can be realized.
  • the data exchange method provided by any possible implementation of .
  • a computer-readable storage medium is provided, and a computer program or instruction is stored in the computer-readable storage medium.
  • the computer program or instruction is executed, the second aspect or the second aspect is implemented.
  • the data exchange method provided by any possible implementation of .
  • a computer-readable storage medium is provided, and a computer program or instruction is stored in the computer-readable storage medium.
  • the third aspect or the third aspect is realized.
  • the data exchange method provided by any possible implementation of .
  • a computer program product includes a computer program or an instruction, and when the computer program or instruction is executed, any possible implementation of the above-mentioned first aspect or the first aspect is performed
  • the data exchange method provided by the method.
  • a computer program product includes a computer program or an instruction, and when the computer program or instruction is executed, any possible implementation of the above-mentioned second aspect or the second aspect is performed method provided by the data exchange party.
  • a computer program product includes a computer program or an instruction, and when the computer program or instruction is executed, any possible realization of the above-mentioned third aspect or the third aspect is performed method provided by the data exchange party.
  • FIG. 1 is a schematic structural diagram of a data exchange network provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of another data exchange network provided by an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of another data exchange network provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of data flow congestion in a data exchange network provided by an embodiment of the present application.
  • FIG. 5 is a schematic flow chart of a data exchange method provided by an embodiment of the present application.
  • FIG. 6 is a schematic flow chart of another data exchange method provided by the embodiment of the present application.
  • FIG. 7 is a schematic diagram of a data transmission congestion notification provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of storing multiple data blocks at multiple switching nodes provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a plurality of switching nodes sending block description information provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a destination node scheduling multiple data blocks provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a source node provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of another source node provided by the embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a switching node provided in an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of another switching node provided in the embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a destination node provided by an embodiment of the present application.
  • FIG. 16 is a schematic structural diagram of another destination node provided by the embodiment of the present application.
  • At least one (unit) of a, b or c can represent: a, b, c, a and b, a and c, b and c or a, b and c, wherein a, b and c can be It can be single or multiple.
  • words such as "first" and "second” do not limit the quantity and order.
  • the technical solution provided by this application can be applied to various data exchange networks.
  • the data exchange network can be a large data exchange network or a small data exchange network.
  • a small data exchange network can also be called a data exchange system.
  • the data switching network may include multiple switching nodes, and the switching nodes may also be referred to as nodes.
  • the switching node may be a switching device such as a switch or a router, or may be a switching board or a switching element (switch element, SE).
  • the switching board may also be called a switching network card or a network interface card (network interface card, NIC), and one switching board may include one or more switching units.
  • the data exchange network may include a data center network (dater center network, DCN), a high performance computing (high performance computing, HPC) network, a cloud network, a single chip or a network on chip after multiple chips are packaged, etc. .
  • the structure of the data exchange network will be illustrated below with reference to FIGS. 1-3 .
  • FIG. 1 is a schematic structural diagram of a data switching network provided by an embodiment of the present application, and the data switching network includes three switching layers.
  • the data exchange network includes an access layer, an aggregation layer and a core layer
  • the access layer includes a plurality of access nodes
  • the aggregation layer includes a plurality of aggregation nodes
  • the core layer includes a plurality of The core (core) node
  • the downlink port of the access node is connected to the server (server) that needs to exchange data traffic
  • the uplink port of the access node is connected to the downlink port of the sink node
  • the uplink port of the sink node is connected to the core node.
  • the aggregation layer and the access layer can be divided into multiple groups (pods), a group can include multiple access nodes and multiple aggregation nodes, and each access node is fully connected to multiple aggregation nodes .
  • Multiple core nodes connected to the same sink node may be referred to as a core (core) plane, and each core plane is connected to a different sink node in each group respectively.
  • the data exchange network includes 3 groups, a group includes 3 access nodes and 4 aggregation nodes, and each core plane includes 2 core nodes as an example for illustration.
  • the access nodes in Fig. 1 can be represented as A1-A9
  • the aggregation nodes can be represented as B1-B12
  • the core nodes can be represented as C1-C8, and the three groups are respectively represented as P1-P3.
  • the access node A1 when the data traffic is exchanged between servers connected to different access nodes in a group, it can be realized through the aggregation node in the same group as the access node, for example, the access node A1 is connected to the access node A3 If the server needs to exchange data traffic, the access node A1 can send the data stream of the connected server to the access node A3 through the aggregation node B1.
  • the data traffic When the data traffic is exchanged between the servers connected to the access nodes in different groups, it can be realized through the convergence node in the same group as the access node and the core node connected to the convergence node, for example, the access node
  • the server connected to A1 and access node A5 needs to exchange data traffic, then access node A1 can send the data flow of the server it connects to the aggregation node B1, and the aggregation node B1 forwards it to the core node C1, and then C1 passes the aggregation Node B5 sends to access node A5.
  • FIG. 2 is a schematic structural diagram of another data switching network provided by an embodiment of the present application.
  • the data switching network includes two switching layers.
  • the data exchange network includes a relay layer (also referred to as a TOR layer) and a backbone (spine) layer, the relay layer includes a plurality of leaf (leaf) nodes, the backbone layer includes a plurality of backbone nodes, and the leaf The downlink port of the node is connected to a server (server) that needs to exchange data traffic, and the uplink port of the leaf node is connected to the plurality of backbone nodes.
  • the data exchange network includes 4 leaf nodes and 2 backbone nodes as an example for illustration.
  • the leaf nodes in Fig. 2 can be expressed as A1-A4, and the backbone nodes can be expressed as C1-C2.
  • the leaf node A1 when data flow exchange is performed between two servers connected to the same leaf node, it can be realized through the leaf node.
  • two servers for example, S1 and S2 connected to the leaf node A1 can pass through the A1 performs data traffic exchange.
  • data traffic is exchanged between two servers connected to different leaf nodes, it can be realized through the leaf node and the backbone node.
  • the server S1 connected to the leaf node A1 needs to communicate with the server S3 connected to the leaf node A2.
  • the leaf node A1 can send the data flow from the server S1 to the backbone node C1, and the backbone node C1 forwards the data flow to the leaf node A2.
  • FIG. 3 is a schematic structural diagram of another data switching network provided by an embodiment of the present application.
  • the data switching network may be a network on chip.
  • the data switching network includes a plurality of switching chips, each of which includes a plurality of switching units, and all switching units in the data switching network can be interconnected in a certain manner.
  • the data switching network includes 4 switching chips D1-D4, each switching chip includes 9 switching units, and the switching units in the data switching network are respectively represented as 1-35 for illustration.
  • each switching unit can have one or more input ports, and one or more output ports, the input ports can be used to receive data packets or information elements input from the outside, and the output ports can be used to output data packets or information elements to the outside. Yuan.
  • the interconnection between multiple switching units in the data switching network can be used to switch data packets or cells received at each input interface to corresponding output ports.
  • Each switching unit in the data switching network may include at least one buffer (queue) queue, and the at least one buffer queue may be used for buffering different data packets or cells destined for different output ports.
  • the interconnection relationship of multiple switching units shown in FIG. 3 is only exemplary, and does not limit the embodiment of the present application.
  • each switching node has a certain switching capability and a certain capacity of buffering.
  • the traffic to be scheduled by a switching node exceeds the switching capability and buffering capability of the switching node, it will cause queue head blocking. And packet loss and other phenomena, thus affecting the switching efficiency of the data switching network, but also increasing the switching delay. Therefore, how to reduce the congestion in the data exchange network is a technical problem to be solved urgently at present.
  • FIG. 4 shows a schematic diagram of exchanging data between a source server S0 and a target server D0 through a data exchange network.
  • the data exchange network includes multiple exchange nodes and are represented as A1-A2, B1-B4, and C1-C2 respectively, the source server S0 is interconnected with A1 and A2 respectively, and the destination server D0 is interconnected with C1 and C2 respectively , A1 is connected to C1 through B1 and B2 respectively, and A2 is connected to C2 through B3 and B4 respectively as an example for illustration.
  • the data exchange network may adopt the following several flow control methods to reduce the congestion.
  • the first is to reduce congestion through passive congestion control mechanisms. Specifically, if C1 is congested, C1 can notify the source server S0 to reduce the sending rate through explicit congestion notification (ECN), or through priority-based flow control (PFC) Notify B1 and B2 to stop sending data respectively, or discard newly received data packets by tail drop.
  • ECN explicit congestion notification
  • PFC priority-based flow control
  • different flow control methods are adopted for different data streams (the data streams can be distinguished by priority, ports, etc.).
  • ECN and PFC can be used for high-priority service data streams
  • ECN and tail drop can be used for low-priority service data streams.
  • the buffer capacity of a single switching node is limited, and it is impossible to increase indefinitely, which will still cause congestion when the amount of data is large; in addition, the growth rate of the buffer capacity is much smaller than the growth rate of the port bandwidth. This will weaken the ability of a single switching node to withstand burst traffic; moreover, if a certain switching node generates flow control, it may spread to the entire data switching network, thereby further causing the problem of head-of-queue resistance.
  • the second is to reduce congestion through active congestion control mechanisms.
  • the source server S0 can actively obtain information such as link status or rate through active detection means such as detection messages and local flow tables (for example, when C1 is congested, it can return a detection result carrying this information to the source server S0);
  • the source server S0 directly controls the sending rate of the local data stream according to this information.
  • the active congestion control mechanism is only suitable for small-scale data exchange networks, because the probe packets and local flow tables need to occupy a certain amount of bandwidth and cache, while the number of data flows in large-scale data exchange networks is very large , which will increase the difficulty of implementation; in addition, the active congestion control mechanism cannot control the burst of a large number of data streams with small data volumes. Therefore, the scope of use of active congestion control mechanisms is limited.
  • the third is to reduce congestion through an adaptive path control mechanism.
  • C1 can notify the upstream node to switch the data flow f to other available paths (For example, C1 notifies the source server S0, and the source server S0 switches the data flow f to the path where A2, B4, and C2 are located).
  • the essence of the adaptive path control mechanism is to fully utilize the bandwidth of the available paths in light-loaded networks to improve bandwidth utilization.
  • This adaptive path control mechanism is generally applicable to scenarios where HPC and the network are convergent, and cannot solve the scenario where the destination is congested.
  • the embodiment of the present application provides a data exchange method, which can make full use of the caches of multiple different switching nodes in the data exchange network to store the data of the data flow when a certain data flow of a certain switching node is congested, Therefore, congestion can be reduced, thereby improving the switching efficiency of the data switching network and reducing the switching delay. That is to say, the embodiment of the present application can pool the caches of all nodes in the data exchange network, that is, virtualize the caches of all nodes in the network into a large-capacity cache pool and present it to users, so as to realize the virtual large cache capability and improve the response to bursts.
  • each node can include a data plane and a control plane, the data plane is used to transmit data, and the control plane is used to transmit control signaling, so as to realize the transmission of data and signaling between different nodes.
  • FIG. 5 is a schematic flowchart of a data exchange method provided by an embodiment of the present application. The method can be applied to any data exchange network provided above, and the method includes the following steps.
  • FIG. 6 is an example of applying the data exchange method to a data exchange network.
  • the source node receives flow indication information from the first switching node, where the flow indication information is used to indicate that a target data flow is congested.
  • the source node may be a source server of the target data flow, or may be a switching node connected to the source server in the data switching network.
  • the destination node hereinafter may be the destination server of the target data flow, or may be a switching node accessed by the destination server in the data switching network.
  • the first switching node may be any switching node in the switching path where the target data flow is located in the data switching network.
  • the target data flow may be a data flow that is congested or will be congested in the first switching node, that is, the target data flow may be a data flow determined by the first switching node, and the first switching node is shown as a congested node in FIG. 6 .
  • the flow indication information may include a flow identifier of the target data flow.
  • the flow indication information may be carried in the congestion notification.
  • the first switching node may determine that the target data flow is congested according to one or more of parameters such as the transmission rate of the target data flow in the first switching node, real-time queue length, queue scheduling priority, and buffer occupancy status. For example, if the real-time queue length of the target data flow in the first switching node is greater than a preset length, or the buffer occupancy status is greater than a preset occupancy rate, the first switching node may determine that the target data flow is a congested flow. It should be noted that, for a specific process of determining that the target data flow is congested by the first switching node, reference may be made to the description in related technologies, which is not specifically limited in this embodiment of the present application.
  • the first switching node may send The source node sends flow indication information for indicating the target data flow, so that the source node receives the flow indication information.
  • the first switching node may directly send the flow indication information to the source node; when the first switching node is interconnected with the source node through other switching nodes, the first switching node The flow indication information may be sent to the source node through the other switching node.
  • the source node is the source server S0
  • the switching path where the target data flow is located is S0-A1-B1-C1-D0
  • the first switching node is C1
  • the switching node C1 determines that the target data flow f is a congested flow, it can send a congestion notification to the source server S0.
  • the switching node B1 forwards the received congestion notification to the switching node A1, and the switching node A1 forwards the congestion notification to the source server S0, so that the source server S0 receives the congestion notification.
  • the information transmission between any two nodes (for example, a server and a switching node, and a switching node and a switching node) in the data switching network may include a control plane and a data plane, and the control plane is used to transmit control information command, the data plane is used to transmit data.
  • the control signaling may include the above-mentioned flow indication information and congestion notification, and may also include block description information, storage indication information, scheduling information, and the like mentioned below.
  • the data transmitted on the data plane may include cells, data packets, and data blocks (data block, DB).
  • the source node sends multiple write data information and multiple data blocks of the target data stream to multiple switching nodes, the multiple write data information is used to instruct the multiple switching nodes to store the multiple data blocks and stop forwarding the multiple data blocks Multiple data blocks.
  • the plurality of switching nodes may include some switching nodes in the data switching network, or may include all switching nodes in the data switching network.
  • the multiple switching nodes may include the first switching node, or may not include the first switching node.
  • the multiple switching nodes are shown as cache nodes, and the multiple switching nodes do not include the first switching node (ie, the congested node) as an example for illustration.
  • each data block of the plurality of data blocks may include a certain number of cells or data packets.
  • the number of cells or data packets included in different data blocks among the multiple data blocks may be the same or different.
  • the lengths of different data blocks may be the same or different; that is, the multiple data blocks may be fixed-length data blocks or variable-length data blocks.
  • each write data information can be used to instruct the corresponding switching node to store at least one data block in the plurality of data blocks, that is, the source node can send one or more data blocks to a switching node, hereinafter referred to as The source node sends a data block to a switching node as an example for illustration.
  • the write data information may include a data block identifier and a write data identifier.
  • the data block identifier may be used to identify the data block, and may be used to indicate the position of the data block in the plurality of data blocks, for example, the data block identifier may be a serial number of the data block.
  • the write data identifier may be used to instruct the switching node receiving the write data identifier to store the data block locally and stop forwarding the data block.
  • stopping forwarding the data block may mean that the switching node that receives the data block does not send the data block to the lower-level nodes when the switching node does not receive the scheduling information for scheduling the data block. That is, the switching node can send the data block to the lower-level node only after receiving the scheduling information for scheduling the data block.
  • the plurality of data blocks may be obtained by the source node by dividing the data to be exchanged in the target data stream, and the data to be exchanged may refer to unsent data of the target data stream stored in the source node .
  • the source node may divide the data to be exchanged in the target data stream into the multiple data blocks, each of the multiple data blocks may correspond to a data block identifier, and the multiple data blocks The number of blocks may be greater than or equal to the same number of switching nodes.
  • the source node may divide the data to be exchanged in the target data stream indicated by the flow indication information into the multiple data blocks; for the multiple data blocks For each data block, the source node may send write data information and the data block to one of the plurality of switching nodes, so as to instruct the switching node to store the data block and stop forwarding the data block through the write data information .
  • the source node may send the multiple data blocks to the multiple switching nodes in a load balancing manner.
  • the source node when the source node is interconnected with the switching node, the source node can directly send the write data information to the switching node; When interconnected with the switching node, the source node may send the write data information to the switching node through the other switching node.
  • the multiple switching nodes include B1-B4 and C1-C2.
  • the source server S0 can The data to be exchanged in the target data stream is divided into 6 data blocks and represented as DB1-DB6 respectively, and DB1-DB6 is sent to the multiple switching nodes according to the following method: send DB1 to C1 through A1 and B1 (or B2) And the corresponding write data information, send DB2 and the corresponding write data information to B1 through A1, send DB3 and the corresponding write data information to B2 through A1, send DB4 and the corresponding write data information to B3 through A2, and send DB4 and the corresponding write data information to B4 through A2 Send DB5 and the corresponding write data information, and send DB6 and the corresponding write data information to C2 through A2 and B3 (or B4).
  • FIG. 8 only shows the data blocks sent by the origin server S0
  • the write data information may also be used to indicate the identifier (for example, sequence number) of the data packet included in the data block.
  • the write data information may include the sequence number of the first data packet and the sequence number of the last data packet in the data block.
  • the data block identifier included in the write data information may be related to the identifier of the data packet included in the data block, so that the data block identifier can be used to determine the identifier of the data packet included in the data block .
  • the switching node can parse the write data information to obtain the data block Identify and write data identification, and determine that the data block needs to be stored according to the written data identification, so that the switching node can store the data block locally, such as storing the data block in the cache, and not forwarding the data block to Subordinate switching nodes.
  • the multiple switching nodes send multiple block description information to the destination node, the multiple block description information corresponds to the multiple data blocks one by one, and each block description information is used to indicate the node information that stores the corresponding data block .
  • the block description information may be used to indicate the switching node storing the data block, and may also be used to indicate the storage address of the data block in the switching node, and the storage address may be a physical address or a logical address.
  • the block description information includes the identification of the switching node and the storage address of the data block in the switching node; further, the block description information may also include the identification of the data block.
  • the block description information may also include identifiers of data packets included in the data block.
  • the switching node can generate block description information corresponding to the data block according to information such as the identification of the switching node, the identification of the data block, the storage address of the data block, and store the The block description information is sent to the destination node.
  • the switching node can directly send the block description information to the destination node; when the switching node is interconnected with the destination node through other switching nodes, the switching node can send the block description information through other switching nodes to The destination node sends the block description information.
  • the switching node may also send the block description information to the source node; when the source node receives multiple After receiving the block description information, the source node may send storage indication information to the destination node according to the multiple block description information, where the storage indication information is used to indicate the node information for storing the multiple data blocks.
  • the storage indication information may be determined by the source node according to the plurality of block description information, and the source node may directly carry the plurality of block description information in the storage indication information, or may process the block description information (such as , the source node may carry the identifier of the data packet included in the corresponding data block in the block description information) and then send the storage indication information to the destination node.
  • the source node when the multiple switching nodes send the block description information to the destination node, or the multiple switching nodes send the block description information to the source node so that the source node sends the block description information to the destination node, the source node also
  • the packet description information used to indicate the data packet included in each data block of the plurality of data blocks may be sent to the destination node separately, and the packet description information may include an identifier of the data packet.
  • the process of sending the block description information to the destination node (ie, the destination server D0) by the multiple switching nodes (ie, B1-B4 and C1-C2) includes :
  • the switching node C1 sends the block description information of BD1 to the destination server D0
  • the switching node B1 sends the block description information of BD2 to the destination server D0 through C1
  • the switching node B2 sends the block description information of BD3 to the destination server D0 through the switching node C1
  • exchange Node B3 sends the block description information of BD4 to the destination server D0 through the switching node C2
  • the switching node B4 sends the block description information of BD5 to the destination server D0 through the switching node C2
  • the switching node C2 sends the block description information of BD6 to the destination server D0
  • the source sends the packet description information PD1-PD6 corresponding to the multiple data blocks to the destination
  • the process of sending the block description information to the source node (ie, the source server S0) by the plurality of switching nodes (ie, B1-B4 and C1-C2) includes: exchanging Node C1 sends the block description information of BD1 to the source server S0 through the switching node B1 (or B2), the switching node B1 sends the block description information of BD2 to the source server S0, and the switching node B2 sends the block description information of BD3 to the source server S0, and exchange Node B3 sends the block description information of BD4 to the source server S0, the switching node B4 sends the block description information of BD5 to the source server S0, and the switching node C2 sends the block description information of BD6 to the source server S0 through the switching node B3 (or B4).
  • the source server S0 sends storage indication information to the destination server D0, and the storage indication information may
  • S205 The destination node receives a plurality of block description information from the plurality of switching nodes, and determines a sequence of the plurality of data blocks according to the plurality of block description information.
  • S205 may specifically be: the destination node receives storage indication information from the source node, where the storage indication information may include the multiple block description information, The order of the multiple data blocks is determined according to the multiple block description information.
  • the destination node can determine the order of the corresponding plurality of data blocks according to the plurality of block description information, that is, determine the order of the plurality of data blocks in the target data stream order. Further, the destination node can also determine the packet description information corresponding to each data block according to the multiple block descriptions or storage indication information, that is, determine the number of data packets included in each data packet in the multiple data blocks or the corresponding The order of the packets, etc.
  • each block description information includes the sequence number of the corresponding data block and the sequence number of the data packet included in the data block
  • the destination node determines the sequence numbers of the multiple data blocks according to the sequence numbers of the data blocks in the multiple block description information.
  • the order of the data packets in the plurality of data blocks is determined according to the sequence numbers of the data packets included in each data block.
  • the destination node sends scheduling information to each of the multiple switching nodes, where the scheduling information is used to schedule data blocks stored in the switching node.
  • the destination node can schedule the multiple data blocks sequentially through scheduling information in accordance with the sequence of the multiple data blocks under the condition of ensuring the throughput, To make the destination node obtain the multiple data blocks sequentially according to the order of the multiple data blocks, that is, to ensure the order in which the destination node receives the multiple data blocks through scheduling information.
  • the scheduling information sent by the destination node to the switching node may include the identification of the switching node, The storage address of the scheduled data block in the switching node, the identifier of the data block, and the like.
  • FIG. 6 takes the scheduling information as a read command as an example for illustration.
  • the destination node when the destination node reads the corresponding data block from the switching node through scheduling information, the destination node can obtain the data block through one scheduling, or through multiple scheduling Get the data block.
  • the scheduling information sent by the destination node each time may also be used to indicate the amount of data currently scheduled, or to indicate the identifier of the currently scheduled data packet, and the like.
  • the destination node when the destination node is interconnected with the switching node, the destination node can directly send the scheduling information to the switching node; When interconnected with the switching node, the destination node may send the scheduling information to the switching node through other switching nodes.
  • the process of the destination node (that is, the destination server D0) sending scheduling information to the multiple switching nodes may include: sending the switching node C1 sends the read command RD1 for scheduling BD1, and sends the read command RD2 for scheduling BD2 through switching node C1 to switching node B1, and sends the read command RD3 for scheduling BD3 through switching node C1 to switching node B2, and through the switching node C2 sends a read command RD4 for scheduling BD4 to switching node B3, sends a read command RD5 for scheduling BD5 to switching node B4 through switching node C2, and sends a read command RD6 for scheduling BD6 to switching node C1.
  • the destination node can create a request link list based on the source node, and based on the request link list, the data flows of multiple source nodes can be fairly scheduled.
  • the destination node can also schedule data streams according to different scheduling levels, such as scheduling according to egress ports, queue priorities, data streams, and buffer pool linked lists.
  • the buffer pool linked list can be used to indicate the The order and storage location of multiple data blocks stored in the same data stream.
  • the switching node can read the corresponding data block from the local according to the scheduling information, and send the data block to the destination node, so that the destination node The node receives the data block.
  • the destination node receives the multiple data blocks in the order of the multiple data blocks, the destination node can output the data block according to a certain bandwidth or rate when receiving each data block, so as to output the multiple Data block, so as to complete the exchange of the destination data flow.
  • the switch node when the destination node obtains the data block through a schedule, the switch node can obtain the entire data block locally according to the schedule information, and send the data block to the destination node; when When the destination node acquires the data block through multiple scheduling, the switching node may send the data block to the destination node through multiple sending according to the scheduling information.
  • the switching node when the destination node is interconnected with the switching node, the switching node can directly send the data block to the destination node; when the destination node communicates with the switching node through other switching nodes When the nodes are interconnected, the switching node can send the data block to the destination node through other switching nodes.
  • the first switching node when the first switching node determines that the target data flow is congested, the first switching node can notify the source node, so that the source node can store multiple data blocks of the target data flow in multiple switching nodes respectively , that is, the source node can store the multiple data blocks in the larger-capacity buffer pool formed by the multiple switching nodes, so that the destination node can schedule corresponding data blocks from the multiple switching nodes, so that the data switching network It can provide a larger cache, reduce the congestion of the target data flow, avoid head resistance, improve the ability to absorb burst traffic, thereby improving data exchange efficiency and reducing exchange delay.
  • each network element such as a source node, a switching node, and a destination node, etc.
  • each network element such as a source node, a switching node, and a destination node, etc.
  • each network element includes a corresponding hardware structure and/or software module for performing each function in order to realize the above functions.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software in combination with the units and algorithm steps of each example described in the embodiments disclosed herein. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the embodiment of the present application can divide the function modules of the source node, switching node and destination node according to the above method example, for example, each function module can be divided corresponding to each function, or two or more functions can be integrated into one module middle.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present application is schematic, and is only a logical function division, and there may be other division methods in actual implementation. The following is an example of dividing each function module by corresponding function:
  • FIG. 11 shows a possible structural diagram of the data exchange device involved in the above embodiment.
  • the data exchange device may be a source node or a built-in chip of the source node, and the data exchange device includes: a receiving unit 301 and a sending unit 302 .
  • the receiving unit 301 is used to support the data switching device to execute S201 in the method embodiment and/or receive the block description information sent by the switching node;
  • the sending unit 302 supports the data switching device to execute S202 in the method embodiment, and/or Or the step of sending storage indication information to the destination node.
  • the data exchange device may further include a processing unit 303, and the processing unit 303 is configured to perform the data exchange device dividing the data to be exchanged into multiple data blocks, parsing stream indication information, and/or parsing block description information. step. All relevant content of the steps involved in the above method embodiments can be referred to the function descriptions of the corresponding functional modules, and will not be repeated here.
  • the processing unit 303 in this application may be a processor of the data exchange device
  • the receiving unit 301 may be a receiver of the data exchange device
  • the sending unit 302 may be a transmitter of the data exchange device
  • the transmitter can generally be integrated with the receiver as a transceiver, and a specific transceiver can also be called a communication interface.
  • FIG. 12 is a schematic diagram of a possible logical structure of the data exchange device involved in the above-mentioned embodiments provided by the embodiments of the present application.
  • the data exchange device may be a source node or a built-in chip of the source node, and the data exchange device includes: a processor 312 and a communication interface 313 .
  • the processor 312 is used to control and manage the actions of the data exchange device, for example, the processor 312 is used to support the data exchange device to execute the method embodiment to divide the data to be exchanged into multiple data blocks, analyze the stream instruction information, analyze The steps of the block description information, and/or other processes used in the techniques described herein.
  • the data exchange device may also include a memory 311 and a bus 314, the processor 312, the communication interface 313 and the memory 311 are connected to each other through the bus 314; the communication interface 313 is used to support the data exchange device to communicate; the memory 311 is used to store the Program code and data of the data exchange device.
  • the processor 312 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 314 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • FIG. 13 shows a possible structural diagram of the data exchange device involved in the above embodiment.
  • the data switching device may be a switching node or a built-in chip of the switching node, and the data switching device includes: a receiving unit 401 and a processing unit 402 .
  • the receiving unit 401 is used to support the data exchange device to perform the step of receiving write data information and/or receive scheduling information in the method embodiment;
  • the processing unit 402 supports the data exchange device to perform S203 in the method embodiment.
  • the data switching device may further include a sending unit 403, configured to support the data switching device to perform the steps of sending flow indication information, sending block description information, and/or sending data blocks in the above embodiments. All relevant content of the steps involved in the above method embodiments can be referred to the function descriptions of the corresponding functional modules, and will not be repeated here.
  • the processing unit 403 in this application may be a processor of the data exchange device, the receiving unit 401 may be a receiver of the data exchange device, and the sending unit 403 may be a transmitter of the data exchange device,
  • the transmitter can generally be integrated with the receiver as a transceiver, and a specific transceiver can also be called a communication interface.
  • FIG. 14 is a schematic diagram of a possible logical structure of the data exchange device involved in the above-mentioned embodiments provided by the embodiments of the present application.
  • the data exchange device may be a destination node or a built-in chip of the destination node, and the data exchange device includes: a processor 412 and a communication interface 413 .
  • the processor 412 is used to control and manage the actions of the data exchange device, for example, the processor 412 is used to support the data exchange device to execute S203 in the method embodiment, and/or other processes for the technologies described herein.
  • the data exchange device may also include a memory 411 and a bus 414, the processor 412, the communication interface 413 and the memory 411 are connected to each other through the bus 414; the communication interface 413 is used to support the data exchange device to communicate; the memory 411 is used to store the Program code and data of the data exchange device.
  • the processor 412 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 414 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • FIG. 15 shows a possible structural diagram of the data exchange device involved in the above embodiment.
  • the data exchange device may be a destination node or a built-in chip of the destination node, and the data exchange device includes: a receiving unit 501 and a sending unit 502 .
  • the receiving unit 501 is used to support the data exchange device to execute the step of receiving block description information or storing instruction information in the method embodiment;
  • the sending unit 502 supports the data exchange device to execute S206 in the method embodiment.
  • the data exchange device may further include a processing unit 503, and the processing unit 503 is configured to execute S205 in the data exchange device execution method embodiment. All relevant content of the steps involved in the above method embodiments can be referred to the function descriptions of the corresponding functional modules, and will not be repeated here.
  • the processing unit 503 in this application may be a processor of the data exchange device
  • the receiving unit 501 may be a receiver of the data exchange device
  • the sending unit 502 may be a transmitter of the data exchange device
  • the transmitter can generally be integrated with the receiver as a transceiver, and a specific transceiver can also be called a communication interface.
  • FIG. 16 is a schematic diagram of a possible logical structure of the data exchange device involved in the above-mentioned embodiments provided by the embodiments of the present application.
  • the data exchange device may be a destination node or a built-in chip of the destination node, and the data exchange device includes: a processor 512 and a communication interface 513.
  • the processor 512 is used to control and manage the actions of the data exchange device, for example, the processor 512 is used to support the data exchange device to execute S205 in the method embodiment, and/or other processes for the technologies described herein.
  • the data exchange device may also include a memory 511 and a bus 514, the processor 512, the communication interface 513 and the memory 511 are connected to each other through the bus 514; the communication interface 513 is used to support the data exchange device to communicate; the memory 511 is used to store the Program code and data of the data exchange device.
  • the processor 512 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 514 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • a data switching network in another embodiment, and the data switching network includes a source node, a switching node, and a destination node.
  • the source node may be the source node provided in the above device embodiment, and is used to support the source node to execute the steps of the source node in the method embodiment; and/or, the switching node is the switching node provided in the above device embodiment, using The supporting switching node executes the steps of the switching node in the method embodiment; and/or, the destination node is the destination node provided in the above device embodiment, and is used to support the destination node to execute the steps of the destination node in the method embodiment.
  • the source node, switching node, and destination node in the device embodiment of the present application may respectively correspond to the source node, switching node, and destination node in the method embodiment of the present application.
  • each module and other operations and/or functions of the source node, switching node, and destination node are respectively intended to implement the corresponding procedures of the above-mentioned method embodiment. This will not be repeated here.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • a readable storage medium is also provided, and computer-executable instructions are stored in the readable storage medium.
  • a device such as a single-chip microcomputer, chip, etc.
  • a processor executes the The steps of the source node in the provided data exchange method.
  • the above-mentioned readable storage medium may include various mediums capable of storing program codes such as U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk.
  • a readable storage medium is also provided, and computer-executable instructions are stored in the readable storage medium.
  • a device such as a single-chip microcomputer, chip, etc.
  • a processor executes the The step of exchanging nodes in the provided data exchange method.
  • the above-mentioned readable storage medium may include various mediums capable of storing program codes such as U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk.
  • a readable storage medium is also provided, and computer-executable instructions are stored in the readable storage medium.
  • a device such as a single-chip microcomputer, chip, etc.
  • a processor executes the The step of the destination node in the provided data exchange method.
  • the above-mentioned readable storage medium may include various mediums capable of storing program codes such as U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk.
  • a computer program product in another embodiment, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; Reading the storage medium reads the computer-executable instructions, and at least one processor executes the computer-executable instructions so as to implement the steps of the source node in the data exchange method provided by the above-mentioned method embodiments.
  • a computer program product in another embodiment, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; Reading the storage medium reads the computer-executable instructions, and at least one processor executes the computer-executable instructions so that the device implements the steps of switching nodes in the provided data exchange method in the above method.
  • a computer program product in another embodiment, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; Reading the storage medium reads the computer-executable instructions, and at least one processor executes the computer-executable instructions so that the device implements the steps of the destination node in the provided data exchange method in the above method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供一种数据交换方法及装置,涉及通信技术领域,用于提高数据交换网络的交换效率,降低交换时延。所述方法包括:源节点接收来自第一交换节点的流指示信息,所述流指示信息用于指示目标数据流发生拥塞,第一交换节点是目标数据流的交换路径中的节点;所述源节点向多个交换节点发送多个写数据信息和目标数据流的多个数据块,所述多个写数据信息用于指示所述多个交换节点存储所述多个数据块且停止转发所述多个数据块;所述多个交换节点根据所述多个写数据信息存储所述多个数据块;所述多个交换节点接收来自目的节点的多个调度信息,所述多个调度信息用于调度所述多个数据块;所述多个交换节点向所述目的节点发送所述多个数据块。

Description

一种数据交换方法及装置
本申请要求于2022年01月05日提交国家知识产权局、申请号为202210010085.6、申请名称为“一种数据交换方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,尤其涉及一种数据交换方法及装置。
背景技术
数据交换网络一般采用多级(比如,两级或三级)交换节点的组网模式,为接入网络内的众多服务器(server)提供全连接的网络,将不同服务器之间的数据进行交换。在数据交换网络中,每个交换节点具有一定容量的缓存,该缓存可用于吸收突发的数据流。当一个交换节点的待调度流量超出了该交换节点的交换能力和缓存能力时,则会导致队列头阻和丢包等现象。比如,当多个源节点同时向目的节点的同一个输出端口发送数据包时,由于该输出端口对应的输出队列(也可以称为缓存队列)的缓存容量有限,该输出端口可能出现拥塞现象,导致缓存溢出,从而导致数据丢包等现象。
现有技术中,通常采用显式拥塞通知(explicit congestion notification,ECN)、基于优先级的流量控制(priority-based flow control,PFC)和尾部丢弃(tail drop)来控制数据交换网络中的流量,以避免缓存溢出。其中,ECN可用于实现源节点(比如,服务器或手机等)的流量控制,比如,网络中的交换节点在出现拥塞前通知源节点降低发送速率,以达到降低网络拥塞的效果。PFC可用于实现交换节点间的流量控制,比如,下游交换节点通知上游交换节点停止数据的发送,以避免本地缓存溢出。尾部丢弃是指通过丢弃数据包来降低拥塞的一种方式,比如,某一交换节点在缓存已经占满时直接丢弃新接收到的数据包。
但是,上述几种方式虽然能在一定程度上降低拥塞现象,但是效果不佳,同时也会影响数据交换网络的交换效率,增加交换时延。
发明内容
本申请提供一种数据交换方法及装置,解决了现有技术中数据交换网络的交换效率低、交换时延长的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种数据交换方法,该方法包括:源节点接收来自第一交换节点的流指示信息,该流指示信息用于指示目标数据流发生拥塞,第一交换节点为目标数据流的交换路径中的节点;该源节点向多个交换节点发送多个写数据信息和目标数据流的多个数据块,该多个写数据信息用于指示该多个交换节点存储该多个数据块且停止转发该多个数据块。
在上述技术方案中,当第一交换节点确定目标数据流发生拥塞或者即将拥塞时,第一交换节点可以通知源节点,这样源节点可以将目标数据流的多个数据块分别存储在多个交换节点中,即源节点可以将该多个数据块存储在该多个交换节点构成的更大容量的缓存池中,从而能够提供更大的缓存,降低目标数据流的拥塞、避免头阻,提高对突发流量的吸收能力,进而提高数据的交换效率、降低交换时延。
在第一方面的一种可能的实现方式中,该源节点向多个交换节点发送多个写数据信息和目标数据流的多个数据块之后,该方法还包括:该源节点接收来自该多个交换节点的多个块描述信息,该多个块描述信息与该多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息,比如存储数据块的节点标识和数据块的存储地址;该源节点根据向目的节点发送该多个块描述信息。可选的,该块描述信息还用于指示对应的数据块包括的数据包的标识;或者,源节点向目的节点发送该多个数据块中每个数据块包括的数据包的标识。上述可能的实现方式中,当源节点将目标数据流的多个数据块分别存储在多个交换节点中时,该多个交换节点可以向源节点返回对应存储的数据块的块描述信息,以使源节点将多个块描述信息发送给目的节点,这样目的节点可以根据该多个块描述信息对该多个数据块进行有序调度。
在第一方面的一种可能的实现方式中,该源节点向多个交换节点发送多个写数据信息和目标数据流的多个数据块之前,该方法还包括:该源节点将该目标数据流中待交换的数据划分为该多个数据块,该多个数据块的数量大于或等于该多个交换节点的数量。上述可能的实现方式中,源节点可以根据实际情况将该目标数据流中待交换的数据划分为该多个数据块,以将数据块分散存储在多个交换节点中。
在第一方面的一种可能的实现方式中,该多个交换节点中一个交换节点对应的数据块在该多个数据块中的顺序与该交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,该交换节点对应的距离为该交换节点与第一交换节点之间的距离,该多个距离包括该多个交换节点中每个交换节点与第一交换节点之间的距离。上述可能的实现方式中,能够减小目的节点调度该多个数据块时的路径,以提高目的节点调度该多个数据块的效率。
第二方面,提供一种数据交换方法,该方法包括:交换节点向源节点发送流指示信息,该流指示信息用于指示目标数据流发生拥塞,该交换节点是该目标数据流的交换路径中的节点;该交换节点接收来自源节点的写数据信息和目标数据流的数据块,该写数据信息用于指示该交换节点存储该数据块且停止转发该数据块;该交换节点根据该写数据信息存储该数据块;该交换节点接收来自目的节点的调度信息,该调度信息用于调度该数据块;该交换节点向该目的节点发送该数据块。
在上述技术方案中,当该交换节点确定目标数据流发生拥塞或者即将拥塞时,该交换节点可以通知源节点,这样源节点可以将目标数据流的多个数据块分别存储在多个交换节点中,即源节点可以将该多个数据块存储在该多个交换节点构成的更大容量的缓存池中,从而能够提供更大的缓存,降低目标数据流的拥塞、避免头阻,提高对突发流量的吸收能力,进而提高数据的交换效率、降低交换时延。
在第二方面的一种可能的实现方式中,该交换节点根据该写数据信息存储该数据块之后,该方法还包括:该交换节点向该目的节点发送该数据块的块描述信息;或者,该交换节点向该源节点发送该数据块的块描述信息;其中,该块描述信息用于指示存储该数据块的节点信息。可选的,该块描述信息还用于指示对应的数据块包括的数据包的标识。上述可能的实现方式中,当源节点将目标数据流的多个数据块分别存储在多个交换节点中时,该多个交换节点中的每个交换节点可以向源节点或者目的节点返回对应存储的数据块的块描述信息,这样目的节点可以根据多个块描述信息对该多个数据块进行有序调度。
第三方面,提供一种数据交换方法,该方法包括:第一交换节点向源节点发送流指示信息,该流指示信息用于指示目标数据流发生拥塞,该第一交换节点是该目标数据流的交换路径中的节点;当该源节点接收到该流指示信息时,向多个交换节点发送多个写数据信息和该目标数据流的多个数据块,该多个写数据信息用于指示该多个交换节点存储该多个数据块且停止转发该多个数据块;该多个交换节点接收该多个写数据信息和该多个数据块,并根据该多个写数据信息存储该多个数据块;目的节点向该多个交换节点发送多个调度信息,该多个调度信息用于调度该多个数据块;该多个交换节点接收该多个调度信息,并根据该多个调度信息向该目的节点发送该多个数据块。
在上述技术方案中,当目标数据流在第一交换节点中发生拥塞或者即将拥塞时,第一交换节点可以通知源节点,这样源节点可以将目标数据流的多个数据块分别存储在多个交换节点中,即源节点可以将该多个数据块存储在该多个交换节点构成的更大容量的缓存池中,这样目的节点可以从该多个交换节点调度对应的数据块,从而使得该数据交换网络能够提供更大的缓存,降低目标数据流的拥塞、避免头阻,提高对突发流量的吸收能力,进而提高数据的交换效率、降低交换时延。
在第三方面的一种可能的实现方式中,该方法还包括:该多个交换节点向该源节点发送多个块描述信息,该多个块描述信息与该多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;该源节点接收该多个块描述信息,并向该目的节点发送该多个块描述信息。可替换的,该多个交换节点向该目的节点发送多个块描述信息,每个块描述信息用于指示存储对应的数据块的节点信息。上述可能的实现方式中,当源节点将目标数据流的多个数据块分别存储在多个交换节点中时,该多个交换节点可以向源节点或者目的节点返回对应存储的数据块的块描述信息,这样目的节点可以根据多个块描述信息对该多个数据块进行有序调度。
在第三方面的一种可能的实现方式中,该目的节点向多个交换节点发送多个调度信息之前,该方法还包括:当该目的节点接收到该多个块描述信息时,根据该多个块描述信息确定该多个数据块的调度顺序,该调度顺序用于从该多个交换节点中调度该多个数据块。上述可能的实现方式中,该目的节点可以根据多个块描述信息或者该存储指示信息,对该多个数据块进行有序调度。
在第三方面的一种可能的实现方式中,该方法还包括:该源节点将该目标数据流中待交换的数据划分为该多个数据块,该多个数据块的数量大于或等于该多个交换节点的数量。上述可能的实现方式中,源节点可以根据实际情况将该目标数据流中待交换的数据划分为该多个数据块,以将数据块分散存储在多个交换节点中。
在第三方面的一种可能的实现方式中,该多个交换节点中一个交换节点对应存储的数据块在该多个数据块中的顺序与该交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,该交换节点对应的距离为该交换节点与该第一交换节点之间的距离,该多个距离包括该多个交换节点与该第一交换节点之间的距离。上述可能的实现方式中,能够减小目的节点调度该多个数据块时的路径,以提高目的节点调度该多个数据块的效率。
第四方面,提供一种数据交换装置,该装置作为源节点,包括:接收单元,用于接收来自第一交换节点的流指示信息,该流指示信息用于指示目标数据流发生拥塞,第一交换节点是该目标数据流的交换路径中的节点;发送单元,用于向多个交换节点发送多个写数 据信息和该目标数据流的多个数据块,该多个写数据信息用于指示该多个交换节点存储该多个且停止转发该多个数据块。
在第四方面的一种可能的实现方式中,该接收单元还用于:接收来自该多个交换节点的多个块描述信息,该多个块描述信息与该多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;该发送单元还用于:向目的节点发送该多个块描述信息。
在第四方面的一种可能的实现方式中,该装置还包括:处理单元,用于将该目标数据流中待交换的数据划分为该多个数据块,该多个数据块的数量大于或等于该多个交换节点的数量。
在第四方面的一种可能的实现方式中,该多个交换节点中一个交换节点对应的数据块在该多个数据块中的顺序与该交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,该交换节点对应的距离为该交换节点与第一交换节点之间的距离,该多个距离包括该多个交换节点中每个交换节点与第一交换节点之间的距离。
第五方面,提供一种数据交换装置,该装置作为交换节点,包括:发送单元,用于向源节点发送流指示信息,该流指示信息用于指示目标数据流发生拥塞,该交换节点是该目标数据流的交换路径中的节点;接收单元,用于接收来自该源节点的写数据信息和目标数据流的数据块,该写数据信息用于指示该交换节点存储该数据块且停止转发该数据块;处理单元,用于根据该写数据信息存储该数据块;该接收单元,还用于接收来自目的节点的调度信息,该调度信息用于调度该数据块;该发送单元,还用于向该目的节点发送该数据块。
在第五方面的一种可能的实现方式中,该发送单元还用于:向该目的节点发送该数据块的块描述信息;或者,向该源节点发送该数据块的块描述信息;其中,该块描述信息用于指示存储该数据块的节点信息。
第六方面,提供一种数据交换网络,该数据交换网络包括源节点、多个交换节点和目的节点,该多个交换节点包括第一交换节点;其中,第一交换节点,用于向该源节点发送流指示信息,该流指示信息用于指示目标数据流发生拥塞,该第一交换节点是该目标数据流的交换路径中的节点;该源节点,用于接收该流指示信息,并向多个交换节点发送多个写数据信息和该目标数据流的多个数据块,该多个写数据信息用于指示该多个交换节点存储该多个数据块且停止转发该多个数据块;该多个交换节点,用于接收该多个写数据信息和该多个数据块,并根据该多个写数据信息存储该多个数据块;该目的节点,用于向该多个交换节点发送多个调度信息,该多个调度信息用于调度该多个数据块;该多个交换节点,还用于接收该多个调度信息,并根据该多个调度信息向该目的节点发送该多个数据块。
在第六方面的一种可能的实现方式中,该多个交换节点还用于:向该源节点发送多个块描述信息,该多个块描述信息与该多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;该源节点还用于:接收该多个块描述信息,并向该目的节点发送该多个块描述信息。
在第六方面的一种可能的实现方式中,该多个交换节点还用于:向该目的节点发送该多个块描述信息。
在第六方面的一种可能的实现方式中,该目的节点还用于:接收该多个块描述信息,并根据该多个块描述信息确定该多个数据块的调度顺序,该调度顺序用于从该多个交换节 点中调度该多个数据块。
在第六方面的一种可能的实现方式中,该源节点还用于:将该目标数据流中待交换的数据划分为该多个数据块,该多个数据块的数量大于或等于该多个交换节点的数量。
在第六方面的一种可能的实现方式中,该多个交换节点中一个交换节点对应存储的数据块在该多个数据块中的顺序与该交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,该交换节点对应的距离为该交换节点与该第一交换节点之间的距离,该多个距离包括该多个交换节点与该第一交换节点之间的距离。
在本申请的又一方面,提供一种数据交换装置,该数据交换装置包括:处理器、存储器、通信接口和总线,该处理器、该存储器和该通信接口通过总线连接;该存储器用于存储程序代码,当该程序代码被该处理器执行时,使得该数据交换装置执行如第一方面或者第一方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种数据交换装置,该数据交换装置包括:处理器、存储器、通信接口和总线,该处理器、该存储器和该通信接口通过总线连接;该存储器用于存储程序代码,当该程序代码被该处理器执行时,使得该数据交换装置执行如第二方面或者第二方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种数据交换装置,该数据交换装置包括:处理器、存储器、通信接口和总线,该处理器、该存储器和该通信接口通过总线连接;该存储器用于存储程序代码,当该程序代码被该处理器执行时,使得该数据交换装置执行如第三方面或者第三方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序或指令,当该计算机程序或指令被运行时,实现如第一方面或者第一方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序或指令,当该计算机程序或指令被运行时,实现如第二方面或者第二方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序或指令,当该计算机程序或指令被运行时,实现如第三方面或者第三方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令被运行时,执行上述第一方面或第一方面的任一种可能的实现方式所提供的数据交换方法。
在本申请的又一方面,提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令被运行时,执行上述第二方面或第二方面的任一种可能的实现方式所提供的数据交换方方法。
在本申请的又一方面,提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当计算机程序或指令被运行时,执行上述第三方面或第三方面的任一种可能的实现方式所提供的数据交换方方法。
可以理解地,上述提供的任一种数据交换方法的装置、数据交换网络、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效 果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
附图说明
图1为本申请实施例提供的一种数据交换网络的结构示意图;
图2为本申请实施例提供的另一种数据交换网络的结构示意图;
图3为本申请实施例提供的又一种数据交换网络的结构示意图;
图4为本申请实施例提供的一种数据交换网络中数据流拥塞的示意图;
图5为本申请实施例提供的一种数据交换方法的流程示意图;
图6为本申请实施例提供的另一种数据交换方法的流程示意图;
图7为本申请实施例提供的一种数发送拥塞通告的示意图;
图8为本申请实施例提供的一种在多个交换节点存储多个数据块的示意图;
图9为本申请实施例提供的一种多个交换节点发送块描述信息的示意图;
图10为本申请实施例提供的一种目的节点调度多个数据块的示意图;
图11为本申请实施例提供的一种源节点的结构示意图;
图12为本申请实施例提供的另一种源节点的结构示意图;
图13为本申请实施例提供的一种交换节点的结构示意图;
图14为本申请实施例提供的另一种交换节点的结构示意图;
图15为本申请实施例提供的一种目的节点的结构示意图;
图16为本申请实施例提供的另一种目的节点的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。在本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,a和b,a和c,b和c或a、b和c,其中a、b和c可以是单个,也可以是多个。另外,在本申请的实施例中,“第一”、“第二”等字样并不对数量和次序进行限定。
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本申请提供的技术方案可以应用于多种不同的数据交换网络中。该数据交换网络可以为大型的数据交换网络,也可以为小型的数据交换网络。小型的数据交换网络也可以称为数据交换系统。该数据交换网络可以包括多个交换节点,该交换节点也可以称为节点。在实际应用中,该交换节点可以为交换机或者路由器等交换设备,也可以为交换板或者交换单元(switch element,SE)等。其中,交换板也可以称为交换网卡或者网络接口卡(network interface card,NIC),一个交换板中可以包括一个或者多个交换单元。可选的,该数据交换网络可以包括数据中心网络(dater center network,DCN)、高性能计算(high performance  computing,HPC)网络、云网络、以及单个芯片或者多个芯片合封后的片上网络等。
下面通过图1-图3对该数据交换网络的结构进行举例说明。
图1为本申请实施例提供的一种数据交换网络的结构示意图,该数据交换网络包括三个交换层。参见图1,该数据交换网络包括接入层、汇聚层和核心层,接入层中包括多个接入(access)节点,汇聚层中包括多个汇聚(aggregation)节点,核心层包括多个核心(core)节点,且接入节点的下行端口与需要进行数据流量交换的服务器(server)连接,接入节点的上行端口与汇聚节点的下行端口连接,汇聚节点的上行端口与核心节点连接。
其中,汇聚层和接入层可以被划分为多个群组(pod),一个群组中可以包括多个接入节点和多个汇聚节点,且每个接入节点与多个汇聚节点全连接。与同一个汇聚节点连接的多个核心节点可以称为一个核心(core)平面,每个核心平面分别和各个群组中的不同汇聚节点连接。图1中仅以该数据交换网络包括3个群组,一个群组内包括3个接入节点和4个汇聚节点,每个核心平面包括两个核心节点为例进行说明。图1中的接入节点可以表示为A1~A9,汇聚节点可以表示为B1~B12,核心节点可以表示为C1~C8,3个群组分别表示为P1~P3。
其中,当一个群组内不同接入节点连接的服务器之间进行数据流量交换时,可以通过与接入节点在同一群组内的汇聚节点实现,比如,接入节点A1和接入节点A3连接的服务器需要进行数据流量交换,则接入节点A1可以通过汇聚节点B1将其连接的服务器的数据流发送给接入节点A3。当不同群组内的接入节点连接的服务器之间进行数据流量交换时,可以通过与接入节点在同一群组内的汇聚节点、以及与汇聚节点连接的核心节点实现,比如,接入节点A1和接入节点A5连接的服务器需要进行数据流量交换,则接入节点A1可以将其连接的服务器的数据流发送给汇聚节点B1,由汇聚节点B1转发给核心节点C1,再由C1通过汇聚节点B5发送给接入节点A5。
图2为本申请实施例提供的另一种数据交换网络的结构示意图,该数据交换网络包括两个交换层。参见图2,该数据交换网络包括中继层(也可以称为TOR层)和骨干(spine)层,中继层中包括多个叶子(leaf)节点,骨干层包括多个骨干节点,且叶子节点的下行端口与需要进行数据流量交换的服务器(server)连接,叶子节点的上行端口与该多个骨干节点连接。图2中仅以该数据交换网络包括4个叶子节点和2个骨干节点为例进行说明。图2中的叶子节点可以表示为A1~A4,骨干节点可以表示为C1~C2。
其中,当接入同一叶子节点的两个服务器之间进行数据流量交换时,可以通过该叶子节点来实现,比如,接入叶子节点A1的两个服务器(比如,S1和S2)可以通过叶子节点A1进行数据流量交换。当接入不同叶子节点的两个服务器之间进行数据流量交换时,可以通过该叶子节点和骨干节点来实现,比如,接入叶子节点A1的服务器S1需要与接入叶子节点A2的服务器S3进行数据流量交换时,叶子节点A1可以将来自服务器S1的数据流发送给骨干节点C1,由骨干节点C1将该数据流转发给叶子节点A2。
图3为本申请实施例提供的又一种数据交换网络的结构示意图,该数据交换网络可以为片上网络。参见图3,该数据交换网络包括多个交换芯片,该多个交换芯片中的每个交换芯片包括多个交换单元,该数据交换网络中的所有交换单元可以按照一定方式互连。图3中以该数据交换网络包括4个交换芯片D1-D4,每个交换芯片包括9个交换单元,该数据交换网络中的交换单元分别表示为1-35为例进行说明。
其中,每个交换单元可以具有一个或者多个输入端口、以及一个或者多个输出端口,该输入接口可用于接收外部输入的数据包或信元,该输出端口可用于向外部输出数据包或者信元。该数据交换网络中的多个交换单元之间的互连可用于将每个输入接口接收到的数据包或者信元交换到对应的输出端口。该数据交换网络中的每个交换单元中可以包括至少一个缓存(queue)队列,该至少一个缓存队列可用于缓存不同去往不同输出端口的数据包或者信元。图3中示出了多个交换单元的互连关系仅为示例性的,并不对本申请实施例构成限制。
在上述数据交换网络中,每个交换节点均具有一定的交换能力和一定容量的缓存,当一个交换节点的待调度流量超出了该交换节点的交换能力和缓存能力时,则会导致队列头阻和丢包等现象,从而影响数据交换网络的交换效率,同时也增加交换时延。因此,如何降低数据交换网络中的拥塞是当前亟待解决的一个技术问题。
示例性,图4示出了一种源服务器S0与目标服务器D0之间通过数据交换网络交换数据的示意图。图4中以该数据交换网络包括多个交换节点且分别表示为A1-A2、B1-B4和C1-C2,源服务器S0分别与A1和A2互连,目的服务器D0分别与C1和C2互连,A1分别通过B1和B2与C1互连,A2分别通过B3和B4与C2互连为例进行说明。在该数据交换过程中,若该数据交换网络中的某一交换节点发生拥塞,该数据交换网络可以采用下面几种不同的流量控制方式来降低拥塞。
第一种、通过被动拥塞控制机制来降低拥塞。具体的,若C1出现拥塞,则C1可以通过显式拥塞通知(explicit congestion notification,ECN)方式通知源服务器S0降低发送速率,或者通过基于优先级的流量控制(priority-based flow control,PFC)方式分别通知B1和B2停止数据的发送,或者通过尾部丢弃(tail drop)方式丢弃新接收到的数据包。在实际应用中,针对不同的数据流(该数据流可以通过优先级和端口等进行区分),会采用不同的流量控制方式。例如,针对高优先级的业务的数据流,可采用ECN和PFC的方式;针对低优先级的业务的数据流,可采用ECN和尾部丢弃的方式。
但是,由于工艺和成本等因素,单个交换节点的缓存容量是有限,不可能无限增长,从而在数据量较大时仍会导致拥塞;另外,缓存容量的增长速率远小于端口带宽的增长,这将导致单个交换节点承受突发流量的能力减弱;再者,某一交换节点产生流量控制,可能会扩散整个数据交换网络中,从而进一步带来队列头阻问题。
第二种、通过主动拥塞控制机制来降低拥塞。具体的,源服务器S0可以主动通过探测报文和本地流表等主动探测手段获取链路状态或速率等信息(比如,当C1发生拥塞时可以向源服务器S0返回携带该信息的探测结果);源服务器S0根据该信息直接控制本地数据流的发送速率。其中,该主动拥塞控制机制仅适用于小型的数据交换网络中,这是因为探测报文和本地流表需要占用的一定的带宽和缓存,而大型的数据交换网络中的数据流的数量很大,这样会增加实现难度;此外,该主动拥塞控制机制无法对大量的小数据量的数据流的突发进行控制。因此,主动拥塞控制机制的使用范围受限。
第三种、通过自适应路径控制机制来降低拥塞。具体的,在源服务器S0通过A1、B1和C1向目的服务器D0交换数据流的过程中,若C1处的数据流f发生拥塞,则C1可以通知上游节点将数据流f切换到其他可用路径上(比如,C1通知源服务器S0,源服务器S0将数据流f切换到A2、B4和C2所在的路径)上。其中,该自适应路径控制机制的本 质是在轻载网络中,充分利用可用路径的带宽,以提高带宽利用率。该自适应路径控制机制一般适用于HPC且网络有收敛的场景中,且无法解决目的端拥塞的场景。
基于此,本申请实施例提供一种数据交换方法,能够在某一交换节点的某一数据流出现拥塞时,充分利用数据交换网络中多个不同交换节点的缓存来存储该数据流的数据,从而能够降低拥塞,进而提高数据交换网络的交换效率、降低交换时延。也即是,本申请实施例能够将数据交换网络中所有节点的缓存池化,即将网络中所有节点的缓存虚拟成一个大容量的缓存池呈现给用户,从而实现虚拟大缓存能力,提高对突发流量的吸收能力;此外,每个节点可以包括数据面和控制面,该数据面用于传输数据,该控制面用于传输控制信令,从而在不同节点间实现数据和信令的传输。
下面对本申请实施例所提供的数据交换方法进行介绍说明。
图5为本申请实施例提供的一种数据交换方法的流程示意图,该方法可以应用于上文所提供的任意一种数据交换网络中,该方法包括以下步骤。图6为该数据交换方法应用于数据交换网络中的一种示例。
S201:源节点接收来自第一交换节点的流指示信息,该流指示信息用于指示目标数据流发生拥塞。
其中,源节点可以为目标数据流的源服务器,也可以为该数据交换网络中该源服务器接入的交换节点。类似的,下文中的目的节点可以为目标数据流的目的服务器,也可以为该数据交换网络中该目的服务器接入的交换节点。
另外,第一交换节点可以为该数据交换网络中该目标数据流所在的交换路径中的任意一个交换节点。该目标数据流可以是在第一交换节点中发生拥塞或将要发生拥塞的数据流,即该目标数据流可以是第一交换节点确定的数据流,图6中将第一交换节点表示为拥塞节点。该流指示信息可以包括该目标数据流的流标识。可选的,该流指示信息可以携带在拥塞通告中。
再者,第一交换节点可以根据目标数据流在第一交换节点中的传输速率、实时队列长度、队列调度优先级和缓存占用状态等参数中的一个或者多个确定目标数据流发生拥塞。比如,若目标数据流在第一交换节点中的实时队列长度大于预设长度、或者缓存占用状态大于预设占用率,第一交换节点可以确定目标数据流为拥塞流。需要说明的是,第一交换节点确定目标数据流发生拥塞的具体过程可以参见相关技术中的描述,本申请实施例对此不作具体限制。
具体的,在源节点通过该数据交换网络交换该目标数据流的过程中,若该目标数据流所在的交换路径中的第一交换节点确定该目标数据流发生拥塞,则第一交换节点可以向该源节点发送用于指示目标数据流的流指示信息,以使该源节点接收到该流指示信息。当第一交换节点与该源节点互连时,第一交换节点可以直接向该源节点发送该流指示信息;当第一交换节点通过其他交换节点与该源节点互连时,第一交换节点可以通过该其他交换节点向该源节点发送该流指示信息。
示例性的,以图4所示的数据交换网络为例,若源节点为源服务器S0,目标数据流所在的交换路径为S0-A1-B1-C1-D0,第一交换节点为C1,则如图7所示,交换节点C1在确定该目标数据流f为拥塞流时可以向源服务器S0发送拥塞通告,具体可以过程为:交换节点C1向交换节点B1发送用于指示该目标数据流f的拥塞通告,交换节点B1将接收到 的该拥塞通告转发给交换节点A1,交换节点A1将该拥塞通告转发给源服务器S0,以使源服务器S0接收到该拥塞通告。
可选的,该数据交换网络中任意两个节点(比如,服务器与交换节点、以及交换节点与交换节点)之间的信息传输包括可以包括控制面和数据面,该控制面用于传输控制信令,该数据面用于传输数据。该控制信令可以包括上述流指示信息和拥塞通告,也可以包括下文中所涉及的块描述信息、存储指示信息和调度信息等。该数据面传输的数据可以包括信元、数据包和数据块(data block,DB)等。
S202:该源节点向多个交换节点发送多个写数据信息和目标数据流的多个数据块,该多个写数据信息用于指示该多个交换节点存储该多个数据块且停止转发该多个数据块。
其中,该多个交换节点(也可以称为缓存节点)可以包括该数据交换网络中的部分交换节点,也可以包括该数据交换网络中的所有交换节点。该多个交换节点可以包括第一交换节点,也可以不包括第一交换节点。图6中将该多个交换节点表示为缓存节点,且以该多个交换节点不包括第一交换节点(即拥塞节点)为例进行说明。
另外,该多个数据块中的每个数据块可以包括一定数量的信元或者数据包。该多个数据块中不同数据块包括的信元或数据包的数量可以相同,也可以不同。不同数据块的长度可以相同,也可以不同;也即是,该多个数据块可以是定长数据块或者变长数据块。
再者,每个写数据信息可以用于指示对应的交换节点存储该多个数据块中的至少一个数据块,即该源节点可以通过向一个交换节点发送一个或者多个数据块,下文中以该源节点向一个交换节点发送一个数据块为例进行说明。可选的,该写数据信息可以包括数据块标识和写数据标识。该数据块标识可以用于标识该数据块,且可以用于指示该数据块在该多个数据块中的位置,比如该数据块标识可以为该数据块的序号。该写数据标识可以用于指示接收该写数据标识的交换节点在本地存储该数据块且停止转发该数据块。
需要说明的是,停止转发该数据块可以是指接收到该数据块的交换节点在没有接收到用于调度该数据块的调度信息时,该交换节点不向下级的节点发送该数据块。即该交换节点只有接收到用于调度该数据块的调度信息,才能向下级的节点发送该数据块。
可选的,该多个数据块可以是该源节点对该目标数据流中待交换的数据划分得到的,该待交换的数据可以是指该源节点中存储的未发送的目标数据流的数据。在一种示例中,该源节点可以将该目标数据流中待交换的数据划分为该多个数据块,该多个数据块中的每个数据块可以对应一个数据块标识,该多个数据块的数量可以大于或等于该多个交换节点的数量相同。
具体的,当该源节点接收到该流指示信息时,该源节点可以将该流指示信息所指示的目标数据流中待交换的数据划分为该多个数据块;对于该多个数据块中的每个数据块,该源节点可以向该多个交换节点中的一个交换节点发送写数据信息和该数据块,以通过该写数据信息指示该交换节点存储该数据块且停止转发该数据块。可选的,该源节点可以按照负载均衡的方式将该多个数据块发送给该多个交换节点。
其中,对于该多个交换节点中的每个交换节点,当该源节点与该交换节点互连时,该源节点可以直接向该交换节点发送该写数据信息;当该源节点通过其他交换节点与该交换节点互连时,该源节点可以通过该其他交换节点向该交换节点发送该写数据信息。
示例性的,以图4所示的数据交换网络为例,若源节点为源服务器S0,该多个交换节 点包括B1-B4和C1-C2,如图8所示,源服务器S0可以将该目标数据流中待交换的数据划分为6个数据块且分别表示为DB1-DB6,并按照以下方法将DB1-DB6发送给该多个交换节点:通过A1和B1(或B2)向C1发送DB1和对应的写数据信息,通过A1向B1发送DB2和对应的写数据信息,通过A1向B2发送DB3和对应的写数据信息,通过A2向B3发送DB4和对应的写数据信息,通过A2向B4发送DB5和对应的写数据信息,通过A2和B3(或B4)向C2发送DB6和对应的写数据信息。图8中仅示出了源服务器S0发送的数据块。
可选的,该写数据信息还可以用于指示该数据块中包括的数据包的标识(比如,序号)。在一种示例中,该写数据信息中可以包括该数据块中第1个数据包的序号和最后一个数据包的序号。在另一种示例中,该写数据信息中包括的数据块标识可以与该数据块中包括的数据包的标识有关,以使该数据块标识可用于确定该数据块中包括的数据包的标识。
S203:当该多个交换节点接收到该多个写数据信息和该多个数据块时,该多个交换节点存储该多个数据块。
对于该多个交换节点中的每个交换节点,当该交换节点接收到写数据信息和该多个数据块中的数据块时,该交换节点可以解析该写数据信息,以得到该数据块的标识和写数据标识,并根据该写数据标识确定该数据块需要进行存储,从而该交换节点可以在本地存储该数据块,比如将该数据块存储在缓存中,且不将该数据块转发给下级的交换节点。
S204:该多个交换节点向目的节点发送多个块描述信息,该多个块描述信息与该多个数据块一一对应,每个块描述信息用于指示该存储对应的数据块的节点信息。
其中,该块描述信息可以用于指示存储该数据块的交换节点,还可以用于指示该数据块在该交换节点中的存储地址,该存储地址可以为物理地址或者逻辑地址等。在一种示例中,该块描述信息包括该交换节点的标识、该数据块在该交换节点中的存储地址;进一步的,该块描述信息还可以包括该数据块的标识。可选的,该块描述信息还可以包括该数据块中包括的数据包的标识。
具体的,当该交换节点存储该数据块后,该交换节点可以根据该交换节点的标识、数据块标识、该数据块的存储地址等信息,生成该数据块对应的块描述信息,并将该块描述信息发送给目的节点。当该交换节点与目的节点互连时,该交换节点可以直接向该目的节点发送该块描述信息;当该交换节点通过其他交换节点与目的节点互连时,该交换节点可以通过其他交换节点向该目的节点发送该块描述信息。
可替换的,当该多个交换节点中的每个交换节点生成块描述信息后,该交换节点也可以向源节点发送该块描述信息;当源节点接收到来自该多个交换节点的多个块描述信息后,该源节点可以根据该多个块描述信息向目的节点发送存储指示信息,该存储指示信息用于指示存储该多个数据块的节点信息。该存储指示信息可以是该源节点根据该多个块描述信息确定的,该源节点可以直接将该多个块描述信息携带在该存储指示信息中,也可以对该块描述信息进行处理(比如,该源节点可以将对应数据块包括的数据包的标识携带在块描述信息中)后通过存储指示信息发送给目的节点。
可选的,在该多个交换节点向目的节点发送块描述信息,或者该多个交换节点向源节点发送块描述信息以使源节点向目的节点发送块描述信息的情况下,该源节点也可以单独向该目的节点发送用于指示该多个数据块中每个数据块包括的数据包的包描述信息,该包 描述信息可以包括数据包的标识。
示例性的,结合图8,如图9中的(a)所示,该多个交换节点(即B1-B4和C1-C2)向目的节点(即目的服务器D0)发送块描述信息的过程包括:交换节点C1向目的服务器D0发送BD1的块描述信息,交换节点B1通过C1向目的服务器D0发送BD2的块描述信息,交换节点B2通过交换节点C1向目的服务器D0发送BD3的块描述信息,交换节点B3通过交换节点C2向目的服务器D0发送BD4的块描述信息,交换节点B4通过交换节点C2向目的服务器D0发送BD5的块描述信息,交换节点C2向目的服务器D0发送BD6的块描述信息,源服务器S0向目的服务器D0发送该多个数据块对应的包描述信息PD1-PD6。
或者,结合图8,如图9中的(b)所示,该多个交换节点(即B1-B4和C1-C2)向源节点(即源服务器S0)发送块描述信息的过程包括:交换节点C1通过交换节点B1(或B2)向源服务器S0发送BD1的块描述信息,交换节点B1向源服务器S0发送BD2的块描述信息,交换节点B2向源服务器S0发送BD3的块描述信息,交换节点B3向源服务器S0发送BD4的块描述信息,交换节点B4通向源服务器S0发送BD5的块描述信息,交换节点C2通过交换节点B3(或B4)向源服务器S0发送BD6的块描述信息。之后,源服务器S0向目的服务器D0发送存储指示信息,该存储指示信息可以包括BD1-BD6的块描述信息,还可以包括BD1-BD6对应的包描述信息PD1-PD6。
S205:该目的节点接收来自该多个交换节点的多个块描述信息,并根据该多个块描述信息确定该多个数据块的顺序。可替换的,当该多个交换节点向源节点发送块描述信息时,S205具体可以为:该目的节点接收来自该源节点的存储指示信息,该存储指示信息可以包括该多个块描述信息,根据该多个块描述信息确定该多个数据块的顺序。
当该目的节点接收到该多个块描述信息时,该目的节点可以根据该多个块描述信息确定对应的该多个数据块的顺序,即确定该多个数据块在该目标数据流中的顺序。进一步的,该目的节点还可以根据该多个块描述或者存储指示信息确定每个数据块对应的包描述信息,即确定该多个数据块中每个数据包所包括的数据包的数量或对应的数据包的顺序等。
比如,每个块描述信息中包括对应的数据块的序号、以及该数据块所包括的数据包的序号,该目的节点根据该多个块描述信息中数据块的序号确定该多个数据块的顺序,并根据每个数据块所包括的数据包的序号确定该多个数据块中数据包的顺序。
S206:该目的节点向该多个交换节点中的每个交换节点发送调度信息,该调度信息用于调度该交换节点中存储的数据块。
当该目的节点需要调度该目标数据流的该多个数据块时,该目的节点可以在保证吞吐量的情况下,按照该多个数据块的顺序,依次通过调度信息调度该多个数据块,以使该目的节点按照该多个数据块的顺序依次获取到该多个数据块,即通过调度信息来保证目的节点接收该多个数据块的顺序。在一种实施例中,对于该多个交换节点中的每个交换节点,该目的节点向该交换节点发送的该调度信息(比如,该调度信息为读命令)可以包括该交换节点的标识、调度的数据块在该交换节点中的存储地址、以及该数据块的标识等。图6中以该调度信息为读命令为例进行说明。
可选的,对于任意一个交换节点,当该目的节点通过调度信息从该交换节点中读取对应的数据块时,该目的节点可以通过一次调度来获取该数据块,也可以通过多次调度来获 取该数据块。当该目的节点通过多次调度来获取该数据块时,该目的节点每次发送的调度信息还可以用于指示当前调度的数据量,或者用于指示当前调度的数据包的标识等。
具体的,对于该多个交换节点中的每个交换节点,当该目的节点与该交换节点互连时,该目的节点可以直接向该交换节点发送该调度信息;当该目的节点通过其他交换节点与该交换节点互连时,该目的节点可以通过其他交换节点向该交换节点发送该调度信息。
示例性的,结合图7,如图10所示,该目的节点(即目的服务器D0)向该多个交换节点(即B1-B4和C1-C2)发送调度信息的过程可以包括:向交换节点C1发送用于调度BD1的读命令RD1,通过交换节点C1向交换节点B1发送用于调度BD2的读命令RD2,通过交换节点C1向交换节点B2发送用于调度BD3的读命令RD3,通过交换节点C2向交换节点B3发送用于调度BD4的读命令RD4,通过交换节点C2向交换节点B4发送用于调度BD5的读命令RD5,向交换节点C1发送用于调度BD6的读命令RD6。
在实际应用中,针对拥塞流,该目的节点可以创建一个基于源节点的请求链表,基于该请求链表对多个源节点的数据流进行公平调度。此外,该目的节点还可以按照不同的调度层次来进行调度数据流,比如按照出端口、队列优先级、数据流和缓存池链表等来进行调度,该缓存池链表可以用于指示在不同交换节点中存储在同一数据流的多个数据块的顺序和存储位置。
S207:当该多个交换节点中的一个交换节点接收到该调度信息时,该交换节点向该目的节点发送对应的数据块。
当该多个交换节点中的一个交换节点接收到对应的调度信息时,该交换节点可以根据调度信息从本地中读取对应的数据块,并将该数据块发送给该目的节点,以使目的节点接收到该数据块。当该目的节点按照该多个数据块的顺序接收该多个数据块时,该目的节点可以在接收到每个数据块时,按照一定带宽或速率向外输出该数据块,以输出该多个数据块,从而完成该目的数据流的交换。
可选的,对于任意一个交换节点,当该目的节点通过一次调度来获取该数据块时,该交换节点可以根据调度信息从本地获取整个数据块,并将该数据块发送给该目的节点;当该目的节点通过多次调度来获取该数据块时,该交换节点可以根据调度信息通过多次发送,以将该数据块发送给该目的节点。
对于该多个交换节点中的任意一个交换节点,当该目的节点与该交换节点互连时,该交换节点可以直接向该目的节点发送该数据块;当该目的节点通过其他交换节点与该交换节点互连时,该交换节点可以通过其他交换节点向该目的节点发送该数据块。
在本申请实施例中,当第一交换节点确定目标数据流发生拥塞时,第一交换节点可以通知源节点,这样源节点可以将目标数据流的多个数据块分别存储在多个交换节点中,即源节点可以将该多个数据块存储在该多个交换节点构成的更大容量的缓存池中,这样目的节点可以从该多个交换节点调度对应的数据块,从而使得该数据交换网络能够提供更大的缓存,降低目标数据流的拥塞、避免头阻,提高对突发流量的吸收能力,进而提高数据的交换效率、降低交换时延。
上述主要从各个节点之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个网元,例如源节点、交换节点和目的节点等,为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本 文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对源节点、交换节点和目的节点进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。下面以采用对应各个功能划分各个功能模块为例进行说明:
在采用集成的单元的情况下,图11示出了上述实施例中所涉及的数据交换装置的一种可能的结构示意图。该数据交换装置可以为源节点或者源节点内置的芯片,该数据交换装置包括:接收单元301和发送单元302。其中,接收单元301用于支持该数据交换装置执行方法实施例中的S201和/或接收交换节点发送的块描述信息的步骤;发送单元302支持该数据交换装置执行方法实施例中S202、和/或向目的节点发送存储指示信息的步骤。可选的,该数据交换装置还可以包括处理单元303,处理单元303用于执行该数据交换装置将待交换的数据划分为多个数据块、解析流指示信息、和/或解析块描述信息的步骤。上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用硬件实现的基础上,本申请中的处理单元303可以为数据交换装置的处理器,接收单元301可以为该数据交换装置的接收器,发送单元302可以为该数据交换装置的发送器,发送器通常可以和接收器集成在一起用作收发器,具体的收发器还可以称为通信接口。
图12所示,为本申请的实施例提供的上述实施例中所涉及的数据交换装置的一种可能的逻辑结构示意图。该数据交换装置可以为源节点或者源节点内置的芯片,该数据交换装置包括:处理器312和通信接口313。处理器312用于对该数据交换装置动作进行控制管理,例如,处理器312用于支持该数据交换装置执行方法实施例中将待交换的数据划分为多个数据块、解析流指示信息、解析块描述信息的步骤,和/或用于本文所描述的技术的其他过程。此外,该数据交换装置还可以包括存储器311和总线314,处理器312、通信接口313以及存储器311通过总线314相互连接;通信接口313用于支持该数据交换装置进行通信;存储器311用于存储该数据交换装置的程序代码和数据。
其中,处理器312可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线314可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在采用集成的单元的情况下,图13示出了上述实施例中所涉及的数据交换装置的一种可能的结构示意图。该数据交换装置可以为交换节点或者交换节点内置的芯片,该数据交换装置包括:接收单元401和处理单元402。其中,接收单元401用于支持该数据交换装置执行方法实施例中接收写数据信息、和/或接收调度信息的步骤;处理单元402支持该数据交换装置执行方法实施例中的S203。可选的,该数据交换装置还可以包括发送单元403,发送单元403用于支持该数据交换装置执行上述实施例中发送流指示信息、发送块描述信息、和/或发送数据块的步骤。上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用硬件实现的基础上,本申请中的处理单元403可以为数据交换装置的处理器,接收单元401可以为该数据交换装置的接收器,发送单元403可以为该数据交换装置的发送器,发送器通常可以和接收器集成在一起用作收发器,具体的收发器还可以称为通信接口。
图14所示,为本申请的实施例提供的上述实施例中所涉及的数据交换装置的一种可能的逻辑结构示意图。该数据交换装置可以为目的节点或者目的节点内置的芯片,该数据交换装置包括:处理器412和通信接口413。处理器412用于对该数据交换装置动作进行控制管理,例如,处理器412用于支持该数据交换装置执行方法实施例中的S203,和/或用于本文所描述的技术的其他过程。此外,该数据交换装置还可以包括存储器411和总线414,处理器412、通信接口413以及存储器411通过总线414相互连接;通信接口413用于支持该数据交换装置进行通信;存储器411用于存储该数据交换装置的程序代码和数据。
其中,处理器412可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线414可以是外设部件互连标准(PCI)总线或扩展工业标准结构(EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图14中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在采用集成的单元的情况下,图15示出了上述实施例中所涉及的数据交换装置的一种可能的结构示意图。该数据交换装置可以为目的节点或者目的节点内置的芯片,该数据交换装置包括:接收单元501和发送单元502。其中,接收单元501用于支持该数据交换装置执行方法实施例中接收块描述信息或者存储指示信息的步骤;发送单元502支持该数据交换装置执行方法实施例中S206。可选的,该数据交换装置还可以包括处理单元503,处理单元503用于执行该数据交换装置执行方法实施例中的S205。上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用硬件实现的基础上,本申请中的处理单元503可以为数据交换装置的处理器,接收单元501可以为该数据交换装置的接收器,发送单元502可以为该数据交换装置的发送器,发送器通常可以和接收器集成在一起用作收发器,具体的收发器还可以称为通信接口。
图16所示,为本申请的实施例提供的上述实施例中所涉及的数据交换装置的一种可能的逻辑结构示意图。该数据交换装置可以为目的节点或者目的节点内置的芯片,该数据 交换装置包括:处理器512和通信接口513。处理器512用于对该数据交换装置动作进行控制管理,例如,处理器512用于支持该数据交换装置执行方法实施例中的S205,和/或用于本文所描述的技术的其他过程。此外,该数据交换装置还可以包括存储器511和总线514,处理器512、通信接口513以及存储器511通过总线514相互连接;通信接口513用于支持该数据交换装置进行通信;存储器511用于存储该数据交换装置的程序代码和数据。
其中,处理器512可以是中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线514可以是外设部件互连标准(PCI)总线或扩展工业标准结构(EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图16中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请的另一实施例中,还提供一种数据交换网络,该数据交换网络包括源节点、交换节点和目的节点。其中,源节点可以为上述装置实施例中所提供的源节点,用于支持源节点执行方法实施例中源节点的步骤;和/或,交换节点为上述装置实施例所提供的交换节点,用于支持交换节点执行方法实施例中交换节点的步骤;和/或,目的节点为上述装置实施例所提供的目的节点,用于支持目的节点执行方法实施例中目的节点的步骤。
本申请装置实施例的源节点、交换节点和目的节点可分别对应于本申请方法实施例中的源节点、交换节点和目的节点。并且,源节点、交换节点和目的节点的各个模块和其它操作和/或功能分别为了实现上述方法实施例的相应流程,为了简洁,本申请方法实施例的描述可以适用于该装置实施例,在此不再赘述。
本申请装置实施例的有益效果可参考上述对应的方法实施例中的有益效果,此处不再赘述。另外,本申请装置实施例中相关内容的描述也可以参考上述对应的方法实施例。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
在本申请的另一实施例中,还提供一种可读存储介质,可读存储介质中存储有计算机执行指令,当一个设备(可以是单片机,芯片等)或者处理器执行上述方法实施例所提供的数据交换方法中源节点的步骤。前述的可读存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请的另一实施例中,还提供一种可读存储介质,可读存储介质中存储有计算机执行指令,当一个设备(可以是单片机,芯片等)或者处理器执行上述方法实施例所提供的数据交换方法中交换节点的步骤。前述的可读存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请的另一实施例中,还提供一种可读存储介质,可读存储介质中存储有计算机执行指令,当一个设备(可以是单片机,芯片等)或者处理器执行上述方法实施例所提供的数据交换方法中目的节点的步骤。前述的可读存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备上述方法实施例所提供的数据交换方法中源节点的步骤。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备上述方法实施所提供的数据交换方法中交换节点的步骤。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备上述方法实施所提供的数据交换方法中目的节点的步骤。
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (27)

  1. 一种数据交换方法,其特征在于,所述方法包括:
    源节点接收来自第一交换节点的流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述第一交换节点是所述目标数据流的交换路径中的节点;
    所述源节点向多个交换节点发送多个写数据信息和所述目标数据流的多个数据块,所述多个写数据信息用于指示所述多个交换节点存储所述多个数据块且停止转发所述多个数据块。
  2. 根据权利要求1所述的方法,其特征在于,所述源节点向多个交换节点发送多个写数据信息和所述目标数据流的多个数据块之后,所述方法还包括:
    所述源节点接收来自所述多个交换节点的多个块描述信息,所述多个块描述信息与所述多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;
    所述源节点向目的节点发送所述多个块描述信息。
  3. 根据权利要求1或2所述的方法,其特征在于,所述源节点向多个交换节点发送多个写数据信息和所述目标数据流的多个数据块之前,所述方法还包括:
    所述源节点将所述目标数据流中待交换的数据划分为所述多个数据块,所述多个数据块的数量大于或等于所述多个交换节点的数量。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述多个交换节点中一个交换节点对应的数据块在所述多个数据块中的顺序与所述交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,所述交换节点对应的距离为所述交换节点与所述第一交换节点之间的距离,所述多个距离包括所述多个交换节点与所述第一交换节点之间的距离。
  5. 一种数据交换方法,其特征在于,所述方法包括:
    交换节点向源节点发送流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述交换节点是所述目标数据流的交换路径中的节点;
    所述交换节点接收来自所述源节点的写数据信息和所述目标数据流的数据块,所述写数据信息用于指示所述交换节点存储所述数据块且停止转发所述数据块;
    所述交换节点根据所述写数据信息存储所述数据块;
    所述交换节点接收来自目的节点的调度信息,所述调度信息用于调度所述数据块;
    所述交换节点向所述目的节点发送所述数据块。
  6. 根据权利要求5所述的方法,其特征在于,所述交换节点根据所述写数据信息存储所述数据块之后,所述方法还包括:
    所述交换节点向所述目的节点发送所述数据块的块描述信息;或者,
    所述交换节点向所述源节点发送所述数据块的块描述信息;
    其中,所述块描述信息用于指示存储所述数据块的节点信息。
  7. 一种数据交换方法,其特征在于,所述方法包括:
    第一交换节点向源节点发送流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述第一交换节点是所述目标数据流的交换路径中的节点;
    当所述源节点接收到所述流指示信息时,向多个交换节点发送多个写数据信息和所述 目标数据流的多个数据块,所述多个写数据信息用于指示所述多个交换节点存储所述多个数据块且停止转发所述多个数据块;
    所述多个交换节点接收所述多个写数据信息和所述多个数据块,并根据所述多个写数据信息存储所述多个数据块;
    目的节点向所述多个交换节点发送多个调度信息,所述多个调度信息用于调度所述多个数据块;
    所述多个交换节点接收所述多个调度信息,并根据所述多个调度信息向所述目的节点发送所述多个数据块。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    所述多个交换节点向所述源节点发送多个块描述信息,所述多个块描述信息与所述多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;
    所述源节点接收所述多个块描述信息,并向所述目的节点发送所述多个块描述信息。
  9. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    所述多个交换节点向所述目的节点发送所述多个块描述信息,每个块描述信息用于指示存储对应的数据块的节点信息。
  10. 根据权利要求8或9所述的方法,其特征在于,所述方法还包括:
    当所述目的节点接收到所述多个块描述信息时,根据所述多个块描述信息确定所述多个数据块的调度顺序,所述调度顺序用于从所述多个交换节点中调度所述多个数据块。
  11. 根据权利要求7-10任一项所述的方法,其特征在于,所述方法还包括:
    所述源节点将所述目标数据流中待交换的数据划分为所述多个数据块,所述多个数据块的数量大于或等于所述多个交换节点的数量。
  12. 根据权利要求7-11任一项所述的方法,其特征在于,所述多个交换节点中一个交换节点对应存储的数据块在所述多个数据块中的顺序与所述交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,所述交换节点对应的距离为所述交换节点与所述第一交换节点之间的距离,所述多个距离包括所述多个交换节点与所述第一交换节点之间的距离。
  13. 一种数据交换装置,其特征在于,所述装置作为源节点,包括:
    接收单元,用于接收来自第一交换节点的流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述第一交换节点是所述目标数据流的交换路径中的节点;
    发送单元,用于向多个交换节点发送多个写数据信息和所述目标数据流的多个数据块,所述多个写数据信息用于指示所述多个交换节点存储所述多个数据块且停止转发所述多个数据块。
  14. 根据权利要求13所述的装置,其特征在于,
    所述接收单元,还用于接收来自所述多个交换节点的多个块描述信息,所述多个块描述信息与所述多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;
    所述发送单元,还用于向目的节点发送所述多个块描述信息。
  15. 根据权利要求13或14所述的装置,其特征在于,所述装置还包括:
    处理单元,用于将所述目标数据流中待交换的数据划分为所述多个数据块,所述多个 数据块的数量大于或等于所述多个交换节点的数量。
  16. 根据权利要求13-15任一项所述的装置,其特征在于,所述多个交换节点中一个交换节点对应的数据块在所述多个数据块中的顺序与所述交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,所述交换节点对应的距离为所述交换节点与所述第一交换节点之间的距离,所述多个距离包括所述多个交换节点中每个交换节点与所述第一交换节点之间的距离。
  17. 一种数据交换装置,其特征在于,所述装置作为交换节点,包括:
    发送单元,用于向源节点发送流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述交换节点是所述目标数据流的交换路径中的节点;
    接收单元,用于接收来自所述源节点的写数据信息和所述目标数据流的数据块,所述写数据信息用于指示所述交换节点存储所述数据块且停止转发所述数据块;
    处理单元,用于根据所述写数据信息存储所述数据块;
    所述接收单元,还用于接收来自目的节点的调度信息,所述调度信息用于调度所述数据块;
    所述发送单元,还用于向所述目的节点发送所述数据块。
  18. 根据权利要求17所述的装置,其特征在于,所述发送单元还用于:
    向所述目的节点发送所述数据块的块描述信息;或者,
    向所述源节点发送所述数据块的块描述信息;
    其中,所述块描述信息用于指示存储所述数据块的节点信息。
  19. 一种数据交换网络,其特征在于,所述数据交换网络包括源节点、多个交换节点和目的节点,所述多个交换节点包括第一交换节点;其中,
    所述第一交换节点,用于向所述源节点发送流指示信息,所述流指示信息用于指示目标数据流发生拥塞,所述第一交换节点是所述目标数据流的交换路径中的节点;
    所述源节点,用于接收所述流指示信息,并向多个交换节点发送多个写数据信息和所述目标数据流的多个数据块,所述多个写数据信息用于指示所述多个交换节点存储所述多个数据块且停止转发所述多个数据块;
    所述多个交换节点,用于接收所述多个写数据信息和所述多个数据块,并根据所述多个写数据信息存储所述多个数据块;
    所述目的节点,用于向所述多个交换节点发送多个调度信息,所述多个调度信息用于调度所述多个数据块;
    所述多个交换节点,还用于接收所述多个调度信息,并根据所述多个调度信息向所述目的节点发送所述多个数据块。
  20. 根据权利要求19所述的数据交换网络,其特征在于,
    所述多个交换节点,还用于向所述源节点发送多个块描述信息,所述多个块描述信息与所述多个数据块一一对应,每个块描述信息用于指示存储对应的数据块的节点信息;
    所述源节点,还用于接收所述多个块描述信息,并向所述目的节点发送所述多个块描述信息。
  21. 根据权利要求19所述的数据交换网络,其特征在于,
    所述多个交换节点,还用于向所述目的节点发送所述多个块描述信息。
  22. 根据权利要求19或20所述的数据交换网络,其特征在于,
    所述目的节点,还用于接收所述多个块描述信息,并根据所述多个块描述信息确定所述多个数据块的调度顺序,所述调度顺序用于从所述多个交换节点中调度所述多个数据块。
  23. 根据权利要求19-22任一项所述的数据交换网络,其特征在于,
    所述源节点,还用于将所述目标数据流中待交换的数据划分为所述多个数据块,所述多个数据块的数量大于或等于所述多个交换节点的数量。
  24. 根据权利要求19-23任一项所述的数据交换网络,其特征在于,所述多个交换节点中一个交换节点对应存储的数据块在所述多个数据块中的顺序与所述交换节点对应的距离在按照从小到大的顺序排列的多个距离中的顺序一致,所述交换节点对应的距离为所述交换节点与所述第一交换节点之间的距离,所述多个距离包括所述多个交换节点与所述第一交换节点之间的距离。
  25. 一种数据交换装置,其特征在于,所述数据交换装置包括:处理器、存储器、通信接口和总线,所述处理器、所述存储器和所述通信接口通过总线连接;所述存储器用于存储程序代码,当所述程序代码被所述处理器执行时,使得所述数据交换装置执行权利要求1-4任一项所述的数据交换方法。
  26. 一种数据交换装置,其特征在于,所述数据交换装置包括:处理器、存储器、通信接口和总线,所述处理器、所述存储器和所述通信接口通过总线连接;所述存储器用于存储程序代码,当所述程序代码被所述处理器执行时,使得所述数据交换装置执行权利要求5-6任一项所述的数据交换方法。
  27. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序或指令,当所述计算机程序或指令被运行时,实现如权利要求1-12任一项所述的数据交换方法。
PCT/CN2022/131459 2022-01-05 2022-11-11 一种数据交换方法及装置 WO2023130835A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210010085.6 2022-01-05
CN202210010085.6A CN116418745A (zh) 2022-01-05 2022-01-05 一种数据交换方法及装置

Publications (1)

Publication Number Publication Date
WO2023130835A1 true WO2023130835A1 (zh) 2023-07-13

Family

ID=87052007

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131459 WO2023130835A1 (zh) 2022-01-05 2022-11-11 一种数据交换方法及装置

Country Status (2)

Country Link
CN (1) CN116418745A (zh)
WO (1) WO2023130835A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000036849A (ja) * 1998-07-17 2000-02-02 Nec Eng Ltd 無手順通信システム及びその無手順通信方法並びにその制御プログラムを記録した記録媒体
CN107786440A (zh) * 2016-08-26 2018-03-09 华为技术有限公司 一种数据报文转发的方法及装置
CN108234320A (zh) * 2016-12-14 2018-06-29 华为技术有限公司 报文传输方法及交换机
US20190182161A1 (en) * 2017-12-09 2019-06-13 Intel Corporation Fast congestion response
US20190230053A1 (en) * 2018-01-25 2019-07-25 Excelero Storage Ltd. System and method for improving network storage accessibility

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000036849A (ja) * 1998-07-17 2000-02-02 Nec Eng Ltd 無手順通信システム及びその無手順通信方法並びにその制御プログラムを記録した記録媒体
CN107786440A (zh) * 2016-08-26 2018-03-09 华为技术有限公司 一种数据报文转发的方法及装置
CN108234320A (zh) * 2016-12-14 2018-06-29 华为技术有限公司 报文传输方法及交换机
US20190182161A1 (en) * 2017-12-09 2019-06-13 Intel Corporation Fast congestion response
US20190230053A1 (en) * 2018-01-25 2019-07-25 Excelero Storage Ltd. System and method for improving network storage accessibility

Also Published As

Publication number Publication date
CN116418745A (zh) 2023-07-11

Similar Documents

Publication Publication Date Title
EP1779607B1 (en) Network interconnect crosspoint switching architecture and method
Duato et al. A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks
US7227841B2 (en) Packet input thresholding for resource distribution in a network switch
US6084856A (en) Method and apparatus for adjusting overflow buffers and flow control watermark levels
US8867559B2 (en) Managing starvation and congestion in a two-dimensional network having flow control
US7274660B2 (en) Method of flow control
US6754222B1 (en) Packet switching apparatus and method in data network
US20210320866A1 (en) Flow control technologies
US20220303217A1 (en) Data Forwarding Method, Data Buffering Method, Apparatus, and Related Device
CN103534997A (zh) 用于无损耗以太网的基于端口和优先级的流控制机制
CN109861931B (zh) 一种高速以太网交换芯片的存储冗余系统
EP1891778A1 (en) Electronic device and method of communication resource allocation.
WO2012116655A1 (zh) 交换单元芯片、路由器及信元信息的发送方法
US8588239B2 (en) Relaying apparatus and packet relaying apparatus
EP3188419B1 (en) Packet storing and forwarding method and circuit, and device
Wu et al. Network congestion avoidance through packet-chaining reservation
CN114531488B (zh) 一种面向以太网交换器的高效缓存管理系统
JP4588259B2 (ja) 通信システム
US7990873B2 (en) Traffic shaping via internal loopback
WO2023130835A1 (zh) 一种数据交换方法及装置
US20220321478A1 (en) Management of port congestion
CN111434079B (zh) 一种数据通信方法及装置
CN114531399A (zh) 一种内存阻塞平衡方法、装置、电子设备和存储介质
WO2023202294A1 (zh) 一种数据流保序方法、数据交换装置及网络
Wu et al. Revisiting network congestion avoidance through adaptive packet-chaining reservation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22918287

Country of ref document: EP

Kind code of ref document: A1