CN117640530A - Method and device for low-delay reliable data transmission of cluster network - Google Patents

Method and device for low-delay reliable data transmission of cluster network Download PDF

Info

Publication number
CN117640530A
CN117640530A CN202311656018.2A CN202311656018A CN117640530A CN 117640530 A CN117640530 A CN 117640530A CN 202311656018 A CN202311656018 A CN 202311656018A CN 117640530 A CN117640530 A CN 117640530A
Authority
CN
China
Prior art keywords
data
receiving
network card
transmitted
identification number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311656018.2A
Other languages
Chinese (zh)
Inventor
董德尊
吴克
杨雨昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202311656018.2A priority Critical patent/CN117640530A/en
Publication of CN117640530A publication Critical patent/CN117640530A/en
Pending legal-status Critical Current

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a low-delay reliable data transmission method and a device of a cluster network, wherein the method is realized based on a host at a cluster end side, and the method comprises the following steps: acquiring a transmission task descriptor from a memory of a transmitting end by using a network card of the transmitting end; the transmission task descriptor comprises a transmission address of data to be transmitted, a receiving address of the data to be transmitted, a length of the data to be transmitted and a descriptor identification number; processing the transmission task descriptor by using a network card of a transmitting end to obtain connection resource information and a data packet to be transmitted; and transmitting the connection resource information and the data packet to be transmitted by using the transmitting end host and the receiving end host to finish the data transmission process. The method of the invention can reduce the connection cost of one RTT for almost all messages, and has a perfect mechanism to cope with the special situation that the receiving end has no connection resource. Large-scale heavy load and severely end-point congested traffic also do not cause performance degradation to the inventive method.

Description

Method and device for low-delay reliable data transmission of cluster network
Technical Field
The present invention relates to the field of high performance computing and data center networks, and in particular, to a method and apparatus for low latency reliable data transmission in a clustered network.
Background
With the increasing size and increasing complexity of high performance computing systems, the likelihood of failure of their system hardware components increases, and the availability of systems becomes an important factor that high performance computer system design needs to take precedence. The internet is an important component of high performance computing systems as well as a component that is prone to transient or permanent failure. Interconnection networks typically provide a point-to-point reliable transport mechanism, but are unable to cope with failures such as packet loss or link disconnection. Although the checkpointing mechanism of the HPC system can tolerate network failures to some extent, the recording of checkpoints can affect the performance of the program and the software overhead of error recovery is also relatively large.
Remote Direct Memory Access (RDMA) protocol allows end hosts to read and write data in the host directly around the kernel while offloading transport protocols and below in the network protocol stack to the RDMA-enabled network interface card (RNIC), so that RDMA can meet the high bandwidth, low latency, and low CPU overhead network stack requirements required for modern high-performance computing (HPC) applications. In view of the significant performance advantages of RDMA, it has been widely used in the form of proprietary networks in HPC interconnect networks.
Therefore, RDMA private networks of HPC systems should be provided with a hardware-side reliable data transfer service at the message level to guarantee their performance advantages. The system hardware can tolerate network faults such as message loss or link disconnection by providing end-to-end reliable data transmission service, and can realize real-time error detection and recovery at a message level. Compared with the error recovery method based on software, the hardware reliable data transmission service does not influence the normal execution of the program, and the program is not required to be paused to restart after the error occurs, so that the cost of error detection and error recovery is small.
In a practical HPC system, the Remote Direct Memory Access (RDMA) protocol offloads the recovery and retransmission of message-level data onto the NIC, the establishment and release of connections is done on the NIC, and the connection context information is also maintained on the NIC, which significantly reduces the overhead of data recovery since no software involvement is required. Meanwhile, as the resources on the network card are limited, the connection needs to be quickly and dynamically established and released to ensure that the network card has idle connection resources all the time. Thus, the protocol establishes a connection for each message. In addition to a few long messages, the connection of other messages is released rapidly with the completion of the transmission. In this manner, it takes at least one RTT between the sender and the receiver network cards to establish the connection before each message is sent. The failure of a message is after all a small probability event, but all messages add to the overhead of one connection setup. Especially for short messages with a length of several RTTs or even less than one RTT, this overhead will seriously affect their delay performance.
Disclosure of Invention
The invention aims to solve the technical problem of providing a low-delay reliable data transmission method and a low-delay reliable data transmission device for a cluster network, which are beneficial to improving the transmission efficiency of a high-performance computing system, reducing the transmission failure rate, further reducing RTT short message overhead in an RDMA protocol, reducing transmission delay and eliminating the influence of large-scale heavy load and serious end point congestion flow on the high-performance computing system.
The first aspect of the embodiment of the invention discloses a low-delay reliable data transmission method of a cluster network, wherein the method is realized based on a host at a cluster end side, and the host at the cluster end side comprises the following steps: a sending end host and a receiving end host; the sending end host comprises a sending end network card and a sending end memory; the receiving end host comprises a receiving end network card and a receiving end memory; the method comprises the following steps:
s1, acquiring a transmission task descriptor from a memory of a transmitting end by using a network card of the transmitting end; the transmission task descriptor comprises a transmission address of data to be transmitted, a receiving address of the data to be transmitted, a length of the data to be transmitted and a descriptor identification number; the descriptor identification number is used for uniquely identifying the transmission task; the transmission task descriptor is used for describing transmission task information;
s2, processing the transmission task descriptor by using a network card of a transmitting end to obtain connection resource information and a data packet to be transmitted;
and S3, transmitting the connection resource information and the data packet to be transmitted by using the transmitting end host and the receiving end host, and completing a data transmission process.
The processing the transmission task descriptor by using the transmitting terminal network card to obtain connection resource information and a data packet to be transmitted, including:
s21, setting the transmission task descriptor by using a network card of a transmitting end to obtain corresponding connection resource information; the connection resource information comprises a sending end host identification number, a connection identification number and a receiving end host identification number; storing the connection identification number in the transmitting network card; the transmitting network card and the receiving network card both comprise a plurality of connection resources; the connection identification number is used for indicating the number of the connection resource of the sending end network card used for transmitting the task; the identification number of the host at the transmitting end is an identification number distributed by the cluster network for the host at the transmitting end;
s22, acquiring the data to be transmitted from a memory of a transmitting end by using a network card of the transmitting end according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s23, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one; the data packet to be sent comprises a receiving address of the data to be sent, a sending end host identification number, a descriptor identification number and segmented data to be sent.
The method for transmitting the connection resource information and the data packet to be transmitted by using the transmitting end host and the receiving end host to complete the data transmission process comprises the following steps:
s31, combining the host identification number, the descriptor identification number, the length of the data to be transmitted and the connection identification number of the transmitting end by using the network card of the transmitting end to obtain connection event information; transmitting the connection event information to a receiving end host;
s32, receiving the data packet to be transmitted by using a network card of a receiving end, and confirming that the data packet to be transmitted is a received data packet; judging whether the receiving end network card has idle storage resources or not to obtain a first judging result; if the first judgment result is yes, the receiving end host machine carries out first receiving processing on the received data packet to finish the data transmission process; if the first judgment result is negative, the receiving end host and the sending end host carry out second receiving processing on the received data packet, and the data transmission process is completed.
The receiving end host machine performs a first receiving process on the received data packet to complete a data transmission process, and the method comprises the following steps:
s3201, a receiving end network card performs combination processing on a transmitting end host identification number and a descriptor identification number in the received data packet to obtain index information; judging whether the storage resource information of the index information exists in a storage resource allocation information table of the network card of the receiving end, and obtaining a second judging result; the storage resource allocation information table stores all index information and corresponding allocated storage resource information;
if the second determination result is yes, executing S3202; if the second judgment result is no, executing S3203;
s3202, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet, and S3204 is executed;
s3203, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet; the receiving terminal network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting terminal host identification number in the index information; adding the index information and the information of the allocated storage resources to the storage resource allocation information table;
s3204, the receiving terminal network card stores the segmented data to be transmitted into a memory of the receiving terminal according to the receiving address of the data to be transmitted in the received data packet;
s3205, the receiving terminal network card determines the corresponding data length to be transmitted according to the descriptor identification number in the connection event information; updating the data size of the stored transmission task descriptor of the receiving end network card by using the data size to be transmitted; and finishing the data transmission process.
The receiving end host and the sending end host perform second receiving processing on the received data packet to complete a data transmission process, and the method comprises the following steps:
s3211, deleting the received data packet from the memory of the receiving end to generate connection failure event information; the receiving end network card is utilized to send the connection failure event information to the sending end network card; the connection failure event information is used for notifying the connection establishment failure of the host at the transmitting end; the connection failure event information comprises a descriptor identification number in the received data packet;
s3212, the sending network card generates connection application event information by using the received connection failure event information, and sends the connection application event information to the receiving network card; the connection application event information comprises a descriptor identification number, a sending end host identification number, a data length to be sent and a connection identification number;
s3213, receiving the connection application event information by the receiving terminal network card, and judging whether the receiving terminal network card has idle storage resources or not to obtain a third judging result; when the third judging result is none, continuing to judge after waiting for the preset time; and when the third judging result is that the data packet is received in some cases, the receiving end host and the sending end host perform third receiving processing on the received data packet to complete the data transmission process.
The receiving end host and the sending end host perform third receiving processing on the received data packet to complete a data transmission process, and the method comprises the following steps:
s32131, the receiving network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting host identification number in the connection application event information; updating the data size of the stored transmission task descriptor of the receiving end network card by using the data size to be transmitted in the connection application event information; combining the descriptor identification number in the connection application event information and the sender host identification number to obtain index information; adding the index information and the information of the storage resources allocated for the descriptor identification number to the storage resource allocation information table;
s32132, the receiving network card generates connection confirmation event information according to the connection application event information; the receiving terminal network card sends the connection confirmation event information to the transmitting terminal network card; the connection confirmation event information comprises a descriptor identification number in the connection application event information;
s32133, after receiving the connection confirmation event information, the transmitting network card performs first transmission processing to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one;
s32134, after receiving the data packet to be sent, updating the received data length in the received transmission task descriptor stored in the receiving network card according to the data amount carried by the data packet to be sent;
s32135, the receiving end network card stores the data to be transmitted in the data packet to be transmitted into a receiving end memory according to the received receiving address of the data to be transmitted in the data packet to be transmitted, and completes the data transmission process.
After receiving the connection confirmation event information, the transmitting network card performs a first transmitting process to obtain a data packet to be transmitted, including:
s321331, after receiving the connection confirmation event information, the transmitting network card acquires the data to be transmitted from the transmitting memory by using the transmitting network card according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s321332, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; the data packet to be sent includes: the receiving address of the data to be transmitted, the identification number of the host at the transmitting end, the identification number of the descriptor and the data to be transmitted in a sectionalized way.
In a second aspect of the embodiment of the present invention, a low-latency reliable data transmission device for a trunking network is disclosed, the device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the low latency reliable data transfer method of the clustered network.
In a third aspect of embodiments of the present invention, a computer-readable medium is disclosed, storing computer instructions that, when invoked, are operable to perform a method of low-latency reliable data transmission for a clustered network.
In a fourth aspect of the embodiment of the present invention, an information data processing terminal is disclosed, where the information data processing terminal is configured to implement the low-latency reliable data transmission method of a trunking network.
The beneficial effects of the invention are as follows:
1. the invention provides a connection protocol based on post-confirmation, which assumes that a receiving end has connection resources and directly sends out a data message, and the receiving end applies for the connection resources after receiving the data message. If the receiving end has no connection resource, the invention also has perfect mechanism to ensure the reliability and the correctness of the data transmission. Compared with RCP, the present invention can save one RTT of connection overhead in most cases. Furthermore, an evaluation of the inventive method shows that the inventive receiving connection uses a smaller number than RCP due to the change of connection establishment mode.
2. The invention provides a low-delay reliable data transmission method and a low-delay reliable data transmission device for a cluster network, which are beneficial to improving the transmission efficiency of a high-performance computing system, reducing the transmission failure rate, further reducing RTT short message overhead in an RDMA protocol, reducing transmission delay and eliminating the influence of large-scale heavy load and serious end point congestion flow on the high-performance computing system.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a time block diagram of the process flow of the method of the present invention with or without connection resources at the receiving end;
FIG. 3 is a schematic diagram of a 2x2 mesh network structure used in the test of the present invention;
FIG. 4 is a graph of the completion time of RDMAW descriptors versus the present invention and the RCP method;
figure 5 is a graph comparing performance of the present invention and RCP in a light endpoint congestion scenario.
Detailed Description
For a better understanding of the present disclosure, an embodiment is presented herein. FIG. 1 is a flow chart of the method of the present invention; FIG. 2 is a time block diagram of the process flow of the method of the present invention with or without connection resources at the receiving end; FIG. 3 is a schematic diagram of a 2x2 mesh network structure used in the test of the present invention;
FIG. 4 is a graph of the completion time of RDMAW descriptors versus the present invention and the RCP method; figure 5 is a graph comparing performance of the present invention and RCP in a light endpoint congestion scenario.
When the receiving end of the network card connects to the resource, its network card resources are most of the time rich, which means that the message may not be acknowledged to the receiving end before being sent but defaults to have the connection resource. This reduces the connection overhead by one RTT for all messages. Based on the thought, the invention provides a connection protocol based on post-confirmation. The method of the invention assumes that the receiving end has connection resources, directly sends out the data message, and applies for the connection resources after the receiving end receives the data message. If the receiving end has no connection resource, the method of the invention also has a perfect mechanism to ensure the reliability and the correctness of the data transmission. Compared with the RCP (Reliable Connection Protocol) method, the method can save one RTT connection overhead in most cases. Furthermore, an evaluation of the inventive method shows that the inventive method uses a smaller number of receiving connections than RCPs due to the change of connection establishment mode.
Aiming at the problems of how to improve the transmission efficiency of a high-performance computing system, reduce the transmission failure rate, further reduce RTT short message overhead in RDMA protocol, reduce transmission delay and eliminate the influence of large-scale heavy load and serious end point congestion flow on the high-performance computing system, the first aspect of the embodiment of the invention discloses a low-delay reliable data transmission method and device of a cluster network, the method is realized based on a host at a cluster end side, and the host at the cluster end side comprises: a sending end host and a receiving end host; the sending end host comprises a sending end network card and a sending end memory; the receiving end host comprises a receiving end network card and a receiving end memory; the method comprises the following steps:
s1, acquiring a transmission task descriptor from a memory of a transmitting end by using a network card of the transmitting end; the transmission task descriptor comprises a transmission address of data to be transmitted, a receiving address of the data to be transmitted, a length of the data to be transmitted and a descriptor identification number; the descriptor identification number is used for uniquely identifying the transmission task; the transmission task descriptor is used for describing transmission task information; the transmission task descriptor is established by the host at the transmitting end according to the transmission task;
s2, processing the transmission task descriptor by using a network card of a transmitting end to obtain connection resource information and a data packet to be transmitted;
s3, transmitting the connection resource information and the data packet to be transmitted by using a transmitting end host and a receiving end host to finish a data transmission process;
the processing the transmission task descriptor by using the transmitting terminal network card to obtain connection resource information and a data packet to be transmitted, including:
s21, setting the transmission task descriptor by using a network card of a transmitting end to obtain corresponding connection resource information; the connection resource information comprises a sending end host identification number, a connection identification number and a receiving end host identification number; storing the connection identification number in the transmitting network card; the transmitting network card and the receiving network card both comprise a plurality of connection resources; the connection resource can be a storage space with a certain capacity; the connection identification number is used for indicating the number of the connection resource of the sending end network card used for transmitting the task. The identification number of the host at the transmitting end is an identification number distributed by the cluster network for the host at the transmitting end and can be obtained from a network card at the transmitting end;
s22, acquiring the data to be transmitted from a memory of a transmitting end by using a network card of the transmitting end according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s23, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one; the data packet to be sent includes: the receiving address of the data to be transmitted, the identification number of the host at the transmitting end, the identification number of the descriptor and the segmented data to be transmitted;
the step of using the transmitting network card to perform segment encapsulation processing on the receiving address of the data to be transmitted, the transmitting host identification number, the descriptor identification number and the data to be transmitted to obtain a data packet to be transmitted, includes:
and transmitting each data packet to be transmitted to the receiving end network card one by one, wherein after one data packet to be transmitted is transmitted, the next data packet to be transmitted is transmitted.
Uniformly dividing the data to be transmitted by using a network card of a transmitting end to obtain a plurality of segmented data to be transmitted; packaging each piece of data to be transmitted, a receiving address of the data to be transmitted, a host identification number of a transmitting end and a descriptor identification number to obtain a corresponding data packet to be transmitted; and after the data to be transmitted is subjected to sectional encapsulation treatment, a plurality of data packets to be transmitted can be obtained. Different data packets to be sent can have the same descriptor identification number and different data to be sent;
the method for transmitting the connection resource information and the data packet to be transmitted by using the transmitting end host and the receiving end host to complete the data transmission process comprises the following steps:
s31, combining the host identification number, the descriptor identification number, the length of the data to be transmitted and the connection identification number of the transmitting end by using the network card of the transmitting end to obtain connection event information; transmitting the connection event information to a receiving end host;
s32, receiving the data packet to be transmitted by using a network card of a receiving end, and confirming that the data packet to be transmitted is a received data packet; judging whether the receiving end network card has idle storage resources or not to obtain a first judging result; if the first judgment result is yes, the receiving end host machine carries out first receiving processing on the received data packet to finish the data transmission process; if the first judgment result is negative, the receiving end host and the sending end host carry out second receiving processing on the received data packet, and the data transmission process is completed;
the receiving end host machine performs a first receiving process on the received data packet to complete a data transmission process, and the method comprises the following steps:
s3201, a receiving end network card performs combination processing on a transmitting end host identification number and a descriptor identification number in the received data packet to obtain index information; judging whether the storage resource information of the index information exists in a storage resource allocation information table of the network card of the receiving end, and obtaining a second judging result; the storage resource allocation information table stores all index information and corresponding allocated storage resource information;
if the second determination result is yes, executing S3202; if the second judgment result is no, executing S3203;
s3202, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet, and S3204 is executed;
the receiving end network card updates the received data length in the received transmission task descriptor stored in the receiving end network card according to the data quantity carried by the received data packet, and the method comprises the following steps:
the receiving end network card is utilized to increase the received data length by the data amount according to the data amount carried by the received data packet, and the updating of the received data length in the received transmission task descriptor stored in the receiving end network card is completed; the subsequent update process is similar.
S3203, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet; the receiving terminal network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting terminal host identification number in the index information; adding the index information and the information of the allocated storage resources to the storage resource allocation information table;
the storage resource allocation information table enables the transmitting end to retransmit data according to the storage resource allocation information table when the network fails, so as to realize end-to-end low-delay reliable data transmission.
S3204, the receiving terminal network card stores the segmented data to be transmitted into a memory of the receiving terminal according to the receiving address of the data to be transmitted in the received data packet;
s3205, the receiving terminal network card determines the corresponding data length to be transmitted according to the descriptor identification number in the connection event information; updating the data size of the stored transmission task descriptor of the receiving end network card by using the data size to be transmitted; the data size length of the transmission task descriptor is information maintained in a storage resource; the data size of the transmission task descriptor is the total length of transmission data in the transmission task, and the total length does not change with each data packet receiving process; completing a data transmission process;
and updating the stored transmission task descriptor data volume length of the receiving end network card by using the length of the data to be transmitted, wherein the length of the data to be transmitted is added to the transmission task descriptor data volume length to obtain an updated value of the transmission task descriptor data volume length.
The receiving end host and the sending end host perform second receiving processing on the received data packet to complete a data transmission process, and the method comprises the following steps:
s3211, deleting the received data packet from the memory of the receiving end to generate connection failure event information; the receiving end network card is utilized to send the connection failure event information to the sending end network card; the connection failure event information is used for notifying the connection establishment failure of the host at the transmitting end; the connection failure event information comprises a descriptor identification number in the received data packet;
s3212, the sending network card generates connection application event information by using the received connection failure event information, and sends the connection application event information to the receiving network card; the connection application event information comprises a descriptor identification number, a sending end host identification number, a data length to be sent and a connection identification number; the descriptor identification number may be obtained from the connection failure event information; the host identification number of the transmitting end, the length of the data to be transmitted and the connection identification number can be obtained from the connection resource information.
The connection identification number can be obtained from connection resource information obtained from the receiving end.
S3213, receiving the connection application event information by the receiving terminal network card, and judging whether the receiving terminal network card has idle storage resources or not to obtain a third judging result; when the third judging result is none, continuing to judge after waiting for the preset time; when the third judging result is that the data packet is received, the receiving end host and the sending end host perform third receiving processing on the received data packet, and the data transmission process is completed;
the receiving end host and the sending end host perform third receiving processing on the received data packet to complete a data transmission process, and the method comprises the following steps:
s32131, the receiving network card uses the connection application event information to perform storage resource allocation and storage resource allocation information table update processing;
s32132, the receiving network card generates connection confirmation event information according to the connection application event information; the receiving terminal network card sends the connection confirmation event information to the transmitting terminal network card; the connection confirmation event information comprises a descriptor identification number in the connection application event information;
s32133, after receiving the connection confirmation event information, the transmitting network card performs first transmission processing to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one;
s32134, after receiving the data packet to be sent, updating the received data length in the received transmission task descriptor stored in the receiving network card according to the data amount carried by the data packet to be sent;
s32135, the receiving end network card stores the data to be transmitted in the data packet to be transmitted into a receiving end memory according to the received receiving address of the data to be transmitted in the data packet to be transmitted, and completes the data transmission process.
The receiving end network card performs storage resource allocation and storage resource allocation information table update processing by using the connection application event information, and comprises the following steps:
s321311, the receiving network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting host identification number in the connection application event information;
s321312, updating the data size of the stored transmission task descriptor of the network card of the receiving end by using the data size to be transmitted in the connection application event information;
s321313, combining the descriptor identification number in the connection application event information and the sender host identification number to obtain index information;
s321314, adding the index information and the information of the storage resource allocated for the descriptor identification number to the storage resource allocation information table;
after receiving the connection confirmation event information, the transmitting network card performs a first transmitting process to obtain a data packet to be transmitted, including:
s321331, after receiving the connection confirmation event information, the transmitting network card acquires the data to be transmitted from the transmitting memory by using the transmitting network card according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s321332, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; the data packet to be sent includes: the receiving address of the data to be transmitted, the identification number of the host at the transmitting end, the identification number of the descriptor and the data to be transmitted in a sectionalized way.
In a second aspect of the embodiment of the present invention, a low-latency reliable data transmission device for a trunking network is disclosed, the device comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the low latency reliable data transfer method of the clustered network.
In a third aspect of embodiments of the present invention, a computer-readable medium is disclosed, storing computer instructions that, when invoked, are operable to perform a method of low-latency reliable data transmission for a clustered network.
In a fourth aspect of the embodiment of the present invention, an information data processing terminal is disclosed, where the information data processing terminal is configured to implement the low-latency reliable data transmission method of a trunking network.
Fig. 2 is a time block diagram of the process flow of the method of the present invention with or without connection resources at the receiving end. If the receiving end does not have the connection resource, the receiving end discards the arrived data message and returns NACK, and notifies the sending end to apply for connection.
The performance of the method of the invention is tested as follows.
First, we use a small-scale network to verify the method of the present invention. The experiment employed a mesh topology with k=2, as shown in fig. 3. Node 0, node 1, node 3, and node 4 send an RDMAW message of the same length to node 6. Of course, node 6 informs the 4 sending nodes of the address of its receiving end before the RDAMW descriptor starts. Fig. 4 shows the completion time vs. the case of these 4 RDMAW descriptors. Wherein 32KB and 256KB refer to the length of RDMAW message, interval refers to that the start time of four sending nodes has a certain interval, and sametime represents the simultaneous start.
In fig. 4, the difference between the RCP and the method of the present invention represents the optimization of the descriptor completion time by the method of the present invention. In a 2x2 mesh topology, node 0 and node 1 are 4 hops away from node 6, while node 3 and node 4 are only 3 hops away from node 6. In the test of "32KB-Interval", the descriptor completion times reduced by the method of the present invention are 892ns and 670ns, respectively, which are exactly 4-hop and 3-hop RTT times (one-hop link delay is 100ns according to the experimental configuration mentioned above, and zero penetration delay of the router is also included). This means that the inventive method does reduce one RTT required for setting up a connection. For the other three groups, either no time interval exists or the time interval is not long enough, congestion exists in the network, so that the optimized amplitude of the method is slightly fluctuated compared with RTT due to the influence of network congestion.
The performance of the method in the end point congestion scene is tested. The method of the present invention assumes that the receiving end has sufficient connection resources, but if the contrary is the case, the method of the present invention needs to restart the connection after receiving the NACK. In this case, the inventive method is more costly than a reliable connection protocol. The endpoint congestion scenario is the most extreme case, so we designed the endpoint congestion scenario to test the performance of the inventive method. We increase the number of descriptors sent by four sending nodes in the demo test to 10, still using the 2x2 mesh network. The number of processing units represents the number of descriptors that a single node can start up at most simultaneously. The greater the number of processing units, the more severe the endpoint congestion, but none exceed the maximum number of connections that the receiving end can support. Fig. 5 shows the average completion time and total completion time for all descriptors when the network card has 2, 4 and 8 processing units, respectively, the inventive method behaves the same as a reliable connection protocol. Because the receiving end has sufficient connection resources, whether the method or the RCP of the invention is adopted, all descriptors are almost completed at the last. The data presented are almost identical.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. A method for low latency reliable data transmission in a clustered network, the method being implemented based on a clustered end-side host, the clustered end-side host comprising: a sending end host and a receiving end host; the sending end host comprises a sending end network card and a sending end memory; the receiving end host comprises a receiving end network card and a receiving end memory; the method comprises the following steps:
s1, acquiring a transmission task descriptor from a memory of a transmitting end by using a network card of the transmitting end; the transmission task descriptor comprises a transmission address of data to be transmitted, a receiving address of the data to be transmitted, a length of the data to be transmitted and a descriptor identification number; the descriptor identification number is used for uniquely identifying the transmission task; the transmission task descriptor is used for describing transmission task information;
s2, processing the transmission task descriptor by using a network card of a transmitting end to obtain connection resource information and a data packet to be transmitted;
and S3, transmitting the connection resource information and the data packet to be transmitted by using the transmitting end host and the receiving end host, and completing a data transmission process.
2. The method for low-latency reliable data transmission in a clustered network according to claim 1, wherein said processing the transmission task descriptor by using the transmitting network card to obtain connection resource information and a data packet to be transmitted comprises:
s21, setting the transmission task descriptor by using a network card of a transmitting end to obtain corresponding connection resource information; the connection resource information comprises a sending end host identification number, a connection identification number and a receiving end host identification number; storing the connection identification number in the transmitting network card; the transmitting network card and the receiving network card both comprise a plurality of connection resources; the connection identification number is used for indicating the number of the connection resource of the sending end network card used for transmitting the task; the identification number of the host at the transmitting end is an identification number distributed by the cluster network for the host at the transmitting end;
s22, acquiring the data to be transmitted from a memory of a transmitting end by using a network card of the transmitting end according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s23, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one; the data packet to be sent comprises a receiving address of the data to be sent, a sending end host identification number, a descriptor identification number and segmented data to be sent.
3. The method for transmitting reliable data with low delay in a clustered network of claim 1 wherein said transmitting end host and receiving end host are used to transmit said connection resource information and data packets to be transmitted, and completing the data transmission process, comprising:
s31, combining the host identification number, the descriptor identification number, the length of the data to be transmitted and the connection identification number of the transmitting end by using the network card of the transmitting end to obtain connection event information; transmitting the connection event information to a receiving end host;
s32, receiving the data packet to be transmitted by using a network card of a receiving end, and confirming that the data packet to be transmitted is a received data packet; judging whether the receiving end network card has idle storage resources or not to obtain a first judging result; if the first judgment result is yes, the receiving end host machine carries out first receiving processing on the received data packet to finish the data transmission process; if the first judgment result is negative, the receiving end host and the sending end host carry out second receiving processing on the received data packet, and the data transmission process is completed.
4. A method for low latency reliable data transmission in a clustered network as claimed in claim 3 wherein said receiving end host performs a first receiving process on said received data packet to complete a data transmission process comprising:
s3201, a receiving end network card performs combination processing on a transmitting end host identification number and a descriptor identification number in the received data packet to obtain index information; judging whether the storage resource information of the index information exists in a storage resource allocation information table of the network card of the receiving end, and obtaining a second judging result; the storage resource allocation information table stores all index information and corresponding allocated storage resource information;
if the second determination result is yes, executing S3202; if the second judgment result is no, executing S3203;
s3202, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet, and S3204 is executed;
s3203, the receiving end network card updates the received data length in the receiving transmission task descriptor stored in the receiving end network card according to the data quantity carried by the receiving data packet; the receiving terminal network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting terminal host identification number in the index information; adding the index information and the information of the allocated storage resources to the storage resource allocation information table;
s3204, the receiving terminal network card stores the segmented data to be transmitted into a memory of the receiving terminal according to the receiving address of the data to be transmitted in the received data packet;
s3205, the receiving terminal network card determines the corresponding data length to be transmitted according to the descriptor identification number in the connection event information; updating the data size of the stored transmission task descriptor of the receiving end network card by using the data size to be transmitted; and finishing the data transmission process.
5. A method for low latency reliable data transmission in a clustered network as claimed in claim 3 wherein said receiving end host and transmitting end host perform a second receiving process on said received data packets, completing a data transmission process comprising:
s3211, deleting the received data packet from the memory of the receiving end to generate connection failure event information; the receiving end network card is utilized to send the connection failure event information to the sending end network card; the connection failure event information is used for notifying the connection establishment failure of the host at the transmitting end; the connection failure event information comprises a descriptor identification number in the received data packet;
s3212, the sending network card generates connection application event information by using the received connection failure event information, and sends the connection application event information to the receiving network card; the connection application event information comprises a descriptor identification number, a sending end host identification number, a data length to be sent and a connection identification number;
s3213, receiving the connection application event information by the receiving terminal network card, and judging whether the receiving terminal network card has idle storage resources or not to obtain a third judging result; when the third judging result is none, continuing to judge after waiting for the preset time; and when the third judging result is that the data packet is received in some cases, the receiving end host and the sending end host perform third receiving processing on the received data packet to complete the data transmission process.
6. The method for low-latency reliable data transmission in a clustered network of claim 5 wherein said receiving end host and said sending end host perform a third receiving process on said received data packet, completing a data transmission process, comprising:
s32131, the receiving network card allocates corresponding storage resources for the descriptor identification number according to the descriptor identification number and the transmitting host identification number in the connection application event information; updating the data size of the stored transmission task descriptor of the receiving end network card by using the data size to be transmitted in the connection application event information; combining the descriptor identification number in the connection application event information and the sender host identification number to obtain index information; adding the index information and the information of the storage resources allocated for the descriptor identification number to the storage resource allocation information table;
s32132, the receiving network card generates connection confirmation event information according to the connection application event information; the receiving terminal network card sends the connection confirmation event information to the transmitting terminal network card; the connection confirmation event information comprises a descriptor identification number in the connection application event information;
s32133, after receiving the connection confirmation event information, the transmitting network card performs first transmission processing to obtain a data packet to be transmitted; each data packet to be sent is sent to a receiving end network card one by one;
s32134, after receiving the data packet to be sent, updating the received data length in the received transmission task descriptor stored in the receiving network card according to the data amount carried by the data packet to be sent;
s32135, the receiving end network card stores the data to be transmitted in the data packet to be transmitted into a receiving end memory according to the received receiving address of the data to be transmitted in the data packet to be transmitted, and completes the data transmission process.
7. The method for low-latency reliable data transmission in a clustered network of claim 6 wherein said sending network card performs a first sending process after receiving connection confirmation event information, and obtaining a data packet to be sent includes:
s321331, after receiving the connection confirmation event information, the transmitting network card acquires the data to be transmitted from the transmitting memory by using the transmitting network card according to the transmitting address of the data to be transmitted of the transmission task descriptor and the length of the data to be transmitted;
s321332, carrying out sectional packaging processing on the receiving address of the data to be transmitted, the host identification number of the transmitting end, the descriptor identification number and the data to be transmitted by using the network card of the transmitting end to obtain a data packet to be transmitted; the data packet to be sent includes: the receiving address of the data to be transmitted, the identification number of the host at the transmitting end, the identification number of the descriptor and the data to be transmitted in a sectionalized way.
8. A low latency reliable data transmission apparatus for a clustered network, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the low latency reliable data transfer method of the clustered network of any of claims 1-7.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer instructions for performing the low-latency reliable data transmission method of a clustered network according to any of claims 1-7 when called.
10. An information data processing terminal, characterized in that the information data processing terminal is adapted to implement the low-latency reliable data transmission method of a clustered network according to any of claims 1-7.
CN202311656018.2A 2023-12-05 2023-12-05 Method and device for low-delay reliable data transmission of cluster network Pending CN117640530A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311656018.2A CN117640530A (en) 2023-12-05 2023-12-05 Method and device for low-delay reliable data transmission of cluster network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311656018.2A CN117640530A (en) 2023-12-05 2023-12-05 Method and device for low-delay reliable data transmission of cluster network

Publications (1)

Publication Number Publication Date
CN117640530A true CN117640530A (en) 2024-03-01

Family

ID=90023126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311656018.2A Pending CN117640530A (en) 2023-12-05 2023-12-05 Method and device for low-delay reliable data transmission of cluster network

Country Status (1)

Country Link
CN (1) CN117640530A (en)

Similar Documents

Publication Publication Date Title
US7801135B2 (en) Transport protocol connection synchronization
US6954797B1 (en) Data Communication method, terminal equipment, interconnecting installation, data communication system and recording medium
WO2018205688A1 (en) Method, apparatus and system for data transmission
WO2021128602A1 (en) Data transmission method and apparatus
US7971099B2 (en) Method for enabling faster recovery of client applications in the event of server failure
US11800587B2 (en) Method for establishing subflow of multipath connection, apparatus, and system
US11283555B2 (en) Packet transmission method, network component, and computer-readable storage medium
WO2014127629A1 (en) Message forwarding system, method and device
US6947435B1 (en) Radio communication system and apparatus, communication method and program recording medium therefor
US7535916B2 (en) Method for sharing a transport connection across a multi-processor platform with limited inter-processor communications
CN115766605A (en) Network congestion control method, device and system
CN116074401B (en) Method for realizing transmission layer protocol on programmable exchanger
CN112969244B (en) Session recovery method and device
CN117640530A (en) Method and device for low-delay reliable data transmission of cluster network
JP7123194B2 (en) Data transmission method, transmission device, data reception method, and reception device
JP2016162324A (en) Information processing system, control program and control method
CN116260887A (en) Data transmission method, data transmission device, data reception device, and storage medium
Dong et al. A concurrent transmission control protocol
Chandra et al. TCP performance for future IP-based wireless networks
JP6268027B2 (en) COMMUNICATION SYSTEM, TRANSMISSION DEVICE, AND COMMUNICATION METHOD
Kumar et al. Data sequence map flooding in MPTCP framework: Potential challenges and efficient countermeasures
JP3867896B2 (en) Router device
CN116366535A (en) TCP-based data link aggregation method and device, readable storage medium and terminal
CN117857232A (en) Socket group and data transmission method based on socket group
CN117294642A (en) Multi-tenant on-network aggregation transmission system and method suitable for RDMA network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination