Disclosure of Invention
The invention aims to solve the technical problem of providing a data transmission method and a system based on RDMA multicast, which can save bandwidth and quicken data transmission by designing a data transmission mode based on an RDMA multicast frame.
In order to solve the technical problem, the first aspect of the present invention discloses a data transmission method based on RDMA multicast, the method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; the method comprises the following steps:
the sending terminal sends a data message to the switch;
The switch judges whether target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal, and a first judging result is obtained;
And if the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal.
As an optional implementation manner, in the first aspect of the present invention, before the sending terminal sends the data packet to the switch, the method further includes:
the transmitting terminal transmits QP information thereof to the first receiving terminal and the second receiving terminal;
the second receiving terminal binds the QP corresponding to each second receiving terminal with QP information of the sending terminal so as to establish unidirectional connection with the sending terminal;
the QP of the first receiving terminal is bound with QP information of the sending terminal, and the QP information of the first receiving terminal is sent to the sending terminal;
the sending terminal receives QP information of the first receiving terminal, and binds the QP of the sending terminal with QP information of the first receiving terminal so as to establish bidirectional connection between the sending terminal and the first receiving terminal.
As an optional implementation manner, in the first aspect of the present invention, after the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal to establish a bidirectional connection between the sending terminal and the first receiving terminal, and before the sending terminal sends a data packet to the switch, the method further includes:
the first receiving terminal and the second receiving terminal send terminal network information to the switch; the terminal network information comprises second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal;
And the switch creates a multicast member table according to the terminal network information.
As an optional implementation manner, in the first aspect of the present invention, the multicast member table includes an exact table and a linear table; the switch creates a multicast member table according to the terminal network information, and the method comprises the following steps:
The switch creates the accurate table according to the first terminal network information of the first receiving terminal, wherein the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
The switch creates the linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, and the linear table is used for copying and editing data messages and linking the linear table to the precise table.
As an optional implementation manner, in the first aspect of the present invention, the forwarding, by the switch, the data packet to the first receiving terminal and the second receiving terminal includes:
the switch replicates the data message by traversing the linear table to obtain a plurality of replicated messages with the number being the sum of the numbers of the first receiving terminal and the second receiving terminal;
The switch correspondingly modifies the target terminal network information in the plurality of replication messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals respectively to obtain a plurality of modified replication messages;
And the switch forwards the plurality of modified copy messages to the first receiving terminal and the second receiving terminal respectively.
As an optional implementation manner, in the first aspect of the present invention, the first terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the first receiving terminal;
And/or the second terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of the second receiving terminal;
And/or the target terminal network information comprises at least one of IP address, QP information, RDMA transmission identification information and equipment identification information of a target transmission terminal of the data message.
As an optional implementation manner, in the first aspect of the present invention, after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further includes:
The first receiving terminal and all the second receiving terminals send respective ACK messages corresponding to the data messages to the switch;
For any one of the ACK messages, the switch judges whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the pre-stored first terminal network information to obtain a second judging result;
if the second judgment result is yes, adding 1 to the ACK count value of the switch;
when the ACK count value is judged to be equal to the sum of the numbers of the first receiving terminal and all the second receiving terminals, the switch determines one of the plurality of ACK messages as a corrected ACK message, changes the source terminal network information of the corrected ACK message into first terminal network information of the first receiving terminal, and sends the corrected ACK message to the sending terminal.
As an optional implementation manner, in the first aspect of the present invention, the source terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of a source terminal of the ACK packet.
As an optional implementation manner, in the first aspect of the present invention, after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further includes:
under the condition that the overtime retransmission condition is met, triggering overtime retransmission by the sending terminal;
Wherein the timeout retransmission condition includes: and in a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive the ACK message corresponding to the data message.
The second aspect of the invention discloses a data transmission system based on RDMA multicast, which comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; wherein:
The sending terminal is used for sending a data message to the switch;
The switch is used for judging whether target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal, so as to obtain a first judgment result;
and the switch is further used for forwarding the data message to the first receiving terminal and the second receiving terminal when the first judging result is yes.
As an alternative embodiment, in a second aspect of the invention,
The sending terminal is further configured to send QP information thereof to the first receiving terminal and the second receiving terminal;
The second receiving terminal is used for binding QPs corresponding to the second receiving terminal with QP information of the sending terminal so as to establish unidirectional connection with the sending terminal;
The first receiving terminal is used for binding QP information of the first receiving terminal with QP information of the sending terminal and sending the QP information of the first receiving terminal to the sending terminal;
The sending terminal is further configured to receive QP information of the first receiving terminal, and bind the QP of the sending terminal with the QP information of the first receiving terminal, so as to establish bidirectional connection between the sending terminal and the first receiving terminal.
As an optional implementation manner, in the second aspect of the present invention, the first receiving terminal and the second receiving terminal are further configured to send terminal network information to the switch; the terminal network information comprises second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal;
The switch is also used for creating a multicast member table according to the terminal network information.
As an alternative embodiment, in the second aspect of the present invention, the multicast member table includes an exact table and a linear table; the specific mode of the switch for creating the multicast member table according to the terminal network information comprises the following steps:
The switch creates the accurate table according to the first terminal network information of the first receiving terminal, wherein the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
The switch creates the linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, and the linear table is used for copying and editing data messages and linking the linear table to the precise table.
As an optional implementation manner, in the second aspect of the present invention, the switch is further configured to forward the data packet to the first receiving terminal and the second receiving terminal, including:
The switch is further configured to copy the data packet by traversing the linear table, to obtain a plurality of copy packets with a number equal to a sum of the numbers of the first receiving terminal and the second receiving terminal;
The switch is further configured to correspondingly modify the target terminal network information in the multiple replication messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals, so as to obtain multiple modified replication messages;
The switch is further configured to forward the plurality of modified duplicate packets to the first receiving terminal and the second receiving terminal, respectively.
As an optional implementation manner, in the second aspect of the present invention, the first terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the first receiving terminal;
And/or the second terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of the second receiving terminal;
And/or the target terminal network information comprises at least one of IP address, QP information, RDMA transmission identification information and equipment identification information of a target transmission terminal of the data message.
As an alternative embodiment, in a second aspect of the invention,
The first receiving terminal and all the second receiving terminals are used for sending respective ACK messages corresponding to the data messages to the switch;
The switch is further configured to determine, for any one of the ACK messages, whether source terminal network information in the ACK message matches with terminal network information of a corresponding receiving terminal according to the first terminal network information stored in advance, to obtain a second determination result, and if the second determination result is yes, trigger an ACK count value to be increased by 1;
The switch is further configured to determine one of the plurality of ACK messages as a corrected ACK message when the ACK count value is determined to be equal to a sum of the numbers of the first receiving terminal and all the second receiving terminals, change source terminal network information of the corrected ACK message to first terminal network information of the first receiving terminal, and send the corrected ACK message to the transmitting terminal.
As an optional implementation manner, in the second aspect of the present invention, the source terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of a source terminal of the ACK packet.
As an optional implementation manner, in the second aspect of the present invention, the sending terminal is further configured to trigger a timeout retransmission if a timeout retransmission condition is met;
Wherein the timeout retransmission condition includes: and in a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive the ACK message corresponding to the data message.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
In the embodiment of the invention, a sending terminal sends a datagram Wen Zhijiao for switching, and a switch judges whether target terminal network information in the datagram is matched with first terminal network information according to the first terminal network information of a first receiving terminal stored in advance, so that a first judgment result is obtained; if the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal. Therefore, by designing a data transmission mode based on the RDMA multicast frame, according to the binding relation between a sender and a receiver under the RDMA multicast frame, only one data message is required to be sent by the data sender in the data transmission process, and the message is copied and forwarded to a plurality of receivers by utilizing the copying capability of the switch, so that the network bandwidth can be saved and the data transmission can be quickened.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The invention discloses a data transmission method and a system based on RDMA multicast, which can save network bandwidth and accelerate data transmission. The following will describe in detail.
Example 1
Referring to fig. 1, fig. 1 is a flow chart of a data transmission method based on RDMA multicast according to an embodiment of the present invention. The method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes a bidirectional connection with the first receiving terminal, and the sending terminal establishes a unidirectional connection with the second receiving terminal. As shown in fig. 1, the data transmission method based on RDMA multicast may include the following operations:
101. The transmitting terminal transmits the data message to the switch.
Here, the operation of the transmitting terminal to transmit the data message may occur in various application scenarios, such as offline searching, big data processing, high performance computing, distributed storage, and the like.
Optionally, the switch comprises one of a TOR topology switch, an EOR topology switch, and a MOR topology switch. Preferably, the switch is a TOR topology switch.
102. The switch judges whether target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal, and a first judging result is obtained.
Optionally, the first terminal network information includes at least one of IP address, QP information, RDMA transmission identification information, and device identification information of the first receiving terminal; the target terminal network information includes at least one of IP address, QP information, RDMA transfer identification information, and device identification information of a target transfer terminal of the data packet.
103. If the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal.
Therefore, by implementing the embodiment of the invention, by designing the data transmission mode based on the RDMA multicast frame, according to the binding relation between the sender and the receiver under the RDMA multicast frame, the data sender only needs to send one data message in the data transmission process, and the message is copied and forwarded to a plurality of receivers by utilizing the copying capability of the switch, so that the network bandwidth can be saved and the data transmission can be quickened.
In an alternative embodiment, before the step 101, the data transmission method based on RDMA multicast further includes:
The transmitting terminal transmits QP information thereof to the first receiving terminal and the second receiving terminal;
The second receiving terminal binds the QP corresponding to each receiving terminal with QP information of the sending terminal so as to establish unidirectional connection with the sending terminal;
the QP of the first receiving terminal is bound with QP information of the sending terminal, and the QP information of the first receiving terminal is sent to the sending terminal;
The transmitting terminal receives QP information of the first receiving terminal, and binds the QP of the transmitting terminal with the QP information of the first receiving terminal to establish bidirectional connection between the transmitting terminal and the first receiving terminal.
Here, a brief explanation will be given of QP in this alternative embodiment: RDMA supports three queues, a Send Queue (SQ), a Receive Queue (RQ), and a Completion Queue (CQ), respectively, where SQ and RQ are typically created in pairs, referred to as Queue Pairs (QP), the QP in this alternative embodiment.
Optionally, the QP information includes identification information of the QP required for the RDMA connection.
It can be seen that this optional embodiment gives a specific way for the sending terminal to establish a bidirectional connection with the first receiving terminal and for the second receiving terminal to connect to the sending terminal in one way, and by implementing this optional embodiment, the bidirectional connection between the sending terminal and the first receiving terminal and the unidirectional connection between the sending terminal and the second receiving terminal are completed, so that the subsequent sending terminal only needs to send one time of datagram Wen Bianke to implement RDMA data transmission with the first receiving terminal and the second receiving terminal.
In an alternative embodiment, after the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal to establish a bidirectional connection between the sending terminal and the first receiving terminal, and before the sending terminal sends the data packet to the switch, the method further includes:
the first receiving terminal and the second receiving terminal send terminal network information to the exchanger;
The switch creates a multicast member table according to the terminal network information.
Optionally, the terminal network information includes second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal.
It can be seen that by implementing this optional embodiment, the switch may create the multicast member table in advance according to the network information of the first receiving terminal and the second receiving terminal, so that after receiving the data packet of the sending terminal, the switch may copy the packet according to the members of the multicast member table and forward the packet to all the members of the multicast member table, thereby implementing one-to-many data transmission.
In an alternative embodiment, the multicast member table includes an exact table and a linear table; the switch creates a multicast member table according to the terminal network information, including:
The switch creates an accurate table according to the first terminal network information of the first receiving terminal, wherein the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
The switch creates a linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, the linear table is used for copying and editing the data message, and the linear table is linked to the accurate table.
Optionally, before the switch creates the multicast membership table according to the terminal network information, an IGMP (Internet Group Manage Protocol, internet group management protocol) multicast protocol is enabled. Wherein the IGMP multicast protocol includes one of IGMPv1 multicast protocol, IGMPv2 multicast protocol and IGMPv3 multicast protocol. By this arrangement, multicast membership is established and maintained between the receiving terminal and its immediate neighboring multicast router.
Preferably, the switch enables IGMPv3 multicast protocol before creating the multicast member table according to the terminal network information.
Optionally, the first terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of the first receiving terminal. Preferably, the first terminal network information must include QP information for the first receiving terminal and at least one of IP address, RDMA transfer identity information, and device identity information. By the arrangement, the switch can judge whether the data message is matched with the first receiving terminal or not more accurately.
Optionally, the second terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of the second receiving terminal. Preferably, the second terminal network information must include QP information for the second receiving terminal and at least one of an IP address, RDMA transfer identification information, and device identification information. By the arrangement, the accuracy of copying and editing the data message by the linear table can be ensured.
It can be seen that this optional embodiment gives a specific way for the switch to create the multicast member table according to the first terminal network information and the second terminal network information, so that the created multicast member table can be used for copy editing and forwarding of the data packet.
In an alternative embodiment, the switch forwards the data message to the first receiving terminal and the second receiving terminal, including:
The switch replicates the data messages by traversing the linear table to obtain a plurality of replicated messages with the number being the sum of the numbers of the first receiving terminal and the second receiving terminal;
the switch correspondingly modifies the target terminal network information in the plurality of copying messages into first terminal network information of a first receiving terminal and second terminal network information corresponding to all second receiving terminals respectively to obtain a plurality of modified copying messages;
the switch forwards the plurality of modified copy messages to the first receiving terminal and the second receiving terminal respectively.
It can be seen that this alternative embodiment gives a specific way for the switch to forward data messages to the first receiving terminal and the second receiving terminal, thus enabling a one-to-many RDMA data transfer.
In an optional embodiment, after the step 103, the data transmission method based on RDMA multicast further includes:
And triggering the timeout retransmission by the sending terminal under the condition that the timeout retransmission condition is met.
Wherein the timeout retransmission condition includes: and in a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive the ACK message corresponding to the data message.
It can be seen that by implementing this alternative embodiment, in case of packet loss, the lossless transmission of data RDMA can be guaranteed by triggering retransmission.
Example two
Referring to fig. 2, fig. 2 is a flow chart of a data transmission method based on RDMA multicast according to an embodiment of the present invention. The method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes a bidirectional connection with the first receiving terminal, and the sending terminal establishes a unidirectional connection with the second receiving terminal. As shown in fig. 2, the RDMA multicast-based data transmission method may include the following operations:
201. the transmitting terminal transmits the data message to the switch.
202. The switch judges whether target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal, and a first judging result is obtained.
203. If the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal.
204. The first receiving terminal and all the second receiving terminals send the ACK messages corresponding to the data messages to the exchanger.
205. For any ACK message, the switch judges whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the pre-stored first terminal network information, and a second judging result is obtained.
206. If the second judgment result is yes, the ACK count value of the switch is increased by 1.
207. When the ACK count value is judged to be equal to the sum of the numbers of the first receiving terminal and all the second receiving terminals, the switch determines one of the plurality of ACK messages as a corrected ACK message, changes the source terminal network information of the corrected ACK message into first terminal network information of the first receiving terminal, and sends the corrected ACK message to the sending terminal.
The specific technical details and the explanation of the nouns in the steps 201 to 203 may refer to the descriptions of the steps 101 to 103 in the first embodiment, and are not repeated here.
Optionally, the source terminal network information includes at least one of IP address, QP information, RDMA transmission identification information, and device identification information of the source terminal of the ACK message. Preferably, the source terminal network information must include QP information for the source terminal with an ACK message, and at least one of IP address, RDMA transfer identification information, and device identification information. By this arrangement, the data transmission process can be made more reliable.
In a specific embodiment of the RDMA multicast-based data transmission method described in the present invention, the first terminal network information includes QP information and IP address of the first receiving terminal, and the second terminal network information includes QP information and IP address of the second receiving terminal, so that in a forward process of data transmission from the sending terminal to the receiving terminal, as shown in fig. 4, a QP Number field and an IP field in an RDMA header are selected as keys of an accurate table. Firstly, the exchanger analyzes an IP field of a network layer in a message sent by a sending terminal and a QP Number field in an RDMA message, so that matching is carried out, and after the QP information and the IP address are confirmed to be matched, a corresponding linear table is found. And traversing the linear table when editing and copying the data message.
As further shown in fig. 5, in the process of transmitting the ACK acknowledgement message in the reverse direction, the switch performs the precision table analysis according to the QP Number and the IP field in the ACK message, and after confirming that the QP information and the IP address match, obtains the QP information and the IP address of the first receiving end from the linear table, so as to send the corrected ACK message to the sending terminal.
It can be seen that, after receiving the data message sent by the sending terminal, the receiving terminal returns the respective acknowledgement message indicating that the data message is received correctly to the switch, and the switch edits and sends one of the acknowledgement messages to the sending terminal according to the multicast member table, so that the sending terminal can determine the lossless transmission of the data without receiving the acknowledgement message sent by each receiving terminal.
Example III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a data transmission system based on RDMA multicast according to an embodiment of the present invention. The data transmission system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes a bidirectional connection with the first receiving terminal, and the sending terminal establishes a unidirectional connection with the second receiving terminal. The RDMA multicast data transmission system is as shown in FIG. 3:
the sending terminal is used for sending the datagram Wen Zhijiao for switching machine;
The switch is used for judging whether target terminal network information in the data message is matched with the first terminal network information according to the first terminal network information of the first receiving terminal stored in advance, and obtaining a first judging result;
and the switch is also used for forwarding the data message to the first receiving terminal and the second receiving terminal when the first judging result is yes.
It can be seen that, by implementing the embodiment shown in fig. 3, through the cooperation of the switch, the sending terminal, the first receiving terminal and the second receiving terminal, according to the binding relationship between the sender and the receiver under the RDMA multicast frame, the data sender only needs to send one data message in the data transmission process, and the message is copied and forwarded to multiple receivers by using the copying capability of the switch, so that the network bandwidth can be saved and the data transmission can be quickened.
In an alternative embodiment, the transmitting terminal is further configured to transmit its QP information to the first receiving terminal and the second receiving terminal;
the second receiving terminal is used for binding QP information of the sending terminal with QP information corresponding to the second receiving terminal so as to establish unidirectional connection with the sending terminal;
the first receiving terminal is used for binding QP information of the first receiving terminal with QP information of the sending terminal and sending the QP information of the first receiving terminal to the sending terminal;
The sending terminal is further configured to receive QP information of the first receiving terminal, and bind the QP of the sending terminal with the QP information of the first receiving terminal, so as to establish bidirectional connection between the sending terminal and the first receiving terminal.
It can be seen that this optional embodiment gives a specific way for the sending terminal to establish a bidirectional connection with the first receiving terminal and for the second receiving terminal to connect to the sending terminal in one way, and by implementing this optional embodiment, the bidirectional connection between the sending terminal and the first receiving terminal and the unidirectional connection between the sending terminal and the second receiving terminal are completed, so that the subsequent sending terminal only needs to send one time of datagram Wen Bianke to implement RDMA data transmission with the first receiving terminal and the second receiving terminal.
In an alternative embodiment, the first receiving terminal and the second receiving terminal are further configured to send terminal network information to the switch; the terminal network information comprises second terminal network information of a second receiving terminal and first terminal network information of a first receiving terminal;
The switch is also configured to create a multicast member table based on the terminal network information.
It can be seen that by implementing this optional embodiment, the switch may create the multicast member table in advance according to the network information of the first receiving terminal and the second receiving terminal, so that after receiving the data packet of the sending terminal, the switch may copy the packet according to the members of the multicast member table and forward the packet to all the members of the multicast member table, thereby implementing one-to-many data transmission.
In an alternative embodiment, the multicast member table includes an exact table and a linear table; the specific mode of the switch for creating the multicast member table according to the terminal network information comprises the following steps:
The switch creates an accurate table according to the first terminal network information of the first receiving terminal, wherein the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
The switch creates a linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, the linear table is used for copying and editing the data message, and the linear table is linked to the accurate table.
It can be seen that this optional embodiment gives a specific way for the switch to create the multicast member table according to the first terminal network information and the second terminal network information, so that the created multicast member table can be used for copy editing and forwarding of the data packet.
In an alternative embodiment, the switch is further configured to replicate the data packet by traversing the linear table to obtain a plurality of replicated packets having a number equal to a sum of the number of the first receiving terminal and the number of the second receiving terminal;
The switch is further configured to correspondingly modify target terminal network information in the multiple replication messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals, so as to obtain multiple modified replication messages;
The switch is further configured to forward the plurality of modified duplicate messages to the first receiving terminal and the second receiving terminal, respectively.
It can be seen that this alternative embodiment gives a specific way for the switch to forward data messages to the first receiving terminal and the second receiving terminal, thus enabling a one-to-many RDMA data transfer.
In an alternative embodiment, the first terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of the first receiving terminal;
And/or the second terminal network information includes at least one of IP address, QP information, RDMA transmission identification information, and device identification information of the second receiving terminal;
And/or the destination terminal network information includes at least one of IP address, QP information, RDMA transfer identity information, and device identity information of a destination transfer terminal of the data packet.
It can be seen that by implementing this alternative embodiment, the first terminal network information, the second terminal network information, and the target terminal network information may respectively include one or more addresses or information of the present terminal that can be distinguished from other terminals, so as to ensure a more reliable data transmission process.
In an alternative embodiment, the first receiving terminal and all the second receiving terminals are configured to send respective ACK messages corresponding to the data messages to the switch;
The switch is further configured to determine, according to the pre-stored first terminal network information, whether source terminal network information in any ACK packet is matched with terminal network information of a corresponding receiving terminal, to obtain a second determination result, and if the second determination result is yes, trigger an ACK count value to be increased by 1;
The switch is further configured to determine one of the plurality of ACK messages as a corrected ACK message when the ACK count value is determined to be equal to a sum of the numbers of the first receiving terminal and all the second receiving terminals, change source terminal network information of the corrected ACK message to first terminal network information of the first receiving terminal, and send the corrected ACK message to the transmitting terminal.
Therefore, by implementing the alternative embodiment, after the receiving terminal receives the data message sent by the sending terminal, the receiving terminal can return the confirmation message to the sending terminal, which indicates that the receiving terminal confirms the receiving of the data message, thereby improving the reliability of data transmission.
In an alternative embodiment, the source terminal network information includes at least one of an IP address, QP information, RDMA transfer identity information, and device identity information of the source terminal of the ACK message.
It can be seen that by implementing this alternative embodiment, the source terminal network information may include one or more addresses or information of the present terminal that can be distinguished from other terminals, so as to ensure that the data transmission process is more reliable.
In an alternative embodiment, the sending terminal is further configured to trigger a timeout retransmission if a timeout retransmission condition is met;
Wherein the timeout retransmission condition includes: and in a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive the ACK message corresponding to the data message.
It can be seen that by implementing this alternative embodiment, in case of packet loss, the lossless transmission of data RDMA can be guaranteed by triggering retransmission.
The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including read-only memory (ROM), random access memory (Random Access Memory, RAM), programmable read-only memory (Programmable Read-only memory, PROM), erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable read-only memory (OTPROM), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (Compact Disc Read-only memory, CD-ROM) or other optical disc memory, magnetic disc memory, tape memory, or any other medium that can be used for computer-readable carrying or storing data.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Finally, it should be noted that: the embodiment of the invention discloses a data transmission method and a system based on RDMA multicast, which are only disclosed as a preferred embodiment of the invention, and are only used for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.