CN114944966A - RDMA multicast-based data transmission method and system - Google Patents

RDMA multicast-based data transmission method and system Download PDF

Info

Publication number
CN114944966A
CN114944966A CN202210414928.9A CN202210414928A CN114944966A CN 114944966 A CN114944966 A CN 114944966A CN 202210414928 A CN202210414928 A CN 202210414928A CN 114944966 A CN114944966 A CN 114944966A
Authority
CN
China
Prior art keywords
terminal
receiving terminal
network information
switch
receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210414928.9A
Other languages
Chinese (zh)
Other versions
CN114944966B (en
Inventor
赵铭
林圳杰
王晓亮
刘德瑞
林强
王李明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Digital Power Grid Research Institute of China Southern Power Grid Co Ltd
Original Assignee
Shenzhen Digital Power Grid Research Institute of China Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Digital Power Grid Research Institute of China Southern Power Grid Co Ltd filed Critical Shenzhen Digital Power Grid Research Institute of China Southern Power Grid Co Ltd
Priority to CN202210414928.9A priority Critical patent/CN114944966B/en
Publication of CN114944966A publication Critical patent/CN114944966A/en
Application granted granted Critical
Publication of CN114944966B publication Critical patent/CN114944966B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/185Arrangements for providing special services to substations for broadcast or conference, e.g. multicast with management of multicast group membership
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/16Multipoint routing

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a data transmission method and a system based on RDMA multicast, wherein the method comprises the following steps: the sending terminal sends a data message to the switch; the switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result; if the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal. Therefore, by designing a data transmission mode based on the RDMA multicast frame and according to the binding relationship between the sender and the receiver under the RDMA multicast frame, the data sender only needs to send one data message in the data transmission process, and the message is copied and forwarded to a plurality of receivers by using the copying capability of the switch, so that the network bandwidth can be saved and the data transmission is accelerated.

Description

RDMA multicast-based data transmission method and system
Technical Field
The invention relates to the technical field of communication, in particular to a data transmission method and a data transmission system based on RDMA multicast.
Background
The existing RDMA (Remote Direct Memory Access) technology realizes Direct Access to a Remote Memory under the condition that a Remote CPU is not aware by means of protocol stack sinking, kernel bypass and the like, thereby realizing low network delay and high throughput. However, in the existing RDMA transmission method, each receiving terminal establishes a one-to-one bidirectional connection relationship with the sending terminal, and when one sending terminal corresponds to a plurality of receiving terminals, a large amount of data copying operation needs to be performed at the sending terminal, which not only wastes a large amount of network bandwidth, but also introduces extra delay and increases the overall data processing time. Therefore, it is important to provide an RDMA solution that saves bandwidth and speeds up data transmission.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a data transmission method and system based on RDMA multicast, which can save bandwidth and accelerate data transmission by designing a data transmission mode based on an RDMA multicast framework.
In order to solve the technical problem, a first aspect of the present invention discloses a data transmission method based on RDMA multicast, which is applied to an RDMA multicast system including a switch, a transmitting terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; the method comprises the following steps:
the sending terminal sends a data message to the switch;
the switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result;
if the first judgment result is yes, the switch forwards the data packet to the first receiving terminal and the second receiving terminal.
As an optional implementation manner, in the first aspect of the present invention, before the sending terminal sends the data packet to the switch, the method further includes:
the sending terminal sends the QP information to the first receiving terminal and the second receiving terminal;
the second receiving terminal binds the QPs corresponding to the second receiving terminal with the QP information of the sending terminal so as to establish one-way connection with the sending terminal;
the first receiving terminal binds the QP with the QP information of the sending terminal and sends the QP information of the first receiving terminal to the sending terminal;
and the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal so as to establish the bidirectional connection between the sending terminal and the first receiving terminal.
As an optional implementation manner, in the first aspect of the present invention, after the sending terminal receives the QP information of the first receiving terminal, and binds the QP of the sending terminal with the QP information of the first receiving terminal to establish a bidirectional connection between the sending terminal and the first receiving terminal, and before the sending terminal sends a data packet to the switch, the method further includes:
the first receiving terminal and the second receiving terminal send terminal network information to the switch; the terminal network information includes second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal;
and the exchanger creates a multicast member table according to the terminal network information.
As an optional implementation manner, in the first aspect of the present invention, the multicast member table includes an accurate table and a linear table; the exchanger establishes a multicast member list according to the terminal network information, and the method comprises the following steps:
the switch creates the accurate table according to the first terminal network information of the first receiving terminal, and the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
and the switch creates the linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, wherein the linear table is used for copying and editing the data message and linking the linear table to the accurate table.
As an optional implementation manner, in the first aspect of the present invention, the forwarding, by the switch, the data packet to the first receiving terminal and the second receiving terminal includes:
the switch copies the data message by traversing the linear table to obtain a plurality of copied messages of which the number is the sum of the numbers of the first receiving terminal and the second receiving terminal;
the switch correspondingly modifies the target terminal network information in the plurality of copied messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals respectively to obtain a plurality of modified copied messages;
and the switch respectively forwards the modified copy messages to the first receiving terminal and the second receiving terminal.
As an optional embodiment, in the first aspect of the present invention, the first terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the first receiving terminal;
and/or the second terminal network information comprises at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the second receiving terminal;
and/or the target terminal network information comprises at least one of an IP address, QP information, RDMA transmission identification information and equipment identification information of a target transmission terminal of the data message.
As an optional implementation manner, in the first aspect of the present invention, after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further includes:
the first receiving terminal and all the second receiving terminals send respective ACK messages corresponding to the data messages to the switch;
for any ACK message, the switch judges whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the prestored first terminal network information to obtain a second judgment result;
if the second judgment result is yes, adding 1 to the ACK count value of the switch;
when the ACK counting value is judged to be equal to the sum of the number of the first receiving terminals and the number of all the second receiving terminals, the switch determines one of the plurality of ACK messages as a corrected ACK message, changes the source terminal network information of the corrected ACK message into the first terminal network information of the first receiving terminal, and sends the corrected ACK message to the sending terminal.
As an optional implementation manner, in the first aspect of the present invention, the source terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of a source terminal of the ACK packet.
As an optional implementation manner, in the first aspect of the present invention, after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further includes:
under the condition that an overtime retransmission condition is met, triggering overtime retransmission by the sending terminal;
wherein the timeout retransmission condition comprises: and within a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive an ACK message corresponding to the data message.
The second aspect of the invention discloses a data transmission system based on RDMA multicast, which comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; wherein:
the sending terminal is used for sending a data message to the switch;
the switch is used for judging whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result;
the switch is further configured to forward the data packet to the first receiving terminal and the second receiving terminal when the first determination result is yes.
As an alternative embodiment, in the second aspect of the invention,
the sending terminal is further configured to send the QP information to the first receiving terminal and the second receiving terminal;
the second receiving terminal is used for binding the QPs corresponding to the second receiving terminal with the QP information of the sending terminal so as to establish one-way connection with the sending terminal;
the first receiving terminal is used for binding the QP with the QP information of the sending terminal and sending the QP information of the first receiving terminal to the sending terminal;
the sending terminal is further configured to receive the QP information of the first receiving terminal, and bind the QP of the sending terminal with the QP information of the first receiving terminal, so as to establish a bidirectional connection between the sending terminal and the first receiving terminal.
As an optional implementation manner, in the second aspect of the present invention, the first receiving terminal and the second receiving terminal are further configured to send terminal network information to the switch; the terminal network information includes second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal;
the switch is also used for creating a multicast member table according to the terminal network information.
As an optional implementation manner, in the second aspect of the present invention, the multicast member table includes an accurate table and a linear table; the specific mode that the switch establishes the multicast member table according to the terminal network information comprises the following steps:
the switch creates the accurate table according to the first terminal network information of the first receiving terminal, and the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
and the switch creates the linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, wherein the linear table is used for copying and editing the data message and linking the linear table to the accurate table.
As an optional implementation manner, in the second aspect of the present invention, the switch is further configured to forward the data packet to the first receiving terminal and the second receiving terminal, and includes:
the switch is further configured to copy the data packet by traversing the linear table to obtain a plurality of copied packets of which the number is the sum of the numbers of the first receiving terminal and the second receiving terminal;
the switch is further configured to respectively modify the target terminal network information in the multiple duplicate messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals, so as to obtain multiple modified duplicate messages;
the switch is further configured to forward the modified duplicate packets to the first receiving terminal and the second receiving terminal, respectively.
As an optional implementation manner, in the second aspect of the present invention, the first terminal network information includes at least one of an IP address, QP information, RDMA transfer identification information, and device identification information of the first receiving terminal;
and/or the second terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information, and device identification information of the second receiving terminal;
and/or the target terminal network information comprises at least one of an IP address, QP information, RDMA transmission identification information and equipment identification information of a target transmission terminal of the data message.
As an alternative embodiment, in the second aspect of the invention,
the first receiving terminal and all the second receiving terminals are used for sending respective ACK messages corresponding to the data messages to the switch;
the switch is further configured to, for any ACK message, determine, according to the prestored first terminal network information, whether source terminal network information in the ACK message matches terminal network information of a corresponding receiving terminal, to obtain a second determination result, and if the second determination result is yes, trigger an ACK count value to add 1;
and the switch is further used for determining one of the plurality of ACK messages as a corrected ACK message when the ACK count value is judged to be equal to the sum of the number of the first receiving terminals and the number of all the second receiving terminals, changing the source terminal network information of the corrected ACK message into the first terminal network information of the first receiving terminal, and sending the corrected ACK message to the sending terminal.
As an optional implementation manner, in the second aspect of the present invention, the source terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the source terminal of the ACK packet.
As an optional implementation manner, in the second aspect of the present invention, the sending terminal is further configured to trigger the timeout retransmission if a timeout retransmission condition is met;
wherein the timeout retransmission condition comprises: and within a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive an ACK message corresponding to the data message.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, a data message is sent to a switch through a sending terminal, and the switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of a first receiving terminal, so that a first judgment result is obtained; if the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal. Therefore, by designing a data transmission mode based on the RDMA multicast framework and according to the binding relationship between the sender and the receiver under the RDMA multicast framework, the data sender only needs to send one data message in the data transmission process, and the message is copied and forwarded to a plurality of receivers by using the copying capability of the switch, so that the network bandwidth can be saved and the data transmission is accelerated.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a RDMA multicast-based data transmission method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another RDMA multicast-based data transmission method disclosed in the embodiment of the present invention;
FIG. 3 is a diagram of an RDMA multicast-based data transmission system according to an embodiment of the present invention;
FIG. 4 is a diagram of a multicast member table according to an embodiment of the present invention;
fig. 5 is a schematic diagram of another multicast member table disclosed in the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
The terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.
The invention discloses a data transmission method and a data transmission system based on RDMA multicast, which can save network bandwidth and accelerate data transmission. The following are detailed below.
Example one
Referring to fig. 1, fig. 1 is a flowchart illustrating a method for data transmission based on RDMA multicast according to an embodiment of the present invention. The method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal. As shown in fig. 1, the RDMA multicast-based data transmission method may include the following operations:
101. the sending terminal sends the data message to the switch.
Here, the operation of the sending terminal sending the data packet may occur in various application scenarios, such as offline search, big data processing, high performance computing, distributed storage, and the like.
Optionally, the switches comprise one of TOR configured switches, EOR configured switches, and MOR configured switches. Preferably, the switch is a TOR-topology switch.
102. The switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result.
Optionally, the first terminal network information includes at least one of an IP address, QP information, RDMA transfer identification information, and device identification information of the first receiving terminal; the target terminal network information includes at least one of an IP address, QP information, RDMA transfer identification information, and device identification information of a target transmission terminal of the data packet.
103. If the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal.
Therefore, by designing a data transmission mode based on the RDMA multicast framework and according to the binding relationship between the sender and the receiver under the RDMA multicast framework, the embodiment of the invention ensures that the data sender only needs to send one data message in the data transmission process, and copies and forwards the message to a plurality of receivers by utilizing the copying capability of the switch, thereby saving the network bandwidth and accelerating the data transmission.
In an optional embodiment, before the step 101, the RDMA multicast-based data transmission method further includes:
the sending terminal sends the QP information to a first receiving terminal and a second receiving terminal;
the second receiving terminal binds the QPs corresponding to the second receiving terminal with the QP information of the sending terminal so as to establish one-way connection with the sending terminal;
the first receiving terminal binds the QP with the QP information of the sending terminal and sends the QP information of the first receiving terminal to the sending terminal;
and the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal so as to establish the bidirectional connection between the sending terminal and the first receiving terminal.
Here, a brief explanation is made on the QP in the present alternative embodiment: RDMA supports three queues in common, namely a Send Queue (SQ), a Receive Queue (RQ), and a Completion Queue (CQ), where SQ and RQ are typically created in Pairs, called Queue Pairs (QPs), which are QPs in this alternative embodiment.
Optionally, the QP information includes identification information for the QP required for the RDMA connection.
It can be seen that, this optional embodiment provides a specific way for the sending terminal to establish a bidirectional connection with the first receiving terminal and for the second receiving terminal to be connected to the sending terminal in a unidirectional manner, and by implementing this optional embodiment, the bidirectional connection between the sending terminal and the first receiving terminal and the unidirectional connection between the sending terminal and the second receiving terminal are completed, so that the subsequent sending terminal can realize RDMA data transmission with the first receiving terminal and the second receiving terminal only by sending a data packet once.
In an optional embodiment, after the sending terminal receives the QP information of the first receiving terminal, and binds the QP of the sending terminal with the QP information of the first receiving terminal to establish a bidirectional connection between the sending terminal and the first receiving terminal, and before the sending terminal sends the data packet to the switch, the method further includes:
the first receiving terminal and the second receiving terminal send terminal network information to the switch;
the switch creates a multicast member table according to the terminal network information.
Optionally, the terminal network information includes second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal.
It can be seen that, by implementing the optional embodiment, the switch may create the multicast member table in advance according to the network information of the first receiving terminal and the second receiving terminal, so that after receiving the data packet of the sending terminal, the switch may copy the packet according to the members of the multicast member table and forward the packet to the members of all multicast member tables, thereby implementing one-to-many data transmission.
In an alternative embodiment, the multicast membership table comprises an exact table and a linear table; the exchanger establishes the multicast member list according to the terminal network information, including:
the switch creates an accurate table according to the first terminal network information of the first receiving terminal, and the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
the switch creates a linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, the linear table is used for copying and editing the data message, and the linear table is linked to the accurate table.
Optionally, before the switch creates the multicast member table according to the terminal network information, an IGMP (Internet Group management Protocol) multicast Protocol is started. The IGMP multicast protocol includes one of an IGMPv1 multicast protocol, an IGMPv2 multicast protocol, and an IGMPv3 multicast protocol. By the arrangement, the multicast membership is established and maintained between the receiving terminal and the multicast router which is directly adjacent to the receiving terminal.
Preferably, the switch enables the IGMPv3 multicast protocol before creating the multicast membership table based on the end network information.
Optionally, the first terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the first receiving terminal. Preferably, the first terminal network information must include QP information of the first receiving terminal and at least one of IP address, RDMA transfer identification information and device identification information. Through the arrangement, the switch can judge whether the data message is matched with the first receiving terminal more accurately.
Optionally, the second terminal network information includes at least one of an IP address, QP information, RDMA transfer identification information, and device identification information of the second receiving terminal. Preferably, the second terminal network information must include QP information of the second receiving terminal and at least one of IP address, RDMA transfer identification information and device identification information. By the arrangement, the accuracy of copying and editing the data message by the linear table can be ensured.
As can be seen, this optional embodiment provides a specific way for the switch to create the multicast member table according to the first terminal network information and the second terminal network information, so that the created multicast member table can be used for the copy editing and forwarding of the data packet.
In an optional embodiment, the forwarding, by the switch, the data packet to the first receiving terminal and the second receiving terminal includes:
the switch copies the data messages by traversing the linear table to obtain a plurality of copied messages of which the number is the sum of the numbers of the first receiving terminals and the second receiving terminals;
the switch respectively and correspondingly modifies the target terminal network information in the plurality of copied messages into first terminal network information of a first receiving terminal and second terminal network information corresponding to all second receiving terminals to obtain a plurality of modified copied messages;
and the switch respectively forwards the modified copy messages to the first receiving terminal and the second receiving terminal.
As can be seen, this alternative embodiment provides a specific way for the switch to forward the data packet to the first receiving terminal and the second receiving terminal, thereby implementing one-to-many RDMA data transmission.
In an optional embodiment, after the step 103, the RDMA multicast-based data transmission method further includes:
and under the condition of meeting the overtime retransmission condition, the sending terminal triggers the overtime retransmission.
Wherein the timeout retransmission condition comprises: and in a preset time period after the data message is forwarded to the first receiving terminal and the second receiving terminal by the switch, the sending terminal does not receive the ACK message corresponding to the data message.
Therefore, by implementing the optional embodiment, when packet loss occurs, RDMA lossless transmission of data can be guaranteed by triggering retransmission.
Example two
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for data transmission based on RDMA multicast according to an embodiment of the present invention. The method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal. As shown in fig. 2, the RDMA multicast-based data transmission method may include the following operations:
201. the sending terminal sends the data message to the switch.
202. The switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result.
203. If the first judgment result is yes, the switch forwards the data message to the first receiving terminal and the second receiving terminal.
204. And the first receiving terminal and all the second receiving terminals send respective ACK messages corresponding to the data messages to the switch.
205. For any ACK message, the switch judges whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the prestored first terminal network information, and a second judgment result is obtained.
206. If the second judgment result is yes, the ACK count value of the switch is increased by 1.
207. And when the ACK count value is judged to be equal to the sum of the number of the first receiving terminals and the number of all the second receiving terminals, the switch determines one of the plurality of ACK messages as a modified ACK message, changes the source terminal network information of the modified ACK message into the first terminal network information of the first receiving terminal, and sends the modified ACK message to the sending terminal.
The specific technical details and the noun explanation of the steps 201-203 can refer to the description of the steps 101-103 in the first embodiment, which are not repeated herein.
Optionally, the source terminal network information includes at least one of an IP address, QP information, RDMA transmission identification information, and device identification information of the source terminal of the ACK packet. Preferably, the source terminal network information must include QP information of the source terminal including the ACK packet and at least one of an IP address, RDMA transport identification information, and device identification information. By this arrangement, the data transmission process can be made more reliable.
In a specific embodiment of the RDMA multicast-based data transmission method described in the present invention, the first terminal network information includes QP information and IP address of the first receiving terminal, and the second terminal network information includes QP information and IP address of the second receiving terminal, so that as shown in fig. 4, in the forward process of data transmission from the sending terminal to the receiving terminal, the QP Number field and IP field in the RDMA packet header are used as keys of the precise table. The method comprises the steps that firstly, a switch analyzes an IP field of a network layer in a message sent by a sending terminal and a QP Number field in an RDMA message so as to carry out matching, and after the QP information is confirmed to be matched with an IP address, a corresponding linear table is found. And traversing the linear table when editing and copying the data message.
As shown in fig. 5, in the process of transmitting the ACK acknowledgment packet in the reverse direction, the switch performs table-accurate parsing according to the QP Number and the IP field in the ACK packet, and after it is confirmed that the QP information and the IP address are matched, obtains the QP information and the IP address of the first receiving end from the linear table, so as to send the modified ACK packet to the sending terminal.
It can be seen that, by implementing the embodiment of the present invention, after receiving the data message sent by the sending terminal, the receiving terminals return the respective acknowledgement messages indicating that the receipt of the data message is correct to the switch, and the switch edits and sends one of the acknowledgement messages to the sending terminal according to the multicast member table, so that the sending terminal can judge the lossless transmission of the data without receiving the acknowledgement message sent by each receiving terminal.
EXAMPLE III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a RDMA multicast-based data transmission system according to an embodiment of the present invention. The data transmission system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal. The RDMA multicast data transmission system is shown in FIG. 3:
the sending terminal is used for sending the data message to the switch;
the switch is used for judging whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result;
the switch is further configured to forward the data packet to the first receiving terminal and the second receiving terminal when the first determination result is yes.
It can be seen that, with the embodiment shown in fig. 3, through cooperation of the switch, the sending terminal, the first receiving terminal and the second receiving terminal, according to the binding relationship between the sending party and the receiving party under the RDMA multicast framework, the data sending party only needs to send one data packet during data transmission, and copies and forwards the packet to multiple receiving parties by using the copying capability of the switch, so that network bandwidth can be saved and data transmission is accelerated.
In an optional embodiment, the sending terminal is further configured to send its QP information to the first receiving terminal and the second receiving terminal;
the second receiving terminal is used for binding the QPs corresponding to the second receiving terminal with the QP information of the sending terminal so as to establish one-way connection with the sending terminal;
the first receiving terminal is used for binding the QP information of the first receiving terminal with the QP information of the sending terminal and sending the QP information of the first receiving terminal to the sending terminal;
the sending terminal is further configured to receive the QP information of the first receiving terminal, and bind the QP of the sending terminal with the QP information of the first receiving terminal, so as to establish a bidirectional connection between the sending terminal and the first receiving terminal.
It can be seen that, this optional embodiment provides a specific way for the sending terminal to establish a bidirectional connection with the first receiving terminal and for the second receiving terminal to be connected to the sending terminal in a unidirectional manner, and by implementing this optional embodiment, the bidirectional connection between the sending terminal and the first receiving terminal and the unidirectional connection between the sending terminal and the second receiving terminal are completed, so that the subsequent sending terminal can realize RDMA data transmission with the first receiving terminal and the second receiving terminal only by sending a data packet once.
In an optional embodiment, the first receiving terminal and the second receiving terminal are further configured to send terminal network information to the switch; the terminal network information includes second terminal network information of a second receiving terminal and first terminal network information of a first receiving terminal;
the switch is also used for creating a multicast member table according to the terminal network information.
It can be seen that, by implementing the optional embodiment, the switch may create the multicast member table in advance according to the network information of the first receiving terminal and the second receiving terminal, so that after receiving the data packet of the sending terminal, the switch may copy the packet according to the members of the multicast member table and forward the packet to the members of all multicast member tables, thereby implementing one-to-many data transmission.
In an alternative embodiment, the multicast membership table comprises an exact table and a linear table; the specific mode that the exchanger establishes the multicast member list according to the terminal network information comprises the following steps:
the switch creates an accurate table according to the first terminal network information of the first receiving terminal, and the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
the switch creates a linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, the linear table is used for copying and editing the data message, and the linear table is linked to the accurate table.
As can be seen, this optional embodiment provides a specific way for the switch to create the multicast member table according to the first terminal network information and the second terminal network information, so that the created multicast member table can be used for the copy editing and forwarding of the data packet.
In an optional embodiment, the switch is further configured to copy the data packet by traversing the linear table to obtain a plurality of copied packets whose number is the sum of the numbers of the first receiving terminal and the second receiving terminal;
the switch is also used for correspondingly modifying the target terminal network information in the plurality of copied messages into first terminal network information of a first receiving terminal and second terminal network information corresponding to all second receiving terminals respectively to obtain a plurality of modified copied messages;
the switch is further configured to forward the modified duplicate packets to the first receiving terminal and the second receiving terminal, respectively.
As can be seen, this alternative embodiment provides a specific way for the switch to forward the data packet to the first receiving terminal and the second receiving terminal, thereby implementing one-to-many RDMA data transmission.
In an alternative embodiment, the first terminal network information includes at least one of an IP address, QP information, RDMA transport identification information, and device identification information of the first receiving terminal;
and/or the second terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information, and device identification information of the second receiving terminal;
and/or the target terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information and device identification information of the target transport terminal of the data message.
It can be seen that, by implementing the optional embodiment, the first terminal network information, the second terminal network information, and the target terminal network information may respectively include one or more addresses or information of the terminal that can be distinguished from other terminals, thereby ensuring that the data transmission process is more reliable.
In an optional embodiment, the first receiving terminal and all the second receiving terminals are configured to send respective ACK packets corresponding to the data packets to the switch;
the switch is also used for judging whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the prestored first terminal network information for any ACK message to obtain a second judgment result, and when the second judgment result is yes, triggering the ACK counting value to be added by 1;
and the switch is also used for determining one of the plurality of ACK messages as a modified ACK message when the ACK count value is judged to be equal to the sum of the number of the first receiving terminal and all the second receiving terminals, changing the source terminal network information of the modified ACK message into the first terminal network information of the first receiving terminal, and sending the modified ACK message to the sending terminal.
Therefore, by implementing the optional embodiment, after the receiving terminal receives the data message sent by the sending terminal, the receiving terminal can return the confirmation message to the sending terminal, which indicates that the receiving terminal confirms that the data message is received correctly, thereby improving the reliability of data transmission.
In an alternative embodiment, the source terminal network information includes at least one of an IP address, QP information, RDMA transport identification information, and device identification information of the source terminal of the ACK packet.
It can be seen that by implementing this alternative embodiment, the source terminal network information may contain one or more addresses or information of the terminal that can be distinguished from other terminals, thereby ensuring that the data transmission process is more reliable.
In an optional embodiment, the sending terminal is further configured to trigger the retransmission timeout if a retransmission timeout condition is met;
wherein the timeout retransmission condition comprises: and in a preset time period after the data message is forwarded to the first receiving terminal and the second receiving terminal by the switch, the sending terminal does not receive the ACK message corresponding to the data message.
Therefore, by implementing the optional embodiment, when packet loss occurs, RDMA lossless transmission of data can be guaranteed by triggering retransmission.
The above-described embodiments of the apparatus are merely illustrative, and the modules described as separate components may or may not be physically separate, and the components shown as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
Through the above detailed description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on such understanding, the above technical solutions essentially or contributing to the prior art may be embodied in the form of software products, which may be stored in a computer-readable storage medium, the storage medium including a Read-Only Memory (ROM), a Random Access Memory (RAM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an optical Disc (CD-ROM), or other disk memories, CD-ROMs, magnetic disks, or other magnetic memories, A tape memory, or any other medium readable by a computer that can be used to carry or store data.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
Finally, it should be noted that: the RDMA multicast-based data transmission method and system disclosed in the embodiments of the present invention are only preferred embodiments of the present invention, and are only used for illustrating the technical solutions of the present invention, not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A RDMA multicast-based data transmission method is applied to an RDMA multicast system, and the RDMA multicast system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; the method comprises the following steps:
the sending terminal sends a data message to the switch;
the switch judges whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result;
if the first judgment result is yes, the switch forwards the data packet to the first receiving terminal and the second receiving terminal.
2. The method of claim 1, wherein before the sending terminal sends the datagram to the switch, the method further comprises:
the sending terminal sends the QP information to the first receiving terminal and the second receiving terminal;
the second receiving terminal binds the QPs corresponding to the second receiving terminal with the QP information of the sending terminal so as to establish one-way connection with the sending terminal;
the first receiving terminal binds the QP with the QP information of the sending terminal and sends the QP information of the first receiving terminal to the sending terminal;
and the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal so as to establish the bidirectional connection between the sending terminal and the first receiving terminal.
3. The method according to claim 2, wherein after the sending terminal receives the QP information of the first receiving terminal and binds the QP of the sending terminal with the QP information of the first receiving terminal to establish the bidirectional connection between the sending terminal and the first receiving terminal, and before the sending terminal sends the data packet to the switch, the method further comprises:
the first receiving terminal and the second receiving terminal send terminal network information to the switch; the terminal network information includes second terminal network information of the second receiving terminal and first terminal network information of the first receiving terminal;
and the exchanger creates a multicast member table according to the terminal network information.
4. The method of claim 3, wherein the multicast membership table comprises an exact table and a linear table; the exchanger establishes a multicast member list according to the terminal network information, and the method comprises the following steps:
the switch creates the accurate table according to the first terminal network information of the first receiving terminal, and the accurate table is used for judging whether the target terminal network information in the data message is matched with the first terminal network information of the first receiving terminal;
and the switch creates the linear table according to the first terminal network information of the first receiving terminal and the second terminal network information of all the second receiving terminals, wherein the linear table is used for copying and editing the data message and linking the linear table to the accurate table.
5. The method of claim 4, wherein the switch forwards the data packet to the first receiving terminal and the second receiving terminal, and comprises:
the switch copies the data message by traversing the linear table to obtain a plurality of copied messages of which the number is the sum of the numbers of the first receiving terminal and the second receiving terminal;
the switch correspondingly modifies the target terminal network information in the plurality of copied messages into first terminal network information of the first receiving terminal and second terminal network information corresponding to all the second receiving terminals respectively to obtain a plurality of modified copied messages;
and the switch respectively forwards the modified copy messages to the first receiving terminal and the second receiving terminal.
6. The method of any of claims 1 to 5, wherein the first terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information and device identification information of the first receiving terminal;
and/or the second terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information, and device identification information of the second receiving terminal;
and/or the target terminal network information comprises at least one of an IP address, QP information, RDMA transmission identification information and equipment identification information of a target transmission terminal of the data message.
7. The method of claim 1, wherein after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further comprises:
the first receiving terminal and all the second receiving terminals send respective ACK messages corresponding to the data messages to the switch;
for any ACK message, the switch judges whether the source terminal network information in the ACK message is matched with the terminal network information of the corresponding receiving terminal according to the prestored first terminal network information to obtain a second judgment result;
if the second judgment result is yes, adding 1 to the ACK count value of the switch;
and when the ACK count value is judged to be equal to the sum of the number of the first receiving terminals and the number of all the second receiving terminals, the switch determines one of the plurality of ACK messages as a corrected ACK message, changes the source terminal network information of the corrected ACK message into the first terminal network information of the first receiving terminal, and sends the corrected ACK message to the sending terminal.
8. The method of claim 7, wherein the source terminal network information comprises at least one of an IP address, QP information, RDMA transport identification information, and device identification information of a source terminal of the ACK packet.
9. The method of claim 1, wherein after the switch forwards the data packet to the first receiving terminal and the second receiving terminal, the method further comprises:
under the condition that an overtime retransmission condition is met, triggering overtime retransmission by the sending terminal;
wherein the timeout retransmission condition comprises: and within a preset time period after the switch forwards the data message to the first receiving terminal and the second receiving terminal, the sending terminal does not receive an ACK message corresponding to the data message.
10. An RDMA multicast-based data transmission system, characterized in that the data transmission system comprises a switch, a sending terminal, a first receiving terminal and a second receiving terminal; the sending terminal establishes bidirectional connection with the first receiving terminal, and the sending terminal establishes unidirectional connection with the second receiving terminal; wherein:
the sending terminal is used for sending a data message to the switch;
the switch is used for judging whether the target terminal network information in the data message is matched with the first terminal network information according to the prestored first terminal network information of the first receiving terminal to obtain a first judgment result;
the switch is further configured to forward the data packet to the first receiving terminal and the second receiving terminal when the first determination result is yes.
CN202210414928.9A 2022-04-20 2022-04-20 RDMA multicast-based data transmission method and system Active CN114944966B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210414928.9A CN114944966B (en) 2022-04-20 2022-04-20 RDMA multicast-based data transmission method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210414928.9A CN114944966B (en) 2022-04-20 2022-04-20 RDMA multicast-based data transmission method and system

Publications (2)

Publication Number Publication Date
CN114944966A true CN114944966A (en) 2022-08-26
CN114944966B CN114944966B (en) 2024-04-19

Family

ID=82906561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210414928.9A Active CN114944966B (en) 2022-04-20 2022-04-20 RDMA multicast-based data transmission method and system

Country Status (1)

Country Link
CN (1) CN114944966B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125604A1 (en) * 2004-08-30 2009-05-14 International Business Machines Corporation Third party, broadcast, multicast and conditional rdma operations
CN103441937A (en) * 2013-08-21 2013-12-11 曙光信息产业(北京)有限公司 Sending method and receiving method of multicast data
CN109586931A (en) * 2018-10-18 2019-04-05 招商证券股份有限公司 Method of multicasting and terminal device
CN112448826A (en) * 2020-11-13 2021-03-05 恒生电子股份有限公司 Multicast message communication method and device, readable medium and electronic equipment
CN113961139A (en) * 2020-07-02 2022-01-21 华为技术有限公司 Method for processing data by using intermediate device, computer system and intermediate device
WO2022048762A1 (en) * 2020-09-04 2022-03-10 Huawei Technologies Co., Ltd. Devices and methods for remote direct memory access

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125604A1 (en) * 2004-08-30 2009-05-14 International Business Machines Corporation Third party, broadcast, multicast and conditional rdma operations
CN103441937A (en) * 2013-08-21 2013-12-11 曙光信息产业(北京)有限公司 Sending method and receiving method of multicast data
CN109586931A (en) * 2018-10-18 2019-04-05 招商证券股份有限公司 Method of multicasting and terminal device
CN113961139A (en) * 2020-07-02 2022-01-21 华为技术有限公司 Method for processing data by using intermediate device, computer system and intermediate device
WO2022048762A1 (en) * 2020-09-04 2022-03-10 Huawei Technologies Co., Ltd. Devices and methods for remote direct memory access
CN112448826A (en) * 2020-11-13 2021-03-05 恒生电子股份有限公司 Multicast message communication method and device, readable medium and electronic equipment

Also Published As

Publication number Publication date
CN114944966B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
JP4495515B2 (en) Reliable delivery of multicast conference data
KR100935933B1 (en) Reliable multicast data retransmission method by grouping wireless terminal in wireless communication, and apparatus thereof
US7536622B2 (en) Data repair enhancements for multicast/broadcast data distribution
US9270475B2 (en) Network-based service for the repair of IP multicast sessions
CN110995697A (en) Big data transmission method and system
EP1552649B1 (en) Multicast data transfer
TW200537841A (en) Data repair
WO2003094449A1 (en) Method and apparatus for multicast delivery of information
JP2007522750A5 (en)
JP2003515269A (en) Transmission system
WO2013104241A1 (en) Data retransmission method and system, multicast server and user terminal
KR100883576B1 (en) Data repair enhancements for multicast/broadcast data distribution
Sabata et al. Transport protocol for reliable multicast: TRM
KR101600060B1 (en) Protocol booster for sctp in multicast networks
CN106034011A (en) Control method and system for multicast transport quality guarantee
CN1627725A (en) Method for guaranteeing reliability of data transmission from one point to multiple points
CN115189813A (en) OTT multicast method, system, device, multicast proxy and multicast server
CN114944966A (en) RDMA multicast-based data transmission method and system
KR100223014B1 (en) Method for controlling error
KR20080055202A (en) System for transfering a large-sized digital content to multi-point using ip-multicast and method thereof
MXPA06011288A (en) Data repair enhancements for multicast/broadcast data distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 518053 501, 502, 601 and 602, building D, wisdom Plaza, Qiaoxiang Road, Gaofa community, Shahe street, Nanshan District, Shenzhen, Guangdong

Applicant after: China Southern Power Grid Digital Platform Technology (Guangdong) Co.,Ltd.

Address before: 518053 501, 502, 601 and 602, building D, wisdom Plaza, Qiaoxiang Road, Gaofa community, Shahe street, Nanshan District, Shenzhen, Guangdong

Applicant before: China Southern Power Grid Shenzhen Digital Power Grid Research Institute Co.,Ltd.

Country or region before: China

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant