CN115701063A - Message transmission method and communication device - Google Patents

Message transmission method and communication device Download PDF

Info

Publication number
CN115701063A
CN115701063A CN202110872533.9A CN202110872533A CN115701063A CN 115701063 A CN115701063 A CN 115701063A CN 202110872533 A CN202110872533 A CN 202110872533A CN 115701063 A CN115701063 A CN 115701063A
Authority
CN
China
Prior art keywords
message
network device
address
header
lid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110872533.9A
Other languages
Chinese (zh)
Inventor
蒋有军
吴涛
郑合文
韩磊
范多亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202110872533.9A priority Critical patent/CN115701063A/en
Priority to PCT/CN2022/106368 priority patent/WO2023005723A1/en
Publication of CN115701063A publication Critical patent/CN115701063A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/66Arrangements for connecting between networks having differing types of switching systems, e.g. gateways
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • H04L47/263Rate modification at the source after receiving feedback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/111Switch interfaces, e.g. port details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application discloses a message transmission method and a communication device, which are used for improving the message transmission rate. The method in the embodiment of the application comprises the following steps: the gateway analyzes the first message of the first network device to obtain the IP address of the second network device, matches the LID corresponding to the IP address of the second network device in the lookup table, encapsulates a second message header comprising the LID of the second network device, generates a second message in the first message after stripping the first message header, stores the LID in a local routing header of the second message, and then sends the second message according to the LID of the second network device.

Description

Message transmission method and communication device
Technical Field
The present application relates to the field of communications, and in particular, to a message transmission method and a communication apparatus.
Background
As data size has increased dramatically, the computational power demands of processing systems from application performance have expanded exponentially, and High Performance Computing (HPC) cluster applications have increased dramatically. The HPC interconnect uses ethernet and Infiniband (IB) networking clusters in a 78% area, far beyond other interconnects. Remote Direct Memory Access (RDMA) technology was first found in IB networks, which has been the first choice for overcomputing interconnect since birth due to its high performance and low latency, but after the advent of Ethernet-based RDMA over Ethernet conversion 2 (RoCEv 2) networks, it is increasingly adopted for overcomputing interconnect due to its complete compatibility with and support for Internet Protocol (IP) networks.
The ethernet IP packet is transmitted over the IB network by using an internet protocol over infiniband (IPoIB) technology running over the IB through the gateway device.
However, because the ethernet IP packet is directly encapsulated in the IB packet transmission through the tunneling technique, the packet still passes through the kernel copy and the software protocol stack, and the remote direct access effect of the IB network cannot be achieved, which affects the transmission efficiency.
Disclosure of Invention
The embodiment of the application provides a message transmission method and a communication device, which are used for improving the message transmission rate.
A first aspect of the embodiments of the present application provides a method for packet transmission, where the method includes: the gateway receives a first message from first network equipment, wherein a first message header of the first message comprises an Internet Protocol (IP) address of second network equipment, and the second network equipment is target network equipment for transmitting the message by the first network equipment; the gateway determines a local identifier LID of the second network equipment by combining a lookup table according to the IP address of the second network equipment, wherein the lookup table comprises an incidence relation between the IP address and the LID; the gateway strips the first message header of the first message and encapsulates the second message header to obtain a second message, wherein the second message header comprises a local routing header, and the local routing header comprises an LID of the second network device; and the gateway sends a second message according to the LID of the second network equipment.
In the first aspect, the gateway parses the first packet of the first network device to obtain the IP address of the second network device, matches the LID corresponding to the IP address of the second network device in the lookup table, encapsulates the second packet header including the LID of the second network device, generates the second packet in the first packet after the first packet header is stripped, and then sends the second packet according to the LID of the second network device, where the second packet is a common IB format packet, the LID of the second network device is stored in the local routing header of the second packet, and the transmission of the second packet in the IB network does not need to be copied through a kernel, so that a remote direct access effect of the IB network can be achieved, and the packet transmission efficiency is improved.
In a possible implementation manner, the step of receiving, by the gateway, the first packet from the first network device includes: the gateway receives a first message in a large-capacity buffer area according to a credit flow control mechanism; the gateway feeds back the state information of the buffer with large capacity to the first network equipment through the pause message so that the first network equipment adjusts the message transmission.
In the possible embodiment, the gateway may receive the message from the first network device by using a credit flow control mechanism of the IB network in combination with a large-capacity buffer, the large-capacity buffer may receive more flight messages, and the gateway may further indicate the state information of the buffer of the first network device by using a pause message, so that the first network device adjusts transmission of a subsequent message, and the first network device may correspondingly reduce or stop sending the message, thereby avoiding congestion of message transmission.
In a possible implementation manner, before the gateway receives the first packet from the first network device in the above steps, the method further includes: the gateway applies for LID from the subnet manager according to the route change, and the route change indicates that the first network equipment is added into the network; the gateway receiving the LID of the first network device; the gateway acquires a response message from the second network equipment, wherein the response message comprises the LID and the IP address of the second network equipment; the gateway updates the lookup table based on the IP address and LID of the first network device and the IP address and LID of the second network device.
In the possible embodiment, the gateway may apply for the LID for the subnet manager of the first network device accessing the network on the IB side, acquire the response packet of the second network device according to the broadcast packet of the first network device, and generate or update the lookup table according to the IP address and the LID of the second network device in the response packet, thereby improving the feasibility of the solution.
In a possible implementation manner, the step of applying, by the gateway, the LID to the subnet manager according to the route change includes: the gateway receives an Address Resolution Protocol (ARP) message from the first network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment; and the gateway applies for the LID of the first network equipment from the subnet manager according to the ARP message.
In the above possible implementation, the broadcast message may be an ARP message, the route change may also be determined by the ARP message, the gateway converts the ARP message into an ARP message in IB format, and sends the ARP message to the second network device, and the second network device feeds back an ARP response message, where the ARP response message includes an IP address and an LID of the second network device, thereby improving feasibility of the present solution.
In a possible implementation manner, after the gateway strips the first packet header of the first packet and encapsulates the second packet header to obtain the second packet, the method further includes: the gateway updates an Invariant Cyclic Redundancy Check (ICRC) and a Variable Cyclic Redundancy Check (VCRC) of the second packet.
In the above possible implementation, after the gateway converts the first packet into the second packet, the gateway needs to modify the ICRC and VCRC of the second packet to increase the error checking capability.
In a possible implementation manner, the first message is an ethernet message, and the second message is an IB message.
In one possible embodiment, the ethernet packet includes an ethernet header, an IP header, a UDP header, an IB transport header, an IB payload, an ICRC, and a Cyclic Redundancy Check (CRC).
In one possible implementation, the IB packet includes a local routing header, an IB transport header, an IB payload, an ICRC, and a VCRC.
A second aspect of the embodiments of the present application provides a packet transmission method, including: the gateway receives a third packet from the first network device, where a third packet header of the third packet includes a local routing header, the local routing header includes a local identifier LID of a second network device, and the second network device is a destination network device for the first network device to transmit the packet; the gateway determines the Internet protocol IP address of the second network equipment by combining a lookup table according to the LID of the second network equipment, wherein the lookup table comprises the incidence relation between the IP address and the LID; the gateway strips a third message header of the third message and encapsulates a fourth message header to obtain a fourth message, wherein the fourth message header comprises an IP address of the second network equipment; and the gateway sends a fourth message according to the IP address of the second network equipment.
In the second aspect, the gateway parses the first packet of the first network device to obtain the LID of the second network device, matches the IP address corresponding to the LID in the lookup table, encapsulates the second packet header including the IP address of the second network device, generates the second packet in the first packet after the first packet header is stripped, and then sends the second packet according to the LID of the second network device, where the second packet is transmitted in the ethernet network without being copied by a kernel, so that a remote direct access effect of the ethernet network can be achieved, and packet transmission efficiency is improved.
In one possible embodiment, the method further comprises: the gateway acquires a Queue Pair Number (QPN) of the first network device and a QPN of the second network device according to the link establishment message; the gateway obtains a User Datagram Protocol (UDP) port number of the second network device according to the QPN of the first network device and the QPN of the second network device, the lookup table further comprises an association relation among the QPN, the UDP port number, the IP address and the LID, the fourth packet header further comprises a Media Access Control (MAC) layer address and a UDP port number of the second network device, and the MAC address of the second network device is obtained according to the IP address broadcast of the second network device.
In the foregoing possible implementation, when the first network device and the second network device establish a transmission link, the gateway may further record QPNs of the first network device and the second network device, and calculate a UDP port number of the second network device through the QPNs of the first network device and the second network device, and the gateway may further send a broadcast packet to the RoCE network according to the IP address of the second network device, so that the second network device feeds back the MAC address, and encapsulates the MAC address and the UDP port number of the second network device in the second packet, so that the gateway can transmit the second packet according to the IP address, the MAC address, and the UDP port number of the second network device, thereby improving reliability of packet transmission.
In a possible implementation manner, the step of sending, by the gateway according to the IP address of the second network device, the fourth packet includes: the gateway sends a fourth message in a high-capacity buffer area according to the credit flow control mechanism and the IP address of the second network equipment; the gateway feeds back the state information of the buffer with large capacity to the second network equipment through the pause message, so that the second network equipment adjusts the message transmission.
In a possible implementation manner, before the gateway receives the third packet from the first network device in the above steps, the method further includes: the gateway applies for LID from the subnet manager according to the route change, and the route change indicates that the second network equipment is added into the network; the gateway receiving the LID of the second network device; the gateway acquires a response message from the first network equipment, wherein the response message comprises the LID and the IP address of the first network equipment; the gateway updates the lookup table based on the IP address and LID of the first network device and the IP address and LID of the second network device.
In a possible implementation manner, the step of applying, by the gateway, the LID to the subnet manager according to the route change includes: the gateway receives an Address Resolution Protocol (ARP) message from the second network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment; and the gateway applies for the LID of the second network equipment to the subnet manager according to the ARP message.
In a possible implementation manner, after the gateway strips the first packet header of the first packet and encapsulates the second packet header to obtain the second packet, the method further includes: and the gateway updates the ICRC and the CRC of the second message.
In a possible implementation manner, the first message is an ethernet message, and the second message is an IB message.
In one possible embodiment, the ethernet packet includes an ethernet header, an IP header, a UDP header, an IB transport header, an IB payload, an ICRC, and a CRC.
In one possible implementation, the IB packet includes a local routing header, an IB transport header, an IB payload, an ICRC, and a VCRC.
A third aspect of embodiments of the present application provides a communication apparatus, including: a receiving unit, configured to receive a first packet from a first network device, where a first packet header of the first packet includes an internet protocol IP address of a second network device, and the second network device is a destination network device for transmitting a packet by the first network device; the determining unit is used for determining a local identifier LID of the second network device by combining a lookup table according to the IP address of the second network device, wherein the lookup table comprises an incidence relation between the IP address and the LID; an encapsulating unit, configured to strip a first packet header of a first packet and encapsulate a second packet header to obtain a second packet, where the second packet header includes a local routing header, and the local routing header includes an LID of a second network device; and the sending unit is used for sending the second message according to the LID of the second network equipment.
The communication device is configured to perform the method of the first aspect or any one of the implementation manners of the first aspect.
A fourth aspect of the embodiments of the present application provides a communication apparatus, including: a receiving unit, configured to receive a third packet from the first network device, where a third packet header of the third packet includes a local routing header, the local routing header includes a local identifier LID of the second network device, and the second network device is a destination network device for the first network device to transmit the packet; the determining unit is used for determining the Internet protocol IP address of the second network equipment by combining a lookup table according to the LID of the second network equipment, wherein the lookup table comprises the incidence relation between the IP address and the LID; an encapsulating unit, configured to strip a third packet header of the third packet and encapsulate a fourth packet header to obtain a fourth packet, where the fourth packet header includes an IP address of the second network device; and the sending unit is used for sending the fourth message according to the IP address of the second network equipment.
The communication device is configured to perform the method of the second aspect or any one of the embodiments of the second aspect.
A fifth aspect of an embodiment of the present application provides a communication device, including: a processor for executing instructions stored in the memory to cause a communication device to perform the method provided by the first aspect or any of the alternatives of the first aspect, and a communication interface for receiving or transmitting an indication. For specific details of the communication device provided by the fifth aspect, reference may be made to the first aspect or any optional manner of the first aspect, and details are not described here.
A sixth aspect of the embodiments of the present application provides a communication device, including: a processor for executing instructions stored in the memory to cause the communication device to perform the method provided by the second aspect or any of the alternatives of the second aspect, and a communication interface for receiving or transmitting the indication. The details of the communication device provided by the sixth aspect may be referred to the second aspect or any optional manner of the second aspect, and are not described here again.
A seventh aspect of embodiments of the present application provides a computer-readable storage medium, which stores a program, and when the computer executes the program, the computer performs the method provided in the first aspect or any one of the alternatives of the first aspect.
An eighth aspect of embodiments of the present application provides a computer-readable storage medium, which stores a program, and when the computer executes the program, the computer performs the method provided in the second aspect or any of the alternatives of the second aspect.
A ninth aspect of embodiments of the present application provides a computer program product, which when executed on a computer, executes the method provided in the first aspect or any one of the alternatives of the first aspect.
A tenth aspect of embodiments of the present application provides a computer program product, which when executed on a computer, executes the method provided in the second aspect or any one of the alternatives of the second aspect.
Drawings
FIG. 1 is a block diagram of an HPC system provided by an embodiment of the present application;
fig. 2 is a schematic view of a short-distance scene inside a data center according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a long-distance scenario of hypercalculation intermediate interconnection provided in an embodiment of the present application;
fig. 4 is a schematic diagram of an embodiment of a message transmission method according to an embodiment of the present application;
fig. 5 is a schematic diagram of another embodiment of a message transmission method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a gateway according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a gateway according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a gateway according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a communication device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a communication device according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a communication device according to an embodiment of the present application;
fig. 12 is another schematic structural diagram of a communication device according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a message transmission method and a communication device, which are used for improving the message transmission rate.
Embodiments of the present application will be described with reference to the accompanying drawings, and it is to be understood that the described embodiments are only some embodiments of the present application, and not all embodiments of the present application. As can be known to those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The tables in the present application can be split and merged, but are not limited thereto, and only one example is given here.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present application.
As data size has increased dramatically, the computational power demands of processing systems from application Performance have expanded exponentially, and High Performance Computing (HPC) application demands have increased dramatically. The main industry and applications of High Performance Computing (HPC) are in scientific research institutions such as schools, institutes, and the like. The oil sector, medical biology, computational chemistry, automobile and aerospace design, building structure design, three-dimensional graph operation and other fields. HPC refers to computing systems and environments that use many processors (part of a single machine) or several computers in a cluster (operating as a single computing resource). In the face of large-scale operation tasks, a parallel algorithm is used, a large task is split and distributed to different nodes in a cluster for parallel operation, then the calculation results are gathered, the final result is quickly obtained, and the performance of the HPC system is closely related to the calculation capability and the storage performance of the calculation nodes and the performance of the node interconnection network. Infiniband (IB) has been the first choice for ultra-computational interconnect since its high performance and low latency, but with the advent of ethernet-based remote direct access (RDMA over ethernet transfer 2, roCEv 2) networks, it is increasingly adopted by ultra-computational interconnect because it is fully compatible with ethernet IP networks and supports Remote Direct Memory Access (RDMA) protocols.
The ethernet packet is a packet transmitted in the ethernet network, and is transmitted in the ethernet network based on the IP address in the IP header.
The IB packet is a packet transmitted in the IB network, the IB network does not sense an IP address, a network device in the IB network is assigned with a Local Identifier (LID), and the IB packet is transmitted in the IB network based on the LID in the local routing header.
At present, the data center has a scenario that an IB network and a RoCE network coexist simultaneously and interaction is required to be performed, for example, different high-speed interconnection networks are used for storage and calculation, and devices and apparatuses for supporting interconnection of the IB network and the RoCE network are required. Fig. 1 is a block diagram of the HPC system provided by the embodiment of the present application, in an HPC cluster environment, a high-performance RDMA network interconnection is used between a computing node and a storage node, and an IB and RoCEv2 network interconnection is currently mainly used, for example, a hybrid interconnection of IB and RoCE exists as shown in fig. 1, and an ethernet message is converted into an IB message.
Most of the HPC cluster adopts IB and RoCE network networking, so that the demand of data transmission across IB and RoCE networks exists, and the IB network and the RoCE network need to be efficiently intercommunicated.
The method for implementing IB and ethernet RoCE conversion in the embodiment of the present application may be used in a scenario of efficient data interaction across an IB network and a RoCE network, where a short-distance scenario inside a data center and a long-distance scenario between super-computation centers mainly exist, as shown in fig. 2, a short-distance scenario diagram inside a data center provided in the embodiment of the present application is shown, an IB network and a RoCE network transmit through a gateway (gateway), an IB message of the IB network is converted into an ethernet message through a gateway to transmit in the RoCE network, or an ethernet message of the RoCE network is converted into an IB message through a gateway to transmit in the IB network. As shown in fig. 3, which is a schematic view of a long-distance scenario of super-computation-intermediate interconnection provided in the embodiment of the present application, the IB network 1, the gateway 1, the IB network 2, and the gateway 2, the IB network 1 converts an IB message into an ethernet message through the gateway 1 and transmits the ethernet message to the gateway 2, and the gateway 2 converts the ethernet message into an IB message for transmission in the IB network 2. The embodiment of the application takes a short-distance scene as an example.
In the prior art, an ethernet IP packet is transmitted over an IB network by using an internet protocol over finiband (IPoIB) technology running over IB through a GateWay device (GateWay) to implement that the ethernet IP packet is transmitted over the IB network, where a TCP/IP packet is directly encapsulated in the IB packet through a tunneling technology, the packet still passes through a kernel copy and a software protocol stack, and cannot exert advantages of a kernel bypass (kernel bypass) and a zero copy (zero copy) of the IB network, a large delay CPU has a high occupancy rate, and a conventional packet is encapsulated in the IB network, and cannot exert advantage of high IB network bearing efficiency, especially for a small message packet, the packet encapsulation efficiency is significantly reduced.
In order to solve the foregoing problem, an embodiment of the present application provides a message transmission method, which is as follows.
Referring to fig. 4, as shown in fig. 4, an embodiment of a message transmission method provided in the present application is an embodiment of the method, where the method includes:
401. the first network device sends a first message to the gateway.
In this embodiment, the first network device is a network device on a RoCE network side, and the second network device is a network device on an IB network side. Specifically, the first network device may send a first message including the data to the second network device through the gateway, where the first message is an ethernet message. The first packet includes a Media Access Control (MAC) address, an Internet Protocol (IP) address, a User Datagram Protocol (UDP) port number of the first network device, and an MAC address and an IP address of the second network device.
The packet format of the first packet is shown in table 1 below, and the first packet includes an Ethernet (ETH) header, an IP header, a UDP header, an IB transport (transport) header, an IB payload (payload), an Invariant Cyclic Redundancy Check (ICRC) and a Cyclic Redundancy Check (CRC) fields, where the ETH header stores MAC addresses of the first network device and the second network device, the IP header stores IP addresses of the first network device and the second network device, the UDP header stores a UDP port number of the first network device, the IB transport header field stores a QPN of the first network device and a QPN of the second network device, and the ICRC field and the CRC field are used for checking data to ensure correctness of data transmission.
TABLE 1
Figure BDA0003189661050000071
402. The gateway determines the LID of the second network device in conjunction with the lookup table based on the IP address of the second network device.
In this embodiment, after receiving the first packet, the gateway may analyze the first packet to obtain an IP address of the second network device in the IP header, and then may match the IP address of the second network device with the lookup table according to an association relationship between the IP address and the LID in the lookup table to obtain the LID of the second network device.
Optionally, before step 401, the gateway applies for an LID from the subnet manager according to the route change; the gateway receives the LID of the first network device; the gateway acquires a response message from the second network equipment; the gateway updates the lookup table based on the IP address and LID of the first network device and the IP address and LID of the second network device. Specifically, when the first network device accesses the network, the gateway needs to add a routing path, that is, the gateway may determine a route change, apply for an LID from the subnet manager, where the subnet manager is located in the IB network, and randomly assign an LID to the subnet manager, where the gateway may correspond the LID to the IP address of the first network device. The gateway may further receive a broadcast packet from the first network, convert the broadcast packet into an IB packet, and forward the IB packet to the second network device, so that the second network device feeds back a response packet, where the response packet includes the IP address and the LID of the second network device, and the gateway may correspond the IP address and the LID of the first network device to the IP address and the LID of the second network device, and store the IP address and the LID in the lookup table, or update the lookup table.
Optionally, the gateway may apply for the LID from the subnet manager according to the route change, where the gateway receives an Address Resolution Protocol (ARP) packet from the first network device, and applies for the LID of the first network device from the subnet manager according to the ARP packet. Specifically, the gateway may also identify a terminal newly added to the ethernet side by receiving an ARP packet on the ethernet side or a packet of another protocol, and when receiving the ARP packet, the gateway may obtain an IP address in the ARP packet and apply for the first network device to the subnet manager for the LID. Correspondingly, the gateway converts the ARP message into an ARP message at an IB side and sends the ARP message to the second network equipment, and the response message sent by the second network equipment is the ARP response message.
Optionally, after the lookup table is generated or updated, an aging setting may be further performed, and a source IP address and a destination IP address in the lookup table may be used as a determination basis, if a message including the source IP address and the destination IP address is not received within a set time, an association relationship between the source IP address and the destination IP address in the lookup table is deleted, or if a message including the source IP address and the destination IP address is not received within a set time, the lookup table enters an aging counting process, and when an aging counter reaches a preset value, an association relationship between the source IP address and the destination IP address in the lookup table is deleted. Specifically, if an ethernet packet containing a specific IP is received during the counting period of the aging counting process, the aging counting process is re-entered.
403. The gateway strips the first message header of the first message and encapsulates the second message header to obtain the second message.
In this embodiment, after obtaining the LID of the second network device from the lookup table according to the IP address of the second network device, the gateway may strip the first packet header of the first packet, where the first packet includes, for example, an ETH header, an IP header, and a UDP header in table 1, and the gateway may encapsulate the second packet header on the first packet from which the first packet header is stripped, so as to form a second packet, where the second packet header includes the LID of the second network device, and the second packet header is an IB packet, and a hardware lookup table mode is used to implement conversion between the IB packet and the ethernet packet. The format of the second packet is shown in table 2, and a Local Route Header (Local Route Header) is included in the second packet Header, where the Local Route Header field includes LIDs of the first network device and the second network device.
TABLE 2
Figure BDA0003189661050000081
After the gateway strips the first header and encapsulates the second header, the gateway may also update the UDP port number in the first header and the QPN in the IB transport header to the lookup table.
The format of the lookup table may be as shown in table 3 below, src represents the source device, dst represents the destination device, and Src UDP port1 represents the UDP port number of the second network device.
TABLE 3
Src IP1 Dst IP1 Src LID1 Dst LID1 Dst QPN1 Src QPN1 Src UDP port1
Src IP2 Dst IP2 Src LID2 Dst LID2 Dst QPN2 Src QPN2 Src UDP port2
Src IP3 Dst IP3 Src LID3 Dst LID3 Dst QPN3 Src QPN3 Src UDP port3
Src IPn Dst IPn Src LIDn Dst LIDn Dst QPNn Src QPNn Src UDP portn
Optionally, before the gateway sends the second packet to the second network device, the gateway may further update an ICRC and a Variable Cyclic Redundancy Check (VCRC) of the second packet. Specifically, after the gateway converts the first packet into the second packet, the gateway needs to modify the ICRC and VCRC of the second packet, so as to increase the code distance and error checking and correcting capability of the whole coding system.
The conversion of the Ethernet message into the IB message is realized, the ICRC and the VCRC of the message are updated, meanwhile, the UDP port number is recorded, in addition, the service level (service level) SL field of the IB message is mapped by the service type of service (TOS)/Differential Service Code Point (DSCP) of the RoCEv2, and the information transmission of quality of service (QOS) is realized. The DSCP prioritizes by encoding values using used 6bits and unused 2 bits in the class of service TOS flag byte of each packet IP header. Table 4 shows the value range of the DSCP field on the ethernet side and the mapping manner of the SL field on the IB side.
TABLE 4
RoCEv2 side DSCP (6 bits) IB side SL (4 bits) Priority (Priority))
0-7 0 Priority 0
8-15 1 Priority 1
16-23 2 Priority 2
24-31 3 Priority 3
32-39 4 Priority 4
40-47 5 Priority 5
48-55 6 Priority 6
56-63 7 Priority 7
404. And the gateway sends a second message according to the LID of the second network equipment.
In this embodiment, after generating the second packet, the gateway may send the second packet to the second network device in the IB network according to the LID of the second network device.
For a long-distance interconnection scenario of a supercomputing center, a common subnet manager needs to be negotiated by a local gateway and a remote gateway, that is, an LID assigned by the subnet manager needs to be uniquely identified in an IB network where the local gateway and the remote gateway are located. The local gateway converts the IB message into an Ethernet message, and the IP address and the MAC address can be configured at the local gateway.
The local gateway receives the message from the remote gateway by adopting a two-stage flow control mode:
optionally, the gateway receiving the first packet from the first network device may be that the gateway receives the first packet in a large-capacity buffer according to a credit (credit) flow control mechanism; the gateway feeds back the state information of the buffer with large capacity to the first network equipment through a Pause (Pause) message, so that the first network equipment adjusts the message transmission. Specifically, the gateway adopts a native credit flow control mechanism of an IB network, adopts large-capacity buffer area butt joint of a First Input First Output (FIFO) queue based on virtual channel (VL) granularity, adopts a configurable FIFO waterline, simultaneously monitors the internal storage condition of the FIFO in real time, and the large-capacity buffer area can receive more flight messages. When the far-end Ethernet transmits a large amount of data to the local gateway, the local gateway transmits FIFO state information to an Ethernet port of the far-end gateway through an Ethernet flow control Pause message, and simultaneously analyzes the Pause message and judges the congestion condition of the opposite end in combination with the arrangement of a waterline to adjust the transmitting end, so that the efficient transmission and congestion avoidance of the opposite end are ensured. Specifically, a priority-based flow control (PFC) function of the port may be used, and the packet may be subjected to flow control based on the 802.1P priority, and the stall time in the PFC is changed to the stacking condition of the buffer (buffer).
According to the technical scheme, the first message of the first network device is analyzed through the gateway to obtain the IP address of the second network device, the LID corresponding to the IP address of the second network device is matched in the lookup table, the second message header comprising the LID of the second network device is packaged, the second message is generated in the first message after the first message header is stripped, the second message is sent according to the LID of the second network device, the second message does not need to be copied through a kernel in the transmission of the IB network, the remote direct access effect of the IB network can be achieved, and the message transmission efficiency is improved.
The above describes a method for converting an ethernet message into an IB message by a gateway, and the following describes a method for converting an IB message into an ethernet message by a gateway, where the first network device is a network device of an IB network and the second network device is a network device of an ethernet network.
Referring to fig. 5, fig. 5 is a schematic diagram of another embodiment of a message transmission method according to an embodiment of the present application, and the method is as follows.
501. And the first network equipment sends the third message to the gateway.
In this embodiment of the application, the third packet is an IB packet, the format of the IB packet may refer to the format of the IB packet in step 403, and a third packet Header of the third packet includes a Local Route Header, where the Local Route Header includes an LID of the first network device and an LID of the second network device, and a QPN of the first network device and a QPN of the second network device.
502. The gateway determines the internet protocol, IP, address of the second network device in conjunction with the lookup table based on the LID of the second network device.
The step 502 of searching for the IP address and the updating manner of the lookup table according to the LID may refer to the description of searching for the LID and the updating manner of the lookup table according to the IP address in the step 402, which is not described herein again.
503. And the gateway strips the third message header of the third message and encapsulates the fourth message header to obtain the fourth message.
In this embodiment of the application, the fourth packet header includes an ETH header, an IP header, a UDP header, an IB transport header, an IB payload, and a CRC field, the gateway receives the IB packet, parses the source LID and the destination LID of the LRH field in the IB packet, and the destination QPN, matches the lookup table with the source LID, the destination LID, and the destination QPN of the packet to implement the ethernet header IP header source IP address, the destination IP address, and the UDP port number of the UDP header of the ethernet packet, and directly encapsulates the IB transport header and the IB payload to the location field corresponding to the RoCEv2, where the gateway may further send a broadcast packet to the RoCE network according to the destination IP address, so that the second network device feeds back the MAC address, which is the destination MAC address, and stores the destination MAC address in the ETH header field. Based on the hop-by-hop property of the MAC address in the ETH header of the Ethernet message, the MAC address of the Ethernet side interface of the gateway can be used as a source MAC address, meanwhile, the ICRC and the CRC field of the message are updated and packaged into the corresponding check field, and thus the gateway realizes the mutual conversion of the IB message and the Ethernet message at the IB side. The fourth packet may refer to the format of the first packet in fig. 4, and details are not described here.
Optionally, the gateway obtains a QPN of the first network device and a QPN of the second network device according to the link establishment message; and the gateway acquires the UDP port number of the second network device according to the QPN of the first network device and the QPN of the second network device. Specifically, when the first network device and the second network device establish a connection, the first network device may send a link establishment message to the gateway, the gateway converts the link establishment message into an IB format and sends the IB format to the second network device, and the second network device feeds back a link establishment response message; or the second network device sends the link establishment message to the gateway, the gateway converts the link establishment message into an ethernet format and sends the ethernet format to the first network device, and the first network device feeds back the link establishment response message. The gateway may obtain a QPN of the first network device and a QPN of the second network device according to the link establishment message, and calculate a UDP port number of the second network device by combining the QPN of the first network device and the QPN of the second network device according to a mapping relationship between the QPN and the UDP port number, and the gateway may store a corresponding relationship between the QPN of the first network device, the QPN of the second network device, and the UDP port number of the second network device in a lookup table. The gateway may determine, according to the third packet, the MAC address of the first network device and the MAC address of the second network device, the IP address of the first network device and the IP address of the second network device, the QPN of the first network device and the QPN of the second network device, and the UDP port number of the second network device, which are required by the fourth packet header.
504. And the gateway sends a fourth message according to the IP address of the second network equipment.
In this embodiment, the gateway may transmit the fourth packet according to the IP address of the second network device in the fourth packet according to the transmission mode of the ethernet packet.
Optionally, the gateway sends the fourth message according to the IP address of the second network device by using two-level flow control transmission, and the gateway sends the fourth message in a large-capacity buffer according to the credit flow control mechanism and the IP address of the second network device; the gateway feeds back the state information of the buffer with large capacity to the second network equipment through the pause message so that the second network equipment adjusts the message transmission. The related description of the two-stage flow control transmission may refer to the related description in step 304, and is not described herein again.
According to the technical scheme of the embodiment of the application, the LID of the second network equipment is obtained by analyzing the first message of the first network equipment through the gateway, the IP address corresponding to the LID is matched in the lookup table, the second message header comprising the IP address of the second network equipment is packaged, the second message is generated in the first message after the first message header is stripped, the second message is sent according to the LID of the second network equipment, the second message is transmitted in the Ethernet without being copied through a kernel, the remote direct access effect of the Ethernet can be achieved, and the message transmission efficiency is improved.
The structure of the gateway in the embodiment of the present application may be a schematic structural diagram of the gateway as shown in fig. 6, where the gateway includes an exchange chip and a processing chip, where the exchange chip mainly implements a basic forwarding function, and the processing chip may be a CPU, an FPGA, or the like, and is responsible for establishing and maintaining a lookup table required for implementing different protocol message conversions and converting different protocol messages.
The structure of the gateway in the embodiment of the present application may be another schematic structural diagram of the gateway as shown in fig. 7, where the gateway includes a switch chip, where the switch chip includes a receiving module, a processing module, and a sending module, the receiving module and the sending module implement a basic forwarding function, and the processing module is responsible for establishing and maintaining a lookup table required for implementing different protocol message conversions and converting different protocol messages.
In this embodiment, a structure of a processing chip may refer to a schematic structural diagram of the processing chip shown in fig. 8, where the processing chip includes an ethernet interface, an encapsulation/decapsulation module, a buffer module, an IB credit flow control module, an IB interface, a quality of service module, and a management module.
The Ethernet interface is used for receiving an Ethernet message or outputting the Ethernet message;
the encapsulation/decapsulation module is used for performing conversion between the Ethernet message and the IB message.
The buffer module is used for storing the flight message and sending the state information of the buffer module to the opposite terminal through the RoCE network.
And the IB credit flow control module is used for adjusting the message transmission at the IB side.
The service quality module is used for determining a virtual port transmitted between the Ethernet side and the IB side.
The management module is used for managing the LID, and exemplarily, the LID is applied to the subnet manager and is allocated to the node.
Having described the message transmission method, a communication apparatus that can execute the message transmission method is described below.
Referring to fig. 9, as shown in fig. 9, a schematic structural diagram of a communication device according to an embodiment of the present application is shown, where the communication device 90 includes:
a receiving unit 901, configured to receive a first packet from a first network device, where a first packet header of the first packet includes an internet protocol IP address of a second network device, and the second network device is a destination network device for transmitting a packet by the first network device;
a determining unit 902, configured to determine, according to an IP address of a second network device, a local identifier LID of the second network device in combination with a lookup table, where the lookup table includes an association relationship between the IP address and the LID;
an encapsulating unit 903, configured to strip a first packet header of the first packet and encapsulate a second packet header to obtain a second packet, where the second packet header includes a local routing header, and the local routing header includes an LID of the second network device;
a sending unit 904, configured to send the second packet according to the LID of the second network device.
Optionally, the receiving unit 901 is specifically configured to:
receiving a first message in a large-capacity buffer area according to a credit flow control mechanism;
and feeding back the state information of the buffer with large capacity to the first network equipment through the pause message so as to enable the first network equipment to adjust message transmission.
Optionally, the sending unit 904 is further configured to:
applying LID to a subnet manager according to the route change, wherein the route change indicates that the first network equipment is added into the network;
the receiving unit 901 is further configured to:
receiving the LID of a first network device;
the obtaining unit 905 is further configured to:
acquiring a response message from the second network equipment, wherein the response message comprises the LID and the IP address of the second network equipment;
the communication device further comprises an updating unit 906, wherein the updating unit 906 is specifically configured to:
the lookup table is updated based on the IP address and LID of the first network device and the IP address and LID of the second network device.
Optionally, the sending unit 904 is further configured to:
receiving an Address Resolution Protocol (ARP) message from first network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of second network equipment;
and applying for the LID of the first network equipment to the subnet manager according to the ARP message.
Optionally, the updating unit 906 is further configured to:
and updating the invariable cyclic redundancy check code ICRC and the variable cyclic redundancy check code VCRC of the second message.
Optionally, the first message is an ethernet message, and the second message is an IB message.
Optionally, the ethernet packet includes an ethernet header, an IP header, a UDP header, an IB transmission header, an IB payload, an ICRC, and a CRC.
Optionally, the IB packet includes a local routing header, an IB transport header, an IB payload, an ICRC, and a VCRC.
Referring to fig. 10, as shown in fig. 10, another structural schematic diagram of a communication device according to an embodiment of the present application is shown, where the communication device 100 includes:
a receiving unit 1001, configured to receive a third packet from a first network device, where a third packet header of the third packet includes a local routing header, the local routing header includes a local identifier LID of a second network device, and the second network device is a destination network device for the first network device to transmit the packet;
a determining unit 1002, configured to determine, according to the LID of the second network device, an internet protocol IP address of the second network device in combination with a lookup table, where the lookup table includes an association relationship between the IP address and the LID;
an encapsulating unit 1003, configured to strip the third packet header of the third packet and encapsulate the fourth packet header to obtain a fourth packet, where the fourth packet header includes an IP address of the second network device;
a sending unit 1004, configured to send the fourth packet according to the IP address of the second network device.
Optionally, the communication device 100 further includes an obtaining unit 1005, where the obtaining unit 1005 is specifically configured to:
acquiring a QPN of the first network equipment and a QPN of the second network equipment according to the link establishment message;
the User Datagram Protocol (UDP) port number of the second network device is obtained according to the QPN of the first network device and the QPN of the second network device, the lookup table further comprises the association relation of the QPN, the UDP port number, the IP address and the LID, the fourth message header further comprises the media intervention control layer (MAC) address and the UDP port number of the second network device, and the MAC address of the second network device is obtained according to the IP address broadcast of the second network device.
Optionally, the sending unit 1004 is specifically configured to:
sending a fourth message in a large-capacity buffer area according to the credit flow control mechanism and the IP address of the second network equipment;
and feeding back the state information of the buffer with large capacity to the second network equipment through the pause message so as to enable the second network equipment to adjust message transmission.
Optionally, the sending unit 1004 is further configured to:
applying for LID from the subnet manager according to the route change, wherein the route change indicates that the second network equipment is added into the network;
the receiving unit 1001 is further configured to:
receiving the LID of the second network device;
the acquisition unit 1005 is further configured to:
acquiring a response message from the first network equipment, wherein the response message comprises the LID and the IP address of the first network equipment;
the communication device 100 further includes an updating unit 1006, where the updating unit 1006 is specifically configured to:
the lookup table is updated based on the IP address and LID of the first network device and the IP address and LID of the second network device.
Optionally, the sending unit 1004 is further configured to:
receiving an Address Resolution Protocol (ARP) message from the second network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment;
and applying for the LID of the second network equipment to the subnet manager according to the ARP message.
Optionally, the updating unit 1006 is further configured to:
and updating the unchanging cyclic redundancy code check ICRC and the cyclic redundancy code check CRC of the second message.
Optionally, the first message is an ethernet message, and the second message is an IB message.
Optionally, the ethernet packet includes an ethernet header, an IP header, a UDP header, an IB transmission header, an IB payload, an ICRC, and a CRC.
Optionally, the IB packet includes a local routing header, an IB transport header, an IB payload, an ICRC, and a VCRC.
Fig. 11 is a schematic diagram illustrating a possible logical structure of a communication device 110 according to an embodiment of the present application. The communication device 110 includes: a processor 1101, a communication interface 1102, a memory system 1103, and a bus 1104. The processor 1101, communication interface 1102, and storage system 1103 are interconnected by a bus 1104. In an embodiment of the present application, the processor 1101 is configured to control and manage the actions of the communication device 110, for example, the processor 1101 is configured to perform the steps performed by the gateway in the method embodiment of fig. 4. The communication interface 1102 is used to support communication for the communication device 110. A storage system 1103 for storing program codes and data of the communication device 110.
The processor 1101 may be a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 1101 may also be a combination of computing functions, e.g., a combination comprising one or more microprocessors, a digital signal processor and a microprocessor, or the like. The bus 1104 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 11, but this is not intended to represent only one bus or type of bus.
The receiving unit 901 and the sending unit 904 in the communication apparatus 90 correspond to a communication interface 1102 in the communication device 110, and the determining unit 902, the encapsulating unit 903, the obtaining unit 905, and the updating unit 906 in the communication apparatus 90 correspond to a processor 1101 in the communication device 110.
The communication device 110 of this embodiment may correspond to the gateway in the embodiment of the method in fig. 4, and the communication interface 1102 in the communication device 110 may implement the functions of the gateway and/or various steps implemented in the embodiment of the method in fig. 4, which are not described herein again for brevity.
Fig. 12 is a schematic diagram illustrating a possible logical structure of a communication device 120 according to an embodiment of the present application. The communication device 120 includes: a processor 1201, a communication interface 1202, a storage system 1203, and a bus 1204. The processor 1201, the communication interface 1202, and the storage system 1203 are connected to each other by a bus 1204. In an embodiment of the present application, the processor 1201 is configured to control and manage actions of the communication device 120, for example, the processor 1201 is configured to perform steps performed by the gateway in the method embodiment of fig. 5. The communication interface 1202 is used to support communication by the communication device 120. A storage system 1203 for storing program codes and data for the communication device 120.
The processor 1201 may be, for example, a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 1201 may also be a combination of computing functions, e.g., a combination comprising one or more microprocessors, a digital signal processor and a microprocessor, or the like. The bus 1204 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 12, but that does not indicate only one bus or one type of bus.
The receiving unit 1001 and the transmitting unit 1004 in the communication apparatus 100 correspond to the communication interface 1202 in the communication device 120, and the determining unit 1002, the packaging unit 1003, the acquiring unit 1005, and the updating unit 1006 in the communication apparatus 100 correspond to the processor 1201 in the communication device 120.
The communication device 120 of this embodiment may correspond to the gateway in the embodiment of the method in fig. 5, and the communication interface 1202 in the communication device 120 may implement the functions of the gateway and/or various steps implemented in the embodiment of the method in fig. 5, which are not described herein again for brevity.
In another embodiment of the present application, a computer-readable storage medium is further provided, where a computer-executable instruction is stored in the computer-readable storage medium, and when a processor of a device executes the computer-executable instruction, the device executes the steps of the message transmission method executed by the gateway device in the method embodiment in fig. 4.
In another embodiment of the present application, a computer-readable storage medium is further provided, in which computer-executable instructions are stored, and when a processor of a device executes the computer-executable instructions, the device executes the steps of the message transmission method executed by the gateway in the embodiment of the method in fig. 5.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; when the processor of the device executes the computer-executable instructions, the device performs the steps of the message transmission method performed by the gateway in the method embodiment of fig. 4 described above.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; when the computer executes the instructions, the processor of the device performs the steps of the message transmission method performed by the gateway in the embodiment of the method of fig. 5 described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (38)

1. A method for packet transmission, comprising:
a gateway receives a first message from a first network device, wherein a first message header of the first message comprises an Internet Protocol (IP) address of a second network device, and the second network device is a destination network device for transmitting the message by the first network device;
the gateway determines a local identifier LID of the second network device by combining a lookup table according to the IP address of the second network device, wherein the lookup table comprises an incidence relation between the IP address and the LID;
the gateway strips a first packet header of a first packet and encapsulates a second packet header to obtain a second packet, wherein the second packet header comprises a local routing header, and the local routing header comprises an LID of the second network device;
and the gateway sends the second message according to the LID of the second network equipment.
2. The message transmission method according to claim 1, wherein the gateway receiving the first message from the first network device includes:
the gateway receives the first message in a large-capacity buffer area according to a credit flow control mechanism;
and the gateway feeds back the state information of the large-capacity buffer area to the first network equipment through a pause message, so that the first network equipment adjusts message transmission.
3. The message transmission method according to any of claims 1-2, wherein before the gateway receives the first message from the first network device, the method further comprises:
the gateway applies for LID from a subnet manager according to route change, and the route change indicates that the first network equipment is added into the network;
receiving, by the gateway, the LID of the first network device;
the gateway acquires a response message from the second network equipment, wherein the response message comprises an LID and an IP address of the second network equipment;
the gateway updates the lookup table based on the IP address and LID of the first network device and the IP address and LID of the second network device.
4. The packet transmission method according to claim 3, wherein the gateway applying for the LID from the subnet manager according to the routing change includes:
the gateway receives an Address Resolution Protocol (ARP) message from the first network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment;
and the gateway applies for the LID of the first network equipment to a subnet manager according to the ARP message.
5. The message transmission method according to any of claims 1-4, wherein after the gateway strips the first header of the first message and encapsulates the second header to obtain the second message, the method further comprises:
and the gateway updates the invariable cyclic redundancy check code ICRC and the variable cyclic redundancy check code VCRC of the second message.
6. The message transmission method according to any of claims 1 to 5, wherein the first message is an Ethernet message and the second message is an IB message.
7. The message transmission method according to claim 6, wherein the Ethernet message comprises an Ethernet header, an IP header, a UDP header, an IB transmission header, an IB payload, an ICRC and a Cyclic Redundancy Check (CRC).
8. The message transmission method according to claim 6, wherein the IB message comprises the local routing header, IB transport header, IB payload, ICRC, and VCRC.
9. A method for packet transmission, comprising:
receiving, by a gateway, a third packet from a first network device, where a third packet header of the third packet includes a local routing header, the local routing header includes a local identifier LID of a second network device, and the second network device is a destination network device for the first network device to transmit the packet;
the gateway determines the Internet protocol IP address of the second network device by combining a lookup table according to the LID of the second network device, wherein the lookup table comprises an incidence relation between the IP address and the LID;
the gateway strips a third message header of a third message and encapsulates a fourth message header to obtain a fourth message, wherein the fourth message header comprises the IP address of the second network equipment;
and the gateway sends the fourth message according to the IP address of the second network equipment.
10. The message transmission method according to claim 9, wherein the method further comprises:
the gateway acquires a queue pair sequence number QPN of the first network device and a QPN of the second network device according to the link establishment message;
the gateway obtains a User Datagram Protocol (UDP) port number of the second network device according to the QPN of the first network device and the QPN of the second network device, the lookup table further includes an association relation among the QPN, the UDP port number, the IP address and the LID, the fourth packet header further includes a Media Access Control (MAC) layer address and a UDP port number of the second network device, and the MAC address of the second network device is obtained according to the IP address broadcast of the second network device.
11. The message transmission method according to claim 9 or 10, wherein the gateway sending the fourth message according to the IP address of the second network device includes:
the gateway sends the fourth message in a buffer with large capacity according to a credit flow control mechanism and the IP address of the second network equipment;
and the gateway feeds back the state information of the large-capacity buffer area to the second network equipment through a pause message so that the second network equipment adjusts message transmission.
12. The message transmission method according to any of claims 9-11, wherein before the gateway receives the third message from the first network device, the method further comprises:
the gateway applies for LID from a subnet manager according to route change, and the route change indicates the second network equipment to join the network;
the gateway receiving the LID of the second network device;
the gateway acquires a response message from the first network equipment, wherein the response message comprises an LID and an IP address of the first network equipment;
the gateway updates the lookup table according to the IP address and LID of the first network device and the IP address and LID of the second network device.
13. The packet transmission method of claim 12, wherein the gateway applying for an LID from a subnet manager according to the routing change comprises:
the gateway receives an Address Resolution Protocol (ARP) message from the second network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment;
and the gateway applies for the LID of the second network equipment to a subnet manager according to the ARP message.
14. The message transmission method according to any of the claims 9-13, wherein after the gateway strips the first header of the first message and encapsulates the second header to obtain the second message, the method further comprises:
and the gateway updates the unchanging cyclic redundancy code check ICRC and the cyclic redundancy code check CRC of the second message.
15. The message transmission method according to any of claims 9-14, wherein the first message is an ethernet message and the second message is an IB message.
16. The message transmission method according to claim 15, wherein the ethernet message comprises an ethernet header, an IP header, a UDP header, an IB transmission header, an IB payload, an ICRC, and a CRC.
17. The message transmission method according to claim 15, wherein the IB message comprises the local routing header, an IB transport header, an IB payload, an ICRC, and a Variable Cyclic Redundancy Check (VCRC).
18. A communications apparatus, comprising:
a receiving unit, configured to receive a first packet from a first network device, where a first packet header of the first packet includes an internet protocol IP address of a second network device, and the second network device is a destination network device of the first network device for transmitting a packet;
a determining unit, configured to determine, according to the IP address of the second network device, a local identifier LID of the second network device in combination with a lookup table, where the lookup table includes an association relationship between the IP address and the LID;
an encapsulating unit, configured to strip a first packet header of a first packet and encapsulate a second packet header to obtain a second packet, where the second packet header includes a local routing header, and the local routing header includes an LID of the second network device;
and the sending unit is used for sending the second message according to the LID of the second network device.
19. The communications apparatus as claimed in claim 18, wherein the receiving unit is specifically configured to:
receiving the first message in a large-capacity buffer area according to a credit flow control mechanism;
and feeding back the state information of the large-capacity buffer area to the first network equipment through a pause message, so that the first network equipment adjusts message transmission.
20. The communication device according to any of claims 18-19, wherein the sending unit is further configured to:
applying for LID from a subnet manager according to a route change, wherein the route change indicates that the first network device joins a network;
the receiving unit is further configured to:
receiving the LID of the first network device;
the acquisition unit is further configured to:
acquiring a response message from the second network device, wherein the response message comprises an LID and an IP address of the second network device;
the communication device further includes an update unit, where the update unit is specifically configured to:
updating the lookup table according to the IP address and LID of the first network device and the IP address and LID of the second network device.
21. The communications apparatus of claim 20, wherein the transmit unit is further configured to:
receiving an Address Resolution Protocol (ARP) message from the first network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment;
and applying for the LID of the first network equipment to a subnet manager according to the ARP message.
22. A communication apparatus according to any of claims 18-21, wherein the updating unit is further configured to:
and updating the invariable cyclic redundancy check code ICRC and the variable cyclic redundancy check code VCRC of the second message.
23. The communications apparatus as claimed in any one of claims 18 to 22, wherein the first message is an ethernet message and the second message is an IB message.
24. The communications apparatus of claim 23, wherein the ethernet packet comprises an ethernet header, an IP header, a UDP header, an IB transport header, an IB payload, an ICRC, and a cyclic redundancy check CRC.
25. The communications apparatus of claim 23, wherein the IB packet comprises the local routing header, IB transport header, IB payload, ICRC, and VCRC.
26. A communications apparatus, comprising:
a receiving unit, configured to receive a third packet from a first network device, where a third packet header of the third packet includes a local routing header, the local routing header includes a local identifier LID of a second network device, and the second network device is a destination network device for transmitting the packet by the first network device;
the determining unit is used for determining an Internet Protocol (IP) address of the second network device by combining a lookup table according to the LID of the second network device, wherein the lookup table comprises an incidence relation between the IP address and the LID;
an encapsulating unit, configured to strip a third packet header of a third packet and encapsulate a fourth packet header to obtain a fourth packet, where the fourth packet header includes an IP address of the second network device;
and the sending unit is used for sending the fourth message according to the IP address of the second network equipment.
27. The communications apparatus according to claim 26, wherein the communications apparatus further comprises an obtaining unit, the obtaining unit is specifically configured to:
acquiring a queue pair sequence number QPN of the first network device and a QPN of the second network device according to the link establishment message;
obtaining a User Datagram Protocol (UDP) port number of the second network device according to the QPN of the first network device and the QPN of the second network device, where the lookup table further includes an association relationship among the QPN, the UDP port number, an IP address and an LID, the fourth packet header further includes a Media Access Control (MAC) address and a UDP port number of the second network device, and the MAC address of the second network device is obtained according to the IP address broadcast of the second network device.
28. The communications device according to claim 26 or 27, wherein the sending unit is specifically configured to:
sending the fourth message in a large-capacity buffer area according to a credit flow control mechanism and the IP address of the second network device;
and feeding back the state information of the large-capacity buffer area to the second network equipment through a pause message, so that the second network equipment adjusts message transmission.
29. The communication device according to any of claims 26-28, wherein the sending unit is further configured to:
applying for LID from a subnet manager according to a route change, the route change indicating that the second network device joins the network;
the receiving unit is further configured to:
receiving the LID of the second network device;
the acquisition unit is further configured to:
acquiring a response message from the first network device, wherein the response message comprises an LID and an IP address of the first network device;
the communication device further includes an update unit, and the update unit is specifically configured to:
updating the lookup table according to the IP address and LID of the first network device and the IP address and LID of the second network device.
30. The communications apparatus of claim 29, wherein the transmitting unit is further configured to:
receiving an Address Resolution Protocol (ARP) message from the second network equipment, wherein the ARP message comprises an IP address of the first network equipment and an IP address of the second network equipment;
and applying for the LID of the second network equipment to a subnet manager according to the ARP message.
31. The communication device according to any of claims 26-30, wherein the updating unit is further configured to:
and updating the unchanging cyclic redundancy code check ICRC and the cyclic redundancy code check CRC of the second message.
32. The communications apparatus as claimed in any one of claims 26 to 31, wherein the first message is an ethernet message and the second message is an IB message.
33. The communications apparatus of claim 32, wherein the ethernet packet comprises an ethernet header, an IP header, a UDP header, an IB transport header, an IB payload, an ICRC, and a CRC.
34. The communications apparatus of claim 32, wherein the IB packet comprises the local routing header, IB transport header, IB payload, ICRC, and Variable Cyclic Redundancy Check (VCRC).
35. A communication device, comprising: a processor and a memory, wherein the processor is connected with the memory,
the processor is configured to execute instructions stored in the memory to cause the communication device to perform the method of any of claims 1 to 8.
36. A communication device, comprising: a processor and a memory, wherein the processor is connected to the memory,
the processor is configured to execute instructions stored in the memory to cause the communication device to perform the method of any of claims 9 to 17.
37. A computer-readable storage medium, in which a computer program is stored which, when run on the computer, causes the computer to carry out the method according to any one of claims 1 to 17.
38. A computer program product, characterized in that when the computer program product is executed on a computer, the computer performs the method according to any of claims 1 to 17.
CN202110872533.9A 2021-07-30 2021-07-30 Message transmission method and communication device Pending CN115701063A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110872533.9A CN115701063A (en) 2021-07-30 2021-07-30 Message transmission method and communication device
PCT/CN2022/106368 WO2023005723A1 (en) 2021-07-30 2022-07-19 Packet transmission method and communication apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110872533.9A CN115701063A (en) 2021-07-30 2021-07-30 Message transmission method and communication device

Publications (1)

Publication Number Publication Date
CN115701063A true CN115701063A (en) 2023-02-07

Family

ID=85086271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110872533.9A Pending CN115701063A (en) 2021-07-30 2021-07-30 Message transmission method and communication device

Country Status (2)

Country Link
CN (1) CN115701063A (en)
WO (1) WO2023005723A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090141727A1 (en) * 2007-11-30 2009-06-04 Brown Aaron C Method and System for Infiniband Over Ethernet by Mapping an Ethernet Media Access Control (MAC) Address to an Infiniband Local Identifier (LID)
US8165138B2 (en) * 2007-12-04 2012-04-24 International Business Machines Corporation Converged infiniband over ethernet network
US9203750B2 (en) * 2013-02-13 2015-12-01 Red Hat Israel, Ltd. Ethernet frame translation to internet protocol over infiniband
CN103368959B (en) * 2013-07-05 2016-08-24 华为技术有限公司 Conversion method between RapidIO message and InfiniBand message and device
CN108521378A (en) * 2018-04-23 2018-09-11 天津芯海创科技有限公司 Retransmission method, device and the network switching equipment of heterogeneous protocol message

Also Published As

Publication number Publication date
WO2023005723A1 (en) 2023-02-02

Similar Documents

Publication Publication Date Title
US11221972B1 (en) Methods and systems for increasing fairness for small vs large NVMe IO commands
US6799220B1 (en) Tunneling management messages over a channel architecture network
CN113326228B (en) Message forwarding method, device and equipment based on remote direct data storage
CN105791214B (en) Method and equipment for converting RapidIO message and Ethernet message
CN107079017B (en) Message conversion method and device
US10057162B1 (en) Extending Virtual Routing and Forwarding at edge of VRF-aware network
US10616105B1 (en) Extending virtual routing and forwarding using source identifiers
US11750418B2 (en) Cross network bridging
TWI721103B (en) Cluster accurate speed limiting method and device
CN113228571B (en) Method and apparatus for network optimization for accessing cloud services from a premise network
US20210359952A1 (en) Technologies for protocol-agnostic network packet segmentation
EP3813318B1 (en) Packet transmission method, communication device, and system
US20120163392A1 (en) Packet processing apparatus and method
CN113612801B (en) EPA gateway equipment and EPA cross-network communication method
US10104206B2 (en) Network module for sending and/or receiving of data packages from a network arrangement and method
CN112866206A (en) Unidirectional data transmission method and device
CN114050998A (en) Method, device, electronic equipment and medium for realizing remote direct memory access
CN112291259B (en) Protocol conversion method, gateway, equipment and readable storage medium
CN115701063A (en) Message transmission method and communication device
EP3913865B1 (en) Message decapsulation method and device, message encapsulation method and device, electronic device, and storage medium
US10805436B2 (en) Deliver an ingress packet to a queue at a gateway device
CN113497767A (en) Method and device for transmitting data, computing equipment and storage medium
CN114793217B (en) Intelligent network card, data forwarding method and device and electronic equipment
US11882039B1 (en) UDF-based traffic offloading methods and systems
WO2023174170A1 (en) Packet processing method and apparatus, and packet checking method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination