WO2021013046A1 - 通信方法和网卡 - Google Patents

通信方法和网卡 Download PDF

Info

Publication number
WO2021013046A1
WO2021013046A1 PCT/CN2020/102466 CN2020102466W WO2021013046A1 WO 2021013046 A1 WO2021013046 A1 WO 2021013046A1 CN 2020102466 W CN2020102466 W CN 2020102466W WO 2021013046 A1 WO2021013046 A1 WO 2021013046A1
Authority
WO
WIPO (PCT)
Prior art keywords
rnic
virtual
address
source
destination
Prior art date
Application number
PCT/CN2020/102466
Other languages
English (en)
French (fr)
Inventor
付斌章
谭焜
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20844079.2A priority Critical patent/EP3828709A4/en
Publication of WO2021013046A1 publication Critical patent/WO2021013046A1/zh
Priority to US17/201,833 priority patent/US11431624B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/58Association of routers
    • H04L45/586Association of routers of virtual routers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0015Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy
    • H04L1/0016Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy involving special memory structures, e.g. look-up tables
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4633Interconnection of networks using encapsulation techniques, e.g. tunneling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4641Virtual LANs, VLANs, e.g. virtual private networks [VPN]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/66Layer 2 routing, e.g. in Ethernet based MAN's
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/618Details of network addresses
    • H04L2101/622Layer-2 addresses, e.g. medium access control [MAC] addresses

Definitions

  • This application relates to the field of communication technology, and more specifically, to a communication method and a network card.
  • RDMA Remote Direct Memory Access
  • RDMA RDMA over Converged Ethernet, RoCE
  • RDMA technology can be applied to cloud computing scenarios or similar scenarios.
  • RDMA Input Output
  • hardware virtualization technology can be used to implement Input Output (IO) virtualization.
  • IO Input Output
  • hardware virtualization technology can be used to abstract one physical network card into multiple virtual network cards supporting RDMA technology. By deploying a dedicated virtual network card driver inside the virtual machine, tenants can use the virtual network card like a physical network card.
  • the virtual network card can directly read the data from the virtual machine memory. It should be noted that, because the data is directly read from the memory of the virtual machine and the virtual network card completes the message encapsulation, the message sent from the virtual network card at this time is encapsulated by the virtual network address.
  • the source and destination media access control (MAC) addresses in the Ethernet layer 2 protocol header in the message header, and the source and destination IP addresses in the Internet Protocol (IP) header are all of the virtual network card address.
  • Network devices (such as routers and switches) cannot identify the address of the virtual network card, so the message cannot be correctly routed to the target node.
  • the present application provides a communication method and a network card, which can improve the effective throughput in the network.
  • an embodiment of the present application provides a communication method.
  • the method includes: a source RNIC that supports remote direct memory access acquires data to be transmitted from a source virtual RNIC, where the source virtual RNIC is running on the source RNIC The source RNIC obtains the message forwarding information and the destination virtual RNIC identity indication information.
  • the message forwarding information includes the Internet Protocol IP address of the source RNIC, the media access control MAC address of the source RNIC, and the IP of the destination RNIC Address, the MAC address of the destination RNIC, and the Layer 4 port number; the source RNIC encapsulates the data to be transmitted to obtain the target message, the target message including the message forwarding information, the destination virtual RNIC identity indication information and For the data to be transmitted, the target packet does not include at least one of the following information: the IP address of the source virtual RNIC, the IP address of the destination virtual RNIC, the MAC address of the source virtual RNIC, the port number of the source virtual RNIC, and The port number of the destination virtual RNIC; the source RNIC sends the destination message to the destination RNIC, where the destination virtual RNIC is a virtual RNIC running on the destination RNIC.
  • the source RNIC may be a physical network card that supports RDMA technology, or a virtual network card that supports RDMA technology.
  • the destination RNIC may be a physical network card supporting RDMA technology, or a virtual network card supporting RDMA technology.
  • the data to be transmitted can be encapsulated only once.
  • the source RNIC does not need to be encapsulated once.
  • the source vRNIC's IP address, MAC address, port number information, and destination vRNIC's IP address and port number information are encapsulated into the message, and then the source vRNIC is encapsulated for the second time.
  • the IP address of the RNIC, the MAC address of the source RNIC, the IP address of the destination RNIC, the MAC address of the destination RNIC, and the Layer 4 port number are encapsulated in the message.
  • the source RNIC can be encapsulated only once, and the IP address of the source RNIC, the MAC address of the source RNIC, the IP address of the destination RNIC, the MAC address of the destination RNIC, and the Layer 4 port number are encapsulated into the message. Since the encapsulated message no longer includes the source vRNIC's IP address, MAC address, port number information, and destination vRNIC's IP address and port number information, more space can be used in the payload of the target message. In order to transmit the data to be transmitted, this can increase the effective throughput in the network. In addition, in the above solution, the encapsulation process of the data to be transmitted can be completed by RNIC. Therefore, there is no need to set up additional hardware for packaging the data to be transmitted. This can reduce the cost of applying RDMA technology in Ethernet.
  • the target message may be a message based on the RoCE standard format.
  • the source RNIC obtains packet forwarding information and destination virtual RNIC identity indication information, including: the source RNIC is used to transmit the source virtual RNIC according to the identity of the source virtual RNIC and The transmission mode of the data to be transmitted is obtained, and the message forwarding information and the destination virtual RNIC identity indication information are obtained.
  • the source RNIC can choose different ways to obtain the message forwarding information according to different transmission modes.
  • the source RNIC obtains the message forwarding information and the destination virtual RNIC identity indication information according to the identity and transmission mode of the source virtual RNIC, including: When the transmission mode is a reliable connection to RC or an unreliable connection to UC, the source RNIC determines the message forwarding information and the destination virtual RNIC identity indication information according to the target queue pair context, where the target queue pair context corresponds to the connection information And the identity of the source virtual RNIC. Based on the above technical solution, when the transmission mode is RC or UC, the source RNIC can directly obtain the message forwarding information and the identity indication information of the destination virtual RNIC from the queue pair context.
  • the source RNIC obtains the message forwarding information and the destination virtual RNIC identity indication information according to the identity and transmission mode of the source virtual RNIC, including: When the transmission mode is a reliable connection to RC or an unreliable connection to UC, the source RNIC determines the target virtual network address from the reference queue pair context or reference WQE, where the target virtual network address includes the IP address of the source virtual RNIC and the At least one of the IP addresses of the destination virtual RNIC, the reference queue corresponds to the context and connection information and the identity of the source virtual RNIC, the reference WQE corresponds to the connection information and the identity of the source virtual RNIC; the source RNIC slave tunnel table Determine the message forwarding information and the destination virtual RNIC identity indication information, where the tunnel table includes at least one tunnel entry, and each tunnel entry in the at least one tunnel entry is used to indicate: the identity of the first virtual RNIC, the second The virtual extended LAN network identification VNI to which the virtual RNIC belongs, the virtual network address,
  • the source RNIC obtains the message forwarding information and the destination virtual RNIC identity indication information according to the identity and transmission mode of the source virtual RNIC, including: When the transmission mode is an unreliable data packet UD or a reliable data packet RD, the source RNIC determines the target virtual network address according to the target WQE corresponding to the data to be transmitted and the identifier corresponding to the source virtual RNIC, or according to the The target WQE determines the target virtual network address, where the target virtual network address includes at least one of the IP address of the source virtual RNIC and the IP address of the destination virtual RNIC; the source RNIC determines the packet forwarding information from the tunnel table and The destination virtual RNIC identity indication information, wherein the tunnel table includes at least one tunnel entry, and each tunnel entry in the at least one tunnel entry is used to indicate: the identity of the first virtual RNIC and the virtual extension to which the second virtual RNIC belongs LAN network identification VNI, virtual network address, address information of the first RNIC,
  • the source RNIC acquiring message forwarding information and destination virtual RNIC identity indication information includes: the source RNIC sends a request message to at least one target NIC, and the at least Each target NIC in a target NIC runs at least one virtual RNIC, the at least one virtual RNIC and the source virtual RNIC belong to the same VNI, the request message includes the source virtual RNIC identifier and the target virtual network address, wherein the target virtual RNIC The network address includes at least one of the IP address of the source virtual RNIC and the IP address of the destination virtual RNIC; the source RNIC receives feedback information sent by the destination RNIC, and the feedback information includes the IP address of the destination RNIC and the destination RNIC MAC address; the source RNIC determines the message forwarding information and the destination virtual RNIC identity indication information according to the feedback information. Based on the above technical solution, if the source RNIC does not store the IP address of the destination RNIC and the MAC address of the destination RNIC, the source RNIC can
  • the target message includes: a MAC header, an IP header, a Layer 4 port number header, a network virtualization protocol header, and a load field, where the MAC header Includes the source RNIC's MAC address and the destination RNIC's MAC address; the IP header includes the source RNIC's IP address and the destination RNIC's IP address; the four-layer port number header includes the four-layer port number; the network The virtualization header includes the identity indication information of the target virtual RNIC; the payload field includes the data to be transmitted.
  • the message format of the target message is similar to the existing message format. Therefore, minor changes are made to the existing message format, which facilitates the implementation of the technical solution of the present application.
  • the identity indication information of the destination virtual RNIC includes the VNI to which the destination virtual RNIC belongs and the virtual MAC address of the destination virtual RNIC.
  • the identity indication information of the target virtual RNIC includes the target virtual RNIC number.
  • the target message further includes identity indication information of the source virtual RNIC.
  • the network virtualization protocol header of the target message includes the identity indication information of the source virtual RNIC.
  • an embodiment of the present application provides a communication method, the method includes: a destination RNIC receives a message sent by a source RNIC, the message includes message forwarding information, destination virtual RNIC identity indication information and data, and the message is forwarded
  • the information includes the Internet Protocol IP address of the source RNIC, the media access control MAC address of the source RNIC, the IP address of the destination RNIC, the MAC address of the destination RNIC, and the Layer 4 port number.
  • the destination message does not include at least one of the following information One: the IP address of the source virtual RNIC, the IP address of the destination virtual RNIC, the MAC address of the source virtual RNIC, the port number of the source virtual RNIC and the port number of the destination virtual RNIC, the source virtual RNIC is running on the source RNIC
  • the destination virtual RNIC is a virtual RNIC running in the destination RNIC; the destination RNIC determines the destination virtual RNIC according to the identification information of the destination virtual RNIC; the destination RNIC sends the message to the destination RNIC. Destination vRNIC.
  • the message received by the destination RNIC no longer includes the IP address, MAC address, and port number information of the source vRNIC, and the IP address and port number information of the destination vRNIC. Therefore, more space can be used for data transmission in the payload of the message, which can improve the effective throughput in the network.
  • the message includes: a MAC header, an IP header, a Layer 4 port number header, a network virtualization protocol header, and a load field, where the MAC header Including the source RNIC's MAC address and the destination RNIC's MAC address; the IP header includes the source RNIC's IP address and the destination RNIC's IP address; the four-layer port number header includes the four-layer port number; the network virtual The header includes the identity indication information of the destination virtual RNIC; the payload field includes the data.
  • the message format of the target message is similar to the existing message format. Therefore, minor changes are made to the existing message format, which facilitates the implementation of the technical solution of the present application.
  • the target virtual RNIC identity indication information includes: the VNI to which the target virtual RNIC belongs and the virtual MAC address of the target virtual RNIC; the target RNIC is based on the purpose
  • the virtual RNIC identity indication information determines the destination virtual RNIC, including: the destination RNIC determines the destination virtual RNIC from a virtual device mapping table, where the virtual device mapping table includes at least one virtual device table entry, and the at least one virtual device table Each entry in the entry includes a VNI, a MAC address, and an identifier.
  • the identifier in the virtual device entry that matches the VNI to which the destination virtual RNIC belongs and the virtual MAC address of the destination virtual RNIC The ID of the virtual RNIC for this purpose.
  • Using the VNI to which the destination virtual RNIC belongs and the MAC address of the destination virtual RNIC as the identification information of the destination virtual RNIC can avoid the inability to accurately find the destination due to the migration of the destination virtual RNIC due to the change of the identity of the destination virtual RNIC.
  • the virtual RNIC situation occurs.
  • the identity indication information of the target virtual RNIC includes the target virtual RNIC number.
  • the target message further includes the identity indication information of the source virtual RNIC.
  • the network virtualization protocol header of the target packet includes the identity indication information of the source virtual RNIC.
  • an embodiment of the present application provides a network card, and the network card includes a unit for implementing the first aspect or any possible implementation manner of the first aspect.
  • the network card supports RDMIA technology.
  • an embodiment of the present application provides a network card, and the network card includes a unit for implementing the second aspect or any possible implementation manner of the second aspect.
  • the network card supports RDMIA technology.
  • an embodiment of the present application provides a computer-readable storage medium that stores instructions for implementing the first aspect or the method described in any one of the possible implementation manners of the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores instructions for implementing the second aspect or any one of the possible implementation manners of the second aspect.
  • the present application provides a computer program product containing instructions, when the computer program product is run on a computer, the computer can execute the first aspect or any one of the possible implementations of the first aspect. method.
  • the present application provides a computer program product containing instructions that when the computer program product is run on a computer, the computer can execute the second aspect or any one of the possible implementations of the second aspect. method.
  • the present application provides a communication device including a processing circuit and a storage medium, the storage medium storing program code, and the processing circuit is configured to call the program code in the storage medium to execute the first aspect or the first aspect described above.
  • the communication device supports RDMIA technology.
  • the present application provides a communication device including a processing circuit and a storage medium, the storage medium storing program code, and the processing circuit is configured to call the program code in the storage medium to execute the second aspect or the first aspect described above.
  • the communication device supports RDMIA technology.
  • Fig. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • Fig. 3 is a schematic structural block diagram of a communication method according to an embodiment of the present application.
  • Figure 4 is a schematic diagram of a target message.
  • Figure 5 is a schematic diagram of the VXLAN-GPE header.
  • Figure 6 is a schematic diagram of an encapsulated target message.
  • Fig. 7 is a schematic structural block diagram of a network card according to an embodiment of the present application.
  • Fig. 8 is a schematic structural block diagram of a network card according to an embodiment of the present application.
  • Fig. 9 is a structural block diagram of a network card provided according to an embodiment of the present application.
  • the technical solutions of the embodiments of the present application can be applied to network interface cards (NIC) and computing nodes that support remote direct memory access (RDMA) technology.
  • NIC network interface cards
  • RDMA remote direct memory access
  • the computing node can be connected to a network card, and the computing node refers to an electronic device with computing capabilities, such as a server, a personal computer (such as a desktop computer device, a notebook computer), and the like.
  • the network card may be referred to as the network card of the computing node.
  • the network card may also be called a network interface card (network interface card), a network adapter (network adapter), a physical network interface (physical network interface), and so on.
  • the network card in the embodiment of the present application may be a network card supporting RDMA technology. Therefore, the network card may also be referred to as an RDMA network interface card (RDMA network interface card, RNIC).
  • RDMA network interface card RDMA network interface card
  • the RNIC of the computing node may be built in the computing node.
  • the RNIC of the computing node can communicate with the computing node through interfaces such as the Peripheral Component Interconnect Express (PCIe) interface or the cache coherent interconnect for accelerator (CCIX) interface used for accelerators. Motherboard connection.
  • PCIe Peripheral Component Interconnect Express
  • CCIX cache coherent interconnect for accelerator
  • Motherboard connection The computing node may be referred to as the host of the RNIC.
  • the RNIC of the computing node may be an external device of the computing node.
  • the RNIC may be connected to the computing node through a PCIe interface, a Quick Path Interconnect (QPI) interface, a Universal Serial Bus (Universal Serial Bus, USB) interface, etc.
  • QPI Quick Path Interconnect
  • USB Universal Serial Bus
  • the network card can be integrated with the CPU in a SOC system.
  • the computing node includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer.
  • the hardware layer includes hardware such as a central processing unit (CPU), a memory management unit (MMU), and memory (also referred to as main memory).
  • the operating system may be any one or more computer operating systems that implement business processing through processes, for example, Linux operating system, Unix operating system, Android operating system, iOS operating system, or windows operating system.
  • the application layer includes applications such as distributed databases, distributed storage, and distributed AI systems.
  • the embodiments of the application do not specifically limit the specific structure of the execution subject of the methods provided in the embodiments of the application, as long as the program that records the codes of the methods provided in the embodiments of the application can be provided according to the embodiments of the application.
  • the execution subject of the method provided in the embodiment of the present application may be a computing node, or a functional module in the computing node that can call and execute the program.
  • Fig. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the system 100 shown in FIG. 1 includes a computing node 110, an RNIC 111, a computing node 120, and an RNIC 121.
  • the RNIC of the computing node 110 may be RNIC 111, and the RNIC of the computing node 120 may be RNIC 121.
  • the RNCI 111 and the RNCI 121 may be connected through a communication link, and the medium of the communication link may be an optical fiber, etc.
  • the embodiment of the present application does not limit the specific medium of the communication link between network devices.
  • the RNCI 111 and the RNCI 121 may include one or more switching nodes, or communicate directly.
  • the computing node 110 includes a storage device 112, and the storage device 112 may be used to store queue information and application data of the computing node 110.
  • the computing node 120 includes a storage device 122, and the storage device 122 may be used to store queue information of the computing node 120.
  • the storage device 112 in FIG. 1 is in the computing node 110 and the storage device 122 is in the computing node 120
  • the storage device 112 may also be a storage device that is externally attached to the computing node 110 or RNCI 111
  • the storage device 122 may also be a storage device externally attached to the computing node 120 or RNCI 121.
  • FIG. 1 only shows the connection relationship between two computing nodes through a network device.
  • Some networks that support RDMA technology can include more computing nodes. Any two computing nodes in such a network can be connected by the method shown in Figure 1.
  • the system 100 shown in FIG. 1 may be a connection mode of any two computing nodes in a network supporting RDMA technology.
  • one physical RNIC can be abstracted into multiple virtual RNICs supporting RDMA technology.
  • the virtual NIC supporting the RDMA technology may be referred to as vRNIC (virtual RNIC, vRNIC) below.
  • vRNIC virtual RNIC
  • the RNIC in the embodiments of the present application refers to a physical RNIC.
  • Fig. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • three virtual machines virtual machines, VMs
  • VMs virtual machines
  • RNIC 220 There are three vRNICs deployed in RNIC 220, namely vRNIC 221, vRNIC 222, and vRNIC 223.
  • Two VMs are deployed in the computing node 230, namely VM 231 and VM 232.
  • RNIC 220 is the RNIC of the computing node 210
  • RNIC 240 is the RNIC of the computing node 230.
  • the computing node 210 is the host of the RNIC 220
  • the computing node 230 is the host of the RNIC 240.
  • the VM deployed in the computing node can have a one-to-one correspondence with the vRNIC in the RNIC, or one VM can be configured with multiple vRNICs.
  • the three VMs deployed in the computing node 210 correspond to the three vRNICs deployed in the RNIC 220 in a one-to-one correspondence.
  • the two VMs deployed in the computing node 230 correspond to the two vRNICs deployed in the RNIC 240 in a one-to-one correspondence.
  • the one-to-one correspondence between VM and vRNIC mentioned here means that the vRNIC is the vRNIC of the corresponding VM.
  • the vRNIC of VM 211 is vRNIC 221
  • the vRNIC of VM 231 is vRNIC 241.
  • the RDMA communication between VM 211 and VM 231 can be implemented through vRNIC 221 and vRNIC 241.
  • Figure 2 shows the correspondence between VM and vRNIC.
  • the tenant can be deployed in a container.
  • each container has a corresponding vRNIC.
  • RDMA communication between containers can be achieved through their corresponding vRNICs.
  • Fig. 3 is a schematic structural block diagram of a communication method according to an embodiment of the present application.
  • the source RNIC obtains data to be transmitted sent by the source vRNIC, where the source vRNIC is a vRNIC running on the source RNIC.
  • the data to be transmitted is data obtained by the source vRNIC from the storage device of the host of the source vRNIC. It is understandable that the host of the source vRNIC is a virtual machine deployed in a computing node. Therefore, the storage device of the host of the source vRNIC is the storage device of the computing node where the virtual machine is deployed.
  • the source RNIC obtains the message forwarding information and the destination vRNIC identity indication information.
  • the message forwarding information includes the IP address of the source RNIC, the MAC address of the source RNIC, the Layer 4 port number, the IP address of the destination RNIC, and the MAC address of the destination RNIC. More specifically, the message forwarding information includes a MAC header, an IP header, and a Layer 4 port number header.
  • the MAC header includes the MAC address of the source RNIC and the MAC address of the destination RNIC.
  • the IP header includes the IP address of the source RNIC and the IP address of the destination RNIC.
  • the Layer 4 port number header includes the Layer 4 port number.
  • the fourth layer refers to the fourth layer in the Open System Interconnection (OSI) model, the transport layer. Therefore, the four-layer port number can also be called the transport layer port number.
  • the Layer 4 port number may be a User Dataram Protocol (UDP) port number, or a Transmission Control Protocol (Transmission Control Protocol, TCP) port number, etc.
  • the four-layer port number may include a source port number and a destination port number.
  • the source RNIC obtaining packet forwarding information and destination vRNIC identity indication information may include: the source RNIC obtains the identity of the source vRNIC; the source RNIC obtains the identity of the source vRNIC and the to-be-transmitted vRNIC The data transmission mode is used to obtain the message forwarding information and the destination vRNIC identity indication information.
  • the source RNIC can use the doorbell mechanism to obtain the identity of the source vRNIC.
  • the source vRNIC can notify the source RNIC of the vRNIC that needs to send data through the doorbell mechanism.
  • the source RNIC can store data in a preset format in the pre-appointed register or storage space with the source vRNIC.
  • the source RNIC detects that the content stored in the pre-appointed register or storage space has changed, the source RNIC starts from the pre-appointed vRNIC. Read the data in the preset format from the register or storage space.
  • the aforementioned doorbell mechanism can use a preset register or storage space to store data in a preset format.
  • the doorbell mechanism is implemented by registers.
  • the source vRNIC can write the identification of the source vRNIC into the register. After detecting the doorbell, the source RNIC can read the queue identification of the source vRNIC in the register, and record the read identification of the source vRNIC.
  • the source vRNIC After the source RNIC reads the identification of the source vRNIC in the register and records the read identification of the source vRNIC, the source vRNIC is notified that the queue identification stored in the register can be deleted. After receiving the notification, the source vRNIC deletes the queue identifier stored in the register.
  • the queue identifier can be saved to the register based on a first-in first-out mechanism. In this way, after the queue identifier is read, the queue identifier is deleted from the register.
  • the data transfer between the source vRNIC and the destination vRNIC is achieved through RDMA technology.
  • the transmission mode of RDMA transmission may be one of: reliable connection (RC), reliable datagram (reliable datagram), unreliable connection (UC), and unreliable datagram (UD).
  • RC reliable connection
  • reliable datagram reliable datagram
  • UC unreliable connection
  • UD unreliable datagram
  • the transmission mode of the data to be transmitted can be one of RC, UC, UD, or RD.
  • the source RNIC may adopt different strategies to obtain the message forwarding information and the destination vRNIC identity indication information according to different transmission modes.
  • the source RNIC may obtain connection information; the source RNIC may determine the target queue pair corresponding to the connection information and the identity of the source vRNIC Context (queue pair context, QPC): The source RNIC can determine the message forwarding information and the virtual vRNIC identity indication information from the target queue pair context.
  • connection information is information for indicating the queue.
  • the connection information may be a queue pair number (QPn).
  • the connection information may be other information that can indicate a queue pair.
  • different queue pairs can correspond to different identifiers.
  • the connection information may be the identity of the queue.
  • the connection information may be an identification of a communication endpoint, such as an identification of a receiving end.
  • the receiving end may be the identification of the destination vRNIC, or the identification of the virtual machine corresponding to the destination vRNIC.
  • the source RNIC may also use the doorbell mechanism to obtain the QPn.
  • the specific implementation manner is the same as the manner in which the source RNIC obtains the identity of the source vRNIC by using the doorbell mechanism. For the sake of brevity, details are not repeated here.
  • the source vRNIC is only one of the multiple vRNICs.
  • the same QPn may exist in different vRNICs.
  • the identification of different vRNICs is different. Therefore, according to the identification of the QPn and the source vRNIC, a unique QPC can be determined, that is, the target QPC.
  • the target QPC may include the message forwarding information.
  • the source RNIC can directly obtain the message forwarding information from the target QPC.
  • the target QPC may include the address of the message forwarding information.
  • the source RNIC may obtain the message forwarding information according to the address of the message forwarding information.
  • the target QPC may include part of the message forwarding information and the address of another part of the message forwarding information.
  • the source RNIC can directly obtain part of the message forwarding information from the target QPC, and then obtain another part of the message forwarding information according to the address in the target QPC.
  • the source RNIC may obtain connection information; the source RNIC may determine the reference QPC or reference corresponding to the connection information and the source vRNIC identifier Work queue element (WQE).
  • the source RNIC may determine the target virtual network address from the reference QPC or the reference WQE, and the target virtual network address may include at least one of the IP address of the source vRNIC and the IP address of the destination vRNIC; the source RNIC slave tunnel
  • the table determines the message forwarding information and the destination virtual RNIC identity indication information, wherein the tunnel table includes at least one tunnel entry, and each tunnel entry in the at least one tunnel entry is used to indicate: the identity of the first virtual RNIC, the first virtual RNIC 2.
  • the virtual extended local area network identification VNI to which the virtual RNIC belongs the virtual network address, the address information of the first RNIC, and the address information of the second RNIC, where the first virtual RNIC runs in the first RNIC, and the second virtual RNIC Running in the second RNIC, the virtual network address includes at least one of the IP address of the first virtual RNIC and the IP address of the second virtual RNIC, and the at least one tunnel entry is associated with the identity of the source virtual RNIC and the The tunnel entry matching the target virtual network address includes the packet forwarding information.
  • the source RNIC may be multiple vRNICs running in the source RNIC.
  • the source vRNIC is only one of the multiple vRNICs.
  • the same QPn may exist in different vRNICs.
  • the identification of different vRNICs is different. Therefore, according to the identification of the QPn and the source vRNIC, a unique QPC can be determined, that is, the reference QPC.
  • the source RNIC can determine a unique WQE, that is, the reference WQE.
  • the source RNIC determines the target virtual network address according to the target WQE corresponding to the data to be transmitted and the identifier corresponding to the source virtual RNIC , Or determine the target virtual network address according to the target WQE, where the target virtual network address includes at least one of the IP address of the source virtual RNIC and the IP address of the destination virtual RNIC; the source RNIC determines the report from the tunnel table The text forwarding information and the destination virtual RNIC identity indication information, wherein the tunnel table includes at least one tunnel entry, and each tunnel entry in the at least one tunnel entry is used to indicate: the identity of the first virtual RNIC, the identity of the second virtual RNIC The virtual extended local area network identification VNI, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC, where the first virtual RNIC runs in the first RNIC, and the second virtual RNIC runs in the first RNIC.
  • the virtual network address includes at least one of the IP address of the first virtual RNIC and the IP address of the second virtual RNIC, and the at least one tunnel entry is associated with the identity of the source virtual RNIC and the target virtual network address
  • the matched tunnel entry includes the message forwarding information.
  • the tunnel entry is used to indicate the identity of the first vRNIC, the VNI to which the second vRNIC belongs, and the virtual network address, address information of the first RNIC, and address information of the second RNIC may be: tunnel entry It includes the identifier of the first vRNIC, the VNI to which the second vRNIC belongs, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC.
  • the identifier of the first vRNIC, the VNI to which the second vRNIC belongs, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC may be: the tunnel entry includes a location indicator Information (may also be referred to as a pointer), the location indication information is used to indicate the identity of the first vRNIC, the VNI to which the second vRNIC belongs, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC in the storage device Position and length.
  • the identifier of the corresponding first vRNIC, the VNI to which the second vRNIC belongs, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC can be read from the location indicated by the location indication information.
  • tunnel entries include the identifier of the first vRNIC, the VNI to which the second vRNIC belongs, the virtual network address, the address information of the first RNIC, and the address information of the second RNIC.
  • WQE can carry data.
  • the target WQE may be the WQE carrying the data to be transmitted.
  • WQE may carry location indication information.
  • the location indication information is used to indicate the storage location and length of the data in the host.
  • the target WQE may be a WQE that carries location indication information indicating the storage location of the data to be transmitted.
  • the source RNIC can use the tunnel table to determine the message forwarding information in the message encapsulation information and the destination vRNIC identity indication information.
  • the address information of the first RNIC in each tunnel entry in the tunnel table may include the IP address and the MAC address of the first RNIC.
  • the address information of the second and NIC may include the IP address and MAC address of the second NIC.
  • the virtual network address may include at least one of the IP address of the first vRNIC and the IP address of the second vRNIC.
  • the related information of the first vRNIC that is, the identification of the first vNIC, the IP address of the first vNIC
  • the related information of the second vRNIC that is, the VNI to which the second vRNIC belongs
  • the “first” and “second” here are only to distinguish that each tunnel entry includes two different vRNIC related information, and does not limit the two first vRNICs belonging to any two tunnel entries in the tunnel table.
  • Related information is related information of the same vRNIC, and related information of two second vRNICs included in any two tunnel entries is related information of the same vRNIC.
  • the related information of the two first vRNICs respectively belonging to any two tunnel entries in the tunnel table may be related information of the same vRNIC, or related information of different vRNICs.
  • the related information of the two second vRNICs respectively belonging to any two tunnel entries in the tunnel table may be related information of the same vRNIC, or related information of different vRNICs.
  • each tunnel entry also includes the relevant information of the first RNIC (that is, the address information of the first RNIC) and the relevant information of the second RNIC (that is, the address information of the second RNIC) in "first" and "second".
  • Two is also to distinguish that each tunnel entry includes two different RNIC related information, and does not limit that the related information of the two first RNICs belonging to any two tunnel entries in the tunnel table are related information of the same RNIC.
  • the related information of the two second RNICs included in the any two tunnel entries are related information of the same RNIC.
  • the related information of the two first RNICs respectively belonging to any two tunnel entries in the tunnel table may be related information of the same RNIC, or related information of different RNICs.
  • the related information of two second NICs respectively belonging to any two tunnel entries in the tunnel table may be related information of the same RNIC, or related information of different RNICs.
  • a tunnel entry that matches the identifier of the source vRNIC and the target virtual network address may be referred to as a target tunnel entry.
  • the first vRNIC of the target tunnel entry is the source vRNIC
  • the second vRNIC of the target tunnel entry is the target vRNIC
  • the first RNIC of the target tunnel entry is the source RNIC
  • the target tunnel The second RNIC of the entry is the destination RNIC.
  • the identity of the first vRNIC in the target tunnel entry is the identity of the source vRNIC
  • the VNI to which the second vRNIC in the target tunnel entry belongs is the VNI to which the destination vRNIC belongs
  • the virtual network address in the target tunnel entry is The target virtual network address
  • the address information of the first RNIC in the target tunnel entry includes the IP address and MAC address of the source RNIC
  • the address information of the second RNIC in the target tunnel entry includes the IP address and MAC of the destination NIC address.
  • the address information of the first RNIC may further include the port number of the first RNIC
  • the address information of the second RNIC may also include the port number of the second RNIC
  • Table 1 is an illustration of a tunnel table.
  • the tunnel table shown in Table 1 includes five tunnel entries. Assume that the identity of the source vRNIC is vNIC1, the IP address of the source vRNIC is 10.1.1.1, and the IP address of the destination vRNIC is 10.1.1.11. In this case, the first tunnel entry of the five tunnel entries shown in Table 1 is the target tunnel entry that matches the identity of the source vRNIC, the IP address of the source vRNIC, and the IP address of the destination vRNIC.
  • the VNI to which the target vRNIC belongs can be determined, the IP address of the source NIC is 192.100.1.1, the MAC address of the source NIC is X:Y:Z:M:N:11, and the IP of the destination NIC The address is 192.100.2.2, and the MAC address of the destination NIC is M:N:X:Y:Z:22.
  • the virtual extended local area network (Virtual eXtensible Local Area Network, VXLAN) network identifier (VXLAN Network Identifier, VNI) to which the destination vRNIC belongs is 1001.
  • the tunnel table may be stored in the buffer of the processing circuit of the source RNIC.
  • the tunnel table may be stored in the memory of the source RNIC.
  • the tunnel table may be stored in a storage device of the host where the source RNIC is installed.
  • the tunnel table may be divided into three parts.
  • the first part may be stored in the cache of the source RNIC's processor
  • the second part may be stored in the memory of the source RNIC
  • the third Part can be stored in the storage device of the host of the source RNIC.
  • the source RNIC may first check whether the first part of the tunnel table includes the target tunnel entry; if the first part of the tunnel table does not include the target tunnel entry, it may check whether the second part of the tunnel table includes the target tunnel entry; The second part of the tunnel table does not include the target tunnel entry, and it can be checked whether the third part of the tunnel table includes the target tunnel entry.
  • any two parts of the tunnel table in the first part of the tunnel table, the second part of the tunnel table, and the third part of the tunnel table may have an intersection or no intersection.
  • the first part of the tunnel table may include the 1st to 10th tunnel entries among the 100 tunnel entries
  • the second part of the tunnel table may include the 11th tunnel entries among the 100 tunnel entries.
  • the third part of the tunnel table may include the 41st to 100th tunnel entries among the 100 tunnel entries.
  • the first part of the tunnel table may be a subset of the second part of the tunnel table and/or the third part of the tunnel table, and/or the second part of the tunnel table may be a subset of the third part of the tunnel table.
  • the tunnel table includes a total of 100 tunnel entries
  • the first part of the tunnel table may include the 1st to 10th tunnel entries among the 100 tunnel entries
  • the second part of the tunnel table may include the first part of the 100 tunnel entries.
  • the third part of the tunnel table may include the 1st to 100th tunnel entries among the 100 tunnel entries.
  • the tunnel table may be divided into two parts, and the two parts of the tunnel table may be stored in the cache of the processor of the source RNIC, the memory of the source RNIC, and the storage of the host of the source RNIC. In any two of the devices.
  • the source RNIC can first check whether the tunnel table stored in the processor cache of the source RNIC includes Target tunnel entry; if the tunnel table stored in the cache of the source RNIC processor does not include the target tunnel entry, check whether the tunnel table stored in the memory of the source RNIC includes the target tunnel entry.
  • the source RNIC may first check whether the tunnel table stored in the memory of the source RNIC includes the target tunnel entry; The tunnel table in the memory of the source RNIC does not include the target tunnel entry, and it can be checked whether the tunnel table stored in the storage device of the host of the source RNIC includes the target tunnel entry. Similarly, the two parts of the tunnel table may or may not have an intersection.
  • the source RNIC may determine a target configuration entry corresponding to the identity of the source vRNIC, and determine the target virtual network address according to the target configuration entry and the target WQE corresponding to the data to be transmitted.
  • the source RNIC may determine the IP address of the target vRNIC from the target WQE, and determine the IP address of the source vRNIC from the target configuration entry.
  • the target WQE may include the IP address of the target vRNIC.
  • the source RNIC can directly obtain the IP address of the destination vRNIC from the target WQE.
  • the target WQE may include target vRNIC IP address indication information, and the target vRNIC IP address indication information is used to indicate the location where the IP address of the target vRNIC is stored.
  • the source RNIC may obtain the IP address of the destination vRNIC according to the location indicated by the destination vRNIC IP address indication information.
  • the target configuration entry is an entry in the configuration table.
  • the configuration table is used to store the correspondence between the vRNIC identifier and the IP address of the vRNIC.
  • the configuration table may include at least one configuration entry, and each configuration entry in the at least one configuration entry includes the identifier of the vRNIC and the IP address of the vRNIC.
  • the source RNIC can use the identifier of the source vRNIC to find the target configuration entry corresponding to the source vRNIC identifier from the configuration table.
  • the identifier of the vRNIC in the target configuration entry is the identifier of the source vRNIC
  • the IP address in the target configuration entry is the IP address of the source vRNIC.
  • the location where the configuration table is saved may be similar to the location where the tunnel table is saved.
  • the configuration table may be stored in any one or more of the cache of the processor of the source RNIC, the memory of the source RNIC, and the storage device of the host of the source RNIC.
  • the configuration table and the method for the source RNIC to look up the configuration table reference may be made to the above-mentioned storage method of the tunnel table and the method for the source RNIC to look up the tunnel table. For brevity, details are not described herein again.
  • the source RNIC can determine the target virtual network address according to the target WQE.
  • the target WQE may include the IP address of the source vRNIC and the IP address of the destination vRNIC.
  • the source RNIC can directly obtain the IP address of the source vRNIC and the IP address of the destination vRNIC from the target WQE.
  • the target WQE may include virtual address indication information, and the virtual address indication information is used to indicate the location where the IP address of the source vRNIC is saved and the location where the IP address of the destination vRNIC is saved.
  • the source RNIC may obtain the IP address of the source vRNIC and the IP address of the destination vRNIC according to the location indicated by the virtual address indication information.
  • the method of determining the message forwarding information by using QPC is referred to as the QPC cache mode for short, and the method of determining the message forwarding information by using the tunnel table may be referred to as the table lookup mode.
  • the source vRNIC can determine the message forwarding information from different places.
  • the source RNIC may not be able to use the QPC cache mode or the table lookup mode to determine the message forwarding information and the destination vRNIC identity indication information.
  • the source RNIC may determine that there is no target queue pair context corresponding to the connection information and the identity of the source vRNIC, or that there is no target tunnel entry that matches the identity of the source vRNIC and the target virtual network address.
  • the message forwarding information includes the IP address of the source RNIC, the MAC address of the source RNIC, the Layer 4 port number, the IP address of the destination RNIC, and the MAC address of the destination RNIC.
  • the IP address of the source RNIC, the MAC address of the source RNIC, and the Layer 4 port number can all be stored in the storage device of the source RNIC. Therefore, the source RNIC can directly obtain this information. Therefore, the source RNIC or the host of the source RNIC does not save the IP address of the destination RNIC and the MAC address of the destination RNIC. In this case, the source RNIC can obtain the IP address of the destination RNIC and the MAC address of the destination RNIC by using the slow processing flow.
  • the IP address of the destination RNIC and the MAC address of the destination RNIC can be referred to as the address information of the destination RNIC, and the IP address of the source RNIC and the MAC address of the source RNIC are referred to as the address information of the source RNIC.
  • the source RNIC may obtain the identity of the source vRNIC and the VNI corresponding to the identity of the source vRNIC.
  • the source RNIC may send a request message to at least one target NIC, and each target NIC of the at least one target NIC runs at least one vRNIC belonging to the VNI, and the request message includes the identity of the source vRNIC and the target virtual network address, where The target virtual network address includes at least one of the IP address of the source vRNIC and the IP address of the destination vRNIC.
  • the source RNIC receives the feedback information sent by the destination RNIC, and the feedback information includes the address information of the destination RNIC.
  • the source RNIC or the storage device of the host of the RNIC may store the IP address of the destination RNIC and the port number of the destination RNIC, but the MAC address of the destination RNIC is not stored.
  • the source RNIC may only need to obtain the MAC address of the destination RNIC.
  • the source RNIC can obtain the MAC address of the destination RNIC by using an address resolution protocol (Address Resolution Protocol, ARP).
  • ARP Address Resolution Protocol
  • the source RNIC may obtain the VNI corresponding to the identity of the source vRNIC, broadcast an ARP request to all vRNICs belonging to the VNI, and receive an ARP response sent by the destination vRNIC.
  • the ARP response includes the MAC address of the destination vRNIC.
  • the source RNIC After the source RNIC obtains the address information of the destination RNIC, it can encapsulate the data to be transmitted according to the address information of the destination RNIC and the address information of the source RNIC. In addition, the source RNIC adds the corresponding tunnel entry in the tunnel table after obtaining the address information of the destination RNIC and the address information of the source RNIC.
  • the source RNIC may also maintain the tunnel table. For example, the source RNIC can set a timeout period and start a timer after a tunnel entry is written into the tunnel table. After each hit of the tunnel entry, the timer is restarted. If the timer exceeds the timeout period and the tunnel entry has not yet hit, the tunnel entry is deleted.
  • the target vRNIC identity indication information is used to indicate the identity of the target vRNIC.
  • the destination vRNIC identity indication information may include the VNI to which the destination vRNIC belongs and the virtual MAC address of the destination vRNIC.
  • the target vRNIC identity indication information may include the target vRNIC number.
  • the vRNIC number may be a virtual function identification (VFID).
  • the VNI to which the target vRNIC belongs can be obtained from the target QPC or the target tunnel entry.
  • the source RNIC can also determine the VNI to which the destination vRNIC belongs from the target QPC; if the message forwarding information is determined using the lookup table mode, then The source RNIC may also determine the VNI to which the destination vRNIC belongs from the target tunnel entry.
  • the IP address of the destination vRNIC can be obtained from the corresponding table of the vRNIC's IP address and MAC address saved by the source RNIC.
  • the source RNIC may store a correspondence table of IP addresses and MAC addresses.
  • the correspondence table includes multiple entries, and each entry includes an IP address and a MAC address.
  • the source RNIC can query the correspondence table according to the IP address of the destination vRNIC obtained from the target QPC or target tunnel entry to determine the matching entry in the correspondence table (that is, the IP address is the entry of the destination vRNIC)
  • the MAC address in is the MAC address of the destination vRNIC.
  • each entry in the correspondence table may also include the VNI to which the vRNIC belongs.
  • the source RNIC can use the correspondence table to determine the MAC address and VNI of the destination vRNIC according to the IP address of the destination vRNIC.
  • the target vRNIC identity indication information may include the identity of the target vRNIC.
  • the identity indication information of the destination vRNIC can be obtained from the corresponding table of the vRNIC's IP address and the identity stored in the source RNIC.
  • the source RNIC may store a correspondence table of IP addresses and identifiers.
  • the correspondence table includes multiple entries, and each entry includes an IP address and an identifier.
  • the source RNIC can query the correspondence table according to the IP address of the destination vRNIC obtained from the target QPC or target tunnel entry to determine the matching entry in the correspondence table (that is, the IP address is the entry of the destination vRNIC)
  • the identifier in is the identifier of the destination vRNIC.
  • each entry in the correspondence table may also include the VNI to which the vRNIC belongs.
  • the source RNIC can use the correspondence table to determine the identity and VNI of the destination vRNIC according to the IP address of the destination vRNIC.
  • the source RNIC encapsulates the data to be transmitted to obtain the target message.
  • the target message includes the message encapsulation information and the data to be transmitted.
  • the message encapsulation information includes message forwarding information and identity indication information of the destination vRNIC.
  • the message forwarding information and the identity indication information of the destination vRNIC are obtained in step 302.
  • the source RNIC can use the acquired message forwarding information and the destination vRNIC identity indication information to encapsulate the data to be transmitted to obtain the target message.
  • the target message may also include the identity indication information of the source vRNIC.
  • the message encapsulation information may also include wireless bandwidth (InfiniBand, IB) information.
  • IB wireless bandwidth
  • FIG 4 is a schematic diagram of a target message.
  • the target message shown in Figure 4 includes: MAC header (also known as “outer MAC header”), IP header (also known as “outer IP header”), UDP header, VXLAN header, IB message Head and load.
  • MAC header also known as "outer MAC header”
  • IP header also known as “outer IP header”
  • UDP header UDP header
  • VXLAN header VXLAN header
  • IB message Head IB message Head and load.
  • the MAC header includes a source MAC address and a destination MAC address.
  • the source MAC address is the MAC address of the source RNIC
  • the destination MAC address is the MAC address of the destination RNIC.
  • the IP header includes a source IP address and a destination IP address.
  • the source IP address is the IP address of the source RNIC
  • the destination IP address is the IP address of the destination RNIC.
  • the four-layer port number header in Figure 4 is a UDP header, which includes a UDP source port number and a destination port number, and the destination port number is a VXLAN port number.
  • the UDP source port number can be a value calculated according to a hash algorithm.
  • the method for determining the UDP source port number is the same as the existing method for determining the UDP source port number, so there is no need to repeat it here.
  • the VXLAN header includes the identity indication information of the destination vRNIC.
  • the IB header includes the IB information.
  • the IB information may be an IB basic transport header (Base Transport Header, BTH). It is understandable that the target message may also include a check bit, such as a frame check sequence (Frame Check Sequence, FCS) (not shown in the figure).
  • FCS Frame Check Sequence
  • the VXLAN header may also include the identity indication information of the source vRNIC.
  • the identity indication information of the source vRNIC may include the VNI to which the source vRNIC belongs and the virtual MAC address of the source vRNIC.
  • the source vRNIC identity indication information may include the source vRNIC number.
  • the target message may include at least one of the identity indication information of the source vRNIC and the identity indication information of the target vRINC.
  • the VXLAN header may include at least one of the identity indication information of the source vRNIC and the identity indication information of the destination vRINC.
  • the MAC header, IP header, and UDP header also include other content.
  • the embodiments of the application do not improve these contents. Therefore, the specific format and content of the MAC header, IP header, and UDP header can refer to existing protocols. For brevity, it is not necessary to repeat them here.
  • the technical solution of this application does not improve the information transmitted in the IB header. Therefore, the specific information transmitted in the IB header can refer to the IB header specified in the existing RoCE protocol. For brevity, it is not necessary to repeat it here.
  • the target message may also include a check field (not shown in the figure) in addition to the fields shown in FIG. 4.
  • the determination method and specific content of the check field are the same as the determination method and specific content of the check field in the RoCE standard message. For the sake of brevity, it will not be repeated here.
  • the target message may be a message based on the RoCE standard message.
  • the target message shown in FIG. 4 is a message based on the RoCE version 2 (version 2, v2) standard message. It can be seen that the target message shown in Fig. 4 only has one more network virtualization protocol header (that is, the VXLAN header in Fig. 4) than the standard message of RoCEv2.
  • the target message may also be a message based on the RoCE version 1 (verision 1, v1) standard message. In this case, the target message can have one more network virtualization protocol header than the RoCEv1 standard message.
  • the target message does not include at least one of the following information: the IP address of the source vRNIC, the IP address of the destination vRNIC, the MAC address of the source vRNIC, and the port of the source vRINC And the port number of the destination vRNIC.
  • the target packet does not include the IP address of the source vRNIC, the IP address of the destination vRNIC, the MAC address of the source vRNIC, the port number of the source vRINC, and the port of the destination vRNIC number.
  • the source RNIC also needs to encapsulate the above information into the target message as an inner message header. This adds a packet encapsulation.
  • the above information also occupies the capacity of the target message. In other words, if the target message still needs to include one or more of the above information, the field used to carry the data to be transmitted in the target message will be reduced. In other words, the load capacity of the target message will be reduced.
  • the data to be transmitted of the same size may require two messages to complete the transmission. This increases the number of packets transmitted in the network.
  • the source RNIC can obtain the IB information from QPC or WQE. Specifically, when the transmission mode is RC/UC, the source RNIC may obtain the IB information from the target QPC. When the transmission mode is UD/RD, the source RNIC can obtain the IB information from the reference WQE.
  • the specific implementation manner for the source RNIC to obtain the IB information is the same as the existing specific implementation manner for obtaining the IB information. For the sake of brevity, it is not necessary to repeat it here.
  • the source RNIC sends the target message to the destination RNIC, where the destination vRNIC is a vRNC running on the destination RNIC.
  • the target RNIC receives the target message.
  • the target RNIC determines the target vRNIC according to the target vRNIC identity indication information in the target message.
  • the target vRNIC identity indication information in the target message may be carried by the network virtualization protocol header.
  • the identity indication information of the destination vRNIC may include the VNI to which the destination vRNIC belongs and the virtual MAC address of the destination vRNIC.
  • the network virtualization protocol header may carry the VNI to which the destination vRNIC belongs and the virtual MAC address of the destination vRNIC.
  • the identity indication information of the target vRNIC may be the number of the target vRNIC.
  • the network virtualization protocol header may carry the destination vRNIC number.
  • the network virtualization protocol header may be a VXLAN header (for example, Figure 4), a VXLAN-generic protocol extension (Generic Protocol Extension, GPE) header, a network service header (Network Service Header), and a general One of the Network Virtualization Encapsulation (Generic Network Virtualization Encapsulation, Geneve) headers.
  • VXLAN header for example, Figure 4
  • GPE VXLAN-generic protocol extension
  • GPE Generic Protocol Extension
  • Network Service Header Network Service Header
  • Generic Network Virtualization Encapsulation Generic Network Virtualization Encapsulation, Geneve
  • Figure 5 uses the VXLAN-GPE header as an example to introduce how to use the VXLAN-GPE header to carry the VNI to which the destination vRNIC belongs and the virtual MAC address of the destination vRNIC.
  • the VXLAN-GPE header shown in FIG. 5 includes a flag (Flags) field, a reserved field, a VNI field, a Next Protocol (NP) field, and a reserved field.
  • Flags flag
  • NP Next Protocol
  • the flag field shown in Figure 5 is RRLLIRRR, where R indicates that the bit is a reserved bit, L indicates that the bit is an indicator bit used to indicate the VXLAN-GPE format, and I indicates a bit already occupied by the VXLAN-GPE. It is understandable that the flag field shown in FIG. 5 occupies the upper 2 and 3 bits as the indicator bits of the VXLAN-GPE format. In other embodiments, other reserved bits of the flag field can also be used as indicator bits of the VXLAN-GPE format.
  • the value of the LL bit is 01, it means that the VXLAN-GPE header includes the identity indication information of the destination vRNIC.
  • the VNI field in the VXLAN-GPE header may carry the VNI to which the destination vRNIC belongs.
  • the second reserved field of the VXLAN-GPE header may carry the MAC address of the destination vRNIC.
  • the destination NIC can receive the target packet according to the VXLAN-GPE
  • the higher 2 and 3 bits of the flag field in the header determine that the VXLAN-GPE header carries the identity indication information of the destination vRNIC, and the destination vRNIC is obtained from the VNI field and the second reserved field in the VXLAN-GPE header.
  • the determination of the target vRNIC by the target RNIC may be the identification of the target vRNIC.
  • the target RNIC may determine the identity of the target vRNIC according to the obtained VNI and MAC address.
  • the identification of the vRNIC is assigned by the RNIC where the vRNIC is located. After the vRNIC is migrated to another RNIC, the identity of the vRNIC will change, and the VNI to which the vRNIC belongs and the MAC address of the vRNIC will not change. Therefore, using the VNI to which the destination vRNIC belongs and the MAC address of the destination vRNIC as the identification information of the destination vRNIC can avoid the situation where the destination vRNIC cannot be accurately found due to the change in the identity of the destination vRNIC caused by the migration of the destination vRNIC occur.
  • the target RNIC can determine the identity of the target vRNIC by looking up the virtual device mapping table.
  • the virtual device mapping table includes at least one virtual device table entry, and each table entry includes a VNI, a MAC address, and an identifier.
  • the destination RNIC can determine from the virtual device mapping table a target virtual device table entry that matches the obtained VNI and MAC address, and the VNI in the target virtual device table entry is the VNI to which the destination vRNIC belongs and obtained by the target RNIC ,
  • the MAC address in the target virtual table entry is the MAC address of the target vRNI obtained by the target RNIC.
  • the identifier in the target virtual entry is the identifier of the target vRNIC.
  • the destination RNIC can maintain the virtual device mapping table. Specifically, when a vRNIC is created in the destination RNIC, the destination RNIC can create a virtual device entry corresponding to the vRNIC in the virtual device mapping table, and the VNI in the virtual device entry is the vRNIC to which the vRNIC belongs VNI, the MAC address in the virtual device table entry is the MAC address of the vRNIC, and the identifier in the virtual device table entry is the identity of the vRNIC.
  • the destination RNIC may also delete the virtual device table entry corresponding to the vRNIC after the vRNIC is destroyed (for example, migrated to another RNIC or deleted from the destination RNIC).
  • the identity indication information of the target vRNIC may be the identity of the target vRNIC.
  • the identifier of the destination vRNIC can also be carried by using the network virtualization protocol header.
  • the second reserved field in the VXLAN-GPE header may carry the identifier of the destination vRNIC.
  • the higher 2 and 3 bits in the flag field of the VXLAN-GPE header can also be used to indicate that the VXLAN-GPE header carries the identity indication information of the target vRNIC. In this way, the target RNIC can directly determine the identity of the target vRNIC from the VXLAN-GPE header.
  • the destination RNIC sends the destination message to the destination vRNIC.
  • the destination vRNIC processes the received destination message.
  • the destination vRNIC can strip the packet forwarding information in the destination packet, the identity indication information and IB information of the destination vRNIC, and the data in the payload part of the destination packet (that is, the data to be sent by the source vRNIC) Transfer data) for processing.
  • the specific process of vRNIC processing the to-be-transmitted data is the same as the existing vRNIC processing process of data transmitted using RDMA technology. For brevity, it is not necessary to repeat it here.
  • the data to be transmitted can be encapsulated only once, so the overhead of tunnel encapsulation is small.
  • the encapsulated message does not need to include the source vRNIC's IP address, MAC address, and port number information, nor does it need to include the destination vRNIC's IP address and port number information.
  • the destination RNIC and destination vRNIC do not need to obtain the above information in the process of processing the target message. Therefore, the payload of the target message may not need to carry these redundant information. In this case, more space can be left in the payload of the target message to transmit the data to be transmitted, which can increase the effective throughput in the network.
  • the encapsulation process of the data to be transmitted can be completed by RNIC. Therefore, there is no need to set up additional hardware for packaging the data to be transmitted. This can reduce the cost of applying RDMA technology in Ethernet.
  • VXLAN Tunnel End Point can be a physical device or a virtual device.
  • RNIC is used as VTEP.
  • the source RNIC in the method shown in FIG. 3 may also be called a source VTEP, and the destination RNIC may also be called a destination VTEP.
  • the IP address of the source RNIC may also be referred to as the IP address of the source VTEP, and the MAC address of the source RNIC may also be referred to as the MAC address of the source VTEP.
  • the IP address of the destination RNIC can also be referred to as the IP address of the destination VTEP, and the MAC address of the destination RNIC can also be referred to as the MAC address of the destination VTEP.
  • RNIC is a physical network card. Therefore, the embodiment shown in FIG. 3 is described with a physical network card as a VTEP. In other embodiments, VTEP can also be implemented by virtual devices. In other words, the RNIC in the method shown in Figure 3 can also be understood as a kind of vRNIC.
  • the specific implementation process of implementing the communication method provided in this application through a virtual device is the same as the specific implementation process of implementing the communication method provided in this application by a physical device. For brevity, it is not necessary to repeat it here.
  • the two communicating parties are VM 211 and VM 231 respectively.
  • the data to be transmitted is stored in the storage device of VM 211, and the data to be transmitted needs to be sent to the storage device of VM 231.
  • the data to be transmitted is "Hello".
  • the source vRNIC is the vRNIC corresponding to VM 211, namely vRNIC 221
  • the destination vRNIC is the vRNIC corresponding to VM 231, namely vRNIC 241
  • the source VTEP is RNIC 220
  • the destination VTEP is RNIC 240.
  • the IP address of vRNIC 211 is 192.168.0.1
  • the IP address of vRNIC 241 is 192.168.0.2
  • the MAC address of vRNIC is 1:2:3:4:5:6, the IP address of RNIC 220 is 10.0.0.1, and the IP address of RNIC 220 is 10.0.0.
  • the MAC address of RNIC is A:B:C:D:E:F
  • the IP address of RNIC 240 is 10.0.0.2
  • the MAC address of RNIC 240 is X:Y:Z:M:N:O
  • the VNI is xxx.
  • the source VTEP can obtain the target QPC from the QPC cache.
  • the target QPC stores the IP address and MAC address of the target VTEP, the MAC address of the target vRNIC and the VNI (or a pointer to the above information).
  • the source VTEP can obtain the target tunnel entry from the tunnel table.
  • the target tunnel entry stores the VNI, the IP address and MAC address of the destination VTEP, the MAC address of the destination vRNIC and the VNI ( Or point to the above information pointer).
  • the source VTEP can encapsulate the data to be transmitted according to the obtained IP address and MAC address of the destination VTEP, the MAC address and VNI of the destination vRNIC, and the IP address and MAC address of the source VTEP.
  • Figure 6 is an encapsulated target message. The target message shown in Figure 6 is encapsulated according to the above content. It is understandable that FIG. 6 only shows some key information in the target message.
  • the destination VTEP After the destination VTEP receives the target message, it can determine that the current message is a virtual message encapsulated by VXLAN according to the port number in the UDP header. According to the definition of the VXLAN header, the higher 2 and 3 bits of the target VTEP discovery flag (Flags) field are enabled. Therefore, it can be determined that the message is the R_VXLAN protocol defined in this embodiment of the application. Furthermore, it can be determined that the content carried in the second reserved field is the MAC address of the vRNIC. The destination VTEP can determine the destination vRNIC as vRNIC 241 according to the MAC addresses of the VNI and vRNIC obtained in the VXLAN header. The destination VTEP sends the destination message to vRNIC241. vRNIC 241 strips the outer packet header, and passes the IB header to the RDMA engine of VM 231 for processing.
  • the target VTEP discovery flag (Flags) field the higher 2 and 3 bits of the target VTEP discovery
  • Fig. 7 is a schematic structural block diagram of a network card according to an embodiment of the present application.
  • the network card 700 shown in FIG. 7 includes a processing unit 701 and a sending unit 702.
  • the network card 700 is a network card supporting RDMIA technology.
  • the obtaining unit 701 is configured to obtain data to be transmitted from a source vRNIC, where the source vRNIC is a vRNIC running on the network card 700.
  • the processing unit 701 is further configured to obtain message forwarding information and destination vRNIC identity indication information.
  • the message forwarding information includes the IP address of the network card 700, the MAC address of the network card 700, the IP address of the destination RNIC, the MAC address of the destination RNIC, and four Layer port number.
  • the processing unit 701 is further configured to encapsulate the data to be transmitted to obtain a target message, the target message including the message forwarding information, the target virtual RNIC identity indication information, and the data to be transmitted, and the target message does not include At least one of the following information: the IP address of the source vRNIC, the IP address of the destination vRNIC, the MAC address of the source vRNIC, the port number of the source vRNIC, and the port number of the destination virtual RNIC.
  • the sending unit 702 is configured to send the target message to the target RNIC, where the target vRNIC is a vRNIC running on the target RNIC.
  • the network card 700 may be the source RNIC in the above embodiment.
  • the processing unit 701 may be implemented by a processor, and the sending unit 702 may be implemented by a transmitter.
  • the processing unit 701 and the sending unit 702 reference may be made to the description of the foregoing embodiment.
  • Fig. 8 is a schematic structural block diagram of a network card according to an embodiment of the present application.
  • the network card 800 shown in FIG. 8 includes a receiving unit 801 and a processing unit 802.
  • the network card 800 is a network card supporting RDMIA technology.
  • the obtaining unit 801 is configured to receive a message sent by a source RNIC, the message including message forwarding information, destination virtual RNIC identity indication information and data, and the message forwarding information includes the Internet Protocol IP address of the source RNIC and the source RNIC
  • the target packet does not include at least one of the following information: the IP address of the source virtual RNIC , The IP address of the destination virtual RNIC, the MAC address of the source virtual RNIC, the port number of the source virtual RNIC and the port number of the destination virtual RNIC.
  • the source virtual RNIC is a virtual RNIC running in the source RNIC.
  • the virtual RNIC is a virtual RNIC running on the network card.
  • the processing unit 802 is configured to determine the destination virtual RNIC according to the identification information of the destination virtual RNIC.
  • the processing unit 802 is configured to send the message to the destination vRNIC.
  • the network card 800 may be the target RNIC in the above embodiment.
  • the receiving unit 801 may be implemented by a receiver, and the processing unit 802 may be implemented by a processor.
  • the processing unit 802 may be implemented by a processor.
  • Fig. 9 is a structural block diagram of a network card provided according to an embodiment of the present application.
  • the network card 900 shown in FIG. 9 includes a processor 901, a memory 902, and a transceiver 903.
  • the network card 900 is a network card supporting RDMIA technology.
  • the processor 901, the memory 902, and the transceiver 903 communicate with each other through an internal connection path to transfer control and/or data signals.
  • the method disclosed in the foregoing embodiments of the present application may be applied to the processor 901 or implemented by the processor 901.
  • the processor 901 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 901 or instructions in the form of software.
  • the aforementioned processor 901 may be a general-purpose processor, a digital signal processor (digital signal processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • Programmable logic devices discrete gate or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in random access memory (RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory, or electrically erasable programmable memory, registers, etc. mature in the field Storage medium.
  • the storage medium is located in the memory 902, and the processor 901 reads the instructions in the memory 902, and completes the steps of the foregoing method in combination with its hardware.
  • the memory 902 may exist independently of the processor 901. In this case, the memory 902 may be connected to the processor 901 through a connection path. In another possible design, the memory 902 may also be integrated with the processor 901, which is not limited in the embodiment of the present application.
  • the memory 902 may store instructions for executing the method executed by the source RNIC in the method shown in FIG. 3.
  • the processor 901 can execute the instructions stored in the memory 902 in combination with other hardware (such as the transceiver 903) to complete the steps of the source RNIC in the method shown in FIG. 3.
  • other hardware such as the transceiver 903
  • the memory 902 may store instructions for executing the method executed by the target RNIC in the method shown in FIG. 3.
  • the processor 901 can execute the instructions stored in the memory 902 in combination with other hardware (such as the transceiver 903) to complete the steps of the target RNIC in the method shown in FIG. 3.
  • other hardware such as the transceiver 903
  • An embodiment of the present application also provides a chip, which includes a transceiver unit and a processing unit.
  • the transceiver unit may be an input/output circuit or a communication interface
  • the processing unit is a processor or microprocessor or integrated circuit integrated on the chip.
  • the chip can execute the method of the source RNIC in the above method embodiment.
  • An embodiment of the present application also provides a chip, which includes a transceiver unit and a processing unit.
  • the transceiver unit may be an input/output circuit or a communication interface
  • the processing unit is a processor or microprocessor or integrated circuit integrated on the chip.
  • the chip can execute the method executed by the target RNIC in the foregoing embodiment.
  • the embodiment of the present application also provides a computer-readable storage medium on which an instruction is stored, and when the instruction is executed, the method of the source RNIC in the foregoing method embodiment is executed.
  • a computer-readable storage medium is provided with instructions stored thereon, and when the instructions are executed, the target RNIC method in the foregoing method embodiment is executed.
  • the embodiment of the present application also provides a computer program product containing instructions that, when executed, execute the method of the source RNIC in the foregoing method embodiment.
  • a computer program product containing instructions is provided, and when the instructions are executed, the method for the purpose in the foregoing method embodiment is executed.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供了本申请提供一种通信方法和网卡,该方法包括:源RNIC获取源虚拟RNIC发送的待传输数据;该源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息;该源RNIC对该待传输数据进行封装,得到目标报文并将该目标报文发送至目的RNIC,其中,该目的虚拟RNIC是运行在该目的RNIC上的一个虚拟RNIC。基于上述技术方案,待传输的数据可以只进行一次报文封装,因此隧道封装的开销较小。目标报文的负载中可以空余更多的空间用于传输待传输的数据,这样可以提升网络中的有效吞吐。

Description

通信方法和网卡
本申请要求于2019年07月19日提交中国专利局、申请号为201910655048.9、申请名称为“通信方法和网卡”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,更具体地,涉及一种通信方法和网卡。
背景技术
远程直接内存存取(Remote Direct Memory Access,RDMA)是为了解决网络传输中计算节点数据处理的延迟而产生的一种技术。利用RDMA技术可以将数据直接从一台计算节点的内存传输到另一台计算节点,无需对方操作系统的接入。这样可以允许高吞吐、低时延的网络通信。
为了兼容以太网,业界提出了基于融合以太网的RDMA(RDMA over Converged Ethernet,RoCE)技术。这样,RDMA技术可以应用到云计算场景或者类似的场景中。
在云计算场景或者类似的场景中,租户大多运行在虚拟机或者容器中。在此情况下,需要向虚拟机或容器提供RDMA能力。为了实现这一目的,可以利用硬件虚拟化技术实现输入输出(Input Output,IO)虚拟化。具体地,可以利用硬件虚拟化技术将一个物理网卡抽象为多个支持RDMA技术的虚拟网卡。通过在虚拟机内部部署专用的虚拟网卡的驱动,租户就可以像使用物理网卡一样使用虚拟网卡了。
当租户需要发送数据时,虚拟网卡可以直接从虚拟机内存读取数据。需要注意的是,因为数据是直接从虚拟机的内存中读取的并由虚拟网卡完成报文封装,所以此时从虚拟网卡发送出来的报文采用的虚拟网络地址进行封装的。报文头中的以太网层2协议头中的源和目的媒体访问控制(Media Access Control,MAC)地址,以及互联网协议(Internet Protocol,IP)头中的源和目的IP地址均是虚拟网卡的地址。网络设备(如路由器和交换机)并不能识别虚拟网卡的地址,所以该报文不能被正确的路由到目标节点。
因此,如何在虚拟化场景中,利用RoCE技术进行数据传输是一个亟待解决的问题。
发明内容
本申请提供一种通信方法和网卡,能够提升网络中的有效吞吐。
第一方面,本申请实施例提供一种通信方法,该方法包括:源支持远程直接内存存取的网卡RNIC获取源虚拟RNIC发送的待传输数据,其中该源虚拟RNIC是运行在该源RNIC上的一个虚拟RNIC;该源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,该报文转发信息包括该源RNIC的互联网协议IP地址、该源RNIC的媒体访问控制MAC地址、目的RNIC的IP地址、该目的RNIC的MAC地址和四层端口号;该源RNIC对该 待传输数据进行封装,以得到目标报文,该目标报文包括该报文转发信息、该目的虚拟RNIC身份指示信息和该待传输数据,该目标报文不包括以下信息中的至少一个:该源虚拟RNIC的IP地址、该目的虚拟RNIC的IP地址、该源虚拟RNIC的MAC地址、该源虚拟RNIC的端口号和该目的虚拟RNIC的端口号;该源RNIC向该目的RNIC发送该目标报文,其中,该目的虚拟RNIC是运行在该目的RNIC上的一个虚拟RNIC。
可选的,该源RNIC可以是支持RDMA技术物理网卡,或者,支持RDMA技术的虚拟网卡。
可选的,该目的RNIC可以是支持RDMA技术物理网卡,或者,支持RDMA技术的虚拟网卡。
上述技术方案可以在虚拟化场景中,利用RoCE技术进行数据传输。此外,待传输的数据可以只进行一次报文封装。具体地,该源RNIC无需先进行一次封装,将源vRNIC的IP地址、MAC地址、端口号信息、目的vRNIC的IP地址和端口号信息封装到报文中,然后在进行第二次封装将源RNIC的IP地址、该源RNIC的MAC地址、目的RNIC的IP地址、该目的RNIC的MAC地址和四层端口号封装到报文中。该源RNIC可以只进行一次封装,将源RNIC的IP地址、该源RNIC的MAC地址、目的RNIC的IP地址、该目的RNIC的MAC地址和四层端口号封装到报文中。由于封装后的报文中不再包括源vRNIC的IP地址、MAC地址、端口号信息、目的vRNIC的IP地址和端口号信息,因此目标报文的负载(payload)中可以空余更多的空间用于传输待传输的数据,这样可以提升网络中的有效吞吐。此外,上述方案中,待传输数据的封装过程可以由RNIC完成。因此,无需设置额外的用于封装待传输数据的硬件。这样可以降低在以太网中应用RDMA技术的成本。该目标报文可以是基于RoCE标准格式的报文。
结合第一方面,在第一方面的一种可能的实现方式中,该源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,包括:该源RNIC根据源虚拟RNIC的标识和用于传输该待传输数据的传输模式,获取该报文转发信息和该目的虚拟RNIC身份指示信息。基于上述技术方案,该源RNIC可以根据传输模式的不同,选择不同的方式获取该报文转发信息。
结合第一方面,在第一方面的一种可能的实现方式中,该源RNIC根据该源虚拟RNIC的标识和传输模式,获取该报文转发信息和该目的虚拟RNIC身份指示信息,包括:在该传输模式为可靠连接RC或不可靠连接UC的情况下,该源RNIC根据目标队列对上下文,确定该报文转发信息和该目的虚拟RNIC身份指示信息,其中该目标队列对上下文对应于连接信息和该源虚拟RNIC的标识。基于上述技术方案,在传输模式为RC或UC情况下,该源RNIC可以直接从队列对上下文中获取该报文转发信息和该目的虚拟RNIC的身份指示信息。
结合第一方面,在第一方面的一种可能的实现方式中,该源RNIC根据该源虚拟RNIC的标识和传输模式,获取该报文转发信息和该目的虚拟RNIC身份指示信息,包括:在该传输模式为可靠连接RC或不可靠连接UC的情况下,该源RNIC从参考队列对上下文或参考WQE中确定目标虚拟网络地址,其中该目标虚拟网络地址包括该源虚拟RNIC的IP地址和该目的虚拟RNIC的IP地址中的至少一个,该参考队列对上下文与连接信息和该源虚拟RNIC的标识对应,该参考WQE与该连接信息和该源虚拟RNIC的标识对应;该 源RNIC从隧道表确定该报文转发信息和该目的虚拟RNIC身份指示信息,其中该隧道表中包括至少一个隧道条目,该至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中该第一虚拟RNIC运行在该第一RNIC中,该第二虚拟RNIC运行在该第二RNIC中,该虚拟网络地址包括该第一虚拟RNIC的IP地址和该第二虚拟RNIC的IP地址中的至少一个,该至少一个隧道条目中与该源虚拟RNIC的标识和该目标虚拟网络地址匹配的隧道条目包括该报文转发信息。基于上述技术方案,在传输模式为RC或UC情况下,该源RNIC可以从预先保存的隧道表中获取该报文转发信息和该目的虚拟RNIC的身份指示信息。
结合第一方面,在第一方面的一种可能的实现方式中,该源RNIC根据该源虚拟RNIC的标识和传输模式,获取该报文转发信息和该目的虚拟RNIC身份指示信息,包括:在该传输模式为不可靠数据包UD或可靠数据包RD的情况下,该源RNIC根据对应于该待传输数据的目标WQE和对应于该源虚拟RNIC的标识,确定目标虚拟网络地址,或者根据该目标WQE,确定该目标虚拟网络地址,其中该目标虚拟网络地址包括该源虚拟RNIC的IP地址和该目的虚拟RNIC的IP地址中的至少一个;该源RNIC从隧道表确定该报文转发信息和该目的虚拟RNIC身份指示信息,其中该隧道表中包括至少一个隧道条目,该至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中该第一虚拟RNIC运行在该第一RNIC中,该第二虚拟RNIC运行在该第二RNIC中,该虚拟网络地址包括该第一虚拟RNIC的IP地址和该第二虚拟RNIC的IP地址中的至少一个,该至少一个隧道条目中与该源虚拟RNIC的标识和该目标虚拟网络地址匹配的隧道条目包括该报文转发信息。基于上述技术方案,在传输模式为RD或UD情况下,该源RNIC可以直接从队列对上下文中获取该报文转发信息和该目的虚拟RNIC的身份指示信息。
结合第一方面,在第一方面的一种可能的实现方式中,该源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,包括:该源RNIC向至少一个目标NIC发送请求消息,该至少一个目标NIC中的每个目标NIC运行有至少一个虚拟RNIC,该至少一个虚拟RNIC与该源虚拟RNIC属于同一VNI,该请求消息包括该源虚拟RNIC的标识和目标虚拟网络地址,其中该目标虚拟网络地址包括该源虚拟RNIC的IP地址和该目的虚拟RNIC的IP地址中的至少一个;该源RNIC接收该目的RNIC发送的反馈信息,该反馈信息中包括该目的RNIC的IP地址和该目的RNIC的MAC地址;该源RNIC根据该反馈信息确定该报文转发信息和该目的虚拟RNIC身份指示信息。基于上述技术方案,在该源RNIC中没有保存目的RNIC的IP地址和目的RNIC的MAC地址的情况下,该源RNIC可以主动获取该目的RNIC的IP地址和该目的RNIC的MAC地址。
结合第一方面,在第一方面的一种可能的实现方式中,该目标报文包括:MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,该MAC头中包括该源RNIC的MAC地址和该目的RNIC的MAC地址;该IP头中包括该源RNIC的IP地址和该目的RNIC的IP地址;该四层端口号头中包括四层端口号;该网络虚拟化头包括该目的虚拟RNIC的身份指示信息;该负载字段包括该待传输数据。上述技术方案中,目标报文的报 文格式与现有的报文格式类似。因此,对现有报文格式改动较小,便于本申请技术方案的实现。
结合第一方面,在第一方面的一种可能的实现方式中,该目的虚拟RNIC的身份指示信息包括该目的虚拟RNIC所属的VNI和该目的虚拟RNIC的虚拟MAC地址。利用该目的虚拟RNIC所属的VNI和该目的虚拟RNIC的MAC地址作为该目的虚拟RNIC的身份指示信息可以避免因该目的虚拟RNIC发生迁移造成的该目的虚拟RNIC的标识发生变化导致的无法准确找到目的虚拟RNIC的情况发生。
结合第一方面,在第一方面的一种可能的实现方式中,该目的虚拟RNIC的身份指示信息包括该目的虚拟RNIC编号。
结合第一方面,在第一方面的一种可能的实现方式中,该目标报文还包括该源虚拟RNIC的身份指示信息。
结合第一方面、在第一方面的一种可能的实现方式中,该目标报文的网络虚拟化协议头中包括该源虚拟RNIC的身份指示信息。
第二方面,本申请实施例提供一种通信方法,该方法包括:目的RNIC接收源RNIC发送的报文,该报文包括报文转发信息、目的虚拟RNIC身份指示信息和数据,该报文转发信息包括该源RNIC的互联网协议IP地址、该源RNIC的媒体访问控制MAC地址、目的RNIC的IP地址、该目的RNIC的MAC地址和四层端口号,该目标报文不包括以下信息中的至少一个:源虚拟RNIC的IP地址、该目的虚拟RNIC的IP地址、源虚拟RNIC的MAC地址、该源虚拟RNIC的端口号和该目的虚拟RNIC的端口号,该源虚拟RNIC是运行在该源RNIC中的一个虚拟RNIC,该目的虚拟RNIC是运行在该目的RNIC中的一个虚拟RNIC;该目的RNIC根据该目的虚拟RNIC身份指示信息,确定该目的虚拟RNIC;该目的RNIC将该报文发送至该目的vRNIC。上述技术方案中,目的RNIC接收到的报文中不再包括源vRNIC的IP地址、MAC地址、端口号信息、目的vRNIC的IP地址和端口号信息。因此该报文的负载(payload)中可以空余更多的空间用于传输数据,这样可以提升网络中的有效吞吐。
结合第二方面,在第二方面的一种可能的实现方式中,该报文包括:MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,该MAC头中包括该源RNIC的MAC地址和该目的RNIC的MAC地址;该IP头中包括该源RNIC的IP地址和该目的RNIC的IP地址;该四层端口号头中包括四层端口号;该网络虚拟化头包括该目的虚拟RNIC的身份指示信息;该负载字段包括该数据。上述技术方案中,目标报文的报文格式与现有的报文格式类似。因此,对现有报文格式改动较小,便于本申请技术方案的实现。
结合第二方面,在第二方面的一种可能的实现方式中,该目的虚拟RNIC身份指示信息包括:该目的虚拟RNIC所属的VNI和该目的虚拟RNIC的虚拟MAC地址;该目的RNIC根据该目的虚拟RNIC身份指示信息,确定该目的虚拟RNIC,包括:该目的RNIC从虚拟设备映射表中确定该目的虚拟RNIC,其中该虚拟设备映射表中包括至少一个虚拟设备表项,该至少一个虚拟设备表项中的每个表项包括VNI、MAC地址和标识,其中该至少一个虚拟设备表项中与该目的虚拟RNIC所属的VNI和该目的虚拟RNIC的虚拟MAC地址匹配的虚拟设备表项中的标识为该目的虚拟RNIC的标识。利用该目的虚拟RNIC所属的VNI和该目的虚拟RNIC的MAC地址作为该目的虚拟RNIC的身份指示信息可以避 免因该目的虚拟RNIC发生迁移造成的该目的虚拟RNIC的标识发生变化导致的无法准确找到目的虚拟RNIC的情况发生。
结合第二方面,在第二方面的一种可能的实现方式中,该目的虚拟RNIC的身份指示信息包括该目的虚拟RNIC编号。
结合第二方面,在第二方面的一种可能的实现方式中,该目标报文还包括该源虚拟RNIC的身份指示信息。
结合第二方面、在第二方面的一种可能的实现方式中,该目标报文的网络虚拟化协议头中包括该源虚拟RNIC的身份指示信息。
第三方面,本申请实施例提供一种网卡,该网卡包括用于实现第一方面或第一方面的任一种可能的实现方式的单元。该网卡支持RDMIA技术。
第四方面,本申请实施例提供一种网卡,该网卡包括用于实现第二方面或第二方面的任一种可能的实现方式的单元。该网卡支持RDMIA技术。
第五方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储用于实现第一方面或第一方面的任一种可能的实现方式所述的方法的指令。
第六方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质存储用于实现第二方面或第二方面的任一种可能的实现方式所述的方法的指令。
第七方面,本申请提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第一方面或第一方面的任一种可能的实现方式所述的方法。
第八方面,本申请提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述第二方面或第二方面的任一种可能的实现方式所述的方法。
第九方面,本申请提供了一种通信装置,该通信装置包括处理电路和存储介质,该存储介质存储程序代码,该处理电路用于调用该存储介质中的程序代码执行上述第一方面或第一方面的任一种可能的实现方式所述的方法。该通信装置支持RDMIA技术。
第十方面,本申请提供了一种通信装置,该通信装置包括处理电路和存储介质,该存储介质存储程序代码,该处理电路用于调用该存储介质中的程序代码执行上述第二方面或第二方面的任一种可能的实现方式所述的方法。该通信装置支持RDMIA技术。
附图说明
图1是本申请实施例提供的一种系统架构的示意图。
图2是本申请实施例提供的一种系统架构的示意图。
图3是根据本申请实施例提供的一种通信方法的示意性结构框图。
图4是一个目标报文的示意图。
图5是VXLAN-GPE头的示意图。
图6是一个封装好的目标报文的示意图。
图7是根据本申请实施例提供的一种网卡的示意性结构框图。
图8是根据本申请实施例提供的一种网卡的示意性结构框图。
图9是根据本申请实施例提供的网卡的结构框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例的技术方案可以应用于支持远程直接数据存取(remote direct memory access,RDMA)技术的网卡(network interface card,NIC)和计算节点。例如,支持RDMA技术的数据中心的网卡和计算节点,或者其他支持RDMA技术的网卡和计算节点。其中,计算节点可以与网卡连接,计算节点是指具有计算能力的电子设备,例如服务器、个人计算机(例如台式计算机设备、笔记本电脑)等。该网卡可以称为该计算节点的网卡。该网卡也可以称为网络接口卡(network interface card)、网络适配器(network adapter)、物理网络接口(physical network interface)等。本申请实施例中的网卡是可以是支持RDMA技术的网卡,因此,该网卡也可以称为RDMA网卡(RDMA network interface card,RNIC)。
可选的,在一些实施例中,计算节点的RNIC可以是内置在该计算节点内部的。例如,该计算节点的RNIC可以通过高速串行计算机扩展总线标准(Peripheral Component Interconnect Express,PCIe)接口、或用于加速器的缓存一致互联(cache coherent interconnect for accelerator,CCIX)接口等接口与该计算节点的主板连接。该计算节点可以称为该RNIC的主机(host)。
可选的,在另一些实施例中,计算节点的RNIC可以是该计算节点的一个外置设备。例如,该RNIC可以通过PCIe接口、快速路径互联(Quick Path Interconnect,QPI)接口、通用串行总线(Universal Serial Bus,USB)接口等与计算节点连接。
可选的,网卡可以和CPU集成在一个SOC系统内。
在本申请实施例中,计算节点包括硬件层、运行在硬件层之上的操作系统层,以及运行在操作系统层上的应用层。该硬件层包括中央处理器(central processing unit,CPU)、内存管理单元(memory management unit,MMU)和内存(也称为主存)等硬件。该操作系统可以是任意一种或多种通过进程(process)实现业务处理的计算机操作系统,例如,Linux操作系统、Unix操作系统、Android操作系统、iOS操作系统或windows操作系统等。该应用层包含分布式数据库、分布式存储、分布式AI系统等应用。并且,本申请实施例并未对本申请实施例提供的方法的执行主体的具体结构特别限定,只要能够通过运行记录有本申请实施例的提供的方法的代码的程序,以根据本申请实施例提供的方法进行通信即可,例如,本申请实施例提供的方法的执行主体可以是计算节点,或者,是计算节点中能够调用程序并执行程序的功能模块。
图1是本申请实施例提供的一种系统架构的示意图。如图1所示的系统100中包括计算节点110、RNIC 111、计算节点120和RNIC 121。
计算节点110的RNIC可以是RNIC 111,计算节点120的RNIC可以是RNIC 121。
RNCI 111与RNCI 121之间可以通过通信链路连接,该通信链路的介质可以是光纤等,本申请实施例对网络设备间的通信链路的具体介质并不限定。RNCI 111与RNCI 121之间可以包括一个或多个交换节点,也可以直接进行通信。如图1所示,计算节点110中包括一个存储装置112,该存储装置112可以用于存储计算节点110的队列信息和应用数据。计算节点120中包括一个存储装置122,该存储装置122可以用于存储计算节点120的队列信息。
作为一个可能的实施例,虽然在图1中的存储装置112在计算节点110内,存储装置122在计算节点120内,但是存储装置112也可以是外挂于计算节点110或RNCI 111的存储装置,存储装置122也可以是外挂于计算节点120或RNCI 121的存储装置。
可以理解的是,图1仅示出了两个计算节点通过网络设备的连接关系。在一些支持RDMA技术的网络(例如数据中心网络)中可以包括更多的计算节点。这样的网络中的任意两个计算节点都可以通过如图1所示的方法连接。换句话说,图1所示的系统100可以是支持RDMA技术的网络中的任意两个计算节点的连接方式。
利用硬件虚拟化技术,可以将一个物理RNIC抽象为多个支持RDMA技术的虚拟RNIC。为了便于描述,以下可以将支持RDMA技术的虚拟NIC称为vRNIC(virtual RNIC,vRNIC)。除非特殊说明,本申请实施例中所称的RNIC均是指物理RNIC。
图2是本申请实施例提供的一种系统架构的示意图。如图2所示的计算节点210中部署有三个虚拟机(virtual machine,VM),分别为VM 211、VM 212和VM 213。RNIC 220中部署有三个vRNIC,分别为vRNIC 221、vRNIC 222和vRNIC 223。计算节点230中部署有两个VM,分别为VM 231和VM 232。RNIC 240中部署有两个vRNIC,分别为vRNIC 241和vRNIC 242。RNIC 220是计算节点210的RNIC,RNIC 240是计算节点230的RNIC。换句话说,计算节点210是RNIC 220的主机,计算节点230是RNIC 240的主机。
部署在计算节点中的VM与RNIC中的vRNIC可以是一一对应,也可以是一个VM配置多个vRNIC。为了便于描述,图2所示的系统中,部署在计算节点210中的三个VM与部署在RNIC 220中的三个vRNIC一一对应。部署在计算节点230中的两个VM与部署在RNIC 240中的两个vRNIC一一对应。这里所称的VM与vRNIC一一对应是指vRNIC是对应的VM的vRNIC。例如,VM 211的vRNIC是vRNIC 221,VM 231的vRNIC是vRNIC 241。VM 211和VM 231之间的RDMA通信可以通过vRNIC 221和vRNIC 241实现。
可以理解的是,图2所示的系统是假设租户部署在VM中。因此,图2示出了VM和vRNIC的对应关系。在另一些实现方式中,租户可以部署在容器(container)中。在此情况下,每个容器有一个对应的vRNIC。容器之间的RDMA通信可以通过各自对应的vRNIC实现。
图3是根据本申请实施例提供的一种通信方法的示意性结构框图。
301,源RNIC获取源vRNIC发送的待传输数据,其中该源vRNIC是运行在该源RNIC上的一个vRNIC。
该待传输数据是该源vRNIC从该源vRNIC的主机的存储装置中获取的数据。可以理解的是,源vRNIC的主机是部署在一个计算节点中的虚拟机。因此,该源vRNIC的主机的存储装置是部署有该虚拟机的计算节点的存储装置。
302,源RNIC获取报文转发信息和目的vRNIC身份指示信息。
该报文转发信息包括该源RNIC的IP地址、该源RNIC的MAC地址、四层端口号、目的RNIC的IP地址、该目的RNIC的MAC地址。更具体地,该报文转发信息包括MAC头,IP头和四层端口号头。该MAC头中包括该源RNIC的MAC地址和该目的RNIC的MAC地址。该IP头中包括该源RNIC的IP地址和该目的RNIC的IP地址。该四层端口号头中包括该四层端口号。四层是指开放系统互联(Open System Interconnection,OSI) 模型中的第四层,即传输层。因此,四层端口号也可以称为传输层端口号。该四层端口号可以是用户数据报协议(User Dataram Protocol,UDP)端口号,或者,传输控制协议(Transmission Control Protocol,TCP)端口号等。该四层端口号可以包括源端口号和目的端口号。
可选的,在一些实施例中,该源RNIC获取报文转发信息和目的vRNIC身份指示信息可以包括:该源RNIC获取该源vRNIC的标识;该源RNIC根据该源vRNIC的标识和该待传输数据的传输模式,获取该报文转发信息和该目的vRNIC身份指示信息。
该源RNIC可以利用门铃(doorbell)机制,获取该源vRNIC的标识。当源vRNIC需要发送数据时,该源vRNIC可以通过门铃(doorbell)机制通知源RNIC需要发送数据的vRNIC。源RNIC可以在与源vRNIC预先约定的寄存器或存储空间中存储预设格式的数据,当源RNIC检测到该预先约定的寄存器或存储空间中存储的内容发生变化时,源RNIC从该预先约定的寄存器或存储空间中读取预设格式的数据。也就是说,上述门铃机制可以利用预设寄存器或存储空间存储预设格式的数据。
例如,门铃机制由寄存器实现。源vRNIC可以该源vRNIC的标识写入到寄存器中。源RNIC在检测到该门铃后,可以读取该寄存器中的该源vRNIC的队列标识,并将读取到的该源vRNIC的标识记录下来。
可选的,在源RNIC读取到该寄存器中的该源vRNIC的标识并记录了读取到的该源vRNIC的标识后,通知源vRNIC可以将该寄存器中保存的该队列标识删除。源vRNIC在获取到该通知后,将该寄存器中保存的该队列标识删除。
可选的,该队列标识可以基于先进先出机制保存至该寄存器。这样在该队列标识被读取后,该队列标识就被从该寄存器中删除。
源vRNIC和目的vRNIC之间的数据传输是通过RDMA技术实现的。RDMA传输的传输模式可以是:可靠连接(reliable connection,RC)、可靠数据报(reliable datagram)、不可靠连接(unreliable connection,UC)和不可靠数据报(unreliable datagram,UD)中的一个。换句话说,该待传输数据的传输模式可以为RC、UC、UD或RD中的一个。
该源RNIC可以根据不同的传输模式,采取不同的策略获取该报文转发信息和该目的vRNIC身份指示信息。
可选的,在一些实施例中,在该传输模式为RC或UC的情况下,该源RNIC可以获取连接信息;该源RNIC可以确定与该连接信息和该源vRNIC的标识对应的目标队列对上下文(queue pair context,QPC);该源RNIC可以从该目标队列对上下文中确定该报文转发信息和该虚拟vRNIC身份指示信息。
该连接信息是用于指示队列的信息。例如,在一些实施例中,该连接信息可以是队列对编号(queue pair number,QPn)。又如,在另一些实施例中,该连接信息可以是其他可以指示队列对的信息。例如,不同的队列对可以对应不同的标识。该连接信息可以是队列的标识。又如,该连接信息可以是一个通信端点的标识,例如接收端的标识。该接收端可以是目的vRNIC的标识,也可以是该目的vRNIC的对应的虚拟机的标识。
该源RNIC也可以利用门铃(doorbell)机制,获取该QPn。具体实现方式与该源RNIC利用门铃机制获取该源vRNIC的标识的方式相同,为了简洁,在此就不再赘述。
如上所述,源RNIC中可能运行有多个vRNIC。该源vRNIC只是该多个vRNIC中的 一个。不同的vRNIC中可能存在相同的QPn。但是,不同的vRNIC的标识不同。因此,根据QPn和源vRNIC的标识,就可以确定一个唯一的QPC,即该目标QPC。
可选的,在一些实施例中,该目标QPC中可以包括该报文转发信息。换句话说,该源RNIC可以直接从该目标QPC中获取该报文转发信息。
可选的,在另一些实施例中,该目标QPC中可以包括该报文转发信息的地址。该源RNIC可以根据该报文转发信息的地址,获取该报文转发信息。
可选的,在另一些实施例中,该目标QPC中可以包括该报文转发信息的部分信息以及该报文转发信息的另一部分信息的地址。这样,该源RNIC可以直接从该目标QPC中获取一部分报文转发信息,然后根据该目标QPC中的地址获取另一部分报文转发信息。
可选的,在一些实施例中,在该传输模式为RC或UC的情况下,该源RNIC可以获取连接信息;该源RNIC可以确定与该连接信息和该源vRNIC标识对应的参考QPC或者参考工作队列元素(work queue element,WQE)。该源RNIC可以从该参考QPC或者该参考WQE中,确定目标虚拟网络地址,该目标虚拟网络地址可以包括该源vRNIC的IP地址和该目的vRNIC的IP地址中的至少一个;该源RNIC从隧道表确定该报文转发信息和该目的虚拟RNIC身份指示信息,其中该隧道表中包括至少一个隧道条目,该至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中该第一虚拟RNIC运行在该第一RNIC中,该第二虚拟RNIC运行在该第二RNIC中,该虚拟网络地址包括该第一虚拟RNIC的IP地址和该第二虚拟RNIC的IP地址中的至少一个,该至少一个隧道条目中与该源虚拟RNIC的标识和该目标虚拟网络地址匹配的隧道条目包括该报文转发信息。
如上所述,源RNIC中可能运行有多个vRNIC。该源vRNIC只是该多个vRNIC中的一个。不同的vRNIC中可能存在相同的QPn。但是,不同的vRNIC的标识不同。因此,根据QPn和源vRNIC的标识,就可以确定一个唯一的QPC,即该参考QPC。类似的,该源RNIC可以确定唯一个WQE,即该参考WQE。
可选的,在一些实施例中,在该传输模式为UD或RD的情况下,该源RNIC根据对应于该待传输数据的目标WQE和对应于该源虚拟RNIC的标识,确定目标虚拟网络地址,或者根据该目标WQE,确定该目标虚拟网络地址,其中该目标虚拟网络地址包括该源虚拟RNIC的IP地址和该目的虚拟RNIC的IP地址中的至少一个;该源RNIC从隧道表确定该报文转发信息和该目的虚拟RNIC身份指示信息,其中该隧道表中包括至少一个隧道条目,该至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中该第一虚拟RNIC运行在该第一RNIC中,该第二虚拟RNIC运行在该第二RNIC中,该虚拟网络地址包括该第一虚拟RNIC的IP地址和该第二虚拟RNIC的IP地址中的至少一个,该至少一个隧道条目中与该源虚拟RNIC的标识和该目标虚拟网络地址匹配的隧道条目包括该报文转发信息。
可选的,在一些实施例中,隧道条目用于指示第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息可以是:隧道条目包括第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址 信息和第二RNIC的地址信息。
可选的,在另一些实施例中,第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息可以是:隧道条目包括一个位置指示信息(也可以称为指针),该位置指示信息用于指示第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息在存储装置中的位置和长度。可以从该位置指示信息所指示的位置中,读取相应第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息。
为便于描述,以下以用于指示以下隧道条目包括第一vRNIC的标识、第二vRNIC所属的VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息为例进行描述。
可选的,在一些实施例中,WQE中可以携带数据。在此情况下,该目标WQE可以是携带有该待传输数据的WQE。
可选的,在另一些实施例中,WQE中可以携带位置指示信息。该位置指示信息用于指示数据在主机中的存储位置和长度。在此情况下,该目标WQE可以是携带有指示该待传输数据的存储位置的位置指示信息的WQE。
可见,在传输模式为RC、UC、UD或RD的情况下,该源RNIC都可以利用隧道表确定报文封装信息中的报文转发信息和目的vRNIC身份指示信息。
隧道表中每个隧道条目中的第一RNIC的地址信息可以包括该第一RNIC的IP地址和MAC地址。该第二和NIC的地址信息可以包括该第二NIC的IP地址和MAC地址。该虚拟网络地址可以包括该第一vRNIC的IP地址和该第二vRNIC的IP地址中的至少一个。
如上所述,每个隧道条目中都包括的第一vRNIC的相关信息(即第一vNIC的标识、第一vNIC的IP地址)和第二vRNIC的相关信息(即第二vRNIC所属的VNI、第二vNIC的IP地址)。但是这里的“第一”和“第二”只是为了区分每个隧道条目包括两个不同的vRNIC的相关信息,并不是限定分别属于该隧道表中任意两个隧道条目的两个第一vRNIC的相关信息都是同一个vRNIC的相关信息,以及分别属于该任意两个隧道条目包括的两个第二vRNIC的相关信息是同一个vRNIC的相关信息。分别属于该隧道表中的任意两个隧道条目的两个第一vRNIC的相关信息可以是同一个vRNIC的相关信息,也可以是不同的vRNIC的相关信息。分别属于该隧道表中的任意两个隧道条目的两个第二vRNIC的相关信息可以是同一个vRNIC的相关信息,也可以是不同的vRNIC的相关信息。
类似的,每个隧道条目中还包括第一RNIC的相关信息(即第一RNIC的地址信息)以及第二RNIC的相关信息(即第二RNIC的地址信息)中的“第一”和“第二”也是为了区分每个隧道条目包括两个不同的RNIC的相关信息,并不是限定分别属于该隧道表中任意两个隧道条目的两个第一RNIC的相关信息都是同一个RNIC的相关信息,以及分别属于该任意两个隧道条目包括的两个第二RNIC的相关信息是同一个RNIC的相关信息。分别属于该隧道表中的任意两个隧道条目的两个第一RNIC的相关信息可以是同一个RNIC的相关信息,也可以是不同的RNIC的相关信息。分别属于该隧道表中的任意两个隧道条目的两个第二NIC的相关信息可以是同一个RNIC的相关信息,也可以是不同的RNIC的相关信息。
隧道表中的至少一个隧道条目中与该源vRNIC的标识和该目标虚拟网络地址匹配的隧道条目可以称为目标隧道条目。对于该目标隧道条目而言,该目标隧道条目的第一 vRNIC为该源vRNIC,该目标隧道条目的第二vRNIC为该目的vRNIC,该目标隧道条目的第一RNIC为该源RNIC,该目标隧道条目的第二RNIC为该目的RNIC。因此,该目标隧道条目中的第一vRNIC的标识为该源vRNIC的标识,该目标隧道条目中的第二vRNIC所属的VNI为该目的vRNIC所属的VNI,该目标隧道条目中的虚拟网络地址为该目标虚拟网络地址,该目标隧道条目中的第一RNIC的地址信息包括该源RNIC的IP地址和MAC地址,该目标隧道条目中的第二RNIC的地址信息包括该目的NIC的IP地址和MAC地址。
可选的,在一些实施例中,该第一RNIC的地址信息还可以包括该第一RNIC的端口号,该第二RNIC的地址信息还可以包括该第二RNIC的端口号。
表1是一个隧道表的示意。
表1
Figure PCTCN2020102466-appb-000001
如表1所示的隧道表中包括五个隧道条目。假设该源vRNIC的标识为vNIC 1,该源vRNIC的IP地址为10.1.1.1,该目的vRNIC的IP地址为10.1.1.11。在此情况下,如表1所示的五个隧道条目的第一个隧道条目是与该源vRNIC的标识、该源vRNIC的IP地址和该目的vRNIC的IP地址匹配的目标隧道条目。根据该目标隧道条目,可以确定该目的vRNIC所属的VNI,该源NIC的IP地址为192.100.1.1,该源NIC的MAC地址为X:Y:Z:M:N:11,该目的NIC的IP地址为192.100.2.2,该目的NIC的MAC地址为M:N:X:Y:Z:22。该目的vRNIC所属的虚拟扩展局域网(Virtual eXtensible Local Area Network,VXLAN)网络标识(VXLAN Network Identifier,VNI)为1001。
可选的,在一些实施例中,该隧道表可以保存在该源RNIC的处理电路的缓存中。
可选的,在另一些实施例中,该隧道表可以保存在该源RNIC的内存中。
可选的,在另一些实施例中,该隧道表可以保存在安装有该源RNIC的主机的存储装置中。
可选的,在另一些实施例中,该隧道表可以分为三部分,第一部分可以保存在该源RNIC的处理器的缓存中,第二部分可以保存在该源RNIC的内存中,第三部分可以保存在该源RNIC的主机的存储装置中。在此情况下,该源RNIC可以先检查第一部分隧道表是否包括目标隧道条目;若该第一部分隧道表不包括该目标隧道条目,可以检查第二部分隧道表是否包括该目标隧道条目;若该第二部分隧道表不包括该目标隧道条目,可以检查第三部分隧道表是否包括该目标隧道条目。该第一部分隧道表、该第二部分隧道表和该第三部分隧道表中的任意两部分隧道表可以由交集,也可以没有交集。例如,假设该隧道表共包括100个隧道条目,该第一部分隧道表可以包括100个隧道条目中的第1至第10个隧道条目,第二部分隧道表可以包括100个隧道条目中的第11至第40个隧道条目,第三部分隧道表可以包括100个隧道条目中的第41至第100个隧道条目。又如,第一部分隧道表可以是第二部分隧道表和/或第三部分隧道表的子集,和/或,第二部分隧道表可以是第三部分隧道表的子集。例如,还假设该隧道表共包括100个隧道条目,该第一部分隧道表可以包括100个隧道条目中的第1至第10个隧道条目,第二部分隧道表可以包括100个隧道条目中的第1至第40个隧道条目,第三部分隧道表可以包括100个隧道条目中的第1至第100个隧道条目。
可选的,在另一些实施例中,该隧道表可以分为两部分,这两部分隧道表可以保存在该源RNIC的处理器的缓存、该源RNIC的内存和该源RNIC的主机的存储装置中的任意两个中。类似的,若该两部分隧道表分别保存在该源RNIC的处理器的缓存和该源RNIC的内存中,则该源RNIC可以先检查保存在该源RNIC的处理器的缓存中隧道表是否包括目标隧道条目;若保存在该源RNIC的处理器的缓存中的隧道表不包括该目标隧道条目,可以检查保存在该源RNIC的内存中的隧道表是否包括该目标隧道条目。若该两部分隧道表分别保存在该源RNIC的内存和该源RNIC的主机的存储装置中,则该源RNIC可以先检查保存在该源RNIC的内存中隧道表是否包括目标隧道条目;若保存在该源RNIC的内存中的隧道表不包括该目标隧道条目,可以检查保存在该源RNIC的主机的存储装置中的隧道表是否包括该目标隧道条目。类似的,该两部分隧道表可以有交集也可以没有交集。
可选的,在一些实施例中,该源RNIC可以确定对应于该源vRNIC的标识的目标配置条目,根据该目标配置条目和对应于该待传输数据的目标WQE,确定该目标虚拟网络地址。可选的,该源RNIC可以从该目标WQE中确定该目的vRNIC的IP地址,从该目标配置条目中确定该源vRNIC的IP地址。
可选的,在一些实施例中,该目标WQE中可以包括该目的vRNIC的IP地址。这样,该源RNIC可以直接从该目标WQE中获取该目的vRNIC的IP地址。在另一些实施例中,该目标WQE中可以包括目的vRNIC IP地址指示信息,该目的vRNIC IP地址指示信息用于指示保存该目的vRNIC的IP地址的位置。该源RNIC可以根据该目的vRNIC IP地址指示信息所指示的位置中获取该目的vRNIC的IP地址。
该目标配置条目是配置表中的一个条目。配置表用于保存vRNIC标识和该vRNIC的IP地址的对应关系。该配置表可以包括至少一个配置条目,该至少一个配置条目中的每个配置条目包括vRNIC的标识和该vRNIC的IP地址。该源RNIC可以利用该源vRNIC的 标识,从该配置表中查找到与该源vRNIC标识对应的目标配置条目。该目标配置条目中的vRNIC的标识为该源vRNIC的标识,该目标配置条目中的IP地址为该源vRNIC的IP地址。
该配置表保存的位置可以与该隧道表保存的位置类似。换句话说,该配置表可以保存在该源RNIC的处理器的缓存、在该源RNIC的内存,和该源RNIC的主机的存储装置中的任意一个或者多个中。该配置表的具体保存方式以及该源RNIC查找该配置表的方式可以参考上述隧道表的保存方式以及该源RNIC查找隧道表的方式,为了简洁,在此就不再赘述。
如上所述,在该传输模式为UD或RD的情况下,该源RNIC可以根据该目标WQE,确定目标虚拟网络地址。
可选的,在一些实施例中,该目标WQE中可以包括该源vRNIC的IP地址和该目的vRNIC的IP地址。这样,该源RNIC可以直接从该目标WQE中获取该源vRNIC的IP地址和该目的vRNIC的IP地址。在另一些实施例中,该目标WQE中可以包括虚拟地址指示信息,该虚拟地址指示信息用于指示保存该源vRNIC的IP地址的位置和保存该目的vRNIC的IP地址的位置。该源RNIC可以根据该虚拟地址指示信息所指示的位置中获取该源vRNIC的IP地址和该目的vRNIC的IP地址。
为了便于描述,以下将利用QPC确定出报文转发信息的方式简称为QPC缓存模式,将利用隧道表确定出报文转发信息的方式可以称为查表模式。
可以看出,根据传输模式,该源vRNIC可以从不同的地方确定该报文转发信息。
可选的,在一些实施例中,该源RNIC可能无法利用QPC缓存模式或者查表模式确定该报文转发信息和该目的vRNIC身份指示信息。换句话说,该源RNIC可能确定没有与该连接信息和该源vRNIC的标识对应的目标队列对上下文,或者,确定没有与该源vRNIC的标识和该目标虚拟网络地址匹配的目标隧道条目。如上所述,报文转发信息中包括该源RNIC的IP地址、该源RNIC的MAC地址、四层端口号、目的RNIC的IP地址、该目的RNIC的MAC地址。该源RNIC的IP地址、该源RNIC的MAC地址和该四层端口号都可以保存在该源RNIC的存储装置中。因此该源RNIC可以直接获取这些信息。因此,该源RNIC或者该源RNIC的主机中未保存目的RNIC的IP地址和该目的RNIC的MAC地址。在此情况下,该源RNIC可以利用慢处理流程获取该目的RNIC的IP地址和该目的RNIC的MAC地址。为了便于描述,可以将该目的RNIC的IP地址和该目的RNIC的MAC地址称为目的RNIC的地址信息,将该源RNIC的IP地址和该源RNIC的MAC地址称为源RNIC的地址信息。
可选的,在一些实施例中,该源RNIC可以获取该源vRNIC的标识以及对应于该源vRNIC的标识的VNI。该源RNIC可以向至少一个目标NIC发送请求消息,该至少一个目标NIC中的每个目标NIC运行有属于该VNI的至少一个vRNIC,该请求消息包括该源vRNIC的标识和目标虚拟网络地址,其中该目标虚拟网络地址包括该源vRNIC的IP地址和该目的vRNIC的IP地址中的至少一个。该源RNIC接收该目的RNIC发送的反馈信息,该反馈信息中包括该目的RNIC的地址信息。
可选的,在一些实施例中,该源RNIC或者该RNIC的主机的存储装置中可以保存有该目的RNIC的IP地址和该目的RNIC的端口号,但是未保存该目的RNIC的MAC地址。 在此情况下,该源RNIC可以只需要获取该目的RNIC的MAC地址。在此情况下,该源RNIC可以利用地址解析协议(Address Resolution Protocol,ARP)获取该目的RNIC的MAC地址。该源RNIC可以获取对应于该源vRNIC的标识的VNI,并向属于该VNI的所有vRNIC广播ARP请求并接收该目的vRNIC发送的ARP响应。该ARP响应中包括该目的vRNIC的MAC地址。
该源RNIC在获取了该目的RNIC的地址信息后,可以根据该目的RNIC的地址信息和该源RNIC的地址信息,封装该待传输数据。此外,该源RNIC在获取了该目的RNIC的地址信息和源RNIC的地址信息后,在隧道表中添加对应的隧道条目。
该源RNIC还可以维护该隧道表。例如,该源RNIC可以设置一个超时时间,并在一个隧道条目写入到隧道表的后开始启动计时器。该隧道条目每次命中后,重新启动该计时器。若该计时器超过该超时时间且该隧道条目还未命中,则删除该隧道条目。
该目的vRNIC身份指示信息用于指示该目的vRNIC的身份。
可选的,在一些实施例中,该目的vRNIC身份指示信息可以包括该目的vRNIC所属的VNI和该目的vRNIC的虚拟MAC地址。
可选的,在另一些实施例中,该目的vRNIC身份指示信息可以包括该目的vRNIC编号。vRNIC编号可以是虚拟功能标识(virtual function identification,VFID)。
该目的vRNIC所属的VNI可以从目标QPC中或者目标隧道条目中获取。换句话说,若报文转发信息是采用QPC缓存模式确定的,则该源RNIC还可以从该目标QPC中确定该目的vRNIC所属的VNI;若报文转发信息是采用查表模式确定的,则该源RNIC还可以从该目标隧道条目中确定该目的vRNIC所属的VNI。
目的vRNIC的IP地址可以从源RNIC保存的vRNIC的IP地址和MAC地址的对应关系表中获取。源RNIC可以保存一个IP地址和MAC地址的对应关系表,该对应关系表中包括多个表项,每个表项包括一个IP地址和一个MAC地址。该源RNIC可以根据从目标QPC或目标隧道条目中获取的该目的vRNIC的IP地址,查询该对应关系表,确定该对应关系表中的匹配表项(即IP地址为该目的vRNIC的表项)中的MAC地址为该目的vRNIC的MAC地址。
可选的,在一些实施例中,该对应关系表的每个表项中还可以包括vRNIC所属的VNI。换句话说,该源RNIC可以利用该对应关系表,根据该目的vRNIC的IP地址,确定该目的vRNIC的MAC地址和VNI。
可选的,在另一些实施例中,该目的vRNIC身份指示信息可以包括该目的vRNIC的标识。目的vRNIC的身份指示信息可以从源RNIC保存的vRNIC的IP地址和标识的对应关系表中获取。源RNIC可以保存一个IP地址和标识的对应关系表,该对应关系表中包括多个表项,每个表项包括一个IP地址和一个标识。该源RNIC可以根据从目标QPC或目标隧道条目中获取的该目的vRNIC的IP地址,查询该对应关系表,确定该对应关系表中的匹配表项(即IP地址为该目的vRNIC的表项)中的标识为该目的vRNIC的标识。可选的,在一些实施例中,该对应关系表的每个表项中还可以包括vRNIC所属的VNI。换句话说,该源RNIC可以利用该对应关系表,根据该目的vRNIC的IP地址,确定该目的vRNIC的标识和VNI。
303,源RNIC对该待传输数据进行封装,得到目标报文。
该目标报文包括该报文封装信息和该待传输数据。该报文封装信息包括报文转发信息,目的vRNIC的身份指示信息。该报文转发信息和该目的vRNIC的身份指示信息是步骤302中获取到的。换句话说,该源RNIC可以利用获取到的报文转发信息和目的vRNIC身份指示信息对该待传输数据进行封装,得到该目标报文。
可选的,在一些实施例中,该目标报文还可以包括源vRNIC的身份指示信息。
该报文封装信息还可以包括无线带宽(InfiniBand,IB)信息。
图4是一个目标报文的示意图。如图4所示的目标报文包括:MAC头(也可以称为“外层MAC头”)、IP头(也可以称为“外层IP头”)、UDP头、VXLAN头、IB报文头和负载。
如图4所示,MAC头中包括源MAC地址和目的MAC地址,该源MAC地址是该源RNIC的MAC地址,该目的MAC地址是该目的RNIC的MAC地址。IP头中包括源IP地址和目的IP地址,该源IP地址是该源RNIC的IP地址,该目的IP地址是该目的RNIC的IP地址。图4中的四层端口号头为UDP头,该UDP头中包括UDP源端口号和目的端口号,且该目的端口号为VXLAN端口号。UDP源端口号可以是根据哈希算法计算得到的值。UDP源端口号的确定方式与现有的UDP源端口号的确定方式相同,在此就不必赘述。该VXLAN头中包括该目的vRNIC的身份指示信息。该IB头中包括该IB信息。该IB信息可以是IB基本传输头(Base Transport Header,BTH)。可以理解的是,该目标报文还可以包括校验位,例如帧校验序列(Frame Check Sequence,FCS)(图中未示出)。
可选的,在一些实施例中,该VXLAN头中还可以包括该源vRNIC的身份指示信息。
可选的,在一些实施例中,该源vRNIC的身份指示信息可以包括包括该源vRNIC所属的VNI和该源vRNIC的虚拟MAC地址。
可选的,在另一些实施例中,该源vRNIC身份指示信息可以包括该源vRNIC编号。
可选的,在一些实施例中,该目标报文可以包括源vRNIC的身份指示信息和目的vRINC的身份指示信息中的至少一个。在此情况下,该VXLAN头中可以包括源vRNIC的身份指示信息和目的vRINC的身份指示信息中的至少一个。
MAC头、IP头和UDP头中除了图4所示的内容外,还包括其他内容。本申请实施例对这些内容并未进行改进。因此,MAC头、IP头和UDP头的具体格式和内容可以参考现有协议,为了简洁,在此就不必赘述。
本申请技术方案并未对IB头中所传输的信息进行改进。因此IB头中所传输的具体信息可以参考现有RoCE协议所规定的IB头,为了简洁,在此就不必赘述。
此外,该目标报文中除了如图4的各个字段外还可以包括校验字段(图中未示出)。该校验字段的确定方式和具体内容与RoCE标准报文中的校验字段的确定方式和具体内容相同,为了简洁,在此就不再赘述。
该目标报文可以是基于RoCE标准报文的报文。
例如,如图4所示的目标报文是基于RoCE第二版(version 2,v2)标准报文的报文。可以看出,图4所示的目标报文只比RoCEv2的标准报文多了一个网络虚拟化协议头(即图4中的VXLAN头)。又如,该目标报文也可以是基于RoCE第一版(verision 1,v1)标准报文的报文。在此情况下,目标报文可以比RoCEv1标准报文多一个网络虚拟化协议头。
可选的,在一些实施例中,该目标报文中不包括以下信息中的至少一个:该源vRNIC的IP地址、该目的vRNIC的IP地址、该源vRNIC的MAC地址、该源vRINC的端口号和该目的vRNIC的端口号。
可选的,在一些实施例中,该目标报文中不包括该源vRNIC的IP地址、该目的vRNIC的IP地址、该源vRNIC的MAC地址、该源vRINC的端口号和该目的vRNIC的端口号。
如果该目标报文包括上述信息,则该源RNIC还需要将上述信息作为内层报文头封装到该目标报文中。这样就增加了一次报文封装。此外,上述信息还占用了该目标报文的容量。换句话说,如果目标报文中还需要包括上述信息中的一个或多个,则该目标报文中用于承载待传输数据的字段就会减小。换句话说,目标报文的负载的容量就会减小。相同大小的待传输数据可能需要两个报文才能完成传输。这样增加了网络中传输的报文数量。
该源RNIC可以从QPC或者WQE中获取该IB信息。具体地,在传输模式为RC/UC的情况下,该源RNIC可以从目标QPC中获取该IB信息。在传输模式为UD/RD的情况下,该源RNIC可以从参考WQE中获取该IB信息。该源RNIC获取该IB信息的具体实现方式与现有的获取IB信息的具体实现方式相同,为了简洁,在此就不必赘述。
304,该源RNIC将该目标报文发送至该目的RNIC,其中,该目的vRNIC是运行在该目的RNIC上的一个vRNC。相应的,该目的RNIC接收该目标报文。
305,该目的RNIC根据该目标报文中的目的vRNIC身份指示信息,确定该目的vRNIC。
该目标报文中的目的vRNIC身份指示信息可以由网路虚拟化协议头携带。
可选的,在一些实施例中,该目的vRNIC的身份指示信息可以包括该目的vRNIC所属的VNI和该目的vRNIC的虚拟MAC地址。该网络虚拟化协议头中可以携带该目的vRNIC所属的VNI和该目的vRNIC的虚拟MAC地址。
可选的,在另一些实施例中,该目的vRNIC的身份指示信息可以是该目的vRNIC编号。该网络虚拟化协议头中可以携带该目的vRNIC编号。
可选的,在一些实施例中,该网络虚拟化协议头可以是VXLAN头(例如图4)、VXLAN-通用协议扩展(Generic Protocol Extension,GPE)头、网络服务头(Network Service Header)和通用网络虚拟化封装(Generic Network Virtualization Encapsulation,Geneve)头中的一个。
图5是以VXLAN-GPE头为例介绍如何利用VXLAN-GPE头携带该目的vRNIC所属的VNI和该目的vRNIC的虚拟MAC地址。
如图5所示的VXLAN-GPE头包括标志(Flags)字段、保留字段、VNI字段、下一协议(Next Protocol,NP)字段和保留字段。
如图5所示的标志字段为RRLLIRRR,其中R表示该位为保留位,L表示该位是用于指示该VXLAN-GPE格式的指示位,I表示已经被VXLAN-GPE占用的位。可以理解的是,如图5所示的标志字段占用了高2、3位作为该VXLAN-GPE格式的指示位。在另一些实施例中,也可以使用该标志字段的其他保留位作为该VXLAN-GPE格式的指示位。可选的,在一些实施例中,如果LL位的值为01,则表示该VXLAN-GPE头中包括该目的vRNIC的身份指示信息。该VXLAN-GPE头中的VNI字段可以携带该目的vRNIC所属的VNI。该VXLAN-GPE头的第二个保留字段可以携带该目的vRNIC的MAC地址。
假设该VXLAN-GPE头中的标志字段中的高2、3位(即图4中的LL位)的值为01,则该目的NIC在接收到该目标报文后,可以根据该VXLAN-GPE头中的标志字段中的高2、3位确定该VXLAN-GPE头中携带目的vRNIC的身份指示信息,并从该VXLAN-GPE头中的VNI字段和第二个保留字段中分别获取该目的vRNIC所属的VNI以及该目的vRNIC的MAC地址。
可选的,该目的RNIC确定该目的vRNIC可以是确定该目的vRNIC的标识。
该目的RNIC可以根据获取到的VNI和MAC地址,确定该目的vRNIC的标识。vRNIC的标识是该vRNIC所在的RNIC分配的。该vRNIC迁移到其他RNIC后,该vRNIC的标识会发生改变,该vRNIC所属的VNI和该vRNIC的MAC地址不会发生变化。因此,利用该目的vRNIC所属的VNI和该目的vRNIC的MAC地址作为该目的vRNIC的身份指示信息可以避免因该目的vRNIC发生迁移造成的该目的vRNIC的标识发生变化导致的无法准确找到目的vRNIC的情况发生。
该目的RNIC可以通过查找虚拟设备映射表来确定该目的vRNIC的标识。该虚拟设备映射表中包括至少一个虚拟设备表项,每个表项包括VNI、MAC地址和标识。该目的RNIC可以从该虚拟设备映射表中确定与获取到的VNI和MAC地址匹配的目标虚拟设备表项,该目标虚拟设备表项中的VNI为该目的RNIC获取到的该目的vRNIC所属的VNI,该目标虚拟表项中的MAC地址为该目的RNIC获取到的该目的vRNI的MAC地址。相应的,该目标虚拟表项中的标识就是该目的vRNIC的标识。
该目的RNIC可以维护该虚拟设备映射表。具体地,在该目的RNIC中创建了一个vRNIC的情况下,该目的RNIC可以在该虚拟设备映射表中创建与该vRNIC对应的虚拟设备表项,该虚拟设备表项中的VNI为该vRNIC所属的VNI,该虚拟设备表项中的MAC地址为该vRNIC的MAC地址,该虚拟设备表项中的标识为该vRNIC的标识。该目的RNIC还可以在该vRNIC销毁(例如迁移到其他RNIC或者从该目的RNIC删除)后,将与该vRNIC对应的虚拟设备表项删除。
可选的,在另一些实施例中,该目的vRNIC的身份指示信息可以是该目的vRNIC的标识。类似的,该目的vRNIC的标识也可以利用该网络虚拟化协议头携带。还以如图4所示的VXLAN-GPE头。该VXLAN-GPE头中的第二个保留字段可以携带该目的vRNIC的标识。该VXLAN-GPE头的标志字段中的高2、3位也可以用于指示该VXLAN-GPE头中携带目的vRNIC的身份指示信息。这样,该目的RNIC可以直接从该VXLAN-GPE头中确定该目的vRNIC的标识。
306,该目的RNIC将该目标报文发送至该目的vRNIC。该目的vRNIC对接收到的该目标报文进行处理。
具体地,该目的vRNIC可以剥离该目标报文中的报文转发信息,目的vRNIC的身份指示信息和IB信息,对该目标报文中的负载部分中的数据(即由该源vRNIC发送的待传输数据)进行处理。该目的vRNIC对该待传输数据进行处理的具体过程与现有的vRNIC对利用RDMA技术传输的数据的处理过程相同,为了简洁,在此就不必赘述
根据图3所示的方法,待传输的数据可以只进行一次报文封装,因此隧道封装的开销较小。封装后的报文中无需包括源vRNIC的IP地址、MAC地址和端口号信息,也无需包括目的vRNIC的IP地址和端口号信息。目的RNIC和目的vRNIC在处理目标报文的过 程中不需要获取上述信息。因此,目标报文的负载中可以不需要携带这些冗余信息。在此情况下,目标报文的负载(payload)中可以空余更多的空间用于传输待传输的数据,这样可以提升网络中的有效吞吐。此外,上述方案中,待传输数据的封装过程可以由RNIC完整。因此,无需设置额外的用于封装待传输数据的硬件。这样可以降低在以太网中应用RDMA技术的成本。
可以理解的是,VXLAN隧道端点(VXLAN Tunnel End Point,VTEP)可以是物理设备也可以是虚拟设备。图3所示的方法中的以RNIC作为VTEP。换句话说,图3所示方法中的源RNIC也可以称为源VTEP,目的RNIC也可以称为目的VTEP。相应的,源RNIC的IP地址也可以称为源VTEP的IP地址,源RNIC的MAC地址也可以称为源VTEP的MAC地址。目的RNIC的IP地址也可以称为目的VTEP的IP地址,目的RNIC的MAC地址也可以称为目的VTEP的MAC地址。如上所述,RNIC是物理网卡。因此图3所示的实施例是以物理网卡作为VTEP描述的。在另一些实施例中,VTEP也可以通过虚拟设备实现。换句话说,图3所示方法中的RNIC也可以被理解为是一种vRNIC。通过虚拟设备实现本申请提供的通信方法的具体实现流程与物理设备实现本申请提供的通信方法的具体实现流程相同,为了简洁,在此就不必赘述。
为了便于本领域技术人员更好地理解本申请的技术方案,下面将结合图2和图3,对本申请的技术方案进行进一步描述。
假设通信双方分别为VM 211和VM 231。待传输数据是保存在VM 211的存储装置中,该待传输数据需要发送至VM 231的存储装置中。假设该待传输数据为“你好”。在此情况下,源vRNIC为VM 211对应的vRNIC,即vRNIC 221,目的vRNIC为VM 231对应的vRNIC,即vRNIC 241,源VTEP为RNIC 220,目的VTEP为RNIC 240。
假设vRNIC 211的IP地址为192.168.0.1,vRNIC 241的IP地址为192.168.0.2,vRNIC的MAC地址为1:2:3:4:5:6,RNIC 220的IP地址为10.0.0.1,RNIC 220的MAC地址为A:B:C:D:E:F,RNIC 240的IP地址为10.0.0.2,RNIC 240的MAC地址为X:Y:Z:M:N:O,VNI为xxx。
在RC/UC模式下,该源VTEP可以从QPC缓存中获取目标QPC,该目标QPC中保存有目的VTEP的IP地址和MAC地址,目的vRNIC的MAC地址和VNI(或者指向上述信息指针)。
在RC/UC或者UD/RD模式下,该源VTEP可以从隧道表中获取目标隧道条目,该目标隧道条目中保存有VNI、目的VTEP的IP地址和MAC地址,目的vRNIC的MAC地址和VNI(或者指向上述信息指针)。
源VTEP可以根据获取到的目的VTEP的IP地址和MAC地址,目的vRNIC的MAC地址和VNI以及源VTEP的IP地址和MAC地址,对待传输数据进行封装。图6是一个封装好的目标报文。图6所示的目标报文是根据上述内容进行封装的。可以理解的是,图6中仅示出了该目标报文中的一些关键信息。
目的VTEP在接收到该目标报文后,可以根据UDP头中的端口号判定当前报文是采用VXLAN封装的虚拟报文。根据VXLAN头的定义,目的VTEP发现标志(Flags)字段的高2,3位使能,因此,可以确定该报文是本申请实施例所定义的R_VXLAN协议。进而,可以确定第二个保留字段中携带的内容是vRNIC的MAC地址。目的VTEP可以根据 VXLAN头中获取到的VNI和vRNIC的MAC地址,确定目的vRNIC为vRNIC 241。目的VTEP将该目标报文发送至vRNIC 241。vRNIC 241剥离外层报文头,将IB头传递给VM 231的RDMA引擎处理。
图7是根据本申请实施例提供的一种网卡的示意性结构框图。如图7所示的网卡700包括:处理单元701和发送单元702。网卡700是支持RDMIA技术的网卡。
获取单元701,用于获取源vRNIC发送的待传输数据,其中该源vRNIC是运行在网卡700上的一个vRNIC。
处理单元701,还用于获取报文转发信息和目的vRNIC身份指示信息,该报文转发信息包括网卡700的IP地址,网卡700的MAC地址,目的RNIC的IP地址,目的RNIC的MAC地址和四层端口号。
处理单元701,还用于对该待传输数据进行封装,得到目标报文,该目标报文包括该报文转发信息、该目的虚拟RNIC身份指示信息和该待传输数据,该目标报文不包括以下信息中的至少一个:该源vRNIC的IP地址、该目的vRNIC的IP地址、该源vRNIC的MAC地址、该源vRNIC的端口号和该目的虚拟RNIC的端口号。
发送单元702,用于将该目标报文发送至该目的RNIC,其中,该目的vRNIC是运行在该目的RNIC上的一个vRNIC。
网卡700可以是上述实施例中的源RNIC。处理单元701可以由处理器实现,发送单元702可以由发送器实现。处理单元701和发送单元702的具体功能和有益效果,可以参见上述实施例的描述。
图8是根据本申请实施例提供的一种网卡的示意性结构框图。如图8所示的网卡800包括:接收单元801和处理单元802。网卡800是支持RDMIA技术的网卡。
获取单元801,用于接收源RNIC发送的报文,该报文包括报文转发信息、目的虚拟RNIC身份指示信息和数据,该报文转发信息包括该源RNIC的互联网协议IP地址、该源RNIC的媒体访问控制MAC地址、该源RNIC的端口号、该网卡的IP地址、该网卡的MAC地址和四层端口号,该目标报文不包括以下信息中的至少一个:源虚拟RNIC的IP地址、该目的虚拟RNIC的IP地址、源虚拟RNIC的MAC地址、该源虚拟RNIC的端口号和该目的虚拟RNIC的端口号,该源虚拟RNIC是运行在该源RNIC中的一个虚拟RNIC,该目的虚拟RNIC是运行在该网卡中的一个虚拟RNIC。
处理单元802,用于根据该目的虚拟RNIC身份指示信息,确定该目的虚拟RNIC。
处理单元802,用于将该报文发送至所述目的vRNIC。
网卡800可以是上述实施例中的目的RNIC。接收单元801可以由接收器实现,处理单元802可以由处理器实现。接收单元801和处理单元802的具体功能和有益效果,可以参见上述实施例的描述。
图9是根据本申请实施例提供的网卡的结构框图。图9所示的网卡900包括:处理器901、存储器902和收发器903。网卡900是支持RDMIA技术的网卡。
处理器901、存储器902和收发器903之间通过内部连接通路互相通信,传递控制和/或数据信号。
上述本申请实施例揭示的方法可以应用于处理器901中,或者由处理器901实现。处理器901可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各 步骤可以通过处理器901中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器901可以是通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器902,处理器901读取存储器902中的指令,结合其硬件完成上述方法的步骤。存储器902可以独立于处理器901存在,此时,存储器902可以通过连接通路与处理器901相连接。又一种可能的设计中,存储器902也可以和处理器901集成在一起,本申请实施例对此不作限定。
可选的,在一些实施例中,存储器902可以存储用于执行如图3所示方法中源RNIC执行的方法的指令。处理器901可以执行存储器902中存储的指令结合其他硬件(例如收发器903)完成如图3所示方法中源RNIC的步骤,具体工作过程和有益效果可以参见图3所示实施例中的描述。
可选的,在一些实施例中,存储器902可以存储用于执行如图3所示方法中目的RNIC执行的方法的指令。处理器901可以执行存储器902中存储的指令结合其他硬件(例如收发器903)完成如图3所示方法中目的RNIC的步骤,具体工作过程和有益效果可以参见图3所示实施例中的描述
本申请实施例还提供一种芯片,该芯片包括收发单元和处理单元。其中,收发单元可以是输入输出电路、通信接口;处理单元为该芯片上集成的处理器或者微处理器或者集成电路。该芯片可以执行上述方法实施例中源RNIC的方法。
本申请实施例还提供一种芯片,该芯片包括收发单元和处理单元。其中,收发单元可以是输入输出电路、通信接口;处理单元为该芯片上集成的处理器或者微处理器或者集成电路。该芯片可以执行上述实施例中目的RNIC执行的方法。
本申请实施例还提供一种计算机可读存储介质,其上存储有指令,该指令被执行时执行上述方法实施例中源RNIC的方法。
作为本实施例的另一种形式,提供一种计算机可读存储介质,其上存储有指令,该指令被执行时执行上述方法实施例中目的RNIC的方法。
本申请实施例还提供一种包含指令的计算机程序产品,该指令被执行时执行上述方法实施例中源RNIC的方法。
作为本实施例的另一种形式,提供一种包含指令的计算机程序产品,该指令被执行时执行上述方法实施例中目的的方法。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本 申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (26)

  1. 一种通信方法,其特征在于,所述方法包括:
    源支持远程直接内存存取的网卡RNIC获取源虚拟RNIC发送的待传输数据,其中所述源虚拟RNIC是运行在所述源RNIC上的一个虚拟RNIC;
    所述源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,所述报文转发信息包括所述源RNIC的互联网协议IP地址、所述源RNIC的媒体访问控制MAC地址、目的RNIC的IP地址、所述目的RNIC的MAC地址和四层端口号;
    所述源RNIC对所述待传输数据进行封装,以得到目标报文,所述目标报文包括所述报文转发信息、所述目的虚拟RNIC身份指示信息和所述待传输数据,所述目标报文不包括以下信息中的至少一个:所述源虚拟RNIC的IP地址、所述目的虚拟RNIC的IP地址、所述源虚拟RNIC的MAC地址、所述源虚拟RNIC的端口号和所述目的虚拟RNIC的端口号;
    所述源RNIC向所述目的RNIC发送所述目标报文,其中,所述目的虚拟RNIC是运行在所述目的RNIC上的一个虚拟RNIC。
  2. 如权利要求1所述的方法,其特征在于,所述源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,包括:
    所述源RNIC根据源虚拟RNIC的标识和用于传输所述待传输数据的传输模式,获取所述报文转发信息和所述目的虚拟RNIC身份指示信息。
  3. 如权利要求2所述的方法,其特征在于,所述源RNIC根据所述源虚拟RNIC的标识和传输模式,获取所述报文转发信息和所述目的虚拟RNIC身份指示信息,包括:
    在所述传输模式为可靠连接RC或不可靠连接UC的情况下,所述源RNIC根据目标队列对上下文,确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述目标队列对上下文对应于连接信息和所述源虚拟RNIC的标识。
  4. 如权利要求2所述的方法,其特征在于,所述源RNIC根据所述源虚拟RNIC的标识和传输模式,获取所述报文转发信息和所述目的虚拟RNIC身份指示信息,包括:
    在所述传输模式为可靠连接RC或不可靠连接UC的情况下,所述源RNIC从参考队列对上下文或参考WQE中确定目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个,所述参考队列对上下文与连接信息和所述源虚拟RNIC的标识对应,所述参考WQE与所述连接信息和所述源虚拟RNIC的标识对应;
    所述源RNIC从隧道表确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述隧道表中包括至少一个隧道条目,所述至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中所述第一虚拟RNIC运行在所述第一RNIC中,所述第二虚拟RNIC运行在所述第二RNIC中,所述虚拟网络地址包括所述第一虚拟RNIC的IP地址和所述第二虚拟RNIC的IP地址中的至少一个,所述至少一个隧道条目中与所述源虚拟RNIC的标识和所述目标虚拟网络地址匹配的隧道 条目包括所述报文转发信息。
  5. 如权利要求2所述的方法,其特征在于,所述源RNIC根据所述源虚拟RNIC的标识和传输模式,获取所述报文转发信息和所述目的虚拟RNIC身份指示信息,包括:
    在所述传输模式为不可靠数据包UD或可靠数据包RD的情况下,所述源RNIC根据对应于所述待传输数据的目标WQE和对应于所述源虚拟RNIC的标识,确定目标虚拟网络地址,或者根据所述目标WQE,确定所述目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个;
    所述源RNIC从隧道表确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述隧道表中包括至少一个隧道条目,所述至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中所述第一虚拟RNIC运行在所述第一RNIC中,所述第二虚拟RNIC运行在所述第二RNIC中,所述虚拟网络地址包括所述第一虚拟RNIC的IP地址和所述第二虚拟RNIC的IP地址中的至少一个,所述至少一个隧道条目中与所述源虚拟RNIC的标识和所述目标虚拟网络地址匹配的隧道条目包括所述报文转发信息。
  6. 如权利要求1所述的方法,其特征在于,所述源RNIC获取报文转发信息和目的虚拟RNIC身份指示信息,包括:
    所述源RNIC向至少一个目标NIC发送请求消息,所述至少一个目标NIC中的每个目标NIC运行有至少一个虚拟RNIC,所述至少一个虚拟RNIC与所述源虚拟RNIC属于同一VNI,所述请求消息包括所述源虚拟RNIC的标识和目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个;
    所述源RNIC接收所述目的RNIC发送的反馈信息,所述反馈信息中包括所述目的RNIC的IP地址和所述目的RNIC的MAC地址;
    所述源RNIC根据所述反馈信息确定所述报文转发信息和所述目的虚拟RNIC身份指示信息。
  7. 如权利要求1至6中任一项所述的方法,其特征在于,所述目标报文包括:MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,
    所述MAC头中包括所述源RNIC的MAC地址和所述目的RNIC的MAC地址;
    所述IP头中包括所述源RNIC的IP地址和所述目的RNIC的IP地址;
    所述四层端口号头中包括四层端口号;
    所述网络虚拟化头包括所述目的虚拟RNIC的身份指示信息;
    所述负载字段包括所述待传输数据。
  8. 如权利要求1至7中任一项所述的方法,其特征在于,所述目的虚拟RNIC的身份指示信息包括所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址。
  9. 一种通信方法,其特征在于,所述方法包括:
    目的RNIC接收源RNIC发送的报文,所述报文包括报文转发信息、目的虚拟RNIC身份指示信息和数据,所述报文转发信息包括所述源RNIC的互联网协议IP地址、所述源RNIC的媒体访问控制MAC地址、目的RNIC的IP地址、所述目的RNIC的MAC地 址和四层端口号,所述目标报文不包括以下信息中的至少一个:源虚拟RNIC的IP地址、所述目的虚拟RNIC的IP地址、源虚拟RNIC的MAC地址、所述源虚拟RNIC的端口号和所述目的虚拟RNIC的端口号,所述源虚拟RNIC是运行在所述源RNIC中的一个虚拟RNIC,所述目的虚拟RNIC是运行在所述目的RNIC中的一个虚拟RNIC;
    所述目的RNIC根据所述目的虚拟RNIC身份指示信息,确定所述目的虚拟RNIC;
    所述目的RNIC将所述报文发送至所述目的vRNIC。
  10. 如权利要求9所述的方法,其特征在于,所述报文包括:MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,
    所述MAC头中包括所述源RNIC的MAC地址和所述目的RNIC的MAC地址;
    所述IP头中包括所述源RNIC的IP地址和所述目的RNIC的IP地址;
    所述四层端口号头中包括四层端口号;
    所述网络虚拟化头包括所述目的虚拟RNIC的身份指示信息;
    所述负载字段包括所述数据。
  11. 如权利要求9或10所述的方法,其特征在于,所述目的虚拟RNIC身份指示信息包括:所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址;
    所述目的RNIC根据所述目的虚拟RNIC身份指示信息,确定所述目的虚拟RNIC,包括:
    所述目的RNIC从虚拟设备映射表中确定所述目的虚拟RNIC,其中所述虚拟设备映射表中包括至少一个虚拟设备表项,所述至少一个虚拟设备表项中的每个表项包括VNI、MAC地址和标识,其中所述至少一个虚拟设备表项中与所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址匹配的虚拟设备表项中的标识为所述目的虚拟RNIC的标识。
  12. 一种网卡,其特征在于,所述网卡支持远程直接内存存取技术,所述网卡包括:
    处理单元,用于获取源虚拟RNIC发送的待传输数据,其中所述源虚拟RNIC是运行在所述网卡上的一个虚拟RNIC;
    所述处理单元,还用于获取报文转发信息和目的虚拟RNIC身份指示信息,所述报文转发信息包括所述网卡的互联网协议IP地址、所述网卡的媒体访问控制MAC地址、目的RNIC的IP地址、所述目的RNIC的MAC地址和四层端口号;
    所述处理单元,还用于对所述待传输数据进行封装,以得到目标报文,所述目标报文包括所述报文转发信息、所述目的虚拟RNIC身份指示信息和所述待传输数据,所述目标报文不包括以下信息中的至少一个:所述源虚拟RNIC的IP地址、所述目的虚拟RNIC的IP地址、所述源虚拟RNIC的MAC地址、所述源虚拟RNIC的端口号和所述目的虚拟RNIC的端口号;
    发送单元,用于向所述目的RNIC发送所述目标报文,其中,所述目的虚拟RNIC是运行在所述目的RNIC上的一个虚拟RNIC。
  13. 如权利要求12所述的网卡,其特征在于,所述处理单元,具体用于根据源虚拟RNIC的标识和用于传输所述待传输数据的传输模式,获取所述报文转发信息和所述目的虚拟RNIC身份指示信息。
  14. 如权利要求13所述的网卡,其特征在于,所述处理单元,具体用于在所述传输 模式为可靠连接RC或不可靠连接UC的情况下,根据目标队列对上下文,确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述目标队列对上下文对应于连接信息和所述源虚拟RNIC的标识。
  15. 如权利要求13所述的网卡,其特征在于,所述处理单元,具体用于在所述传输模式为可靠连接RC或不可靠连接UC的情况下,从参考队列对上下文或参考WQE中确定目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个,所述参考队列对上下文与连接信息和所述源虚拟RNIC的标识对应,所述参考WQE与所述连接信息和所述源虚拟RNIC的标识对应;
    从隧道表确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述隧道表中包括至少一个隧道条目,所述至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中所述第一虚拟RNIC运行在所述第一RNIC中,所述第二虚拟RNIC运行在所述第二RNIC中,所述虚拟网络地址包括所述第一虚拟RNIC的IP地址和所述第二虚拟RNIC的IP地址中的至少一个,所述至少一个隧道条目中与所述源虚拟RNIC的标识和所述目标虚拟网络地址匹配的隧道条目包括所述报文转发信息。
  16. 如权利要求13所述的网卡,其特征在于,所述处理单元,具体用于在所述传输模式为不可靠数据包UD或可靠数据包RD的情况下,根据对应于所述待传输数据的目标WQE和对应于所述源虚拟RNIC的标识,确定目标虚拟网络地址,或者根据所述目标WQE,确定所述目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个;
    从隧道表确定所述报文转发信息和所述目的虚拟RNIC身份指示信息,其中所述隧道表中包括至少一个隧道条目,所述至少一个隧道条目中的每个隧道条目用于指示:第一虚拟RNIC的标识、第二虚拟RNIC的所属的虚拟扩展局域网网络标识VNI,虚拟网络地址、第一RNIC的地址信息和第二RNIC的地址信息,其中所述第一虚拟RNIC运行在所述第一RNIC中,所述第二虚拟RNIC运行在所述第二RNIC中,所述虚拟网络地址包括所述第一虚拟RNIC的IP地址和所述第二虚拟RNIC的IP地址中的至少一个,所述至少一个隧道条目中与所述源虚拟RNIC的标识和所述目标虚拟网络地址匹配的隧道条目包括所述报文转发信息。
  17. 如权利要求12所述的网卡,其特征在于,所述发送单元,还用于向至少一个目标NIC发送请求消息,所述至少一个目标NIC中的每个目标NIC运行有至少一个虚拟RNIC,所述至少一个虚拟RNIC与所述源虚拟RNIC属于同一VNI,所述请求消息包括所述源虚拟RNIC的标识和目标虚拟网络地址,其中所述目标虚拟网络地址包括所述源虚拟RNIC的IP地址和所述目的虚拟RNIC的IP地址中的至少一个;
    所述网卡还包括:接收单元,用于接收所述目的RNIC发送的反馈信息,所述反馈信息中包括所述目的RNIC的IP地址和所述目的RNIC的MAC地址;
    所述处理单元,具体用于根据所述反馈信息确定所述报文转发信息和所述目的虚拟RNIC身份指示信息。
  18. 如权利要求12至17中任一项所述的网卡,其特征在于,所述目标报文包括: MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,
    所述MAC头中包括所述网卡的MAC地址和所述目的RNIC的MAC地址;
    所述IP头中包括所述网卡的IP地址和所述目的RNIC的IP地址;
    所述四层端口号头中包括四层端口号;
    所述网络虚拟化头包括所述目的虚拟RNIC的身份指示信息;
    所述负载字段包括所述待传输数据。
  19. 如权利要求12至18中任一项所述的网卡,,其特征在于,所述目的虚拟RNIC的身份指示信息包括所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址。
  20. 一种网卡,其特征在于,所述网卡支持远程直接内存存取技术,所述网卡包括:
    接收单元,用于接收源RNIC发送的报文,所述报文包括报文转发信息、目的虚拟RNIC身份指示信息和数据,所述报文转发信息包括所述源RNIC的互联网协议IP地址、所述源RNIC的媒体访问控制MAC地址、所述源RNIC的端口号、所述网卡的IP地址、所述网卡的MAC地址和四层端口号,所述目标报文不包括以下信息中的至少一个:源虚拟RNIC的IP地址、所述目的虚拟RNIC的IP地址、源虚拟RNIC的MAC地址、所述源虚拟RNIC的端口号和所述目的虚拟RNIC的端口号,所述源虚拟RNIC是运行在所述源RNIC中的一个虚拟RNIC,所述目的虚拟RNIC是运行在所述网卡中的一个虚拟RNIC;
    处理单元,用于根据所述目的虚拟RNIC身份指示信息,确定所述目的虚拟RNIC;
    所述处理单元,还用于将所述报文发送至所述目的vRNIC。
  21. 如权利要求20所述的网卡,其特征在于,所述报文包括:MAC头、IP头、四层端口号头、网络虚拟化协议头和负载字段,其中,
    所述MAC头中包括所述源RNIC的MAC地址和所述网卡的MAC地址;
    所述IP头中包括所述源RNIC的IP地址和所述网卡的IP地址;
    所述四层端口号头中包括四层端口号;
    所述网络虚拟化头包括所述目的虚拟RNIC的身份指示信息;
    所述负载字段包括所述数据。
  22. 如权利要求20或21所述的网卡,其特征在于,所述目的虚拟RNIC身份指示信息包括:所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址;
    处理单元,具体用于从虚拟设备映射表中确定所述目的虚拟RNIC,其中所述虚拟设备映射表中包括至少一个虚拟设备表项,所述至少一个虚拟设备表项中的每个表项包括VNI、MAC地址和标识,其中所述至少一个虚拟设备表项中与所述目的虚拟RNIC所属的VNI和所述目的虚拟RNIC的虚拟MAC地址匹配的虚拟设备表项中的标识为所述目的虚拟RNIC的标识。
  23. 一种通信装置,其特征在于,所述通信装置包括处理电路和存储介质,所述存储介质存储程序代码,所述处理电路用于调用所述存储介质中的程序代码执行如权利要求1至8中任一项所述的方法。
  24. 一种通信装置,其特征在于,所述通信装置包括处理电路和存储介质,所述存储介质存储程序代码,所述处理电路用于调用所述存储介质中的程序代码执行如权利要求权利要求9至10中任一项所述的方法。
  25. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储实现如权利要求1至8中任一项所述方法的指令。
  26. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储执行如权利要求9至10中任一项所述方法的指令。
PCT/CN2020/102466 2019-07-19 2020-07-16 通信方法和网卡 WO2021013046A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20844079.2A EP3828709A4 (en) 2019-07-19 2020-07-16 COMMUNICATION PROCESS AND NETWORK CARD
US17/201,833 US11431624B2 (en) 2019-07-19 2021-03-15 Communication method and network interface card

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910655048.9 2019-07-19
CN201910655048.9A CN112243046B (zh) 2019-07-19 2019-07-19 通信方法和网卡

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/201,833 Continuation US11431624B2 (en) 2019-07-19 2021-03-15 Communication method and network interface card

Publications (1)

Publication Number Publication Date
WO2021013046A1 true WO2021013046A1 (zh) 2021-01-28

Family

ID=74167436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/102466 WO2021013046A1 (zh) 2019-07-19 2020-07-16 通信方法和网卡

Country Status (4)

Country Link
US (1) US11431624B2 (zh)
EP (1) EP3828709A4 (zh)
CN (1) CN112243046B (zh)
WO (1) WO2021013046A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11895092B2 (en) * 2019-03-04 2024-02-06 Appgate Cybersecurity, Inc. Network access controller operation
US11652666B2 (en) * 2019-07-30 2023-05-16 Vmware, Inc. Methods for identifying a source location in a service chaining topology
US11444790B1 (en) * 2021-07-09 2022-09-13 International Business Machines Corporation Dynamic exclusion of RDMA-based shared memory communication based on performance-related data
CN113556265B (zh) * 2021-07-14 2024-02-20 国家计算机网络与信息安全管理中心 数据处理方法、计算机设备及可读存储介质
CN113312155B (zh) * 2021-07-29 2022-02-01 阿里云计算有限公司 虚拟机创建方法、装置、设备、系统及计算机程序产品
CN113326101B (zh) * 2021-08-02 2022-04-12 阿里云计算有限公司 基于远程直接数据存储的热迁移方法、装置及设备
CN113824622B (zh) * 2021-09-13 2023-06-27 京东科技信息技术有限公司 容器之间的通信控制方法、装置、计算机设备及存储介质
CN117675258A (zh) * 2022-09-06 2024-03-08 华为技术有限公司 网络隔离方法、系统及相关设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012606A1 (en) * 2013-07-02 2015-01-08 Dell Products, Lp System and Method to Trap Virtual Functions of a Network Interface Device for Remote Direct Memory Access
CN107766261A (zh) * 2017-09-22 2018-03-06 华为技术有限公司 数据校验的方法、装置以及网卡
WO2018119774A1 (en) * 2016-12-28 2018-07-05 Intel Corporation Virtualized remote direct memory access
CN109491809A (zh) * 2018-11-12 2019-03-19 西安微电子技术研究所 一种降低高速总线延迟的通信方法

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7551614B2 (en) * 2004-12-14 2009-06-23 Hewlett-Packard Development Company, L.P. Aggregation over multiple processing nodes of network resources each providing offloaded connections between applications over a network
US9331963B2 (en) 2010-09-24 2016-05-03 Oracle International Corporation Wireless host I/O using virtualized I/O controllers
CA2951970C (en) 2011-03-30 2018-02-13 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US20130107889A1 (en) * 2011-11-02 2013-05-02 International Business Machines Corporation Distributed Address Resolution Service for Virtualized Networks
US9348649B2 (en) * 2013-07-22 2016-05-24 International Business Machines Corporation Network resource management system utilizing physical network identification for converging operations
US9306916B2 (en) * 2013-12-25 2016-04-05 Cavium, Inc. System and a method for a remote direct memory access over converged ethernet
CN103763173B (zh) * 2013-12-31 2017-08-25 华为技术有限公司 数据传输方法和计算节点
US10635316B2 (en) * 2014-03-08 2020-04-28 Diamanti, Inc. Methods and systems for data storage using solid state drives
CN105227464B (zh) 2014-06-23 2019-01-18 新华三技术有限公司 Vcf系统中的报文转发方法及装置
US20160026605A1 (en) 2014-07-28 2016-01-28 Emulex Corporation Registrationless transmit onload rdma
US9747249B2 (en) 2014-12-29 2017-08-29 Nicira, Inc. Methods and systems to achieve multi-tenancy in RDMA over converged Ethernet
CN105472023B (zh) 2014-12-31 2018-11-20 华为技术有限公司 一种远程直接存储器存取的方法及装置
CN104636185B (zh) * 2015-01-27 2018-03-02 华为技术有限公司 业务上下文管理方法、物理主机、pcie设备及迁移管理设备
CN105404542A (zh) * 2015-08-14 2016-03-16 国家超级计算深圳中心(深圳云计算中心) 云计算系统及在其上运行高性能计算的方法
US10333865B2 (en) * 2015-08-21 2019-06-25 Cisco Technology, Inc. Transformation of peripheral component interconnect express compliant virtual devices in a network environment
US10367733B2 (en) * 2017-03-30 2019-07-30 Nicira, Inc. Identifier-based virtual networking
CN106953797B (zh) * 2017-04-05 2020-05-26 苏州浪潮智能科技有限公司 一种基于动态连接的rdma数据传输的方法与装置
US10614356B2 (en) * 2017-04-24 2020-04-07 International Business Machines Corporation Local multicast in single-host multi-GPU machine for distributed deep learning systems
CN109213702B (zh) * 2017-06-30 2022-08-30 伊姆西Ip控股有限责任公司 虚拟机环境中的虚拟双控制模块之间的通信
CN107357660A (zh) * 2017-07-06 2017-11-17 华为技术有限公司 一种虚拟资源的分配方法及装置
CN107508828B (zh) 2017-09-18 2019-10-18 南京斯坦德云科技股份有限公司 一种超远程数据交互系统及方法
US10992590B2 (en) * 2018-04-09 2021-04-27 Nicira, Inc. Path maximum transmission unit (PMTU) discovery in software-defined networking (SDN) environments
US11184295B2 (en) * 2018-12-28 2021-11-23 Vmware, Inc. Port mirroring based on remote direct memory access (RDMA) in software-defined networking (SDN) environments

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150012606A1 (en) * 2013-07-02 2015-01-08 Dell Products, Lp System and Method to Trap Virtual Functions of a Network Interface Device for Remote Direct Memory Access
WO2018119774A1 (en) * 2016-12-28 2018-07-05 Intel Corporation Virtualized remote direct memory access
CN107766261A (zh) * 2017-09-22 2018-03-06 华为技术有限公司 数据校验的方法、装置以及网卡
CN109491809A (zh) * 2018-11-12 2019-03-19 西安微电子技术研究所 一种降低高速总线延迟的通信方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IBM: "IBM's Shared Memory Communications over RDMA (SMC-R) Protocol", INDEPENDENT SUBMISSION REQUEST FOR COMMENTS:7609, 31 August 2015 (2015-08-31), XP015107657 *
See also references of EP3828709A4

Also Published As

Publication number Publication date
US20210226892A1 (en) 2021-07-22
CN112243046A (zh) 2021-01-19
EP3828709A1 (en) 2021-06-02
EP3828709A4 (en) 2021-11-17
CN112243046B (zh) 2021-12-14
US11431624B2 (en) 2022-08-30

Similar Documents

Publication Publication Date Title
WO2021013046A1 (zh) 通信方法和网卡
US20240171507A1 (en) System and method for facilitating efficient utilization of an output buffer in a network interface controller (nic)
EP3042298B1 (en) Universal pci express port
US7996569B2 (en) Method and system for zero copy in a virtualized network environment
US9450780B2 (en) Packet processing approach to improve performance and energy efficiency for software routers
CN113326228B (zh) 基于远程直接数据存储的报文转发方法、装置及设备
WO2017113306A1 (zh) 可扩展虚拟局域网报文发送方法、计算机设备和可读介质
US20060067346A1 (en) System and method for placement of RDMA payload into application memory of a processor system
CN112398817B (zh) 数据发送的方法及设备
US20050223118A1 (en) System and method for placement of sharing physical buffer lists in RDMA communication
US10880204B1 (en) Low latency access for storage using multiple paths
WO2016191990A1 (zh) 一种报文转换方法及装置
WO2020063298A1 (zh) 处理tcp报文的方法、toe组件以及网络设备
WO2014079005A1 (zh) Mac地址强制转发装置及方法
WO2022068744A1 (zh) 获取报文头信息、生成报文的方法、设备及存储介质
US12003417B2 (en) Communication method and apparatus
CN115827549A (zh) 网络接口卡、消息发送方法和存储装置
US10185675B1 (en) Device with multiple interrupt reporting modes
WO2023010730A1 (zh) 数据包解析的方法和服务器
US20080056263A1 (en) Efficient transport layer processing of incoming packets
WO2024016975A1 (zh) 报文转发方法、装置、设备及芯片系统
US20240220347A1 (en) Network interface card, message sending method, and storage apparatus
CN115701063A (zh) 一种报文传输方法以及通信装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20844079

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020844079

Country of ref document: EP

Effective date: 20210226

NENP Non-entry into the national phase

Ref country code: DE