CN118093468A - PCIe exchange chip with RDMA acceleration function and PCIe switch - Google Patents

PCIe exchange chip with RDMA acceleration function and PCIe switch Download PDF

Info

Publication number
CN118093468A
CN118093468A CN202410490685.6A CN202410490685A CN118093468A CN 118093468 A CN118093468 A CN 118093468A CN 202410490685 A CN202410490685 A CN 202410490685A CN 118093468 A CN118093468 A CN 118093468A
Authority
CN
China
Prior art keywords
rdma
memory
pcie
virtual
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410490685.6A
Other languages
Chinese (zh)
Other versions
CN118093468B (en
Inventor
张洪波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shudu Information Technology Co ltd
Original Assignee
Beijing Shudu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shudu Information Technology Co ltd filed Critical Beijing Shudu Information Technology Co ltd
Priority to CN202410490685.6A priority Critical patent/CN118093468B/en
Publication of CN118093468A publication Critical patent/CN118093468A/en
Application granted granted Critical
Publication of CN118093468B publication Critical patent/CN118093468B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Bus Control (AREA)

Abstract

The invention relates to a PCIe exchange chip with RDMA acceleration function and a PCIe exchanger, which comprises PCIe endpoint equipment with internal integration, a virtual-physical address conversion table, a memory authority table and a DMA unit, wherein the virtual-physical address conversion table records the conversion relation between virtual addresses and physical addresses of memory areas used for reading and writing data of RDMA, the memory authority table records the access authority of registered memory, the DMA unit in the PCIe exchange chip analyzes WQE structure in QP of RDMA to generate corresponding memory access TLP, and after data copying from a source end to a destination end is completed, CQE is generated and put into CQ queues of both receiving and transmitting parties, thereby realizing the complete data transmission process. The invention can directly analyze WQE in the QP work queue pair of RDMA, can reduce the workload of software encapsulation and conversion, more directly and effectively support RDMA application ecology, reduce the limit of transmission delay and transmission rate, can reach the maximum PCIe transmission rate, and is hopeful to be superior to RDMA network equipment of PCIe interface.

Description

PCIe exchange chip with RDMA acceleration function and PCIe switch
Technical Field
The invention relates to a PCIe exchange chip with an RDMA acceleration function and a PCIe switch, belonging to the technical field of PCIe exchange chips, RDMA communication software interfaces and hardware realization.
Background
PCIe is a computer expansion bus standard that is widely used for connecting computer hosts to peripheral devices. The PCIe controller at the host side is called RC, and the PCIe controller at the device side is called EP. Each EP device has a standard configuration space (Configuration Space) which is a set of registers in a canonical format that indicates information about the basic properties of the device, and the BAR register identifies address space resources such as registers, memory, and the like, referred to as BAR space, of the device. After the PCIe physical layer establishes the link, the RC maps the device's address spaces to the host's address spaces, and the host's driver can then access and operate the device. Under PCIe bus topology, each PCIe device has a unique BDF number.
The request and data are transmitted over the PCIe physical link using a TLP (Transaction LAYER PACKETS), which is of the following 4 types: the memory access TLP is used to read and write a register or a memory of the device, the IO TLP is used to access an IO space under some architecture, the configuration TLP is used to access a configuration space register of the device, and the message TLP is used to implement a user-defined message.
PCIe is a point-to-point connection and requires the use of PCIe switching chips when the host's RC controllers are not sufficient in number to connect more devices. PCIe switch chips typically have one upstream port USP for connecting to the RC at the host side and multiple downstream ports DSP for connecting to the EP devices. During PCIe enumeration, a lookup table is generated in the PCIe switch chip, and records the bus numbers, address ranges and other information of each port. After enumeration, the TLP coming from each port looks up the table according to its different types, and can route information such as BDF number or address to the corresponding port, so as to implement one RC to connect multiple EP devices, and also implement communication between EP devices. The PCIe exchange chip can also realize NTB (Non-TRANSPARENT BRIDGE) function, namely, has a plurality of USP ports, thereby being capable of connecting a plurality of hosts and realizing resource access among the hosts.
RDMA (Remote Direct Memory Access), remote memory access, refers to a technology that a host can access to a remote host memory under the support of corresponding software and hardware.
The basic operation of RDMA is called verbs. There are two basic operations related to data transmission: memory operations Memory verbs and message operations MESSAGING VERBS. Memory operations include remote memory reads and writes, i.e., RDMA reads and RDMA WRITE, which, after establishing a connection and providing appropriate access rights, can directly Read and write the virtual memory address of the remote host without the involvement of the remote host. The message operation includes the receiving and sending of the message, namely RDMA Send and RDMA RECEIVE, which is different from the memory read-write, the sender of the message operation can not write the data into the memory of the receiver directly, but put into the buffer queue of the network device of the receiver, and then the network device writes the data into the main memory.
RDMA uses queues to manage data transceiving. As shown in fig. 1, after the transceiver side registers the available memory area and establishes a connection, a work queue pair WQ is created, including a transmit queue SQ and a receive queue RQ, where a pair of SQ and RQ is called a queue pair QP; at the same time completion queues CQ are also created. When data is transmitted, the software of a sender writes a descriptor of the data to be transmitted, namely a work queue entity WQE, into a transmission queue SQ, the network equipment hardware of the sender transmits the data to a receiver according to the description of the WQE, and after the transmission is finished, a completion descriptor, namely CQE, is written into a completion queue CQ. Software polls the CQE to learn that the data transmission is complete. In addition, RDMA-capable network devices may store virtual address-to-physical address translation tables, so users need only pass virtual addresses, and hardware devices automatically look up the translation tables to use physical addresses to transfer data. In summary, the software only needs to write the WQEs describing the data into the send queue, and the hardware device with RDMA function (such as RDMA hardware in fig. 1) completes all the data sending processes and notifies the completion status, so the data transmission efficiency is extremely high.
RDMA was originally proposed by Mellanox corporation based on its proprietary InfiniBand network. And then extends to ethernet-based but requires upgrades to the corresponding network card hardware. RoCE (RDMA over Converged Ethernet) is an implementation of RDMA based on ethernet data link layer, while iWARP (INTERNET WIDE AREA RDMA Protocol) is another implementation of RDMA based on TCP transport layer Protocol.
In the traditional architecture, network data transmission and reception is required to be subjected to multiple copies from a user mode to a kernel mode and hardware, frequent hardware interrupt response and multiple software context switching caused by the frequent hardware interrupt response, and further system cache hit rate is reduced, so that performance is affected, and the complex protocol stack software of the kernel also needs to consume certain processor computing resources.
The push of RDMA technology, which solves the above problems, allows applications to submit memory accesses to remote hosts directly from user space to hardware, eliminating additional copies of data; the context switch is reduced without passing through an operating system kernel; the data transmission operation is directly finished by hardware without participation of a CPU, so that the CPU load rate is greatly reduced. Because of its superior performance, a large number of network applications can use RDMA communication at present, and RDMA technology has formed a wide application ecology, and in fact, has become a standard interface supported by a large number of applications.
RDMA technology requires the use of hardware devices with corresponding functionality, such as InfiniBand or RoCE network cards. In some cases, however, a group of servers within the same cabinet, which are closely spaced, may be operatively connected using PCIe switching chips. From a hardware transport protocol perspective, RDMA network devices are also PCIe devices on the server host, i.e., going from PCIe interface to network interface, adding additional hardware latency and cost, and therefore efficiency is highest if PCIe switch chips can be directly connected.
Because the RDMA-based software is still running on the server, the RDMA verbs interface (the verbs interface is a kind of underlying programming interface) can be migrated to the PCIe link, which is compatible with RDMA software and does not need to purchase expensive RDMA hardware devices. Such attempts are for example: one IEEE proposal, "Using PCIe-Based RDMA to ACCELERATE RACK-Scale Communications IN DATA CENTERS", and "InfiniBand RDMA over PCIExpress Networks" by Oslo university, which has been the experiment to port RDMA software interfaces to a network of PCIe NTB interconnects.
Theoretically, the RDAM interface can be implemented on the PCIe interconnection network through the interface call and encapsulation of the software layers, which is the same as the above practical demonstration. However, such a direct software package may not be optimal in performance, although it may achieve functional goals. The core problem is the gap between the conventional DMA unit built in the PCIe switch chip and the RDMA scene requirement. Firstly, after analyzing and sending data of a data transmission request WQE with a certain format in the SQ, RDMA equipment places a completion identifier CQE in the CQ; the DMA unit built in the PCIe device is a conventional control mode, for example, a basic command such as a source address, a destination address, a transmission length, etc. is configured, after the data copy is completed, an interrupt signal is generally sent, and an interrupt corresponding program of the processor finds out an interrupt reason and performs further processing operations. However, the WQE in the QP of RDMA cannot be identified by the traditional DMA unit built in PCIe, and software is required to perform translation before each transmission; meanwhile, after the traditional DMA transmission is completed, the generation of the completion identification CQE is also required to be realized in an interrupt response program, and the software operations reduce the transmission efficiency of the RDAM. In addition, conventional DMAs work directly with physical addresses, whereas RDMA applications pass virtual addresses of data directly, thus requiring software to translate to physical addresses and then to DMA, a factor that also affects software performance. Furthermore, when RDMA accesses the remote memory, the access authority of the remote memory needs to be controlled, otherwise, security problems may be caused, and the conventional PCIe built-in DMA unit does not have this function.
Based on this, the present invention has been proposed.
Disclosure of Invention
RDMA mechanisms have high transmission efficiency. And under the condition of meeting the transmission distance and the number of nodes, the PCIe exchange chip is used for directly interconnecting and converting into other network protocols more directly and efficiently. RDMA software interfaces may also be run on the PCIe link using a software encapsulated approach, but some aspects of efficiency are impacted by the traditional DMA function hardware.
In the process of developing the PCIe exchange chip, the invention knows that the working mode of the traditional DMA can be improved in the whole flow analysis of the application scene from software to hardware, so that the operation of RDMA on the PCIe exchange chip can be better supported.
While running RDMA applications on PCIe switched network connected hosts may be accomplished using software for encapsulation conversion, the overall system operating efficiency may be compromised due to the limitations of conventional DMA unit functionality. The invention provides a novel PCIe-based DMA unit realization mode, which enables a PCIe exchange chip to better support RDMA application, reduces the workload of software processing, and further improves the data transmission efficiency. The specific technical scheme is as follows:
PCIe switch chip with RDMA acceleration function, comprising:
PCIe endpoint devices with internal integration to enable interaction of control commands;
A virtual-physical address conversion table, which is established in a host memory when a driver is initialized, wherein the virtual-physical address conversion table records the conversion relation between virtual addresses and physical addresses of a memory area used for RDMA read-write data;
when the RDMA transceiver side establishes a connection registration memory, the memory authority table is established in a host memory, and the memory authority table records the access authority of the registered memory;
and the DMA unit in the PCIe exchange chip analyzes the WQE structure in the QP of the RDMA to generate a corresponding memory access TLP, generates CQEs and puts the CQEs into CQ queues of the receiving and transmitting parties after the data copy from the source end to the destination end is completed, thereby realizing the complete data transmission process.
Still further improvements, the interactions of the PCIe endpoint device with internal integration include programming of virtual-to-physical address translation tables, programming of memory permission tables, QP and CQ information management, DMA unit transfer control.
In a further improvement, a memory within the PCIe switch chip caches the virtual-physical address translation table, and after system initialization, the DMA unit caches the translation table of the external host into the PCIe switch chip by issuing a memory access TLP request.
In a further improvement, after the system is initialized, the DMA unit sends out a memory access TLP request to cache the memory permission table of the external host into the PCIe switch chip.
In a further improvement, the DMA unit completes conversion from a virtual address to a physical address, and performs authority control on remote memory data through a verification key when the remote memory data are read and written.
In a further improvement, after the driver is initialized, the DMA unit reads QP and QPC, CQ and CQC in the host memory into the internal memory unit, creates a virtual-physical address conversion table and a memory authority table, and automatically completes data copying according to the QP and QPC, CQ and CQC, the virtual-physical address conversion table and the memory authority table by the DMA unit and generates copy completion information.
Further improvements include PCIe switches including at least one of the PCIe switch chips having RDMA acceleration functionality.
The invention has the beneficial effects that:
1. The invention can directly analyze WQEs in the QP work queue pair of RDMA, and generates CQEs in the CQ queue after transmission is completed; the virtual-physical address conversion table of the corresponding memory area of the host can be stored, the application program can transmit the virtual address, and the DMA unit actively performs table lookup and conversion into the physical address; and protecting the access of the local and remote memories according to the secret key generated when the connection is established.
2. The invention adds the DMA unit with the function on the basis of the common PCIe exchange chip, thereby leading the PCIe exchange chip to better support the RDMA function.
3. In the range that PCIe can meet the distance and node quantity, use the improved PCIe exchange chip interconnection of the invention, because the support of the specialized DMA unit, can reduce the software encapsulation and work load of conversion, support RDMA application ecology more directly and effectively; because no extra RDMA network equipment is needed, the limitation of transmission delay and transmission rate is reduced, the maximum PCIe transmission rate can be achieved, the RDMA network equipment is hopeful to be superior to the RDMA network equipment with PCIe interface, and meanwhile, the purchase cost of the network equipment is saved.
Drawings
FIG. 1 is a work queue for RDMA;
FIG. 2 is a diagram of the core data fields of a transport job request WQE;
FIG. 3 is a functional block diagram and control scheme;
FIG. 4 is a workflow of a PCIe switch chip built-in DMA unit.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Abbreviation and key term definitions
RDMA: remote Direct Memory Access, an implementation mechanism that can directly access the remote host memory;
InfiniBand: also called IB, a network protocol for high-speed data transmission;
DMA: direct Memory Access, which refers to a mode of completing data copying by a special hardware unit without passing through a CPU;
IO: input Output, input/Output;
BAR: base ADDRESS REGISTER, a register that indicates the memory resources of the PCIe device;
BDF: bus Device Function, PCIe device unique ID identification;
TLP: a data packet transmission format specified by a Transaction LAYER PACKETS, PCIE protocol;
USP: upstream Port, PCIe exchange chip Upstream Port, used for connecting RC apparatus;
DSP: downstream ports of the PCIe switching chip are used for connecting the EP equipment;
NTB: non-TRANSPARENT BRIDGE, non-transparent bridge, PCIe exchanges modify their addresses as set when transferring TLPs;
Enumerating: a process of finding PCIe bus topology and emptying its devices to host address space by a host at PCIe RC end;
RoCE: RDMA over Converged Ethernet, an implementation of RDMA based on ethernet data link layer;
iWARP: INTERNET WIDE AREA RDMA Protocol, another implementation of RDMA based on TCP transport layer Protocol;
WQ: work Queue, a Queue storing RDMA Work requests, including SQ and RQ;
WQE: work Queue Element, an element in the RDMA work queue;
SQ: send Queue, data Send Queue;
RQ: a Received Queue of data;
QP: the Queue Pair comprises a data transmission Queue and a data receiving Queue, wherein the Queue Pair comprises a Queue Pair;
QPC: queue Pair Context for storing QP related attributes, such as addresses of SQ and RQ;
CQ: completion Queue;
CQE: completion Queue Element, an element in the work-completion queue;
CQC: completion Queue Context for storing information on CQ-related attributes, such as an address of CQ;
WR: work Request, user layer uses QR, corresponding to the WQE of the driver and hardware layer;
WC: work Completion, user layer using WC, CQE corresponding to driver and hardware layer;
MR: memory Region for RDMA data;
MTT: memory Translation Table a virtual to physical address translation table;
MPT: memory Protection Table, recording a table of access rights of the memory area;
OPCODE: operation Code, operation instruction.
Example 1
On the basis of a common PCIe exchange chip, the invention creatively realizes corresponding hardware functional units for better supporting RDMA application, and the invention comprises the following parts:
1. PCIe endpoint device with internal integration (DSPiEP)
In PCIe architectures, one or a group of functional units need a BDF number to implement routing of PCIe packets, i.e., a new PCIe endpoint device (EP); and in PCIe switching networks, an EP needs to be connected under the DSP to meet the specification, which is achieved in practice by DSP integration of EPs, DSPiEP, as shown in fig. 3. The "internet" in fig. 3 refers to an internal bus for interworking data between the USPs and DSPs.
DSPiEP are connected to the original switching network, and can be discovered and allocated with resources such as BDF numbers in the enumeration process. The newly added register and memory of the DMA unit are mapped into the BAR space of the EP, so that the newly added register and memory of the DMA unit can be accessed after enumeration, thereby realizing all operations of interaction of control commands and the like, including programming of virtual-physical address conversion tables and memory authority tables in PCIe exchange chips, QP and CQ information management, DMA unit transmission control and the like.
2. Virtual-physical address translation table (MTT)
When a driver is initialized, a virtual-physical address conversion table is established in a host memory, and the virtual-physical address conversion table records the conversion relation between virtual addresses and physical addresses of a memory area used for RDMA read-write data; in order to obtain better performance, a memory inside the PCIe switch chip caches the virtual-physical address translation table, and after the system is initialized, the DMA unit caches the translation table of the external host inside the PCIe switch chip by sending out a memory access TLP request.
3. Memory Permission Table (MPT)
When the RDMA transceiver side establishes a connection registration memory, a memory authority table is established in a host memory, and the memory authority table records the access authority of the registered memory;
in order to obtain better performance, a memory in the PCIe switch chip caches the memory permission table, and after the system is initialized, the DMA unit caches the memory permission table of the external host into the PCIe switch chip by sending out a memory access request TLP.
4. DMA unit
The core function of the DMA unit in the PCIe exchange chip is to analyze the WQE structure in the QP of the RDMA, generate a corresponding memory access TLP, generate CQE and put the CQE into the CQ queues of the receiving and transmitting parties after the data copy from the source end to the destination end is completed, thereby realizing the complete data transmission process. In the above process, since the virtual address is transmitted by the application program, the DMA unit can complete the conversion from the virtual address to the physical address, so that the application program can directly use the virtual address; and when the remote memory data is read and written, the secret key can be verified, the authority control can be performed, and the system safety is ensured.
The WQE required by the DMA unit transmission is positioned in a QP consisting of the SQ and the RQ in the host memory, the basic attributes of the QP (such as the addresses of the SQ and the RQ and the current WQE and the like) are positioned in the QP, and the basic attributes of the CQ (such as the address of the CQ and the current CQE and the like) are positioned in the CQC; after the initialization of the driver, the driver generates QP and QPC, CQ and CQC, virtual-physical address conversion table and memory authority table in the host memory, and the basic attributes of QP and QPC, the basic attributes of CQ and CQC, virtual-physical address conversion table and memory authority table are loaded into the memory space inside the PCIe exchange chip before the DMA unit transmission starts.
Taking RDMA Send/Receive as an example, as shown in fig. 4, WQE in the memory is only a descriptor of data, while real data is located in another data area, and DMA unit internal modules and workflows are as follows:
1) After the DMA unit is started, the work description WQE of the sending request in the SQ is found according to the QPC;
2) Reading and analyzing WQEs in a sending request queue;
3) Searching a virtual-physical address conversion table according to the source data address described in the WQE to obtain the physical address of the transmitted data;
4) Sending out a memory read TLP request, and reading source data from a data area of a sending host;
5) Searching a memory authority table of the receiving port according to the remote KEY R_KEY, and confirming that the receiving port has access authority;
6) Finding out the work description WQE of the received data according to the QPC of the receiving port;
7) Reading WQE of the receiving queue and analyzing information such as receiving address;
8) Searching the MTT of the receiving end, and converting the virtual address into a physical address;
9) And packaging the source data to generate a memory write request TLP, writing the memory write request TLP into a data area of the receiving host, and waiting for a replying completion TLP to be used as a response for data transmission.
10 After receiving the TLP, the DMA unit obtains CQ address information according to the CQC of the receiving end;
11 Writing CQE to the receiving end to indicate that the data transmission is completed;
12 Acquiring CQ address information of a transmitting end according to CQC of the transmitting end;
13 A CQE is written to the sender indicating that the data transfer is complete.
The PCIe exchange chip in the process is internally provided with the DMA unit, so that the formats of WQEs in the RDMA protocol QP, CQEs in the CQ generation and the like can be completely analyzed, and virtual addresses are supported for use, thereby better supporting RDMA application ecology when the PCIe exchange chip is used for interconnection, reducing the workload of a software adaptation layer and improving the efficiency. And can reach time delay and cost smaller than using RDMA network card based on PCIe interface under the transmission distance and node quantity that PCIe exchange chip satisfies.
The present invention is able to directly parse the WQE in the RDMA QP work queue pair (see fig. 2), i.e., determine the type of data transfer from the opcode therein, then read the data according to the local address and data transfer length, send the data to the remote address, or receive the data from the remote to the local address. The remote key is used for verifying the access authority of reading and writing to the remote end, and the local key is used for verifying the access authority of reading and writing to the local end. And generating CQE in the CQ queue after the transmission is completed, wherein the CQE comprises a unique ID of the current work request, and the completion of the current request is indicated.
The invention can efficiently support the RDMA verbs interface, namely, after the application program on the operating system establishes connection, the application program puts the data receiving and transmitting request into the received queue RQ of the sending queue SQ, and directly uses the virtual address, and the rest operations are all completed by hardware, and the software waits for completing the message in the queue CQ. After the RDMA-enabled verbs interface, the software ecology based thereon is compatible.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (7)

1. PCIe switching chip with RDMA acceleration function, characterized by comprising:
PCIe endpoint devices with internal integration to enable interaction of control commands;
A virtual-physical address conversion table, which is established in a host memory when a driver is initialized, wherein the virtual-physical address conversion table records the conversion relation between virtual addresses and physical addresses of a memory area used for RDMA read-write data;
when the RDMA transceiver side establishes a connection registration memory, the memory authority table is established in a host memory, and the memory authority table records the access authority of the registered memory;
and the DMA unit in the PCIe exchange chip analyzes the WQE structure in the QP of the RDMA to generate a corresponding memory access TLP, generates CQEs and puts the CQEs into CQ queues of the receiving and transmitting parties after the data copy from the source end to the destination end is completed, thereby realizing the complete data transmission process.
2. The PCIe switch chip with RDMA acceleration function of claim 1 wherein: the interactions of the PCIe endpoint device with internal integration include programming of virtual-to-physical address translation tables, programming of memory permission tables, QP and CQ information management, DMA unit transfer control.
3. The PCIe switch chip with RDMA acceleration function of claim 1 wherein: and after the system is initialized, the DMA unit caches the conversion table of the external host into the PCIe exchange chip by sending out a memory access TLP request.
4. The PCIe switch chip with RDMA acceleration function of claim 1 wherein: after the system is initialized, the DMA unit caches the memory authority table of the external host into the PCIe exchange chip by sending out a memory access TLP request.
5. The PCIe switch chip with RDMA acceleration function of claim 1 wherein: and the DMA unit finishes conversion from a virtual address to a physical address, and performs authority control through the verification key when reading and writing the remote memory data.
6. The PCIe switch chip with RDMA acceleration function of claim 5 wherein: after the driver is initialized, the DMA unit reads QP and QP, CQ and CQC in the host memory into the internal memory unit, creates a virtual-physical address conversion table and a memory authority table, and automatically completes data copying according to the QP and QP, CQ and CQC, the virtual-physical address conversion table and the memory authority table and generates copying completion information.
Pcie switch, characterized in that: comprising at least one PCIe switch chip with RDMA acceleration function as claimed in any of claims 1-6.
CN202410490685.6A 2024-04-23 2024-04-23 PCIe exchange chip with RDMA acceleration function and PCIe switch Active CN118093468B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410490685.6A CN118093468B (en) 2024-04-23 2024-04-23 PCIe exchange chip with RDMA acceleration function and PCIe switch

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410490685.6A CN118093468B (en) 2024-04-23 2024-04-23 PCIe exchange chip with RDMA acceleration function and PCIe switch

Publications (2)

Publication Number Publication Date
CN118093468A true CN118093468A (en) 2024-05-28
CN118093468B CN118093468B (en) 2024-07-02

Family

ID=91144238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410490685.6A Active CN118093468B (en) 2024-04-23 2024-04-23 PCIe exchange chip with RDMA acceleration function and PCIe switch

Country Status (1)

Country Link
CN (1) CN118093468B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111221758A (en) * 2019-09-30 2020-06-02 华为技术有限公司 Method and computer equipment for processing remote direct memory access request
CN115309665A (en) * 2021-05-07 2022-11-08 华为技术有限公司 Computer equipment and memory registration method
CN116016570A (en) * 2022-12-29 2023-04-25 深圳云豹智能有限公司 Message processing method, device and system
CN116578504A (en) * 2023-05-26 2023-08-11 深圳云豹智能有限公司 Method, device, system, chip and storage medium for improving efficiency of access of RDMA engine to memory area

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111221758A (en) * 2019-09-30 2020-06-02 华为技术有限公司 Method and computer equipment for processing remote direct memory access request
CN115309665A (en) * 2021-05-07 2022-11-08 华为技术有限公司 Computer equipment and memory registration method
CN116016570A (en) * 2022-12-29 2023-04-25 深圳云豹智能有限公司 Message processing method, device and system
CN116578504A (en) * 2023-05-26 2023-08-11 深圳云豹智能有限公司 Method, device, system, chip and storage medium for improving efficiency of access of RDMA engine to memory area

Also Published As

Publication number Publication date
CN118093468B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN1647054B (en) Double-mode network device driving device, system and method
US6757768B1 (en) Apparatus and technique for maintaining order among requests issued over an external bus of an intermediate network node
US10152441B2 (en) Host bus access by add-on devices via a network interface controller
US7404190B2 (en) Method and apparatus for providing notification via multiple completion queue handlers
US8244826B2 (en) Providing a memory region or memory window access notification on a system area network
US9430432B2 (en) Optimized multi-root input output virtualization aware switch
US7103744B2 (en) Binding a memory window to a queue pair
KR100773013B1 (en) Method and Apparatus for controlling flow of data between data processing systems via a memory
US6832279B1 (en) Apparatus and technique for maintaining order among requests directed to a same address on an external bus of an intermediate network node
Dubnicki et al. Design and implementation of virtual memory-mapped communication on myrinet
TWI244288B (en) Network interface and protocol
US7617376B2 (en) Method and apparatus for accessing a memory
JP4755391B2 (en) Method and apparatus for controlling the flow of data between data processing systems via a memory
US6487619B1 (en) Multiprocessor system that communicates through an internal bus using a network protocol
US6901451B1 (en) PCI bridge over network
US20090043886A1 (en) OPTIMIZING VIRTUAL INTERFACE ARCHITECTURE (VIA) ON MULTIPROCESSOR SERVERS AND PHYSICALLY INDEPENDENT CONSOLIDATED VICs
US20090077567A1 (en) Adaptive Low Latency Receive Queues
US20050228930A1 (en) Programmable inter-virtual channel and intra-virtual channel instructions issuing rules for an I/O bus of a system-on-a-chip processor
US20100329275A1 (en) Multiple Processes Sharing a Single Infiniband Connection
JP2004520646A (en) Method and apparatus for transferring an interrupt from a peripheral device to a host computer system
US6816889B1 (en) Assignment of dual port memory banks for a CPU and a host channel adapter in an InfiniBand computing node
US7710990B2 (en) Adaptive low latency receive queues
US7089378B2 (en) Shared receive queues
CN118093468B (en) PCIe exchange chip with RDMA acceleration function and PCIe switch
US20020049875A1 (en) Data communications interfaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant