WO2022237695A1 - 基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质 - Google Patents

基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质 Download PDF

Info

Publication number
WO2022237695A1
WO2022237695A1 PCT/CN2022/091531 CN2022091531W WO2022237695A1 WO 2022237695 A1 WO2022237695 A1 WO 2022237695A1 CN 2022091531 W CN2022091531 W CN 2022091531W WO 2022237695 A1 WO2022237695 A1 WO 2022237695A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
toe
sent
amount information
data amount
Prior art date
Application number
PCT/CN2022/091531
Other languages
English (en)
French (fr)
Inventor
金浩
杨洪章
屠要峰
蒋德钧
韩银俊
郭斌
陈峰峰
Original Assignee
中兴通讯股份有限公司
中国科学院计算技术研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司, 中国科学院计算技术研究所 filed Critical 中兴通讯股份有限公司
Publication of WO2022237695A1 publication Critical patent/WO2022237695A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to but is not limited to the field of network transmission technology, and in particular relates to a high-concurrency protocol stack offloading method, device, and medium based on a host-side large-capacity memory.
  • TCP Transmission Control Protocol
  • CPU Central Processing Unit
  • the running space is usually provided by the mounted double data rate (Double Data Rate, DDR) memory, and the DDR memory.
  • DDR Double Data Rate
  • Embodiments of the present application provide a method, device, and medium for unloading a high-concurrency protocol stack based on a host-side large-capacity memory.
  • the embodiment of the present application provides a high-concurrency protocol stack unloading method based on a large-capacity memory on the host side, which is applied to the central processing unit CPU on the host side, and offloads the transceiver buffer of the TOE hardware logic to the The RAM memory space on the host side, the CPU is connected to the TOE hardware in communication, and the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side includes: obtaining data to be sent, and determining the first data volume of the data to be sent information; buffer the data to be sent to the sending buffer of the shared memory; send the first data amount information to the TOE hardware, so that the TOE hardware reads from the shared memory according to the first data amount information Acquiring the data to be sent, and performing TOE offloading according to the data to be sent.
  • the embodiment of the present application provides a high-concurrency protocol stack unloading method based on a large-capacity memory on the host side, which is applied to TOE hardware, and unloads the transceiver buffer of the TOE hardware logic to the RAM memory space on the host side
  • the TOE hardware is connected in communication with the central processing unit CPU on the host side
  • the protocol stack unloading method includes: acquiring the first data amount information sent by the CPU; sending from the shared memory according to the first data amount information
  • the buffer acquires data to be sent, wherein the data to be sent is acquired by the CPU and buffered in the sending buffer; and TOE offload is performed on the data to be sent.
  • an embodiment of the present application provides an electronic device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the computer program, the following On the one hand, the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side, or, when executing the computer program, realize the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side as described in the second aspect.
  • Fig. 1 is a flow chart of a method for unloading a high-concurrency protocol stack based on a host-side large-capacity memory provided by an embodiment of the present application;
  • FIG. 2 is a schematic structural diagram of a server provided by another embodiment of the present application.
  • FIG. 3 is a flow chart of obtaining data to be sent provided by another embodiment of the present application.
  • FIG. 4 is a flow chart of synchronizing first address information provided by another embodiment of the present application.
  • Fig. 5 is a flow chart of performing data reception provided by another embodiment of the present application.
  • Fig. 6 is a flow chart of obtaining data to be received provided by another embodiment of the present application.
  • FIG. 7 is a flow chart of synchronizing second address information provided by another embodiment of the present application.
  • FIG. 8 is a flowchart of a method for unloading a high-concurrency protocol stack based on a host-side large-capacity memory provided by another embodiment of the present application;
  • FIG. 9 is a flow chart of acquiring data to be sent provided by another embodiment of the present application.
  • FIG. 10 is a flow chart of synchronizing first address information provided by another embodiment of the present application.
  • Fig. 11 is a flow chart of performing data reception provided by another embodiment of the present application.
  • Fig. 12 is a flow chart of synchronizing second address information provided by another embodiment of the present application.
  • Fig. 13 is a flowchart of Example 1 provided by another embodiment of the present application.
  • FIG. 14 is a schematic diagram of a sending buffer provided by another embodiment of the present application.
  • Fig. 15 is a flowchart of Example 2 provided by another embodiment of the present application.
  • FIG. 16 is a schematic diagram of a receiving buffer provided by another embodiment of the present application.
  • Fig. 17 is a structural diagram of an electronic device provided by another embodiment of the present application.
  • the present application provides a high-concurrency protocol stack unloading method, device, and storage medium based on a host-side large-capacity memory.
  • the protocol stack unloading method includes: acquiring data to be sent, and determining the first data volume of the data to be sent information; buffer the data to be sent to the sending buffer of the shared memory; send the first data amount information to the TOE hardware, so that the TOE hardware reads from the shared memory according to the first data amount information Acquiring the data to be sent, and performing TOE offloading according to the data to be sent.
  • the shared memory can be used as the buffer space for the data to be sent. Compared with the DDR memory of the TOE hardware, the storage capacity is greatly improved, and the unloading capability of the protocol stack in high concurrency scenarios is improved. , thereby improving network performance.
  • Figure 1 is a high-concurrency protocol stack unloading method based on the large-capacity memory on the host side provided by an embodiment of the present application, which is applied to the central processing unit CPU on the host side, and the transceiver buffer of the TOE hardware logic Unloading to the RAM memory space on the host side, the CPU communicates with the TOE hardware, and the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side includes but is not limited to step S110, step S120, and step S130.
  • Step S110 acquiring data to be sent, and determining first data amount information of the data to be sent.
  • the data to be sent can be obtained by generating a network link or a network message through any application in the application layer of the host server. This embodiment does not limit the specific source and type of the network link or network message.
  • the TOE offloading can be realized and sent.
  • the first data amount information may be the number of bytes of data of each network link, after receiving the network link, determine the first data amount information according to each network link, and use the host memory as a shared memory, thereby The corresponding buffer space is allocated for each network link in the host memory, and the orderly buffering of data can be realized in the case of high concurrency, which improves the positioning efficiency of TOE hardware when acquiring data.
  • Step S120 buffering the data to be sent to the sending buffer of the shared memory.
  • the storage space is divided into internal memory and external storage, and external storage includes DDR memory of TOE hardware, etc.
  • DDR memory DDR memory of TOE hardware
  • the memory of the host server usually has a large storage space, such as random access memory (Random Access Memory, RAM), which can store buffer data far more than the DDR memory of TOE hardware. Therefore, the host memory is used as a data buffer , can physically provide a large-capacity buffer space, and provide a storage space basis for high-concurrency scenarios.
  • RAM Random Access Memory
  • memory generally includes volatile memory and nonvolatile memory, and the storage capacity of nonvolatile memory is usually larger than volatile memory, and common volatile memory includes dynamic random access memory (Dynamic Random Access Memory, DRAM), common non-volatile memory includes non-volatile dual in-line memory module (Non-Volatile Dual In-line Memory Module, NVDIMM), for TOE unloading, even High concurrency scenarios are also a process of synchronously executing data acquisition and sending data, so the buffered data does not need to be stored permanently. Based on this, in this embodiment, either volatile memory or non-volatile memory can be used , can be selected according to actual needs.
  • DRAM Dynamic Random Access Memory
  • NVDIMM Non-Volatile Dual In-line Memory Module
  • the send buffer can be in the form of a buffer queue to ensure that the data to be sent can be executed in the order of acquisition. It can be understood that, for the CPU, it can directly access the RAM, so it can directly perform data read and write operations, and for the TOE hardware, it can access the RAM data through direct memory access (Direct Memory Access, DMA). It does not involve the improvement of specific data acquisition methods, so I will not repeat them here.
  • DMA Direct Memory Access
  • Step S130 sending the first data amount information to the TOE hardware, so that the TOE hardware obtains the data to be sent from the shared memory according to the first data amount information, and performs TOE unloading according to the data to be sent.
  • the CPU and TOE hardware can communicate through the hardware descriptor queue, for example, communicate through the message (Message, msg) in the queue of the network interface card (Net Interface Card, NIC), so as to realize the interaction of information.
  • Message Message, msg
  • NIC Network Interface Card
  • the first data amount information can be pushed to the hardware descriptor queue in the form of a sending instruction, for example, after the CPU receives the data to be sent, the proxy layer software constructs a sending instruction carrying the first data amount information, so that The TOE hardware can obtain the first data amount information after obtaining the send command by polling the hardware descriptor queue.
  • the TOE hardware can be informed of the data amount information of the data to be sent.
  • the TOE hardware can obtain data of a corresponding data amount from the sending buffer according to the first data amount information, so as to ensure that all data of each network link is obtained.
  • the length of the data acquired by the TOE hardware each time can be arbitrary, and can be determined according to the actual processing capacity until all the data in the sending buffer is acquired and sent.
  • the proxy layer software can send a completion notification to the application layer, so that the application that generates the network link can determine that the data has been sent to perform subsequent operations.
  • the software part can be divided into an application layer and an agent layer, wherein the application layer includes a plurality of applications 210, and the agent layer and the driver layer can be operated as a network service while providing a LIB library File, the application software calls the socket interface provided by the library function to access the network service, avoiding the overhead caused by system calls and multiple data copies, effectively shortening the sending and receiving path of the message, and realizing efficient message sending and receiving.
  • the application layer includes a plurality of applications 210
  • the agent layer and the driver layer can be operated as a network service while providing a LIB library File
  • the application software calls the socket interface provided by the library function to access the network service, avoiding the overhead caused by system calls and multiple data copies, effectively shortening the sending and receiving path of the message, and realizing efficient message sending and receiving.
  • the driver layer is used to implement all interactive interfaces between the application and the hardware, for example, including the basic driver software 231, and at the same time provide the receiving agent interface and the sending agent interface of the msg message to the upper layer software, and realize message sending and receiving through the hardware descriptor queue .
  • the proxy layer is used to realize the proxy service of receiving and sending, and maintain and manage the sending and receiving buffer 221, wherein the sending and receiving buffer 221 includes a sending buffer 222 and a receiving buffer 223, and can also be used to provide a posix-compatible socket interface to the application layer, It also supports multiple processes to access network services, implements network management interfaces, and provides tools like ifconfig and ethtool to configure and manage network protocol stacks.
  • the hardware layer is used to implement a complete network protocol stack, such as the network hardware 232 shown in the figure, including an Ethernet card, an IP protocol stack for performing IP protocol processing, a TCP protocol stack for performing TCP protocol processing, link management, Port management and routing tables, etc., can also increase or decrease specific hardware according to actual needs, and there are no limitations here.
  • the hardware layer is communicatively connected to the transmission network, and can send the data of the network link to the transmission network, and can also obtain network packets from the transmission network.
  • step S130 in the embodiment shown in FIG. 1 also includes but is not limited to the following steps:
  • Step S310 acquiring first address information, the first address information is used to describe the area where data is buffered in the sending buffer;
  • Step S320 sending the first data amount information and the first address information to the TOE hardware, so that the TOE hardware obtains the data to be sent from the sending buffer according to the first address information and the first data amount information, and executes TOE according to the data to be sent uninstall.
  • the sending buffer can be a buffer queue in RAM
  • the first address information can be the pointer information of the buffer queue, such as the sending buffer shown in Figure 14, and the pointer moves from the start position to the end The position of end moves, and after starting to write the data to be sent on the first network link, the head pointer slides toward end to form a buffer space.
  • the tail pointer slides toward end , so as to release the available buffer space, so that the space in the RAM can be reused, and the utilization rate of the space is improved.
  • the position of the head pointer is the start position of writing data
  • the position of the tail pointer is the start position of reading data. Therefore, when the first address information is obtained, the TOE hardware uses the position of the tail pointer and each The first data information amount of the network link can determine the storage location of the data to be sent for each network link.
  • step S320 in the embodiment shown in FIG. 3 is executed, the following steps are also included but not limited to:
  • Step S410 acquiring the second data volume information fed back by the TOE hardware, the second data volume information representing the data volume of the sent data newly increased by the TOE hardware by performing TOE offloading;
  • Step S420 updating the first address information according to the second data amount information, and synchronizing the updated first address information to the TOE hardware.
  • the TOE hardware can acquire data of any amount and perform TOE offloading, and the amount of data acquired each time should not exceed the total amount of data buffered in the sending buffer queue.
  • the information of the sliding window in the TOE hardware and agent layer can be updated according to the amount of data completed, that is, the update can be realized by moving the position of the tail pointer
  • the first address information, and the moving length of the tail pointer is equal to the number of bytes carried in the second data amount information, so as to prevent the TOE hardware and the CPU from accessing the same memory.
  • step S130 in the embodiment shown in FIG. 1 is executed, it also includes but is not limited to the following steps:
  • Step S510 acquiring the third data volume information sent by the TOE hardware, the third data volume information represents the data volume of the data to be received obtained by the TOE hardware through TOE offloading;
  • Step S520 obtaining the data to be received from the receiving buffer of the shared memory according to the third data amount information, wherein the data to be received is buffered by the TOE hardware to the receiving buffer;
  • step S530 the receiving process of the data to be received is completed.
  • TOE offloading can also be used to obtain network packet data from the network. After obtaining the packet data, it needs to be received through the application layer application, and high concurrency may also occur when receiving , therefore, similar to sending data, the data to be received can be buffered into RAM, and the CPU obtains the data to be received from RAM for receiving processing.
  • the specific interaction method can refer to the description of the embodiment shown in Figure 1, and there are not many For repeat.
  • send buffer and receive buffer of the shared memory can be different queues, so as to prevent the TOE hardware and the CPU from accessing the same memory in different processes.
  • the third data volume information can also be the length of bytes, and it is pushed into the hardware descriptor queue and reported to the agent layer by means of instructions, and details will not be repeated here.
  • step S520 in the embodiment shown in FIG. 5 also includes but is not limited to the following steps:
  • Step S610 determining second address information according to the third data amount information, wherein the second address information is used to describe the area where data is buffered in the receiving buffer;
  • Step S620 acquire the data to be sent from the receiving buffer area according to the third data amount information and the second address information.
  • the TOE hardware directly writes the data to be received into RAM. Since the sliding window of RAM is maintained by the agent layer software, the TOE hardware cannot buffer the received data. Therefore, the agent layer software needs to determine the second address information through the third data volume, that is, the head pointer and tail pointer positions of the receiving buffer, and perform real-time maintenance.
  • step S530 in the embodiment shown in FIG. 5 is executed, the following steps are also included but not limited to:
  • Step S710 determining the fourth data amount information, the fourth data amount information represents the data amount of the newly added processed data through receiving processing;
  • Step S720 updating the second address information according to the fourth data amount information
  • Step S730 sending the updated second address information to the TOE hardware.
  • the method for updating the second address information through the fourth data volume information is similar to the principle of updating the first address information according to the second data volume information shown in FIG. 4 , that is, sliding the tail pointer of the receiving buffer queue, and
  • the sliding length is equal to the number of bytes carried by the fourth data amount information.
  • the agent layer since the TOE hardware writes data before sliding the head pointer of the receiving buffer, and after the writing is completed, the agent layer is triggered to maintain the head pointer position through instructions, so after updating the second address information, it is synchronized to the TOE hardware , which can not only prevent TOE hardware and CPU from accessing the same memory, but also ensure that the position written by TOE hardware after acquiring new data is accurate.
  • an embodiment of the present application also provides a high-concurrency protocol stack unloading method based on large-capacity memory on the host side, which is applied to TOE hardware, and unloads the transceiver buffer of TOE hardware logic to the RAM on the host side Memory space, TOE hardware communicates with the central processing unit CPU on the host side, and the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side includes but is not limited to the following steps:
  • Step S810 acquiring the first data volume information sent by the CPU
  • Step S820 acquiring the data to be sent from the sending buffer of the shared memory according to the first data amount information, wherein the data to be sent is acquired by the CPU and buffered to the sending buffer;
  • Step S830 perform TOE offloading for the data to be sent.
  • step S820 in the embodiment shown in FIG. 8 also includes but is not limited to the following steps:
  • Step S910 acquiring the first address information sent by the CPU, wherein the first address information is used to describe the area where data is buffered in the sending buffer;
  • Step S920 acquiring data to be sent from the sending buffer according to the first address information and the first data volume information.
  • step S830 in the embodiment shown in FIG. 8 is executed, it also includes but is not limited to the following steps:
  • Step S1010 determine the second data amount information, the second data amount information represents the data amount of the sent data newly increased by performing TOE offloading;
  • Step S1020 feeding back the second data amount information to the CPU, so that the CPU updates the first address information according to the second data amount information;
  • Step S1030 acquiring updated first address information sent by the CPU.
  • step S830 in the embodiment shown in FIG. 8 it also includes but is not limited to the following steps:
  • Step S1110 obtain the data to be received through TOE offloading, and determine the third data volume information of the data to be received;
  • Step S1120 buffering the data to be received to the receiving buffer of the shared memory
  • Step S1130 sending the third data amount information to the CPU, so that the CPU obtains the data to be received from the receiving buffer according to the third data amount information, and performs receiving processing of the data to be received.
  • step S830 in the embodiment shown in FIG. 11 is executed, it also includes but is not limited to the following steps:
  • Step S1210 acquire the updated second address information sent by the CPU, wherein the second address information is determined by the CPU according to the third data amount information, and is updated according to the fourth data amount information, and the fourth data amount information represents an address received by the CPU The data volume of the processed data newly increased by processing.
  • the technical solution of the present application is illustrated here by taking the TCP protocol stack as an example through two specific processes of data sending and data receiving.
  • the software and hardware architecture may refer to the architecture shown in FIG. 2 , and the specific architecture will not be repeated here.
  • Example 1 Data sending process.
  • Step S1310 the application sends data with a length of L, and the head pointer p_app_send of the sending buffer slides L bytes to the right under the condition of ensuring that p_acked is not overwritten, and rolls back if it reaches the boundary. After the head pointer slides successfully, copy the data to Newly allocated buffer space and send TxSend instruction to the proxy layer;
  • Step S1320 the proxy layer constructs a TX sending request message, and pushes a TX request instruction to the hardware descriptor queue of the network card;
  • Step S1330 the network card polls the TX sending instruction received by the hardware descriptor queue, calculates the corresponding sending buffer address according to the content of the TX sending instruction, triggers the TCP protocol stack to execute the sending process, and sends the link sending buffer data to the network ;
  • Step S1340 after the network card receives the ack confirmation message, it constructs a sending completion message TxComp, and pushes it to the agent layer through the description queue;
  • Step S1350 the proxy layer receives the sending completion message TxComp, updates the sliding window of the sending queue of the TCP link, slides the p_acked pointer to the right by X bytes, and ensures that p_acked does not cover the p_app_send pointer;
  • Step S1360 if all the data in the sending buffer is sent, the agent layer sends a completion notification to the application layer, informing the application layer that the data is sent successfully.
  • Figure 14 is a schematic diagram of the sending buffer, wherein, in the sending buffer, the head pointer p_app_send slides from the start of the queue of the sending buffer to the end, forming a buffer space by sliding, using To store the data to be sent, and after the data to be sent is sent, the tail pointer p_acked slides from the start of the queue in the send buffer to the end, thereby forming an area for buffering the data to be sent in the send buffer, and the head pointer
  • the tail pointer and tail pointer keep sliding at the same time in high concurrency scenarios to achieve data acquisition and sending, and roll back to the start of strat after reaching end, which can recycle the RAM capacity and help improve the buffering capacity in high concurrency scenarios.
  • Example 2 Data receiving process.
  • Step S1510 the network card receives the message from the network, obtains L bytes of application data after being processed by the protocol stack module, calls the DMA interface to write the data of L bytes length into the p_toe_rx position, and ensures that the p_app_read pointer will not be overwritten;
  • Step S1520 the network card sends an RX receiving instruction to the agent layer through the descriptor queue
  • Step S1530 the proxy layer receives the RX instruction, updates the receiving buffer sliding window of the specified link according to the content of the RX instruction, and slides the head pointer p_toe_rx to the right by L bytes, and at the same time notifies the application that the link has data to be read;
  • Step S1540 the application executes the socket interface to read X-length data, and the tail pointer p_app_read slides X bytes to the right to ensure that p_app_read does not cover p_toe_rx, and at the same time sends a reception completion message RxComp to the hardware layer;
  • Step S1550 the network card receives the RxComp message, updates the buffer status of the corresponding link, and slides the tail pointer p_app_read to the right by X bytes, thereby completing the synchronization of the sliding window status of software and hardware.
  • Figure 16 is a schematic diagram of the receiving buffer, wherein, in the receiving buffer, the head pointer p_toe_rx slides from the start of the queue of the receiving buffer to the end, forming a buffer space by sliding, using To store the data to be received, and after the data to be received is received, the tail pointer p_app_read slides from the start of the queue of the receiving buffer to the end, thereby forming an area for buffering the data to be received in the receiving buffer, and the head pointer
  • the tail pointer and tail pointer keep sliding at the same time in high-concurrency scenarios to achieve data acquisition and reception, and roll back to the start of strat after reaching end, which can recycle the RAM capacity and help improve the buffering capacity in high-concurrency scenarios.
  • an embodiment of the present application also provides an electronic device.
  • the electronic device 1700 includes: a memory 1710 , a processor 1720 , and a computer program stored in the memory 1710 and operable on the processor 1720 .
  • the processor 1720 and the memory 1710 may be connected through a bus or in other ways.
  • the non-transitory software programs and instructions required to implement the high-concurrency protocol stack unloading method based on the host-side large-capacity memory of the above-mentioned embodiment are stored in the memory 1710, and when executed by the processor 1720, the application in the above-mentioned embodiment is executed.
  • the high-concurrency protocol stack unloading method based on the host-side large-capacity memory of the central processing unit CPU on the host side for example, executes the method steps S110 to S130 in FIG. 1 described above, and the method steps S310 to S320 in FIG. 3 , Method step S410 to step S420 in Fig. 4, method step S510 to step S530 in Fig. 5, method step S610 to step S620 in Fig.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • an embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are executed by a processor or a controller, for example, by the above-mentioned Execution by a processor in the embodiment of the electronic device can cause the above-mentioned processor to execute the high-concurrency protocol stack unloading method based on the large-capacity memory on the host side in the above-mentioned embodiment, for example, execute the method from step S110 to step 1 in Fig. 1 described above S130, method step S310 to step S320 in Fig. 3, method step S410 to step S420 in Fig. 4, method step S510 to step S530 in Fig.
  • Method steps S610 to step S620 in Fig. 6 or in Fig. 7 Method steps S710 to step S730; or, execute the high-concurrency protocol stack unloading method based on host-side large-capacity memory applied to TOE hardware in the above-mentioned embodiments, for example, execute the method steps S810 to step S830 in FIG. 8 described above , method step S910 to step S920 in FIG. 9 , method step S1010 to step S1030 in FIG. 10 , method step S1110 to step S1130 in FIG. 11 , method step S1210 in FIG. 12 .
  • Computer storage media including, but not limited to, shared memory, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or Any other medium that can be used to store desired information and that can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
  • the embodiment of the present application includes: acquiring the data to be sent, and determining the first data amount information of the data to be sent; buffering the data to be sent to the sending buffer of the shared memory; sending the first data to the TOE hardware data volume information, so that the TOE hardware obtains the data to be sent from the shared memory according to the first data volume information, and executes TOE offloading according to the data to be sent.
  • the shared memory can be used as the buffer space for the data to be sent. Compared with the DDR memory of the TOE hardware, the storage capacity is greatly improved, and the unloading capability of the protocol stack in high concurrency scenarios is improved. , thereby improving network performance.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

一种基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质,所述方法包括:获取待发送数据,并确定所述待发送数据的第一数据量信息(S110);将所述待发送数据缓冲至共享内存的发送缓冲区(S120);向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载(S130)。

Description

基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质
相关申请的交叉引用
本申请基于申请号为202110527515.7,申请日为2021年05月14日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及但不限于网络传输技术领域,尤其涉及一种基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质。
背景技术
随着网络技术的快速发展,百万兆以太网越来越普及,中央处理器(Central Processing Unit,CPU)实现的传输控制协议(Transmission Control Protocol,TCP)协议栈的开销越来越大,软件运行的开销和缓存时延已经成为高性能网络服务的技术瓶颈,为了释放CPU资源,出现了能够将协议栈卸载到现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)硬件设备进行处理的TCP卸载引擎(TCP Offload Engine,TOE)技术。
当并发的网络链接较多,缓冲队列内存的开销较大,在常见的TOE实现方案中,运行空间通常是由挂载的双倍数据速率(Double Data Rate,DDR)存储器提供,而DDR存储器的硬件特性决定了运行空间的扩展有限,很难满足高并发网络链接场景下的容量需求,影响缓存时延和网络性能。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供了一种基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质。
第一方面,本申请实施例提供了一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于所述主机侧的中央处理器CPU,将TOE硬件逻辑的收发缓冲区卸载到所述主机侧的RAM内存空间,所述CPU与TOE硬件通信连接,所述基于主机侧大容量内存的高并发协议栈卸载方法包括:获取待发送数据,并确定所述待发送数据的第一数据量信息;将所述待发送数据缓冲至共享内存的发送缓冲区;向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载。
第二方面,本申请实施例提供了一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于TOE硬件,将TOE硬件逻辑的收发缓冲区卸载到所述主机侧的RAM内存空间,所述TOE硬件与所述主机侧的中央处理器CPU通信连接,所述协议栈卸载方法包括:获取所述CPU发送的第一数据量信息;根据所述第一数据量信息从共享内存的发送缓冲区获取待发送数据,其中,所述待发送数据由所述CPU获取并缓冲至所述发送缓冲区;针对所述待发送数据执行TOE卸载。
第三方面,本申请实施例提供了一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如第一方面所述的基于主机侧大容量内存的高并发协议栈卸载方法,或者,执行所述计算机程序时实现如第二方面所述的基于主机侧大容量内存的高并发协议栈卸载方法。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1是本申请一个实施例提供的基于主机侧大容量内存的高并发协议栈卸载方法的流程图;
图2是本申请另一个实施例提供的服务器结构示意图;
图3是本申请另一个实施例提供的获取待发送数据的流程图;
图4是本申请另一个实施例提供的同步第一地址信息的流程图;
图5是本申请另一个实施例提供的执行数据接收的流程图;
图6是本申请另一个实施例提供的获取待接收数据的流程图;
图7是本申请另一个实施例提供的同步第二地址信息的流程图;
图8是本申请另一个实施例提供的基于主机侧大容量内存的高并发协议栈卸载方法的流程图;
图9是本申请另一个实施例提供的获取待发送数据的流程图;
图10是本申请另一个实施例提供的同步第一地址信息的流程图;
图11是本申请另一个实施例提供的执行数据接收的流程图;
图12是本申请另一个实施例提供的同步第二地址信息的流程图;
图13是本申请另一个实施例提供的示例一的流程图;
图14是本申请另一个实施例提供的发送缓冲区的示意图;
图15是本申请另一个实施例提供的示例二的流程图;
图16是本申请另一个实施例提供的接收缓冲区的示意图;
图17是本申请另一个实施例提供的电子设备的结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书、权利要求书或上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本申请提供了一种基于主机侧大容量内存的高并发协议栈卸载方法、设备、存储介质,所述协议栈卸载方法包括:获取待发送数据,并确定所述待发送数据的第一数据量信息;将所述待发送数据缓冲至共享内存的发送缓冲区;向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载。根据本申请实施例提供的方案,能够利用共享内存作为待发送数据的缓冲空间,相比起TOE硬件的DDR存储器在存储容量上有较大的提升,提高了高并发场景下协议栈的卸载能力,从而提高网络性能。
下面结合附图,对本申请实施例作进一步阐述。
如图1所示,图1是本申请一个实施例提供的一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于主机侧的中央处理器CPU,将TOE硬件逻辑的收发缓冲区卸载到主机侧的RAM内存空间,CPU与TOE硬件通信连接,基于主机侧大容量内存的高并发协议栈卸载方法包括但不限于有步骤S110、步骤S120和步骤S130。
步骤S110,获取待发送数据,并确定待发送数据的第一数据量信息。
需要说明的是,待发送数据可以通过主机服务器的应用层中的任意应用生成网络链接或者网络报文得到,本实施例对网络链接或者网络报文的具体来源和类型不多作限定,能够通过TOE卸载实现发送即可。
需要说明的是,第一数据量信息可以是每个网络链接的数据的字节数,在接收到网络链接 之后,根据每条网络链接确定第一数据量信息,利用主机内存作为共享内存,从而在主机内存为每条网络链接分配相对应的缓冲空间,在高并发的情况下能够实现数据的有序缓冲,提高TOE硬件获取数据时定位的效率。
步骤S120,将待发送数据缓冲至共享内存的发送缓冲区。
值得注意的是,对于主机服务器而言,存储空间分为内存和外存,外存包括TOE硬件的DDR存储器等,其不足之处在于存储空间小、成本高和性能低,因此在应对高并发数据的情况下很容易出现存储空间不足的情况,影响网络性能。而主机服务器的内存通常具有较大的存储空间,例如随机存取存储器(Random Access Memory,RAM),能够存储的缓冲数据量远超于TOE硬件的DDR存储器,因此,利用主机内存作为数据缓冲区,能够在物理上提供大容量缓冲空间,为高并发场景提供存储空间基础。
本领域技术人员可以理解的是,内存通常包括易失性内存和非易失性内存,非易失性内存的存储容量通常大于易失性内存,常见的易失性内存包括动态随机存取存储器(Dynamic Random Access Memory,DRAM),常见的非易失性内存包括非易失性双列直插式内存模块(Non-Volatile Dual In-line Memory Module,NVDIMM),对于TOE卸载而言,即使是高并发场景,也是获取数据和发送数据同步执行的过程,因此缓冲的数据并不需要永久性保存,基于此,在本实施例中既可以采用易失性内存,也可以采用非易失性内存,根据实际需求选取即可。
需要说明的是,在共享内存中,发送缓冲区可以采用缓冲队列的形式,以确保待发送数据能够按照获取顺序被执行。可以理解的是,对于CPU而言,能够直接访问RAM,因此能够直接执行数据的读写操作,而对于TOE硬件,能够通过直接存储器访问(Direct Memory Access,DMA)方式访问RAM数据,本实施例并不涉及具体的数据获取方式改进,在此不多作赘述。
步骤S130,向TOE硬件发送第一数据量信息,以使TOE硬件根据第一数据量信息从共享内存获取待发送数据,并根据待发送数据执行TOE卸载。
需要说明的是,CPU和TOE硬件之间能够通过硬件描述符队列实现通信,例如通过网络接口卡(Net Interface Card,NIC)队列中的消息(Message,msg)进行通信,以实现信息的交互。
可以理解的是,第一数据量信息可以通过以发送指令的形式推送至硬件描述符队列,例如CPU接收到待发送数据之后,通过代理层软件构建携带有第一数据量信息的发送指令,使得TOE硬件通过轮询硬件描述符队列获取到发送指令后,能够获取第一数据量信息,当然,也可以根据实际需求采用其他形式,能够将待发送数据的数据量信息告知TOE硬件即可。
可以理解的是,TOE硬件在获取到第一数据量信息后,能够根据第一数据量信息从发送缓冲区中获取相应数据量的数据,以确保获取到每条网络链接的全部数据,当然,TOE硬件每次获取的数据长度可以是任意的,根据实际处理能力确定即可,直至将发送缓冲区的全部数据获取并且完成发送为止。
可以理解的是,当检测到发送缓冲区的全部数据完成发送,可以通过代理层软件向应用层发送完成通知,以使产生该网络链接的应用能够确定数据完成发送,以执行后续操作。
例如,例如图2所示的服务器结构,软件部分可以划分为应用层和代理层,其中,应用层包括多个应用210,代理层和驱动层可以作为一个网络服务进行运行,同时提供一个LIB库文件,应用软件调用库函数提供的socket接口访问网络服务,避免系统调用、多次数据拷贝带来的开销,有效缩短报文的收发路径,实现高效的报文收发。
其中,驱动层用于实现应用与硬件之间的全部交互接口,例如包括基本的驱动软件231,同时向上层软件提供msg消息的接收代理接口和发送代理接口,并通过硬件描述符队列实现消息收发。
代理层用于实现接收和发送的代理服务,维护管理收发缓冲区221,其中,收发缓冲区221包括发送缓冲区222和接收缓冲区223,也可以用于向应用层提供posix兼容的socket接口,并且支持多个进程访问网络服务,同时实现网络管理接口,提供类ifconfig、ethtool工具配置管理网络协议栈。
硬件层用于实现完整的网络协议栈,例如图中所示的网络硬件232,包括以太网卡、用于执 行IP协议处理的IP协议栈、用于执行TCP协议处理的TCP协议栈、链接管理、端口管理以及路由表等,也可以根据实际需求增加或者减少具体的硬件,在此不多作限定。同时,硬件层与传输网络通信连接,能够将网络链接的数据发送至传输网络,也可以从传输网络中获取网络报文。
另外,参照图3,在一实施例中,图1所示实施例中的步骤S130还包括但不限于有以下步骤:
步骤S310,获取第一地址信息,第一地址信息用于描述发送缓冲区缓冲有数据的区域;
步骤S320,向TOE硬件发送第一数据量信息和第一地址信息,以使TOE硬件根据第一地址信息和第一数据量信息从发送缓冲区中获取待发送数据,并根据待发送数据执行TOE卸载。
需要说明的是,由于发送缓冲区可以是RAM中的缓冲队列,因此第一地址信息可以是缓冲队列的指针信息,例如图14所示的发送缓冲区,指针移动方向从开始start的位置向结束end的位置移动,开始写入第一个网络链接的待发送数据后,头指针朝向end的方向滑动,形成缓冲空间,当数据由TOE硬件获取并完成TOE卸载后,尾指针朝向end的方向滑动,从而释放出可用的缓冲空间,使得RAM中的空间能够重复使用,提高空间的利用率。
可以理解的是,头指针的位置为写入数据的开始位置,尾指针的位置为读取数据的开始位置,因此,当获取到第一地址信息后,TOE硬件根据尾指针的位置和每个网络链接的第一数据信息量,即可确定每个网络链接的待发送数据的存储位置。
需要说明的是,在滑动指针的过程中,需要确保头指针和尾指针不相互重叠,避免TOE硬件和CPU同时访问相同的内存。
另外,参照图4,在一实施例中,在执行完图3所示实施例中的步骤S320之后,还包括但不限于有以下步骤:
步骤S410,获取TOE硬件反馈的第二数据量信息,第二数据量信息表征TOE硬件通过执行TOE卸载而新增加的已发送数据的数据量;
步骤S420,根据第二数据量信息更新第一地址信息,并将更新后的第一地址信息同步至TOE硬件。
需要说明的是,TOE硬件可以获取任意数据量的数据并执行TOE卸载,每次获取的数据量不大于发送缓冲队列中缓冲的总数据量即可。为了确保软件和硬件中对于发送缓冲队列的信息一致,在每次完成TOE卸载之后,可以根据完成的数据量更新TOE硬件和代理层中的滑动窗口的信息,即通过移动尾指针的位置实现更新第一地址信息,并且尾指针移动的长度等于第二数据量信息中所携带的字节数,从而避免TOE硬件和CPU访问相同的内存。
另外,参照图5,在一实施例中,在执行完图1所示实施例中的步骤S130之后,还包括但不限于有以下步骤:
步骤S510,获取TOE硬件发送的第三数据量信息,第三数据量信息表征TOE硬件通过TOE卸载所得到的待接收数据的数据量;
步骤S520,根据第三数据量信息从共享内存的接收缓冲区中获取待接收数据,其中,待接收数据由TOE硬件缓冲至接收缓冲区;
步骤S530,完成待接收数据的接收处理。
需要说明的是,除了发送数据,TOE卸载还可以用于从网络中获取网络报文数据,在获取到报文数据后,需要通过应用层的应用进行接收,而接收也可能出现高并发的情况,因此,可以和发送数据相类似,将待接收数据缓冲到RAM中,CPU从RAM中获取待接收数据进行接收处理,具体的交互方式可以参考图1所示实施例的描述,在此不多作赘述。
可以理解的是,共享内存的发送缓冲区和接收缓冲区可以是不同的队列,避免TOE硬件和CPU在不同的进程中访问到相同的内存。
可以理解的是,第三数据量信息也可以是字节数长度,并且通过指令的方式推入硬件描述符队列上报至代理层,在此不多作赘述。
另外,参照图6,在一实施例中,图5所示实施例中的步骤S520还包括但不限于有以下步 骤:
步骤S610,根据第三数据量信息确定第二地址信息,其中,第二地址信息用于描述接收缓冲区缓冲有数据的区域;
步骤S620,根据第三数据量信息和第二地址信息从接收缓冲区域获取待发送数据。
需要说明的是,与发送数据的流程不同,接收到网络报文后,TOE硬件直接将待接收数据写入RAM,由于RAM的滑动窗口是由代理层软件维护的,TOE硬件并不能对接收缓冲区的指针进行维护,因此,需要代理层软件通过第三数据量确定第二地址信息,即接收缓冲区的头指针和尾指针位置,并进行实时的维护。
另外,参照图7,在一实施例中,在执行完图5所示实施例中的步骤S530之后,还包括但不限于有以下步骤:
步骤S710,确定第四数据量信息,第四数据量信息表征通过接收处理而新增加的已处理数据的数据量;
步骤S720,根据第四数据量信息更新第二地址信息;
步骤S730,将更新后的第二地址信息发送至TOE硬件。
需要说明的是,通过第四数据量信息更新第二地址信息的方法与图4中所示根据第二数据量信息更新第一地址信息的原理相近似,即滑动接收缓冲队列的尾指针,并且滑动长度等于第四数据量信息所携带的字节数。
需要说明的是,由于TOE硬件写入数据是在滑动接收缓冲区的头指针之前,并且在写入完成后通过指令触发代理层维护头指针位置,因此,更新第二地址信息后同步至TOE硬件,既能防止TOE硬件和CPU访问相同的内存,也能够确保TOE硬件获取到新数据后写入的位置准确。
另外,参照图8,本申请的一个实施例还提供了一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于TOE硬件,将TOE硬件逻辑的收发缓冲区卸载到主机侧的RAM内存空间,TOE硬件与主机侧的中央处理器CPU通信连接,基于主机侧大容量内存的高并发协议栈卸载方法包括但不限于有以下步骤:
步骤S810,获取CPU发送的第一数据量信息;
步骤S820,根据第一数据量信息从共享内存的发送缓冲区获取待发送数据,其中,待发送数据由CPU获取并缓冲至发送缓冲区;
步骤S830,针对待发送数据执行TOE卸载。
需要说明的是,本实施例的技术方案和原理可以参考图1所示的实施例,主要区别在于本实施例的执行主体为TOE硬件,为了叙述简便,在此不多作赘述。
可以理解的是,多个硬件,任意一个都可以触发。
另外,参照图9,在一实施例中,图8所示实施例中的步骤S820还包括但不限于有以下步骤:
步骤S910,获取CPU发送的第一地址信息,其中,第一地址信息用于描述发送缓冲区缓冲有数据的区域;
步骤S920,根据第一地址信息和第一数据量信息从发送缓冲区获取待发送数据。
需要说明的是,本实施例的技术方案和原理可以参考图3所示的实施例,主要区别在于本实施例的执行主体为TOE硬件,作为待发送数据的获取端,除此以外与图3所示的实施例相近似,为了叙述简便,在此不多作赘述。
另外,参照图10,在一实施例中,在执行完图8所示实施例中的步骤S830之后,还包括但不限于有以下步骤:
步骤S1010,确定第二数据量信息,第二数据量信息表征通过执行TOE卸载而新增加的已发送数据的数据量;
步骤S1020,将第二数据量信息反馈至CPU,以使CPU根据第二数据量信息更新第一地址信息;
步骤S1030,获取CPU发送的更新后的第一地址信息。
需要说明的是,本实施例的技术方案和原理可以参考图4所示的实施例,主要区别在于本实施例的执行主体为TOE硬件,作为第一地址信息的接收端,除此以外与图4所示实施例的原理类似,为了叙述简便,在此不多作赘述。
另外,参照图11,在一实施例中,在执行完图8所示实施例中的步骤S830之后,还包括但不限于有以下步骤:
步骤S1110,通过TOE卸载获取待接收数据,并确定待接收数据的第三数据量信息;
步骤S1120,将待接收数据缓冲至共享内存的接收缓冲区;
步骤S1130,向CPU发送第三数据量信息,以使CPU根据第三数据量信息从接收缓冲区获取待接收数据,并执行待接收数据的接收处理。
需要说明的是,本实施例的技术方案和原理可以参考图5所示的实施例,主要区别在于本实施例的执行主体为TOE硬件,作为待发送数据的获取端,除此以外与图5所示实施例的原理类似,为了叙述简便,在此不多作赘述。
另外,参照图12,在一实施例中,在执行完图11所示实施例中的步骤S830之后,还包括但不限于有以下步骤:
步骤S1210,获取CPU发送的更新后的第二地址信息,其中,第二地址信息由CPU根据第三数据量信息确定,并根据第四数据量信息更新,第四数据量信息表征通过CPU的接收处理而新增加的已处理数据的数据量。
需要说明的是,本实施例的技术方案和原理可以参考图6和图7所示的实施例,主要区别在于本实施例的执行主体为TOE硬件,作为第二地址信息的接收端,除此以外与图6和图7所示实施例的原理类似,为了叙述简便,在此不多作赘述。
另外,为了对本申请的技术方案进行更加详细的说明,在此以TCP协议栈为例,通过数据发送和数据接收的两个具体流程对本申请的技术方案进行举例说明。
需要说明的是,在两个具体示例中,软硬件架构可以参考图2所示的架构,在此不对具体架构进行重复赘述。
示例一:数据发送流程。
参考图13,在TCP协议栈的数据发送流程中,包括但不限于有以下步骤:
步骤S1310,应用发送长度为L的数据,发送缓冲区头指针p_app_send在确保不覆盖p_acked的情况下,向右滑动L字节,若到达边界则回滚,头指针滑动成功后,将数据拷贝到新分配的缓冲空间并向代理层发送TxSend指令;
步骤S1320,代理层构建TX发送请求消息,向网卡的硬件描述符队列推送TX请求指令;
步骤S1330,网卡轮询硬件描述符队列接收到的TX发送指令,根据TX发送指令的内容计算出对应发送缓冲区地址,触发TCP协议栈执行发送流程,将该链路发送缓冲区数据发送到网络;
步骤S1340,网卡收到ack确认消息后,构造发送完成消息TxComp,通过描述队列向代理层推送;
步骤S1350,代理层收到发送完成消息TxComp,更新TCP链接的发送队列滑动窗口,p_acked指针向右滑动X字节,并确保p_acked不覆盖p_app_send指针;
步骤S1360,若发送缓冲区数据全部发送完成,代理层向应用层发送完成通知,告知应用层数据发送成功。
需要说明的是,参考图14,图14为发送缓冲区的示意图,其中,在发送缓冲区中,头指针p_app_send从发送缓冲区的队列start处开始向end处滑动,通过滑动形成缓冲空间,用于存入待发送数据,并且,在待发送数据发送之后,尾指针p_acked从发送缓冲区的队列start处开始向end处滑动,从而在发送缓冲区中形成缓冲有待发送数据的区域,并且头指针和尾指针在高并发的场景下同时保持滑动,实现数据的获取和发送,并在到达end之后回滚到strat开始,能够循环利用RAM的容量,有利于提高高并发场景下的缓冲能力。
示例二:数据接收流程。
参考图15,在TCP协议栈的数据接收流程中,包括但不限于有以下步骤:
步骤S1510,网卡从网络接收报文,经过协议栈模块处理后得到应用数据L字节,调用DMA接口将L字节长度数据写入p_toe_rx位置,确保不会覆盖p_app_read指针;
步骤S1520,网卡通过描述符队列向代理层发送RX接收指令;
步骤S1530,代理层收到RX指令,根据RX指令的内容更新指定链路的接收缓冲区滑动窗口,头指针p_toe_rx向右滑动L字节,同时通知应用该链路有数据待读取;
步骤S1540,应用执行socket接口读取X长度数据,尾指针p_app_read向右滑动X字节,确保p_app_read不覆盖p_toe_rx,同时向硬件层发送接收完成消息RxComp;
步骤S1550,网卡收到RxComp消息,更新对应链路的缓冲区状态,尾指针p_app_read向右滑动X字节,从而完成软、硬件的滑动窗口状态的同步。
需要说明的是,参考图16,图16为接收缓冲区的示意图,其中,在接收缓冲区中,头指针p_toe_rx从接收缓冲区的队列start处开始向end处滑动,通过滑动形成缓冲空间,用于存入待接收数据,并且,在待接收数据接收之后,尾指针p_app_read从接收缓冲区的队列start处开始向end处滑动,从而在接收缓冲区中形成缓冲有待接收数据的区域,并且头指针和尾指针在高并发的场景下同时保持滑动,实现数据的获取和接收,并在到达end之后回滚到strat开始,能够循环利用RAM的容量,有利于提高高并发场景下的缓冲能力。
另外,参照图17,本申请的一个实施例还提供了一种电子设备,该电子设备1700包括:存储器1710、处理器1720及存储在存储器1710上并可在处理器1720上运行的计算机程序。
处理器1720和存储器1710可以通过总线或者其他方式连接。
实现上述实施例的基于主机侧大容量内存的高并发协议栈卸载方法所需的非暂态软件程序以及指令存储在存储器1710中,当被处理器1720执行时,执行上述实施例中的应用于主机侧的中央处理器CPU的基于主机侧大容量内存的高并发协议栈卸载方法,例如,执行以上描述的图1中的方法步骤S110至步骤S130、图3中的方法步骤S310至步骤S320、图4中的方法步骤S410至步骤S420、图5中的方法步骤S510至步骤S530、图6中的方法步骤S610至步骤S620或图7中的方法步骤S710至步骤S730;或者,执行上述实施例中的应用于TOE硬件的基于主机侧大容量内存的高并发协议栈卸载方法,例如,执行以上描述的图8中的方法步骤S810至步骤S830、图9中的方法步骤S910至步骤S920、图10中的方法步骤S1010至步骤S1030、图11中的方法步骤S1110至步骤S1130、图12中的方法步骤S1210。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
此外,本申请的一个实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个处理器或控制器执行,例如,被上述电子设备实施例中的一个处理器执行,可使得上述处理器执行上述实施例中基于主机侧大容量内存的高并发协议栈卸载方法,例如,执行以上描述的图1中的方法步骤S110至步骤S130、图3中的方法步骤S310至步骤S320、图4中的方法步骤S410至步骤S420、图5中的方法步骤S510至步骤S530、图6中的方法步骤S610至步骤S620或图7中的方法步骤S710至步骤S730;或者,执行上述实施例中的应用于TOE硬件的基于主机侧大容量内存的高并发协议栈卸载方法,例如,执行以上描述的图8中的方法步骤S810至步骤S830、图9中的方法步骤S910至步骤S920、图10中的方法步骤S1010至步骤S1030、图11中的方法步骤S1110至步骤S1130、图12中的方法步骤S1210。本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。 计算机存储介质包括但不限于共享内存、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
本申请实施例包括:获取待发送数据,并确定所述待发送数据的第一数据量信息;将所述待发送数据缓冲至共享内存的发送缓冲区;向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载。根据本申请实施例提供的方案,能够利用共享内存作为待发送数据的缓冲空间,相比起TOE硬件的DDR存储器在存储容量上有较大的提升,提高了高并发场景下协议栈的卸载能力,从而提高网络性能。
以上是对本申请的若干实施方式进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请精神的前提下还可作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。

Claims (13)

  1. 一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于所述主机侧的中央处理器CPU,将TOE硬件逻辑的收发缓冲区卸载到所述主机侧的RAM内存空间,所述CPU与传输控制协议卸载引擎TOE硬件通信连接,所述基于大容量内存的高并发协议栈卸载方法包括:
    获取待发送数据,并确定所述待发送数据的第一数据量信息;
    将所述待发送数据缓冲至共享内存的发送缓冲区;
    向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载。
  2. 根据权利要求1所述的方法,其中,所述向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载,包括:
    获取第一地址信息,所述第一地址信息用于描述所述发送缓冲区缓冲有数据的区域;
    向所述TOE硬件发送所述第一数据量信息和所述第一地址信息,以使所述TOE硬件根据所述第一地址信息和所述第一数据量信息从所述发送缓冲区中获取所述待发送数据,并根据所述待发送数据执行TOE卸载。
  3. 根据权利要求2所述的方法,其中,在所述向所述TOE硬件发送所述第一数据量信息和所述第一地址信息之后,所述方法还包括:
    获取所述TOE硬件反馈的第二数据量信息,所述第二数据量信息表征所述TOE硬件通过执行TOE卸载而新增加的已发送数据的数据量;
    根据所述第二数据量信息更新第一地址信息,并将更新后的第一地址信息同步至所述TOE硬件。
  4. 根据权利要求1所述的方法,其中,在向所述TOE硬件发送所述第一数据量信息,以使所述TOE硬件根据所述第一数据量信息从所述共享内存获取所述待发送数据,并根据所述待发送数据执行TOE卸载之后,所述方法还包括:
    获取所述TOE硬件发送的第三数据量信息,所述第三数据量信息表征所述TOE硬件通过TOE卸载所得到的待接收数据的数据量;
    根据所述第三数据量信息从共享内存的接收缓冲区中获取所述待接收数据,其中,所述待接收数据由所述TOE硬件缓冲至所述接收缓冲区;
    完成所述待接收数据的接收处理。
  5. 根据权利要求4所述的方法,其中,所述根据所述第三数据量信息从共享内存的接收缓冲区中获取所述待接收数据,包括:
    根据所述第三数据量信息确定第二地址信息,其中,所述第二地址信息用于描述所述接收缓冲区缓冲有数据的区域;
    根据所述第三数据量信息和所述第二地址信息从所述接收缓冲区域获取所述待发送数据。
  6. 根据权利要求5所述的方法,其中,在所述完成所述待接收数据的接收处理之后,所述方法还包括:
    确定第四数据量信息,所述第四数据量信息表征通过接收处理而新增加的已处理数据的数据量;
    根据所述第四数据量信息更新所述第二地址信息;
    将更新后的第二地址信息发送至所述TOE硬件。
  7. 一种基于主机侧大容量内存的高并发协议栈卸载方法,应用于TOE硬件,将TOE硬件逻辑的收发缓冲区卸载到所述主机侧的RAM内存空间,所述TOE硬件与所述主机侧的中央处理器CPU通信连接,所述基于主机侧大容量内存的高并发协议栈卸载方法包括:
    获取所述CPU发送的第一数据量信息;
    根据所述第一数据量信息从共享内存的发送缓冲区获取待发送数据,其中,所述待发送数据由所述CPU获取并缓冲至所述发送缓冲区;
    针对所述待发送数据执行TOE卸载。
  8. 根据权利要求7所述的方法,其中,所述根据所述第一数据量信息从共享内存的发送缓冲区获取待发送数据,包括:
    获取所述CPU发送的第一地址信息,其中,所述第一地址信息用于描述所述发送缓冲区缓冲有数据的区域;
    根据所述第一地址信息和所述第一数据量信息从所述发送缓冲区获取所述待发送数据。
  9. 根据权利要求8所述的方法,其中,在所述针对所述待发送数据执行TOE卸载之后,所述方法还包括:
    确定第二数据量信息,所述第二数据量信息表征通过执行TOE卸载而新增加的已发送数据的数据量;
    将所述第二数据量信息反馈至所述CPU,以使所述CPU根据所述第二数据量信息更新第一地址信息;
    获取所述CPU发送的更新后的第一地址信息。
  10. 根据权利要求7所述的方法,其中,在所述针对所述待发送数据执行TOE卸载之后,所述方法还包括:
    通过TOE卸载获取待接收数据,并确定所述待接收数据的第三数据量信息;
    将所述待接收数据缓冲至所述共享内存的接收缓冲区;
    向所述CPU发送所述第三数据量信息,以使所述CPU根据所述第三数据量信息从所述接收缓冲区获取所述待接收数据,并执行所述待接收数据的接收处理。
  11. 根据权利要求10所述的方法,其中,在所述向所述CPU发送所述第三数据量信息,以使所述CPU根据所述第三数据量信息从所述接收缓冲区获取所述待接收数据,并执行所述待接收数据的接收处理之后,所述方法还包括:
    获取所述CPU发送的更新后的第二地址信息,其中,所述第二地址信息由所述CPU根据所述第三数据量信息确定,并根据第四数据量信息更新,所述第四数据量信息表征通过所述CPU的接收处理而新增加的已处理数据的数据量。
  12. 一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1至6中任意一项所述的基于主机侧大容量内存的高并发协议栈卸载方法,或者,执行所述计算机程序时实现如权利要求7至11中任意一项所述的基于主机侧大容量内存的高并发协议栈卸载方法。
  13. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于执行如权利要求1至11中任意一项所述的基于主机侧大容量内存的高并发协议栈卸载方法。
PCT/CN2022/091531 2021-05-14 2022-05-07 基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质 WO2022237695A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110527515.7 2021-05-14
CN202110527515.7A CN113179327B (zh) 2021-05-14 2021-05-14 基于大容量内存的高并发协议栈卸载方法、设备、介质

Publications (1)

Publication Number Publication Date
WO2022237695A1 true WO2022237695A1 (zh) 2022-11-17

Family

ID=76928984

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/091531 WO2022237695A1 (zh) 2021-05-14 2022-05-07 基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质

Country Status (2)

Country Link
CN (1) CN113179327B (zh)
WO (1) WO2022237695A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113179327B (zh) * 2021-05-14 2023-06-02 中兴通讯股份有限公司 基于大容量内存的高并发协议栈卸载方法、设备、介质
CN117155729A (zh) * 2022-05-24 2023-12-01 北京有竹居网络技术有限公司 通信方法、系统、装置和电子设备
CN115208830B (zh) * 2022-05-27 2023-09-08 上海大学 一种高性能无阻塞数据发送方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1864376A (zh) * 2003-08-07 2006-11-15 英特尔公司 从卸载适配器利用主机存储器的方法、系统和制品
CN1910869A (zh) * 2003-12-05 2007-02-07 艾拉克瑞技术公司 具有简化的顺序处理的tcp/ip卸载设备
US20070255866A1 (en) * 2006-05-01 2007-11-01 Eliezer Aloni Method and system for a user space TCP offload engine (TOE)
CN101616194A (zh) * 2009-07-23 2009-12-30 中国科学技术大学 主机网络性能优化系统及方法
US20150205632A1 (en) * 2014-01-21 2015-07-23 Qualcomm Incorporated System and method for synchronous task dispatch in a portable device
CN109413106A (zh) * 2018-12-12 2019-03-01 中国航空工业集团公司西安航空计算技术研究所 一种tcp/ip协议栈实现方法
CN113179327A (zh) * 2021-05-14 2021-07-27 中兴通讯股份有限公司 基于大容量内存的高并发协议栈卸载方法、设备、介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070162639A1 (en) * 2005-11-30 2007-07-12 Chu Hsiao-Keng J TCP-offload-engine based zero-copy sockets
CN101853238A (zh) * 2010-06-01 2010-10-06 华为技术有限公司 通信处理器间消息通信方法和系统
CN105516191B (zh) * 2016-01-13 2019-08-20 成都市智讯联创科技有限责任公司 基于fpga实现的万兆网tcp协议卸载引擎toe的系统
CN110958213B (zh) * 2018-09-27 2021-10-22 华为技术有限公司 处理tcp报文的方法、toe组件以及网络设备
CN111327603B (zh) * 2020-01-21 2021-04-20 中科驭数(北京)科技有限公司 数据传输方法、装置和系统
CN112583935B (zh) * 2020-12-28 2022-11-22 深信服科技股份有限公司 缓冲区窗口调整方法、网关设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1864376A (zh) * 2003-08-07 2006-11-15 英特尔公司 从卸载适配器利用主机存储器的方法、系统和制品
CN1910869A (zh) * 2003-12-05 2007-02-07 艾拉克瑞技术公司 具有简化的顺序处理的tcp/ip卸载设备
US20070255866A1 (en) * 2006-05-01 2007-11-01 Eliezer Aloni Method and system for a user space TCP offload engine (TOE)
CN101616194A (zh) * 2009-07-23 2009-12-30 中国科学技术大学 主机网络性能优化系统及方法
US20150205632A1 (en) * 2014-01-21 2015-07-23 Qualcomm Incorporated System and method for synchronous task dispatch in a portable device
CN109413106A (zh) * 2018-12-12 2019-03-01 中国航空工业集团公司西安航空计算技术研究所 一种tcp/ip协议栈实现方法
CN113179327A (zh) * 2021-05-14 2021-07-27 中兴通讯股份有限公司 基于大容量内存的高并发协议栈卸载方法、设备、介质

Also Published As

Publication number Publication date
CN113179327A (zh) 2021-07-27
CN113179327B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2022237695A1 (zh) 基于主机侧大容量内存的高并发协议栈卸载方法、设备、介质
US10609150B2 (en) Lock management method in cluster, lock server, and client
US11023411B2 (en) Programmed input/output mode
US10642777B2 (en) System and method for maximizing bandwidth of PCI express peer-to-peer (P2P) connection
WO2021254330A1 (zh) 内存管理方法、系统、客户端、服务器及存储介质
EP3826267B1 (en) File sending method, file receiving method and file transceiving apparatus
EP4318251A1 (en) Data access system and method, and device and network card
US9311044B2 (en) System and method for supporting efficient buffer usage with a single external memory interface
CN109564502B (zh) 应用于存储设备中的访问请求的处理方法和装置
WO2020199760A1 (zh) 数据存储方法、存储器和服务器
WO2014180397A1 (zh) 网络数据包的发送方法及装置
CN115270033A (zh) 一种数据访问系统、方法、设备以及网卡
WO2022257587A1 (zh) 数据处理方法、toe硬件及计算机可读存储介质
CN110445580B (zh) 数据发送方法及装置、存储介质、电子装置
CN108063737B (zh) 一种FCoE存储区域网读请求处理方法及系统
WO2023065809A1 (zh) Cdn网元容器配置方法、读写方法、装置、设备及存储介质
US8898353B1 (en) System and method for supporting virtual host bus adaptor (VHBA) over infiniband (IB) using a single external memory interface
US10289550B1 (en) Method and system for dynamic write-back cache sizing in solid state memory storage
US11886938B2 (en) Message communication between integrated computing devices
US11121956B1 (en) Methods and systems for optimizing bidirectional forwarding detection in hardware
CN114401072A (zh) 一种基于hinoc协议的拆帧重排序队列的动态缓存控制方法及系统
US20110258282A1 (en) Optimized utilization of dma buffers for incoming data packets in a network protocol
CN107615259A (zh) 一种数据处理方法及系统
US20190050274A1 (en) Technologies for synchronizing triggered operations
US9104637B2 (en) System and method for managing host bus adaptor (HBA) over infiniband (IB) using a single external memory interface

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22806661

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE