WO2021208092A1 - Method and device for processing stateful service - Google Patents

Method and device for processing stateful service Download PDF

Info

Publication number
WO2021208092A1
WO2021208092A1 PCT/CN2020/085418 CN2020085418W WO2021208092A1 WO 2021208092 A1 WO2021208092 A1 WO 2021208092A1 CN 2020085418 W CN2020085418 W CN 2020085418W WO 2021208092 A1 WO2021208092 A1 WO 2021208092A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
connection
processing
host
instruction
Prior art date
Application number
PCT/CN2020/085418
Other languages
French (fr)
Chinese (zh)
Inventor
李君瑛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/085418 priority Critical patent/WO2021208092A1/en
Priority to CN202080099188.3A priority patent/CN115349121A/en
Publication of WO2021208092A1 publication Critical patent/WO2021208092A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer

Definitions

  • This application relates to the field of communication technology, and in particular to a method and device for processing stateful services.
  • the host occupies a large amount of central processing unit (CPU) resources when executing a network protocol or a storage protocol, thereby bringing a large CPU load to the host.
  • Smart network interface card (smart network interface card, smart NIC) is a high-performance network access card with a network processor as the core. It has a multi-core and multi-threaded network processor architecture that can be used to process various networks separated from the host. Protocol or storage protocol, which can greatly reduce the CPU load of the host. This way of separating the relevant protocol processing in the host and executing it by the smart network card can be called offload.
  • the host can communicate with the smart network card through a doorbell (DB) mechanism.
  • the smart network card usually includes three stages from receiving the DB sent by the host to sending the message to the network port.
  • Stage 1 is DB processing: obtain the connection context (context) corresponding to the DB, obtain the working queue element (WQE) of the connection based on the context, and generate direct memory access (DMA) commands,
  • the DMA command carries associated data with the path, and is used to guide how to forward and edit the subsequent download (download) of the message from the host;
  • stage 2 is the execution of the DMA command: download the message and process with the path, the size of the downloaded message is usually It is the maximum transmission unit (MTU);
  • Phase 3 is the editing of the message and the construction of the DMA command.
  • the above stage 1 is a stateful stage, that is, the processing of the DB strictly depends on the context of the same connection. Only the core or thread corresponding to the previous DB can update the context, and the core or thread corresponding to the next DB can be updated according to the updated context. Start processing in the context of, and the size of the message downloaded corresponding to each DB is only MTU, therefore, the processing performance of a single service is low.
  • the present application provides a method and device for processing a stateful service, which are used to improve the processing performance of a single service when the network card processes the stateful service unloading.
  • a method for processing a stateful service is provided, which is applied to a network card, and the network card is connected to a host.
  • the network card can be connected to the host via a PCIe bus.
  • the method includes: receiving an identifier of the first connection from the host
  • the identifier of the first connection can be carried in the first doorbell DB received from the host by the network card, the first connection is the connection where the stateful service is located; the context of the first connection is obtained according to the identifier of the first connection, and the first connection The context of is used to indicate the relevant information of the first connection.
  • the context of the first connection is the context updated after the last data block of the first connection is processed; the data processing instruction is generated according to the context of the first connection, and the data processing instruction is available To process the data of the first data volume; use the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host; process the data block of the first connection according to the data processing instruction to obtain multiple messages; send multiple Message.
  • the network card when the network card receives the first DB carrying the identifier of the first connection, the network card can generate a data processing instruction that can be used to process data of the first amount of data according to the context of the first connection, and from the host Download the data block of the first connection and process the data block of the first connection according to the data processing instruction to obtain multiple messages, so that the network card can obtain the context of the first connection once based on the first DB, and process based on the context Send multiple messages of the first connection, thereby improving the processing performance of a single stateful service.
  • the data processing instruction includes a direct memory access DMA instruction corresponding to the first data volume
  • downloading the first connected data block from the host includes: corresponding to the first data volume DMA command to download the first connected data block from the host.
  • the smart network card may generate a DMA instruction corresponding to the first data volume based on the first DB, thereby downloading the data block of the first connection from the host based on the DMA, so that the smart network card may be based on the first DB.
  • the context of one connection processes multiple messages of the first connection, thereby improving the processing performance of a single stateful service.
  • the data processing instruction further includes associated data
  • processing the data block of the first connection according to the data processing instruction includes: comparing data of the first connection according to the associated data The block is fragmented to obtain multiple data fragments.
  • the smart network card may generate associated data corresponding to the first data volume based on the first DB, so as to process the data block of the first connection based on the associated data, so that the smart network card can be based on The context of the first connection processes multiple packets of the first connection, thereby improving the processing performance of a single stateful service.
  • the fragmentation processing includes at least one of the following: inserting multiple markers, deleting part of the header data, deleting part of the tail data, determining the message header, and determining the cyclic redundancy of the payload. After checking the CRC, determine the checksum CS.
  • the data processing instruction further includes a message editing instruction
  • processing the data block of the first connection according to the data processing instruction further includes: dividing a plurality of data according to the message editing instruction The film is edited and encapsulated to obtain multiple messages.
  • the smart network card can generate a message editing instruction at the same time when generating a DMA instruction, which can avoid subsequent editing and packaging of multiple data fragments to wake up the smart network card corresponding to the first DB. Cores/threads, thereby further improving processing performance.
  • the message editing instructions include: editing instructions for the head slice, editing instructions for the middle slice, and editing instructions for the tail slice.
  • the complexity of the message editing instructions can be reduced.
  • the method before using the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host, the method further includes: allocating and sending the first DB from the available bus bandwidth. Bandwidth, the sending bandwidth is equal to the first amount of data.
  • the smart network card can download multiple packets of the first connection from the host at one time, so that the multiple packets can be processed at the same time based on the context of the first connection subsequently, thereby increasing the number of individual packets. The processing performance of the state business.
  • the method further includes: according to the first data volume and the data of the first connection The difference in the actual data volume of the block adjusts the sending bandwidth.
  • a device for processing stateful services is provided, which is applied to a network card.
  • the network card is connected to a host.
  • the network card is connected to the host via a PCIe bus.
  • the identification of the connection for example, the identification of the first connection can be carried in the first doorbell DB received from the host by the network card, and the first connection is the connection where the stateful service is located; the acquiring unit is also used to acquire according to the identification of the first connection
  • the context of the first connection the context of the first connection is used to indicate the relevant information of the first connection, the context of the first connection is the context updated after the last data block of the first connection is processed; the processing unit, according to the first connection
  • the context of generating a data processing instruction the data processing instruction can be used to process the data of the first amount of data; the acquiring unit is also used to download the data block of the first connection from the host using the sending bandwidth corresponding to the first amount of data; the processing unit, It is also used to process the data block of the first connection
  • the data processing instruction includes a direct memory access DMA instruction corresponding to the first amount of data
  • the acquiring unit is further configured to: according to the DMA instruction corresponding to the first amount of data, from the host Download the data block of the first connection.
  • the data processing instruction further includes associated data
  • the processing unit is further configured to: perform slicing processing on the data block of the first connection according to the associated data to obtain multiple Data fragmentation.
  • the fragmentation processing includes at least one of the following: inserting multiple markers, deleting part of the header data, deleting part of the tail data, determining the message header, and determining the cyclic redundancy of the payload. After checking the CRC, determine the checksum CS.
  • the data processing instruction further includes a message editing instruction
  • the processing unit is further used for: editing and encapsulating multiple data fragments according to the message editing instruction to obtain multiple Message.
  • the message editing instructions include: editing instructions for the head slice, editing instructions for the middle slice, and editing instructions for the tail slice.
  • the device further includes: a bandwidth allocation unit, configured to allocate a transmission bandwidth to the first DB from the available bus bandwidth, and the transmission bandwidth is equal to the first data amount.
  • the bandwidth allocation unit is further configured to: adjust the transmission bandwidth according to the difference between the first data amount and the actual data amount of the data block of the first connection.
  • a device for processing stateful services is provided.
  • the device is a network card or a chip built in the network card.
  • the device includes a memory and a processor coupled with the memory, and the memory stores code And data, the processor runs the code in the memory to make the device execute the stateful service processing method provided by the first aspect or any one of the possible implementations of the first aspect.
  • a communication system in another aspect of the present application, includes a network card and a host, and the network card is connected to the host through a bus; wherein, the network card is a network card provided in any one of the above aspects, and is used to execute the first Aspect or the stateful service processing method provided by any possible implementation manner of the first aspect.
  • a computer-readable storage medium stores instructions, which when run on a computer, cause the computer to execute the first aspect or the first aspect.
  • the stateful service processing method provided by any of the possible implementations.
  • a computer program product in another aspect of the present application, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can be read from the computer-readable storage medium.
  • the computer-executable instruction is read, and at least one processor executes the computer-executed instruction to make the device execute the stateful service processing method provided by the first aspect or any one of the possible implementation manners of the first aspect.
  • any device, computer storage medium or computer program product for processing stateful services provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the above The beneficial effects of the corresponding methods provided in the article will not be repeated here.
  • FIG. 1 is a schematic structural diagram of a communication system provided by an embodiment of this application.
  • FIG. 2 is a schematic flowchart of a method for processing a stateful service according to an embodiment of the application
  • FIG. 3 is a schematic diagram of a fragmentation process provided by an embodiment of the application.
  • FIG. 4 is a schematic structural diagram of a smart network card provided by an embodiment of this application.
  • FIG. 5 is a schematic diagram of multi-message processing provided by an embodiment of this application.
  • FIG. 6 is a schematic structural diagram of a processing device for a stateful service provided by an embodiment of this application.
  • FIG. 7 is a schematic structural diagram of another device for processing a stateful service provided by an embodiment of the application.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • And/or describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
  • the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • At least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • the embodiments of the present application use words such as "first" and "second” to distinguish the same items or similar items that have substantially the same function and effect.
  • the first threshold and the second threshold are only for distinguishing different thresholds, and the order of their order is not limited. Those skilled in the art can understand that words such as “first” and “second” do not limit the number and execution order.
  • FIG. 1 is a schematic structural diagram of a communication system provided by an embodiment of the application.
  • the communication system includes a host and a network card, and the host and the network card are connected through a bus.
  • the network card is a smart network interface card (smart NIC), and the smart network card is connected to the host through a peripheral component interconnect express (PCIe) bus.
  • the communication system may include one or more hosts, and the one or more hosts may all be connected to the smart network card.
  • the embodiment of the present application will be introduced and explained by taking the network card as a smart network card as an example.
  • the host is provided with multiple virtual machines (virtual machines, VMs), and each VM can run one or more virtual functions (VF), and the one or more VFs can correspond to different functions.
  • a VF may correspond to one or more queues, and the input or output mechanism of the VF is realized through the one or more queues.
  • the multiple queues may include a sending queue and a receiving queue.
  • the smart network card can be used to process various network protocols or storage protocols separated from the host, and it can also be called protocol offload.
  • network offloading can include virtualization I/O (virtualization I/O, Virt IO) offloading, single-root I/O virtualization (single-root I/O virtualization, SR-IOV) offloading, user datagram protocol ( user datagram protocol, UDP)/transmission control protocol (transmission control protocol, TCP)/Internet protocol (Internet Protocol, IP) checksum (checksum, CS) offloading, receiving side expansion (receive side scaling, RSS) offloading/TCP split TCP Segment Offload (TSO)/Large Receive Offload (LRO), Virtual Extensible Local Area Network (VxLAN)/Generic Network Virtualization Encapsulation (Geneve) Offloading, stateful (stateful) open virtual switch (open virtual switch, OVS) offloading, IP security (IP security, IPSec) offloading, TCP offloading (TCP offload engine, TOE) offloading, converged Ethernet remote DMA version 2( RDMA overconverged
  • the smart network card may include: a transmit bandwidth provision (TX bandwidth provision) module, a receive bandwidth provision (RX bandwidth provision) module, a transmit processing (TX processing) module, and a receive processing (RX processing) module , Scheduler, processor pool including multiple processor cores, traffic manager, TX port for sending messages to Eth, and The receiving virtual machine (RX VM) used to send messages to the host.
  • the processor pool in the smart network card may be an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA), etc. The embodiment of the application does not specifically limit this .
  • FIG. 2 is a schematic flowchart of a method for processing a stateful service according to an embodiment of the application.
  • the method can be applied to the communication system including the host and the smart network card shown in FIG. 1.
  • the method includes the following steps.
  • the smart network card receives the identifier of the first connection from the host, and the first connection is the connection where the stateful service is located.
  • connection refers to a logical link established by a session at both ends.
  • the connection may be a TCP connection, a UDP connection, or a ROCE queue pair (QP) connection.
  • Multiple connections may be established between the host and the network (Eth) through the smart network card, the first connection may be any one of the multiple connections, and the identifier of the first connection may be used to identify the first connection.
  • the identification of the first connection may be carried in the first DB.
  • the smart network card receives the first DB from the host, and the first DB carries the identification of the first connection of the host.
  • stateful services correspond to stateless services.
  • a stateless service may mean that the processing of a single message of the service can be processed based on the header of the message itself, and there is no correlation between the message and the message.
  • a stateful service can mean that a single message of the service cannot decide how to process the message.
  • the processing of the message needs to depend on the state of the "connection" where the message is located, and the information of the message itself, etc., before the decision can be made.
  • the processing behavior of the message that is, there is an association between the message of the stateful service and the message.
  • the state information of the above "connection” includes but is not limited to: the expected sequence number of the next message and the response (ACK) The serial number, receiving window update and statistical information, etc.
  • firewalls can include firewalls, and can also include firewalls at different levels such as security groups in OpenStack.
  • the stateless firewall refers to filtering or blocking network data packets based on static values, for example, based on addresses, ports, and protocols, etc. That is, the stateless firewall itself does not care about the current network connection status.
  • a stateful firewall can distinguish the state of a network connection. For example, a stateful firewall can distinguish between a TCP connection and which stage of the TCP connection is currently in. Filter or block network data packets.
  • the host can send the first DB carrying the identifier of the first connection to the smart network card, which can be run on the first VM in the host.
  • the first VF is sent, and the first VF may be a VF used to perform tasks related to the first connection.
  • the first VF may push the corresponding working queue element (WQE) ( post) to the first sending queue corresponding to the first VF.
  • the first sending queue can be a queue for sending messages for the first connection.
  • This WQE can be used to describe the size of the sent message and the sending of the message on the host.
  • the storage address in the memory, etc., the WQE can also be called a descriptor.
  • the first VF can send the first DB to the smart network card, so that the smart network card can receive the first DB, and the first DB carries the identifier of the first connection.
  • the first DB may also carry an identifier of the first VF, and the identifier of the first VF may be used to uniquely identify the first VF among multiple VFs in the host.
  • the identification of the first connection can be used to uniquely identify the first connection corresponding to the first VF among the multiple VFs of the host; when the first DB carries the first connection
  • the identification of the first VF can be used to uniquely identify the first VF among the multiple VFs in the host, and the identification of the first connection can be used for multiple connections corresponding to the first VF.
  • the smart network card obtains the context of the first connection according to the identifier of the first connection.
  • the context of the first connection may be stored in the cache of the smart network card, or the context of the first connection may not be stored.
  • the smart network card can obtain the context of the first connection from the host according to the identifier of the first connection; when the context of the first connection is stored in the smart network card, the smart network card can Get the context of the first connection from its own cache.
  • the context of the first connection is used to record related information about the first connection, such as the address of the first sending queue in the memory of the host, the current sending position in the first sending queue, the sending window, and the information of the sent message Serial number, etc.
  • the context of the first connection is the updated context obtained after the last data block of the first connection is processed.
  • the smart network card generates a data processing instruction according to the context of the first connection, and the data processing instruction can be used to process data of the first amount of data.
  • the first data amount may be set in advance, and the first data amount is greater than the maximum transmission unit (MTU).
  • MTU maximum transmission unit
  • the specific value of the first data volume can be fixed or variable. It can be set by those skilled in the art based on experience or actual conditions.
  • the first data volume can be 64KB. This is not specifically limited.
  • the smart network card may obtain WQE from the first sending queue of the host according to the context of the first connection, for example, based on the address of the first sending queue indicated by the context of the first connection, from the first sending queue of the host Acquire the WQE, and then generate a data processing instruction corresponding to the first data volume based on information such as the size of the sent message indicated by the WQE (for example, the first data volume) and the storage address of the sent message in the memory of the host.
  • the data processing instruction may also include a direct memory access (DMA) instruction (command), associated data, and packet edit instruction (packet edit, PE) corresponding to the first data volume.
  • DMA direct memory access
  • PE packet edit instruction
  • the path-associated data is used to indicate segmentation information
  • the message editing instruction can be a three-stage PE instruction, which can specifically include an editing instruction for a head slice, an editing instruction for at least one middle slice, and an editing instruction for a tail slice.
  • the smart network card may also update the context of the first connection after performing step S203, and store the updated context of the first connection in the buffer, or may send the updated context of the first connection to the host .
  • the smart network card uses the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host.
  • the transmission bandwidth corresponding to the first data amount may be equal to the first data amount.
  • the smart network card may allocate the sending bandwidth to the first DB from the available bus bandwidth (available bus bandwidth corresponding to the PCIe bus).
  • the smart network card can use the sending bandwidth to download the data block of the first connection from the host, and the data block of the first connection can include multiple packets of data.
  • the data processing instruction may include a DMA instruction, and the DMA instruction carries the storage address of the data block of the first connection in the memory of the host, so that the smart network card can download the first connection from the host through the DMA instruction Data block.
  • the actual data volume of the first connected data block may be equal to or smaller than the first data volume.
  • the smart network card can reduce the transmission bandwidth according to the difference between the first data volume and the actual data volume, thereby saving the available bus bandwidth corresponding to the PCIe bus.
  • S205 The smart network card processes the data block of the first connection according to the data processing instruction to obtain multiple messages.
  • the data processing instruction may include associated data, and the associated data may be used to guide the smart network card to perform fragmentation according to a maximum segment size (MSS) and calculate CS.
  • the associated data may include information indicating the size of each data fragment (ie MSS), information indicating the calculation of the checksum of each data fragment, and information indicating the interval or length of inserting markers , And instructions to delete some data, etc. Therefore, the smart network card can perform fragmentation processing on the data block of the first connection according to the associated associated data to obtain multiple data fragments.
  • the fragmentation processing may include at least one of the following processing: inserting multiple markers, deleting parts Header data, delete part of the tail data, determine the header, determine the cyclic redundancy check (CRC) of the payload, and determine the CS.
  • the marker can be DIF, or DIX, etc.
  • the deleted part of the header data can be redundant data, for example, the part of the header data has been sent before
  • the deleted part of the tail data can be redundant data, For example, this part of the tail data is less than the payload of one packet.
  • the smart network card can insert multiple DIFs in the iSCSI PDU payload, determine that the header is iSCSI HDR, and determine The CRC of the iSCSI HDR (represented as HDR CRC in Figure 3) and the CRC of the payload (represented as data CRC in Figure 3), and then the iSCSI PDU payload can be divided into multiple data fragments according to MSS.
  • the PAD in FIG. 3 is the abbreviation of padding, which can be used to indicate the data part of the padding, which can be specifically indicated by the associated data.
  • the data processing instruction may also include a message editing instruction
  • the smart network card may edit and encapsulate multiple data fragments obtained by the fragment processing to obtain multiple messages.
  • the message editing instruction is a three-segment PE instruction
  • the smart network card can perform data segmentation on the first data segment of the multiple data segments (also called the first data segment) according to the editing instruction of the header segment.
  • the second data slice of the multiple data slices to the penultimate data slice (also called intermediate data slice or intermediate slice) ) Performs encoding and encapsulation, and encodes and encapsulates the penultimate data slice (also referred to as a tail data slice or a trailer) among the multiple data slices according to the editing instruction of the tail slice to obtain multiple messages.
  • the edit instruction of at least one intermediate slice can also be used to indicate the change rule of the header (header) from the second data slice to the penultimate data slice when encoding and encapsulating, for example, the TCP sequence number ( Sequence number, SN) is an incremental method, and IP ID is the same.
  • TCP sequence number Sequence number, SN
  • IP ID IP ID
  • S206 The smart network card sends multiple messages.
  • the smart network card can send multiple messages to the network through the network port.
  • the smart network card may also send feedback information to the host, and the feedback information may be used to indicate that the host has successfully sent multiple packets.
  • the TX bandwidth allocation module of the smart network card may include multiple DB queues (DB Q), queue mapping (QM) modules, bandwidth allocation nodes (represented as vNIC in Figure 4), and round robin (RR). ) Scheduler for scheduling (represented as RR in Figure 4).
  • DB Q DB queues
  • QM queue mapping
  • RR round robin
  • Scheduler for scheduling represented as RR in Figure 4.
  • FIG. 4 description is made by taking multiple DBQs corresponding to multiple hosts (that is, the smart network card is connected to multiple hosts (for example, H0 to H3)) as an example.
  • the first DB can be queued in multiple DB queues according to certain rules and mapped to the corresponding bandwidth allocation node through the queue mapping module to complete the allocation of the sending bandwidth. Then, the RR allocates processor cores or threads in the processor cores from the processor pool to the first DB, and the allocated processor cores or threads may be referred to as cores/threads in the following.
  • the core/thread can obtain the context of the first connection based on the identifier of the first connection carried by the first DB, and generate the DMA instruction of the first data amount, the associated data and the message editing instruction according to the context of the first connection, and Download the data block of the first connection from the host through a DMA instruction; optionally, the core/thread may also send the difference between the first data volume and the actual data volume of the data block of the first connection to the corresponding bandwidth allocation node, So that the bandwidth allocation node adjusts the sending bandwidth of the first DB.
  • the TX processing module can complete the fragmentation processing of the data block of the first connection according to the associated data, and the obtained multiple data fragments can be stored in the memory of the smart network card.
  • the traffic manager can schedule multiple data fragments from the memory and edit and encapsulate the data fragments according to the message editing instructions, and the resulting message can be sent to the network through the TX port.
  • the context of the first connection may be decomposed into multiple sub-contexts, so that different processor cores or threads of the smart network card concurrently process different DBs according to different sub-contexts, thereby improving the throughput of the first connection's message.
  • the context of the first connection can be decomposed into four sub-contexts and represented as S0, S1, S2, and S3 respectively, so that the smart network card can process the 4 DBs of the first connection at the same time and represent them respectively These are DB1, DB2, DB3 and DB4.
  • processor cores allocated by the scheduler to the 4 DBs are Core1, Core2, Core3, and Core4, after Core1 finishes processing DB1S0, Core1 can continue to process DB1S1, and Core2 can process DB2S0 at this time; when Core1 finishes processing DB1S1, Core1 can continue processing DB1S2, at this time Core2 has finished processing DB2S0 and can start processing DB2S1, while Core3 can start processing DB3S0,..., and so on.
  • the first connection also corresponds to DB5
  • Core1 can start processing DB5S0
  • Core2 starts processing DB2S3
  • Core3 starts processing DB3S2
  • Core4 starts processing DB4S1, so that 4 smart network cards can be processed.
  • the cores concurrently participate in the context processing of the first connection, and each core can process a data block of the first connection, so that the throughput of the first connection can be improved.
  • the smart network card when the smart network card receives the first DB carrying the identifier of the first connection, the smart network card may generate a data processing instruction of the first amount of data according to the context of the first connection, and from the host Download the data block of the first connection, and process the data block of the first connection according to the data processing instruction to obtain multiple messages, so that the smart network card can obtain the context of the first connection once based on the first DB, and process based on the context Send multiple messages of the first connection, thereby improving the processing performance of a single stateful service.
  • the smart network card can generate message editing instructions at the same time when generating DMA instructions, which can avoid subsequent editing and packaging of multiple data fragments to wake up the core/thread corresponding to the first DB in the smart network card again, thereby Further improve the processing performance.
  • the embodiments of the present application may divide the functional modules of the smart network card according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation. The following is an example of dividing each function module corresponding to each function.
  • FIG. 6 shows a possible structural schematic diagram of the stateful service processing apparatus involved in the foregoing embodiment.
  • the device may be a smart network card or a chip built in the smart network card, and the device includes: an acquiring unit 601, a processing unit 602, and a sending unit 603.
  • the acquiring unit 601 is used to support the device to perform S201, S202, and S204 in the above method embodiment
  • the processing unit 602 is used to support the device to perform S203 and S205 in the above method embodiment
  • the sending unit 603 is used to support the device Perform S206 in the foregoing method embodiment.
  • the device may further include: a bandwidth allocating unit 604, configured to support the device to perform the step of allocating the sending bandwidth in the foregoing method embodiment.
  • the processing unit 602 may be the integration of the TX processing module, scheduler, processor pool, and traffic manager in the smart network card described in the foregoing method embodiment, and the bandwidth allocation unit 604 may be the foregoing method embodiment.
  • the sending unit 603 may be the TX port in the smart network card described in the foregoing method embodiment.
  • the processing unit 602 in this application may be the processor of the device, the acquiring unit 601 may be the receiver of the device, and the sending unit 603 may be the transmitter, receiver, and transmitter of the device. Usually can be integrated together as the communication interface of the device.
  • the device may be a smart network card or a chip built in the smart network card, and the device includes a processor 702 and a communication interface 703.
  • the processor 702 is used to control and manage the actions of the device.
  • the processor 702 is used to support the device to execute S203 and S205 in the foregoing method embodiments, and/or other processes used in the technology described herein.
  • the device may also include a memory 701 and a bus 704.
  • the processor 702, a communication interface 703, and the memory 701 are connected to each other through the bus 704; the communication interface 703 is used to support the device to communicate, for example, to support the device to communicate with a host or a network. Communication; The memory 701 is used to store the program code and data of the device.
  • the processor 702 may be a central processing unit, a general-purpose processor, a baseband processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components, or any of them combination. It can implement or execute various exemplary logical blocks, modules, and circuits described in conjunction with the disclosure of this application.
  • the processor may also be a combination that implements computing functions, for example, a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
  • the bus 704 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • the bus 704 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
  • a communication system in another aspect of the present application, includes a network card and a host, and the network card is connected to the host via a bus; wherein, the network card is any one of the network cards provided above, and is used to perform the above-mentioned method implementation. The steps of the network card in the example.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be It can be combined or integrated into another device, or some features can be omitted or not implemented.
  • the units described as separate parts may or may not be physically separate.
  • the parts displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the readable storage medium may include: U disk, mobile hard disk, read-only Various media that can store program codes such as memory, random access memory, magnetic disk or optical disk.
  • a readable storage medium stores computer execution instructions.
  • a device may be a single-chip microcomputer, a chip, etc.
  • a processor executes the above method embodiments The stateful service processing method provided.
  • a computer program product in another embodiment of the present application, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can be accessed from a computer.
  • the reading storage medium reads the computer-executable instruction, and at least one processor executes the computer-executable instruction to make the device execute the stateful service processing method provided by the foregoing method embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present application relates to the technical field of communications, and provides a method and device for processing a stateful service, which are used to improve the processing performance of a single service when a network card processes the unloading of a stateful service. The method is applied in a network card, and the network card is connected to a host; the method comprises: receiving an identifier of a first connection from the host, the first connection being a connection in which a stateful service is located; obtaining the context of the first connection according to the identifier of the first connection, the context of the first connection being used to indicate related information of the first connection, and the context of the first connection being a context obtained by updating after the previous data block of the first connection is processed; generating a data processing instruction according to the context of the first connection, wherein the data processing instruction can be used to process data of a first data volume; downloading a data block of the first connection from the host by using a transmission bandwidth corresponding to the first data volume; processing the data block of the first connection according to the data processing instruction to obtain multiple messages; and sending the multiple messages.

Description

一种有状态业务的处理方法及装置Method and device for processing stateful business 技术领域Technical field
本申请涉及通信技术领域,尤其涉及一种有状态业务的处理方法及装置。This application relates to the field of communication technology, and in particular to a method and device for processing stateful services.
背景技术Background technique
随着云网络中业务类型和数据量的持续增加,对于网络协议或存储协议的执行已成为一项计算密集型的操作。主机(host)在执行网络协议或存储协议时会占用大量的中央处理单元(center processing unit,CPU)资源,从而给主机带来很大的CPU负载。智能网卡(smart network interface card,smart NIC)作为一种以网络处理器为核心的高性能网络接入卡,具有多核多线程的网络处理器架构,可用于处理从主机中分离出来的各种网络协议或存储协议,这样可以大大减少主机的CPU负载。这种将主机中的相关协议处理分离出来由智能网卡执行的方式可以称为卸载(offload)。With the continuous increase of business types and data volumes in cloud networks, the execution of network protocols or storage protocols has become a computationally intensive operation. The host (host) occupies a large amount of central processing unit (CPU) resources when executing a network protocol or a storage protocol, thereby bringing a large CPU load to the host. Smart network interface card (smart network interface card, smart NIC) is a high-performance network access card with a network processor as the core. It has a multi-core and multi-threaded network processor architecture that can be used to process various networks separated from the host. Protocol or storage protocol, which can greatly reduce the CPU load of the host. This way of separating the relevant protocol processing in the host and executing it by the smart network card can be called offload.
现有技术中,主机可通过门铃(door bell,DB)机制与智能网卡通信。对于有状态的(stateful)业务卸载,智能网卡从接收到主机发送的DB至将报文发送到网络端口通常包括三个阶段。阶段1为DB处理:获取该DB对应的连接的上下文(context),基于该上下文获取该连接的工作队列元素(working queue element,WQE)并生成直接存储器存取(direct memory access,DMA)命令,该DMA命令携带有随路关联数据,用于指导如何对后续从主机下载(download)的报文进行转发编辑;阶段2为DMA命令执行:下载报文和随路处理,下载的报文大小通常为最大传输单元(maximum transmission unit,MTU);阶段3为报文的编辑和DMA命令构建。In the prior art, the host can communicate with the smart network card through a doorbell (DB) mechanism. For stateful service offloading, the smart network card usually includes three stages from receiving the DB sent by the host to sending the message to the network port. Stage 1 is DB processing: obtain the connection context (context) corresponding to the DB, obtain the working queue element (WQE) of the connection based on the context, and generate direct memory access (DMA) commands, The DMA command carries associated data with the path, and is used to guide how to forward and edit the subsequent download (download) of the message from the host; stage 2 is the execution of the DMA command: download the message and process with the path, the size of the downloaded message is usually It is the maximum transmission unit (MTU); Phase 3 is the editing of the message and the construction of the DMA command.
其中,上述阶段1是有状态阶段,即DB的处理是严格依赖于同一个连接的上下文,只有前一个DB对应的核或线程更新完该上下文,下一个DB对应的核或线程才能根据更新后的上下文开始处理,而且每个DB对应下载的报文大小仅为MTU,因此,导致单个业务的处理性能较低。Among them, the above stage 1 is a stateful stage, that is, the processing of the DB strictly depends on the context of the same connection. Only the core or thread corresponding to the previous DB can update the context, and the core or thread corresponding to the next DB can be updated according to the updated context. Start processing in the context of, and the size of the message downloaded corresponding to each DB is only MTU, therefore, the processing performance of a single service is low.
发明内容Summary of the invention
本申请提供一种有状态业务的处理方法及装置,用于在网卡处理有状态业务卸载时提高单个业务的处理性能较低。The present application provides a method and device for processing a stateful service, which are used to improve the processing performance of a single service when the network card processes the stateful service unloading.
为达到上述目的,本申请采用如下技术方案:In order to achieve the above objectives, this application adopts the following technical solutions:
第一方面,提供一种有状态业务的处理方法,应用于网卡中,该网卡与主机连接,比如,该网卡可以通过PCIe总线与主机连接,该方法包括:接收来自主机的第一连接的标识,比如,第一连接的标识可以携带在网卡接收到来自主机的第一门铃DB中,第一连接为有状态业务所在的连接;根据第一连接的标识获取第一连接的上下文,第一连接的上下文用于指示第一连接的相关信息,第一连接的上下文是第一连接的上一个数据块被处理后更新得到的上下文;根据第一连接的上下文生成数据处理指令,该数据处理指令可用于处理第一数据量的数据;使用第一数据量对应的发送带宽从主机中下载第一连接的数据块;根据数据处理指令处理第一连接的数据块,得到多个报文;发送多个报文。In the first aspect, a method for processing a stateful service is provided, which is applied to a network card, and the network card is connected to a host. For example, the network card can be connected to the host via a PCIe bus. The method includes: receiving an identifier of the first connection from the host For example, the identifier of the first connection can be carried in the first doorbell DB received from the host by the network card, the first connection is the connection where the stateful service is located; the context of the first connection is obtained according to the identifier of the first connection, and the first connection The context of is used to indicate the relevant information of the first connection. The context of the first connection is the context updated after the last data block of the first connection is processed; the data processing instruction is generated according to the context of the first connection, and the data processing instruction is available To process the data of the first data volume; use the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host; process the data block of the first connection according to the data processing instruction to obtain multiple messages; send multiple Message.
上述技术方案中,该网卡在接收到携带第一连接的标识的第一DB时,该网卡可以根据第一连接的上下文生成可用于处理第一数据量的数据的数据处理指令,以及从该主机中下载第一连接的数据块,并根据该数据处理指令处理第一连接的数据块以得到多个报文,从而该网卡可以基于第一DB获取一次第一连接的上下文,并基于该上下文处理发送第一连接的多个报文,进而提高了单个有状态业务的处理性能。In the above technical solution, when the network card receives the first DB carrying the identifier of the first connection, the network card can generate a data processing instruction that can be used to process data of the first amount of data according to the context of the first connection, and from the host Download the data block of the first connection and process the data block of the first connection according to the data processing instruction to obtain multiple messages, so that the network card can obtain the context of the first connection once based on the first DB, and process based on the context Send multiple messages of the first connection, thereby improving the processing performance of a single stateful service.
在第一方面的一种可能的实现方式中,该数据处理指令包括第一数据量对应的直接存储器存取DMA指令,从主机中下载第一连接的数据块,包括:根据第一数据量对应的DMA指令,从主机中下载第一连接的数据块。上述可能的实现方式中,该智能网卡可以基于第一DB,生成第一数据量对应的DMA指令,从而基于该DMA从该主机中下载第一连接的数据块,这样可以使得该智能网卡基于第一连接的上下文处理第一连接的多个报文,进而提高单个有状态业务的处理性能。In a possible implementation manner of the first aspect, the data processing instruction includes a direct memory access DMA instruction corresponding to the first data volume, and downloading the first connected data block from the host includes: corresponding to the first data volume DMA command to download the first connected data block from the host. In the foregoing possible implementation manners, the smart network card may generate a DMA instruction corresponding to the first data volume based on the first DB, thereby downloading the data block of the first connection from the host based on the DMA, so that the smart network card may be based on the first DB. The context of one connection processes multiple messages of the first connection, thereby improving the processing performance of a single stateful service.
在第一方面的一种可能的实现方式中,该数据处理指令还包括随路关联数据,根据数据处理指令处理第一连接的数据块,包括:根据该随路关联数据对第一连接的数据块进行分片处理,得到多个数据分片。上述可能的实现方式中,该智能网卡可以基于第一DB,生成第一数据量对应的随路关联数据,从而基于该随路关联数据处理第一连接的数据块,这样可以使得该智能网卡基于第一连接的上下文处理第一连接的多个报文,进而提高单个有状态业务的处理性能。In a possible implementation of the first aspect, the data processing instruction further includes associated data, and processing the data block of the first connection according to the data processing instruction includes: comparing data of the first connection according to the associated data The block is fragmented to obtain multiple data fragments. In the foregoing possible implementation manners, the smart network card may generate associated data corresponding to the first data volume based on the first DB, so as to process the data block of the first connection based on the associated data, so that the smart network card can be based on The context of the first connection processes multiple packets of the first connection, thereby improving the processing performance of a single stateful service.
在第一方面的一种可能的实现方式中,该分片处理包括以下至少一项:插入多个标记符,删除部分头部数据,删除部分尾部数据,确定报文头,确定载荷的循环冗余校验CRC,确定校验和CS。In a possible implementation of the first aspect, the fragmentation processing includes at least one of the following: inserting multiple markers, deleting part of the header data, deleting part of the tail data, determining the message header, and determining the cyclic redundancy of the payload. After checking the CRC, determine the checksum CS.
在第一方面的一种可能的实现方式中,该数据处理指令还包括报文编辑指令,根据数据处理指令处理第一连接的数据块,还包括:根据报文编辑指令,对多个数据分片进行编辑封装以得到多个报文。上述可能的实现方式中,该智能网卡可以在生成DMA指令时,同时生成报文编辑指令,这样可以避免后续再对多个数据分片进行编辑封装时再次唤醒智能网卡中与第一DB对应的核/线程,从而进一步提高了处理性能。In a possible implementation of the first aspect, the data processing instruction further includes a message editing instruction, and processing the data block of the first connection according to the data processing instruction further includes: dividing a plurality of data according to the message editing instruction The film is edited and encapsulated to obtain multiple messages. In the above possible implementation manners, the smart network card can generate a message editing instruction at the same time when generating a DMA instruction, which can avoid subsequent editing and packaging of multiple data fragments to wake up the smart network card corresponding to the first DB. Cores/threads, thereby further improving processing performance.
在第一方面的一种可能的实现方式中,报文编辑指令包括:头部切片的编辑指令、中间切片的编辑指令和尾部切片的编辑指令。上述可能的实现方式中,能够降低报文编辑指令的复杂度。In a possible implementation of the first aspect, the message editing instructions include: editing instructions for the head slice, editing instructions for the middle slice, and editing instructions for the tail slice. In the foregoing possible implementation manners, the complexity of the message editing instructions can be reduced.
在第一方面的一种可能的实现方式中,使用第一数据量对应的发送带宽从主机中下载第一连接的数据块之前,该方法还包括:从可用总线带宽中为第一DB分配发送带宽,发送带宽等于第一数据量。上述可能的实现方式中,能够使得该智能网卡从该主机中一次性地下载第一连接的多个报文,从而后续可以基于第一连接的上下文同时处理这多个报文,进而提高单个有状态业务的处理性能。In a possible implementation of the first aspect, before using the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host, the method further includes: allocating and sending the first DB from the available bus bandwidth. Bandwidth, the sending bandwidth is equal to the first amount of data. In the foregoing possible implementation manners, the smart network card can download multiple packets of the first connection from the host at one time, so that the multiple packets can be processed at the same time based on the context of the first connection subsequently, thereby increasing the number of individual packets. The processing performance of the state business.
在第一方面的一种可能的实现方式中,使用第一数据量对应的发送带宽从主机中下载第一连接的数据块之后,该方法还包括:根据第一数据量与第一连接的数据块的实际数据量的差值,调整发送带宽。上述可能的实现方式,能够提高该智能网卡带宽分配的灵活性和带宽利用率。In a possible implementation of the first aspect, after downloading the data block of the first connection from the host using the transmission bandwidth corresponding to the first data volume, the method further includes: according to the first data volume and the data of the first connection The difference in the actual data volume of the block adjusts the sending bandwidth. The foregoing possible implementation manners can improve the flexibility of bandwidth allocation and bandwidth utilization of the smart network card.
第二方面,提供一种有状态业务的处理装置,应用于网卡中,网卡与主机连接,比如,该网卡通过PCIe总线与主机连接,该装置包括:获取单元,用于接收来自主机 的第一连接的标识,比如,第一连接的标识可以携带在网卡接收到来自主机的第一门铃DB中,第一连接为有状态业务所在的连接;获取单元,还用于根据第一连接的标识获取第一连接的上下文,第一连接的上下文用于指示第一连接的相关信息,第一连接的上下文是第一连接的上一个数据块被处理后更新得到的上下文;处理单元,根据第一连接的上下文生成数据处理指令,该数据处理指令可用于处理第一数据量的数据;获取单元,还用于使用第一数据量对应的发送带宽从主机中下载第一连接的数据块;处理单元,还用于根据数据处理指令处理第一连接的数据块以得到多个报文;发送单元,用于发送多个报文。In a second aspect, a device for processing stateful services is provided, which is applied to a network card. The network card is connected to a host. For example, the network card is connected to the host via a PCIe bus. The identification of the connection, for example, the identification of the first connection can be carried in the first doorbell DB received from the host by the network card, and the first connection is the connection where the stateful service is located; the acquiring unit is also used to acquire according to the identification of the first connection The context of the first connection, the context of the first connection is used to indicate the relevant information of the first connection, the context of the first connection is the context updated after the last data block of the first connection is processed; the processing unit, according to the first connection The context of generating a data processing instruction, the data processing instruction can be used to process the data of the first amount of data; the acquiring unit is also used to download the data block of the first connection from the host using the sending bandwidth corresponding to the first amount of data; the processing unit, It is also used to process the data block of the first connection according to the data processing instruction to obtain multiple messages; the sending unit is used to send multiple messages.
在第二方面的一种可能的实现方式中,该数据处理指令包括第一数据量对应的直接存储器存取DMA指令,获取单元还用于:根据第一数据量对应的DMA指令,从主机中下载第一连接的数据块。In a possible implementation of the second aspect, the data processing instruction includes a direct memory access DMA instruction corresponding to the first amount of data, and the acquiring unit is further configured to: according to the DMA instruction corresponding to the first amount of data, from the host Download the data block of the first connection.
在第二方面的一种可能的实现方式中,该数据处理指令还包括随路关联数据,处理单元还用于:根据随路关联数据对第一连接的数据块进行分片处理,得到多个数据分片。In a possible implementation of the second aspect, the data processing instruction further includes associated data, and the processing unit is further configured to: perform slicing processing on the data block of the first connection according to the associated data to obtain multiple Data fragmentation.
在第二方面的一种可能的实现方式中,该分片处理包括以下至少一项:插入多个标记符,删除部分头部数据,删除部分尾部数据,确定报文头,确定载荷的循环冗余校验CRC,确定校验和CS。In a possible implementation of the second aspect, the fragmentation processing includes at least one of the following: inserting multiple markers, deleting part of the header data, deleting part of the tail data, determining the message header, and determining the cyclic redundancy of the payload. After checking the CRC, determine the checksum CS.
在第二方面的一种可能的实现方式中,该数据处理指令还包括报文编辑指令,处理单元,还用于:根据报文编辑指令,对多个数据分片进行编辑封装,得到多个报文。可选的,该报文编辑指令包括:头部切片的编辑指令、中间切片的编辑指令和尾部切片的编辑指令。In a possible implementation manner of the second aspect, the data processing instruction further includes a message editing instruction, and the processing unit is further used for: editing and encapsulating multiple data fragments according to the message editing instruction to obtain multiple Message. Optionally, the message editing instructions include: editing instructions for the head slice, editing instructions for the middle slice, and editing instructions for the tail slice.
在第二方面的一种可能的实现方式中,该装置还包括:带宽分配单元,用于从可用总线带宽中为第一DB分配发送带宽,发送带宽等于第一数据量。In a possible implementation manner of the second aspect, the device further includes: a bandwidth allocation unit, configured to allocate a transmission bandwidth to the first DB from the available bus bandwidth, and the transmission bandwidth is equal to the first data amount.
在第二方面的一种可能的实现方式中,带宽分配单元还用于:根据第一数据量与第一连接的数据块的实际数据量的差值,调整发送带宽。In a possible implementation manner of the second aspect, the bandwidth allocation unit is further configured to: adjust the transmission bandwidth according to the difference between the first data amount and the actual data amount of the data block of the first connection.
在本申请的又一方面,提供一种有状态业务的处理装置,该装置为网卡或者网卡内置的芯片,该装置包括:存储器、以及与所述存储器耦合的处理器,所述存储器中存储代码和数据,所述处理器运行所述存储器中的代码使得该装置执行上述第一方面或者第一方面的任一种可能的实现方式所提供的有状态业务的处理方法。In another aspect of the present application, a device for processing stateful services is provided. The device is a network card or a chip built in the network card. The device includes a memory and a processor coupled with the memory, and the memory stores code And data, the processor runs the code in the memory to make the device execute the stateful service processing method provided by the first aspect or any one of the possible implementations of the first aspect.
在本申请的又一方面,提供一种通信系统,该通信系统包括网卡和主机,该网卡通过总线与主机连接;其中,该网卡为上述任一方面所提供的网卡,用于执行上述第一方面或者第一方面的任一种可能的实现方式所提供的有状态业务的处理方法。In another aspect of the present application, a communication system is provided. The communication system includes a network card and a host, and the network card is connected to the host through a bus; wherein, the network card is a network card provided in any one of the above aspects, and is used to execute the first Aspect or the stateful service processing method provided by any possible implementation manner of the first aspect.
在本申请的又一方面,提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得所述计算机执行上述第一方面或者第一方面的任一种可能的实现方式所提供的有状态业务的处理方法。In yet another aspect of the present application, a computer-readable storage medium is provided. The computer-readable storage medium stores instructions, which when run on a computer, cause the computer to execute the first aspect or the first aspect. The stateful service processing method provided by any of the possible implementations.
在本申请的又一方面,提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备执行上述第一方面或者第一方面的任一种可能的实现方式所提供的有 状态业务的处理方法。In another aspect of the present application, a computer program product is provided. The computer program product includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can be read from the computer-readable storage medium. The computer-executable instruction is read, and at least one processor executes the computer-executed instruction to make the device execute the stateful service processing method provided by the first aspect or any one of the possible implementation manners of the first aspect.
可以理解地,上述提供的任一种有状态业务的处理的装置、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。It is understandable that any device, computer storage medium or computer program product for processing stateful services provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the above The beneficial effects of the corresponding methods provided in the article will not be repeated here.
附图说明Description of the drawings
图1为本申请实施例提供的一种通信系统的结构示意图;FIG. 1 is a schematic structural diagram of a communication system provided by an embodiment of this application;
图2为本申请实施例提供的一种有状态业务的处理方法的流程示意图;2 is a schematic flowchart of a method for processing a stateful service according to an embodiment of the application;
图3为本申请实施例提供的一种分片处理的示意图;FIG. 3 is a schematic diagram of a fragmentation process provided by an embodiment of the application;
图4为本申请实施例提供的一种智能网卡的结构示意图;FIG. 4 is a schematic structural diagram of a smart network card provided by an embodiment of this application;
图5为本申请实施例提供的一种多报文处理的示意图;FIG. 5 is a schematic diagram of multi-message processing provided by an embodiment of this application;
图6为本申请实施例提供的一种有状态业务的处理装置的结构示意图;FIG. 6 is a schematic structural diagram of a processing device for a stateful service provided by an embodiment of this application;
图7为本申请实施例提供的另一种有状态业务的处理装置的结构示意图。FIG. 7 is a schematic structural diagram of another device for processing a stateful service provided by an embodiment of the application.
具体实施方式Detailed ways
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。另外,本申请实施例采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一阈值和第二阈值仅仅是为了区分不同的阈值,并不对其先后顺序进行限定。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定。In this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple . In addition, the embodiments of the present application use words such as "first" and "second" to distinguish the same items or similar items that have substantially the same function and effect. For example, the first threshold and the second threshold are only for distinguishing different thresholds, and the order of their order is not limited. Those skilled in the art can understand that words such as "first" and "second" do not limit the number and execution order.
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。It should be noted that in this application, words such as "exemplary" or "for example" are used to indicate examples, illustrations, or illustrations. Any embodiment or design solution described as "exemplary" or "for example" in this application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, words such as "exemplary" or "for example" are used to present related concepts in a specific manner.
图1为本申请实施例提供的一种通信系统的结构示意图,该通信系统包括主机(host)和网卡,该主机与网卡通过总线连接。比如,该网卡为智能网卡(smart network interface card,smart NIC),该智能网卡通过外设部件互连标准(peripheral component interconnect express,PCIe)总线与该主机连接。可选的,该通信系统中可以包括一个或者多个主机,这一个或者多个主机均可以与智能网卡连接。下文中以该网卡为智能网卡为例对本申请实施例进行介绍说明。FIG. 1 is a schematic structural diagram of a communication system provided by an embodiment of the application. The communication system includes a host and a network card, and the host and the network card are connected through a bus. For example, the network card is a smart network interface card (smart NIC), and the smart network card is connected to the host through a peripheral component interconnect express (PCIe) bus. Optionally, the communication system may include one or more hosts, and the one or more hosts may all be connected to the smart network card. Hereinafter, the embodiment of the present application will be introduced and explained by taking the network card as a smart network card as an example.
其中,该主机中设置有多个虚拟机(virtual machine,VM),每个VM中可运行一个或者多个虚拟功能(virtual function,VF),这一个或者多个VF可以对应不同的功能,每个VF可以对应一个或者多个队列(queue),并通过这一个或者多个队列来实现该VF的输入或输出机制,这多个队列可以包括发送队列和接收队列。该智能网卡可用于处理从主机中分离出来的各种网络协议或存储协议,也可以称为协议卸载(offload)。比如,网络类卸载可以包括虚拟化I/O(virtualization I/O,Virt IO)卸载、 单根I/O虚拟化(single-root I/O virtualization,SR-IOV)卸载、用户数据报协议(user datagram protocol,UDP)/传输控制协议(transmission control protocol,TCP)/因特网协议(Internet Protocol,IP)校验和(checksum,CS)卸载,接收侧扩展(receive side scaling,RSS)卸载/TCP分片卸载(TCP segment offload,TSO)/大的接收聚合卸载(Large receive offload,LRO)、虚拟可扩展局域网(virtual extensible local area network,VxLAN)/通用网络虚拟化封装(generic network virtualization encapsulation,Geneve)卸载、有状态的(stateful)开放虚拟交换(open virtual switch,OVS)卸载、IP安全(IP security,IPSec)卸载、TCP卸载(TCP offload engine,TOE)卸载、融合以太网远端DMA版本2(RDMA over converged ethernet V2,RoCEv2)卸载等;存储类卸载可以包括纠删码(erasure coding,EC)卸载、虚拟块服务(virtual block service,VBS)卸载、T10数据完整性域(data integrity field,DIF)/数据完整性扩展(data integrity extension,DIX)卸载、光纤通道(fiber channel,FC)卸载、非易失性内存主机控制器接口规范非易失性内存主机控制器接口规范(non volatile memory express,NVMe)卸载和跨网络的非易失性内存主机控制器接口(NVME over fabric,NoF)卸载等。Wherein, the host is provided with multiple virtual machines (virtual machines, VMs), and each VM can run one or more virtual functions (VF), and the one or more VFs can correspond to different functions. A VF may correspond to one or more queues, and the input or output mechanism of the VF is realized through the one or more queues. The multiple queues may include a sending queue and a receiving queue. The smart network card can be used to process various network protocols or storage protocols separated from the host, and it can also be called protocol offload. For example, network offloading can include virtualization I/O (virtualization I/O, Virt IO) offloading, single-root I/O virtualization (single-root I/O virtualization, SR-IOV) offloading, user datagram protocol ( user datagram protocol, UDP)/transmission control protocol (transmission control protocol, TCP)/Internet protocol (Internet Protocol, IP) checksum (checksum, CS) offloading, receiving side expansion (receive side scaling, RSS) offloading/TCP split TCP Segment Offload (TSO)/Large Receive Offload (LRO), Virtual Extensible Local Area Network (VxLAN)/Generic Network Virtualization Encapsulation (Geneve) Offloading, stateful (stateful) open virtual switch (open virtual switch, OVS) offloading, IP security (IP security, IPSec) offloading, TCP offloading (TCP offload engine, TOE) offloading, converged Ethernet remote DMA version 2( RDMA overconverged ethernet V2, RoCEv2) offloading, etc.; storage offloading can include erasure coding (EC) offloading, virtual block service (VBS) offloading, T10 data integrity field (DIF) )/Data Integrity Extension (DIX) Offload, Fibre Channel (FC) Offload, Non-volatile Memory Host Controller Interface Specification Non-volatile Memory Host Controller Interface Specification (nonvolatile memory express , NVMe) offloading and non-volatile memory host controller interface (NVME over fabric, NoF) offloading across the network, etc.
另外,当该主机需要通过智能网卡向以太网(ethernet,Eth)发送报文时,该主机可通过门铃(door bell,DB)机制与该智能网卡通信,由该智能网卡处理相关报文,并将处理后的报文发送给Eth。在一种可能的实施例中,该智能网卡可以包括:发送带宽分配(TX bandwidth provision)模块、接收带宽分配(RX bandwidth provision)模块、发送处理(TX processing)模块、接收处理(RX processing)模块、调度器(scheduler)、包括多个处理器核(processor core)的处理器池(processor pool)、交通管理器(traffic manager)、用于向Eth发送报文的发送端口(TX port),以及用于向主机发送报文的接收虚拟机(RX VM)。可选的,该智能网卡中的处理器池可以为专用集成电路(application specific integrated circuit,ASIC)或者现场可编程门阵列(field programmable gate array,FPGA)等,本申请实施例对此不作具体限制。In addition, when the host needs to send a message to the Ethernet (ethernet, Eth) through the smart network card, the host can communicate with the smart network card through the doorbell (DB) mechanism, and the smart network card will process the relevant messages, and Send the processed message to Eth. In a possible embodiment, the smart network card may include: a transmit bandwidth provision (TX bandwidth provision) module, a receive bandwidth provision (RX bandwidth provision) module, a transmit processing (TX processing) module, and a receive processing (RX processing) module , Scheduler, processor pool including multiple processor cores, traffic manager, TX port for sending messages to Eth, and The receiving virtual machine (RX VM) used to send messages to the host. Optionally, the processor pool in the smart network card may be an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA), etc. The embodiment of the application does not specifically limit this .
图2为本申请实施例提供的一种有状态业务的处理方法的流程示意图,该方法可应用于图1所示的包括主机和智能网卡的通信系统中,该方法包括以下几个步骤。FIG. 2 is a schematic flowchart of a method for processing a stateful service according to an embodiment of the application. The method can be applied to the communication system including the host and the smart network card shown in FIG. 1. The method includes the following steps.
S201:智能网卡接收来自主机的第一连接的标识,第一连接为有状态业务所在的连接。S201: The smart network card receives the identifier of the first connection from the host, and the first connection is the connection where the stateful service is located.
其中,连接(connection)是指两端会话建立的一个逻辑链路,比如,该连接可以是TCP连接、UDP连接、或者ROCE队列对(queue pair,QP)连接等。该主机通过智能网卡与网络(Eth)之间可以建立有多个连接,第一连接可以是多个连接中的任意一个连接,第一连接的标识可以用于标识第一连接。可选的,第一连接的标识可以承载在第一DB中,比如,智能网卡接收来自主机的第一DB,第一DB中携带该主机的第一连接的标识。Among them, a connection (connection) refers to a logical link established by a session at both ends. For example, the connection may be a TCP connection, a UDP connection, or a ROCE queue pair (QP) connection. Multiple connections may be established between the host and the network (Eth) through the smart network card, the first connection may be any one of the multiple connections, and the identifier of the first connection may be used to identify the first connection. Optionally, the identification of the first connection may be carried in the first DB. For example, the smart network card receives the first DB from the host, and the first DB carries the identification of the first connection of the host.
另外,上述有状态(stateful)业务与无状态(stateless)业务相对应。无状态业务可以是指该业务的单个报文处理基于该报文自身的报文头(header)即可处理,报文与报文之间没有关联。有状态业务可以是指该业务的单个报文无法决定如何处理该报文,该报文的处理需要依赖于该报文所在“连接”的状态、以及该报文自身的信息等,才能决定该报文的处理行为,也就是说,有状态业务的报文与报文之间存在关联,上 述“连接”的状态信息包括但不限于:期待的下一个报文的序列号、应答(ACK)的序列号、接收窗口更新和统计信息等。为便于了解,下面以防火墙为例,对有状态防火墙和无状态防火墙进行举例说明。这里说的防火墙可以包括firewall、也可以包括OpenStack中的安全组(security group)等不同层面的防火墙。其中,无状态防火墙是指基于静态数值来过滤或阻拦网络数据包,例如基于地址、端口和协议等等,即无状态防火墙本身不关心当前的网络连接状态。有状态防火墙可以区分出网络连接的状态,比如,有状态防火墙可以区分出TCP连接,以及当前处于TCP连接的哪个阶段,也即是,有状态防火墙可以在静态数值之外,再通过连接状态来过滤或阻拦网络数据包。In addition, the above-mentioned stateful services correspond to stateless services. A stateless service may mean that the processing of a single message of the service can be processed based on the header of the message itself, and there is no correlation between the message and the message. A stateful service can mean that a single message of the service cannot decide how to process the message. The processing of the message needs to depend on the state of the "connection" where the message is located, and the information of the message itself, etc., before the decision can be made. The processing behavior of the message, that is, there is an association between the message of the stateful service and the message. The state information of the above "connection" includes but is not limited to: the expected sequence number of the next message and the response (ACK) The serial number, receiving window update and statistical information, etc. For ease of understanding, the following uses firewalls as an example to illustrate stateful firewalls and stateless firewalls. The firewall mentioned here can include firewalls, and can also include firewalls at different levels such as security groups in OpenStack. Among them, the stateless firewall refers to filtering or blocking network data packets based on static values, for example, based on addresses, ports, and protocols, etc. That is, the stateless firewall itself does not care about the current network connection status. A stateful firewall can distinguish the state of a network connection. For example, a stateful firewall can distinguish between a TCP connection and which stage of the TCP connection is currently in. Filter or block network data packets.
当该主机需要通过智能网卡向网络发送第一连接的报文时,该主机可以向智能网卡发送携带有第一连接的标识的第一DB,具体可以是由该主机中的第一VM上运行的第一VF发送,第一VF可以是用于执行第一连接相关的任务的VF。具体的,当该主机中第一VM上运行的第一VF需要通过智能网卡向网络发送第一连接的报文时,第一VF可以将对应的工作队列元素(working queue element,WQE)推送(post)到第一VF对应的第一发送队列中,第一发送队列可以是用于发送第一连接的报文的队列,该WQE可用于描述发送的报文的大小及发送的报文在主机内存中的存储地址等,该WQE也可以称为描述符。之后,第一VF就可以向智能网卡发送第一DB,从而智能网卡可以接收到第一DB,第一DB中携带有第一连接的标识。When the host needs to send the message of the first connection to the network through the smart network card, the host can send the first DB carrying the identifier of the first connection to the smart network card, which can be run on the first VM in the host. The first VF is sent, and the first VF may be a VF used to perform tasks related to the first connection. Specifically, when the first VF running on the first VM in the host needs to send the message of the first connection to the network through the smart network card, the first VF may push the corresponding working queue element (WQE) ( post) to the first sending queue corresponding to the first VF. The first sending queue can be a queue for sending messages for the first connection. This WQE can be used to describe the size of the sent message and the sending of the message on the host. The storage address in the memory, etc., the WQE can also be called a descriptor. After that, the first VF can send the first DB to the smart network card, so that the smart network card can receive the first DB, and the first DB carries the identifier of the first connection.
可选的,第一DB中还可以携带有第一VF的标识,第一VF的标识可用于在该主机中的多个VF中唯一标识第一VF。当第一DB中仅携带有第一连接的标识时,第一连接的标识可用于在该主机的多个VF中唯一标识第一VF所对应的第一连接;当第一DB中携带有第一连接的标识和第一VF的标识时,第一VF的标识可用于在该主机中的多个VF中唯一标识第一VF,第一连接的标识可用于在第一VF对应的多个连接中唯一标识第一连接。Optionally, the first DB may also carry an identifier of the first VF, and the identifier of the first VF may be used to uniquely identify the first VF among multiple VFs in the host. When the first DB only carries the identification of the first connection, the identification of the first connection can be used to uniquely identify the first connection corresponding to the first VF among the multiple VFs of the host; when the first DB carries the first connection When the identification of a connection and the identification of the first VF are used, the identification of the first VF can be used to uniquely identify the first VF among the multiple VFs in the host, and the identification of the first connection can be used for multiple connections corresponding to the first VF. Uniquely identifies the first connection in.
S202:智能网卡根据第一连接的标识获取第一连接的上下文。S202: The smart network card obtains the context of the first connection according to the identifier of the first connection.
具体的,智能网卡的缓存(cache)中可能存储有第一连接的上下文,也可以未存储第一连接的上下文。当智能网卡中未存储有第一连接的上下文时,智能网卡可以根据第一连接的标识从该主机中获取第一连接的上下文;当智能网卡中存储有第一连接的上下文时,智能网卡可以从自身的缓存中获取第一连接的上下文。其中,第一连接的上下文用于记录第一连接的相关信息,比如,第一发送队列在该主机内存中的地址、第一发送队列中的当前发送位置、发送窗口、以及发送的报文的序列号等。另外,第一连接的上下文是第一连接的上一个数据块被处理后更新得到的上下文。Specifically, the context of the first connection may be stored in the cache of the smart network card, or the context of the first connection may not be stored. When the context of the first connection is not stored in the smart network card, the smart network card can obtain the context of the first connection from the host according to the identifier of the first connection; when the context of the first connection is stored in the smart network card, the smart network card can Get the context of the first connection from its own cache. Among them, the context of the first connection is used to record related information about the first connection, such as the address of the first sending queue in the memory of the host, the current sending position in the first sending queue, the sending window, and the information of the sent message Serial number, etc. In addition, the context of the first connection is the updated context obtained after the last data block of the first connection is processed.
S203:智能网卡根据第一连接的上下文生成数据处理指令,该数据处理指令可用于处理第一数据量的数据。S203: The smart network card generates a data processing instruction according to the context of the first connection, and the data processing instruction can be used to process data of the first amount of data.
其中,第一数据量可以是事先设置,且第一数据量大于最大传输单元(maximum transmission unit,MTU)。第一数据量的具体数值可以是固定不变的,也可以是变化的,具体可以由本领域技术人员根据经验或者根据实际情况进行设置,比如,第一数据量可以为64KB,本申请实施例对此不作具体限定。The first data amount may be set in advance, and the first data amount is greater than the maximum transmission unit (MTU). The specific value of the first data volume can be fixed or variable. It can be set by those skilled in the art based on experience or actual conditions. For example, the first data volume can be 64KB. This is not specifically limited.
具体的,智能网卡可以根据第一连接的上下文从该主机的第一发送队列中获取WQE,比如,基于第一连接的上下文指示的第一发送队列的地址,从该主机的第一发 送队列中获取WQE,进而基于该WQE所指示的发送的报文的大小(比如,第一数据量)及发送的报文在该主机内存中的存储地址等信息生成第一数据量对应的数据处理指令。可选的,该数据处理指令还可以包括第一数据量对应的直接存储器存取(direct memory access,DMA)指令(command)、随路关联数据、报文编辑指令(packet edit,PE)中的至少一个;随路关联数据用于指示分片信息;报文编辑指令可以为三段式PE指令,具体可以包括头部切片的编辑指令、至少一个中间切片的编辑指令和尾部切片的编辑指令。Specifically, the smart network card may obtain WQE from the first sending queue of the host according to the context of the first connection, for example, based on the address of the first sending queue indicated by the context of the first connection, from the first sending queue of the host Acquire the WQE, and then generate a data processing instruction corresponding to the first data volume based on information such as the size of the sent message indicated by the WQE (for example, the first data volume) and the storage address of the sent message in the memory of the host. Optionally, the data processing instruction may also include a direct memory access (DMA) instruction (command), associated data, and packet edit instruction (packet edit, PE) corresponding to the first data volume. At least one; the path-associated data is used to indicate segmentation information; the message editing instruction can be a three-stage PE instruction, which can specifically include an editing instruction for a head slice, an editing instruction for at least one middle slice, and an editing instruction for a tail slice.
进一步的,智能网卡还可以在执行完步骤S203后更新第一连接的上下文,并将更新后的第一连接的上下文存储在缓冲中,也可以将更新后的第一连接的上下文发送给该主机。Further, the smart network card may also update the context of the first connection after performing step S203, and store the updated context of the first connection in the buffer, or may send the updated context of the first connection to the host .
S204:智能网卡使用第一数据量对应的发送带宽从该主机中下载第一连接的数据块。S204: The smart network card uses the sending bandwidth corresponding to the first data volume to download the data block of the first connection from the host.
其中,第一数据量对应的发送带宽可以等于第一数据量。智能网卡在从该主机中下载第一连接的数据块之前,可以从可用总线带宽(该PCIe总线对应的可用总线带宽)中为第一DB分配该发送带宽。当智能网卡需要从该主机中下载第一连接的数据块时,智能网卡可以使用该发送带宽从该主机中下载第一连接的数据块,第一连接的数据块中可以包括多个报文的数据。可选的,该数据处理指令中可以包括DMA指令,该DMA指令中携带有第一连接的数据块在该主机内存中的存储地址,从而智能网卡可以通过DMA指令从该主机中下载第一连接的数据块。Wherein, the transmission bandwidth corresponding to the first data amount may be equal to the first data amount. Before downloading the data block of the first connection from the host, the smart network card may allocate the sending bandwidth to the first DB from the available bus bandwidth (available bus bandwidth corresponding to the PCIe bus). When the smart network card needs to download the data block of the first connection from the host, the smart network card can use the sending bandwidth to download the data block of the first connection from the host, and the data block of the first connection can include multiple packets of data. data. Optionally, the data processing instruction may include a DMA instruction, and the DMA instruction carries the storage address of the data block of the first connection in the memory of the host, so that the smart network card can download the first connection from the host through the DMA instruction Data block.
另外,第一连接的数据块的实际数据量可以与第一数据量相等,也可以小于第一数据量。当该实际数据量小于第一数据量时,智能网卡可以根据第一数据量与该实际数据量的差值,减小该发送带宽,从而节省该PCIe总线对应的可用总线带宽。In addition, the actual data volume of the first connected data block may be equal to or smaller than the first data volume. When the actual data volume is less than the first data volume, the smart network card can reduce the transmission bandwidth according to the difference between the first data volume and the actual data volume, thereby saving the available bus bandwidth corresponding to the PCIe bus.
S205:智能网卡根据该数据处理指令处理第一连接的数据块以得到多个报文。S205: The smart network card processes the data block of the first connection according to the data processing instruction to obtain multiple messages.
其中,该数据处理指令可以包括随路关联数据,该随路关联数据可用于指导智能网卡按照最大分段尺寸(maximum segment size,MSS)进行分片、以及计算CS等。比如,该随路关联数据可以包括用于指示每个数据分片的大小(即MSS)的信息、指示计算每个数据分片的校验和的信息,指示插入标记符的间隔或长度的信息,以及指示删除部分数据的信息等。因此,智能网卡可以根据该随路关联数据对第一连接的数据块进行分片处理,得到多个数据分片,该分片处理可以包括以下至少一项处理:插入多个标记符,删除部分头部数据,删除部分尾部数据,确定报文头(header),确定载荷的循环冗余校验(cyclic redundancy check,CRC),确定CS。可选的,该标记符可以为DIF、或者DIX等;删除的部分头部数据可以是多余的数据,比如,该部分头部数据之前已经送过;删除的部分尾部数据可以是多余的数据,比如,该部分尾部数据不足一个报文的载荷。Wherein, the data processing instruction may include associated data, and the associated data may be used to guide the smart network card to perform fragmentation according to a maximum segment size (MSS) and calculate CS. For example, the associated data may include information indicating the size of each data fragment (ie MSS), information indicating the calculation of the checksum of each data fragment, and information indicating the interval or length of inserting markers , And instructions to delete some data, etc. Therefore, the smart network card can perform fragmentation processing on the data block of the first connection according to the associated associated data to obtain multiple data fragments. The fragmentation processing may include at least one of the following processing: inserting multiple markers, deleting parts Header data, delete part of the tail data, determine the header, determine the cyclic redundancy check (CRC) of the payload, and determine the CS. Optionally, the marker can be DIF, or DIX, etc.; the deleted part of the header data can be redundant data, for example, the part of the header data has been sent before; the deleted part of the tail data can be redundant data, For example, this part of the tail data is less than the payload of one packet.
示例性的,如图3所示,当第一连接的数据块为来自CPI接口的小型计算机系统接口(internet small computer system interface Internet,iSCSI)协议数据单元(protocol data unit,PDU)且该iSCSI PDU包括iSCSI头(header,HDR)和iSCSI PDU载荷(也可以称为iSCSI PDU数据(data))时,智能网卡可以在该iSCSI PDU载荷中插入多个DIF,确定报文头为iSCSI HDR,以及确定iSCSI HDR的CRC(图3中表示为HDR  CRC)和载荷的CRC(图3中表示为数据CRC),之后可以将iSCSI PDU载荷按照MSS分片为多个数据分片。图3中的PAD为padding的缩写,可以用于表示填充的数据部分,具体可以由随路关联数据指示。Exemplarily, as shown in Fig. 3, when the first connected data block is a small computer system interface (Internet, iSCSI) protocol data unit (PDU) from the CPI interface and the iSCSI PDU When it includes iSCSI header (HDR) and iSCSI PDU payload (also called iSCSI PDU data (data)), the smart network card can insert multiple DIFs in the iSCSI PDU payload, determine that the header is iSCSI HDR, and determine The CRC of the iSCSI HDR (represented as HDR CRC in Figure 3) and the CRC of the payload (represented as data CRC in Figure 3), and then the iSCSI PDU payload can be divided into multiple data fragments according to MSS. The PAD in FIG. 3 is the abbreviation of padding, which can be used to indicate the data part of the padding, which can be specifically indicated by the associated data.
另外,该数据处理指令还可以包括报文编辑指令,智能网卡可以对分片处理得到的多个数据分片进行编辑封装,以得到多个报文。具体的,当该报文编辑指令为三段式PE指令时,智能网卡可以根据头部切片的编辑指令对多个数据分片中的第一个数据分片(也可以称为首个数据分片或首片)进行编码封装,根据至少一个中间切片的编辑指令对多个数据分片中的第二个数据分片至倒数第二个数据分片(也可以称为中间数据分片或中间片)进行编码封装,根据尾部切片的编辑指令对多个数据分片中的倒数第一个数据分片(也可以称为尾部数据分片或者尾片)进行编码封装,得到多个报文。可选的,至少一个中间切片的编辑指令还可以用于指示第二个数据分片至倒数第二个数据分片编码封装时的报文头(header)的变化规律,比如,TCP序列号(sequence number,SN)为递增的方式,IP ID是相同的等。In addition, the data processing instruction may also include a message editing instruction, and the smart network card may edit and encapsulate multiple data fragments obtained by the fragment processing to obtain multiple messages. Specifically, when the message editing instruction is a three-segment PE instruction, the smart network card can perform data segmentation on the first data segment of the multiple data segments (also called the first data segment) according to the editing instruction of the header segment. Or the first slice) for encoding and packaging, and according to the editing instructions of at least one intermediate slice, the second data slice of the multiple data slices to the penultimate data slice (also called intermediate data slice or intermediate slice) ) Performs encoding and encapsulation, and encodes and encapsulates the penultimate data slice (also referred to as a tail data slice or a trailer) among the multiple data slices according to the editing instruction of the tail slice to obtain multiple messages. Optionally, the edit instruction of at least one intermediate slice can also be used to indicate the change rule of the header (header) from the second data slice to the penultimate data slice when encoding and encapsulating, for example, the TCP sequence number ( Sequence number, SN) is an incremental method, and IP ID is the same.
S206:智能网卡发送多个报文。S206: The smart network card sends multiple messages.
当智能网卡编辑得到多个报文后,智能网卡可以通过网络端口将多个报文发送给网络。可选的,智能网卡在将多个报文发送给网络后,还可以向该主机发送反馈信息,该反馈信息可以用于指示该主机多个报文发送成功。After the smart network card has edited multiple messages, the smart network card can send multiple messages to the network through the network port. Optionally, after the smart network card sends multiple packets to the network, it may also send feedback information to the host, and the feedback information may be used to indicate that the host has successfully sent multiple packets.
为便于理解,下面以图4所示的智能网卡的结构为例,对本申请实施例提供的方案进行举例说明。For ease of understanding, the following takes the structure of the smart network card shown in FIG. 4 as an example to illustrate the solutions provided in the embodiments of the present application.
其中,智能网卡的TX带宽分配模块中可以包括多个DB队列(DB Q)、队列映射(queue mapping,QM)模块、带宽分配节点(图4中表示为vNIC)和轮询(round robin,RR)调度的调度器(图4中表示为RR)。图4中以多个DB Q对应多个主机(即该智能网卡与多个主机(比如,H0至H3))连接为例进行说明。Among them, the TX bandwidth allocation module of the smart network card may include multiple DB queues (DB Q), queue mapping (QM) modules, bandwidth allocation nodes (represented as vNIC in Figure 4), and round robin (RR). ) Scheduler for scheduling (represented as RR in Figure 4). In FIG. 4, description is made by taking multiple DBQs corresponding to multiple hosts (that is, the smart network card is connected to multiple hosts (for example, H0 to H3)) as an example.
具体的,当该主机向智能网卡发送第一DB后,第一DB可以按照一定的规则在多个DB队列中入队并通过队列映射模块映射到对应的带宽分配节点以完成发送带宽的分配。然后,由RR从处理器池中为第一DB分配处理器核或者处理器核中的线程(thread),分配的处理器核或者线程在下文中可以称为核/线程。该核/线程可以基于第一DB携带的第一连接的标识获取第一连接的上下文,并根据第一连接的上下文生成第一数据量的DMA指令、随路关联数据和报文编辑指令,并通过DMA指令从主机下载第一连接的数据块;可选的,该核/线程还可以将第一数据量与第一连接的数据块的实际数据量的差值发送给对应的带宽分配节点,以使该带宽分配节点调整第一DB的发送带宽。TX处理模块可以根据随路关联数据完成第一连接的数据块的分片处理,得到的多个数据分片可以存放在该智能网卡的内存中。交通管理器可以从内存中调度多个数据分片并根据报文编辑指令对数据分片进行编辑封装,得到的报文可通过TX端口发送给网络。Specifically, after the host sends the first DB to the smart network card, the first DB can be queued in multiple DB queues according to certain rules and mapped to the corresponding bandwidth allocation node through the queue mapping module to complete the allocation of the sending bandwidth. Then, the RR allocates processor cores or threads in the processor cores from the processor pool to the first DB, and the allocated processor cores or threads may be referred to as cores/threads in the following. The core/thread can obtain the context of the first connection based on the identifier of the first connection carried by the first DB, and generate the DMA instruction of the first data amount, the associated data and the message editing instruction according to the context of the first connection, and Download the data block of the first connection from the host through a DMA instruction; optionally, the core/thread may also send the difference between the first data volume and the actual data volume of the data block of the first connection to the corresponding bandwidth allocation node, So that the bandwidth allocation node adjusts the sending bandwidth of the first DB. The TX processing module can complete the fragmentation processing of the data block of the first connection according to the associated data, and the obtained multiple data fragments can be stored in the memory of the smart network card. The traffic manager can schedule multiple data fragments from the memory and edit and encapsulate the data fragments according to the message editing instructions, and the resulting message can be sent to the network through the TX port.
进一步的,第一连接的上下文可以被分解为多个子上下文,以便智能网卡的不同处理器核或线程并发地根据不同的子上下文处理不同的DB,从而提升第一连接的报文的吞吐率。示例性的,如图5所示,第一连接的上下文可以被分解为四个子上下文且分别表示为S0、S1、S2和S3,这样智能网卡可以同时处理第一连接的4个DB且分 别表示为DB1、DB2、DB3和DB4。假设调度器为4个DB分配的处理器核分别为Core1、Core2、Core3和Core4,则当Core1处理完DB1S0后,Core1可以继续处理DB1S1,此时Core2可以处理DB2S0;当Core1处理完成DB1S1后,Core1可以继续处理DB1S2,此时Core2处理完DB2S0且可以开始处理DB2S1,同时Core3可以开始处理DB3S0,……,如此类推。进一步的,若第一连接还对应有DB5,则当Core1完成DB1S3的处理后,Core1可以开始处理DB5S0,Core2开始处理DB2S3,Core3开始处理DB3S2,Core4开始处理DB4S1,这样可以让智能网卡的4个核并发参与到第一连接的上下文处理,每个核可处理一个第一连接的数据块,从而可以提升第一连接的报文的吞吐率。Further, the context of the first connection may be decomposed into multiple sub-contexts, so that different processor cores or threads of the smart network card concurrently process different DBs according to different sub-contexts, thereby improving the throughput of the first connection's message. Exemplarily, as shown in Figure 5, the context of the first connection can be decomposed into four sub-contexts and represented as S0, S1, S2, and S3 respectively, so that the smart network card can process the 4 DBs of the first connection at the same time and represent them respectively These are DB1, DB2, DB3 and DB4. Assuming that the processor cores allocated by the scheduler to the 4 DBs are Core1, Core2, Core3, and Core4, after Core1 finishes processing DB1S0, Core1 can continue to process DB1S1, and Core2 can process DB2S0 at this time; when Core1 finishes processing DB1S1, Core1 can continue processing DB1S2, at this time Core2 has finished processing DB2S0 and can start processing DB2S1, while Core3 can start processing DB3S0,..., and so on. Further, if the first connection also corresponds to DB5, when Core1 completes the processing of DB1S3, Core1 can start processing DB5S0, Core2 starts processing DB2S3, Core3 starts processing DB3S2, and Core4 starts processing DB4S1, so that 4 smart network cards can be processed. The cores concurrently participate in the context processing of the first connection, and each core can process a data block of the first connection, so that the throughput of the first connection can be improved.
在本申请实施例中,该智能网卡在接收到携带第一连接的标识的第一DB时,该智能网卡可以根据第一连接的上下文生成第一数据量的数据处理指令,以及从该主机中下载第一连接的数据块,并根据该数据处理指令处理第一连接的数据块以得到多个报文,从而该智能网卡可以基于第一DB获取一次第一连接的上下文,并基于该上下文处理发送第一连接的多个报文,进而提高了单个有状态业务的处理性能。此外,该智能网卡可以在生成DMA指令时,同时生成报文编辑指令,这样可以避免后续再对多个数据分片进行编辑封装时再次唤醒智能网卡中与第一DB对应的核/线程,从而进一步提高了处理性能。In the embodiment of the present application, when the smart network card receives the first DB carrying the identifier of the first connection, the smart network card may generate a data processing instruction of the first amount of data according to the context of the first connection, and from the host Download the data block of the first connection, and process the data block of the first connection according to the data processing instruction to obtain multiple messages, so that the smart network card can obtain the context of the first connection once based on the first DB, and process based on the context Send multiple messages of the first connection, thereby improving the processing performance of a single stateful service. In addition, the smart network card can generate message editing instructions at the same time when generating DMA instructions, which can avoid subsequent editing and packaging of multiple data fragments to wake up the core/thread corresponding to the first DB in the smart network card again, thereby Further improve the processing performance.
上述主要从各个设备之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个设备,例如该主机和智能网卡。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions provided by the embodiments of the present application from the perspective of interaction between various devices. It can be understood that various devices, such as the host and the smart network card. In order to realize the above-mentioned functions, it includes hardware structures and/or software modules corresponding to the respective functions. Those skilled in the art should easily realize that in combination with the units and algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对智能网卡进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。下面以采用对应各个功能划分各个功能模块为例进行说明。The embodiments of the present application may divide the functional modules of the smart network card according to the foregoing method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation. The following is an example of dividing each function module corresponding to each function.
在采用集成的单元的情况下,图6示出了上述实施例中所涉及的有状态业务的处理装置的一种可能的结构示意图。该装置可以为智能网卡或者智能网卡内置的芯片,该装置包括:获取单元601、处理单元602和发送单元603。其中,获取单元601用于支持该装置执行上述方法实施例中的S201、S202和S204;处理单元602用于支持该装置执行上述方法实施例中的S203和S205;发送单元603用于支持该装置执行上述方法实施例中的S206。进一步的,该装置还可以包括:带宽分配单元604,用于支持该装置执行上述方法实施例中分配发送带宽的步骤。In the case of using an integrated unit, FIG. 6 shows a possible structural schematic diagram of the stateful service processing apparatus involved in the foregoing embodiment. The device may be a smart network card or a chip built in the smart network card, and the device includes: an acquiring unit 601, a processing unit 602, and a sending unit 603. Wherein, the acquiring unit 601 is used to support the device to perform S201, S202, and S204 in the above method embodiment; the processing unit 602 is used to support the device to perform S203 and S205 in the above method embodiment; the sending unit 603 is used to support the device Perform S206 in the foregoing method embodiment. Further, the device may further include: a bandwidth allocating unit 604, configured to support the device to perform the step of allocating the sending bandwidth in the foregoing method embodiment.
在实际应用中,处理单元602可以为上述方法实施例中所描述的智能网卡中的TX处理模块、调度器、处理器池和交通管理器的集成,带宽分配单元604可以为上述方 法实施例中所描述的智能网卡中的TX带宽分配模块,发送单元603可以为上述方法实施例中所描述的智能网卡中的TX端口。In practical applications, the processing unit 602 may be the integration of the TX processing module, scheduler, processor pool, and traffic manager in the smart network card described in the foregoing method embodiment, and the bandwidth allocation unit 604 may be the foregoing method embodiment. In the described TX bandwidth allocation module in the smart network card, the sending unit 603 may be the TX port in the smart network card described in the foregoing method embodiment.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,具体可以参见上述方法实施例中的描述,本申请实施例在此不再赘述。It should be noted that all relevant content of each step involved in the foregoing method embodiment can be cited in the functional description of the corresponding functional module. For details, please refer to the description in the foregoing method embodiment, and the description of the embodiment of this application will not be repeated here.
在采用硬件实现的基础上,本申请中的处理单元602可以为该装置的处理器,获取单元601可以为该装置的接收器,发送单元603可以为该装置的发送器,接收器和发送器通常可以集成在一起作为该装置的通信接口。On the basis of hardware implementation, the processing unit 602 in this application may be the processor of the device, the acquiring unit 601 may be the receiver of the device, and the sending unit 603 may be the transmitter, receiver, and transmitter of the device. Usually can be integrated together as the communication interface of the device.
如图7所示,为本申请的实施例提供的上述实施例中所涉及的有状态业务的处理装置的一种可能的逻辑结构示意图。该装置可以为智能网卡或者智能网卡内置的芯片,该装置包括:处理器702和通信接口703。处理器702用于对该装置动作进行控制管理,例如,处理器702用于支持该装置执行上述方法实施例中的S203和S205,和/或用于本文所描述的技术的其他过程。此外,该装置还可以包括存储器701和总线704,处理器702、通信接口703以及存储器701通过总线704相互连接;通信接口703用于支持该装置进行通信,比如,支持该装置与主机或网络进行通信;存储器701用于存储该装置的程序代码和数据。As shown in FIG. 7, it is a schematic diagram of a possible logical structure of the apparatus for processing a stateful service involved in the foregoing embodiment provided by the embodiment of this application. The device may be a smart network card or a chip built in the smart network card, and the device includes a processor 702 and a communication interface 703. The processor 702 is used to control and manage the actions of the device. For example, the processor 702 is used to support the device to execute S203 and S205 in the foregoing method embodiments, and/or other processes used in the technology described herein. In addition, the device may also include a memory 701 and a bus 704. The processor 702, a communication interface 703, and the memory 701 are connected to each other through the bus 704; the communication interface 703 is used to support the device to communicate, for example, to support the device to communicate with a host or a network. Communication; The memory 701 is used to store the program code and data of the device.
其中,处理器702可以是中央处理器单元,通用处理器,基带处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件,晶体管逻辑器件,硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等。总线704可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线704可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。Among them, the processor 702 may be a central processing unit, a general-purpose processor, a baseband processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components, or any of them combination. It can implement or execute various exemplary logical blocks, modules, and circuits described in conjunction with the disclosure of this application. The processor may also be a combination that implements computing functions, for example, a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on. The bus 704 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus 704 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
在本申请的又一方面,提供一种通信系统,该通信系统包括网卡和主机,该网卡通过总线与主机连接;其中,该网卡为上文所提供的任一网卡,用于执行上述方法实施例中网卡的步骤。In another aspect of the present application, a communication system is provided. The communication system includes a network card and a host, and the network card is connected to the host via a bus; wherein, the network card is any one of the network cards provided above, and is used to perform the above-mentioned method implementation. The steps of the network card in the example.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, for example, multiple units or components may be It can be combined or integrated into another device, or some features can be omitted or not implemented.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate parts may or may not be physically separate. The parts displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成 的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中,该可读存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. The readable storage medium may include: U disk, mobile hard disk, read-only Various media that can store program codes such as memory, random access memory, magnetic disk or optical disk. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence or a part that contributes to the prior art, or all or part of the technical solutions.
在本申请的另一实施例中,还提供一种可读存储介质,该可读存储介质中存储有计算机执行指令,当一个设备(可以是单片机,芯片等)或者处理器执行上述方法实施例所提供的有状态业务的处理方法。In another embodiment of the present application, a readable storage medium is also provided. The readable storage medium stores computer execution instructions. When a device (may be a single-chip microcomputer, a chip, etc.) or a processor executes the above method embodiments The stateful service processing method provided.
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备执行上述方法实施例所提供的有状态业务的处理方法。In another embodiment of the present application, a computer program product is also provided. The computer program product includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium; at least one processor of the device can be accessed from a computer. The reading storage medium reads the computer-executable instruction, and at least one processor executes the computer-executable instruction to make the device execute the stateful service processing method provided by the foregoing method embodiment.
最后应说明的是:以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。Finally, it should be noted that the above are only specific implementations of this application, but the scope of protection of this application is not limited to this. Any changes or substitutions within the technical scope disclosed in this application shall be covered by this application. Within the scope of protection applied for. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (21)

  1. 一种有状态业务的处理方法,其特征在于,应用于网卡中,所述网卡与主机连接,所述方法包括:A method for processing stateful services, which is characterized in that it is applied to a network card, and the network card is connected to a host, and the method includes:
    接收来自所述主机的第一连接的标识,所述第一连接为有状态业务所在的连接;Receiving an identifier of a first connection from the host, where the first connection is a connection where a stateful service is located;
    根据所述第一连接的标识获取所述第一连接的上下文,所述第一连接的上下文用于指示所述第一连接的相关信息,所述第一连接的上下文是所述第一连接的上一个数据块被处理后更新得到的上下文;Acquire the context of the first connection according to the identifier of the first connection, the context of the first connection is used to indicate related information of the first connection, and the context of the first connection is that of the first connection The updated context obtained after the last data block was processed;
    根据所述第一连接的上下文生成数据处理指令,所述数据处理指令可用于处理第一数据量的数据;Generating a data processing instruction according to the context of the first connection, where the data processing instruction can be used to process data of a first amount of data;
    使用所述第一数据量对应的发送带宽从所述主机中下载所述第一连接的数据块;Download the data block of the first connection from the host by using the transmission bandwidth corresponding to the first data volume;
    根据所述数据处理指令处理所述第一连接的数据块以得到多个报文;Process the data block of the first connection according to the data processing instruction to obtain multiple messages;
    发送所述多个报文。Sending the multiple messages.
  2. 根据权利要求1所述的方法,其特征在于,所述数据处理指令包括所述第一数据量对应的直接存储器存取DMA指令,所述从所述主机中下载所述第一连接的数据块,包括:The method according to claim 1, wherein the data processing instruction includes a direct memory access DMA instruction corresponding to the first data volume, and the data block of the first connection is downloaded from the host ,include:
    根据所述第一数据量对应的DMA指令,从所述主机中下载所述第一连接的数据块。Downloading the data block of the first connection from the host according to the DMA instruction corresponding to the first data volume.
  3. 根据权利要求1或2所述的方法,其特征在于,所述数据处理指令还包括随路关联数据,所述根据所述数据处理指令处理所述第一连接的数据块,包括:The method according to claim 1 or 2, wherein the data processing instruction further includes associated data, and the processing of the data block of the first connection according to the data processing instruction includes:
    根据所述随路关联数据对所述第一连接的数据块进行分片处理,得到多个数据分片。Fragment processing is performed on the data block of the first connection according to the path associated data to obtain multiple data fragments.
  4. 根据权利要求3所述的方法,其特征在于,所述分片处理包括以下至少一项:The method according to claim 3, wherein the fragmentation processing includes at least one of the following:
    插入多个标记符,删除部分头部数据,删除部分尾部数据,确定报文头,确定载荷的循环冗余校验CRC,确定校验和CS。Insert multiple markers, delete part of the header data, delete part of the tail data, determine the message header, determine the cyclic redundancy check CRC of the payload, and determine the checksum CS.
  5. 根据权利要求3或4所述的方法,其特征在于,所述数据处理指令还包括报文编辑指令,所述根据所述数据处理指令处理所述第一连接的数据块,还包括:The method according to claim 3 or 4, wherein the data processing instruction further comprises a message editing instruction, and the processing of the data block of the first connection according to the data processing instruction further comprises:
    根据所述报文编辑指令,对所述多个数据分片进行编辑封装以得到所述多个报文。According to the message editing instruction, the multiple data fragments are edited and packaged to obtain the multiple messages.
  6. 根据权利要求5所述的方法,其特征在于,所述报文编辑指令包括:头部切片的编辑指令、中间切片的编辑指令和尾部切片的编辑指令。The method according to claim 5, wherein the message editing instruction includes: an editing instruction for a head slice, an editing instruction for a middle slice, and an editing instruction for a tail slice.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述接收来自所述主机的第一连接的标识,包括:The method according to any one of claims 1-6, wherein the receiving the identifier of the first connection from the host comprises:
    接收来自所述主机的第一门铃DB,所述第一DB携带所述主机中第一连接的标识。Receive a first doorbell DB from the host, where the first DB carries an identifier of the first connection in the host.
  8. 根据权利要求7所述的方法,其特征在于,所述使用所述第一数据量对应的发送带宽从所述主机中下载所述第一连接的数据块之前,所述方法还包括:7. The method according to claim 7, wherein before the downloading the data block of the first connection from the host by using the sending bandwidth corresponding to the first data amount, the method further comprises:
    从可用总线带宽中为所述第一DB分配所述发送带宽,所述发送带宽等于所述第一数据量。The transmission bandwidth is allocated to the first DB from the available bus bandwidth, where the transmission bandwidth is equal to the first data amount.
  9. 根据权利要求8所述的方法,其特征在于,所述使用所述第一数据量对应的发送带宽从所述主机中下载所述第一连接的数据块之后,所述方法还包括:8. The method according to claim 8, wherein after said downloading the data block of the first connection from the host using the sending bandwidth corresponding to the first data amount, the method further comprises:
    根据所述第一数据量与所述第一连接的数据块的实际数据量的差值,调整所述发送带宽。Adjust the sending bandwidth according to the difference between the first data volume and the actual data volume of the data block of the first connection.
  10. 一种有状态业务的处理装置,其特征在于,应用于网卡中,所述网卡与主机连接,所述装置包括:A processing device for stateful services, characterized in that it is applied to a network card, the network card is connected to a host, and the device includes:
    获取单元,用于接收来自主机的第一连接的标识,所述第一连接为有状态业务所在的连接;An obtaining unit, configured to receive an identifier of a first connection from the host, where the first connection is a connection where a stateful service is located;
    所述获取单元,还用于根据所述第一连接的标识获取所述第一连接的上下文,所述第一连接的上下文用于指示所述第一连接的相关信息,所述第一连接的上下文是所述第一连接的上一个数据块被处理后更新得到的上下文;The acquiring unit is further configured to acquire the context of the first connection according to the identifier of the first connection, and the context of the first connection is used to indicate related information of the first connection. The context is the context obtained by updating after the last data block of the first connection is processed;
    处理单元,根据所述第一连接的上下文生成数据处理指令,所述数据处理指令可用于处理第一数据量的数据;A processing unit, generating a data processing instruction according to the context of the first connection, the data processing instruction may be used to process data of a first amount of data;
    所述获取单元,还用于使用所述第一数据量对应的发送带宽从所述主机中下载所述第一连接的数据块;The acquiring unit is further configured to download the data block of the first connection from the host by using the transmission bandwidth corresponding to the first data amount;
    所述处理单元,还用于根据所述数据处理指令处理所述第一连接的数据块以得到多个报文;The processing unit is further configured to process the data block of the first connection according to the data processing instruction to obtain multiple messages;
    发送单元,用于发送所述多个报文。The sending unit is configured to send the multiple messages.
  11. 根据权利要求10所述的装置,其特征在于,所述数据处理指令包括所述第一数据量对应的直接存储器存取DMA指令,所述获取单元,还用于:11. The device according to claim 10, wherein the data processing instruction comprises a direct memory access DMA instruction corresponding to the first amount of data, and the acquiring unit is further configured to:
    根据所述第一数据量对应的DMA指令,从所述主机中下载所述第一连接的数据块。Downloading the data block of the first connection from the host according to the DMA instruction corresponding to the first data volume.
  12. 根据权利要求10或11所述的装置,其特征在于,所述数据处理指令还包括随路关联数据,所述处理单元,还用于:The device according to claim 10 or 11, wherein the data processing instruction further includes associated data, and the processing unit is further configured to:
    根据所述随路关联数据对所述第一连接的数据块进行分片处理,得到多个数据分片。Fragment processing is performed on the data block of the first connection according to the path associated data to obtain multiple data fragments.
  13. 根据权利要求12所述的装置,其特征在于,所述分片处理包括以下至少一项:插入多个标记符,删除部分头部数据,删除部分尾部数据,确定报文头,确定载荷的循环冗余校验CRC,确定校验和CS。The device according to claim 12, wherein the fragmentation processing includes at least one of the following: inserting multiple markers, deleting part of the header data, deleting part of the tail data, determining the header of the message, and determining the cycle of the load Redundancy check CRC, determine the checksum CS.
  14. 根据权利要求12或13所述的装置,其特征在于,所述数据处理指令还包括报文编辑指令,所述处理单元,还用于:The device according to claim 12 or 13, wherein the data processing instruction further comprises a message editing instruction, and the processing unit is further configured to:
    根据所述报文编辑指令,对所述多个数据分片进行编辑封装以得到所述多个报文。According to the message editing instruction, the multiple data fragments are edited and packaged to obtain the multiple messages.
  15. 根据权利要求14所述的装置,其特征在于,所述报文编辑指令包括:头部切片的编辑指令、中间切片的编辑指令和尾部切片的编辑指令。The device according to claim 14, wherein the message editing instructions include: editing instructions for the head slice, editing instructions for the middle slice, and editing instructions for the tail slice.
  16. 根据权利要求10-15任一项所述的装置,其特征在于,所述获取单元,还用于:The device according to any one of claims 10-15, wherein the acquiring unit is further configured to:
    接收来自所述主机的第一门铃DB,所述第一DB携带所述主机中第一连接的标识。Receive a first doorbell DB from the host, where the first DB carries an identifier of the first connection in the host.
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括:The device according to claim 16, wherein the device further comprises:
    带宽分配单元,用于从可用总线带宽中为所述第一DB分配所述发送带宽,所述发送带宽等于所述第一数据量。The bandwidth allocation unit is configured to allocate the sending bandwidth to the first DB from the available bus bandwidth, where the sending bandwidth is equal to the first data amount.
  18. 根据权利要求17所述的装置,其特征在于,所述带宽分配单元,还用于:The apparatus according to claim 17, wherein the bandwidth allocation unit is further configured to:
    根据所述第一数据量与所述第一连接的数据块的实际数据量的差值,调整所述发送带宽。Adjust the sending bandwidth according to the difference between the first data volume and the actual data volume of the data block of the first connection.
  19. 一种有状态业务的处理装置,其特征在于,所述有状态业务的处理装置为网卡或者网卡内置的芯片,所述装置包括:存储器、以及与所述存储器耦合的处理器,所述存储器中存储代码和数据,所述处理器运行所述存储器中的代码使得所述装置执行权利要求1-9任一项所述的有状态业务的处理方法。A processing device for stateful services, characterized in that the processing device for stateful services is a network card or a chip built into the network card, and the device includes a memory and a processor coupled with the memory. Stores code and data, and the processor runs the code in the memory to enable the device to execute the method for processing a stateful service according to any one of claims 1-9.
  20. 一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得所述计算机执行权利要求1-9任一项所述的有状态业务的处理方法。A computer-readable storage medium having instructions stored in the computer-readable storage medium, which when run on a computer, causes the computer to execute the method for processing a stateful service according to any one of claims 1-9 .
  21. 一种计算机程序产品,其特征在于,当所述计算机程序产品在设备上运行时,使得所述设备执行权利要求1-9任一项所述的有状态业务的处理方法。A computer program product, characterized in that, when the computer program product runs on a device, the device is caused to execute the method for processing a stateful service according to any one of claims 1-9.
PCT/CN2020/085418 2020-04-17 2020-04-17 Method and device for processing stateful service WO2021208092A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/085418 WO2021208092A1 (en) 2020-04-17 2020-04-17 Method and device for processing stateful service
CN202080099188.3A CN115349121A (en) 2020-04-17 2020-04-17 Method and device for processing stateful service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/085418 WO2021208092A1 (en) 2020-04-17 2020-04-17 Method and device for processing stateful service

Publications (1)

Publication Number Publication Date
WO2021208092A1 true WO2021208092A1 (en) 2021-10-21

Family

ID=78083826

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/085418 WO2021208092A1 (en) 2020-04-17 2020-04-17 Method and device for processing stateful service

Country Status (2)

Country Link
CN (1) CN115349121A (en)
WO (1) WO2021208092A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114301575A (en) * 2021-12-21 2022-04-08 阿里巴巴(中国)有限公司 Data processing method, system, device and medium
CN114553635A (en) * 2022-02-18 2022-05-27 珠海星云智联科技有限公司 Data processing method, data interaction method and product in DPU network equipment
CN115052055A (en) * 2022-08-17 2022-09-13 北京左江科技股份有限公司 Network message checksum unloading method based on FPGA
CN117857649A (en) * 2024-03-07 2024-04-09 西安众望能源科技有限公司 Transmission method and system for transmission control protocol data packet

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170770B (en) * 2023-04-25 2024-04-12 杭州镭湖科技有限公司 Automatic sensing method and device for realizing sensor based on Internet of things

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255802A1 (en) * 2006-05-01 2007-11-01 Eliezer Aloni Method and system for transparent TCP offload (TTO) with a user space library
US20080155154A1 (en) * 2006-12-21 2008-06-26 Yuval Kenan Method and System for Coalescing Task Completions
US20080270599A1 (en) * 2007-04-30 2008-10-30 Eliezer Tamir Method and system for configuring a plurality of network interfaces that share a physical interface
CN103248467A (en) * 2013-05-14 2013-08-14 中国人民解放军国防科学技术大学 In-chip connection management-based RDMA communication method
US9378049B1 (en) * 2015-02-12 2016-06-28 Amazon Technologies, Inc. Servicing I/O requests in an I/O adapter device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255802A1 (en) * 2006-05-01 2007-11-01 Eliezer Aloni Method and system for transparent TCP offload (TTO) with a user space library
US20080155154A1 (en) * 2006-12-21 2008-06-26 Yuval Kenan Method and System for Coalescing Task Completions
US20080270599A1 (en) * 2007-04-30 2008-10-30 Eliezer Tamir Method and system for configuring a plurality of network interfaces that share a physical interface
CN103248467A (en) * 2013-05-14 2013-08-14 中国人民解放军国防科学技术大学 In-chip connection management-based RDMA communication method
US9378049B1 (en) * 2015-02-12 2016-06-28 Amazon Technologies, Inc. Servicing I/O requests in an I/O adapter device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114301575A (en) * 2021-12-21 2022-04-08 阿里巴巴(中国)有限公司 Data processing method, system, device and medium
CN114301575B (en) * 2021-12-21 2024-03-29 阿里巴巴(中国)有限公司 Data processing method, system, equipment and medium
CN114553635A (en) * 2022-02-18 2022-05-27 珠海星云智联科技有限公司 Data processing method, data interaction method and product in DPU network equipment
CN115052055A (en) * 2022-08-17 2022-09-13 北京左江科技股份有限公司 Network message checksum unloading method based on FPGA
CN115052055B (en) * 2022-08-17 2022-11-11 北京左江科技股份有限公司 Network message checksum unloading method based on FPGA
CN117857649A (en) * 2024-03-07 2024-04-09 西安众望能源科技有限公司 Transmission method and system for transmission control protocol data packet
CN117857649B (en) * 2024-03-07 2024-04-30 西安众望能源科技有限公司 Transmission method and system for transmission control protocol data packet

Also Published As

Publication number Publication date
CN115349121A (en) 2022-11-15

Similar Documents

Publication Publication Date Title
WO2021208092A1 (en) Method and device for processing stateful service
US11991072B2 (en) System and method for facilitating efficient event notification management for a network interface controller (NIC)
CN114189571B (en) Apparatus and method for implementing accelerated network packet processing
US10305802B2 (en) Reliable transport of ethernet packet data with wire-speed and packet data rate match
US10084647B2 (en) Data forwarding to server via virtual network card or to external network via network interface, based on fusion descriptor
US8111707B2 (en) Compression mechanisms for control plane—data plane processing architectures
US11296807B2 (en) Techniques to operate a time division multiplexing(TDM) media access control (MAC)
JP2019075109A (en) Data storage device and bridge device
US10880204B1 (en) Low latency access for storage using multiple paths
US9813283B2 (en) Efficient data transfer between servers and remote peripherals
WO2015058699A1 (en) Data forwarding
US9262354B2 (en) Adaptive interrupt moderation
CN116018790A (en) Receiver-based precise congestion control
US20220244861A1 (en) Data Access Method and Apparatus, and First Computing Device
CN113326228A (en) Message forwarding method, device and equipment based on remote direct data storage
WO2023186046A1 (en) Method and apparatus for transmitting message
US8745235B2 (en) Networking system call data division for zero copy operations
CN114697387B (en) Data packet transmission method, device and storage medium
US20230029796A1 (en) Stateful service processing method and apparatus
Suzuki et al. Disaggregation and sharing of I/O devices in cloud data centers
US10877911B1 (en) Pattern generation using a direct memory access engine
WO2019084805A1 (en) Method and apparatus for distributing message
US10805436B2 (en) Deliver an ingress packet to a queue at a gateway device
US11799988B2 (en) User datagram protocol segmentation offload for virtual machines
WO2022110384A1 (en) Routing control method and apparatus, and routing device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20931186

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20931186

Country of ref document: EP

Kind code of ref document: A1