WO2024041140A1 - Data processing method, accelerator, and computing device - Google Patents

Data processing method, accelerator, and computing device Download PDF

Info

Publication number
WO2024041140A1
WO2024041140A1 PCT/CN2023/101332 CN2023101332W WO2024041140A1 WO 2024041140 A1 WO2024041140 A1 WO 2024041140A1 CN 2023101332 W CN2023101332 W CN 2023101332W WO 2024041140 A1 WO2024041140 A1 WO 2024041140A1
Authority
WO
WIPO (PCT)
Prior art keywords
accelerator
block
hash
operation request
memory
Prior art date
Application number
PCT/CN2023/101332
Other languages
French (fr)
Chinese (zh)
Inventor
毛修斌
何泽耀
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024041140A1 publication Critical patent/WO2024041140A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1072Decentralised address translation, e.g. in distributed shared memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of computer technology, and in particular, to a data processing method, an accelerator and a computing device.
  • KV key value
  • the KV storage process is often implemented based on the central processing unit (CPU).
  • the CPU needs to calculate and determine the KV data structure, which occupies the CPU's computing resources and network bandwidth.
  • the KV service that relies on the CPU can provide limited throughput and cannot meet the performance requirements of high-concurrency KV operations.
  • This application provides a data processing method, which is completed by offloading KV operation requests to the accelerator.
  • Data plane processing within the offloading capability range completely bypasses the CPU, which can not only improve the throughput of the system, but also reduce the occupation of the CPU. , which meets the performance requirements of high-concurrency KV operations.
  • This application also provides corresponding data processing devices, accelerators, computing equipment, computer-readable storage media, and computer program products.
  • this application provides a data processing method.
  • the method is applied to computing devices supporting key-value KV services.
  • the computing device includes an accelerator and a processor.
  • the accelerator can obtain the KV operation request, and then determine the execution mode based on the KV operation request.
  • the execution mode includes an offloading mode and a non-offloading mode.
  • the offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation
  • the non-offloading mode is used to instruct the processor to perform the operation requested by the KV operation. Then the accelerator performs the processing of the KV operation request according to the execution mode.
  • This method offloads the basic operations of distributed KV data to the accelerator by accelerating the programmability and on-path processing capabilities of the processor. Data plane processing within the offloading capability range completely bypasses the CPU, which can improve the throughput of the system. , and can reduce CPU usage. Moreover, this method retains all the functions of KV on the CPU side, and a small number of KV operations that exceed the offload specifications are still forwarded to the CPU for processing, forming a hierarchical KV service that takes into account both performance and versatility.
  • the accelerator can obtain the operation metadata in the KV operation request and use the operation metadata to determine how to execute the KV operation request. Specifically, when the operation metadata satisfies the preset conditions, the execution mode is determined to be the offloading mode; otherwise, the execution mode is determined to be the non-offloading mode.
  • This method filters KV operation requests through operation metadata, determines the execution mode of KV operation requests within the range of the accelerator's offloading capability as the coloading mode, and determines the execution mode of KV operation requests outside the range of the accelerator's offloading capability as
  • the non-coloaded method forms a hierarchical KV service, which not only improves the performance of the KV service through the accelerator, but also reduces the CPU usage, enabling the CPU to process complex operations and ensuring versatility.
  • operation metadata includes operation type and key length.
  • the operation metadata satisfies the preset conditions, which may be that the operation type is one of adding, deleting, querying, modifying or batch adding, batch deleting, batch querying or batch modifying, or Multiple types, and the bond length is less than the preset length.
  • This method is triggered by the operation type of the KV operation that the accelerator itself can handle and the maximum key length of the KV data that the KV operation can operate, and sets conditions for filtering KV operation requests within the offloading capability of the accelerator, so that it can accurately filter out those that meet the conditions.
  • the KV operation request is offloaded to avoid the need for additional resources and time to be forwarded to the CPU side for processing due to low screening progress.
  • the accelerator when the execution mode determined according to the KV operation request is the offloading mode, the accelerator can perform the target KV operation on the memory of the computing device according to the KV operation request.
  • the accelerator can perform add operations, delete operations, modify operations, query operations, or batch add operations, batch delete operations, batch modify operations, and batch query operations on the memory of the computing device.
  • the accelerator can effectively reduce pressure on the CPU side, reduce CPU usage, and improve KV operation performance by performing the above basic operations on the memory.
  • the memory can use KV blocks to store KV data.
  • KV block includes key content field and value content field. Based on this, when the accelerator performs a target KV operation, it can write the target KV block to the memory of the computing device or query the target KV block from the memory of the computing device according to the KV operation request.
  • the accelerator can write the target KV block to the memory or query the target KV block from the memory according to the relevant information in the KV operation request to complete the KV operation. Since the KV operation request is offloaded to the accelerator, the performance of the KV operation is improved. .
  • the memory uses a hash table to store the metadata of the KV data.
  • the hash table includes multiple hash buckets, each hash bucket in the multiple hash buckets includes multiple hash slots, and multiple hash slots belonging to the same hash bucket are used to store hash values.
  • the fingerprints, key lengths and corresponding block addresses of the same multiple keys In this way, it is helpful to quickly search the target KV block or quickly write the target KV block, which improves the efficiency of KV operations.
  • the KV operation request is a KV modification request
  • the accelerator can determine the hash value and fingerprint based on the key content of the target KV block, for example, hashing the key content of the target KV block separately according to different hashing algorithms.
  • Hash operation is performed to determine the hash value and fingerprint, and then the hash bucket corresponding to the target KV block is determined based on the hash value, and the hash slot in which the fingerprint in the hash bucket matches the fingerprint determined by the key content in the target KV block is updated.
  • This method further improves KV operation performance by designing a data structure suitable for accelerator processing to store KV data and its metadata.
  • the KV block also includes a lower-level KV block identification field and a lower-level KV block address field.
  • KV operation request is a KV increase request.
  • the accelerator may determine the hash value and fingerprint based on the key content of the target KV block, and then determine the hash bucket corresponding to the target KV block based on the hash value.
  • the fingerprint stored in the target hash slot in the hash bucket matches the fingerprint determined by the key content in the target KV block, read the lower-level KV block identification field and lower-level KV block address field in the target hash slot, and based on the lower-level KV block identification field and the field value of the lower-level KV block address field to determine the KV block at the end of the linked list, write the block address of the target KV block into the lower-level KV block address field in the KV block at the end of the linked list, and write the lower-level KV block in the KV block at the end of the linked list.
  • the field value of the block identification field is identified as valid.
  • This method manages KV data with the same fingerprint by setting up a linked list, and solves the conflict problem of adding (inserting) operations.
  • the KV operation request is a KV increase request.
  • the accelerator can determine the hash value and fingerprint based on the key content of the target KV block, then determine the hash bucket corresponding to the target KV block based on the hash value, and then write the target KV block into the empty hash slot in the hash bucket.
  • the block address and key content in the target KV block determine the fingerprint.
  • This method writes the block address of the target KV block and the fingerprint determined by the key content in the target KV block into the empty hash slot in the hash bucket, so that subsequent data query and modification can be performed based on the above block address and fingerprint.
  • the KV operation request is a KV query request.
  • the KV operation request includes the block address and key content of the target KV block to be queried.
  • the accelerator can determine the hash value according to the key content of the target KV block, determine the corresponding hash bucket according to the hash value, and then perform address translation according to the block address of the target KV block to obtain the physical address.
  • the physical address reads the hash bucket, and then the accelerator can determine the fingerprint based on the key content of the target KV block, query the hash bucket based on the fingerprint determined by the key content of the target KV block, and obtain the value content corresponding to the key content.
  • the accelerator can quickly query the target KV block based on the data structure designed for the accelerator, which improves operating performance. Moreover, the data structure designed by the accelerator fully considers conflict situations. Even if there is a conflict, the target KV block you are looking for can be accurately found, meeting business needs.
  • this application provides a data processing device.
  • the data processing device includes various units for executing the data processing method in the first aspect or any possible implementation of the first aspect.
  • this application provides an accelerator.
  • the accelerator includes processing modules and communication interfaces.
  • the communication interface is used to provide network communication for the processing module, and the accelerator is used to execute the data processing method in the first aspect or any possible implementation of the first aspect.
  • the present application provides a computing device including an accelerator.
  • computing devices include accelerators and processors.
  • the processor can be a central processing unit, and the central processing unit can also provide KV services.
  • the accelerator is used to execute the data processing method in the first aspect or any possible implementation manner of the first aspect to accelerate the KV service.
  • the present application provides a computer-readable storage medium in which instructions are stored, and the instructions instruct the computing device to execute the above-mentioned first aspect or any implementation of the first aspect. data processing methods.
  • the present application provides a computer program product containing instructions that, when run on a computing device, causes the computing device to execute the data processing method described in the above first aspect or any implementation of the first aspect. .
  • Figure 1 is a schematic architectural diagram of a data processing system provided by this application.
  • Figure 2 is a flow chart of a data processing method provided by this application.
  • FIG. 3 is a schematic structural diagram of a KV server node provided by this application.
  • Figure 4 is a schematic flow chart of a data processing method provided by this application.
  • FIG. 5 is a schematic structural diagram of a data processing system provided by this application.
  • FIG. 6 is a schematic structural diagram of KV data provided by this application.
  • Figure 7 is a schematic flow chart of a data processing method provided by this application.
  • Figure 8 is a schematic flow chart of a data processing method provided by this application.
  • Figure 9 is a schematic flow chart of a data processing method provided by this application.
  • Figure 10 is a schematic structural diagram of a data processing device provided by this application.
  • Figure 11 is a schematic structural diagram of a computing device provided by this application.
  • Distributed storage refers to the distributed storage of data on multiple independent devices (such as storage servers and other storage devices).
  • Distributed storage system refers to a storage system that uses distributed storage for data storage.
  • Distributed storage systems usually have a scalable system structure, which can use multiple storage servers to share the storage load and use location servers to locate storage information. In this way, it not only improves the reliability, availability and access efficiency of the storage system, but also makes it easy to expand.
  • Data or metadata in distributed storage can be organized using a tree structure or key value (KV).
  • KV key value
  • local key data such as hotspot data
  • key-value data structures to obtain lower query latency and higher concurrency.
  • Key value KV is specifically a method of describing the mapping relationship between elements that are related to each other. Each pair of elements contains a key and a value. The corresponding value can be retrieved by combining the specific key with the data structure used.
  • this application provides a data processing method.
  • This method can be applied to computing devices that support KV services, where supporting KV services includes supporting adding, deleting, querying or modifying KV data, or batch increment, batch deletion, batch query or batch modification of KV data.
  • Computing devices include accelerators and processors.
  • the processor can be a CPU, and the CPU can also support KV services.
  • An accelerator refers to a device that works with a processor to accelerate services.
  • the accelerator can be a data processing unit (DPU) or an infrastructure processing unit (IPU).
  • DPU data processing unit
  • IPU infrastructure processing unit
  • the accelerator is used as an example of DPU in the following description.
  • DPU is specifically an on-chip system that focuses on associated processing and calculation of data.
  • associated processing refers to associated signaling processing.
  • Associated signaling is the various signaling required in call connection through the connection. The relay circuit occupied to transmit the signaling.
  • DPU usually has certain programmability capabilities and can perform customized offload acceleration according to application scenarios.
  • the DPU can deploy an operating system (also called a small system) independently and provide KV services.
  • the computing device where the DPU is located includes two operating systems, namely a general operating system running in the CPU and a small system running in the DPU.
  • DPU also It can exist as an external device to the CPU and form a heterogeneous system together with processors such as a graphics processing unit (GPU).
  • GPU graphics processing unit
  • the DPU can obtain the KV operation request, and then the DPU can determine the execution mode according to the KV operation request.
  • the execution mode includes an offloading mode and a non-offloading mode.
  • the offloading mode is used to instruct an accelerator such as the DPU to perform the operation requested by the KV operation.
  • the non-offloading mode The mode is used to instruct the processor, such as the CPU, to perform the operation requested by the KV operation.
  • the DPU can perform the processing of the KV operation request according to the execution mode.
  • This method uses the programmability and on-line processing capabilities of data processors such as DPU to offload the basic operations of distributed KV data to the DPU.
  • Data plane processing within the offloading capability completely bypasses the CPU, which can improve the system. throughput and reduce CPU usage.
  • this method retains all the functions of KV on the CPU side, and a small number of KV operations that exceed the offload specifications are still forwarded to the CPU for processing, forming a hierarchical KV service that takes into account both performance and versatility.
  • the data processing system 10 includes a KV server node 100 and a KV client node 200.
  • the KV server node 100 is a computing device that supports KV services, such as a server that supports KV services.
  • the KV client node 200 is a device that supports access to the KV server node 100.
  • the KV client node 200 may be a lightweight device, including but not limited to a laptop, a tablet, or a smartphone.
  • the KV server node 100 and the KV client node 200 may be interconnected through a network, for example, through a high-performance network. It should be noted that according to the network scale of different business scenarios, the data processing system 10 may include one or more KV server nodes 100. Similarly, the data processing system 10 may include one or more KV client nodes 200.
  • the KV server node 100 includes DPU102 and CPU104.
  • the CPU 104 is placed in the host, and the DPU 102 is used as an external device of the host.
  • the KV server node 100 also includes a memory 106, which is used to store KV data to speed up the access efficiency of KV data.
  • the memory 106 can be externally connected to the host. Based on this, the memory 106 can also be called host memory. Each host can be connected to multiple external memories 106, and the multiple memories 106 can be used to form a memory pool.
  • the KV client node 200 is deployed with applications. Application processes can be spawned when the application is running. The application process can call the KV service interface to initiate a KV operation.
  • the KV client for example, the KV client process
  • the KV client process on the KV client node 200 can generate a KV operation request based on the KV operation.
  • the KV operation request can be (remote Direct memory access Remote Direct Memory Access, RDMA) message format.
  • RDMA Remote Direct Memory Access
  • the KV client node 200 sends the KV operation request to the KV server node 100.
  • the DPU 102 of the KV server node 100 is responsible for receiving and processing various KV operation requests.
  • the DPU 102 may determine the execution mode according to the KV operation request, and then perform the processing of the KV operation request according to the execution mode. For example, the DPU 102 can obtain the operation metadata in the KV operation request, including the operation type and key length (Klen).
  • the operation metadata meets the preset conditions, for example, the operation type is one or more of add, delete, query, modification or batch addition, batch deletion, batch query or batch modification, and the key length is less than the preset length, then OK
  • the execution mode is an offloading mode, and the DPU 102 can perform a target KV operation on the memory 106 according to the KV operation request.
  • the execution mode is determined to be a non-offloading mode, and the DPU 102 forwards the KV operation request to the CPU 104, and the CPU 104 performs the target KV operation on the memory 106 according to the KV operation request. That is, KV operation requests within the processing capability of the DPU 102 are processed by the DPU 102 , and KV operation requests beyond the processing capability of the DPU 102 may be forwarded to the CPU 104 for processing by the CPU 104 .
  • an embodiment of the present application also provides a data processing method.
  • the data processing method of the embodiment of the present application is introduced below with reference to the accompanying drawings.
  • the method includes:
  • S202 The DPU 102 receives the KV operation request sent by the KV client node 200.
  • KV operation requests include operation metadata.
  • the operation metadata can include operation type and key length.
  • Operation types may include basic operations, such as one or more of add (create), delete (delete), query (read), and modify (update). Adding, deleting, querying, and modifying can be collectively referred to as adding, deleting, querying, and modifying, and are recorded as CRUD.
  • multiple KV operations may be encapsulated in the KV operation request.
  • the operation type may include batch basic operations, such as one or more of batch addition, batch deletion, batch query, or batch modification.
  • the key length refers to the length of the key, that is, the length of the key content.
  • S204 DPU102 obtains the operation metadata in the KV operation request. When the operation metadata meets the preset conditions, S206 is executed. When the operation metadata does not satisfy the preset conditions, S208 is executed.
  • DPU102 determines that the execution mode is the offloading mode. Then execute S210.
  • DPU102 determines that the execution mode is a non-offloading mode. Then execute S214.
  • the DPU 102 can parse the KV operation request and obtain the operation metadata in the KV operation request.
  • the operation metadata includes the operation type and key length.
  • DPU102 has the ability to process basic operations, while CPU104 has the ability to process complex operations, and DPU102 is limited by hardware capabilities and is usually used to process KV whose key length is within the preset length (key length is less than the preset length)
  • the DPU 102 can determine whether the DPU 102 has the ability to process the KV operation in the KV operation request according to the operation type and key length, thereby determining the execution method of the KV operation.
  • the DPU 102 can compare the operation type in the KV operation request with the operation type of the basic operation (such as add, delete, query, modify or batch add, batch delete, batch query, batch modification), and compare the operation type in the KV operation request.
  • the bond length is compared to a preset length.
  • the operation type matches the operation type of the basic operation and the key length is less than the preset length, it indicates that the DPU 102 has the ability to process the KV operation in the KV operation request, and the execution mode can be determined to be the offloading mode.
  • the offloading mode is used to instruct the DPU 102 to perform the operation requested by the KV operation.
  • the DPU 102 When the operation type does not match the operation type of the basic operation, and/or the key length is not less than the preset length, it means that the DPU 102 does not have the ability to process the KV operation in the KV operation request, and the execution mode can be determined to be a non-offloading mode.
  • the non-offloading mode is used to instruct the CPU 104 to execute the operation requested by the KV operation.
  • the above preset length can be set according to the hardware type of the DPU 102. Depending on the hardware type of the DPU 102, the preset length can be different. For example, the default length can be set to 128 bytes (byte, B), or to 1 kilobyte (kilo byte, KB).
  • the above-mentioned S204 to S208 is a specific implementation manner in which the DPU 102 determines the execution method according to the KV operation request in the embodiment of the present application.
  • the DPU may not perform the above steps. Or use other implementation methods.
  • the DPU 102 can directly try to perform the target KV operation. When the execution is successful, the result is returned. When the execution is unsuccessful, the KV operation request is forwarded to the CPU for processing by the CPU.
  • the DPU 102 performs the target KV operation on the memory 106 according to the KV operation request.
  • the memory 106 uses KV blocks to store KV data. Based on this, the DPU 102 can write the target KV block to the memory 106 or query the target KV block from the memory 106 according to the KV operation request.
  • the operation type in the KV operation request is add, modify, or batch add or batch modify
  • the DPU 102 performs the operation of writing the target KV block into the memory 106 .
  • the operation type in the KV operation request is query or batch query
  • the DPU 102 performs the operation of querying the target KV block from the memory 106 .
  • the operation type in the KV operation request is delete or batch delete
  • the DPU 102 may perform an operation of deleting the target KV block from the memory 106 .
  • the KV operation request may also include the value content in the KV data to be added or modified.
  • the KV operation request may include the key content "name” and the value content "Zhang San” to request the addition of the KV data "name, Zhang San”.
  • the KV operation results may include different information.
  • the KV operation result can include operation success or operation failure.
  • the KV operation result may include the queried KV data.
  • the DPU 102 can encapsulate the queried KV data and the operation success field value in the same response message, and then return the response message to the client node 200 .
  • the DPU 102 can also encapsulate the queried KV data and the operation success field value in different response messages, and then return them to the client node 200 respectively.
  • the CPU 104 performs the target KV operation on the memory 106 according to the KV operation request and returns the KV operation result to the KV client node 200.
  • the relevant content description of the DPU 102 performing the target KV operation and returning the KV operation result please refer to the relevant content description of the DPU 102 performing the target KV operation and returning the KV operation result, which will not be discussed here. Repeat.
  • the above-mentioned S210 to S212 and the above-mentioned S214 to S218 are some processes in which the DPU 102 executes the KV operation request according to the execution mode. Specific implementation manner: In other possible implementation manners of the embodiments of this application, the processing of the KV operation request may also be performed in other ways.
  • This method uses the programmability and on-path processing capabilities of DPU102 to offload the basic operations of distributed KV data to DPU102.
  • Data plane processing within the offloading capability completely bypasses the host CPU 104, which can improve the throughput of the system. , and can reduce the occupation of the host CPU104.
  • this method retains all functions of KV on the CPU 104 side, and a small number of KV operations that exceed the offloading specifications are still forwarded to the host CPU 104 for processing, forming a hierarchical KV service that takes into account both performance and versatility.
  • the DPU 102 can be logically divided into a data plane and a control plane.
  • the embodiment shown in Figure 2 mainly introduces the data processing method from the perspective of the data plane. The method of the embodiment of the present application will be described in detail from the perspective of the control plane and the data plane.
  • the DPU 102 is logically divided into two parts: the data plane and the control plane.
  • the data plane is responsible for network IO communication and on-line processing of KV transactions.
  • the control plane is responsible for managing the ,context information of communication and transaction processing and ,managing the state of the transaction.
  • CPU104 runs the KV server and generates a KV server process.
  • the KV server process can be responsible for connection management with the KV client process, KV resource management and scheduling in the memory 106, and has complete KV operation processing capabilities. A small number of KV operation requests that cannot be accelerated by the DPU 102 can be served by KV The end process completes processing.
  • Memory 106 resides in the KV data structure.
  • the KV data structure is used to describe the organizational form of KV data.
  • KV data can be organized in the form of KV blocks.
  • a hash table also resides in the memory 106 .
  • the embodiment of the present application designs the corresponding KV data structure in a manner suitable for acceleration by the DPU 102 so that it can be processed more efficiently by the DPU 102 .
  • Step 1 The KV operation request from the KV client node 200 passes through the switching network in the form of an RDMA message and reaches the network port of the DPU102 in the KV server node 100.
  • the associated processing unit located in the data processing plane of the DPU102 processes the message. Parse to obtain operational metadata.
  • the switching network can also be called the connection network. Specifically, it is a network that establishes a communication channel between the source and destination of communication to realize information transmission.
  • a switched network is usually implemented by switching equipment, which can include switches, routers and other equipment that implements information exchange.
  • Operation metadata may include one or more of operation types or key lengths. Further, the KV operation request may also include key content. For add, modify, batch add, and batch modify operations, the KV operation request can also include value content. Considering that multiple versions of data can exist, operational metadata can also include version numbers. Similarly, operational metadata can also include sequence numbers.
  • Step 2 The associated processing unit checks whether the current KV operation can be offloaded and accelerated on the DPU102. If it exceeds the offloading capability of the DPU102, it is forwarded to the host CPU 104 for processing by the KV server process.
  • Step 3 The path-associated processing unit writes the context to the control plane to record the network connection status and the status of the current KV operation.
  • the context includes the processing information necessary to perform the current KV operation and the status of the current KV operation.
  • the processing information may include key content, and further, the processing information may also include value content.
  • the status of the current KV operation includes the execution node and execution result of the current KV operation.
  • Step 4 According to the data requirements of the KV operation, the associated processing unit pulls the necessary operation data from the memory 106 through the high-speed bus through the IO processing module for processing.
  • the high-speed bus can be a standard bus such as Peripheral Component Interconnect Express (PCIe) or Compute Express Link (CXL), or it can be a private bus type.
  • PCIe Peripheral Component Interconnect Express
  • CXL Compute Express Link
  • the operation of dragging data can use Direct Memory Access (DMA) or other memory access semantics, such as load store.
  • DMA Direct Memory Access
  • the associated processing unit may need to complete data extraction and processing several times.
  • Step 5 The path-associated processing unit writes the intermediate results and final results of processing into the context, and updates the operation status.
  • the KV operation is an add operation, a query operation, a batch add operation, or a batch query operation
  • there may be conflicts for example, the characteristic value of the key content in the KV data added in the add operation and the key content of other KV data in the hash slot.
  • the eigenvalues are the same, or the eigenvalues of the key content in the KV data to be queried by the query operation are the same as the eigenvalues of the key contents of other KV data in the hash slot, then the representations conflict.
  • the associated processing unit can perform conflict resolution operations, and the results generated during this operation can be intermediate results.
  • the intermediate results may include a linked list recording conflict information.
  • the KV operation is an add operation, a delete operation, etc.
  • the intermediate results may not be included, and the accompanying processing unit may write the final results into the context.
  • Step 6 The associated processing unit encapsulates the final result of the KV operation into an RDMA message and sends it back to the KV client node 100 to complete a KV operation.
  • the processing of the KV operation can be completed by the associated processing unit on the DPU 102, and the data interaction with the memory 106 is completed through the high-speed bus pass-through, without the participation of the computing power of the CPU 104.
  • Figure 4 Compared with Figure 2, Figure 4 not only describes the flow of the data processing method from the data plane, but also introduces the data processing method in detail from the control plane. It should also be noted that Figure 4 illustrates the interaction between a KV client node 200 and a KV server node 100. In actual application, multiple KV server nodes 200 can store KV data in a distributed manner and provide one or more A KV client node 200 provides services.
  • KV client nodes 200 can be interconnected with several KV end service nodes 100 through the RDMA network.
  • the entire KV data can be divided into different domain segments and stored in the memory 106 of different KV server nodes 100.
  • the KV data can be divided into different domain segments according to the key range, or divided into different domain segments in other ways, and distributed and stored in different KV server nodes 100 to achieve load balancing.
  • the KV server process in the KV server node 100 may include multiple execution threads to support certain concurrency.
  • Each KV server node 100 establishes at least one RDMA connection with each KV client node 200 to complete message transmission.
  • the KV client process can initiate a KV operation request on the corresponding RDMA queue pair (QP) according to the division of data domain segments.
  • QP is a virtual interface between hardware and software.
  • QP is a queue structure, which stores tasks issued by software to hardware in order, that is, Work Queue Element (WQE). WQE contains information such as where the data is taken out and how long it is sent to, and to which destination it is sent. .
  • WQE Work Queue Element
  • Each QP is independent and isolated from each other by a Protection Domain (PD). Therefore, a QP can be regarded as a resource exclusive to a user, and a user can also use multiple QPs at the same time.
  • QP has many service types, including reliable connection (RC), reliable datagram (RD), unreliable connection (UC) and unreliable datagram (UD), etc., all Data interaction is possible when the source QP and destination QP are of the same type.
  • the DPU 102 on the KV server node 100 side receives the KV operation request and checks whether the operation type, key length and other information of the KV operation are within the acceleration range supported by the DPU 102. If supported, the DPU 102 will complete the corresponding request processing, otherwise it will be forwarded. The processing is completed by the host CPU 104.
  • the KV data structure resident in the memory 106 can be seen in Figure 6 .
  • the hash table includes multiple hash buckets, such as multiple columns of the hash table in Figure 6, denoted as Entry 0...Entry M, each hash bucket in the multiple hash buckets Includes multiple hash slots, corresponding to multiple rows in a column, denoted as Slot 0...Slot N. Multiple hash slots belonging to the same hash bucket are used to store the fingerprints (Fingerprint), key length (Klen) and corresponding block addresses (KV Block address) of multiple keys with the same hash value.
  • the result calculated by hash algorithm 1 (i.e., the hash value described above) can be used to index the hash bucket entry, and each hash bucket contains For multiple slots, keys with the same calculation result of hash algorithm 1 will be placed in different slots in the same hash bucket.
  • the header of each Slot includes a Fingerprint field, which stores the result of key calculation through hash algorithm 2 (i.e., the fingerprint mentioned above).
  • DPU102 or CPU104 can distinguish and search multiple keys stored in the same hash bucket through the Fingerprint field.
  • the Slot also stores the virtual address and Key length (Klen) information of the KV Block, as well as the verification key required for address translation. The corresponding KV Block can be read and written through the virtual address.
  • the header field segment in KV Block includes the complete Key content and Value length information (Vlen). Further, the header field segment of the KV Block may also include a subordinate KV block identification field, such as the flags field in Figure 6. In the case of serious conflicts, the Fingerprint calculated by different keys in the same hash bucket may be the same. At this time, these keys share the same Slot, and their corresponding KV Blocks are managed in the form of a linked list.
  • the Flags field identifies whether a next-level KV Block exists, and stores the virtual address and verification key of the next-level KV Block in the previous-level KV Block (for example, the Next field segment of the KV Block). If no next-level KV Block exists, the corresponding address and verification key fields are invalid, but the corresponding space can be reserved. For example, the field value can be set to rsvd for possible subsequent insertion operations.
  • adding also called inserting
  • modifying also called updating
  • querying and batch (batch) operations.
  • This method mainly describes the insertion or update process from the perspective of interaction between the KV client node 200 and the KV server node 100, specifically including the following steps:
  • Step 1 The KV client node 200 sends a KV operation request to the KV server node 100.
  • the KV client node 200 may send a KV insert request or a KV update request to the KV server node 100 through an RDMA write operation.
  • the KV insertion request or KV update request carries the necessary information to generate the KV Block and the corresponding write address (virtual address).
  • the necessary information to generate a KV Block may include the key content and value content of the target KV block to be added or modified, and the write address may be the block address of the target KV block.
  • the KV insertion request or KV update request may also include a verification key, that is, a verification key.
  • the KV client node 200 can complete the sending through multiple messages.
  • MTU Maximum Transmission Unit
  • Step 2 DPU 102 receives the KV operation request, writes the value content to the memory 106 according to the KV operation request, and determines the hash value and fingerprint according to the key content and the hash algorithm.
  • the DPU 102 directly writes the value content (Value part) of the target KV block into the memory 106 through DMA according to the block address of the target KV block in the KV insertion request or KV update request, and then writes the value content (Value part) of the target KV block into the memory 106 according to the key content (key part) and Hash algorithm 1 determines the hash value, determines the hash bucket corresponding to the target KV block based on the hash value, and then determines the fingerprint of the target KV block based on hash algorithm 2.
  • Step 3 DPU102 reads the hash bucket corresponding to the target KV block, and sequentially compares whether the fingerprints in each hash slot Slot of the hash bucket are the same as the fingerprints of the target KV block calculated in step 2. If the current update operation is, the Slot with successful fingerprint matching is refreshed; if the current insert operation is, DPU102 can try to find a blank Slot to write the block address and fingerprint of the KV Block.
  • DPU102 can read the KV Block header information in the conflicting Slot, check whether the next-level KV Block address is valid through the field value of the lower-level KV block identification field, and if it is valid, read the lower-level KV block address field to read the next The block address of the first-level KV Block until the KV Block at the end of the linked list is found, then the address and verification key in the KV insertion request are written into the next-level KV Block address and verification key fields, and the Flags field is modified to identify the current KV Block The next hop address is valid.
  • Step 4 The KV server node 100 returns the KV operation result to the KV client node 200.
  • the KV server node 100 can notify the KV client node 200 that the current KV insert/update operation is completed through RDMA Send.
  • the DPU 102 can return a response message ACK to the KV client node 100 after receiving the KV operation request.
  • the KV server node 100 for example, DPU 102
  • the KV client node 200 can also return a response message ACK to the KV server node 100.
  • the KV write operation involves more than 3 read and write operations (more times are needed when conflicts occur) to the memory 106 and corresponding logical processing operations.
  • data read and write can be passed on the DPU 102
  • the DMA method is directly completed, and the KV logic processing can also be completed through the core on the DPU.
  • the path for processing the DMA data from the memory 106 to the DPU 102 in this application is shorter, more efficient, and can be significantly shortened Operation delay.
  • this application is based on DPU102's on-path processing, which completes the KV operation while processing RDMA protocol forwarding. It can not only bypass and release host CPU resources, but also obtain higher throughput and latency. performance.
  • this method mainly describes the query process from the perspective of interaction between the KV client node 200 and the KV server node 100, specifically including the following steps:
  • Step 1 The KV client node 200 sends a KV operation request to the KV server node 100.
  • the KV client node 200 may send a KV query request to the KV server node 100 through an RDMA write operation.
  • the KV query request carries the key content (specifically, the key part in the KV data) to request the query for the value content corresponding to the key content (specifically, the value part in the KV data).
  • the KV client node 200 carries the key content and the operation type in the payload of the RDMA write operation, thereby sending the KV query request through the RDMA write operation.
  • Step 2 The DPU 102 in the KV server node 100 receives the KV operation request and loads the hash bucket according to the KV operation request.
  • the operation type in the KV operation request is query, which is used to request the value content corresponding to the query key content.
  • DPU102 can calculate the corresponding hash bucket based on the key content and hash algorithm 1 in the KV query request, complete the translation from the virtual address to the physical address based on the virtual address and verification key in the KV query request, and then use the physical address to complete the translation.
  • the address loads the hash bucket from memory 106.
  • Step 3 When the hash bucket returns, DPU102 can calculate the fingerprint based on the key content and hash algorithm 2, and search the Slot with the same fingerprint as the calculated fingerprint in the hash bucket to obtain the virtual address and calibration of the KV Block in the Slot. Check Key.
  • Step 4 DPU102 can complete the translation from the virtual address of the corresponding KV Block to the physical address based on the virtual address and verification key, and read the Header field and Next field.
  • Step 5 After the Header field and Next field of the KV Block are returned, DPU102 can compare whether the key content in the Header and the key content requested by the KV query are the same. If they are the same, go to step 6; if they are not the same, get the virtual address and verification key of the next KV Block in the linked list from the Next field, and return to step 4.
  • Step 6 DPU102 can load value content, generate KV operation results, and return KV operation results to KV client node 200.
  • DPU 102 can assemble the RDMA message according to the value content, send the value content back to the KV client node 200 through the RDMA Send operation, and complete the KV query operation.
  • the DPU 102 can return a response message ACK to the KV client node 100 after receiving the KV operation request.
  • the DPU 102 of the KV server node 100 notifies the completion of the operation through RDMA Send, the KV client node 200 can also return a response message ACK to the KV server node 100.
  • the KV read operation involves more than two (more times when conflicts occur) read and write operations to the memory 106 and corresponding logical processing operations. Similar to the KV write operation, data reading and writing in this application can be completed directly through DMA on the DPU102, and KV logic processing can also be completed through the core on the DPU102.
  • this application can shorten the data transmission from: server memory -> client memory to: server memory -> server DPU cache, greatly shortening the data transmission delay.
  • this application can achieve CPU bypass, save CPU resources, and improve the overall throughput and latency performance of the system through the multi-core and associated path processing capabilities of DPU102.
  • this application also proposes a design that supports transaction batch processing, allowing the KV client node 200 to encapsulate multiple KV operations in a single KV operation request, such as carrying operation metadata of multiple operations, DPU102 generates multiple transactions according to KV operation requests. Each transaction in the multiple transactions is used to perform one KV operation in multiple KV operations.
  • DPU102 can execute multiple transactions in parallel through multiple cores, thereby improving the overall efficiency.
  • this method mainly describes the batch operation process from the perspective of interaction between the KV client node 200 and the KV server node 100, and specifically includes the following steps:
  • Step 1 After the KV operation request is received by DPU102, a thread of DPU102 is responsible for processing.
  • KV operation requests can be handled by Thread 0. Specifically, Thread 0 obtains the operation type and number of operations of the current Batch operation by parsing the message header.
  • Step 2 DPU102 parses each operation domain segment in turn through the distribution and status management logic on Thread 0, and initializes the status information of each operation in the status table. Then it generates a new operation transaction and puts it in the transaction queue, and generates an operation notification. Enqueued into the notification queue.
  • the notification queue can be implemented in the form of a doorbell queue in the figure.
  • Step 3 The scheduler on the DPU102 can schedule the generated operation notifications to different threads, and the different threads can perform corresponding KV operations respectively.
  • the scheduler on DPU102 schedules the doorbell to Thread 1 to Thread N.
  • Thread 1 to Thread N read the information needed to complete the operation from the transaction queue by parsing the doorbell, and then perform the corresponding operation. For example, KV insertion operation.
  • Step 4 After the operation is completed, each thread writes back the corresponding status table.
  • the queue or order-preserving resources inside the DPU102 ensure the atomicity of the status update.
  • the last completed thread generates a Doorbell that replies to the request and merges it into the Doorbell queue.
  • Step 5 The scheduler on the DPU 102 schedules the Doorbell that responds to the request to a certain thread for execution.
  • the thread collects status information.
  • a completion message is generated and sent back to the KV client node 200.
  • the requesting end can encapsulate multiple KV operations into one message, use DPU102 to distribute multiple transactions in the message to different processing cores for parallel processing, and implement batch processing operations through process combination and orchestration, thus Improve overall transaction processing efficiency.
  • the data processing device 1000 can be a software device or a hardware device in an accelerator.
  • the device 1000 includes:
  • Determining unit 1004 configured to determine an execution mode according to the KV operation request.
  • the execution mode includes an offloading mode and a non-offloading mode.
  • the offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation.
  • the non-offloading mode is used to instruct the processor to perform the operation requested by the KV operation;
  • the execution unit 1006 is configured to execute the processing of the KV operation request according to the execution mode.
  • the device 1000 in the embodiment of the present invention can be implemented by a central processing unit (CPU), an application-specific integrated circuit (ASIC), or a programmable logic device.
  • programmable logic device, PLD programmable logic device
  • the above PLD can be a complex programmable logical device (CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL ), data processing unit (DPU), system on chip (SoC), or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL general array logic
  • DPU data processing unit
  • SoC system on chip
  • the determining unit 1004 is specifically used to:
  • the execution mode is determined to be the uninstallation mode; otherwise, the execution mode is determined to be the non-offloading mode.
  • the operation metadata includes operation type and key length
  • the operation metadata satisfies preset conditions, including: the operation type is add, delete, query, modify, or batch add or batch delete. , batch query or batch modification, and the key length is less than the preset length.
  • the execution unit 1006 when the execution mode determined according to the KV operation request is the offloading mode, the execution unit 1006 is specifically configured to:
  • a target KV operation is performed on the memory of the computing device.
  • the memory uses KV blocks to store KV data, and the KV blocks include key content fields and value content fields;
  • the execution unit 1006 is specifically used for:
  • a target KV block is written to or queried from the memory of the computing device.
  • the memory uses a hash table to store the metadata of the KV data.
  • the hash table includes multiple hash buckets, and each hash bucket in the multiple hash buckets Including multiple hash slots, multiple hash slots belonging to the same hash bucket are used to store the fingerprints, key lengths and corresponding block addresses of multiple keys with the same hash value.
  • the KV operation request is a KV modification request
  • the execution unit 1006 is also used to:
  • the KV block also includes a lower-level KV block identification field and a lower-level KV block address field
  • the KV operation request is a KV increase request
  • the execution unit 1006 is also used to:
  • the fingerprint stored in the target hash slot in the hash bucket matches the fingerprint determined by the key content in the target KV block, read the lower-level KV block identification field and lower-level KV block address field in the target hash slot, Determine the KV block at the end of the linked list based on the field values of the lower-level KV block identification field and the lower-level KV block address field, and write the block address of the target KV block into the lower-level KV block address in the KV block at the end of the linked list. field, marking the field value of the lower-level KV block identification field in the KV block at the end of the linked list as valid.
  • the KV operation request is a KV increase request
  • the execution unit 1006 is also used to:
  • the KV operation request is a KV query request
  • the KV operation request includes the block address and key content of the target KV block to be queried
  • the execution unit 1006 is specifically used to:
  • FIG 11 is a hardware structure diagram of a computing device 1100 provided by this application.
  • the computing device 1100 can be the aforementioned KV server node 100, used to implement the functions of the data processing device 1000 in the embodiment shown in Figure 10.
  • the computing device 1100 includes a processor 1101 , an accelerator 1102 and a memory 1103 .
  • the processor 1101, the accelerator 1102 and the memory 1103 can communicate through the bus 1104, or can also communicate through other means such as wireless transmission.
  • the computing device 1100 also includes a communication interface 1105, which is used for communicating with external devices, such as the KV client node 200 and other devices.
  • computing device 1100 may also include memory 1106 .
  • the processor 1101 may be a central processing unit (CPU), and the accelerator 1102 may be a data processor (DPU) or an infrastructure processor (IPU). Among them, the accelerator 1102 is used to offload the workload of the processor 1101 to realize the acceleration function. It should be noted that the accelerator 1102 can deploy an operating system (ie, a small system) independently to provide KV services. At this time, the computing device 1100 includes two operating systems, namely, a general operating system running in the processor 1101, and a general operating system running in the accelerator 1102. small system running. In addition, the accelerator 1102 may also exist as an external device to the processor 1101, and the processor 1101 and the accelerator 1102 may constitute a heterogeneous system.
  • CPU central processing unit
  • IPU infrastructure processor
  • Memory 1103 refers to the internal memory that directly exchanges data with the processor 1101 or the accelerator 1102. It can read and write data at any time and very quickly, and serves as a temporary data storage for the operating system or other running programs.
  • Memory can include at least two types of memory.
  • memory can be either Random Access Memory (RAM) or Read Only Memory (ROM).
  • RAM Random Access Memory
  • ROM Read Only Memory
  • random access memory is dynamic random access memory (Dynamic RAM, DRAM), static random access memory (Static RAM, SRAM), or storage class memory (Storage Class Memory, SCM), etc.
  • the read-only memory can be a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable Programmable ROM, EPROM), etc.
  • the memory 1103 can also be a dual in-line memory module or a dual in-line memory module (Dual In-line Memory Module, DIMM), that is, a module composed of DRAM, or a solid state drive (Solid State Disk, SSD).
  • DIMM Dual In-line Memory Module
  • the memory 1103 can be configured to have a power-saving function.
  • the power-guaranteing function means that the data stored in the memory 1103 will not be lost when the system is powered off and then on again.
  • Memory with a power-saving function is called non-volatile memory.
  • bus 1104 may also include a power bus, a control bus, a status signal bus, etc.
  • bus 1104 may also include a power bus, a control bus, a status signal bus, etc.
  • the various buses are labeled bus 1104 in the figure.
  • the communication interface 1105 is used to communicate with external devices such as the KV client node 200.
  • the communication interface 1105 may be a network card, RDMA network card as mentioned above.
  • the communication interface 1105 may be used to receive a KV operation request sent by the KV client node 200 through RDMA Write.
  • Memory 1106 also known as external memory or external storage, is typically used to persistently store data or instructions.
  • the memory 1106 may include a magnetic disk or other type of storage medium, such as a solid state drive or a shingled magnetic recording hard drive.
  • the computing device 1100 also includes a memory 1106, the memory 1103 is also used to temporarily store data exchanged with the memory 1106.
  • the memory 1103 is used to store instructions, which can be instructions solidified in the memory 1103, or instructions exchanged from the memory 1106.
  • the accelerator 1102 is used to execute the instructions stored in the memory 1103 to perform the following operations:
  • the execution mode is determined according to the KV operation request.
  • the execution mode includes an offloading mode and a non-offloading mode.
  • the offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation.
  • the non-offloading mode is used to indicate The processor performs the operation requested by the KV operation;
  • the KV operation request is processed according to the execution mode.
  • the accelerator 1102 is also used to execute instructions stored in the memory 1103 to execute other steps of the data processing method in the embodiment of the present application.
  • the computing device 1100 may correspond to the data processing device 1000 in the embodiment of the present application, and may correspond to the KV server node 100 executing the method shown in Figure 2 according to the embodiment of the present application, And the above and other operations and/or functions implemented by the computing device 1100 are respectively to implement the corresponding processes of the method in Figure 2. For the sake of simplicity, they will not be described again here.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store or a data storage device such as a data center that contains one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.
  • the computer-readable storage medium includes instructions that instruct the computing device to execute the above-described application to the data processing apparatus 1000 for performing the data processing method.
  • An embodiment of the present application also provides a computer program product containing instructions.
  • the computer program product may be a software or program product containing instructions capable of running on a computing device or stored in any available medium.
  • the computer program product is run on at least one computing device, at least one computing device is caused to execute the above data processing method.

Abstract

A data processing method and an accelerator (1102), which are applied to a computing device (1100) supporting a key value (KV) service. The computing device comprises the accelerator (1102) and a processor (1101). The method comprises: the accelerator (1102) acquiring a KV operation request, and determining an execution mode according to the KV operation request, the execution mode comprising an unloading mode and a non-unloading mode; and then performing processing of the KV operation request according to the execution mode. In this way, the KV operation request is unloaded to the accelerator (1102) for completion, and a completely bypassed CPU is processed on a data plane within an unloading capability range, so that the throughput of a system can be improved, and the occupation of the CPU can be reduced.

Description

数据处理方法、加速器及计算设备Data processing methods, accelerators and computing devices
本申请要求于2022年08月23日提交中国国家知识产权局、申请号为202211017087.4、发明名称为“数据处理方法、加速器及计算设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application submitted to the State Intellectual Property Office of China on August 23, 2022, with application number 202211017087.4 and the invention name "Data processing method, accelerator and computing device", the entire content of which is incorporated by reference. in this application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种数据处理方法、加速器及计算设备。The present application relates to the field of computer technology, and in particular, to a data processing method, an accelerator and a computing device.
背景技术Background technique
随着分布式存储技术的不断发展,数据库、大数据、高性能计算(high performance computing,HPC)、人工智能(artificial intelligence)等场景开始广泛使用分布式存储技术存储数据,以支持更好的扩展性和提升资源利用率。With the continuous development of distributed storage technology, scenarios such as databases, big data, high performance computing (HPC), and artificial intelligence (artificial intelligence) have begun to widely use distributed storage technology to store data to support better expansion. performance and improve resource utilization.
在分布式存储中,数据被分散存储到多个节点,并通过高性能的网络实现跨节点访问数据。其中,数据/元数据的管理和索引除了使用更为通用的树结构外,对局部关键数据例如热点数据可以使用键值(key value,KV)数据结构进行组织,从而获得更低的查询时延和更高的并发度。而非易失性存储介质作为内存的应用方式,使得构建更大的内存池成为可能。基于此,将全部的数据/元数据以KV数据结构存储在内存池能有效提升管理或索引的效率。In distributed storage, data is distributed and stored on multiple nodes, and data is accessed across nodes through a high-performance network. Among them, in addition to using a more general tree structure for data/metadata management and indexing, local key data such as hotspot data can be organized using a key value (KV) data structure to achieve lower query latency. and higher concurrency. The application of non-volatile storage media as memory makes it possible to build a larger memory pool. Based on this, storing all data/metadata in the memory pool in the KV data structure can effectively improve the efficiency of management or indexing.
但是,在分布式存储系统中,往往基于中央处理器(central processing unit,CPU)实现KV存储过程,CPU需要计算和确定KV数据结构,占用了CPU的计算资源和网络带宽,当多数据需并发存储时,依赖CPU实现的KV服务能提供的吞吐量有限,无法满足高并发的KV操作性能需求。However, in distributed storage systems, the KV storage process is often implemented based on the central processing unit (CPU). The CPU needs to calculate and determine the KV data structure, which occupies the CPU's computing resources and network bandwidth. When multiple data needs to be concurrently When storing, the KV service that relies on the CPU can provide limited throughput and cannot meet the performance requirements of high-concurrency KV operations.
发明内容Contents of the invention
本申请提供了一种数据处理方法,该方法通过将KV操作请求卸载至加速器完成,在卸载能力范围内的数据面处理完全旁路CPU,既能提高系统的吞吐,又能降低对CPU的占用,满足了高并发的KV操作性能需求。本申请还提供了对应的数据处理装置、加速器、计算设备、计算机可读存储介质以及计算机程序产品。This application provides a data processing method, which is completed by offloading KV operation requests to the accelerator. Data plane processing within the offloading capability range completely bypasses the CPU, which can not only improve the throughput of the system, but also reduce the occupation of the CPU. , which meets the performance requirements of high-concurrency KV operations. This application also provides corresponding data processing devices, accelerators, computing equipment, computer-readable storage media, and computer program products.
第一方面,本申请提供一种数据处理方法。所述方法应用于支持键值KV服务的计算设备。所述计算设备包括加速器和处理器。具体地,加速器可以获取KV操作请求,然后根据所述KV操作请求确定执行方式。执行方式包括卸载方式和非卸载方式,卸载方式用于指示由加速器执行KV操作请求的操作,非卸载方式用于指示由处理器执行KV操作请求的操作。接着加速器根据执行方式执行KV操作请求的处理。In the first aspect, this application provides a data processing method. The method is applied to computing devices supporting key-value KV services. The computing device includes an accelerator and a processor. Specifically, the accelerator can obtain the KV operation request, and then determine the execution mode based on the KV operation request. The execution mode includes an offloading mode and a non-offloading mode. The offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation, and the non-offloading mode is used to instruct the processor to perform the operation requested by the KV operation. Then the accelerator performs the processing of the KV operation request according to the execution mode.
该方法通过加速处理器的可编程能力和随路处理能力,将分布式的KV数据的基本操作卸载到加速器完成,在卸载能力范围内的数据面处理完全旁路CPU,既能提高系统的吞吐,又能降低对CPU的占用。并且,该方法在CPU侧保留KV的全部功能,对于超出卸载规格的少量KV操作仍然转发到CPU处理,形成层级化的KV服务,兼顾性能和通用性。This method offloads the basic operations of distributed KV data to the accelerator by accelerating the programmability and on-path processing capabilities of the processor. Data plane processing within the offloading capability range completely bypasses the CPU, which can improve the throughput of the system. , and can reduce CPU usage. Moreover, this method retains all the functions of KV on the CPU side, and a small number of KV operations that exceed the offload specifications are still forwarded to the CPU for processing, forming a hierarchical KV service that takes into account both performance and versatility.
在一些可能的实现方式中,加速器可以获取KV操作请求中的操作元数据,利用该操作元数据确定对KV操作请求的执行方式。具体地,当操作元数据满足预设条件,确定执行方式为卸载方式,否则,确定执行方式为所述非卸载方式。In some possible implementations, the accelerator can obtain the operation metadata in the KV operation request and use the operation metadata to determine how to execute the KV operation request. Specifically, when the operation metadata satisfies the preset conditions, the execution mode is determined to be the offloading mode; otherwise, the execution mode is determined to be the non-offloading mode.
该方法通过操作元数据对KV操作请求进行筛选,将在加速器卸载能力范围内的KV操作请求的执行方式确定为协载方式,将在加速器卸载能力范围之外的KV操作请求的执行方式确定为非协载方式,形成层级化的KV服务,不仅通过加速器提升了KV服务的性能,而且通过减少对CPU的占用,使得CPU能够对复杂操作进行处理,保障了通用性。This method filters KV operation requests through operation metadata, determines the execution mode of KV operation requests within the range of the accelerator's offloading capability as the coloading mode, and determines the execution mode of KV operation requests outside the range of the accelerator's offloading capability as The non-coloaded method forms a hierarchical KV service, which not only improves the performance of the KV service through the accelerator, but also reduces the CPU usage, enabling the CPU to process complex operations and ensuring versatility.
在一些可能的实现方式中,操作元数据包括操作类型和键长。其中,操作元数据满足预设条件,可以是所述操作类型为增加、删除、查询、修改或批量增加、批量删除、批量查询或批量修改中的一种或 多种,且所述键长小于预设长度。In some possible implementations, operation metadata includes operation type and key length. Wherein, the operation metadata satisfies the preset conditions, which may be that the operation type is one of adding, deleting, querying, modifying or batch adding, batch deleting, batch querying or batch modifying, or Multiple types, and the bond length is less than the preset length.
该方法从加速器自身能够处理的KV操作的操作类型、KV操作所能操作的KV数据的最大键长触发,设置用于筛选加速器卸载能力范围内的KV操作请求的条件,因而能够精准筛选符合条件的KV操作请求进行卸载,避免筛选进度不高导致需要额外耗费资源和时间转发至CPU侧处理。This method is triggered by the operation type of the KV operation that the accelerator itself can handle and the maximum key length of the KV data that the KV operation can operate, and sets conditions for filtering KV operation requests within the offloading capability of the accelerator, so that it can accurately filter out those that meet the conditions. The KV operation request is offloaded to avoid the need for additional resources and time to be forwarded to the CPU side for processing due to low screening progress.
在一些可能的实现方式中,当根据所述KV操作请求确定的执行方式为卸载方式时,加速器可以根据KV操作请求,对计算设备的内存执行目标KV操作。例如,加速器可以对计算设备的内存执行增加操作、删除操作、修改操作、查询操作或者是批量增加操作、批量删除操作、批量修改操作、批量查询操作。In some possible implementations, when the execution mode determined according to the KV operation request is the offloading mode, the accelerator can perform the target KV operation on the memory of the computing device according to the KV operation request. For example, the accelerator can perform add operations, delete operations, modify operations, query operations, or batch add operations, batch delete operations, batch modify operations, and batch query operations on the memory of the computing device.
在该方法中,加速器通过对内存执行上述基础操作,可以有效降低CPU侧压力,减少对CPU的占用,提升KV操作性能。In this method, the accelerator can effectively reduce pressure on the CPU side, reduce CPU usage, and improve KV operation performance by performing the above basic operations on the memory.
在一些可能的实现方式中,内存可以采用KV块存储KV数据。KV块包括键内容字段和值内容字段。基于此,加速器在执行目标KV操作时,可以根据KV操作请求,向计算设备的内存写入目标KV块或从计算设备的内存查询目标KV块。In some possible implementations, the memory can use KV blocks to store KV data. KV block includes key content field and value content field. Based on this, when the accelerator performs a target KV operation, it can write the target KV block to the memory of the computing device or query the target KV block from the memory of the computing device according to the KV operation request.
在该方法中,加速器可以根据KV操作请求中的相关信息向内存写入目标KV块或从内存查询目标KV块,以完成KV操作,由于KV操作请求被卸载至加速器,提升了KV操作的性能。In this method, the accelerator can write the target KV block to the memory or query the target KV block from the memory according to the relevant information in the KV operation request to complete the KV operation. Since the KV operation request is offloaded to the accelerator, the performance of the KV operation is improved. .
在一些可能的实现方式中,内存采用哈希表存储所述KV数据的元数据。所述哈希表包括多个哈希桶,所述多个哈希桶中的每个哈希桶包括多个哈希槽,属于同一哈希桶的多个哈希槽用于存放哈希值相同的多个键的指纹、键长以及对应的块地址。如此,有利于快速查找目标KV块,或者快速写入目标KV块,提高了KV操作的效率。In some possible implementations, the memory uses a hash table to store the metadata of the KV data. The hash table includes multiple hash buckets, each hash bucket in the multiple hash buckets includes multiple hash slots, and multiple hash slots belonging to the same hash bucket are used to store hash values. The fingerprints, key lengths and corresponding block addresses of the same multiple keys. In this way, it is helpful to quickly search the target KV block or quickly write the target KV block, which improves the efficiency of KV operations.
在一些可能的实现方式中,KV操作请求为KV修改请求,加速器可以根据目标KV块的键内容确定哈希值和指纹,例如是根据不同的哈希算法对目标KV块的键内容分别进行哈希运算,从而确定哈希值和指纹,然后根据哈希值确定目标KV块对应的哈希桶,更新哈希桶中指纹与目标KV块中键内容确定的指纹匹配的哈希槽。In some possible implementations, the KV operation request is a KV modification request, and the accelerator can determine the hash value and fingerprint based on the key content of the target KV block, for example, hashing the key content of the target KV block separately according to different hashing algorithms. Hash operation is performed to determine the hash value and fingerprint, and then the hash bucket corresponding to the target KV block is determined based on the hash value, and the hash slot in which the fingerprint in the hash bucket matches the fingerprint determined by the key content in the target KV block is updated.
该方法通过设计适合加速器处理的数据结构存储KV数据及其元数据,进一步提升了KV操作性能。This method further improves KV operation performance by designing a data structure suitable for accelerator processing to store KV data and its metadata.
在一些可能的实现方式中,KV块还包括下级KV块标识字段和下级KV块地址字段。KV操作请求为KV增加请求。加速器可以根据目标KV块的键内容确定哈希值和指纹,然后根据所述哈希值确定所述目标KV块对应的哈希桶。当哈希桶中目标哈希槽存储的指纹与目标KV块中键内容确定的指纹匹配,读取目标哈希槽中的下级KV块标识字段和下级KV块地址字段,根据下级KV块标识字段和下级KV块地址字段的字段值确定链表尾部的KV块,将目标KV块的块地址写入所述链表尾部的KV块中的下级KV块地址字段,将链表尾部的KV块中的下级KV块标识字段的字段值标识为有效。In some possible implementations, the KV block also includes a lower-level KV block identification field and a lower-level KV block address field. KV operation request is a KV increase request. The accelerator may determine the hash value and fingerprint based on the key content of the target KV block, and then determine the hash bucket corresponding to the target KV block based on the hash value. When the fingerprint stored in the target hash slot in the hash bucket matches the fingerprint determined by the key content in the target KV block, read the lower-level KV block identification field and lower-level KV block address field in the target hash slot, and based on the lower-level KV block identification field and the field value of the lower-level KV block address field to determine the KV block at the end of the linked list, write the block address of the target KV block into the lower-level KV block address field in the KV block at the end of the linked list, and write the lower-level KV block in the KV block at the end of the linked list. The field value of the block identification field is identified as valid.
该方法通过设置链表对具有相同指纹的KV数据进行管理,解决了增加(插入)操作的冲突问题。This method manages KV data with the same fingerprint by setting up a linked list, and solves the conflict problem of adding (inserting) operations.
在一些可能的实现方式中,KV操作请求为KV增加请求。具体地,加速器可以根据目标KV块的键内容确定哈希值和指纹,然后根据哈希值确定目标KV块对应的哈希桶,接着向哈希桶中的空白哈希槽写入目标KV块的块地址和目标KV块中键内容确定的指纹。In some possible implementations, the KV operation request is a KV increase request. Specifically, the accelerator can determine the hash value and fingerprint based on the key content of the target KV block, then determine the hash bucket corresponding to the target KV block based on the hash value, and then write the target KV block into the empty hash slot in the hash bucket. The block address and key content in the target KV block determine the fingerprint.
该方法通过向哈希桶中的空白哈希槽写入目标KV块的块地址和目标KV块中键内容确定的指纹,以便于后续可以基于上述块地址、指纹进行数据查询、修改。This method writes the block address of the target KV block and the fingerprint determined by the key content in the target KV block into the empty hash slot in the hash bucket, so that subsequent data query and modification can be performed based on the above block address and fingerprint.
在一些可能的实现方式中,所述KV操作请求为KV查询请求。该KV操作请求中包括待查询的目标KV块的块地址和键内容。相应地,加速器可以根据所述目标KV块的键内容确定哈希值,根据所述哈希值确定对应的哈希桶,然后根据所述目标KV块的块地址进行地址翻译获得物理地址,根据所述物理地址读取所述哈希桶,接着加速器可以根据目标KV块的键内容确定指纹,根据由目标KV块的键内容确定的指纹查询哈希桶,获得键内容对应的值内容。In some possible implementations, the KV operation request is a KV query request. The KV operation request includes the block address and key content of the target KV block to be queried. Correspondingly, the accelerator can determine the hash value according to the key content of the target KV block, determine the corresponding hash bucket according to the hash value, and then perform address translation according to the block address of the target KV block to obtain the physical address. The physical address reads the hash bucket, and then the accelerator can determine the fingerprint based on the key content of the target KV block, query the hash bucket based on the fingerprint determined by the key content of the target KV block, and obtain the value content corresponding to the key content.
在该方法中,加速器可以基于针对该加速器设计的数据结构,快速查询到目标KV块,提高了操作性能。而且,加速器设计的数据结构有充分考虑冲突情况,即使存在冲突,也能够精准地找到所要查找的目标KV块,满足了业务需求。In this method, the accelerator can quickly query the target KV block based on the data structure designed for the accelerator, which improves operating performance. Moreover, the data structure designed by the accelerator fully considers conflict situations. Even if there is a conflict, the target KV block you are looking for can be accurately found, meeting business needs.
第二方面,本申请提供一种数据处理装置。该数据处理装置包括用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法的各个单元。 In a second aspect, this application provides a data processing device. The data processing device includes various units for executing the data processing method in the first aspect or any possible implementation of the first aspect.
第三方面,本申请提供一种加速器。该加速器包括处理模块和通信接口。其中,通信接口用于为处理模块提供网络通信,加速器用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法。In a third aspect, this application provides an accelerator. The accelerator includes processing modules and communication interfaces. The communication interface is used to provide network communication for the processing module, and the accelerator is used to execute the data processing method in the first aspect or any possible implementation of the first aspect.
第四方面,本申请提供一种包括加速器的计算设备。其中,计算设备包括加速器和处理器。该处理器可以是中央处理器,中央处理器也能提供KV服务。加速器用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法,以对KV服务加速。In a fourth aspect, the present application provides a computing device including an accelerator. Among them, computing devices include accelerators and processors. The processor can be a central processing unit, and the central processing unit can also provide KV services. The accelerator is used to execute the data processing method in the first aspect or any possible implementation manner of the first aspect to accelerate the KV service.
第五方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,所述指令指示计算设备执行上述第一方面或第一方面的任一种实现方式所述的数据处理方法。In a fifth aspect, the present application provides a computer-readable storage medium in which instructions are stored, and the instructions instruct the computing device to execute the above-mentioned first aspect or any implementation of the first aspect. data processing methods.
第六方面,本申请提供了一种包含指令的计算机程序产品,当其在计算设备上运行时,使得计算设备执行上述第一方面或第一方面的任一种实现方式所述的数据处理方法。In a sixth aspect, the present application provides a computer program product containing instructions that, when run on a computing device, causes the computing device to execute the data processing method described in the above first aspect or any implementation of the first aspect. .
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。Based on the implementation methods provided in the above aspects, this application can also be further combined to provide more implementation methods.
附图说明Description of drawings
图1为本申请提供的一种数据处理系统的架构示意图;Figure 1 is a schematic architectural diagram of a data processing system provided by this application;
图2为本申请提供的一种数据处理方法的流程图;Figure 2 is a flow chart of a data processing method provided by this application;
图3为本申请提供的一种KV服务端节点的结构示意图;Figure 3 is a schematic structural diagram of a KV server node provided by this application;
图4为本申请提供的一种数据处理方法的流程示意图;Figure 4 is a schematic flow chart of a data processing method provided by this application;
图5为本申请提供的一种数据处理系统的结构示意图;Figure 5 is a schematic structural diagram of a data processing system provided by this application;
图6为本申请提供的一种KV数据的结构示意图;Figure 6 is a schematic structural diagram of KV data provided by this application;
图7为本申请提供的一种数据处理方法的流程示意图;Figure 7 is a schematic flow chart of a data processing method provided by this application;
图8为本申请提供的一种数据处理方法的流程示意图;Figure 8 is a schematic flow chart of a data processing method provided by this application;
图9为本申请提供的一种数据处理方法的流程示意图;Figure 9 is a schematic flow chart of a data processing method provided by this application;
图10为本申请提供的一种数据处理装置的结构示意图;Figure 10 is a schematic structural diagram of a data processing device provided by this application;
图11为本申请提供的一种计算设备的结构示意图。Figure 11 is a schematic structural diagram of a computing device provided by this application.
具体实施方式Detailed ways
为了便于理解,首先对本申请实施例中所涉及到的一些技术术语进行介绍。To facilitate understanding, some technical terms involved in the embodiments of this application are first introduced.
分布式存储,是指将数据分散存储在多台独立的设备(例如是存储服务器等存储设备)上。分布式存储系统是指采用分布式存储方式进行数据存储的存储系统。分布式存储系统通常具有可扩展的系统结构,可以利用多台存储服务器分担存储负荷,利用位置服务器定位存储信息。如此,不但提高了存储系统的可靠性、可用性和存取效率,还易于扩展。Distributed storage refers to the distributed storage of data on multiple independent devices (such as storage servers and other storage devices). Distributed storage system refers to a storage system that uses distributed storage for data storage. Distributed storage systems usually have a scalable system structure, which can use multiple storage servers to share the storage load and use location servers to locate storage information. In this way, it not only improves the reliability, availability and access efficiency of the storage system, but also makes it easy to expand.
分布式存储中的数据或元数据可以采用树结构或键值(key value,KV)进行组织。考虑到查询性能和查询成本,通常可以对局部关键数据(例如热点数据)使用键值数据结构进行组织,从而获得更低的查询时延和更高的并发度。Data or metadata in distributed storage can be organized using a tree structure or key value (KV). Considering query performance and query cost, local key data (such as hotspot data) can usually be organized using key-value data structures to obtain lower query latency and higher concurrency.
键值KV,具体是描述相互之间有关联的元素之间映射关系的一种方法。每一对元素都包含一个键(key)和一个值(value),通过特定的key结合所使用的数据结构,可以检索出对应的value。Key value KV is specifically a method of describing the mapping relationship between elements that are related to each other. Each pair of elements contains a key and a value. The corresponding value can be retrieved by combining the specific key with the data structure used.
为了满足高并发的KV操作性能需求,本申请提供了一种数据处理方法。该方法可以应用于支持KV服务的计算设备,其中,支持KV服务包括支持增加、删除、查询或修改KV数据,或者是批量增量、批量删除、批量查询或批量修改KV数据。计算设备包括加速器和处理器。其中,处理器可以为CPU,CPU也可以支持KV服务。加速器是指与处理器协同,实现服务加速的器件。加速器可以为数据处理器(data processing unit,DPU)或者基础设施处理器(infrastructure processing units,IPU)。为了便于描述,下文均以加速器为DPU示例说明。In order to meet the performance requirements of highly concurrent KV operations, this application provides a data processing method. This method can be applied to computing devices that support KV services, where supporting KV services includes supporting adding, deleting, querying or modifying KV data, or batch increment, batch deletion, batch query or batch modification of KV data. Computing devices include accelerators and processors. Among them, the processor can be a CPU, and the CPU can also support KV services. An accelerator refers to a device that works with a processor to accelerate services. The accelerator can be a data processing unit (DPU) or an infrastructure processing unit (IPU). For ease of description, the accelerator is used as an example of DPU in the following description.
DPU具体是一种专注于对数据做随路处理和计算的片上系统,其中,随路处理是指随路信令处理,随路信令为呼叫接续中所需的各种信令通过该接续所占用的中继电路来传送的信令。除了具备对网络转发、虚拟化、存储等场景的加速能力外,DPU通常具备一定的可编程能力,可以根据应用场景做定制化的卸载加速。DPU可以单独部署操作系统(也可以称为小系统),提供KV服务,此时,DPU所在计算设备内包括两个操作系统,即CPU中运行的通用操作系统,以及DPU中运行的小系统。此外,DPU也 可以作为CPU的外接设备存在,与图形处理器(graphics processing unit,GPU)等处理器一起构成异构系统。DPU is specifically an on-chip system that focuses on associated processing and calculation of data. Among them, associated processing refers to associated signaling processing. Associated signaling is the various signaling required in call connection through the connection. The relay circuit occupied to transmit the signaling. In addition to its ability to accelerate network forwarding, virtualization, storage and other scenarios, DPU usually has certain programmability capabilities and can perform customized offload acceleration according to application scenarios. The DPU can deploy an operating system (also called a small system) independently and provide KV services. At this time, the computing device where the DPU is located includes two operating systems, namely a general operating system running in the CPU and a small system running in the DPU. In addition, DPU also It can exist as an external device to the CPU and form a heterogeneous system together with processors such as a graphics processing unit (GPU).
具体地,DPU可以获取KV操作请求,然后DPU可以根据KV操作请求确定执行方式,该执行方式包括卸载方式和非卸载方式,卸载方式用于指示由加速器如DPU执行KV操作请求的操作,非卸载方式用于指示由处理器如CPU执行KV操作请求的操作,DPU可以根据执行方式执行KV操作请求的处理。Specifically, the DPU can obtain the KV operation request, and then the DPU can determine the execution mode according to the KV operation request. The execution mode includes an offloading mode and a non-offloading mode. The offloading mode is used to instruct an accelerator such as the DPU to perform the operation requested by the KV operation. The non-offloading mode The mode is used to instruct the processor, such as the CPU, to perform the operation requested by the KV operation. The DPU can perform the processing of the KV operation request according to the execution mode.
该方法通过DPU等数据处理器的可编程能力和随路处理能力,将分布式的KV数据的基本操作卸载到DPU完成,在卸载能力范围内的数据面处理完全旁路CPU,既能提高系统的吞吐,又能降低对CPU的占用。并且,该方法在CPU侧保留KV的全部功能,对于超出卸载规格的少量KV操作仍然转发到CPU处理,形成层级化的KV服务,兼顾性能和通用性。This method uses the programmability and on-line processing capabilities of data processors such as DPU to offload the basic operations of distributed KV data to the DPU. Data plane processing within the offloading capability completely bypasses the CPU, which can improve the system. throughput and reduce CPU usage. Moreover, this method retains all the functions of KV on the CPU side, and a small number of KV operations that exceed the offload specifications are still forwarded to the CPU for processing, forming a hierarchical KV service that takes into account both performance and versatility.
接下来,先结合附图对本申请实施例的系统架构进行介绍。Next, the system architecture of the embodiment of the present application will be introduced with reference to the accompanying drawings.
参见图1所示的数据处理系统的架构示意图,该数据处理系统10包括KV服务端节点100和KV客户端节点200。其中,KV服务端节点100为支持KV服务的计算设备,例如为支持KV服务的服务器。KV客户端节点200为支持访问KV服务端节点100的设备,该KV客户端节点200可以为轻量级设备,包括但不限于笔记本电脑、平板电脑或智能手机。KV服务端节点100和KV客户端节点200可以通过网络互连,例如是通过高性能网络互联。需要说明,根据不同业务场景的组网规模,数据处理系统10中可以包括一个或多个KV服务端节点100,类似地,数据处理系统10可以包括一个或多个KV客户端节点200。Referring to the schematic architectural diagram of the data processing system shown in Figure 1, the data processing system 10 includes a KV server node 100 and a KV client node 200. Among them, the KV server node 100 is a computing device that supports KV services, such as a server that supports KV services. The KV client node 200 is a device that supports access to the KV server node 100. The KV client node 200 may be a lightweight device, including but not limited to a laptop, a tablet, or a smartphone. The KV server node 100 and the KV client node 200 may be interconnected through a network, for example, through a high-performance network. It should be noted that according to the network scale of different business scenarios, the data processing system 10 may include one or more KV server nodes 100. Similarly, the data processing system 10 may include one or more KV client nodes 200.
其中,KV服务端节点100包括DPU102和CPU104。在图1的示例中,CPU104置于主机内,DPU102作为主机的外接设备。KV服务端节点100还包括内存(memory)106,内存106用于存储KV数据,以加快KV数据的访问效率。内存106可以外接至主机,基于此,内存106也可以称作主机内存。每台主机可以外接多个内存106,多个内存106可以用于形成内存池。Among them, the KV server node 100 includes DPU102 and CPU104. In the example of FIG. 1 , the CPU 104 is placed in the host, and the DPU 102 is used as an external device of the host. The KV server node 100 also includes a memory 106, which is used to store KV data to speed up the access efficiency of KV data. The memory 106 can be externally connected to the host. Based on this, the memory 106 can also be called host memory. Each host can be connected to multiple external memories 106, and the multiple memories 106 can be used to form a memory pool.
KV客户端节点200部署有应用。应用运行时可以产生应用进程。应用进程可以调用KV服务接口从而发起KV操作,相应地,KV客户端节点200上的KV客户端(例如是KV客户端进程)可以根据KV操作生成KV操作请求,该KV操作请求可以采用(远程直接内存访问Remote Direct Memory Access,RDMA)报文格式。然后KV客户端节点200将KV操作请求发送至KV服务端节点100。KV服务端节点100的DPU102负责接收和处理各类KV操作请求。The KV client node 200 is deployed with applications. Application processes can be spawned when the application is running. The application process can call the KV service interface to initiate a KV operation. Correspondingly, the KV client (for example, the KV client process) on the KV client node 200 can generate a KV operation request based on the KV operation. The KV operation request can be (remote Direct memory access Remote Direct Memory Access, RDMA) message format. Then the KV client node 200 sends the KV operation request to the KV server node 100. The DPU 102 of the KV server node 100 is responsible for receiving and processing various KV operation requests.
具体地,DPU102可以根据KV操作请求确定执行方式,然后根据执行方式执行KV操作请求的处理。例如,DPU102可以获取KV操作请求中的操作元数据,包括操作类型和键长(key length,Klen)。当操作元数据满足预设条件,例如操作类型为加、删除、查询、修改或批量增加、批量删除、批量查询或批量修改中的一种或多种,且键长小于预设长度,则确定执行方式为卸载方式,DPU102可以根据该KV操作请求,对内存106执行目标KV操作。当操作元数据不满足预设条件,则确定执行方式为非卸载方式,DPU102向CPU104转发KV操作请求,由CPU104根据KV操作请求,对内存106执行目标KV操作。也即,在DPU102处理能力之内的KV操作请求由DPU102处理,超出DPU102处理能力的KV操作请求可以被转发到CPU104,由CPU104进行处理。Specifically, the DPU 102 may determine the execution mode according to the KV operation request, and then perform the processing of the KV operation request according to the execution mode. For example, the DPU 102 can obtain the operation metadata in the KV operation request, including the operation type and key length (Klen). When the operation metadata meets the preset conditions, for example, the operation type is one or more of add, delete, query, modification or batch addition, batch deletion, batch query or batch modification, and the key length is less than the preset length, then OK The execution mode is an offloading mode, and the DPU 102 can perform a target KV operation on the memory 106 according to the KV operation request. When the operation metadata does not meet the preset conditions, the execution mode is determined to be a non-offloading mode, and the DPU 102 forwards the KV operation request to the CPU 104, and the CPU 104 performs the target KV operation on the memory 106 according to the KV operation request. That is, KV operation requests within the processing capability of the DPU 102 are processed by the DPU 102 , and KV operation requests beyond the processing capability of the DPU 102 may be forwarded to the CPU 104 for processing by the CPU 104 .
基于图1所示的数据处理系统100,本申请实施例还提供了一种数据处理方法,下面结合附图对本申请实施例的数据处理方法进行介绍。Based on the data processing system 100 shown in Figure 1, an embodiment of the present application also provides a data processing method. The data processing method of the embodiment of the present application is introduced below with reference to the accompanying drawings.
参见图2所示的数据处理方法的流程图,该方法包括:Referring to the flow chart of the data processing method shown in Figure 2, the method includes:
S202:DPU102接收KV客户端节点200发送的KV操作请求。S202: The DPU 102 receives the KV operation request sent by the KV client node 200.
KV操作请求中包括操作元数据。其中,操作元数据可以包括操作类型和键长。操作类型可以包括基础操作,如增加(create)、删除(delete)、查询(read)、修改(update)中的一种或多种。增加、删除、查询、修改可以统称为增删查改,记作CRUD。在一些实施例中,KV操作请求中可以封装多个KV操作,基于此,操作类型可以包括批量基础操作,如批量增加、批量删除、批量查询或批量修改中的一种或多种。键长是指键的长度,即键内容的长度。KV operation requests include operation metadata. Among them, the operation metadata can include operation type and key length. Operation types may include basic operations, such as one or more of add (create), delete (delete), query (read), and modify (update). Adding, deleting, querying, and modifying can be collectively referred to as adding, deleting, querying, and modifying, and are recorded as CRUD. In some embodiments, multiple KV operations may be encapsulated in the KV operation request. Based on this, the operation type may include batch basic operations, such as one or more of batch addition, batch deletion, batch query, or batch modification. The key length refers to the length of the key, that is, the length of the key content.
S204:DPU102获取KV操作请求中的操作元数据。当操作元数据满足预设条件,执行S206,当操作元数据不满足预设条件,执行S208。 S204: DPU102 obtains the operation metadata in the KV operation request. When the operation metadata meets the preset conditions, S206 is executed. When the operation metadata does not satisfy the preset conditions, S208 is executed.
S206:DPU102确定执行方式为卸载方式。然后执行S210。S206: DPU102 determines that the execution mode is the offloading mode. Then execute S210.
S208:DPU102确定执行方式为非卸载方式。然后执行S214。S208: DPU102 determines that the execution mode is a non-offloading mode. Then execute S214.
具体地,DPU102可以解析KV操作请求,获得KV操作请求中的操作元数据。该操作元数据包括操作类型和键长。通常情况下,DPU102具备处理基础操作的能力,而CPU104具备处理复杂操作的能力,并且DPU102受限于硬件能力,通常用于处理键长在预设长度以内(键长小于预设长度)的KV数据,基于此,DPU102可以根据操作类型和键长,确定DPU102是否具备处理该KV操作请求中KV操作的能力,从而确定KV操作的执行方式。Specifically, the DPU 102 can parse the KV operation request and obtain the operation metadata in the KV operation request. The operation metadata includes the operation type and key length. Under normal circumstances, DPU102 has the ability to process basic operations, while CPU104 has the ability to process complex operations, and DPU102 is limited by hardware capabilities and is usually used to process KV whose key length is within the preset length (key length is less than the preset length) Based on this data, the DPU 102 can determine whether the DPU 102 has the ability to process the KV operation in the KV operation request according to the operation type and key length, thereby determining the execution method of the KV operation.
例如,DPU102可以将KV操作请求中的操作类型与基础操作的操作类型(例如增加、删除、查询、修改或者批量增加、批量删除、批量查询、批量修改)进行比较,以及将KV操作请求中的键长与预设长度进行比较。For example, the DPU 102 can compare the operation type in the KV operation request with the operation type of the basic operation (such as add, delete, query, modify or batch add, batch delete, batch query, batch modification), and compare the operation type in the KV operation request. The bond length is compared to a preset length.
当操作类型与基础操作的操作类型匹配,键长小于预设长度,则表征DPU102具备处理该KV操作请求中KV操作的能力,可以确定执行方式为卸载方式。其中,卸载方式用于指示由DPU102执行KV操作请求的操作。When the operation type matches the operation type of the basic operation and the key length is less than the preset length, it indicates that the DPU 102 has the ability to process the KV operation in the KV operation request, and the execution mode can be determined to be the offloading mode. The offloading mode is used to instruct the DPU 102 to perform the operation requested by the KV operation.
当操作类型与基础操作的操作类型不匹配,和/或者键长不小于预设长度,则表征DPU102不具备处理该KV操作请求中KV操作的能力,可以确定执行方式为非卸载方式。其中,非卸载方式用于指示由CPU104执行KV操作请求的操作。When the operation type does not match the operation type of the basic operation, and/or the key length is not less than the preset length, it means that the DPU 102 does not have the ability to process the KV operation in the KV operation request, and the execution mode can be determined to be a non-offloading mode. The non-offloading mode is used to instruct the CPU 104 to execute the operation requested by the KV operation.
上述预设长度可以根据DPU102的硬件类型而设置,DPU102的硬件类型不同,预设长度可以是不同的。例如,预设长度可以设置为128字节(byte,B),或者设置为1千字节(kilo byte,KB)。The above preset length can be set according to the hardware type of the DPU 102. Depending on the hardware type of the DPU 102, the preset length can be different. For example, the default length can be set to 128 bytes (byte, B), or to 1 kilobyte (kilo byte, KB).
需要说明的是,上述S204至S208为本申请实施例中DPU102根据KV操作请求确定执行方式的一种具体实现方式,在本申请实施例其他可能的实现方式中,DPU也可以不执行上述步骤,或者采用其他实现方式执行。例如,DPU102可以直接尝试执行目标KV操作,当执行成功,则返回结果,当执行不成功,则将KV操作请求转发至CPU,由CPU处理。It should be noted that the above-mentioned S204 to S208 is a specific implementation manner in which the DPU 102 determines the execution method according to the KV operation request in the embodiment of the present application. In other possible implementation methods of the embodiment of the present application, the DPU may not perform the above steps. Or use other implementation methods. For example, the DPU 102 can directly try to perform the target KV operation. When the execution is successful, the result is returned. When the execution is unsuccessful, the KV operation request is forwarded to the CPU for processing by the CPU.
S210:DPU102根据KV操作请求,对内存106执行目标KV操作。S210: The DPU 102 performs the target KV operation on the memory 106 according to the KV operation request.
内存106采用KV块(KV Block)存储KV数据,基于此,DPU102可以根据KV操作请求,向内存106写入目标KV块或从内存106查询目标KV块。其中,KV操作请求中的操作类型为增加、修改或者批量增加、批量修改时,DPU102执行对内存106写入目标KV块的操作。KV操作请求中的操作类型为查询或批量查询时,DPU102执行从内存106查询目标KV块的操作。KV操作请求中的操作类型为删除或批量删除时,DPU102可以执行从内存106删除目标KV块的操作。The memory 106 uses KV blocks to store KV data. Based on this, the DPU 102 can write the target KV block to the memory 106 or query the target KV block from the memory 106 according to the KV operation request. When the operation type in the KV operation request is add, modify, or batch add or batch modify, the DPU 102 performs the operation of writing the target KV block into the memory 106 . When the operation type in the KV operation request is query or batch query, the DPU 102 performs the operation of querying the target KV block from the memory 106 . When the operation type in the KV operation request is delete or batch delete, the DPU 102 may perform an operation of deleting the target KV block from the memory 106 .
需要说明的是,KV操作为增加、修改或者批量增加、批量修改时,KV操作请求中还可以包括待增加、待修改的KV数据中的值内容。例如,KV操作请求可以包括键内容“姓名”和值内容“张三”,以用于请求增加“姓名,张三”这一KV数据。It should be noted that when the KV operation is to add, modify, or batch add or batch modify, the KV operation request may also include the value content in the KV data to be added or modified. For example, the KV operation request may include the key content "name" and the value content "Zhang San" to request the addition of the KV data "name, Zhang San".
S212:DPU102向KV客户端节点200返回KV操作结果。S212: The DPU 102 returns the KV operation result to the KV client node 200.
根据操作类型不同,KV操作结果可以包括不同信息。例如,操作类型为增加、修改、删除或批量增加、批量修改、批量删除时,KV操作结果可以包括操作成功或操作失败。又例如,操作类型为查询或批量查询时,KV操作结果可以包括查询到的KV数据。Depending on the type of operation, the KV operation results may include different information. For example, when the operation type is add, modify, delete, or batch add, batch modify, or batch delete, the KV operation result can include operation success or operation failure. For another example, when the operation type is query or batch query, the KV operation result may include the queried KV data.
为了便于理解,本申请结合一示例进行具体说明。针对查询或批量查询操作,DPU102可以将查询的KV数据和操作成功字段值封装在同一响应消息中,然后向客户端节点200返回该响应消息。在一些实施例中,DPU102也可以将查询的KV数据和操作成功字段值封装在不同响应消息中,然后分别向客户端节点200返回。In order to facilitate understanding, this application will be described in detail with an example. For query or batch query operations, the DPU 102 can encapsulate the queried KV data and the operation success field value in the same response message, and then return the response message to the client node 200 . In some embodiments, the DPU 102 can also encapsulate the queried KV data and the operation success field value in different response messages, and then return them to the client node 200 respectively.
S214:DPU102向CPU104转发KV操作请求。S214: DPU102 forwards the KV operation request to CPU104.
S216:CPU104根据KV操作请求,对内存106执行目标KV操作。S216: The CPU 104 performs the target KV operation on the memory 106 according to the KV operation request.
S218:CPU104向KV客户端节点200返回KV操作结果。S218: The CPU 104 returns the KV operation result to the KV client node 200.
其中,CPU104根据KV操作请求,对内存106执行目标KV操作,向KV客户端节点200返回KV操作结果的具体实现可以参考DPU102执行目标KV操作,返回KV操作结果的相关内容描述,在此不再赘述。Among them, the CPU 104 performs the target KV operation on the memory 106 according to the KV operation request and returns the KV operation result to the KV client node 200. For the specific implementation, please refer to the relevant content description of the DPU 102 performing the target KV operation and returning the KV operation result, which will not be discussed here. Repeat.
上述S210至S212以及上述S214至S218为DPU102根据执行方式执行KV操作请求的处理的一些 具体实现方式,在本申请实施例其他可能的实现方式中,也可以按照其他方式执行KV操作请求的处理。The above-mentioned S210 to S212 and the above-mentioned S214 to S218 are some processes in which the DPU 102 executes the KV operation request according to the execution mode. Specific implementation manner: In other possible implementation manners of the embodiments of this application, the processing of the KV operation request may also be performed in other ways.
该方法通过DPU102的可编程能力和随路处理能力,将分布式的KV数据的基本操作卸载到DPU102完成,在卸载能力范围内的数据面处理完全旁路主机CPU104,既能提高系统的吞吐量,又能降低对主机CPU104的占用。并且,该方法在CPU104侧保留KV的全部功能,对于超出卸载规格的少量KV操作仍然转发到主机CPU104处理,形成层级化的KV服务,兼顾性能和通用性。This method uses the programmability and on-path processing capabilities of DPU102 to offload the basic operations of distributed KV data to DPU102. Data plane processing within the offloading capability completely bypasses the host CPU 104, which can improve the throughput of the system. , and can reduce the occupation of the host CPU104. Moreover, this method retains all functions of KV on the CPU 104 side, and a small number of KV operations that exceed the offloading specifications are still forwarded to the host CPU 104 for processing, forming a hierarchical KV service that takes into account both performance and versatility.
作为一种可能的实现方式,DPU102可以从逻辑上分为数据平面和控制平面,图2所示实施例主要从数据平面的角度对数据处理方法进行了介绍。下面将从控制平面和数据平面的角度对本申请实施例的方法进行详细说明。As a possible implementation method, the DPU 102 can be logically divided into a data plane and a control plane. The embodiment shown in Figure 2 mainly introduces the data processing method from the perspective of the data plane. The method of the embodiment of the present application will be described in detail from the perspective of the control plane and the data plane.
首先参见图3所示的KV服务端节点100的结构示意图,如图3所示,DPU102从逻辑功能上分成数据平面和控制平面两部分,其中数据平面负责网络IO通信以及KV事务的随路处理,控制平面负责管理通信和事务处理的上下文信息并管理事务的状态。CPU104运行KV服务端,产生KV服务端进程。KV服务端进程可以负责和KV客户端进程之间的连接管理,内存106中的KV资源管理和调度,同时具有完备的KV操作处理能力,对于DPU102不能加速的少量KV操作请求,可以由KV服务端进程完成处理。First, refer to the schematic structural diagram of the KV server node 100 shown in Figure 3. As shown in Figure 3, the DPU 102 is logically divided into two parts: the data plane and the control plane. The data plane is responsible for network IO communication and on-line processing of KV transactions. ,The control plane is responsible for managing the ,context information of communication and transaction processing and ,managing the state of the transaction. CPU104 runs the KV server and generates a KV server process. The KV server process can be responsible for connection management with the KV client process, KV resource management and scheduling in the memory 106, and has complete KV operation processing capabilities. A small number of KV operation requests that cannot be accelerated by the DPU 102 can be served by KV The end process completes processing.
内存106驻留有KV数据结构。KV数据结构用于描述KV数据的组织形式。KV数据可以以KV块(KV block)形式进行组织。为了提高查找效率,内存106还驻留有哈希表。本申请实施例通过按照适合DPU102加速的方式设计相应的KV数据结构,使其能被DPU102更高效地处理。Memory 106 resides in the KV data structure. The KV data structure is used to describe the organizational form of KV data. KV data can be organized in the form of KV blocks. In order to improve search efficiency, a hash table also resides in the memory 106 . The embodiment of the present application designs the corresponding KV data structure in a manner suitable for acceleration by the DPU 102 so that it can be processed more efficiently by the DPU 102 .
接下来,参见图4所示的数据处理方法的流程示意图,具体包括如下步骤:Next, refer to the flow diagram of the data processing method shown in Figure 4, which specifically includes the following steps:
步骤1:来自KV客户端节点200的KV操作请求以RDMA报文形式,经过交换网络,到达KV服务端节点100中DPU102的网络端口,位于DPU102内数据处理平面的随路处理单元对报文进行解析,获得操作元数据。Step 1: The KV operation request from the KV client node 200 passes through the switching network in the form of an RDMA message and reaches the network port of the DPU102 in the KV server node 100. The associated processing unit located in the data processing plane of the DPU102 processes the message. Parse to obtain operational metadata.
其中,交换网络,也可以称作接续网络,具体是在通信的源和目的之间建立通信信道,实现信息传送的网络。交换网络通常由交换设备实现,交换设备可以包括交换机、路由器等实现信息交换的设备。Among them, the switching network can also be called the connection network. Specifically, it is a network that establishes a communication channel between the source and destination of communication to realize information transmission. A switched network is usually implemented by switching equipment, which can include switches, routers and other equipment that implements information exchange.
操作元数据可以包括操作类型或键长中的一种或多种。进一步地,KV操作请求中还可以包括键内容。针对增加、修改、批量增加、批量修改操作,KV操作请求中还可以包括值内容。考虑到数据可以存在多个版本,操作元数据还可以包括版本号。类似地,操作元数据还可以包括序列号。Operation metadata may include one or more of operation types or key lengths. Further, the KV operation request may also include key content. For add, modify, batch add, and batch modify operations, the KV operation request can also include value content. Considering that multiple versions of data can exist, operational metadata can also include version numbers. Similarly, operational metadata can also include sequence numbers.
步骤2:随路处理单元检查当前KV操作是否能在DPU102上完成卸载加速,如果超出了DPU102的卸载能力,则转发到主机CPU104由KV服务端进程处理。Step 2: The associated processing unit checks whether the current KV operation can be offloaded and accelerated on the DPU102. If it exceeds the offloading capability of the DPU102, it is forwarded to the host CPU 104 for processing by the KV server process.
步骤3:随路处理单元向控制平面中写入上下文,以记录网络连接状态和当前KV操作进行的状态。Step 3: The path-associated processing unit writes the context to the control plane to record the network connection status and the status of the current KV operation.
上下文包括执行当前KV操作必要的处理信息、当前KV操作的状态。其中,处理信息可以包括键内容,进一步地,处理信息还可以包括值内容。当前KV操作的状态包括当前KV操作的执行节点、执行结果。The context includes the processing information necessary to perform the current KV operation and the status of the current KV operation. The processing information may include key content, and further, the processing information may also include value content. The status of the current KV operation includes the execution node and execution result of the current KV operation.
步骤4:随路处理单元根据KV操作的数据需求,通过IO处理模块经过高速总线从内存106拖取必要的操作数据进行处理。Step 4: According to the data requirements of the KV operation, the associated processing unit pulls the necessary operation data from the memory 106 through the high-speed bus through the IO processing module for processing.
高速总线可以是高速外设组件互连(Peripheral Component Interconnect Express,PCIe)、计算快速链路(Compute Express Link,CXL)等标准总线,也可以是私有的总线类型。拖取数据的操作可以使用直接存储器访问(Direct Memory Access,DMA)或者其他内存访问语义,例如load store。根据不同KV操作的复杂度,随路处理单元可能需要完成若干次数据拖取和处理。The high-speed bus can be a standard bus such as Peripheral Component Interconnect Express (PCIe) or Compute Express Link (CXL), or it can be a private bus type. The operation of dragging data can use Direct Memory Access (DMA) or other memory access semantics, such as load store. Depending on the complexity of different KV operations, the associated processing unit may need to complete data extraction and processing several times.
步骤5:随路处理单元将处理的中间结果、最终结果写入上下文,并更新操作状态。Step 5: The path-associated processing unit writes the intermediate results and final results of processing into the context, and updates the operation status.
KV操作为增加操作、查询操作或者批量增加操作、批量查询操作时,可以存在冲突情况,例如增加操作中用于增加的KV数据中键内容的特征值与哈希槽中其他KV数据的键内容的特征值相同,或者查询操作所要查询KV数据中键内容的特征值与哈希槽中其他KV数据的键内容的特征值相同,则表征发生冲突。随路处理单元可以进行冲突解决操作,该操作过程中产生的结果可以为中间结果。例如,中间结果可以包括记录冲突信息的链表。When the KV operation is an add operation, a query operation, a batch add operation, or a batch query operation, there may be conflicts, for example, the characteristic value of the key content in the KV data added in the add operation and the key content of other KV data in the hash slot. The eigenvalues are the same, or the eigenvalues of the key content in the KV data to be queried by the query operation are the same as the eigenvalues of the key contents of other KV data in the hash slot, then the representations conflict. The associated processing unit can perform conflict resolution operations, and the results generated during this operation can be intermediate results. For example, the intermediate results may include a linked list recording conflict information.
需要说明的是,KV操作为增加操作、删除操作等操作时,也可以不包括中间结果,随路处理单元可以将最终结果写入上下文。 It should be noted that when the KV operation is an add operation, a delete operation, etc., the intermediate results may not be included, and the accompanying processing unit may write the final results into the context.
步骤6:随路处理单元将KV操作的最终结果封装成RDMA报文并发送回KV客户端节点100,完成一次KV操作。Step 6: The associated processing unit encapsulates the final result of the KV operation into an RDMA message and sends it back to the KV client node 100 to complete a KV operation.
整个过程中,KV操作的处理可以由DPU102上的随路处理单元完成,与内存106中的数据交互通过高速总线直通完成,均不需CPU104算力参与。During the entire process, the processing of the KV operation can be completed by the associated processing unit on the DPU 102, and the data interaction with the memory 106 is completed through the high-speed bus pass-through, without the participation of the computing power of the CPU 104.
与图2相比,图4不仅从数据面对数据处理方法的流程进行了描述,还从控制面对数据处理方法进行了详细介绍。还需要说明,图4以一个KV客户端节点200和一个KV服务端节点100的交互进行示例说明,实际应用时,可以由多个KV服务端节点200分布式存储KV数据,并为一个或多个KV客户端节点200提供服务。Compared with Figure 2, Figure 4 not only describes the flow of the data processing method from the data plane, but also introduces the data processing method in detail from the control plane. It should also be noted that Figure 4 illustrates the interaction between a KV client node 200 and a KV server node 100. In actual application, multiple KV server nodes 200 can store KV data in a distributed manner and provide one or more A KV client node 200 provides services.
参见图5所示的数据处理系统10的结构示意图,若干个KV客户端节点200可以通过RDMA网络与若干个KV端服务节点100互联。全量的KV数据可以划分成不同的域段,存放到不同的KV服务端节点100的内存106中。其中,KV数据可以按照键的范围划分成不同的域段,或者按照其他方式划分成不同的域段,分散存储在不同的KV服务端节点100,实现负载均衡。Referring to the schematic structural diagram of the data processing system 10 shown in Figure 5, several KV client nodes 200 can be interconnected with several KV end service nodes 100 through the RDMA network. The entire KV data can be divided into different domain segments and stored in the memory 106 of different KV server nodes 100. Among them, the KV data can be divided into different domain segments according to the key range, or divided into different domain segments in other ways, and distributed and stored in different KV server nodes 100 to achieve load balancing.
KV服务端节点100中KV服务端进程可以包括多个执行线程,以支持一定的并发。每个KV服务端节点100至少和每个KV客户端节点200之间建立一条RDMA连接以完成消息传输。例如图5中有N个KV客户端节点200和M个KV服务端节点100,相应的,KV服务端节点和KV客户端节点200上分别至少有N条和M条RDMA连接。The KV server process in the KV server node 100 may include multiple execution threads to support certain concurrency. Each KV server node 100 establishes at least one RDMA connection with each KV client node 200 to complete message transmission. For example, there are N KV client nodes 200 and M KV server nodes 100 in Figure 5. Correspondingly, there are at least N and M RDMA connections on the KV server nodes and KV client nodes 200 respectively.
当应用触发KV操作时,KV客户端进程可以根据数据域段的划分,在对应的RDMA队列对(queue pair,QP)上发起KV操作请求。QP是硬件和软件之间的一个虚拟接口。QP是队列结构,按顺序存储着软件给硬件下发的任务,也即工作队列元素(Work Queue Ellement,WQE),WQE中包含从哪里取出多长的数据,并且发送给哪个目的地等等信息。每个QP间都是独立的,彼此通过保护域(Protection Domain,PD)隔离,因此一个QP可以被视为某个用户独占的一种资源,一个用户也可以同时使用多个QP。QP有很多种服务类型,包括可靠连接(reliable connection,RC)、可靠数据报(reliable datagram,RD)、不可靠连接(unreliable connection,UC)和不可靠数据报(unreliable datagram,UD)等,所有的源QP和目的QP为同一种类型时可以进行数据交互。When an application triggers a KV operation, the KV client process can initiate a KV operation request on the corresponding RDMA queue pair (QP) according to the division of data domain segments. QP is a virtual interface between hardware and software. QP is a queue structure, which stores tasks issued by software to hardware in order, that is, Work Queue Element (WQE). WQE contains information such as where the data is taken out and how long it is sent to, and to which destination it is sent. . Each QP is independent and isolated from each other by a Protection Domain (PD). Therefore, a QP can be regarded as a resource exclusive to a user, and a user can also use multiple QPs at the same time. QP has many service types, including reliable connection (RC), reliable datagram (RD), unreliable connection (UC) and unreliable datagram (UD), etc., all Data interaction is possible when the source QP and destination QP are of the same type.
在该示例中,KV服务端节点100侧DPU102收到KV操作请求,检查KV操作的操作类型、Key长度等信息是否在DPU102支持的加速范围内,如果支持则由DPU102完成相应请求处理,否则转发到主机CPU104完成处理。In this example, the DPU 102 on the KV server node 100 side receives the KV operation request and checks whether the operation type, key length and other information of the KV operation are within the acceleration range supported by the DPU 102. If supported, the DPU 102 will complete the corresponding request processing, otherwise it will be forwarded. The processing is completed by the host CPU 104.
其中,内存106中驻留的KV数据结构可以参见图6。如图6所示,哈希表包括多个哈希桶,例如是图6中哈希表的多个列,记作Entry 0……Entry M,多个哈希桶中的每个哈希桶包括多个哈希槽,对应于一个列中的多个行,记作Slot 0……Slot N。属于同一哈希桶的多个哈希槽用于存放哈希值相同的多个键的指纹(Fingerprint)、键长(Klen)和对应的块地址(KV Block地址)。The KV data structure resident in the memory 106 can be seen in Figure 6 . As shown in Figure 6, the hash table includes multiple hash buckets, such as multiple columns of the hash table in Figure 6, denoted as Entry 0...Entry M, each hash bucket in the multiple hash buckets Includes multiple hash slots, corresponding to multiple rows in a column, denoted as Slot 0...Slot N. Multiple hash slots belonging to the same hash bucket are used to store the fingerprints (Fingerprint), key length (Klen) and corresponding block addresses (KV Block address) of multiple keys with the same hash value.
在一些可能的实现方式中,对于KV数据的key,其通过哈希算法1进行计算的结果(即上文所述的哈希值)可以用来索引哈希桶入口,每个哈希桶包含多个Slot,哈希算法1计算结果相同的key会被放到同一个哈希桶的不同Slot中。每个Slot的头部包括一个Fingerprint域,存放key通过哈希算法2计算的结果(即上文所述的指纹)。具体实现时,DPU102或CPU104可以通过Fingerprint域区分和查找存放到同一个哈希桶中的多个key。进一步地,Slot中还存放有KV Block的虚拟地址和Key长度(Klen)信息,以及做地址翻译所需的校验key,通过虚拟地址可以读写相应的KV Block。In some possible implementations, for the key of KV data, the result calculated by hash algorithm 1 (i.e., the hash value described above) can be used to index the hash bucket entry, and each hash bucket contains For multiple slots, keys with the same calculation result of hash algorithm 1 will be placed in different slots in the same hash bucket. The header of each Slot includes a Fingerprint field, which stores the result of key calculation through hash algorithm 2 (i.e., the fingerprint mentioned above). During specific implementation, DPU102 or CPU104 can distinguish and search multiple keys stored in the same hash bucket through the Fingerprint field. Furthermore, the Slot also stores the virtual address and Key length (Klen) information of the KV Block, as well as the verification key required for address translation. The corresponding KV Block can be read and written through the virtual address.
KV Block中的头部域段包括完整的Key内容、Value的长度信息(Vlen)。进一步地,KV Block的头部域段还可以包括下级KV块标识字段,如图6中的flags字段。在冲突严重的情况下,同一个哈希桶中不同的key计算的Fingerprint也可能相同,此时这些key共用同一个Slot,其对应的KV Block通过链表的形式来管理。具体地,Flags字段标识是否有下一级KV Block存在,并在前一级KV Block(例如是KV Block的Next域段)中存放下一级KV Block的虚拟地址和校验key。如果没有下一级KV Block存在,相应的地址和校验key字段无效,但相应空间可以保留,例如字段值可以设置为rsvd,以备后续可能的插入操作中使用。The header field segment in KV Block includes the complete Key content and Value length information (Vlen). Further, the header field segment of the KV Block may also include a subordinate KV block identification field, such as the flags field in Figure 6. In the case of serious conflicts, the Fingerprint calculated by different keys in the same hash bucket may be the same. At this time, these keys share the same Slot, and their corresponding KV Blocks are managed in the form of a linked list. Specifically, the Flags field identifies whether a next-level KV Block exists, and stores the virtual address and verification key of the next-level KV Block in the previous-level KV Block (for example, the Next field segment of the KV Block). If no next-level KV Block exists, the corresponding address and verification key fields are invalid, but the corresponding space can be reserved. For example, the field value can be set to rsvd for possible subsequent insertion operations.
这种数据结构设计下,DPU102在做KV Block查找时,只需要从内存106中load图6中的header字段和Next字段,而不需要load整个KV Block,当完成key值校验确认查找成功后,直接从内存106 拖取Value内容组装RDMA消息,然后发送RDMA消息,以返回KV操作结果。因为大多数场景Key值的长度都相对较短,例如在1KB以内,可以load到DPU102处理,而Value的长度则可能达到MB级别,load Value到DPU102对片上缓冲区的冲击非常大。这种数据结构设计降低了对DPU102的规格需求,更适合做卸载加速。Under this data structure design, when DPU102 performs KV Block search, it only needs to load the header field and Next field in Figure 6 from the memory 106, and does not need to load the entire KV Block. When the key value verification is completed and the search is successful, , directly from memory 106 Drag the Value content to assemble the RDMA message, and then send the RDMA message to return the KV operation result. Because the length of the Key value in most scenarios is relatively short, for example, within 1KB, it can be loaded to the DPU102 for processing, while the length of the Value may reach the MB level. Loading the Value to the DPU102 has a great impact on the on-chip buffer. This data structure design reduces the specification requirements for DPU102 and is more suitable for offload acceleration.
为了使得本申请的技术方案更加清楚、易于理解,下面以增加(也称作插入)、修改(也称作更新)、查询以及批量(batch)操作进行示例说明。In order to make the technical solution of the present application clearer and easier to understand, the following is an example of adding (also called inserting), modifying (also called updating), querying, and batch (batch) operations.
首先,参见图7所示的数据处理方法的流程示意图,该方法主要从KV客户端节点200、KV服务端节点100交互的角度描述插入或更新流程,具体包括如下步骤:First, refer to the flow diagram of the data processing method shown in Figure 7. This method mainly describes the insertion or update process from the perspective of interaction between the KV client node 200 and the KV server node 100, specifically including the following steps:
步骤1:KV客户端节点200发送KV操作请求至KV服务端节点100。Step 1: The KV client node 200 sends a KV operation request to the KV server node 100.
具体地,KV客户端节点200可以通过RDMA写操作发送KV插入请求或KV更新请求到KV服务端节点100。KV插入请求或KV更新请求中携带有生成KV Block的必要信息以及相应的写入地址(虚拟地址)。Specifically, the KV client node 200 may send a KV insert request or a KV update request to the KV server node 100 through an RDMA write operation. The KV insertion request or KV update request carries the necessary information to generate the KV Block and the corresponding write address (virtual address).
其中,生成KV Block的必要信息可以包括待增加或修改的目标KV块的键内容和值内容,写入地址可以为目标KV块的块地址。进一步地,KV插入请求或KV更新请求中还可以包括校验密钥,即校验key。Among them, the necessary information to generate a KV Block may include the key content and value content of the target KV block to be added or modified, and the write address may be the block address of the target KV block. Further, the KV insertion request or KV update request may also include a verification key, that is, a verification key.
需要说明的是,如果KV插入请求或KV更新请求的消息长度超过一个最大传输单元(Maximum Transmission Unit,MTU)的大小,则KV客户端节点200可以通过多个报文完成发送。It should be noted that if the message length of the KV insertion request or KV update request exceeds the size of a maximum transmission unit (Maximum Transmission Unit, MTU), the KV client node 200 can complete the sending through multiple messages.
步骤2:DPU102收到KV操作请求,根据KV操作请求写入值内容至内存106,并根据键内容和哈希算法确定哈希值和指纹。Step 2: DPU 102 receives the KV operation request, writes the value content to the memory 106 according to the KV operation request, and determines the hash value and fingerprint according to the key content and the hash algorithm.
具体地,DPU102根据KV插入请求或KV更新请求中的目标KV块的块地址,直接将目标KV块的值内容(Value部分)通过DMA方式写入内存106,然后根据键内容(key部分)和哈希算法1确定哈希值,根据哈希值确定目标KV块对应的哈希桶,再根据哈希算法2确定目标KV块的指纹fingerprint。Specifically, the DPU 102 directly writes the value content (Value part) of the target KV block into the memory 106 through DMA according to the block address of the target KV block in the KV insertion request or KV update request, and then writes the value content (Value part) of the target KV block into the memory 106 according to the key content (key part) and Hash algorithm 1 determines the hash value, determines the hash bucket corresponding to the target KV block based on the hash value, and then determines the fingerprint of the target KV block based on hash algorithm 2.
步骤3:DPU102读取目标KV块对应的哈希桶,依次比较该哈希桶的各个哈希槽Slot中的指纹与步骤2中计算的目标KV块的指纹是否相同。如果当前是更新操作,则对指纹匹配成功的Slot进行刷新;如果当前是插入操作,DPU102可以尝试找到一个空白Slot将KV Block的块地址和指纹写入。Step 3: DPU102 reads the hash bucket corresponding to the target KV block, and sequentially compares whether the fingerprints in each hash slot Slot of the hash bucket are the same as the fingerprints of the target KV block calculated in step 2. If the current update operation is, the Slot with successful fingerprint matching is refreshed; if the current insert operation is, DPU102 can try to find a blank Slot to write the block address and fingerprint of the KV Block.
如果出现指纹冲突,则可以进行冲突解决。其中,DPU102可以读取冲突Slot中的KV Block头部信息,通过下级KV块标识字段的字段值检查下一级KV Block地址是否有效,如果有效则读取下级KV块地址字段,以读取下一级KV Block的块地址,直到找到位于链表尾部的KV Block,然后将KV插入请求中的地址和校验key写入下一级KV Block地址和校验key字段,修改Flags字段标识当前KV Block下一跳地址有效。If a fingerprint conflict occurs, conflict resolution can be performed. Among them, DPU102 can read the KV Block header information in the conflicting Slot, check whether the next-level KV Block address is valid through the field value of the lower-level KV block identification field, and if it is valid, read the lower-level KV block address field to read the next The block address of the first-level KV Block until the KV Block at the end of the linked list is found, then the address and verification key in the KV insertion request are written into the next-level KV Block address and verification key fields, and the Flags field is modified to identify the current KV Block The next hop address is valid.
步骤4:KV服务端节点100向KV客户端节点200返回KV操作结果。Step 4: The KV server node 100 returns the KV operation result to the KV client node 200.
具体地,KV服务端节点100可以通过RDMA Send通知KV客户端节点200当前KV插入/更新操作完成。Specifically, the KV server node 100 can notify the KV client node 200 that the current KV insert/update operation is completed through RDMA Send.
需要说明的是,KV客户端节点100通过RDMA Write发送KV操作请求时,DPU102可以在接收到该KV操作请求,向KV客户端节点100返回应答消息ACK。类似地,KV服务端节点100(例如是DPU102)通过RDMA Send通知操作完成时,KV客户端节点200也可以向KV服务端节点100返回应答消息ACK。It should be noted that when the KV client node 100 sends a KV operation request through RDMA Write, the DPU 102 can return a response message ACK to the KV client node 100 after receiving the KV operation request. Similarly, when the KV server node 100 (for example, DPU 102) notifies the completion of the operation through RDMA Send, the KV client node 200 can also return a response message ACK to the KV server node 100.
在上述场景中,KV写操作(插入或更新)涉及3次以上(出现冲突时需要次数更多)到内存106的读写操作和相应逻辑处理操作,其中,数据读写均可以在DPU102上通过DMA的方式直接完成,KV逻辑处理也可以通过DPU上的核心来完成。相比于单边RDMA通过多次RDMA操作从远端读取数据到KV客户端节点200处理的方式,本申请中从内存106DMA数据到DPU102来处理的路径更短,效率更高,可以大幅缩短操作时延。相比基于CPU的RPC方案,本申请基于DPU102的随路处理,在处理RDMA协议转发的同时完成了KV操作,既能旁路和释放主机CPU资源,又能获得更高的吞吐量和时延性能。In the above scenario, the KV write operation (insert or update) involves more than 3 read and write operations (more times are needed when conflicts occur) to the memory 106 and corresponding logical processing operations. Among them, data read and write can be passed on the DPU 102 The DMA method is directly completed, and the KV logic processing can also be completed through the core on the DPU. Compared with the unilateral RDMA method of reading data from the remote end to the KV client node 200 for processing through multiple RDMA operations, the path for processing the DMA data from the memory 106 to the DPU 102 in this application is shorter, more efficient, and can be significantly shortened Operation delay. Compared with the CPU-based RPC solution, this application is based on DPU102's on-path processing, which completes the KV operation while processing RDMA protocol forwarding. It can not only bypass and release host CPU resources, but also obtain higher throughput and latency. performance.
进一步参见图8所示的数据处理方法的流程示意图,该方法主要从KV客户端节点200、KV服务端节点100交互的角度描述查询流程,具体包括如下步骤:Referring further to the flow diagram of the data processing method shown in Figure 8, this method mainly describes the query process from the perspective of interaction between the KV client node 200 and the KV server node 100, specifically including the following steps:
步骤1:KV客户端节点200发生KV操作请求至KV服务端节点100。 Step 1: The KV client node 200 sends a KV operation request to the KV server node 100.
具体地,KV客户端节点200可以通过RDMA写操作发送KV查询请求到KV服务端节点100。KV查询请求中携带键内容(具体是KV数据中的key部分),以请求查询该键内容对应的值内容(具体是KV数据中的value部分)。Specifically, the KV client node 200 may send a KV query request to the KV server node 100 through an RDMA write operation. The KV query request carries the key content (specifically, the key part in the KV data) to request the query for the value content corresponding to the key content (specifically, the value part in the KV data).
需要说明,RDMA读操作通常不具有payload,因此,KV客户端节点200在RDMA写操作的payload中携带key内容以及操作类型,从而实现通过RDMA写操作发送KV查询请求。It should be noted that the RDMA read operation usually does not have a payload. Therefore, the KV client node 200 carries the key content and the operation type in the payload of the RDMA write operation, thereby sending the KV query request through the RDMA write operation.
步骤2:KV服务端节点100中的DPU102收到KV操作请求,根据KV操作请求中load哈希桶。Step 2: The DPU 102 in the KV server node 100 receives the KV operation request and loads the hash bucket according to the KV operation request.
具体地,KV操作请求中操作类型为查询,用于请求查询key内容对应的值内容。基于此,DPU102可以根据KV查询请求中的key内容和哈希算法1计算对应的哈希桶,根据KV查询请求中的虚拟地址和校验key完成虚拟地址到物理地址的翻译,进而根据该物理地址从内存106中load哈希桶。Specifically, the operation type in the KV operation request is query, which is used to request the value content corresponding to the query key content. Based on this, DPU102 can calculate the corresponding hash bucket based on the key content and hash algorithm 1 in the KV query request, complete the translation from the virtual address to the physical address based on the virtual address and verification key in the KV query request, and then use the physical address to complete the translation. The address loads the hash bucket from memory 106.
步骤3:当哈希桶返回后,DPU102可以根据key内容和哈希算法2计算指纹,并在哈希桶中查找指纹和计算的指纹相同的Slot,获取该Slot中KV Block的虚拟地址和校验Key。Step 3: When the hash bucket returns, DPU102 can calculate the fingerprint based on the key content and hash algorithm 2, and search the Slot with the same fingerprint as the calculated fingerprint in the hash bucket to obtain the virtual address and calibration of the KV Block in the Slot. Check Key.
步骤4:DPU102可以根据虚拟地址和校验key,完成相应KV Block的虚拟地址到物理地址的翻译,并读取Header字段和Next字段。Step 4: DPU102 can complete the translation from the virtual address of the corresponding KV Block to the physical address based on the virtual address and verification key, and read the Header field and Next field.
步骤5:KV Block的Header字段和Next字段返回后,DPU102可以比对Header中的key内容和KV查询请求的key内容是否相同。如果相同,执行步骤6;如果不相同,从Next字段中获取链表中下一个KV Block的虚拟地址和校验key,返回步骤4。Step 5: After the Header field and Next field of the KV Block are returned, DPU102 can compare whether the key content in the Header and the key content requested by the KV query are the same. If they are the same, go to step 6; if they are not the same, get the virtual address and verification key of the next KV Block in the linked list from the Next field, and return to step 4.
步骤6:DPU102可以load value内容,生成KV操作结果,向KV客户端节点200返回KV操作结果。Step 6: DPU102 can load value content, generate KV operation results, and return KV operation results to KV client node 200.
具体地,DPU102可以根据value内容组装RDMA消息,将value内容通过RDMA Send操作发送回KV客户端节点200,完成KV查询操作。Specifically, DPU 102 can assemble the RDMA message according to the value content, send the value content back to the KV client node 200 through the RDMA Send operation, and complete the KV query operation.
需要说明的是,KV客户端节点100通过RDMA Write发送用于查询数据的KV操作请求时,DPU102可以在接收到该KV操作请求,向KV客户端节点100返回应答消息ACK。类似地,KV服务端节点100的DPU102通过RDMA Send通知操作完成时,KV客户端节点200也可以向KV服务端节点100返回应答消息ACK。It should be noted that when the KV client node 100 sends a KV operation request for querying data through RDMA Write, the DPU 102 can return a response message ACK to the KV client node 100 after receiving the KV operation request. Similarly, when the DPU 102 of the KV server node 100 notifies the completion of the operation through RDMA Send, the KV client node 200 can also return a response message ACK to the KV server node 100.
在上述场景中,KV读操作涉及2次以上(出现冲突时需要次数更多)到内存106的读写操作和相应逻辑处理操作。与KV写操作类似,本申请中数据读写均可以在DPU102上通过DMA的方式直接完成,KV逻辑处理也可以通过DPU102上的核心来完成。相比于单边RDMA实现方案,本申请能将数据的传输从:服务端内存->客户端内存,缩短为:服务端内存->服务端DPU缓存,大幅缩短数据传输时延。相比基于CPU的RPC方案,本申请能实现CPU旁路,节省CPU资源,并通过DPU102的多核心和随路处理能力,提升系统的整体吞吐和时延性能。In the above scenario, the KV read operation involves more than two (more times when conflicts occur) read and write operations to the memory 106 and corresponding logical processing operations. Similar to the KV write operation, data reading and writing in this application can be completed directly through DMA on the DPU102, and KV logic processing can also be completed through the core on the DPU102. Compared with the unilateral RDMA implementation solution, this application can shorten the data transmission from: server memory -> client memory to: server memory -> server DPU cache, greatly shortening the data transmission delay. Compared with the CPU-based RPC solution, this application can achieve CPU bypass, save CPU resources, and improve the overall throughput and latency performance of the system through the multi-core and associated path processing capabilities of DPU102.
作为另一种可能的实现方式,本申请还提出一种支持事务批处理的设计,允许KV客户端节点200在单个KV操作请求中封装多个KV操作,例如携带多个操作的操作元数据,DPU102根据KV操作请求,生成多个事务,多个事务中的每个事务用于执行多个KV操作中的一个KV操作,相应地,DPU102可以通过多个内核并行执行多个事务,进而提升整体效率。As another possible implementation, this application also proposes a design that supports transaction batch processing, allowing the KV client node 200 to encapsulate multiple KV operations in a single KV operation request, such as carrying operation metadata of multiple operations, DPU102 generates multiple transactions according to KV operation requests. Each transaction in the multiple transactions is used to perform one KV operation in multiple KV operations. Correspondingly, DPU102 can execute multiple transactions in parallel through multiple cores, thereby improving the overall efficiency.
参见图9所示的数据处理方法的流程示意图,该方法主要从KV客户端节点200、KV服务端节点100交互的角度描述批量操作流程,具体包括如下步骤:Referring to the flow diagram of the data processing method shown in Figure 9, this method mainly describes the batch operation process from the perspective of interaction between the KV client node 200 and the KV server node 100, and specifically includes the following steps:
步骤1:KV操作请求被DPU102接收后,由DPU102的一个线程负责处理。Step 1: After the KV operation request is received by DPU102, a thread of DPU102 is responsible for processing.
在图9中,KV操作请求可以由Thread 0处理。具体地,Thread 0通过解析消息头获取当前Batch操作的操作类型、操作数量。In Figure 9, KV operation requests can be handled by Thread 0. Specifically, Thread 0 obtains the operation type and number of operations of the current Batch operation by parsing the message header.
步骤2:DPU102通过Thread 0上的分发和状态管理逻辑,依次解析各操作域段,并在状态表中初始化各操作的状态信息,之后生成新的操作事务入列到事务队列,并生成操作通知入列到通知队列。Step 2: DPU102 parses each operation domain segment in turn through the distribution and status management logic on Thread 0, and initializes the status information of each operation in the status table. Then it generates a new operation transaction and puts it in the transaction queue, and generates an operation notification. Enqueued into the notification queue.
其中,通知队列的实现可以是图中的门铃(doorbell)队列的形式。Among them, the notification queue can be implemented in the form of a doorbell queue in the figure.
步骤3:DPU102上的调度器可以将生成的操作通知调度到不同的线程,由不同的线程分别执行相应的KV操作。Step 3: The scheduler on the DPU102 can schedule the generated operation notifications to different threads, and the different threads can perform corresponding KV operations respectively.
在图9中,DPU102上的调度器将doorbell调度到Thread 1至Thread N,相应地,Thread 1至Thread N通过解析doorbell,从事务队列中读取完成操作需要的信息,之后执行相应的操作,例如KV插入操作。In Figure 9, the scheduler on DPU102 schedules the doorbell to Thread 1 to Thread N. Correspondingly, Thread 1 to Thread N read the information needed to complete the operation from the transaction queue by parsing the doorbell, and then perform the corresponding operation. For example, KV insertion operation.
步骤4:操作完成后,各线程回写相应的状态表。 Step 4: After the operation is completed, each thread writes back the corresponding status table.
各线程在回写状态表的过程中,由DPU102内部的队列或者保序资源确保状态更新的原子性,最后一个完成的线程生成回复请求的Doorbell并入列到Doorbell队列。During the process of each thread writing back the status table, the queue or order-preserving resources inside the DPU102 ensure the atomicity of the status update. The last completed thread generates a Doorbell that replies to the request and merges it into the Doorbell queue.
步骤5:DPU102上的调度器将回复请求的Doorbell调度到某一个线程执行,该线程收集状态信息,在状态信息表征操作完成时,生成完成消息后发送回KV客户端节点200。Step 5: The scheduler on the DPU 102 schedules the Doorbell that responds to the request to a certain thread for execution. The thread collects status information. When the status information representation operation is completed, a completion message is generated and sent back to the KV client node 200.
在该场景中,请求端可以将多个KV操作封装到一个消息中,利用DPU102将消息中的多个事务分发给不同的处理核心并行化处理,通过流程组合与编排来实现批处理操作,从而提升整体事务处理效率。In this scenario, the requesting end can encapsulate multiple KV operations into one message, use DPU102 to distribute multiple transactions in the message to different processing cores for parallel processing, and implement batch processing operations through process combination and orchestration, thus Improve overall transaction processing efficiency.
值得注意的是,本领域的技术人员根据以上描述的内容,能够想到的其他合理的步骤组合,也属于本申请的保护范围内。其次,本领域技术人员也应该熟悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作并不一定是本申请所必须的。It is worth noting that those skilled in the art can think of other reasonable step combinations based on the above description, which also fall within the protection scope of this application. Secondly, those skilled in the art should also be familiar with the fact that the embodiments described in the specification are all optional embodiments, and the actions involved are not necessarily necessary for this application.
以上结合图1至图9对本申请实施例提供的数据处理方法进行介绍,接下来结合附图对本申请实施例提供的数据处理装置的功能以及实现该数据处理装置的计算设备进行介绍。The data processing method provided by the embodiment of the present application is introduced above with reference to Figures 1 to 9. Next, the functions of the data processing device provided by the embodiment of the present application and the computing device that implements the data processing device are introduced with reference to the accompanying drawings.
参见图10,示出了一种数据处理装置的结构示意图,数据处理装置1000可以是加速器中的软件装置或者硬件装置,该装置1000包括:Referring to Figure 10, a schematic structural diagram of a data processing device is shown. The data processing device 1000 can be a software device or a hardware device in an accelerator. The device 1000 includes:
获取单元1002,用于获取KV操作请求;Obtaining unit 1002, used to obtain KV operation requests;
确定单元1004,用于根据所述KV操作请求确定执行方式,所述执行方式包括卸载方式和非卸载方式,所述卸载方式用于指示由所述加速器执行所述KV操作请求的操作,所述非卸载方式用于指示由处理器执行所述KV操作请求的操作;Determining unit 1004, configured to determine an execution mode according to the KV operation request. The execution mode includes an offloading mode and a non-offloading mode. The offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation. The non-offloading mode is used to instruct the processor to perform the operation requested by the KV operation;
执行单元1006,用于根据所述执行方式执行所述KV操作请求的处理。The execution unit 1006 is configured to execute the processing of the KV operation request according to the execution mode.
应理解的是,本发明本申请实施例的装置1000可以通过中央处理单元(central processing unit,CPU)实现,也可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)、数据处理单元(data processing unit,DPU)、片上系统(system on chip,SoC)或其任意组合。也可以通过软件实现图2至图9所示的数据处理方法时,装置1000及其各个模块也可以为软件模块。It should be understood that the device 1000 in the embodiment of the present invention can be implemented by a central processing unit (CPU), an application-specific integrated circuit (ASIC), or a programmable logic device. (programmable logic device, PLD) implementation, the above PLD can be a complex programmable logical device (CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL ), data processing unit (DPU), system on chip (SoC), or any combination thereof. When the data processing methods shown in Figures 2 to 9 can also be implemented through software, the device 1000 and its respective modules can also be software modules.
在一些可能的实现方式中,确定单元1004具体用于:In some possible implementations, the determining unit 1004 is specifically used to:
获取所述KV操作请求中的操作元数据;Obtain the operation metadata in the KV operation request;
当所述操作元数据满足预设条件,确定所述执行方式为所述卸载方式,否则,确定所述执行方式为所述非卸载方式。When the operation metadata satisfies the preset condition, the execution mode is determined to be the uninstallation mode; otherwise, the execution mode is determined to be the non-offloading mode.
在一些可能的实现方式中,所述操作元数据包括操作类型和键长,所述操作元数据满足预设条件,包括:所述操作类型为增加、删除、查询、修改或批量增加、批量删除、批量查询或批量修改中的一种或多种,且所述键长小于预设长度。In some possible implementations, the operation metadata includes operation type and key length, and the operation metadata satisfies preset conditions, including: the operation type is add, delete, query, modify, or batch add or batch delete. , batch query or batch modification, and the key length is less than the preset length.
在一些可能的实现方式中,当根据所述KV操作请求确定的所述执行方式为所述卸载方式时,所述执行单元1006具体用于:In some possible implementations, when the execution mode determined according to the KV operation request is the offloading mode, the execution unit 1006 is specifically configured to:
根据所述KV操作请求,对所述计算设备的内存执行目标KV操作。According to the KV operation request, a target KV operation is performed on the memory of the computing device.
在一些可能的实现方式中,所述内存采用KV块存储KV数据,所述KV块包括键内容字段和值内容字段;In some possible implementations, the memory uses KV blocks to store KV data, and the KV blocks include key content fields and value content fields;
所述执行单元1006具体用于:The execution unit 1006 is specifically used for:
根据所述KV操作请求,向所述计算设备的内存写入目标KV块或从所述计算设备的内存查询所述目标KV块。According to the KV operation request, a target KV block is written to or queried from the memory of the computing device.
在一些可能的实现方式中,所述内存采用哈希表存储所述KV数据的元数据,所述哈希表包括多个哈希桶,所述多个哈希桶中的每个哈希桶包括多个哈希槽,属于同一哈希桶的多个哈希槽用于存放哈希值相同的多个键的指纹、键长以及对应的块地址。In some possible implementations, the memory uses a hash table to store the metadata of the KV data. The hash table includes multiple hash buckets, and each hash bucket in the multiple hash buckets Including multiple hash slots, multiple hash slots belonging to the same hash bucket are used to store the fingerprints, key lengths and corresponding block addresses of multiple keys with the same hash value.
在一些可能的实现方式中,所述KV操作请求为KV修改请求,所述执行单元1006还用于:In some possible implementations, the KV operation request is a KV modification request, and the execution unit 1006 is also used to:
根据所述目标KV块的键内容确定哈希值和指纹;Determine the hash value and fingerprint based on the key content of the target KV block;
根据所述哈希值确定所述目标KV块对应的哈希桶; Determine the hash bucket corresponding to the target KV block according to the hash value;
更新哈希桶中指纹与目标KV块中键内容确定的指纹匹配的哈希槽。Update the hash slot in the hash bucket whose fingerprint matches the fingerprint determined by the key content in the target KV block.
在一些可能的实现方式中,所述KV块还包括下级KV块标识字段和下级KV块地址字段,所述KV操作请求为KV增加请求,所述执行单元1006还用于:In some possible implementations, the KV block also includes a lower-level KV block identification field and a lower-level KV block address field, the KV operation request is a KV increase request, and the execution unit 1006 is also used to:
根据所述目标KV块的键内容确定哈希值和指纹;Determine the hash value and fingerprint based on the key content of the target KV block;
根据所述哈希值确定所述目标KV块对应的哈希桶;Determine the hash bucket corresponding to the target KV block according to the hash value;
当所述哈希桶中目标哈希槽存储的指纹与所述目标KV块中键内容确定的指纹匹配,读取所述目标哈希槽中的下级KV块标识字段和下级KV块地址字段,根据所述下级KV块标识字段和所述下级KV块地址字段的字段值确定链表尾部的KV块,将所述目标KV块的块地址写入所述链表尾部的KV块中的下级KV块地址字段,将所述链表尾部的KV块中的下级KV块标识字段的字段值标识为有效。When the fingerprint stored in the target hash slot in the hash bucket matches the fingerprint determined by the key content in the target KV block, read the lower-level KV block identification field and lower-level KV block address field in the target hash slot, Determine the KV block at the end of the linked list based on the field values of the lower-level KV block identification field and the lower-level KV block address field, and write the block address of the target KV block into the lower-level KV block address in the KV block at the end of the linked list. field, marking the field value of the lower-level KV block identification field in the KV block at the end of the linked list as valid.
在一些可能的实现方式中,所述KV操作请求为KV增加请求,所述执行单元1006还用于:In some possible implementations, the KV operation request is a KV increase request, and the execution unit 1006 is also used to:
根据所述目标KV块的键内容确定哈希值和指纹;Determine the hash value and fingerprint based on the key content of the target KV block;
根据所述哈希值确定所述目标KV块对应的哈希桶;Determine the hash bucket corresponding to the target KV block according to the hash value;
向所述哈希桶中的空白哈希槽写入所述目标KV块的块地址和所述目标KV块中键内容确定的指纹。Write the block address of the target KV block and the fingerprint determined by the key content in the target KV block into the empty hash slot in the hash bucket.
在一些可能的实现方式中,所述KV操作请求为KV查询请求,所述KV操作请求中包括待查询的目标KV块的块地址和键内容,所述执行单元1006具体用于:In some possible implementations, the KV operation request is a KV query request, and the KV operation request includes the block address and key content of the target KV block to be queried, and the execution unit 1006 is specifically used to:
根据所述目标KV块的键内容确定哈希值,根据所述哈希值确定对应的哈希桶;Determine a hash value according to the key content of the target KV block, and determine the corresponding hash bucket according to the hash value;
根据所述目标KV块的块地址进行地址翻译获得物理地址,根据所述物理地址读取所述哈希桶;Perform address translation according to the block address of the target KV block to obtain a physical address, and read the hash bucket according to the physical address;
根据所述目标KV块的键内容确定指纹,根据由所述目标KV块的键内容确定的指纹查询所述哈希桶,获得所述键内容对应的值内容。Determine the fingerprint according to the key content of the target KV block, query the hash bucket according to the fingerprint determined by the key content of the target KV block, and obtain the value content corresponding to the key content.
由于图10所示的数据处理装置1000对应于图2、图4、图7、图8、图9所示的方法,故图10所示的数据处理装置1000的具体实现方式及其所具有的技术效果,可以参见前述实施例中的相关之处描述,在此不做赘述。Since the data processing device 1000 shown in Figure 10 corresponds to the methods shown in Figures 2, 4, 7, 8, and 9, the specific implementation of the data processing device 1000 shown in Figure 10 and its For technical effects, please refer to the relevant descriptions in the foregoing embodiments and will not be described again here.
图11为本申请提供的一种计算设备1100的硬件结构图,该计算设备1100可以是前述的KV服务端节点100,用于实现上述图10所示实施例中的数据处理装置1000的功能。Figure 11 is a hardware structure diagram of a computing device 1100 provided by this application. The computing device 1100 can be the aforementioned KV server node 100, used to implement the functions of the data processing device 1000 in the embodiment shown in Figure 10.
如图11所示,所述计算设备1100包括处理器1101、加速器1102和内存1103。其中,处理器1101、加速器1102和内存1103可以通过总线1104通信,也可以通过无线传输等其他手段实现通信。计算设备1100还包括通信接口1105,通信接口1105用于和外部通信,例如是和KV客户端节点200等其他设备通信。在一些可能的实现方式中,计算设备1100还可以包括存储器1106。As shown in FIG. 11 , the computing device 1100 includes a processor 1101 , an accelerator 1102 and a memory 1103 . Among them, the processor 1101, the accelerator 1102 and the memory 1103 can communicate through the bus 1104, or can also communicate through other means such as wireless transmission. The computing device 1100 also includes a communication interface 1105, which is used for communicating with external devices, such as the KV client node 200 and other devices. In some possible implementations, computing device 1100 may also include memory 1106 .
处理器1101可以为中央处理器CPU,加速器1102可以为数据处理器DPU或者基础设施处理器IPU。其中,加速器1102用于对处理器1101的工作负载进行卸载,从而实现加速功能。需要说明的是,加速器1102可以单独部署操作系统(即小系统),提供KV服务,此时,计算设备1100内包括两个操作系统,即处理器1101中运行的通用操作系统,以及加速器1102中运行的小系统。此外,加速器1102也可以作为处理器1101的外接设备存在,处理器1101和加速器1102可以构成异构系统。The processor 1101 may be a central processing unit (CPU), and the accelerator 1102 may be a data processor (DPU) or an infrastructure processor (IPU). Among them, the accelerator 1102 is used to offload the workload of the processor 1101 to realize the acceleration function. It should be noted that the accelerator 1102 can deploy an operating system (ie, a small system) independently to provide KV services. At this time, the computing device 1100 includes two operating systems, namely, a general operating system running in the processor 1101, and a general operating system running in the accelerator 1102. small system running. In addition, the accelerator 1102 may also exist as an external device to the processor 1101, and the processor 1101 and the accelerator 1102 may constitute a heterogeneous system.
内存1103是指与处理器1101或加速器1102直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为操作系统或其他正在运行中的程序的临时数据存储器。内存可以包括至少两种存储器,例如内存既可以是随机存取存储器(Random Access Memory,RAM),也可以是只读存储器(Read Only Memory,ROM)。举例来说,随机存取存储器是动态随机存取存储器(Dynamic RAM,DRAM)、静态随机存取存储器(Static RAM,SRAM),或者存储级存储器(Storage Class Memory,SCM)等。而只读存储器可以是可编程只读存储器(Programmable ROM,PROM)、可抹除可编程只读存储器(Erasable Programmable ROM,EPROM)等。另外,内存1103还可以是双列直插式存储器模块或双线存储器模块(Dual In-line Memory Module,DIMM),即由DRAM组成的模块,或者是固态硬盘(Solid State Disk,SSD)。此外,可对内存1103进行配置使其具有保电功能。保电功能是指系统发生掉电又重新上电时,内存1103中存储的数据也不会丢失。具有保电功能的内存被称为非易失性内存。Memory 1103 refers to the internal memory that directly exchanges data with the processor 1101 or the accelerator 1102. It can read and write data at any time and very quickly, and serves as a temporary data storage for the operating system or other running programs. Memory can include at least two types of memory. For example, memory can be either Random Access Memory (RAM) or Read Only Memory (ROM). For example, random access memory is dynamic random access memory (Dynamic RAM, DRAM), static random access memory (Static RAM, SRAM), or storage class memory (Storage Class Memory, SCM), etc. The read-only memory can be a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable Programmable ROM, EPROM), etc. In addition, the memory 1103 can also be a dual in-line memory module or a dual in-line memory module (Dual In-line Memory Module, DIMM), that is, a module composed of DRAM, or a solid state drive (Solid State Disk, SSD). In addition, the memory 1103 can be configured to have a power-saving function. The power-guaranteing function means that the data stored in the memory 1103 will not be lost when the system is powered off and then on again. Memory with a power-saving function is called non-volatile memory.
总线1104除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线1104。In addition to the data bus, the bus 1104 may also include a power bus, a control bus, a status signal bus, etc. However, for the sake of clarity, the various buses are labeled bus 1104 in the figure.
通信接口1105用于和外部设备如KV客户端节点200通信。具体地,通信接口1105可以是网卡, 如前文所述的RDMA网卡。通信接口1105可以用于接收KV客户端节点200通过RDMA Write发送的KV操作请求。The communication interface 1105 is used to communicate with external devices such as the KV client node 200. Specifically, the communication interface 1105 may be a network card, RDMA network card as mentioned above. The communication interface 1105 may be used to receive a KV operation request sent by the KV client node 200 through RDMA Write.
存储器1106,也称作外部存储器或外存,通常用于持久化存储数据或指令。存储器1106可以包括磁盘或者其他类型的存储介质,例如固态硬盘或者叠瓦式磁记录硬盘等。当计算设备1100还包括存储器1106时,内存1103还用于暂时存放与存储器1106交换的数据。Memory 1106, also known as external memory or external storage, is typically used to persistently store data or instructions. The memory 1106 may include a magnetic disk or other type of storage medium, such as a solid state drive or a shingled magnetic recording hard drive. When the computing device 1100 also includes a memory 1106, the memory 1103 is also used to temporarily store data exchanged with the memory 1106.
内存1103用于存储指令,该指令可以是内存1103固化的指令,或者是从存储器1106交换的指令,加速器1102用于执行该内存1103存储的指令,以执行如下操作:The memory 1103 is used to store instructions, which can be instructions solidified in the memory 1103, or instructions exchanged from the memory 1106. The accelerator 1102 is used to execute the instructions stored in the memory 1103 to perform the following operations:
获取KV操作请求;Get KV operation request;
根据所述KV操作请求确定执行方式,所述执行方式包括卸载方式和非卸载方式,所述卸载方式用于指示由所述加速器执行所述KV操作请求的操作,所述非卸载方式用于指示由处理器执行所述KV操作请求的操作;The execution mode is determined according to the KV operation request. The execution mode includes an offloading mode and a non-offloading mode. The offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation. The non-offloading mode is used to indicate The processor performs the operation requested by the KV operation;
根据所述执行方式执行所述KV操作请求的处理。The KV operation request is processed according to the execution mode.
可选地,加速器1102还用于执行内存1103存储的指令,以执行本申请实施例的数据处理方法的其他步骤。Optionally, the accelerator 1102 is also used to execute instructions stored in the memory 1103 to execute other steps of the data processing method in the embodiment of the present application.
应理解,根据本申请实施例的计算设备1100可对应于本申请实施例中的数据处理装置1000,并可以对应于执行根据本申请实施例中图2所示方法中的KV服务端节点100,并且计算设备1100所实现的上述和其它操作和/或功能分别为了实现图2中方法的相应流程,为了简洁,在此不再赘述。It should be understood that the computing device 1100 according to the embodiment of the present application may correspond to the data processing device 1000 in the embodiment of the present application, and may correspond to the KV server node 100 executing the method shown in Figure 2 according to the embodiment of the present application, And the above and other operations and/or functions implemented by the computing device 1100 are respectively to implement the corresponding processes of the method in Figure 2. For the sake of simplicity, they will not be described again here.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述应用于数据处理装置1000用于执行数据处理方法。An embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium may be any available medium that a computing device can store or a data storage device such as a data center that contains one or more available media. The available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc. The computer-readable storage medium includes instructions that instruct the computing device to execute the above-described application to the data processing apparatus 1000 for performing the data processing method.
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行上述数据处理方法。An embodiment of the present application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions capable of running on a computing device or stored in any available medium. When the computer program product is run on at least one computing device, at least one computing device is caused to execute the above data processing method.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be used Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not cause the essence of the corresponding technical solutions to depart from the protection scope of the technical solutions of the various embodiments of the present invention.

Claims (10)

  1. 一种数据处理方法,其特征在于,所述方法应用于支持键值KV服务的计算设备,所述计算设备包括加速器和处理器,所述方法由加速器执行,包括:A data processing method, characterized in that the method is applied to a computing device that supports key-value KV services, the computing device includes an accelerator and a processor, the method is executed by the accelerator, and includes:
    获取KV操作请求;Get KV operation request;
    根据所述KV操作请求确定执行方式,所述执行方式包括卸载方式和非卸载方式,所述卸载方式用于指示由所述加速器执行所述KV操作请求的操作,所述非卸载方式用于指示由处理器执行所述KV操作请求的操作;The execution mode is determined according to the KV operation request. The execution mode includes an offloading mode and a non-offloading mode. The offloading mode is used to instruct the accelerator to perform the operation requested by the KV operation. The non-offloading mode is used to indicate The processor performs the operation requested by the KV operation;
    根据所述执行方式执行所述KV操作请求的处理。The KV operation request is processed according to the execution mode.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述KV操作请求确定执行方式,包括:The method according to claim 1, characterized in that determining the execution mode according to the KV operation request includes:
    获取所述KV操作请求中的操作元数据;Obtain the operation metadata in the KV operation request;
    当所述操作元数据满足预设条件,确定所述执行方式为所述卸载方式,否则,确定所述执行方式为所述非卸载方式。When the operation metadata satisfies the preset condition, the execution mode is determined to be the uninstallation mode; otherwise, the execution mode is determined to be the non-offloading mode.
  3. 根据权利要求2所述的方法,其特征在于,所述操作元数据包括操作类型和键长,所述操作元数据满足预设条件,包括:所述操作类型为增加、删除、查询、修改或批量增加、批量删除、批量查询或批量修改中的一种或多种,且所述键长小于预设长度。The method according to claim 2, characterized in that the operation metadata includes operation type and key length, and the operation metadata satisfies preset conditions, including: the operation type is add, delete, query, modify or One or more of batch addition, batch deletion, batch query or batch modification, and the key length is less than the preset length.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,当根据所述KV操作请求确定的所述执行方式为所述卸载方式时,所述根据所述执行方式执行所述KV操作请求的处理,包括:The method according to any one of claims 1 to 3, characterized in that when the execution mode determined according to the KV operation request is the offloading mode, the KV is executed according to the execution mode. Processing of operation requests, including:
    根据所述KV操作请求,对所述计算设备的内存执行目标KV操作。According to the KV operation request, a target KV operation is performed on the memory of the computing device.
  5. 根据权利要求4所述的方法,其特征在于,所述内存采用KV块存储KV数据,所述KV块包括键内容字段和值内容字段;The method according to claim 4, characterized in that the memory uses KV blocks to store KV data, and the KV blocks include key content fields and value content fields;
    所述根据所述KV操作请求,对所述计算设备的内存执行目标KV操作,包括:Performing a target KV operation on the memory of the computing device according to the KV operation request includes:
    根据所述KV操作请求,向所述计算设备的内存写入目标KV块或从所述计算设备的内存查询所述目标KV块。According to the KV operation request, a target KV block is written to or queried from the memory of the computing device.
  6. 根据权利要求5所述的方法,其特征在于,所述内存采用哈希表存储所述KV数据的元数据,所述哈希表包括多个哈希桶,所述多个哈希桶中的每个哈希桶包括多个哈希槽,属于同一哈希桶的多个哈希槽用于存放哈希值相同的多个键的指纹、键长以及对应的块地址。The method according to claim 5, characterized in that the memory uses a hash table to store the metadata of the KV data, the hash table includes a plurality of hash buckets, and the Each hash bucket includes multiple hash slots. Multiple hash slots belonging to the same hash bucket are used to store the fingerprints, key lengths and corresponding block addresses of multiple keys with the same hash value.
  7. 根据权利要求6所述的方法,其特征在于,所述KV操作请求为KV修改请求,所述方法还包括:The method according to claim 6, characterized in that the KV operation request is a KV modification request, and the method further includes:
    根据所述目标KV块的键内容确定哈希值和指纹;Determine the hash value and fingerprint based on the key content of the target KV block;
    根据所述哈希值确定所述目标KV块对应的哈希桶;Determine the hash bucket corresponding to the target KV block according to the hash value;
    更新所述哈希桶中指纹与所述目标KV块中键内容确定的指纹匹配的哈希槽。Update the hash slot whose fingerprint in the hash bucket matches the fingerprint determined by the key content in the target KV block.
  8. 根据权利要求6所述的方法,其特征在于,所述KV操作请求封装有多个KV操作;The method according to claim 6, characterized in that the KV operation request encapsulates multiple KV operations;
    所述根据所述KV操作请求,对所述计算设备的内存执行目标KV操作,包括:Performing a target KV operation on the memory of the computing device according to the KV operation request includes:
    根据所述KV操作请求,生成多个事务,所述多个事务中的每个事务用于执行所述多个KV操作中的一个KV操作;Generate multiple transactions according to the KV operation request, each of the multiple transactions being used to perform one KV operation among the multiple KV operations;
    通过多个内核并行执行所述多个事务。The multiple transactions are executed in parallel through multiple cores.
  9. 一种加速器,其特征在于,所述加速器包括处理模块和通信接口,所述通信接口用于为所述处理模块提供网络通信,所述加速器用于执行如权利要求1至8中任一项所述的方法。An accelerator, characterized in that the accelerator includes a processing module and a communication interface, the communication interface is used to provide network communication for the processing module, and the accelerator is used to execute the method according to any one of claims 1 to 8. method described.
  10. 一种包括加速器的计算设备,其特征在于,所述计算设备包括所述加速器和处理器,所述加速器用于执行如权利要求1至8中任一项所述的方法。 A computing device including an accelerator, characterized in that the computing device includes the accelerator and a processor, and the accelerator is used to perform the method according to any one of claims 1 to 8.
PCT/CN2023/101332 2022-08-23 2023-06-20 Data processing method, accelerator, and computing device WO2024041140A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211017087.4 2022-08-23
CN202211017087.4A CN117666921A (en) 2022-08-23 2022-08-23 Data processing method, accelerator and computing device

Publications (1)

Publication Number Publication Date
WO2024041140A1 true WO2024041140A1 (en) 2024-02-29

Family

ID=90012338

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/101332 WO2024041140A1 (en) 2022-08-23 2023-06-20 Data processing method, accelerator, and computing device

Country Status (2)

Country Link
CN (1) CN117666921A (en)
WO (1) WO2024041140A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321294A1 (en) * 2015-04-30 2016-11-03 Vmware, Inc. Distributed, Scalable Key-Value Store
DE102018113885A1 (en) * 2017-06-13 2018-12-13 Western Digital Technologies, Inc. Memory-efficient persistent key-value memory for non-volatile memory
CN113821311A (en) * 2020-06-19 2021-12-21 华为技术有限公司 Task execution method and storage device
US20220012095A1 (en) * 2021-09-22 2022-01-13 Intel Corporation Metrics and security-based accelerator service rescheduling and auto-scaling using a programmable network device
US20220114270A1 (en) * 2020-12-26 2022-04-14 Intel Corporation Hardware offload circuitry

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160321294A1 (en) * 2015-04-30 2016-11-03 Vmware, Inc. Distributed, Scalable Key-Value Store
DE102018113885A1 (en) * 2017-06-13 2018-12-13 Western Digital Technologies, Inc. Memory-efficient persistent key-value memory for non-volatile memory
CN113821311A (en) * 2020-06-19 2021-12-21 华为技术有限公司 Task execution method and storage device
US20220114270A1 (en) * 2020-12-26 2022-04-14 Intel Corporation Hardware offload circuitry
US20220012095A1 (en) * 2021-09-22 2022-01-13 Intel Corporation Metrics and security-based accelerator service rescheduling and auto-scaling using a programmable network device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG TENG, WANG JIANYING, CHENG XUNTAO, XU HAO, HUANG GUI, ZHANG TIEYING, HE DENGCHENG, LI FEIFEI, CAO WEI, GROUP ALIBABA, HUANG : "FPGA-Accelerated Compactions for LSM-based Key-Value Store", 18TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES (FAST ’20), 26 February 2020 (2020-02-26), pages 225 - 237, XP093143594 *

Also Published As

Publication number Publication date
CN117666921A (en) 2024-03-08

Similar Documents

Publication Publication Date Title
CN108028833B (en) NAS data access method, system and related equipment
US10175891B1 (en) Minimizing read latency for solid state drives
CN114201421B (en) Data stream processing method, storage control node and readable storage medium
EP4318251A1 (en) Data access system and method, and device and network card
US11025564B2 (en) RDMA transport with hardware integration and out of order placement
US9092275B2 (en) Store operation with conditional push of a tag value to a queue
US10802753B2 (en) Distributed compute array in a storage system
US20200272579A1 (en) Rdma transport with hardware integration
CN110119304B (en) Interrupt processing method and device and server
US10162775B2 (en) System and method for efficient cross-controller request handling in active/active storage systems
WO2020199760A1 (en) Data storage method, memory and server
EP4357901A1 (en) Data writing method and apparatus, data reading method and apparatus, and device, system and medium
WO2022017475A1 (en) Data access method and related device
CN115129621B (en) Memory management method, device, medium and memory management module
CN115270033A (en) Data access system, method, equipment and network card
CN115934623A (en) Data processing method, device and medium based on remote direct memory access
WO2014154045A1 (en) Method, apparatus and system for implementing multicore operating system
US20240061802A1 (en) Data Transmission Method, Data Processing Method, and Related Product
CN116204487A (en) Remote data access method and device
WO2024041140A1 (en) Data processing method, accelerator, and computing device
Wu et al. RF-RPC: Remote fetching RPC paradigm for RDMA-enabled network
US10289550B1 (en) Method and system for dynamic write-back cache sizing in solid state memory storage
CN116049085A (en) Data processing system and method
Sun et al. A comprehensive study on optimizing systems with data processing units
US8819302B2 (en) System to manage input/output performance and/or deadlock in network attached storage gateway connected to a storage area network environment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856232

Country of ref document: EP

Kind code of ref document: A1