WO2024060934A1 - 数据处理方法及装置 - Google Patents

数据处理方法及装置 Download PDF

Info

Publication number
WO2024060934A1
WO2024060934A1 PCT/CN2023/115226 CN2023115226W WO2024060934A1 WO 2024060934 A1 WO2024060934 A1 WO 2024060934A1 CN 2023115226 W CN2023115226 W CN 2023115226W WO 2024060934 A1 WO2024060934 A1 WO 2024060934A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
processed
data operation
key
request
Prior art date
Application number
PCT/CN2023/115226
Other languages
English (en)
French (fr)
Inventor
黄凯欣
Original Assignee
北京火山引擎科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京火山引擎科技有限公司 filed Critical 北京火山引擎科技有限公司
Publication of WO2024060934A1 publication Critical patent/WO2024060934A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers

Definitions

  • the present disclosure relates to the field of data processing technology, and in particular, to a data processing method and device.
  • Key-value database (KV-store, KVS for short) is a new type of non-relational database (NoSQL database), which mainly stores data in the form of key-value pairs.
  • NoSQL database non-relational database
  • key-value databases are more flexible than relational databases in related technologies; in terms of access interfaces, key-value databases use simple data access interfaces such as writing (PUT) and reading (GET).
  • PUT writing
  • GET reading
  • key-value databases are widely used in various fields.
  • the present disclosure provides a data processing method, which includes: receiving a data operation request sent by a client through a network card module in a smart network card; calling a request analysis module in the smart network card to analyze the data operation request To obtain the data to be processed and the data operation type information, and input the data to be processed and the data operation type information to the execution engine module in the smart network card; call the execution engine module to execute based on the data to be processed perform the data operation indicated by the data operation type information to obtain the data operation result; and call the request analysis module to encapsulate the data operation result to obtain a response to the data operation request, and send it to the data operation request through the network card module.
  • the client sends a response to the data operation request.
  • the present disclosure provides a data processing device, including: a network card module for receiving a data operation request sent by a client; and a request analysis module for parsing the received data operation request to obtain data to be processed. and data operation type information, and input the data to be processed and the data operation type information to the execution engine module; and the execution engine module to execute the data operation indicated by the data operation type information based on the data to be processed to obtain the data operation result; the request analysis module is also used to encapsulate the data operation result to obtain a response to the data operation request; the network card module is also used to send a message to the client. Send a response to the data operation request.
  • the present disclosure provides an electronic device, including: a memory and a processor; the memory is configured to store computer program instructions; the processor is configured to execute the computer program instructions, so that the electronic device implements The data processing method as described in the first aspect.
  • the present disclosure provides a readable storage medium, including: computer program instructions. At least one processor of an electronic device executes the computer program instructions, so that the electronic device implements the data processing method as described in the first aspect. .
  • the present disclosure provides a computer program product.
  • the computer program product When the computer program product is executed by an electronic device, the electronic device implements the data processing method as described in the first aspect.
  • the present disclosure provides a computer program, including instructions that, when executed by a processor, cause the processor to perform the data processing method as described in the first aspect.
  • Figure 1 is a schematic framework diagram of a data processing system provided by an embodiment of the present disclosure
  • Figure 2 is a data structure diagram of a data operation request/response provided by an embodiment of the present disclosure
  • Figure 3 is a schematic flowchart of a data processing method provided by an embodiment of the present disclosure
  • Figure 4 is a schematic framework diagram of a data processing system provided by another embodiment of the present disclosure.
  • Figure 5 is a schematic framework diagram illustrating the hash index mechanism of the present disclosure
  • Figure 6 is a schematic diagram of the data structure of the index slot in the index structure exemplified by the present disclosure
  • Figure 7 is a schematic flowchart of a data processing method provided by another embodiment of the present disclosure.
  • Figure 8 is a schematic diagram of a data storage method provided by an embodiment of the present disclosure.
  • FIG9 is a schematic diagram of the structure of a data processing system provided by an embodiment of the present disclosure.
  • Figure 10 is a schematic structural diagram of the memory management mechanism used by the memory allocator provided by an embodiment of the present disclosure
  • Figure 11 is a schematic structural diagram of a data processing system provided by another embodiment of the present disclosure.
  • Figure 12 is a schematic structural diagram of a data relay station provided by an embodiment of the present disclosure.
  • Figure 13 is a schematic flowchart of a data processing method provided by another embodiment of the present disclosure.
  • Figure 14 is a schematic flowchart of a data processing method provided by another embodiment of the present disclosure.
  • FIG15 is a flow chart of a data processing method provided in another embodiment of the present disclosure.
  • KVS has been widely used in many fields due to its unique advantages.
  • Distributed memory key-value database transmits data through the network and allows data to be stored on multiple nodes. This not only provides a larger storage space for key-value data pairs, but also has more flexible expansion capabilities (i.e. dynamically adding or deleting storage services). node).
  • RDMA Remote Direct Memory Access
  • KVS KVS's goal of pursuing high throughput and low latency
  • RDMA technology not only supports bilateral message semantics similar to the socket network transmission mechanism in related technologies, but also supports unilateral memory semantics.
  • the main interaction process of data processing using bilateral message semantics is as follows: the client node sends the operation request to the server node, and the server node performs the corresponding PUT/GET and other data operations locally, and returns the operation results to the client node. The operation is completed after the client node receives the feedback result from the server node.
  • Unilateral memory semantics refers to the access method in which the client can directly read and write the memory space of the server in a server-bypass manner.
  • the application programming interface provides a more convenient way.
  • the main interaction process of data processing using single-sided memory semantics is as follows: for GET operations, use the RDMA READ action to complete the reading of key-value data; for PUT operations, a reasonable combination of RDMA ATOMIC, RDMA WRITE and RDMA READ actions is required to support consistency data written.
  • the server node in addition to participating in data storage and initial communication establishment, the server node basically does not need to respond on subsequent critical data paths.
  • Bilateral message semantics can support more Add rich and flexible interface definitions (because the data operation process can be hidden from the user side), and unilateral memory semantics can achieve faster and more efficient data access under a single network round trip (RTT).
  • RTT network round trip
  • the in-memory key-value database using the RPC mechanism based on bilateral message semantics needs to send the operation request to the server node.
  • the CPU of the server node is responsible for executing the specific data storage logic and returning the data operation results to the client.
  • This processing mechanism will cause the server node CPU to become a performance bottleneck on the critical path in high-concurrency scenarios (such as hundreds or even thousands of clients accessing simultaneously), which may result in higher tail latency. This is not only due to the multi-core frequency limit of a single server, but also related to the interaction mode between the CPU and the network card.
  • the CPU of the server node must not only prepare to receive the request work (recv work request, RECV WR) and be responsible for polling the completion queue (completion queue, CQ) for each client node's data operation request, but also process the operation request and copy Data, prepare to send request work (send work request, SEND WR), etc., which generates a lot of additional access overhead.
  • the server node CPU is the core device for performing KVS access, more expensive and efficient CPU components and adapted motherboards must be deployed on the server. This is contrary to the current idea of separation of computing and storage architecture advocated in the field of cloud computing (that is, decoupling computing resources and storage resources, and storage nodes can focus on data storage). It is difficult to achieve low total cost of ownership of storage nodes. , TCO) control.
  • the in-memory key-value database based on one-sided memory semantics allows the client to perform index queries and locate the memory address of the server node where the corresponding key-value pair data is located, and complete GET/PUT operations through actions such as READ and WRITE.
  • this not only requires the client node to cache the index structure of the key-value pair data of the server node, but also poses a higher challenge to the consistency of concurrent operations: the server node centralized consistency guarantee mechanism in related technologies needs to be upgraded. To ensure distributed consistency, it is more complex and more difficult to ensure correctness. And even if consistent data writing operations are supported by combining multiple WRITE, ATOMIC, and READ operations, there are still multiple network round-trip overheads. Due to the lack of transaction support, the memory-level abstraction provided by RDMA is not suitable for building efficient KVS.
  • Hardware bandwidth such as network card line speed, PCIe (Peripheral Component Interconnect Express, peripheral component interconnect expansion bus) bandwidth, memory bandwidth, etc. determine the upper limit of memory key-value database access.
  • PCIe Peripheral Component Interconnect Express, peripheral component interconnect expansion bus
  • memory bandwidth etc. determine the upper limit of memory key-value database access.
  • performance bottlenecks for example, CPU processing bottlenecks, multiple network round-trips. This makes them unable to efficiently utilize the bandwidth resources of the hardware. For example, when the CPU becomes the processing bottleneck, memory bandwidth and network bandwidth are wasted; when network round-trip becomes the bottleneck, PCIe bandwidth and network bandwidth are wasted. Naturally, the waste of bandwidth will lead to a bottleneck in the throughput rate, which cannot support the construction of an efficient in-memory key-value database under ideal conditions.
  • NIC Network Interface Controller, Network Interface Controller
  • RDMA RDMA over Converged Ethernet, RDMA on Ethernet
  • Inifiniband HCA High Channel Adapter, Host Channel Adapter
  • SmartNIC also known as programmable network cards
  • the core component of a smart NIC is a field programmable gate array (FPGA) with an embedded NIC chip to connect to the network and a PCIe connector to connect to the server (host).
  • FPGA field programmable gate array
  • the inventor of the present disclosure has discovered that with the rapid development of the Internet and the explosive growth of network usage, the access pressure on the database continues to increase. High-concurrency access scenarios for the database, such as when hundreds and thousands of users access the database at the same time, will lead to Database performance degrades.
  • embodiments of the present disclosure provide a data processing method to improve the performance of the key-value database.
  • the data processing method provided by the present disclosure some workloads of the server's CPU are migrated to the smart network card, and the FPGA in the smart network card is used as the processor chip to process the data operation requests sent by the client, and the client is processed through the smart network card.
  • the end node presents the access abstraction of KVS without requiring the server's CPU to participate in the processing of data operation requests, effectively reducing the processing pressure on the server's CPU.
  • both PUT/GET operations support one network round-trip operation delay, maximizing Utilize network bandwidth to improve access throughput. It should be noted that the data processing method provided by this disclosure is also applicable to databases with similar problems.
  • the data processing method can be executed by the data processing device provided by the present disclosure, and the device can be implemented by any software and/or hardware.
  • the data processing device can be a software system.
  • the data processing device is a data processing system
  • the database deployed in the server is a memory key-value database.
  • FIG 1 is a schematic architectural diagram of a data processing system provided by an embodiment of the present disclosure.
  • the data processing system 100 is deployed in a smart network card, and the data processing system 100 can access and operate the database in the memory of the server.
  • the network card module in the smart network card is mainly used to receive the data operation request sent by the client and send it to the corresponding module of the data processing system 100 and to pass the response generated by the data processing system 100 to the client.
  • the data processing system 100 includes: a request analysis module 101 and an execution engine module 102.
  • the request analysis module 101 is mainly used to parse, identify, encapsulate and deliver data operation requests transmitted by the network card module in the smart network card.
  • the request analysis module 101 includes: a request decoder 101a and a request encoder 101b.
  • the request decoder 101a is mainly used to obtain the data operation request sent by the client from the network card module for analysis and identification;
  • the request encoder 101b is mainly used to encapsulate the data according to the data operation result passed by the execution engine module 102 to obtain a response to the data operation request. And passed to the network card module, through which the response is fed back to the client.
  • the execution engine module 102 is mainly used to perform corresponding actions (such as write/read/delete, etc. data operations) according to the data operation type indicated by the data operation request sent by the client, and transfer the data operation results to the request analysis module 101.
  • the execution engine module 102 can interact with the memory of the server to implement operations on key-value pair data.
  • the request decoder 101a and the request encoder 101b in the request analysis module 101 can share the same data structure, that is, the data operation request and the data operation result can share the same data structure.
  • Figure 2 exemplarily shows a schematic diagram of a data structure shared by data operation requests/responses.
  • data operation requests/responses include: transaction identification information, information on the number of total requests/responses that make up this transaction, sequence information of this request among all requests/responses, data operation type, and total number of keys corresponding to this transaction. Length, the total length of the value of this transaction, the length of the key contained in this request/response, the length of the value contained in this request/response, the field used to accommodate all or part of the key data of this transaction, One or more fields, verification information, etc. used to accommodate all or part of the value data of this transaction.
  • the data size of the data operation request/response there is no limit to the data size of the data operation request/response.
  • it can be 32 bytes, 64 bytes, 128 bytes, etc.
  • the data operation request/response can be 64 bytes in size.
  • a 4-byte TxID field (referring to a globally unique transaction ID, which can be specified by the user), a 2-byte Num field (indicating the total number of requests/responses that make up this transaction, Thus, it can handle variable length keys and values), 2-byte Seq field (indicates the order of this request/response among all requests/responses of this transaction, used for data splicing after receiving all requests or responses), 2 words Section's Opcode field (indicates the type of data operation, such as read/write/delete, etc.), 2-byte Tkey_len field (indicates the total length of the key of this transaction, which can span multiple requests), 2-byte Tvalue_len field (Indicates the total value length of this transaction, which can span multiple requests/responses), 1-byte Key_len field (key length included in this request), 1-byte Value_len field (value included in this request/response) length), a 16-byte Key field (accommodating all or
  • FIG 3 is a schematic flowchart of a data processing method provided by an embodiment of the present application. Referring to Figure 3, the method provided by this embodiment includes:
  • step S301 includes: receiving multiple data operation requests with the same transaction identifier sent by the client through the network card module.
  • S302. Call the request analysis module in the smart network card to analyze the data operation request to obtain the data to be processed and the data operation type information, and input the data to be processed and the data operation type information into the execution engine module in the smart network card.
  • the network card module of the smart network card can receive the data operation request sent by the client and transmit the data operation request to the request decoder.
  • the request decoder parses the data operation request to obtain the data operation request corresponding The data operation type, key-value pair data and other information in the data structure.
  • the client can send a separate data operation request to the data processing system.
  • the request decoder of the data processing system can obtain the information in the separate data operation request by parsing it, and perform the corresponding action.
  • the client can send multiple data operation requests to the data processing system, so that the data processing system aggregates the multiple data operation requests to perform corresponding actions on the super-long key-value pair data.
  • the request decoder of the data processing system can parse and identify multiple data operation requests, obtain the TxID field, Seq field, and Checksum field in each data operation request, and then identify whether it belongs to the same transaction based on the TxID in the multiple data operation requests, restore the order of each data operation request through the Seq field, and use the Checksum field to check the data consistency, thereby realizing the data operation of the super-long key-value pair data.
  • the request decoder can parse multiple data operation requests with the same transaction identifier separately to obtain multiple Key fields, and then splice the multiple Key fields in the order indicated by the Seq field to obtain the data to be processed corresponding to the transaction. And the data operation type indicated in the data operation request with the same transaction identifier is consistent. It should be noted that if the Value field is required to execute the data operation request, the Value field obtained by parsing can also be spliced in the above manner.
  • calling the request analysis module in the smart network card to parse the data operation request to obtain the data to be processed and data operation type information includes: calling the request analysis module to analyze multiple data operation requests with the same transaction identifier. Analyze separately to obtain multiple data fields and multiple identical data operation type information; and call the request analysis module to splice the multiple data fields to be processed according to the sequence indication information respectively included in the multiple data operation requests to splice the multiple to-be-processed data fields. Get data to be processed.
  • the data structures of the multiple data operation requests with the same transaction identifier are consistent.
  • S303 Call the execution engine module in the smart network card to perform the data operation indicated by the data operation type information based on the data to be processed to obtain the data operation result.
  • the data operation request can be a data read request, a data write request, or a data deletion request.
  • the execution engine module of the data processing system can perform a read operation, a write operation, or a delete operation based on the data operation type indicated by the data operation request, and obtain the corresponding data operation. result.
  • the data operation result obtained by the execution engine module can be the target data (such as value data) indicated by the read data to be processed; if the data operation request is a data write operation, the execution engine module The obtained data operation result can be information about writing success/failure; if the data operation request is a data deletion request, the data operation result obtained by the execution engine module can be information about deletion success/failure.
  • calling the execution engine module to perform the data operation indicated by the data operation type information based on the data to be processed to obtain the data operation result includes: calling the execution engine module to obtain the data from the smart network card based on the data to be processed. Determine the target index slot corresponding to the data to be processed in the index structure stored in the memory; and call the execution engine module to perform the data operation indicated by the data operation type information for the target index slot to obtain the data operation result. .
  • S304 includes: calling the request analysis module to update the target field in the data structure corresponding to the data operation request based on the data operation result to obtain a response to the data operation request.
  • the request encoder may fill the data operation result into the specified field of the data structure corresponding to the data operation request to obtain a response.
  • the request encoder can fill the read value data (ie, the data operation result) into the Value field in the data structure corresponding to the data operation request to obtain a response to the data operation request.
  • the amount of value data read may be large because the size of the corresponding field in the data operation request cannot satisfy the value read.
  • the amount of data you can create multiple response structures based on the amount of read value data to satisfy the read value data.
  • the value field is 24 bytes. If the read value data is less than 24 bytes, then the read value data can be written into the value field; if the read value data greater than 24 bytes, Then you can create multiple required response structures, and then write the read value data into the corresponding value fields of the multiple response structures.
  • the request encoder can modify the field (Opcode field) used to indicate the data operation type in the data structure shown in Figure 2 to ack/null, which represents write/deletion. Operation success/failure. Among them, ack can indicate that the write/delete operation is successful, and null can indicate that the write/delete operation has failed.
  • the request encoder passes the encapsulated response to the network card module of the smart network card, and the response is transmitted to the client through the network card module.
  • the client parses the response. It can first match the TxID field, then check the Opcode field to confirm the data operation type, and then go to the Value field to obtain data if necessary. For example, when the data operation request is a data read request, the data in the Value field can be obtained. If the data operation request is a data write request/data deletion request, the client can obtain the Opcode field to determine whether the data writing/deletion is successful.
  • the data processing method includes: receiving the data operation request sent by the client through the network card module in the smart network card; calling the request analysis module in the smart network card to analyze the data operation request to obtain the data to be processed and the data operation type information, and Process data and data operation type information and input it to the execution engine module in the smart network card; call the execution engine module to perform the data operation indicated by the data operation type information based on the data to be processed to obtain the data operation result; and call the request analysis module to analyze the data operation result. Encapsulate to obtain the response to the data operation request, and send the response to the data operation request to the client through the network card module.
  • the method provided by this embodiment migrates some workloads of the server's CPU to the smart network card, and the smart network card processes the data operation requests sent by the client, and presents the database access abstraction to the client node through the smart network card without having to Requiring the server CPU to participate in the processing of data operation requests can reduce the processing pressure on the server's CPU.
  • the data structure that responds to application data operation requests can reduce the server's space allocation and data copy overhead through this data structure reuse mechanism.
  • the data structure that supports aggregated multi-data operation requests is cache-aligned. Therefore, the network transmission overhead caused by boundary misalignment can be reduced during data transmission and the read-write amplification problem can be avoided as much as possible.
  • a key-value database can include two parts: the index structure and key-value pair data, where the key Value pair data is the target object for users to perform data operations, and the index structure is the retrieval data structure used to find the storage location of the requested key-value pair data.
  • Figure 4 is a schematic architectural diagram of a data processing system provided by an embodiment of the present disclosure.
  • the data processing system provided in this embodiment based on the embodiment shown in FIG. 1 , further includes: an index module 103 .
  • the index module 103 can be set in the memory of the smart network card.
  • the index module 103 includes index information corresponding to the key-value pair data, and the key-value pair data pointed to by the index information can be stored in the memory of the server (i.e., host memory, hereinafter referred to as for the server's memory).
  • the execution engine module 102 may need to interact with the index module 103 when calling the execution engine module 102 to process the data operation request sent by the client.
  • the index structure can use a hash index structure, such as: chain hash index structure, cuckoo hash index structure, hopscotch hash index structure, etc.
  • a hash index structure such as: chain hash index structure, cuckoo hash index structure, hopscotch hash index structure, etc.
  • the index structure can also use binary trees, radix tree, B-tree, B+ tree, red-black tree and other structures.
  • the index structure in order to improve the memory utilization of the smart network card, can be implemented by using sub-index structure organization and a multi-channel index mechanism.
  • the sub-index structure organization means that the index structure is composed of multiple sub-index structures. Each sub-index structure can include multiple index slots, and each index slot can be used to store information related to key-value pair data.
  • the multi-way index mechanism means that data operation requests are mapped to multiple sub-index structures using preset multiple mapping methods.
  • the index structure can include multiple hash buckets, and multi-way hashing is used for mapping when processing data operation requests.
  • FIG. 5 illustrates a schematic diagram of the hash index mechanism using a hash bucket and a two-way hash method as an example.
  • Each hash bucket may include: multiple index slots, a field for indicating whether each index slot in the bucket is a free slot, and a field for indicating whether the index slot in the bucket is occupied by a thread.
  • each hash bucket can contain 4 bytes of metadata: 1-byte Bitmap field (each bit represents the corresponding index in the bucket Whether the slot is a free slot, 0 means free, 1 means occupied), 1-byte Lockmap field (each bit represents whether the corresponding index slot in the bucket is occupied by a certain Thread occupation, 0 means idle, 1 means occupied), 2-byte Padding field (this field is meaningless bits, only used to align 4 bytes).
  • Each hash bucket can contain four 15-byte index slots, and each index slot can be used to store index information of a key-value pair.
  • this disclosure does not limit the number of index slots included in each hash bucket.
  • the number of index slots included in each hash bucket may be the same or different.
  • the byte size of the metadata can be adjusted to ensure that the metadata can completely represent the status of all index slots.
  • this disclosure does not limit the number and implementation of mapping methods.
  • the index structure can be implemented using an inline storage mechanism, that is, the key-value pair data that meets the preset conditions is stored inline in the index slot, and the client needs to access these inline storages.
  • Key-value pair data can be accessed by accessing the index structure, eliminating the need to access the server's memory and the PCIe data channel between the smart network card and the server, thus reducing the access and storage pressure on the server.
  • the key-value pair data meets the requirements of the inline storage mechanism can be determined based on the attribute information of the key-value pair data.
  • the attribute information of the key-value pair data mentioned here may include but is not limited to: the data type of a specific field (such as int8, int16, int32, int64, float32, float64, string, etc.), the data size, etc.
  • FIG6 is a schematic diagram of the data structure of an index slot in an index structure exemplarily shown in the present disclosure.
  • an index slot may include a field for indicating the data type of key-value pair data, a field for indicating the storage type of key and value, a field for storing key-related information, and a field for storing value-related information.
  • the present disclosure does not limit the byte size of each field.
  • the embodiment shown in FIG. 6 takes an index slot of 15 bytes as an example for illustration.
  • the index slot includes 4 fields, which are: 6-bit field used to indicate the data type (also called the type field), 2-bit field used to indicate the key and value related information storage method (also called the type field) Flag field), an 8-byte field used to store key-related information (also called the key-info field), and a 6-byte field used to store value-related information (also called the value-info field).
  • the Flag field can have three values: 01, 10, and 11. Among them, when the value of the Flag field is 01, it means that the keys and values in the key-value pair data can be stored inline in the index slot. When the value of the Flag field is 10, it means that the keys in the key-value pair data can be stored inline in the index slot, but the values in the key-value pair data cannot be stored inline in the index slot. When the value of the Flag field is 11, it means that neither the key nor the value in the key-value pair data can be stored inline in the index slot.
  • index slot 1 stores data of type Int32 and index slot 2 stores data of type string.
  • index slot 1 stores data of type Int32
  • index slot 2 stores data of type string.
  • the byte sizes of keys and values both meet the byte size restrictions of the key-info field and the value-info field. Therefore, keys and values are stored inline in the index slots.
  • the key meets the byte size limit of the key-info field, but the value is larger than 6 bytes and cannot meet the byte size limit of the value-info field. Therefore, the key can be stored inline.
  • the value-info field can be filled with pointer information corresponding to the key-value pair data, that is, the key-value pair data is stored in the memory of the server pointed to by the pointer information.
  • the key-info field can be used to store the fingerprint summary information of the key in the key-value pair data, wherein the fingerprint summary information of the key can be obtained by mapping the key to the data that meets the byte size limit of the key-info field, for example, by mapping the key by hashing the key. Of course, other methods can also be used for mapping to obtain the fingerprint summary information of the key.
  • the value-info field can then be filled with the pointer information corresponding to the key-value pair data, that is, the key-value pair data is stored in the memory of the server pointed to by the pointer information.
  • the index module is set up in the memory of the smart network card, and the index structure in the index module adopts the hash bucket, multi-way hash and the use of the previously mentioned
  • the inline storage mechanism how does the data processing system handle data read requests, data write requests, and data delete requests sent by the client.
  • Figure 7 is a flow chart of a data processing method provided by an embodiment of the present disclosure. Please refer to Figure 7.
  • the method provided by this embodiment includes:
  • S702 Call the request analysis module in the smart network card to analyze the data operation request to obtain the data to be processed and the data operation type information, and input the data to be processed and the data operation type information into the execution engine module in the smart network card.
  • Steps S701 and S702 are respectively similar to steps S301 and S302 in the embodiment shown in FIG. 3. Reference may be made to the detailed description of the embodiment shown in FIG. 3. For the sake of simplicity, they will not be described again here.
  • the index structure is implemented in the form of a hash bucket, and the hash bucket includes multiple index slots.
  • determining the target index slot can be achieved by, but is not limited to, the following methods:
  • Step a Call the execution engine module to perform hash calculation on the data to be processed to obtain a hash value, and perform matching in the index structure based on the hash value to obtain a successfully matched hash bucket.
  • the data processing system can use one or more hash algorithms to match hash buckets.
  • the data to be processed is When dealing with key-value pair data, you can use multiple hash algorithms to calculate the keys in the key-value pair data to be processed to obtain multiple hash values, and match multiple hash buckets based on the multiple hash values.
  • calling the execution engine module to perform hash calculation on the data to be processed to obtain a hash value, and matching in the index structure based on the hash value to obtain a successfully matched hash bucket includes: calling the execution engine module to perform a hash calculation on the data to be processed. Multiple preset hash algorithms perform hash calculations respectively to obtain multiple hash values; and call the execution engine module to match multiple hash values with the identifiers of each hash bucket included in the index structure to obtain multiple successfully matched ones.
  • a hash bucket is a hash calculation on the data to be processed.
  • Step b Call the execution engine module to match the data to be processed in the index slot included in the successfully matched hash bucket to obtain the matching result, and determine the target index slot based on the matching result.
  • the data operation request is a data read request or a data deletion request
  • matching can be performed in multiple successfully matched hash buckets based on the key or the fingerprint summary information of the key in the key-value pair data to be processed, and the match is successful.
  • the index slot is the target index slot.
  • an idle index slot can be allocated as the target index slot for the data to be processed according to the occupancy of the index slots in multiple successfully matched hash buckets or other factors.
  • the occupancy of the index slot can be obtained by whether each bit in the Bitmap field of the hash bucket is 0 or 1.
  • the index slot corresponding to the data to be modified is the target index slot.
  • calling the execution engine module to perform matching in the index slot included in the successfully matched hash bucket based on the to-be-processed data to obtain the matching result includes: calling the execution engine module to perform matching in the successfully-matched hash bucket based on the to-be-processed data. Matching is performed; or, the execution engine module is called to calculate the fingerprint summary information corresponding to the data to be processed, and matching is performed in the hash bucket with successful matching based on the fingerprint summary information.
  • the execution engine module is called to store the keys in the key-value pair data to be processed in each index slot included in the hash bucket. Matching is performed, and the index slot whose related information of the key filled in the index slot matches the key of the key-value pair to be processed is determined to be the target index slot. For example, with reference to the data structure in the embodiment shown in Figure 6, it is determined that the index slot in which the key filled in the key-info field is consistent with the key in the key-value pair data to be processed is the target index slot.
  • the execution engine module is called to store the key in the hash bucket according to the fingerprint summary information of the key-value pair data to be processed. Indexes included Matching is performed in the slot, and the index slot whose related information of the key filled in the index slot matches the fingerprint summary information of the key of the key-value pair to be processed is determined to be the target index slot. For example, combined with the data structure in the embodiment shown in Figure 6, it is determined that the index slot whose fingerprint summary information of the key filled in the key-info field is consistent with the fingerprint summary information of the key in the key-value pair data to be processed is the target index slot.
  • step S704 when the data operation request is a data read request, includes: when it is determined that the data to be processed and the target data corresponding to the data to be processed are stored inline, from the target index slot Read the target data indicated by the data to be processed; and determine that the data to be processed is stored in an inline manner, and the target data corresponding to the data to be processed is stored in a non-inline manner, or, determine that the data to be processed is stored in a non-inline manner. Next, obtain the pointer information from the target index slot, and read the target data corresponding to the data to be processed from the memory of the server indicated by the pointer information.
  • step S704 when the data operation request is a data deletion request, includes: deleting the target index when it is determined that the data to be processed and the target data indicated by the data to be processed are both stored in an inline storage manner. slot; and when it is determined that the data to be processed is stored in an inline manner and the target data indicated by the data to be processed is stored in a non-inline manner, or when it is determined that the data to be processed is stored in a non-inline manner, obtain it from the target index slot pointer information, delete the data in the server's memory indicated by the pointer information, and release the target index slot.
  • deleting the data in the memory of the server indicated by the pointer information includes: controlling the memory management module of the smart network card through the execution engine module to release the memory of the server indicated by the pointer information.
  • the memory management module is used to manage Server memory.
  • Scenario 1 The data operation request is a data read request, and the Flag field in the target index slot is 01.
  • the Flag field in the target index slot is 01, which indicates that the key-value pair data to be read is stored inline in the target index slot.
  • the execution engine module can be called to read the value data from the value-info field in the target index slot. . And the execution engine module can determine the data type of the value data based on the type field in the target index slot.
  • Scenario 2 The data operation request is a data read request, and the Flag field in the target index slot is 10 or 11.
  • the Flag field in the target index slot is 10 or 11, it indicates that the key-value pair data to be read is stored in the server's memory.
  • the execution engine module can be called to read the pointer information from the value-info field in the target index slot. and root Read value data from the server's memory based on pointer information. And the execution engine module can determine the data type of the value data based on the type field in the target index slot.
  • Scenario 3 The data operation request is a data deletion request, and the Flag field in the target index slot is 01.
  • Flag field in the target index slot is 01, it indicates that the key-value pair data to be deleted is stored inline in the target index slot. Calling the execution engine module can release the target index slot, thereby completing the data deletion.
  • Scenario 4 The data operation request is a data deletion request, and the Flag field in the target index slot is 10 or 11.
  • the Flag field in the target index slot is 10 or 11, it indicates that the key-value pair data to be deleted is stored in the server's memory.
  • the execution engine module can be called to read the pointer information from the value-info field in the target index slot, and Delete the key-value pair data in the memory pointed to by the pointer information in the server's memory and release the memory of the server occupied by the key-value pair data. And call the execution engine module to release the target index slot, thereby completing the data deletion.
  • Scenario 5 The data operation request is a data write request, and the key and value in the key-value pair data to be processed can be stored inline.
  • the data operation request is a data write request.
  • the keys in the key-value pair data to be processed can be stored inline, and the values can be stored in non-inline.
  • Scenario 7 The data operation request is a data write request, and the key of the key-value pair to be processed is stored non-inline.
  • the values in the key-value pair data to be processed are also stored non-inline, and the keys in the key-value pair data to be processed need to be stored in the target index slot.
  • fingerprint summary information and pointer information Therefore, calling the execution engine module can fill the Flag field in the target index slot with 11; fill the data type of the key-value pair data to be processed in the type field in the target index slot; change the key value in the key-value pair data to be processed.
  • the fingerprint summary information is filled in the key-info field; the memory of the server is allocated for the key-value pair data to be processed, the pointer information is generated according to the address of the allocated server memory, and the pointer information is filled in the value-info field.
  • Call the execution engine module to pass the key-value pair data to be processed to the server and store it in the memory of the allocated server.
  • the server's memory can support the following two methods of data storage:
  • Method 1 If the keys in the key-value pair data are stored inline, and the data length can be determined based on the data type of the key-value pair data, then only the values in the key-value pair data to be processed can be stored in the server's memory. For example, when the key-value pair data in the server's memory is stored in method 1, the data structure can be as shown in method 1 in Figure 8.
  • Method 2 If the key in the key-value pair data is stored non-inline, or the key in the key-value pair data is stored inline, and the data length cannot be determined according to the data type of the key-value pair data, then the server's memory needs to store not only the key and value of the key-value pair data, but also the data length of the key and the data length of the value.
  • the key-value pair data in the server's memory is stored in method 2
  • the data structure can be as shown in method 2 in Figure 8.
  • the data structure can also be other methods, for example, the key-value pair data is stored first, and then the data length information of the key of the key-value pair data and the data length information of the value are stored, or it can also be stored in the order of the key in the key-value pair data, the data length information of the key, the value in the key-value pair data, and the length information of the value, and the present disclosure does not limit this.
  • the server's memory uses method 1 to store key-value pair data, which can eliminate unnecessary memory space occupation overhead for the keys and related data in the key-value pair data, thereby improving the server's memory utilization.
  • the execution engine of the data processing system can correctly handle the access boundaries of the data, so that the data operation request sent by the client can be guaranteed to be processed correctly without errors. .
  • the value in the key-value pair data if the data to be deleted is stored in method 2, the key-value pair data and the data length information of the key-value pair data are deleted from the server's memory; in the scenarios shown in scenarios 6 and 7 , if the requirements of method one are met, the keys in the key-value pair data can be stored in the server's memory. If the requirements of method two are met, both the key-value pair data and the data length information of the key-value pair data are stored. to the server's memory.
  • S705 Call the request analysis module to encapsulate the data operation result to obtain a response to the data operation request.
  • S705 and S706 in this embodiment are respectively similar to S304 and S305 in the embodiment shown in FIG. 3. Reference may be made to the detailed description in the embodiment shown in FIG. 3. For the sake of simplicity, they will not be described again here.
  • the data processing system uses sub-index structure organization and multi-way indexing to
  • the spatial locality of the structure cache greatly reduces the access overhead caused by accessing the same sub-index structure, which can improve the memory utilization of smart network cards.
  • the index structure is implemented using an inline storage mechanism, which can optimize data access and storage overhead for small key values, thereby reducing the access pressure on the server's memory and improving data processing efficiency.
  • the data to be processed needs to be written into the memory of the server, and the smart network card needs to interact with the server to request the server to allocate memory space for data writing. If each data write operation requires a separate request to the server, the smart network card needs to frequently interact with the server's CPU, which will become a performance bottleneck for key-value data operations.
  • this disclosure sets up a memory allocator in the data processing system.
  • the memory allocator can use a combination of pre-application and slab management to achieve host memory management near the network card.
  • FIG. 9 is a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure.
  • the data processing system provided in this embodiment based on the embodiment shown in FIG. 4 , further includes: a memory allocator 104 .
  • the memory allocator 104 is set in the memory of the smart network card and is mainly responsible for applying for free storage space from the host memory of the server and managing it.
  • the execution engine module 102 may also need to interact with the memory allocator 104 when processing the data operation request sent by the client.
  • the memory allocator 104 when storage resources are insufficient, the memory allocator 104 will interact with the server CPU to apply for a preset size of memory space.
  • the preset size is not limited. For example, a large segment of hundreds of megabytes can be applied for at a time. memory space.
  • the memory allocator 104 can use a slab management mechanism to locally manage the memory of the server. Specifically, the memory allocator 104 can use different orders to manage free memory of different sizes, and each order represents a fixed-size linked list structure of a memory block.
  • the memory allocator 104 includes 11 levels, which are respectively the 0th level to the 10th level.
  • the 0th level to the 10th level represent memory block sizes of 8B, 16B, 32B, 64B, 128B, 256B, 512B, 1KB, 2KB, 4KB, 8KB linked list structure.
  • a maximum of 8KB of memory allocation can be supported (removing some metadata interference, a maximum of 4KB of key-value pair data storage can be supported).
  • the server's local memory management mechanism of the memory allocator when the data processing system deletes and writes key-value data, it can complete the allocation and release of the server's memory resources by operating the memory allocator without having to interact with the server CPU. , significantly reducing the overall performance overhead of the data processing system, thereby reducing operating delays and improving throughput.
  • the execution engine module 102 is a core functional module connected to other component modules in the data processing device 100 .
  • a data relay station can be set up in the smart network card to reduce the impact on the server's memory caused by hot key-value data operations. multiple accesses, thereby optimizing the overall performance of the data processing system.
  • FIG. 11 is a schematic structural diagram of a data processing system provided by an embodiment of the present disclosure. Referring to FIG. 11 , the data processing device 100 also includes: a data relay station 105 set in the memory of the smart network card.
  • the data relay station 105 is mainly used to cache key-value pair data that meet preset requirements.
  • the execution engine module 102 executes a data operation request, the execution engine module 102 can first access the data relay station for matching.
  • the data relay station can cache data that meets preset requirements (such as key-value pair data with a length of both the key and the value within 8 bytes).
  • the data relay station 105 can be implemented using a structure based on a chained hash index. Using this structure, operations on the same key can be mapped to a fixed linked list for matching each time.
  • Figure 12 exemplarily shows a schematic structural diagram of the data relay station.
  • the data relay station 105 includes multiple linked list structures. Each linked list structure corresponds to an identification (hash ID as shown in Figure 12).
  • the key in the value pair data is processed by a hash algorithm to obtain a hash value.
  • the hash ID corresponding to each linked list is queried, which can be mapped to the corresponding target linked list structure, and operations are performed in the target linked list structure.
  • the execution engine module 102 can first perform a match in the data relay station 105 to determine whether there is a matching key in the data relay station 105; if the match is successful, return the value of the latest version corresponding to the key; If the match fails, null may be returned to indicate that the corresponding key does not exist in the data relay station 105 .
  • the execution engine module 102 When the execution engine module 102 processes the data write request and the data deletion request, the execution engine module 102 will atomically add a data item to the hashed target linked list structure.
  • the data item includes: a field used to indicate the data operation type, Fields that hold keys and fields that hold values; in addition, the execution engine module 102 also deletes data items with the same key in the target linked list structure.
  • Figure 13 is a flow chart of a data processing method provided by an embodiment of the present disclosure.
  • the data operation request sent by the client is a data read request.
  • the data processing system receives the data read request sent by the client, it parses the key-value pair to be processed through the request decoder.
  • the data and data operation type information are passed to the execution engine module.
  • the execution engine module first needs to determine whether the key in the data read request meets the requirements of inline storage. For example, it can be judged based on the size of the key. Assuming it is within 8 bytes, then It meets the inline storage requirements. If it is larger than 8 bytes, it does not meet the inline storage requirements.
  • the execution engine module is called to access the data relay station to check whether there is the latest version of the value of the corresponding key in the data relay station. If the corresponding data item is queried, the latest version of the value is returned.
  • the execution engine module is called to access the index module, query the index structure, and map it to two hash buckets through two hash algorithms. And determine the target index slot based on the key matching in the two hash buckets hashed to. Next, determine the length of the data to be read and where to read the data based on the Flag field and type field in the target index slot.
  • the value of the Flag field is 01, read the type information from the target index slot to determine the data length of the value, and read the value from the target index slot, and then unlock the target index slot; if the value of the Flag field is 10, then read the pointer information from the target index slot, and read the key-value pair data from the server's memory based on the pointer information.
  • the fingerprint summary information of the key needs to be calculated. Match the key's fingerprint summary information in the hash bucket to determine the target index slot. It should be noted that during the matching process, the fingerprint summary information of the key may match multiple index slots, that is, the number of target index slots may be multiple.
  • the key-value pair data may match multiple index slots, that is, the number of target index slots may be multiple.
  • the key-value pair data may match multiple index slots, that is, the number of target index slots may be multiple.
  • the key-value pair data read from the server's memory. If the keys of the key-value pair data read from the server's memory also completely match, then the key-value pair data read from the server's memory will be read from the server's memory. The retrieved value is returned. If the key of the key-value pair data read in the server's memory does not match, continue to match the next target index slot and repeat the above process.
  • Figure 14 is a flow chart of a data processing method provided by an embodiment of the present disclosure.
  • the data operation request sent by the client is a data write request.
  • the data processing system receives the data write request sent by the client, it is parsed by the request decoder. Obtain the information about the key-value pair data to be processed and the data operation type and pass it to the execution engine module.
  • the execution engine module first needs to determine whether the key in the data write request meets the requirements of inline storage. For example, it can be judged based on the size of the key. Assume that in If it is less than 8 bytes, it meets the inline storage requirements; if it is larger than 8 bytes, it does not meet the inline storage requirements.
  • the key meets the requirements of inline storage, you need to further determine whether the key-value pair data to be written is small key-value pair data. If it is small key-value pair data, it needs to be written to the data relay station. If it is not small key-value pair data, Then the data needs to be written to the target index slot or the server's memory.
  • the key is hashed to the target linked list structure in the data relay station, and a PUT data item is added to the header of the target linked list structure. , then traverse the target linked list structure to delete all data items containing the key, and return after completing the above process. Subsequent steps can be executed asynchronously.
  • the preset requirement is, for example, 8 bytes in size
  • the steps that need to be executed asynchronously shown in FIG14 are the same as the steps that the engine module executes synchronously when the value does not meet the preset requirements, that is, the key is hashed into two hash buckets by two hash algorithms (the implementation shown in FIG14 The embodiment takes two hash buckets as an example for illustration), and matches in the two hash buckets according to the key. If the match is successful, the memory space of the server occupied by the old key-value pair data is released. Then, the execution engine module determines whether the value meets the requirements of inline storage.
  • the value meets the requirements of inline storage by judging whether the value is less than 6 bytes; if it is determined that the value can be stored in memory, the target index slot is filled according to the key-value pair data, and returned; if it is determined that the value cannot be stored inline, it is necessary to determine whether the memory resources are sufficient through the memory allocator.
  • memory space is directly allocated for the key-value pair data, and the key-value pair data is filled in the allocated memory space, and then the target index slot is filled and returned;
  • the server CPU is requested to pre-allocate a section of memory resources (the size of the memory resources requested to be allocated can be flexibly set), and then hierarchical management is established for the section of memory resources, and then memory space is allocated for the key-value pair data to be written, and the key-value pair is filled in the allocated memory space, and then the index slot is filled and returned.
  • the execution engine module can process the key with a hash algorithm and hash the key into two hash buckets (the embodiment shown in Figure 14 takes two hash buckets as an example) Description), then, calculate the fingerprint summary information of the key, and match the index slot in the hash bucket according to the fingerprint summary information of the key; if the match is successful, the execution engine module reads the corresponding key-value pair data from the server's memory to match the key Is it consistent?
  • the execution engine module releases the memory space occupied by the old key-value pair data, and then executes the writing process of the key-value pair data; if the fingerprint summary information based on the key is not in the two hash buckets If the match is successful, a free index slot is allocated and locked for the key-value pair data currently to be written, and then the writing process of the key-value pair data is executed.
  • the key-value pair data writing process can refer to the relevant description of writing according to whether the size of the value meets the requirements of inline storage in the aforementioned asynchronous execution step. For the sake of simplicity, it will not be described again here.
  • Figure 15 is a flow chart of a data processing method provided by an embodiment of the present disclosure.
  • the data operation request sent by the client is a data deletion request.
  • the data processing system receives the data deletion request sent by the client, it is parsed by the request decoder. Obtain the information about the key-value pair data to be processed and the data operation type and pass it to the execution engine module.
  • the execution engine module first needs to determine whether the key in the data deletion request meets the requirements of inline storage. For example, it can be judged based on the size of the key. Assume that in If it is less than 8 bytes, it meets the inline storage requirements; if it is larger than 8 bytes, it does not meet the inline storage requirements.
  • the key meets the requirements of inline storage, it is necessary to further determine whether the key-value pair data to be deleted is a small key-value pair data. If it is a small key-value pair data, it is necessary to first enter the data relay station for search. If it is not a small key-value pair data, it is necessary to search in the server host memory.
  • the key is hashed to the target linked list structure in the data relay station, and a DEL data item is added to the header of the target linked list structure, and the key value is After writing the data into the DEL data item and deleting the data item with the same key in the target linked list structure, the data can be returned, and the remaining steps can be executed asynchronously.
  • the preset requirement for example, the preset requirement is 8 bytes
  • the logic of the asynchronously executed steps is similar to the write operation, that is, by performing a hash algorithm on the key, the key is hashed to two hash buckets (the embodiment shown in Figure 15 is illustrated by taking two hash buckets as an example), and matching the key in the two hash buckets.
  • the value of the Flag field in the target index slot that successfully matches is further determined to determine whether the data is stored in the target index slot or in the server's memory; if the Flag field is 01, the target index slot is unlocked, and the memory occupied by the key-value pair data is released; if the Flag field is 10, the key-value pair data is read from the server's memory according to the pointer information in the target index slot, and the memory occupied by the key-value pair data is released, the target index slot is unlocked, and then returned.
  • the key is hashed into two hash buckets by performing a hash algorithm on the key, and then the fingerprint summary information of the key is calculated and matched in the hash bucket according to the fingerprint summary information of the key.
  • the number of index slots in the two hash buckets that match the fingerprint summary information of the key may be multiple, that is, the number of target index slots may be multiple, and the pointer information read from the successfully matched target index slot is used to read the key-value pair data from the server's memory according to the pointer information.
  • the memory occupied by the key-value pair data is released, the target index slot is unlocked, and then the data is returned; if it is determined that the key is inconsistent with the key-value pair data read, the pointer information in the next target index slot is read for matching, and the above process is repeated.
  • the present disclosure migrates some workloads of the server's CPU to the smart network card, and the smart network card processes the data operation requests sent by the client, and presents the KVS access abstraction to the client node through the smart network card without having to
  • the server CPU is required to participate in the processing of data operation requests, and the data processing system implements server memory pre-application and local management through sub-index structure organization and multi-way index mechanism, inline storage mechanism, data relay station acceleration structure, and memory allocator. This method greatly reduces the performance bottleneck of the server and increases the access overhead of the memory key-value database. For the client, low latency can bring a better experience to the client.
  • a data processing device including: a network card module, a request analysis module, and an execution engine module.
  • the network card module is used to receive data operation requests sent by the client.
  • the request analysis module is used to parse the received data operation request to obtain the data to be processed and the data operation type information, and input the data to be processed and the data operation type information to the execution engine module.
  • the execution engine module is used to execute the data operation indicated by the data operation type information based on the data to be processed to obtain the data operation result.
  • the request analysis module is also used to encapsulate the data operation results to obtain responses to the data operation requests.
  • the network card module is also used to send responses to data operation requests to the client.
  • the request analysis module is configured to update the target field in the data structure corresponding to the data operation request based on the data operation result to obtain a response to the data operation request.
  • the network card module is configured to receive multiple data operation requests with the same transaction identifier sent by the client.
  • the request analysis module is configured to separately parse multiple data operation requests with the same transaction ID to obtain multiple data fields and multiple same data operation type information, and analyze the multiple data operation requests according to the multiple data operation requests.
  • the sequence indication information included respectively splices multiple data fields to be processed to obtain the data to be processed.
  • the data structures of multiple data operation requests with the same transaction ID are consistent.
  • the execution engine module is configured to determine the target index slot corresponding to the data to be processed from the index structure stored in the memory included in the smart network card based on the data to be processed, and execute the data indicated by the data operation type information for the target index slot. Operate to obtain data manipulation results.
  • the index structure is implemented in the form of a hash bucket, and the hash bucket includes multiple index slots.
  • the execution engine module is used to perform hash calculation on the data to be processed to obtain a hash value, to obtain a successfully matched hash bucket based on the hash value in the index structure, and to obtain a successfully matched hash bucket based on the data to be processed. Match in the index slot included in the hash bucket to obtain the matching result, and determine the target index slot based on the matching result.
  • the execution engine module is used to perform hash calculations on the data to be processed using multiple preset hash algorithms to obtain multiple hash values, and compare the multiple hash values with each hash included in the index structure. Bucket IDs are matched to obtain multiple hash buckets that match successfully.
  • the execution engine module is used to perform matching in a successfully matched hash bucket based on the data to be processed; or, the execution engine module is used to calculate fingerprint summary information corresponding to the data to be processed, and based on the fingerprint summary information, when the matching is successful, Match in the hash bucket.
  • the execution engine module when it is determined that the data to be processed and the target data corresponding to the data to be processed are both stored inline, read the target data indicated by the data to be processed from the target index slot; and when it is determined that the data to be processed is stored inline,
  • the target data corresponding to the data to be processed is stored in a non-inline manner, or, when it is determined that the data to be processed is stored in a non-inline manner, pointer information is obtained from the target index slot, and the target data corresponding to the data to be processed is read from the memory of the server indicated by the pointer information.
  • the execution engine module is configured to: delete the target when it is determined that both the data to be processed and the target data indicated by the data to be processed are stored in an inline storage manner. Index slot; and when it is determined that the data to be processed is stored in an inline manner and the target data indicated by the data to be processed is stored in a non-inline manner, or when it is determined that the data to be processed is stored in a non-inline manner, from the target index slot Obtain the pointer information, delete the data in the server's memory indicated by the pointer information, and release the target index slot.
  • the execution engine module is used to control the memory management module of the smart network card to release the memory of the server indicated by the pointer information, and the memory management module is used to manage the memory of the server.
  • embodiments of the present disclosure also provide an electronic device, including: a memory and a processor.
  • the present disclosure does not limit the types of the memory and the processor, and the memory and the processor may be connected through a data bus.
  • the memory is configured to store computer program instructions
  • the processor is configured to execute the computer program instructions, so that the electronic device implements the data processing method shown in any of the above method embodiments.
  • embodiments of the present disclosure also provide a computer-readable storage medium, including: computer program instructions.
  • the computer program instructions When the computer program instructions are executed by at least one processor of an electronic device, the electronic device implements any of the above method embodiments. Data processing methods shown.
  • embodiments of the present disclosure also provide a computer program product.
  • the electronic device executes the computer program product, the electronic device implements the data processing method shown in any of the above method embodiments.
  • an embodiment of the present disclosure also provides a computer program, including: instructions, which when executed by a processor cause the processor to perform the data processing method shown in any of the above method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开涉及一种数据处理方法及装置,该方法包括:通过智能网卡中的网卡模块接收客户端发送的数据操作请求;调用智能网卡中的请求解析模块对数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将待处理数据和数据操作类型信息输入至智能网卡中的执行引擎模块;调用执行引擎模块基于所述待处理数据执行数据操作类型信息所指示的数据操作以得到数据操作结果;和调用请求分析模块对数据操作结果进行封装以得到数据操作请求的应答,并通过网卡模块向客户端发送数据操作请求的应答。

Description

数据处理方法及装置
相关申请的交叉引用
本申请是以申请号为202211160563.8,申请日为2022年9月22日的中国申请为基础,并主张其优先权,该中国申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及数据处理技术领域,尤其涉及一种数据处理方法及装置。
背景技术
键值数据库(KV-store,简称KVS)是一种新型的非关系型数据库(NoSQL database),主要以键值对(key-value)的形式存储数据。在数据存储上,由于键值数据库相对于相关技术中的关系型数据库更加灵活;在访问接口上,键值数据库使用写入(PUT)、读取(GET)等简单的数据访问接口,即可满足大量的业务需求,因此,键值数据库在各个领域中被广泛应用。
发明内容
第一方面,本公开提供了一种数据处理方法,包括:通过智能网卡中的网卡模块接收客户端发送的数据操作请求;调用所述智能网卡中的请求分析模块对所述数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将所述待处理数据和所述数据操作类型信息输入至所述智能网卡中的执行引擎模块;调用所述执行引擎模块基于所述待处理数据执行所述数据操作类型信息所指示的数据操作以得到数据操作结果;和调用所述请求分析模块对所述数据操作结果进行封装以得到所述数据操作请求的应答,并通过所述网卡模块向所述客户端发送所述数据操作请求的应答。
第二方面,本公开提供一种数据处理装置,包括:网卡模块,用于接收客户端发送的数据操作请求;请求分析模块,用于对接收的所述数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将所述待处理数据和所述数据操作类型信息输入至执行引擎模块;和执行引擎模块,用于基于所述待处理数据执行所述数据操作类型信息指示的数据操作以得到数据操作结果;所述请求分析模块,还用于对所述数据操作结果进行封装以得到所述数据操作请求的应答;所述网卡模块,还用于向客户端发 送所述数据操作请求的应答。
第三方面,本公开提供一种电子设备,包括:存储器和处理器;所述存储器被配置为存储计算机程序指令;所述处理器被配置为执行所述计算机程序指令,使得所述电子设备实现如第一方面所述的数据处理方法。
第四方面,本公开提供一种可读存储介质,包括:计算机程序指令,电子设备的至少一个处理器执行所述计算机程序指令,使得所述电子设备实现如第一方面所述的数据处理方法。
第五方面,本公开提供一种计算机程序产品,当所述计算机程序产品被电子设备执行时,使得所述电子设备实现如第一方面所述的数据处理方法。
第六方面,本公开提供一种计算机程序,包括:指令,所述指令当由处理器执行时使所述处理器执行如第一方面所述的数据处理方法。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开一实施例提供的数据处理系统的框架示意图;
图2为本公开一实施例提供的数据操作请求/应答的数据结构图;
图3为本公开一实施例提供的数据处理方法的流程示意图;
图4为本公开另一实施例提供的数据处理系统的框架示意图;
图5为本公开示例性地示出了哈希索引机制的框架示意图;
图6为本公开示例性示出的索引结构中索引槽的数据结构示意图;
图7为本公开另一实施例提供的数据处理方法的流程示意图;
图8为本公开一实施例提供的数据存储方式的示意;
图9为本公开一实施例提供的数据处理系统的结构示意图;
图10为本公开一实施例提供的内存分配器所采用的内存管理机制的结构示意图;
图11为本公开另一实施例提供的数据处理系统的结构示意图;
图12为本公开一实施例提供的数据中继站的结构示意图;
图13为本公开另一实施例提供的数据处理方法的流程示意图;
图14为本公开另一实施例提供的数据处理方法的流程示意图;
图15为本公开另一实施例提供的数据处理方法的流程示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
KVS以其独特的优势,在很多领域得到了广泛的应用。目前,为了支持更高性能的数据存储与访问,将业务以及应用部署在内存键值数据库中成为一种新的解决方案。
随着分布式系统设计的快速发展,分布式内存键值数据库成为新的研究热点方向。分布式内存键值数据库通过网络传输数据,且允许数据存储在多个节点上,这不仅提供了更大的键值数据对存储空间,而且具有更灵活的扩展能力(即动态加入或删除存储服务节点)。
远程直接内存访问(Remote Direct Memory Access,RDMA)技术具有高带宽、低延迟的特点,它与KVS追求高吞吐、低延迟的目标不谋而合。此外,RDMA技术除了支持类似相关技术中的套接字(socket)网络传输机制的双边消息语义,还支持单边内存语义。
采用双边消息语义进行数据处理主要交互过程如下:客户端节点将操作请求发送给服务端节点,服务端节点在本地执行相应的PUT/GET等数据操作,并将操作结果返回给客户端节点。客户端节点收到服务端节点反馈的结果后操作完成。
单边内存语义是指客户端可以直接以服务端旁路(server-bypass)的方式读写服务端的内存空间的访问方式,单边通过内存语义,为分布式系统构建共享内存和类load/store的应用程序接口(API)提供了更加便捷的途径。采用单边内存语义进行数据处理主要交互过程如下:对于GET操作,使用RDMA READ动作来完成键值对数据的读取;对于PUT操作,需要合理组合RDMA ATOMIC、RDMA WRITE和RDMA READ动作来支持一致的数据写入。在单边内存语义场景中,服务端节点除了参与数据存储和初始的通信建立,后续关键数据路径上基本不需要做出响应。
基于双边消息语义和单边内存语义的内存KVS各有优势。双边消息语义可以支持更 加丰富灵活的接口定义(因为数据操作过程可以对用户侧隐藏),而单边内存语义可以实现单次网络往返(RTT)下更加快速高效的数据访问。
但本公开的发明人发现,相关技术中的基于RDMA的内存键值数据库至少存在以下问题:
1、服务端节点的CPU处理瓶颈问题。
采用基于双边消息语义RPC机制的内存键值数据库需要将操作请求发送给服务端节点,服务端节点的CPU负责执行具体的数据存储逻辑,并将数据操作结果返回给客户端。这种处理机制会导致高并发场景(如数百甚至数千客户端同时访问)下服务端节点CPU成为关键路径上的性能瓶颈,进而可能会产生较高的尾延迟。这不仅仅是由于单服务器的多核频率限制,而且和CPU与网卡的交互模式有关。服务端节点的CPU不仅要为每一个客户端节点的数据操作请求提前准备接收请求工作(recv work request,RECV WR)和负责轮询完成队列(completion queue,CQ),还要处理操作请求,拷贝数据,准备发送请求工作(send work request,SEND WR)等,这就产生了许多的额外的访问开销。此外,由于服务端节点CPU是执行KVS访问的核心器件,就必然需要在服务器上部署更加昂贵高效的CPU组件以及适配的主板等。这与时下云计算领域所倡导的计存分离架构思想(即计算资源与存储资源解耦,存储节点专注于数据存储即可)相背离,难以做到存储节点的低总成本(total cost of ownership,TCO)控制。
2、复杂的KVS访问协议实现与多次网络往返开销。
采用基于单边内存语义的内存键值数据库允许客户端进行索引查询并定位相应键值对数据所在服务端节点的内存地址,通过READ和WRITE等动作完成GET/PUT操作。然而,这不仅要求客户端节点缓存服务端节点键值对数据的索引结构,而且对并发操作的一致性提出了更高的挑战:需要把相关技术中的服务端节点中心化一致性保障机制升级为分布式一致性保障,更复杂也更难以保证正确性。而且即便像通过组合多个WRITE、ATOMIC、READ操作支持一致性的数据写入操作,也存在多次网络往返开销。由于缺乏事务支持,RDMA提供的内存级别抽象并不适合构建高效的KVS。
3、对硬件带宽的利用效率较低,存在吞吐性能瓶颈。
硬件带宽如网卡线速、PCIe(Peripheral Component Interconnect Express,外设组件互连扩展总线)带宽、内存带宽等决定了内存键值数据库访问的上限。而上述部分已经提到,无论是基于双边消息语义还是单边内存语义的内存键值数据库系统都存在性能瓶颈(例如,CPU处理瓶颈、多次网络往返)。这就使得它们无法高效利用硬件的带宽资源。 例如,当CPU成为处理瓶颈时,内存带宽和网络带宽存在浪费;当网络往返成为瓶颈时,PCIe带宽和网络带宽存在浪费。自然地,带宽浪费就会导致吞吐率存在瓶颈,无法支撑理想状态下高效内存键值数据库的构建。
随着互联网技术的不断发展,支持RDMA技术的网卡(NIC(Network Interface Controller,网络接口控制器),如RoCE(RDMA over Converged Ethernet,以太网上的RDMA)、Inifiniband HCA(Host Channel Adapter,主机通道适配器)等)在网络部署上逐渐普及开来。与此同时,数据中心正在出现另一个关于硬件加速的进化趋势。越来越多的数据中心服务器上装配了智能网卡(SmartNIC,又可称为可编程网卡)。智能网卡的核心组件是一个现场可编程门阵列(FPGA),它带有一个嵌入式网卡芯片以连接到网络,还有一个PCIe连接器以连接到服务器(host)。
本公开的发明人发现,随着互联网的快速发展,网络使用呈现的爆发式增长,数据库的访问压力不断增大,针对数据库的高并发访问场景,如数百数千用户同时访问时,会导致数据库性能下降。
鉴于此,本公开的实施例提供一种数据处理方法,以提高键值数据库的性能。
在本公开提供的数据处理方法中,通过将服务器的CPU的一些工作负载迁移至智能网卡中,由智能网卡中的FPGA作为处理器芯片,处理客户端发送的数据操作请求,通过智能网卡对客户端节点呈现了KVS的访问抽象,而不必要求服务器的CPU参与数据操作请求的处理过程,有效降低服务器的CPU的处理压力,此外,对于PUT/GET操作均支持一次网络往返的操作延迟,最大化利用网络带宽来提升访问吞吐率。需要说明的是,针对存在类似问题的数据库,本公开提供的数据处理方法同样适用。
接下来,通过一些实施例,结合附图以及场景,对本公开提供的数据处理方法进行详细介绍。其中,数据处理方法可以由本公开提供的数据处理装置执行,该装置可以通过任意的软件和/或硬件的方式实现,例如,其可以为软件系统。在下述实施例中,以数据处理装置为数据处理系统,且服务器中部署的数据库为内存键值数据库为例进行说明。
图1为本公开一实施例提供的数据处理系统的架构示意图。请参阅图1所示,数据处理系统100部署在智能网卡中,数据处理系统100能够访问并操作服务器的内存中的数据库。其中,智能网卡中的网卡模块,主要用于接收客户端发送的数据操作请求发送给数据处理系统100相应的模块以及将数据处理系统100生成的应答传递给客户端。
例如,数据处理系统100包括:请求分析模块101和执行引擎模块102。请求分析模块101主要用于对智能网卡中的网卡模块传输的数据操作请求进行解析识别与封装传递。 例如,请求分析模块101包括:请求译码器101a和请求编码器101b。请求译码器101a主要用于从网卡模块中获取客户端发送的数据操作请求进行解析识别;请求编码器101b主要用于根据执行引擎模块102传递的数据操作结果进行数据封装得到数据操作请求的应答并传递给网卡模块,通过网卡模块将应答反馈给客户端。
执行引擎模块102主要用于根据客户端发送的数据操作请求所指示的数据操作类型执行相应的动作(如写入/读取/删除等等数据操作),并将数据操作结果传递给请求分析模块101。执行引擎模块102可以与服务器的内存交互,实现针对键值对数据的操作。
作为一种可能的实施方式,请求分析模块101中的请求译码器101a和请求编码器101b可以共用相同的数据结构,即数据操作请求和数据操作结果可以共用相同的数据结构。请参阅图2所示,图2示例性地示出数据操作请求/应答所共用的数据结构的示意图。
例如,数据操作请求/应答包括:事务标识信息、组成本次事务的总请求/应答的数量信息、本请求在所有请求/应答中的顺序信息、数据操作类型、本次事务对应的键的总长度、本次事务的值的总长度、本次请求/应答所包含的键的长度、本次请求/应答所包含的值的长度、用于容纳本次事务的全部或者部分键数据的字段、用于容纳本次事务的全部或者部分值数据的字段、校验信息等等中的一项或者多项。
本公开实施例中,对于数据操作请求/应答的数据量大小不做限定,如,可以为32字节大小、64字节大小、128字节大小等等。示例性地,图2所示实施例中以数据操作请求/应答可以采用64字节大小为例进行举例说明。具体地,可以包括一个4字节的TxID字段(指代一个全局唯一的事务ID,该事务ID可以由用户指定),2字节的Num字段(指示组成本次事务的总请求/应答数量,从而可以应对变长key和value),2字节的Seq字段(指示本请求/应答在本事务的所有请求/应答中的次序,用于收到所有请求或应答后的数据拼接),2字节的Opcode字段(指明数据操作类型,如读取/写入/删除等),2字节的Tkey_len字段(指示本次事务的key总长度,可跨多个请求),2字节的Tvalue_len字段(指示本次事务的value总长度,可跨多个请求/应答),1字节的Key_len字段(本次请求包含的key长度),1字节的Value_len字段(本次请求/应答包含的value长度),16字节的Key字段(容纳本次事务的全部或部分key数据),24字节的Value字段(容纳本次事务的全部或部分value数据)和8字节的Checksum字段(即校验信息,可以为单个请求的校验和,用于网络传输后核对请求/应答数据的完整性、准确性)。
图3为本申请一实施例提供的数据处理方法的流程示意图。参照图3所示,本实施例提供的方法包括:
S301、通过智能网卡中的网卡模块接收客户端发送的数据操作请求。
在一些实施例中,步骤S301包括:通过网卡模块接收客户端发送的具有相同事务标识的多个数据操作请求。
S302、调用智能网卡中的请求分析模块对数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将待处理数据和数据操作类型信息输入至智能网卡中的执行引擎模块。
结合图1所示,智能网卡的网卡模块可以接收客户端发送的数据操作请求,并将数据操作请求传输至请求译码器,由请求译码器对数据操作请求进行解析以得到数据操作请求对应的数据结构中的数据操作类型、键值对数据等等信息。
在一些实施例中,客户端可以向数据处理系统发送单独的数据操作请求,相应地,数据处理系统的请求译码器可以通过解析该单独的数据操作请求,获取其中的信息,并执行相应的动作。
在另一些实施例中,客户端可以向数据处理系统发送多个数据操作请求,使得数据处理系统通过对多个数据操作请求进行聚合,从而针对超长键值对数据执行相应的动作。示例性地,结合图2所示的数据结构,数据处理系统的请求译码器可以解析识别多个数据操作请求,获取各数据操作请求中的TxID字段、Seq字段以及Checksum字段等等,再基于多个数据操作请求中的TxID识别是否属于同一事务,通过Seq字段还原各数据操作请求的顺序,并利用Checksum字段检查数据一致性,从而实现超长键值对数据的数据操作。其中,请求译码器可以对多个具有相同事务标识的数据操作请求分别进行解析以得到多个Key字段,再按照Seq字段所指示的顺序将多个Key字段进行拼接得到该事务对应的待处理数据。且具有相同事务标识的数据操作请求中指示的数据操作类型是一致的。需要说明的是,若执行数据操作请求需要Value字段,则还可以按照上述方式将解析得到的Value字段进行拼接。
在一些实施例中,调用智能网卡中的请求分析模块对数据操作请求进行解析以得到待处理数据和数据操作类型信息,包括:调用请求分析模块根据分别对具有相同事务标识的多个数据操作请求分别进行解析以得到多个数据字段以及多个相同的数据操作类型信息;和调用请求分析模块根据所述多个数据操作请求分别包括的顺序指示信息对所述多个待处理数据字段进行拼接以得到待处理数据。
在一些实施例中,具有相同事务标识的所述多个数据操作请求的数据结构一致。
S303、通过调用智能网卡中的执行引擎模块基于待处理数据执行数据操作类型信息所指示的数据操作以得到数据操作结果。
数据操作请求可以为数据读请求、数据写请求或者数据删除请求,数据处理系统的执行引擎模块可以基于数据操作请求所指示的数据操作类型执行读操作、写操作或者删除操作,获取相应的数据操作结果。
若数据操作请求为数据读请求,执行引擎模块获取的数据操作结果可以为读取到的待处理数据所指示的目标数据(如value数据);若数据操作请求为数据写入操作,执行引擎模块获取的数据操作结果可以为写入成功/失败的信息;若数据操作请求为数据删除请求,执行引擎模块获取的数据操作结果可以为删除成功/失败的信息。
在一些实施例中,调用执行引擎模块基于待处理数据执行数据操作类型信息指示的数据操作以得到数据操作结果,包括:调用所述执行引擎模块基于所述待处理数据从所述智能网卡包括的内存中存储的索引结构中确定所述待处理数据对应的目标索引槽;和调用所述执行引擎模块针对所述目标索引槽执行所述数据操作类型信息指示的数据操作以得到所述数据操作结果。
S304、调用请求分析模块对数据操作结果进行封装以得到所述数据操作请求的应答。
在一些实施例中,S304包括:调用请求分析模块基于数据操作结果更新数据操作请求对应的数据结构中的目标字段以得到数据操作请求的应答。
请求编码器可以将数据操作结果填充至数据操作请求对应的数据结构的指定字段中得到应答。
以图2所示实施例的数据结构为例:
示例性地,若数据操作请求为数据读请求,请求编码器可以将读取的value数据(即数据操作结果)填充至数据操作请求对应的数据结构中的Value字段中得到数据操作请求的应答。
作为一种可能的实施方式,若数据读请求是针对超长键值对数据的操作,读取的value数据的数据量可能较大,由于数据操作请求中相应字段的大小无法满足读取的value数据的数据量,则可以根据读取的value数据的数据量创建多个应答结构,以满足读取的value数据。
例如,结合图2所示的数据结构,value字段为24字节,若读取的value数据小于24字节,则将读取的value数据写入value字段中即可;若读取的value数据大于24字节, 则可以创建多个所需应答结构,再将读取的value数据写入多个应答结构分别对应的value字段中。
若数据操作请求为数据写请求或者数据删除请求,请求编码器可以将图2所示数据结构中的用于指示数据操作类型中的字段(Opcode字段)修改为ack/null,代表写入/删除操作成功/失败。其中,ack可以表示写入/删除操作成功,null可以表示写入/删除操作失败。
需要说明的是,此处所示的复用数据结构的实现方式仅为示例,请求/应答的数据结构以及复用方式还可以通过其他方式实现,本公开对此不做限定。
S305、通过网卡模块向客户端发送数据操作请求的应答。
其中,请求编码器将封装好的应答传递给智能网卡的网卡模块,通过网卡模块将应答传输给客户端。客户端接收到应答后,对应答进行解析,可以先匹配TxID字段,然后检查Opcode字段确认数据操作类型,如有需要再去Value字段获取数据。例如,数据操作请求为数据读请求时,可以获取Value字段中的数据,若数据操作请求为数据写请求/数据删除请求时,客户端可以获取Opcode字段,确定数据写入/删除是否成功。
至此,提供了根据本公开一些实施例的数据处理方法。该数据处理方法包括:通过智能网卡中的网卡模块接收客户端发送的数据操作请求;调用智能网卡中的请求分析模块对数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将待处理数据和数据操作类型信息输入至智能网卡中的执行引擎模块;调用执行引擎模块基于待处理数据执行数据操作类型信息所指示的数据操作以得到数据操作结果;和调用请求分析模块对数据操作结果进行封装以得到数据操作请求的应答,并通过网卡模块向客户端发送数据操作请求的应答。本实施例提供的方法,通过将服务器的CPU的一些工作负载迁移至智能网卡,由智能网卡处理客户端发送的数据操作请求,且通过智能网卡对客户端节点呈现了数据库的访问抽象,而不必要求服务器CPU参与数据操作请求的处理,可以降低服务器的CPU的处理压力。
此外,应答复用数据操作请求的数据结构,通过该数据结构复用机制能够减少服务器的空间分配和数据拷贝开销。
此外,支持聚合的多数据操作请求的数据结构是缓存对齐的,因此,在数据传输时能够减小边界不对齐导致的网络传输开销,尽量避免读写放大问题。
从存储的角度来说,键值数据库可以包括两部分:索引结构和键值对数据,其中,键 值对数据是用户执行数据操作的目标对象,索引结构是用于查找请求键值对数据存储位置的检索数据结构。
图4为本公开一实施例提供的数据处理系统的架构示意图。参照图4所示,本实施例提供的数据处理系统在图1所示实施例的基础上,还包括:索引模块103。其中,索引模块103可以设置在智能网卡的内存中,索引模块103中包括键值对数据对应的索引信息,而索引信息指向的键值对数据可以存储在服务器的内存(即host memory,以下称为服务器的内存)中。
在智能网卡的内存中设置索引模块103用于存储索引结构的情况下,调用执行引擎模块102处理客户端发送的数据操作请求时还可能执行引擎模块102需要与索引模块103进行交互。
其中,本公开对于索引结构的具体实现方式不做限定。作为一种可能的实施方式,索引结构可以采用哈希索引结构,如:链式哈希索引结构、cuckoo哈希索引结构、hopscotch哈希索引结构等等,当然,索引结构还可以采用二叉树、radix tree、B树、B+树、红黑树等结构。
本实施例中,为了提升智能网卡的内存利用率,索引结构可以采用子索引结构组织以及多路索引机制的方式实现。其中,子索引结构组织表示索引结构由多个子索引结构组成,每个子索引结构可以包括多个索引槽,每个索引槽中可以用于存储键值对数据的相关信息。多路索引机制表示采用预设的多种映射方式将数据操作请求映射至多个子索引结构。通过子索引结构组织和多路索引机制,能够实现针对索引结构访问的负载均衡处理,从而避免大量访问同一索引结构导致访问压力较大。
当索引结构采用哈希索引结构实现时,索引结构可以包括多个哈希桶,在处理数据操作请求时采用多路哈希的方式进行映射。其中,图5示例性地以哈希桶和两路哈希的方式为例示出哈希索引机制的示意图。
请参阅图5所示的两个哈希桶,分别为哈希桶X和哈希桶Y,其中,哈希桶X和哈希桶Y可以是缓存对齐的,从而可以利用缓存的空间局部性大大减少访问同一个哈希桶的访问开销。每个哈希桶可以包括:多个索引槽、用于指示桶内的各索引槽是否为空闲槽的字段、以及用于指示桶内的索引槽是否被线程占用的字段。
假设一个哈希桶大小为64字节,且哈希桶是缓存对齐时,每个哈希桶可以包含4字节的元数据:1字节Bitmap字段(每个比特位代表对应的桶内索引槽是否为空闲槽,0为空闲,1为占用),1字节Lockmap字段(每个比特位代表对应的桶内索引槽是否被某一 线程占用,0为空闲,1为占用),2字节Padding字段(该字段为无意义比特位,仅用于对齐4字节)。每个哈希桶内可以包含4个15字节的索引槽,每个索引槽可以用于存储一条键值对数据的索引信息。
需要说明的是,本公开对于各哈希桶包括的索引槽的数量不做限定,各哈希桶分别包括的索引槽的数量可以是相同的,也可以是不同的。索引槽数量不同时,可通过调整元数据的字节大小,从而保证元数据能够完整地表示全部索引槽的状态。此外,采用多路哈希索引机制,本公开对于映射方式的数量及实现不做限定。
为了减小键值对数据的访问和存储开销,索引结构可以采用内联存储机制实现,即将满足预设条件的键值对数据内联地存储在索引槽中,客户端需要访问这些内联存储的键值对数据时,可以通过访问索引结构实现,从而无需访问服务器的内存,无需通过智能网卡与服务器之间的PCIe数据通道,进而减小服务器的访问与存储压力。
作为一种可能的实施方式,可以根据键值对数据的属性信息确定键值对数据是否满足内联存储机制的要求。此处提及键值对数据的属性信息可以包括但不限于:特定字段的数据类型(如int8、int16、int32、int64、float32、float64、sring等等)、数据量大小等等。
示例性地,图6为本公开示例性示出的索引结构中索引槽的数据结构示意图。请参阅图6所示,采用内联存储机制时,索引槽可以包括用于指示键值对数据的数据类型的字段、用于指示键和值的存储类型的字段、用于存储键的相关信息的字段以及用于存储值的相关信息的字段。本公开对于各字段的字节大小不做限定。
结合前述图5所示实施例,图6所示实施例以索引槽为15字节为例进行示例说明。其中,索引槽包括4个字段,分别为:6比特用于指示数据类型的字段(也可以称为type字段)、2比特用于指示键和值的相关信息存储方式的字段(也可以称为Flag字段)、8字节用于存储键的相关信息的字段(也可以称为key-info字段)以及6字节用于存储值的相关信息的字段(也可以称为value-info字段)。
Flag字段可以有三种取值:01、10、11。其中,Flag字段取值为01时,表示键值对数据中的键和值均可以内联地存储在索引槽中。Flag字段取值为10时,表示键值对数据中的键可以内联的存储在索引槽中,但键值对数据中的值不可以内联地存储在索引槽中。Flag字段取值为11时,表示键值对数据中的键和值均不可以内联地存储在索引槽中。
请参阅图6所示的4个索引槽,其中,索引槽1中存储Int32类型的数据以及索引槽2中存储的string类型的数据,键和值的字节大小均满足key-info字段和value-info字段的字节大小限制,因此,键和值内联存储在索引槽中。
索引槽3中存储的int64类型的数据,键满足key-info字段的字节大小限制,但值大于6字节,无法满足value-info字段的字节大小限制,因此,键可以内联地存储在索引槽的key-info字段中,value-info字段则可以填充键值对数据对应的指针信息,即键值对数据存储在指针信息所指向的服务器的内存中。
索引槽4中存储的string类型的数据,键的字节大小无法满足key-info字段的字节大小限制,因此,键值对数据需要采用非内联存储的方式。参照图6所示,key-info字段可以用于存储键值对数据中键的指纹摘要信息,其中,键的指纹摘要信息可以通过对键进行映射获得的满足key-info字段的字节大小限制的数据,例如,通过对键进行哈希计算的方式进行映射,当然也可以采用其他方式进行映射,获取键的指纹摘要信息。value-info字段则可以填充键值对数据对应的指针信息,即键值对数据存储在指针信息所指向的服务器的内存中。
接下来,通过图7至图9所示实施例,详细介绍在智能网卡的内存中设置索引模块,且索引模块中的索引结构采用如前所述的哈希桶、多路哈希以及采用了内联存储机制实现的情况下,数据处理系统如何处理客户端发送的数据读请求、数据写请求以及数据删除请求。
图7为本公开一实施例提供的数据处理方法的流程图。请参阅图7所示,本实施例提供的方法包括:
S701、通过智能网卡中的网卡模块接收客户端发送的数据操作请求。
S702、调用智能网卡中的请求分析模块对数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将待处理数据和数据操作类型信息输入至智能网卡中的执行引擎模块。
步骤S701、S702分别与图3所示实施例中步骤S301、S302类似,可参照图3所示实施例的详细描述,简明起见,此处不再赘述。
S703、调用执行引擎模块基于待处理数据从智能网卡包括的内存中存储的索引结构中确定待处理数据对应的目标索引槽。
在一些实施例中,索引结构采用哈希桶的方式实现,哈希桶包括多个索引槽。
其中,确定目标索引槽可以但不限于通过下述方式实现:
步骤a、调用执行引擎模块对待处理数据进行哈希计算以得到哈希值,基于哈希值在索引结构中进行匹配得到匹配成功的哈希桶。
其中,数据处理系统可以采用一种或多种哈希算法匹配哈希桶。其中,待处理数据为 键值对数据时,可以通过对待处理键值对数据中的键采用多种的哈希算法进行计算得到多个哈希值,基于多个哈希值匹配到多个哈希桶。
在一些实施例中,调用执行引擎模块对待处理数据进行哈希计算以得到哈希值,基于哈希值在索引结构中匹配得到匹配成功的哈希桶,包括:调用执行引擎模块对待处理数据采用多种预设哈希算法分别进行哈希计算以得到多个哈希值;和调用执行引擎模块将多个哈希值与索引结构包括的各哈希桶的标识进行匹配以得到匹配成功的多个哈希桶。
步骤b、调用执行引擎模块基于待处理数据在匹配成功的哈希桶包括的索引槽中进行匹配以得到匹配结果,基于匹配结果确定目标索引槽。
示例性地,当数据操作请求为数据读请求或者数据删除请求时,可以根据待处理键值对数据中的键或者键的指纹摘要信息在多个匹配成功的哈希桶中进行匹配,匹配成功的索引槽即为目标索引槽。
示例性地,当数据操作请求为数据写请求时,可根据多个匹配成功的哈希桶中索引槽的占用情况或者其他因素,为待处理数据分配一个空闲的索引槽作为目标索引槽,结合前述图5所示实施例,索引槽的占用情况可通过哈希桶的Bitmap字段中各比特位为0或者1获得。在一些情况下,当数据操作请求为修改数据的数据写请求,则要修改的数据所对应的索引槽即为目标索引槽。
其中,在哈希桶中进行匹配,可基于待处理数据的属性信息来判断是采用待处理数据进行匹配还是待处理数据的指纹摘要信息进行匹配。
在一些实施例中,调用执行引擎模块基于待处理数据在匹配成功的哈希桶包括的索引槽中进行匹配以得到匹配结果,包括:调用执行引擎模块基于待处理数据在匹配成功的哈希桶中进行匹配;或者,调用执行引擎模块计算待处理数据对应的指纹摘要信息,基于指纹摘要信息在匹配成功的哈希桶中进行匹配。
作为一种可能的实施方式,若待处理键值对数据中的键满足内联存储的要求,则调用执行引擎模块根据待处理键值对数据中的键在哈希桶包括的各索引槽中进行匹配,确定索引槽中所填充的键的相关信息与待处理键值对的键相匹配的索引槽为目标索引槽。例如,结合图6所示实施例中的数据结构,确定key-info字段中填充的键与待处理键值对数据中的键一致的索引槽为目标索引槽。
作为另一种可能的实施方式,若待处理键值对数据中的键不满足内联存储的要求,则调用执行引擎模块根据待处理键值对数据中的键的指纹摘要信息在哈希桶包括的各索引 槽中进行匹配,确定索引槽中所填充的键的相关信息与待处理键值对的键的指纹摘要信息相匹配的索引槽为目标索引槽。例如,结合图6所示实施例中的数据结构,确定key-info字段中填充的键的指纹摘要信息待处理键值对数据中的键的指纹摘要信息一致的索引槽为目标索引槽。
S704、调用执行引擎模块针对目标索引槽执行数据操作类型信息指示的数据操作以得到数据操作结果。
在一些实施例中,在数据操作请求为数据读请求的情况下,步骤S704包括:在确定待处理数据和待处理数据对应的目标数据均采用内联方式存储的情况下,从目标索引槽中读取待处理数据指示的目标数据;和在确定待处理数据采用内联方式存储,待处理数据对应的目标数据采用非内联方式存储,或者,确定待处理数据采用非内联方式存储的情况下,从目标索引槽中获取指针信息,并从指针信息指示的服务器的内存中读取待处理数据对应的目标数据。
在另一些实施例中,在数据操作请求为数据删除请求的情况下,步骤S704包括:在确定待处理数据和待处理数据指示的目标数据均采用内联存储方式存储的情况下,删除目标索引槽;和在确定待处理数据采用内联方式存储,待处理数据指示的目标数据采用非内联方式存储,或者,确定待处理数据采用非内联方式存储的情况下,从目标索引槽中获取指针信息,并删除指针信息所指示的服务器的内存中的数据,且释放目标索引槽。
在一些实施例中,所述删除指针信息所指示的服务器的内存中的数据,包括:通过执行引擎模块控制智能网卡的内存管理模块释放指针信息所指示的服务器的内存,内存管理模块用于管理服务器的内存。
根据不同的数据操作请求,并结合索引采用内联以及非内联存储方式,通过几种不同的情形示例说明数据处理系统的执行引擎模块如何执行数据操作请求。
情形一、数据操作请求为数据读请求,目标索引槽中的Flag字段为01。
目标索引槽中的Flag字段为01,则表明要读取的键值对数据内联地存储在目标索引槽中,调用执行引擎模块可以从目标索引槽中的value-info字段中读取value数据。且执行引擎模块可以根据目标索引槽中的type字段确定value数据的数据类型。
情形二、数据操作请求为数据读请求,目标索引槽中的Flag字段为10或者11。
目标索引槽中的Flag字段为10或者11,则表明要读取的键值对数据存储在服务器的内存中,调用执行引擎模块可以从目标索引槽中的value-info字段中读取指针信息,并根 据指针信息从服务器的内存中读取value数据。且执行引擎模块可以根据目标索引槽中的type字段确定value数据的数据类型。
情形三、数据操作请求为数据删除请求,目标索引槽中的Flag字段为01。
目标索引槽中的Flag字段为01,则表明要删除的键值对数据内联地存储在目标索引槽中,调用执行引擎模块可以释放该目标索引槽,从而完成数据删除。
情形四、数据操作请求为数据删除请求,目标索引槽中的Flag字段为10或者11。
目标索引槽中的Flag字段为10或者11,则表明要删除的键值对数据存储在服务器的内存中,调用执行引擎模块可以从目标索引槽中的value-info字段中读取指针信息,并删除服务器的内存中指针信息指向的内存中的键值对数据释放键值对数据所占用的服务器的内存。且调用执行引擎模块释放该目标索引槽,从而完成数据删除。
情形五、数据操作请求为数据写请求,待处理键值对数据中的键和值可以采用内联存储。
调用执行引擎模块向目标索引槽中的Flag字段填充为01;向目标索引槽中的type字段中填充待处理键值对数据的数据类型;将待处理键值对数据中的键填充至key-info字段;将待处理键值对数据中的值填充至value-info字段。
情形六、数据操作请求为数据写请求,待处理键值对数据中的键可以采用内联存储,值采用非内联存储。
调用执行引擎模块向目标索引槽中的Flag字段填充为10;向目标索引槽中的type字段中填充待处理键值对数据的数据类型;将待处理键值对数据中的键填充至key-info字段;为待处理键值对数据分配服务器的内存,根据所分配的服务器的内存的地址生成指针信息,将指针信息填充至value-info字段。调用执行引擎模块将待处理键值对数据传递给服务器存储至所分配的服务器的内存中。
情形七、数据操作请求为数据写请求,待处理键值对数据的键采用非内联存储。
待处理键值对数据中的键采用非内联存储的情况下,待处理键值对数据中的值也采用非内联存储,且目标索引槽中需要存储待处理键值对数据中的键的指纹摘要信息以及指针信息。因此,调用执行引擎模块可将目标索引槽中的Flag字段填充为11;在目标索引槽中的type字段中填充待处理键值对数据的数据类型;将待处理键值对数据中的键的指纹摘要信息填充至key-info字段;为待处理键值对数据分配服务器的内存,根据所分配的服务器的内存的地址生成指针信息,将指针信息填充至value-info字段。调用执行引擎模块将待处理键值对数据传递给服务器存储至所分配的服务器的内存中。
作为一种可能的实施方式,服务器的内存可以支持以下两种方式的数据存储:
方式一、若键值对数据中的键采用内联存储,且根据键值对数据的数据类型可以确定数据长度,则在服务器的内存中可以只存储待处理键值对数据中的值。示例性地,服务器的内存中键值对数据采用方式一存储时,数据结构可以如图8中方式一所示。
方式二、若键值对数据中的键采用非内联存储,或者,键值对数据中的键采用内联存储,且根据键值对数据的数据类型无法确定数据长度,则在服务器的内存中不仅需要存储键值对数据的键和值,还需要存储键的数据长度和值的数据长度。示例性地,服务器的内存中键值对数据采用方式二存储时,数据结构可以如图8中方式二所示。需要说明的是,采用方式二存储键值对数据时,数据结构也可以采用其他方式,例如,先存储键值对数据,再存储键值对数据的键的数据长度信息以及值的数据长度信息,或者,还可以按照键值对数据中的键、键的数据长度信息、键值对数据中的值、值的长度信息的顺序存储,本公开对此不做限定。
服务器的内存采用方式一存储键值对数据,能够免去键值对数据中的键和相关数据不必要的内存空间占用开销,从而能够提升服务器的内存的利用率。此外,对于方式二,通过在服务器的内存中存储数据长度的信息,数据处理系统的执行引擎能够正确处理数据的访问边界,使得客户端发送的数据操作请求能够保证被正确处理,不会出现错误。
结合前述情形二所示的场景中,若要读取的数据采用方式一存储,则从服务器的内存中读取键值对数据中的值,若要读取的数据采用方式二存储,则执行引擎从服务器的内存中读取键值对数据以及键值对数据的数据长度信息;在前述情形四所示的场景中,若要删除的数据采用方式一存储,则从服务器的内存中删除存储的键值对数据中的值,若要删除的数据采用方式二存储,则从服务器的内存中删除键值对数据以及键值对数据的数据长度信息;在情形六以及情形七所示的场景中,若满足方式一的要求,则可以将键值对数据中的键存储至服务器的内存中,若满足方式二的要求,则将键值对数据以及键值对数据的数据长度信息均存储至服务器的内存中。
S705、调用请求分析模块对所述数据操作结果进行封装以得到所述数据操作请求的应答。
S706、通过网卡模块向客户端发送数据操作请求的应答。
本实施例中S705、S706分别与图3所示实施例中S304、S305类似,可参照图3所示实施例中的详细描述,简明起见,此处不再赘述。
本实施例中,数据处理系统通过子索引结构组织以及多路索引的方式,利用子索引结 构缓存的空间局部性大大减小访问同一个子索引结构而带来的访问开销,可以提升智能网卡的内存利用率。此外,索引结构采用内联存储机制实现,能够优化小键值对数据的访问和存储开销,从而减小服务器的内存的访问压力,提升数据处理效率。
结合前述图7所示实施例中步骤S705中情形五至情形七的描述,待处理数据需要写入服务器的内存中,智能网卡需要与服务器进行交互请求服务器分配内存空间用于数据写入,若针对每一次数据写入操作均单独请求服务器,则智能网卡需要频繁地与服务器的CPU进行交互,这会成为键值对数据操作的性能瓶颈。为了解决该问题,本公开通过在数据处理系统中设置内存分配器,内存分配器可以采用预申请以及slab管理相结合的方式实现近网卡的主机内存管理。
其中,图9为本公开一实施例提供的数据处理系统的结构示意图。参照图9所示,本实施例提供的数据处理系统在图4所示实施例的基础上,还包括:内存分配器104。
其中,内存分配器104设置在智能网卡的内存中,主要负责从服务器的主机内存中申请空闲存储空间并进行管理。数据处理系统中设置有内存分配器104的情况下,执行引擎模块102在处理客户端发送的数据操作请求时还可能需要与内存分配器104进行交互。
作为一种可能的实施方式,内存分配器104在存储资源不足时,会与服务器CPU进行交互,申请预设大小的内存空间,预设大小不限定,例如,可以一次申请数百兆的大段内存空间。
其中,内存分配器104可以采用slab管理机制进行服务器的内存的本地管理。具体地,内存分配器104可以采用不同阶(order)管理不同大小的空闲内存,每个阶代表一个内存块为固定大小的链表结构。
例如,参照图10所示的管理机制的结构示意图,内存分配器104包括11个阶,分别为第0阶至第10阶,第0阶至第10阶依次代表内存块大小为8B、16B、32B、64B、128B、256B、512B、1KB、2KB、4KB、8KB的链表结构。在上述11个阶中,最大可以支持8KB的内存分配(去除一些元数据的干扰,可支持最大4KB的键值对数据存储)。
需要说明的是,采用slab管理机制时,设置的阶数以及各阶分别对应的内存块大小可以根据实际需求设定,上述图10所示实施例仅为示例,并不是对于内存分配器104管理主机内存的实现方式的限制。
通过内存分配器的服务器的内存本地管理机制,当数据处理系统实现键值对数据删除、写入时,可以通过操作内存分配器完成服务器的内存资源的分配与释放,而不必与服务器CPU进行交互,显著减小数据处理系统的整体性能开销,从而降低操作延迟,提高吞吐率。
结合前文各实施例可知,执行引擎模块102为数据处理装置100中连接其他组成模块的核心功能模块。为了进一步提高数据处理系统执行数据操作请求的处理效率,本公开提供的数据处理装置100中,可以在智能网卡中设置数据中继站,减小热的键值对数据的操作所造成的对服务器的内存的多次访问,从而优化数据处理系统的整体性能。其中,图11为本公开一实施例提供的数据处理系统的结构示意图。请参阅图11所示,数据处理装置100还包括:设置在智能网卡的内存中的数据中继站105。
数据中继站105主要用于缓存满足预设要求的键值对数据,当执行引擎模块102执行数据操作请求时,执行引擎模块102可以先访问的数据中继站进行匹配。例如,数据中继站中可以缓存数据量满足预设要求(如键和值的长度均在8字节以内的键值对数据)。
作为一种可能的实施方式,数据中继站105可以采用基于链式哈希索引的结构实现,采用该结构,对于同样的键的操作,每次都可以映射到固定的链表中进行匹配。
其中,图12示例性地示出了数据中继站的结构示意图,数据中继站105中包括多个链表结构,每个链表结构对应一标识(如图12中所示的哈希ID),通过对待处理键值对数据中的键进行哈希算法处理得到哈希值,根据哈希值查询各链表分别对应的哈希ID,即可映射到相应的目标链表结构,并在目标链表结构中执行操作。
其中,执行引擎模块102在处理数据读请求时,可以先在数据中继站105中进行匹配,确定数据中继站105中是否存在相匹配的键;若匹配成功,则返回该键对应的最新版本的值;若匹配失败,则可以返回null,以指示数据中继站105中不存在相应的键。
执行引擎模块102处理数据写请求和数据删除请求时,执行引擎模块102会向哈希到的目标链表结构中原子地添加一个数据项,数据项包括:用于指示数据操作类型的字段、用于容纳键的字段以及用于容纳值的字段;此外,执行引擎模块102还会删除目标链表结构中相同键的数据项。
接下来,通过图13至图15所示实施例分别详细介绍通过图11所示实施例的数据处理系统如何实现数据读取、数据删除以及数据写入。
图13为本公开一实施例提供的数据处理方法的流程图。请参阅图13所示,本实施例中,客户端发送的数据操作请求为数据读请求,当数据处理系统接收到客户端发送的数据读请求,通过请求译码器解析得到待处理键值对数据以及数据操作类型的信息并传递给执行引擎模块,执行引擎模块首先需要判断数据读请求中键是否满足内联存储的要求,例如,可根据键的大小判断,假设在8字节以内,则满足内联存储要求,大于8字节则不满足内联存储要求。
若键满足内联存储要求,则调用执行引擎模块访问数据中继站,查看数据中继站中是否有对应键的最新版本的值,如查询到相应的数据项,则返回最新版本的值。
若键满足内联存储要求,但在数据中继站中未查询到对应键最新版本的值,则调用执行引擎模块访问索引模块,查询索引结构,通过两种哈希算法映射到两个哈希桶,并根据键在哈希到的两个哈希桶中进行匹配确定目标索引槽。接下来,根据目标索引槽中的Flag字段和type字段确定要读取的数据的长度以及要从何处读取数据。若Flag字段的取值为01,则从目标索引槽中读取type信息确定值的数据长度,并从目标索引槽中读取值,之后,再解锁目标索引槽;若Flag字段的取值为10,则从目标索引槽中读取指针信息,并根据指针信息从服务器的内存中读取键值对数据。
若键不满足内联存储要求,则首先需要通过两种哈希算法映射到两个哈希桶(本实施例中以两个哈希桶为例进行举例),再计算键的指纹摘要信息,根据键的指纹摘要信息在哈希到的哈希桶中进行匹配,确定目标索引槽。需要说明的是,在匹配的过程中,键的指纹摘要信息可能与多个索引槽相匹配,即目标索引槽的数量可能为多个。接下来,根据目标索引槽中的指针信息,从服务器的内存中读取键值对数据,若服务器的内存中读取的键值对数据的键也完全匹配,则将从服务器的内存中读取的值返回,若服务器的内存中读取的键值对数据的键不匹配,则继续匹配下一个目标索引槽,重复上述过程。
图14为本公开一实施例提供的数据处理方法的流程图。请参阅图14所示,本实施例中,客户端发送的数据操作请求为数据写请求,当数据处理系统接收到客户端发送的数据写请求,通过请求译码器解析通过请求译码器解析得到待处理键值对数据以及数据操作类型的信息并传递给执行引擎模块,执行引擎模块首先需要判断数据写请求中键是否满足内联存储的要求,例如,可根据键的大小判断,假设在8字节以内,则满足内联存储要求,大于8字节则不满足内联存储要求。
若键满足内联存储的要求,则需要进一步判断要写入的键值对数据是否为小键值对数据,若是小键值对数据则需要写入数据中继站中,若不是小键值对数据则需要将数据写入目标索引槽或服务器的内存。
具体地,若值的大小满足预设要求(预设要求例如为8字节大小),则将键哈希到数据中继站中的目标链表结构,且在目标链表结构的表头添加一个PUT数据项,接着遍历目标链表结构删除所有包含该键的数据项,完成上述流程即可返回,后续步骤可异步执行。
其中,图14中所示需要异步执行的步骤与值不满足预设要求的情形执行引擎模块同步执行的步骤相同,即通过对键通过两种哈希算法将键哈希到两个哈希桶(图14所示实 施例以2个哈希桶为例进行举例说明),并根据键在两个哈希桶中匹配,如果匹配成功,则释放该旧的键值对数据占用的服务器的内存空间。接着,执行引擎模块判断值是否满足内联存储的要求,例如,在图5所示实施例的索引槽的数据结构的基础上,可以通过判断值是否小于6字节,确定值是否满足内联存储的要求;若确定值可以内存存储,则根据键值对数据填充目标索引槽,并返回;若确定值不可以内联存储,则需要通过内存分配器确定内存资源是否充足。若内存资源充足,则直接为键值对数据分配内存空间,并在分配的内存空间中填充键值对数据,然后填充目标索引槽并返回;若内存资源不充足,则请求服务器CPU预分配一段内存资源(请求分配的内存资源的大小可灵活设置),然后对该段内存资源建立分阶管理,然后为要写入的键值对数据分配内存空间,并在分配的内存空间中填充键值对,然后填充索引槽并返回。
若键不满足内联存储的要求,执行引擎模块可以通过对键进行哈希算法处理,将键哈希到两个哈希桶(图14所示实施例以2个哈希桶为例进行举例说明),接着,计算键的指纹摘要信息,并根据键的指纹摘要信息在哈希桶内匹配索引槽;若匹配成功,则执行引擎模块从服务器的内存读取相应的键值对数据匹配键是否一致,若一致,则执行引擎模块释放旧的键值对数据所占用的内存空间,接着执行键值对数据的写入流程;若根据键的指纹摘要信息在两个哈希桶内均未匹配成功,则为当前要写入的键值对数据分配并锁定一个空闲索引槽,接着执行键值对数据的写入流程。
其中,键值对数据的写入流程可参照前述异步执行的步骤中根据值的大小是否满足内联存储的要求进行写入的相关描述,简明起见,此处不再赘述。
图15为本公开一实施例提供的数据处理方法的流程图。请参阅图15所示,本实施例中,客户端发送的数据操作请求为数据删除请求,当数据处理系统接收到客户端发送的数据删除请求,通过请求译码器解析通过请求译码器解析得到待处理键值对数据以及数据操作类型的信息并传递给执行引擎模块,执行引擎模块首先需要判断数据删除请求中键是否满足内联存储的要求,例如,可根据键的大小判断,假设在8字节以内,则满足内联存储要求,大于8字节则不满足内联存储要求。
若键满足内联存储的要求,则需要进一步判断要删除的键值对数据是否为小键值对数据,若是小键值对数据则需要优先进入数据中继站中进行查找,若不是小键值对数据则需要在服务器主机内存中进行查找。
具体地,若值的大小满足预设要求(预设要求例如为8字节大小),则将键哈希到数据中继站中的目标链表结构,且在目标链表结构的表头添加一个DEL数据项,并将键值 对数据写入DEL数据项中,删除目标链表结构中相同键的数据项之后即可返回,剩余步骤可以异步执行。其中,异步执行的步骤逻辑与写入操作类似,即通过对键进行哈希算法处理,将键哈希到两个哈希桶(图15所示实施例以2个哈希桶为例进行举例说明),并根据键在两个哈希桶中匹配,如果匹配成功,根据匹配成功的目标索引槽中的Flag字段的取值进一步确定数据存储在目标索引槽中还是服务器的内存中;若Flag字段为01,则解锁目标索引槽,并释放键值对数据所占用的内存;若Flag字段为10,则根据目标索引槽中的指针信息,从服务器的内存中读取键值对数据,并释放键值对数据所占用的内存,解锁目标索引槽,之后返回。
若键不满足内联存储的要求,则通过对键进行哈希算法处理,将键哈希到两个哈希桶,接着计算键的指纹摘要信息,根据键的指纹摘要信息在哈希桶中匹配,需要说明的是,两个哈希桶中与键的指纹摘要信息匹配的索引槽的数量可能为多个,即目标索引槽的数量可能为多个,则从匹配成功的目标索引槽中读取的指针信息,根据指针信息从服务器的内存中读取键值对数据,当确定与读取的键值对数据的键一致,则释放键值对数据所占的内存,解锁目标索引槽,之后返回;若确定与读取的键值对数据的键不一致,则继续读取下一个目标索引槽中的指针信息,进行匹配,重复执行上述流程。
结合前述实施例,本公开通过将服务器的CPU的一些工作负载迁移至智能网卡中,由智能网卡处理客户端发送的数据操作请求,通过智能网卡对客户端节点呈现了KVS的访问抽象,而不必要求服务器CPU参与数据操作请求的处理,且数据处理系统通过子索引结构组织以及多路索引的机制、内联存储机制、数据中继站加速结构以及通过内存分配器实现服务器的内存预申请以及本地管理的方式,大大减小了服务器的性能瓶颈,提高了内存键值数据库的访问开销,对于客户端而言,低延迟能够给客户端带来较好的体验感受。
在图13至图15所示实施例中,若数据中继站中存储的键值对数据对于key的大小有要求,对于value的大小没有要求,则在确定是否进入数据中继站中进行查询时,也可以判断key的大小是否满足预设的条件即可,无需判断value的大小。
在本公开的一些实施例中,还提供了一种数据处理装置,包括:网卡模块、请求分析模块和执行引擎模块。
网卡模块用于接收客户端发送的数据操作请求。
请求分析模块用于对接收的数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将待处理数据和数据操作类型信息输入至执行引擎模块。
执行引擎模块用于基于待处理数据执行数据操作类型信息指示的数据操作以得到数据操作结果。
请求分析模块还用于对数据操作结果进行封装以得到数据操作请求的应答。
网卡模块还用于向客户端发送数据操作请求的应答。
在一些实施例中,请求分析模块用于基于数据操作结果更新数据操作请求对应的数据结构中的目标字段以得到数据操作请求的应答。
在一些实施例中,网卡模块用于接收客户端发送的具有相同事务标识的多个数据操作请求。
在一些实施例中,请求分析模块用于根据分别对具有相同事务标识的多个数据操作请求分别进行解析以得到多个数据字段以及多个相同的数据操作类型信息,并根据多个数据操作请求分别包括的顺序指示信息对多个待处理数据字段进行拼接以得到待处理数据。
在一些实施例中,具有相同事务标识的多个数据操作请求的数据结构一致。
在一些实施例中,执行引擎模块用于基于待处理数据从智能网卡包括的内存中存储的索引结构中确定待处理数据对应的目标索引槽,并针对目标索引槽执行数据操作类型信息指示的数据操作以得到数据操作结果。
在一些实施例中,索引结构采用哈希桶的方式实现,哈希桶包括多个索引槽。
在一些实施例中,执行引擎模块用于对待处理数据进行哈希计算以得到哈希值,基于哈希值在索引结构中匹配得到匹配成功的哈希桶,并基于待处理数据在匹配成功的哈希桶包括的索引槽中进行匹配以得到匹配结果,基于匹配结果确定目标索引槽。
在一些实施例中,执行引擎模块用于对待处理数据采用多种预设哈希算法分别进行哈希计算以得到多个哈希值,并将多个哈希值与索引结构包括的各哈希桶的标识进行匹配以得到匹配成功的多个哈希桶。
在一些实施例中,执行引擎模块用于基于待处理数据在匹配成功的哈希桶中进行匹配;或者,执行引擎模块用于计算待处理数据对应的指纹摘要信息,基于指纹摘要信息在匹配成功的哈希桶中进行匹配。
在一些实施例中,在数据操作请求为数据读请求的情况下,执行引擎模块用于:在确定待处理数据和待处理数据对应的目标数据均采用内联方式存储的情况下,从目标索引槽中读取待处理数据指示的目标数据;和在确定待处理数据采用内联方式存储, 待处理数据对应的目标数据采用非内联方式存储,或者,确定待处理数据采用非内联方式存储的情况下,从目标索引槽中获取指针信息,并从指针信息指示的服务器的内存中读取待处理数据对应的目标数据。
在一些实施例中,在数据操作请求为数据删除请求的情况下,执行引擎模块用于:在确定待处理数据和待处理数据指示的目标数据均采用内联存储方式存储的情况下,删除目标索引槽;和在确定待处理数据采用内联方式存储,待处理数据指示的目标数据采用非内联方式存储,或者,确定待处理数据采用非内联方式存储的情况下,从目标索引槽中获取指针信息,并删除指针信息所指示的服务器的内存中的数据,且释放目标索引槽。
在一些实施例中,执行引擎模块用于控制智能网卡的内存管理模块释放指针信息所指示的服务器的内存,内存管理模块用于管理服务器的内存。
示例性地,本公开实施例还提供一种电子设备,包括:存储器和处理器,本公开对于存储器和处理器的类型等等不做限定,其中,存储器和处理器可以通过数据总线连接。存储器被配置为存储计算机程序指令,处理器被配置为执行计算机程序指令,使得电子设备实现如上任一方法实施例所示的数据处理方法。
示例性地,本公开实施例还提供一种计算机可读存储介质,包括:计算机程序指令,计算机程序指令被电子设备的至少一个处理器执行时,使得所述电子设备实现如上任一方法实施例所示的数据处理方法。
示例性地,本公开实施例还提供一种计算机程序产品,当电子设备执行计算机程序产品,使得电子设备实现如上任一方法实施例所示的数据处理方法。
示例性地,本公开实施例还提供一种计算机程序,包括:指令,所述指令当由处理器执行时使所述处理器执行如上任一方法实施例所示的数据处理方法。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对 这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (17)

  1. 一种数据处理方法,包括:
    通过智能网卡中的网卡模块接收客户端发送的数据操作请求;
    调用所述智能网卡中的请求分析模块对所述数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将所述待处理数据和所述数据操作类型信息输入至所述智能网卡中的执行引擎模块;
    调用所述执行引擎模块基于所述待处理数据执行所述数据操作类型信息所指示的数据操作以得到数据操作结果;和
    调用所述请求分析模块对所述数据操作结果进行封装以得到所述数据操作请求的应答,并通过所述网卡模块向所述客户端发送所述数据操作请求的应答。
  2. 根据权利要求1所述的数据处理方法,其中,所述调用所述请求分析模块对所述数据操作结果进行封装以得到所述数据操作请求的应答,包括:
    调用所述请求分析模块基于所述数据操作结果更新所述数据操作请求对应的数据结构中的目标字段以得到所述数据操作请求的应答。
  3. 根据权利要求1或2所述的数据处理方法,其中,所述通过智能网卡中的网卡模块接收客户端发送的数据操作请求,包括:
    通过所述网卡模块接收所述客户端发送的具有相同事务标识的多个数据操作请求。
  4. 根据权利要求3所述的数据处理方法,其中,所述调用所述智能网卡中的请求分析模块对所述数据操作请求进行解析以得到待处理数据和数据操作类型信息,包括:
    调用所述请求分析模块根据分别对具有相同事务标识的所述多个数据操作请求分别进行解析以得到多个数据字段以及多个相同的数据操作类型信息;和
    调用所述请求分析模块根据所述多个数据操作请求分别包括的顺序指示信息对所述多个待处理数据字段进行拼接以得到所述待处理数据。
  5. 根据权利要求3或4所述的数据处理方法,其中,具有相同事务标识的所述多 个数据操作请求的数据结构一致。
  6. 根据权利要求1至5任意一项所述的数据处理方法,其中,所述调用所述执行引擎模块基于所述待处理数据执行所述数据操作类型信息指示的数据操作以得到数据操作结果,包括:
    调用所述执行引擎模块基于所述待处理数据从所述智能网卡包括的内存中存储的索引结构中确定所述待处理数据对应的目标索引槽;和
    调用所述执行引擎模块针对所述目标索引槽执行所述数据操作类型信息指示的数据操作以得到所述数据操作结果。
  7. 根据权利要求6所述的数据处理方法,其中:
    所述索引结构采用哈希桶的方式实现,所述哈希桶包括多个索引槽;
    所述通过所述执行引擎模块基于所述待处理数据从所述智能网卡包括的内存中存储的索引结构中确定所述待处理数据对应的目标索引槽,包括:
    调用所述执行引擎模块对所述待处理数据进行哈希计算以得到哈希值,基于所述哈希值在所述索引结构中匹配得到匹配成功的哈希桶;和
    调用所述执行引擎模块基于所述待处理数据在匹配成功的哈希桶包括的索引槽中进行匹配以得到匹配结果,基于所述匹配结果确定所述目标索引槽。
  8. 根据权利要求7所述的数据处理方法,其中,所述调用所述执行引擎模块对所述待处理数据进行哈希计算以得到哈希值,基于所述哈希值在所述索引结构中匹配得到匹配成功的哈希桶,包括:
    调用所述执行引擎模块对所述待处理数据采用多种预设哈希算法分别进行哈希计算以得到多个哈希值;和
    调用所述执行引擎模块将所述多个哈希值与所述索引结构包括的各哈希桶的标识进行匹配以得到匹配成功的多个哈希桶。
  9. 根据权利要求7或8所述的数据处理方法,其中,所述调用所述执行引擎模块基于所述待处理数据在匹配成功的哈希桶包括的索引槽中进行匹配以得到匹配结果,包括:
    调用所述执行引擎模块基于所述待处理数据在匹配成功的哈希桶中进行匹配;或者,调用所述执行引擎模块计算所述待处理数据对应的指纹摘要信息,基于所述指纹摘要信息在匹配成功的哈希桶中进行匹配。
  10. 根据权利要求6所述的数据处理方法,其中,在所述数据操作请求为数据读请求的情况下,所述调用所述执行引擎模块针对所述目标索引槽执行所述数据操作类型信息指示的数据操作以得到所述数据操作结果,包括:
    在确定所述待处理数据和所述待处理数据对应的目标数据均采用内联方式存储的情况下,从所述目标索引槽中读取所述待处理数据指示的目标数据;和
    在确定所述待处理数据采用内联方式存储,所述待处理数据对应的目标数据采用非内联方式存储,或者,确定所述待处理数据采用非内联方式存储的情况下,从所述目标索引槽中获取指针信息,并从所述指针信息指示的服务器的内存中读取所述待处理数据对应的目标数据。
  11. 根据权利要求6所述的数据处理方法,其中,在所述数据操作请求为数据删除请求的情况下,所述调用所述执行引擎模块针对所述目标索引槽执行所述数据操作类型信息指示的数据操作以得到所述数据操作结果,包括:
    在确定所述待处理数据和所述待处理数据指示的目标数据均采用内联存储方式存储的情况下,删除所述目标索引槽;和
    在确定所述待处理数据采用内联方式存储,所述待处理数据指示的目标数据采用非内联方式存储,或者,确定所述待处理数据采用非内联方式存储的情况下,从所述目标索引槽中获取指针信息,并删除所述指针信息所指示的服务器的内存中的数据,且释放所述目标索引槽。
  12. 根据权利要求11所述的数据处理方法,其中,所述删除所述指针信息所指示的服务器的内存中的数据,包括:
    通过所述执行引擎模块控制所述智能网卡的内存管理模块释放所述指针信息所指示的服务器的内存,所述内存管理模块用于管理所述服务器的内存。
  13. 一种数据处理装置,包括:
    网卡模块,用于接收客户端发送的数据操作请求;
    请求分析模块,用于对接收的所述数据操作请求进行解析以得到待处理数据和数据操作类型信息,并将所述待处理数据和所述数据操作类型信息输入至执行引擎模块;和
    执行引擎模块,用于基于所述待处理数据执行所述数据操作类型信息指示的数据操作以得到数据操作结果;
    所述请求分析模块,还用于对所述数据操作结果进行封装以得到所述数据操作请求的应答;
    所述网卡模块,还用于向所述客户端发送所述数据操作请求的应答。
  14. 一种电子设备,包括:存储器和处理器;
    所述存储器被配置为存储计算机程序指令;
    所述处理器被配置为执行所述计算机程序指令,使得所述电子设备实现如权利要求1至12任一项所述的数据处理方法。
  15. 一种计算机程序产品,其中,当所述计算机程序产品被电子设备执行时,使得所述电子设备实现如权利要求1至12任一项所述的数据处理方法。
  16. 一种可读存储介质,包括:计算机程序指令,电子设备的至少一个处理器执行所述计算机程序指令,使得所述电子设备实现如权利要求1至12任一项所述的数据处理方法。
  17. 一种计算机程序,包括:
    指令,所述指令当由处理器执行时使所述处理器执行如权利要求1至12中任一项所述的数据处理方法。
PCT/CN2023/115226 2022-09-22 2023-08-28 数据处理方法及装置 WO2024060934A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211160563.8 2022-09-22
CN202211160563.8A CN117785997A (zh) 2022-09-22 2022-09-22 数据处理方法及装置

Publications (1)

Publication Number Publication Date
WO2024060934A1 true WO2024060934A1 (zh) 2024-03-28

Family

ID=90398787

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/115226 WO2024060934A1 (zh) 2022-09-22 2023-08-28 数据处理方法及装置

Country Status (2)

Country Link
CN (1) CN117785997A (zh)
WO (1) WO2024060934A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347243A1 (en) * 2014-05-27 2015-12-03 International Business Machines Corporation Multi-way, zero-copy, passive transaction log collection in distributed transaction systems
CN109491602A (zh) * 2018-10-31 2019-03-19 钟祥博谦信息科技有限公司 一种用于Key-Value数据存储的Hash计算方法及系统
CN114285676A (zh) * 2021-11-24 2022-04-05 中科驭数(北京)科技有限公司 智能网卡、智能网卡的网络存储方法和介质
WO2022156650A1 (zh) * 2021-01-21 2022-07-28 华为技术有限公司 访问数据的方法及装置
WO2022206170A1 (zh) * 2021-03-29 2022-10-06 华为技术有限公司 一种数据处理方法、服务端及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347243A1 (en) * 2014-05-27 2015-12-03 International Business Machines Corporation Multi-way, zero-copy, passive transaction log collection in distributed transaction systems
CN109491602A (zh) * 2018-10-31 2019-03-19 钟祥博谦信息科技有限公司 一种用于Key-Value数据存储的Hash计算方法及系统
WO2022156650A1 (zh) * 2021-01-21 2022-07-28 华为技术有限公司 访问数据的方法及装置
WO2022206170A1 (zh) * 2021-03-29 2022-10-06 华为技术有限公司 一种数据处理方法、服务端及系统
CN114285676A (zh) * 2021-11-24 2022-04-05 中科驭数(北京)科技有限公司 智能网卡、智能网卡的网络存储方法和介质

Also Published As

Publication number Publication date
CN117785997A (zh) 2024-03-29

Similar Documents

Publication Publication Date Title
CN112204513B (zh) 多租户存储系统中的基于组的数据复制
US9727590B2 (en) Data management and indexing across a distributed database
CN106657365B (zh) 一种基于rdma的高并发数据传输方法
CN111277616B (zh) 一种基于rdma的数据传输方法和分布式共享内存系统
CN106933500B (zh) 访问存储在存储系统中的数据对象的方法和系统
US20040117375A1 (en) Using direct memory access for performing database operations between two or more machines
CN111966446B (zh) 一种容器环境下rdma虚拟化方法
CN111078607B (zh) 面向rdma与非易失性内存的网络访问编程框架部署方法及系统
EP1237086A2 (en) Method and apparatus to migrate data using concurrent archive and restore
US9405484B2 (en) System of managing remote resources
Cassell et al. Nessie: A decoupled, client-driven key-value store using RDMA
EP4318251A1 (en) Data access system and method, and device and network card
CN114201421B (zh) 一种数据流处理方法、存储控制节点及可读存储介质
US10708379B1 (en) Dynamic proxy for databases
CN110119304B (zh) 一种中断处理方法、装置及服务器
CN108139927B (zh) 联机事务处理系统中事务的基于动作的路由
CN102981857A (zh) 数据库集群的并行压缩海量数据装载方法
US6725218B1 (en) Computerized database system and method
Luo et al. {SMART}: A {High-Performance} Adaptive Radix Tree for Disaggregated Memory
WO2023246843A1 (zh) 数据处理方法、装置及系统
WO2024060934A1 (zh) 数据处理方法及装置
US9237057B1 (en) Reassignment of a virtual connection from a busiest virtual connection or locality domain to a least busy virtual connection or locality domain
CN112241398A (zh) 一种数据迁移方法和系统
Du et al. Leader confirmation replication for millisecond consensus in private chains
Dalessandro et al. iSER storage target for object-based storage devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23867225

Country of ref document: EP

Kind code of ref document: A1