CN115729915A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115729915A
CN115729915A CN202211502934.6A CN202211502934A CN115729915A CN 115729915 A CN115729915 A CN 115729915A CN 202211502934 A CN202211502934 A CN 202211502934A CN 115729915 A CN115729915 A CN 115729915A
Authority
CN
China
Prior art keywords
processing
call request
data
layer
target data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211502934.6A
Other languages
Chinese (zh)
Inventor
柴云鹏
赵博瑄
胡浦云
查寒天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN202211502934.6A priority Critical patent/CN115729915A/en
Publication of CN115729915A publication Critical patent/CN115729915A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application provides a data processing method, a data processing device, an electronic device and a storage medium, wherein the method comprises the following steps: responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation by the interface layer, and sending the calling request to the parallel layer; the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partitions to be selected included in the processing layer, and sends the call request to the target data partition; and the processing layer processes the call request by using a processing mode corresponding to the target data partition to obtain a processing result. The method optimizes the aspects of parallel architecture, access granularity, writing mode, data storage, garbage recovery and the like of the storage system, uses a modular design mode, and realizes an asynchronous high-performance native database storage engine on a non-volatile memory.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
Key-value database: namely, key-Value Database, is a NoSQL Database. Key-value store applications are generally considered to be the type in which they are the simplest to implement and the fastest performing. The basic unit of storage for key-value storage applications is a key-value pair, consisting of two parts, a key (key) and a value (value), with the value of each in the database being uniquely identified by a key. Key value stores no defined data type, but can contain almost anything, and values can store data in any form, either structured, reshaped numbers, strings, or unstructured data such as nested JSON documents, HTML information, image information encoded in base64, and so forth. The median and the value of the key value storage application are independent from each other, no mutual dependency relationship exists, meanwhile, the key value storage application has no default query language, the operation on the key value pair is provided by an API (application programming interface) of each database, and common interfaces comprise operation of Put, get, delete, scan and the like.
In the conventional key value database, a key value storage system based on block equipment cannot fully exert the hardware advantages of a nonvolatile memory, cannot be completely matched with the conventional hardware equipment, has low efficiency in the aspect of throughput and cannot meet the conventional working requirements.
Disclosure of Invention
In view of the above, an object of the present application is to provide a method, an apparatus, an electronic device and a storage medium for data processing, so as to overcome the problems in the prior art.
In a first aspect, an embodiment of the present application provides a data processing method, which is applied to a storage system, where the storage system includes an interface layer, a parallel layer, and a processing layer, and the method includes:
responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation by the interface layer, and sending the calling request to the parallel layer;
the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partitions to be selected included in the processing layer, and sends the call request to the target data partition;
and the processing layer processes the call request by using a processing mode corresponding to the target data partition to obtain a processing result.
In some technical solutions of the present application, the steps executed by the interface layer are executed by a client thread, the steps executed by the parallel layer and the steps executed by the processing layer are executed by a worker thread, and the client thread and the worker thread are different threads.
In some embodiments of the present application, the method further includes:
creating a premium and future pair in the client thread; wherein, future returns to the client, is used for establishing asynchronous callback chain or blocking waiting; the premium object is saved for notifying the future object of the result after the work thread returns.
In some technical solutions of the present application, the processing layer processes the call request by using a processing manner corresponding to the target data partition, including:
and sequentially arranging the call requests in the working threads corresponding to the target data partitions, wherein the call requests are sequentially processed according to the arrangement sequence of the call requests in the working threads by the processing mode corresponding to the target data partitions.
In some technical solutions of the present application, the invoking request includes a key value pair, the analyzing of the invoking request by the parallel layer determines a target data partition corresponding to the invoking request from the to-be-selected data partition included in the processing layer, and the determining includes:
and carrying out hash operation on the key of the call request to obtain a hash value corresponding to the key, and determining a target data partition corresponding to the call request according to the hash value corresponding to the key.
In some technical solutions of the present application, the processing manner corresponding to the target data partition includes data storage, and the method processes the call request in the following manner:
each stored record is checked using a checksum to determine whether the record was written.
In some technical solutions of the present application, the processing manner corresponding to the target data partition includes data updating, and the method processes the call request in the following manner:
the method updates data in an additional writing mode.
In a second aspect, an embodiment of the present application provides an apparatus for data processing, where the apparatus resides in a storage system, and the apparatus includes:
the response module is used for responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation, and sending the calling request to the analysis module;
the analysis module is used for analyzing the call request, determining a target data partition corresponding to the call request from the data partitions to be selected, and sending the call request to the target data partition;
and the processing module is used for processing the calling request by using a processing mode corresponding to the target data partition to obtain a processing result.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the data processing method when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the above-mentioned data processing method.
The technical scheme provided by the embodiment of the application can have the following beneficial effects: the method comprises the steps that in response to interface calling operation initiated by a client, an interface layer acquires a calling request which is generated by the client and corresponds to the interface calling operation, and sends the calling request to a parallel layer; the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partitions to be selected included in the processing layer, and sends the call request to the target data partition; and the processing layer processes the call request by using a processing mode corresponding to the target data partition to obtain a processing result. The method optimizes the aspects of parallel architecture, access granularity, writing mode, data storage, garbage recovery and the like of the storage system, uses a modular design mode, and realizes the asynchronous high-performance native database storage engine on the non-volatile memory.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic flowchart illustrating a method for data processing according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a storage system provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a recording structure provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a process provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of a data processing apparatus provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
Key-value store: namely, key-Value Database, is a NoSQL Database. Key-value store applications are generally considered to be the type in which they are the simplest to implement and the fastest performing. The basic unit of storage for key-value storage applications is a key-value pair, consisting of two parts, a key (key) and a value (value), with the value of each in the database being uniquely identified by a key. Key value stores no defined data type, but can contain almost anything, and values can store data in any form, either structured, reshaped numbers, strings, or unstructured data such as nested JSON documents, HTML information, image information encoded in base64, and so forth. The median and the value of the key value storage application are independent from each other, no mutual dependency relationship exists, meanwhile, the key value storage application has no default query language, the operation on the key value pair is provided by an API (application programming interface) of each database, and common interfaces comprise operation of Put, get, delete, scan and the like.
A nonvolatile memory: non-Volatile Memory (NVM), is a type of computer Memory that retains stored information even after power is removed. The most widely used form of main memory today is volatile random access memory, which means that any content held in storage will be lost when the computer is turned off. Most forms of non-volatile memory have limitations that make them unsuitable for use as main memory.
A nonvolatile main memory: that is, non-Volatile Main Memory is generally not suitable for being used as a Main storage because a nonvolatile medium has a low performance, a limited lifetime, a coarse read-write granularity, or a high cost. The recent generation of ao Teng storage uses 3D XPoint technology, has high read-write speed, can address bytes and meets the basic condition of serving as a main storage. Some proud stored DIMM form packages have therefore emerged, serving as non-volatile random access memory (NVRAM). It and the traditional DRAM memory are both integrated memory controller that uses memory bus to exchange data, and when it is used, the device is inserted into DIMM slot of server and connected to processor.
Asynchronous programming: namely, asynchronous Programming, which is a Programming paradigm that enables a program to initiate a task that may run for a long period of time and still be able to respond to other events while the task is running without having to wait until the task is completed. In asynchronous programming, a thread will process multiple tasks without waiting for results. When the results are ready, the thread will collect the results and commit them.
HiKV: in order to accommodate the phenomenon of fast DRAM writing speed and slow NVM writing speed, xia et al proposed the HiKV system. Structurally, the HiKV adopts a parallel model combining data partitions, single working threads and single partitions, and data on the partitions are processed by the threads. In terms of data distribution, hiKV establishes NVM-based hash indexes in each data partition, each data partition has a separate hash index for fast lookup of data in the partition, and the system shares the same B + tree index in DRAM.
Flatstore: in order to adapt to the hardware write granularity of 256 bytes of the NVM and avoid the access granularity and NVM hardware mismatch caused by writing the NVM in a small granularity, the FlatStore proposes a log structure and segment storage structure dual storage structure. Each writing operation is additionally recorded at the tail end of the log structure, and the content in the log is distinguished according to the size of the key value pair: if the key value pair is smaller, directly recording the data in a log; if the key-value pair is large, a segment of space is allocated in the segment storage structure to store the key-value pair, and a persistent pointer of the space is stored in the log entry. Meanwhile, the Flatstore designs a multi-thread batch processing mode, different threads can select a leader thread in the process of processing the write request, the leader thread collects the write operation of each thread within a period of time, the write operation is uniformly performed, and the frequency of the persistence operation is reduced.
The Viper: because the sequential write delay of the NVM is not much different from that of the DRAM, and the NVM hardware device can gather a plurality of sequential write operations together and brush the sequential write operations into a persistent medium in the write process, the Viper organizes the data into a data segment with the size of 4KB, and the data is directly written in without DRAM buffering so as to achieve the minimum delay of the NVM write operations. In the aspect of multithread parallelism, one-time write operation of Viper is executed on a plurality of threads in parallel, and each thread corresponds to a different memory channel. In order to avoid the random writing delay of the NVM, viper adopts a key value separation strategy, persistent storage of the key value is completed by the NVM, an index is placed in the DRAM, and the index stores a pointer of key value data in the NVM, so that the performance is improved.
The key value storage system based on the block device cannot fully exert the hardware advantages of the nonvolatile memory, in the aspect of throughput, the hardware throughput of the nonvolatile memory can reach 9GB/s, the maximum throughput of the traditional key value storage system on new hardware is 588MB/s, the hardware throughput from a solid state disk to the nonvolatile memory is improved by 3.3 times, the maximum improvement of the traditional key value storage system is 2.37 times, namely the related design in the system compromises the exertion of hardware performance. Therefore, it is necessary to analyze the reason why the performance of the conventional key-value storage system is poor on the non-volatile memory, and perform relevant optimization to achieve the optimal performance on the new hardware.
Based on this, embodiments of the present application provide a method and an apparatus for data processing, an electronic device, and a storage medium, which are described below by way of embodiments.
Fig. 1 shows a schematic flowchart of a method for data processing provided in an embodiment of the present application, which is applied to a storage system, where the storage system includes an interface layer, a parallel layer, and a processing layer, where the method includes steps S101-S103; specifically, the method comprises the following steps:
s101, responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation by an interface layer, and sending the calling request to a parallel layer;
s102, the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partition to be selected included in the processing layer, and sends the call request to the target data partition;
s103, the processing layer processes the call request by using the processing mode corresponding to the target data partition to obtain a processing result.
The method optimizes the aspects of parallel architecture, access granularity, writing mode, data storage, garbage recovery and the like of the storage system, uses a modular design mode, and realizes the asynchronous high-performance native database storage engine on the nonvolatile memory.
Some embodiments of the present application are described in detail below. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
The method of the present application is applied to a storage system, as shown in fig. 2, the storage system of the present application includes an interface layer, a parallel layer, and a processing layer, where the processing layer includes an index and a storage layer. Interface layer: the user call interface initiates a call request to the storage system. The call request here includes direct initiation locally and remote initiation via RPC. Depending on the specific type of calling interface, the client may block waiting for results, or may perform other tasks. Parallel layers: and forwarding the request of the user to the corresponding data partition, and finishing by using a seastar framework. And (3) treatment layer: each data partition processes the user's request, retrieves the key index from the HotTree index, and reads the data in the storage or stores the user's data. The processing layer includes a plurality of data partitions, each of which contains a CPU core and data. The data partition is also correspondingly provided with a DRAM and a PMEM: index, data block cache, and persistent file. Each persistent file contains a file header data block and a plurality of file data blocks.
S101, responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation by the interface layer, and sending the calling request to the parallel layer.
When processing data, a user carries out interface calling operation through a client, the client generates a calling processing signal corresponding to the interface calling operation, and sends the calling processing signal to a storage system. The client comprises a mobile phone, a computer, a pad and the like. The calling operation performed at the client includes an operation for inputting a corresponding operation command or an operation for selecting an interface, which enables the function of the calling interface to be called. In the application, the user of the interface layer receives the call request, the call request comprises the key value pair, and the call request is sent to the parallel layer.
S102, the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partition to be selected included in the processing layer, and sends the call request to the target data partition.
Because the processing layer of the application comprises a plurality of data partitions, after receiving a call request sent by the interface layer, the parallel layer needs to analyze the call request to determine a target data partition corresponding to the call request. The analysis of the call request here is: and carrying out hash operation on the key of the call request to obtain a hash value corresponding to the key, and determining the target data partition corresponding to the call request according to the hash value corresponding to the key. Specifically, the key of the call request is hashed, and the first four bits of the hash value are selected as the partition attribution of the key. That is, all query, insert, update, and delete operations of the key are completed under the target data partition.
After determining the target data partition for the call request, the parallel layer sends the call request to the target data partition of the processing layer.
S103, the processing layer processes the call request by using the processing mode corresponding to the target data partition to obtain a processing result.
The target data partition processes the call request and comprises data storage and data updating. In the aspect of data storage, data writing is organized in a log structure, and multiple versions of data may exist in a storage system at the same time. The data updating in the way of additional writing can ensure that the access to the persistent memory is performed in the way of sequential writing, so that the throughput of the overall data writing is increased, but for hot spot data which is frequently updated in a short period, a large amount of invalid data exists in a data page due to the continuous additional writing, and the waste of the persistent memory space is caused. In order to optimize the space occupation problem, the method and the device use the Slab data structure to cache and update the frequent hot spot data outside the log structure. The Slab data structure is generally used in a memory key value storage system, and divides data into different sizes, and uniformly stores the data in a specific range interval in a data structure similar to an array, wherein the size of each data slot of the array is the maximum value of the interval. Each array independently maintains a free list, so that elements with similar sizes are gathered and placed, if a certain element is deleted, a free data slot can be used by subsequent elements, and the influence caused by data fragmentation is avoided.
As shown in fig. 3, the storage system uses a checksum to enable the storage system to detect whether each record stored in the nonvolatile memory is completely written during the reboot. One record consists of a data header and a data item, and the complete data class is called DataEntry. The data header is a persistent consistency guarantee data structure of the data record, and metadata and key value pair data are recorded in the data item. The data header size is 8 bytes, contains a checksum of 4 bytes and a suffix length of 4 bytes, just the nonvolatile memory atom write granularity. The suffix length is the length of the data record immediately following the header, check values of the metadata, the key and the value are respectively calculated before persistence, the three calculation structures are added to obtain a checksum of the whole data record, and the checksum is stored in the data header. The data header is the first part written in the recording persistence process, so as long as the writing is started, the data header is written completely. The design can ensure the consistency of data after the storage system is restarted, the check length of each record and the value c1 of the check sum can be obtained by checking the data head of each data record, the check sum c2 of the suffix size length of the data head is recalculated, and if c1= c2, the data record is completely persistent and is accepted by the restarted storage system; if c1! = c2, then this represents that the record is not completely persisted, the storage system discards this incomplete record, and the data of this data area is cleared. Because the storage system does not inform the upper layer application that the write is complete when the data is not persisted to completion, there is no need to worry about data inconsistencies at the upper layer application.
Persistent slab: the data updating in the additional writing mode can ensure that the access to the persistent memory is performed in the sequential writing mode, so that the throughput of the overall data writing is increased, but for hot data which is frequently updated in a short period, a large amount of invalid data exists in a data page due to the continuous additional writing, and the waste of the persistent memory space is caused. To address this problem, storage systems use a data structure, persistent Slab, for holding frequently updated data. The Slab data structure is generally used in a memory key value storage system, and divides data into different sizes, and uniformly stores the data in a specific range interval in a data structure similar to an array, wherein the size of each data slot of the array is the maximum value of the interval. Each array independently maintains a free list, so that elements with similar sizes are gathered and placed, if a certain element is deleted, a free data slot can be used by subsequent elements, and the influence caused by data fragmentation is avoided.
In the implementation process, the demarcation threshold values of the slabs are set to 64B, 256B, 1024B and 4096B, and each slab group can hold 1000 data records. Each Slab group has 1 bitmap for idle management, 1 LRU cache for controlling data elimination in the Slab and a counter with TTL function, the above 3 modules are DRAM memory modules, and are not related to persistence, and the persistence part only has a persistence array of the Slab group. When the storage layer is written, the data can be directly written into the log structure data page; if the storage layer updating interface is called, whether the data slot is hit or not needs to be checked in the LRU cache, and if the cache is hit and the data slot has no snapshot reference, the data slot is updated in situ in the array slot; if the cache is hit but snapshot reference exists, the data needs to be written into a new data slot, the referenced slot is added into a recovery queue, and the slot is recovered when the snapshot is released; and if the cache is not hit, updating the TTL counter, replacing a slot corresponding to eliminated data in the Slab when the count value exceeds 3, recording the access times of the TTL counter in nearly 1 minute in a log structure data page of the eliminated data persistence value, and updating the TTL every time access is performed.
In this embodiment, as an optional embodiment, the steps executed by the interface layer are executed by a client thread, the steps executed by the parallel layer and the steps executed by the processing layer are executed by a worker thread, and the client thread and the worker thread are different threads.
Creating a premium and future pair in the client thread; wherein, future returns to the client, is used for establishing asynchronous callback chain or blocking waiting; the premium object is saved for notifying the future object of the result after the work thread returns.
And sequentially arranging the call requests in the working threads corresponding to the target data partitions, and sequentially processing the call requests according to the arrangement sequence of the call requests in the working threads by the processing modes corresponding to the target data partitions.
When the steps are implemented, thread separation is performed. After the client initiates a request, the client thread part of the database acquires the request. To notify the client of the result, the present system creates a premium/future pair. The future part is returned to the client for creating an asynchronous callback chain or blocking wait, etc. The plan object is saved for informing the future object of the result after the work thread returns. The system then hashes the requested key, selecting the first four bits of the hash value as the partition attribution of the key. That is, all query, insert, update, and delete operations of the key are completed under the partition. And then the system submits the requested key value pair to the corresponding working thread through the seastar frame and acquires a cross-thread future object for obtaining a calling result. To get the call result, the system will bind a callback function to the future object and capture the previously created plan object in the closure of the callback function. When the worker thread returns the result, the callback function will use the promise object to notify the client of the result.
In the application, the data in the data partitions are independent from each other, and the storage system allocates corresponding background threads to process specific data. The processing of the read-write call request of the storage system needs to involve 2 threads: a client thread (client thread) calling a system interface and a worker thread (worker thread) binding a CPU. In the data flow of the storage system for processing the read-write request, calling the request to be mapped to a corresponding data partition from a client thread, and then forwarding the request to a queue of a working thread by the client thread; and the work thread acquires the call request from the queue in the process of executing the event loop. And performing read-write processing in the data partition according to the data in the call request, and notifying a client thread processing result through a callback function after the read-write processing is completed. This parallel model has one data forwarding and thread switching between the call interface and the data processing.
Specifically, as shown in fig. 4, the client thread part executing step in the present application includes: the caller calls the system interface, encodes the parameter according to different interfaces, calculates the hash value of the key, selects (data) partitions according to the hash value, inserts (call) requests into the queue, acquires the future object, returns from the (storage) system interface, finishes the preparation of the future object, and executes the callback function. The working thread executing step comprises the following steps: summarizing and acquiring requests from the queue, judging the type of the requests, and writing in a storage layer and writing in an index layer if the type of the requests is put; if the request type is get, the index layer searches and the storage layer accesses; and if the request type is delete, searching for deletion by the index layer and deleting by the index layer.
Fig. 5 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present application, where the apparatus resides in a storage system, and the apparatus includes:
the response module is used for responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation, and sending the calling request to the analysis module;
the analysis module is used for analyzing the call request, determining a target data partition corresponding to the call request from the data partitions to be selected, and sending the call request to the target data partition;
and the processing module is used for processing the calling request by using a processing mode corresponding to the target data partition to obtain a processing result.
The steps performed by the response module are performed by a client thread, the steps performed by the analysis module and the steps performed by the processing module are performed by a worker thread, and the client thread and the worker thread are different threads.
A creation module for creating a plan and future pair in the client thread; wherein, future returns to the client, is used for establishing asynchronous callback chain or blocking waiting; the plan object is saved for informing the future object of the result after the work thread returns.
The processing module processes the call request by using a processing mode corresponding to the target data partition, and includes:
and sequentially arranging the call requests in the working threads corresponding to the target data partitions, wherein the call requests are sequentially processed according to the arrangement sequence of the call requests in the working threads by the processing mode corresponding to the target data partitions.
The calling request comprises a key value pair, the analysis module analyzes the calling request, and determines a target data partition corresponding to the calling request from the data partitions to be selected included in the processing module, and the method comprises the following steps:
and carrying out hash operation on the key of the call request to obtain a hash value corresponding to the key, and determining a target data partition corresponding to the call request according to the hash value corresponding to the key.
The processing mode corresponding to the target data partition comprises data storage, and the method processes the call request in the following modes:
each stored record is checked by means of a checksum to determine whether the record is written.
The processing mode corresponding to the target data partition comprises data updating, and the method processes the call request in the following mode:
the method updates data in an additional writing mode.
As shown in fig. 6, an embodiment of the present application provides an electronic device for executing the method of data processing in the present application, where the device includes a memory, a processor, a bus, and a computer program stored in the memory and executable on the processor, where the processor executes the steps of the method of data processing when the processor executes the computer program.
The memory and the processor may be a general-purpose memory and a processor, which are not limited in particular, and may be capable of executing the data processing method when the processor runs a computer program stored in the memory.
Corresponding to the method for data processing in the present application, the present application also provides a computer readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to perform the steps of the method for data processing.
In particular, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, etc., and the computer program on the storage medium can be executed when being executed to perform the above-mentioned data processing method.
In the embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other ways. The above-described system embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of systems or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures, and moreover, the terms "first," "second," "third," etc. are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of data processing, applied to a storage system, the storage system comprising an interface layer, a parallel layer, and a processing layer, the method comprising:
responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation by the interface layer, and sending the calling request to the parallel layer;
the parallel layer analyzes the call request, determines a target data partition corresponding to the call request from the data partitions to be selected included in the processing layer, and sends the call request to the target data partition;
and the processing layer processes the call request by using a processing mode corresponding to the target data partition to obtain a processing result.
2. The method of claim 1, wherein the steps performed by the interface layer are performed by a customer thread, the steps performed by the parallel layer and the steps performed by the processing layer are performed by a worker thread, and the customer thread and the worker thread are different threads.
3. The method of claim 2, further comprising:
creating a plan and future pair in the client thread; wherein, future returns to the client, is used for establishing asynchronous callback chain or blocking waiting; the premium object is saved for notifying the future object of the result after the work thread returns.
4. The method of claim 1, wherein the processing layer processes the call request using a processing manner corresponding to the target data partition, including:
and sequentially arranging the call requests in the working threads corresponding to the target data partitions, wherein the call requests are sequentially processed according to the arrangement sequence of the call requests in the working threads by the processing mode corresponding to the target data partitions.
5. The method of claim 1, wherein the call request includes a key value pair, the parallel layer analyzes the call request, and determines a target data partition corresponding to the call request from the candidate data partitions included in the processing layer, including:
and carrying out hash operation on the key of the call request to obtain a hash value corresponding to the key, and determining a target data partition corresponding to the call request according to the hash value corresponding to the key.
6. The method of claim 1, wherein the processing mode corresponding to the target data partition comprises data storage, and the method processes the call request by:
each stored record is checked using a checksum to determine whether the record was written.
7. The method of claim 1, wherein the processing mode corresponding to the target data partition comprises data updating, and the method processes the call request by:
the method updates data in an additional writing mode.
8. An apparatus for data processing, the apparatus residing in a storage system, the apparatus comprising:
the response module is used for responding to an interface calling operation initiated by a client, acquiring a calling request which is generated by the client and corresponds to the interface calling operation, and sending the calling request to the analysis module;
the analysis module is used for analyzing the call request, determining a target data partition corresponding to the call request from the data partitions to be selected, and sending the call request to the target data partition;
and the processing module is used for processing the call request by using the processing mode corresponding to the target data partition to obtain a processing result.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions, when executed by the processor, performing the steps of the method of data processing according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the method of data processing according to one of the claims 1 to 7.
CN202211502934.6A 2022-11-28 2022-11-28 Data processing method and device, electronic equipment and storage medium Pending CN115729915A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211502934.6A CN115729915A (en) 2022-11-28 2022-11-28 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211502934.6A CN115729915A (en) 2022-11-28 2022-11-28 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115729915A true CN115729915A (en) 2023-03-03

Family

ID=85298679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211502934.6A Pending CN115729915A (en) 2022-11-28 2022-11-28 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115729915A (en)

Similar Documents

Publication Publication Date Title
US11775524B2 (en) Cache for efficient record lookups in an LSM data structure
US11099937B2 (en) Implementing clone snapshots in a distributed storage system
US20190213085A1 (en) Implementing Fault Domain And Latency Requirements In A Virtualized Distributed Storage System
US11113158B2 (en) Rolling back kubernetes applications
US9430331B1 (en) Rapid incremental backup of changed files in a file system
CN107491523B (en) Method and device for storing data object
WO2016041480A1 (en) Method and system for adaptively building and updating column store database from row store database based on query demands
WO2023093245A1 (en) Metadata query method based on distributed file system, and device and storage medium
WO2022063059A1 (en) Data management method for key-value storage system and device thereof
US20190188090A1 (en) Snapshot Deletion In A Distributed Storage System
US20240012813A1 (en) Dynamic prefetching for database queries
CN104035822A (en) Low-cost efficient internal storage redundancy removing method and system
US20220342888A1 (en) Object tagging
CN114281819A (en) Data query method, device, equipment and storage medium
CN111913913B (en) Access request processing method and device
CN115729915A (en) Data processing method and device, electronic equipment and storage medium
US20200333970A1 (en) Data de-duplication
CN114443722A (en) Cache management method and device, storage medium and electronic equipment
CN113946577A (en) Object merging method and device
US11748203B2 (en) Multi-role application orchestration in a distributed storage system
CN113051244A (en) Data access method and device, and data acquisition method and device
RU2647648C1 (en) Method of organizing storage of historical deltas of records
CN117493276B (en) Reading method and device for Ceph file, server and storage medium
US11593355B2 (en) Method, device and computer program product for processing metadata
CN110134509B (en) Data caching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination