WO2023207132A1

WO2023207132A1 - Data storage method and apparatus, and device and medium

Info

Publication number: WO2023207132A1
Application number: PCT/CN2022/138693
Authority: WO
Inventors: 张子奇
Original assignee: 苏州元脑智能科技有限公司
Priority date: 2022-04-28
Filing date: 2022-12-13
Publication date: 2023-11-02
Also published as: CN114579061B; CN114579061A

Abstract

The present application relates to the field of distributed storage. Provided are a data storage method and apparatus, and a device and a medium. The method comprises: determining data to be stored, and determining a data ID of said data by using a preset ID determination method, wherein the preset ID determination method is a determination method based on a storage scenario or a determination method based on a hash algorithm; by means of determining the data ID as a key and determining said data as a value, constructing key-value information corresponding to said data; and by means of a key-value storage interface in a solid state drive, storing, in the solid state drive, the key-value information corresponding to said data.

Description

A data storage method, device, equipment and medium

Cross-references to related applications

This application requests the priority of the Chinese patent application submitted to the China Patent Office on April 28, 2022, with the application number 202210462119.5 and the application title "A data storage method, device, equipment and medium", the entire content of which is incorporated by reference. in this application.

Technical field

The present invention relates to the field of distributed storage, and in particular, to a data storage method, device, equipment and medium.

Background technique

Distributed storage systems store data dispersedly on multiple independent devices. They generally adopt a scalable system structure, use multiple storage servers to share the storage load, and use location servers to locate storage information. It not only improves the reliability and availability of the system and access efficiency, and easy to expand.

In the current mainstream distributed storage local systems, disks and SSD (Solid State Drives) are generally used as storage media. When data is stored in the media, some data that records data information, that is, metadata, generally needs to be organized through key-value database software (key-value DB, such as RocksDB). The current mainstream solid-state storage media generally uses iSCSI (Internet Small Computer System Interface, Internet Small Computer System Interface) and NVMe (Non-Volatile Memory express) interfaces. However, the inventor realized that in the storage software system, the key Value storage is currently widely used as a storage backend due to its simple and versatile interface. This leads to the fact that in the current storage system, if you need to use a key-value storage system, you must go through the conversion of multiple software layers, resulting in multiple software layers and complex systems, which brings huge resource overhead.

It can be seen from the above that in the process of using key-value storage systems, how to avoid the high system complexity, high system overhead, and multiple software layers caused by traditional data storage methods is a problem to be solved in this field.

Contents of the invention

The application provides a data storage method applied to distributed storage systems, including:

Determine the data to be stored, and use a preset name determination method to determine the data name of the data to be stored; the preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm;

Construct key-value information corresponding to the data to be stored by determining the data name as the key and the data to be stored as the value; and

The key-value information corresponding to the data to be stored is stored in the solid-state drive through the key-value storage interface in the solid-state drive.

In some embodiments, before determining the data name of the data to be stored using a preset name determination method, the method further includes:

Obtain the cutting length through the preset cutting length acquisition interface;

Segment the data to be stored using the segmentation length to obtain each data block corresponding to the data to be stored, and determine the segmentation offset corresponding to each data block; and

The segmentation offset is used to perform remainder on the segmentation length to obtain the remainder result corresponding to each data block, and the remainder result corresponding to each data block is determined as the data block sequence number of each data block.

In some embodiments, a preset name determination method is used to determine the data name of the data to be stored, including:

Determine the current storage scenario; and

Use the data block serial number and the current storage scenario to determine the data block name of each data block;

Correspondingly, by determining the data name as a key and the data to be stored as a value, the key value information corresponding to the data to be stored is constructed, and the key value information corresponding to the data to be stored is stored through the key value storage interface in the solid state drive. to solid state drive, including:

Construct the key value information corresponding to the data block by determining the data block name as the key and the data block data in the data block as the value; and

The key-value information corresponding to the data block is stored in the solid-state drive through the key-value storage interface in the solid-state drive.

In some embodiments, the data block serial number and the current storage scenario are used to determine the data name of the data to be stored, including:

In response to the current storage scenario being file storage or object storage, determine the file number corresponding to the data to be stored, and use the file number and the data block serial number to determine the data block name of the data block, or, in response to the current storage scenario being block storage, determine The logical unit number corresponding to the data to be stored, and the data block name of the data block is determined using the logical unit number and the data block serial number.

Calculate the hash value corresponding to the data block using a preset hash algorithm; and

Use the hash value as the data block name of the data block;

In some embodiments, the above data storage method also includes:

Bind the data block serial number corresponding to any data block with the data block name and form a mapping relationship; and

Record the mapping relationship between the data block serial number and the data block name to form a mapping relationship list between the data block serial number and the data block name.

In some of the embodiments, before determining the data to be stored, the method further includes:

Obtain the data to be stored and determine the target object storage device corresponding to the data to be stored; and

Write the data to be stored into the target object storage device;

Accordingly, determine the data to be stored, including:

Extract the data to be stored from the target object storage device.

The second aspect of this application provides a data storage device applied to a distributed storage system, including:

The data name determination module is used to determine the data to be stored, and determine the data name of the data to be stored using a preset name determination method; the preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm;

A key-value information building module, used to construct key-value information corresponding to the data to be stored by determining the data name as a key and the data to be stored as a value; and

The information storage module is used to store the key value information corresponding to the data to be stored in the solid state drive through the key value storage interface in the solid state drive.

A third aspect of this application provides an electronic device, including:

One or more memories for holding computer-readable instructions; and

One or more processors, used to execute computer-readable instructions to implement the aforementioned data storage method.

A fourth aspect of the present application provides a non-volatile computer-readable storage medium for storing computer-readable instructions; wherein the computer-readable instructions implement the aforementioned disclosed data storage when executed by one or more processors. Method steps.

Description of the drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without exerting creative efforts.

Figure 1 is a flow chart of a data storage method provided in one or more embodiments of the present application;

Figure 2 is an overall architecture diagram of a data storage provided in one or more embodiments of the present application;

Figure 3 is an overall architecture diagram of a traditional distributed data storage;

Figure 4 is a flow chart of a specific data storage method provided in one or more embodiments of the present application;

Figure 5 is a flow chart of a specific data storage method provided in one or more embodiments of the present application;

Figure 6 is a schematic structural diagram of a data storage device provided in one or more embodiments of the present application;

Figure 7 is a structural diagram of an electronic device provided in one or more embodiments of the present application.

Detailed ways

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

In the existing technology, when a key-value storage system needs to be used, database software must be used in the storage node, and data storage is completed through conversion of multiple software layers. The entire process has many software layers, complex systems, and huge resources. overhead. In the method proposed in this application, software methods such as database software are no longer used to store key-value information. Instead, the data name is used as the key and the data is used as the value. A solid-state drive with a key-value storage interface is directly used for storage, which can Reduce system complexity and software levels, and reduce system overhead.

An embodiment of the present invention discloses a data storage method, which is applied to a distributed storage system. See Figure 1. The method includes:

Step S11: Determine the data to be stored, and determine the data name of the data to be stored using a preset name determination method; the preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm.

In this embodiment, before determining the data to be stored, it may also include: obtaining the data to be stored and determining a target object storage device corresponding to the data to be stored; writing the data to be stored into the target object storage device; correspondingly, determining the data to be stored. Storing data includes: retrieving data to be stored from the target object storage device. In this embodiment, the data to be stored is data pre-written in the object storage device (ie, Object Storage Device, OSD). In a specific implementation, the data to be stored may be data taken out from a preset data pool and stored in the corresponding target object storage device.

Step S12: Construct key value information corresponding to the data to be stored by determining the data name as a key and the data to be stored as a value.

In this embodiment, this method avoids the use of hardware resources such as CPU and memory of storage nodes in traditional distributed storage, and uses database software to record the storage location of data. Instead, it directly determines the data name of the data to be stored ( (i.e., data ID), the method of determining the data name as a key (i.e., key) and determining the data to be stored as a value (i.e., value) can greatly shorten the software stack of distributed storage and shorten IO (i.e., Input/Output, Input/output) delay, thereby achieving the purpose of improving system performance.

Step S13: Store the key value information corresponding to the data to be stored in the solid state drive through the key value storage interface in the solid state drive.

It can be understood that the solid state drive in this embodiment is a solid state drive with a key-value storage interface, which is hereinafter referred to as KV-SSD. In a specific implementation manner, the solid state drive in this embodiment may also be SSD hardware that can directly provide a key-value interface to the outside world as proposed in the NVMe2.0 protocol.

Figure 2 is an overall architecture diagram of a data storage proposed by this application. The figure shows that after data interaction with the outside, the data can be stored in the data pool, and the data can be written to the corresponding object storage device. Finally, the process of writing data into a preset solid-state drive with a key-value storage interface through the storage layer, in which the data name of the data (i.e. data ID) can be directly used as the key of the data (i.e. key), and the data Directly as the value (i.e. value), and then complete the construction of key-value information.

Figure 3 is the overall architecture diagram of traditional distributed data storage. A comparison of Figure 2 and Figure 3 shows that in traditional distributed data storage, database software (i.e. RocksDB in the figure) is used to process metadata. , in this application, these database software are not used, but KV-SSD is directly used instead of database software to store key value information. The management of the metadata system is completed inside the KV-SSD, that is, inside the hardware, effectively reducing the number of storage nodes. Resource consumption and software layer complexity.

It should be pointed out that in this embodiment, the database software used in the traditional distributed storage method is no longer used to complete the storage of key-value information. As long as the keys and values corresponding to the data to be stored meet the preset key-value storage interface specifications, it is sufficient. suitable for the method of the present invention.

In this embodiment, the data to be stored is determined first, and the data name of the data to be stored is determined using a preset name determination method. The preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm. By changing the data name By determining as a key and determining the data to be stored as a value, the key value information corresponding to the data to be stored is constructed, and the key value information corresponding to the data to be stored is stored in the solid state drive through the key value storage interface in the solid state drive. In this way, this solution avoids the use of hardware resources such as CPU and memory of storage nodes in traditional distributed storage, and does not need to use database software to complete data storage. Instead, it uses the data name as the key and the data as the value, directly Stored in a solid-state drive with a key-value storage interface. In this method, the storage node does not need to manage metadata, and the management of metadata is completed by the KV-SSD main control CPU, thereby reducing the software level and reducing the number of The resource consumption of storage nodes reduces system complexity and system overhead, achieving the purpose of efficiently improving system performance.

Figure 4 is a flow chart of a specific data storage method provided by an embodiment of the present application. As shown in Figure 4, the method includes:

Step S21: Determine the data to be stored, and obtain the segmentation length through the preset segmentation length acquisition interface.

In a specific implementation, after determining the data to be stored, the data to be stored can be segmented, and the segmented data can be processed. It is understandable that in this case, the segmentation length must first be obtained through the preset segmentation length acquisition interface, and then used to complete data segmentation based on the segmentation length. In a specific implementation, the segmentation length may be 2MB or 4MB, etc.

Step S22: Segment the data to be stored using the segmentation length to obtain each data block corresponding to the data to be stored, and determine the segmentation offset corresponding to each data block.

In this step, the segmentation length obtained through the preset segmentation length acquisition interface is used to segment the data to be stored, and each segmented data block is obtained. It can be understood that during the segmentation process, the segmentation offset (ie, offset) of each data block will also be determined.

Step S23: Use the segmentation offset to perform remainder on the segmentation length to obtain the remainder result corresponding to each data block, and determine the remainder result corresponding to each data block as the data block sequence number of each data block.

What this step completes is the process of determining the data block sequence number corresponding to each data block. The data block sequence number can also be recorded as data index. The specific process can be to use the segmentation offset to take the remainder of the segmentation length, that is, when When the segmentation length is 4MB, the data block sequence number is data index=offset%4MB.

Step S24: Determine the current storage scenario, and use the data block serial number and the current storage scenario to determine the data block name of each data block.

In this embodiment, using the data block serial number and the current storage scenario to determine the data name of the data to be stored may include: if the current storage scenario is file storage or object storage, determining the file number corresponding to the data to be stored, and using the file number Determine the data block name of the data block with the data block serial number; if the current storage scenario is block storage, determine the logical unit number corresponding to the data to be stored, and use the logical unit number and the data block serial number to determine the data block name of the data block. It can be understood that in this embodiment, different data block name determination methods can be used in different storage scenarios. That is, if the current storage is file storage or object storage, the file number (ie, inodenumber) and data block serial number can be used to determine the data. The data block name of the block. In a specific implementation, the data block name data ID can be recorded as data ID=inodenumber+offset%4MB. Since the file number uniquely corresponds to the data to be stored, the data obtained by this method can be used. ID distinguishes different files on KV-SSD. If the current storage is block storage, the logical unit number (i.e. LUN ID) and the data block serial number can be used to determine the data block name of the data block. In a specific implementation, the data block name data ID can be recorded as data ID=LUN ID+offset%4MB. Since the logical unit number uniquely corresponds to the data to be stored, the data ID obtained by this method can be used to distinguish the data of different volumes on the KV-SSD.

Step S25: Construct the key value information corresponding to the data block by determining the data block name as the key and the data block data in the data block as the value.

For other more specific processing procedures of step S25, reference may be made to the corresponding content disclosed in the foregoing embodiments, and will not be described again here.

Step S26: Store the key value information corresponding to the data block in the solid state drive through the key value storage interface in the solid state drive.

For other more specific processing procedures of step S26, reference may be made to the corresponding content disclosed in the foregoing embodiments, and will not be described again here.

In this embodiment, after obtaining the data to be stored, the data to be stored is segmented, and each segmented data block is obtained, and then a determination method based on the storage scenario is used to determine the data block name of each data block, and The key-value information is constructed by determining the data block name as a value and the data block data as a value, and stores them in a solid-state drive with a key-value storage interface. The determination method based on the storage scenario proposed in this embodiment is simple to calculate. , the amount of calculation is small, and the system operation overhead and system complexity are reduced.

Figure 5 is a flow chart of a specific data storage method provided by an embodiment of the present application. As shown in Figure 5, the method includes:

Step S31: Determine the data to be stored, and obtain the segmentation length through the preset segmentation length acquisition interface.

For other more specific processing procedures of step S31, reference may be made to the corresponding content disclosed in the foregoing embodiments, and will not be described again here.

Step S32: Segment the data to be stored using the segmentation length to obtain each data block corresponding to the data to be stored.

For other more specific processing procedures of step S32, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and will not be described again here.

Step S33: Calculate the hash value corresponding to the data block using a preset hash algorithm.

In this embodiment, the hash value corresponding to the data block can be calculated through a determination method based on a hash algorithm. Hash algorithms include but are not limited to MD5 (Message Digest Algorithm 5, message digest algorithm), SHA (Secure Hash Algorithm, secure hash algorithm).

Step S34: Use the hash value as the data block name of the data block.

In this embodiment, the data storage method may also include: binding the data block serial number corresponding to any data block and the data block name, and forming a mapping relationship; performing the mapping relationship between the data block serial number and the data block name. Record to form a mapping relationship list between data block serial numbers and data block names. It should be pointed out that this method also needs to additionally calculate the corresponding relationship between the data block serial number data index and the data block name data ID. Compared with the above determination method based on storage scenarios, the amount of calculation increases, but the advantage of this method is that For data blocks with the same content, the calculated data ID is the same. Since only one copy of the same data is stored on the disk, it has the function of deduplication.

Step S35: Construct the key value information corresponding to the data block by determining the data block name as the key and the data block data in the data block as the value.

For other more specific processing procedures of step S35, reference may be made to the corresponding content disclosed in the foregoing embodiments, and will not be described again here.

Step S36: Store the key value information corresponding to the data block in the solid state drive through the key value storage interface in the solid state drive.

For other more specific processing procedures of step S36, reference may be made to the corresponding content disclosed in the foregoing embodiments, and will not be described again here.

In this embodiment, after obtaining the data to be stored, the data to be stored is segmented, and each segmented data block is obtained, and then a determination method based on a hash algorithm is used to determine the data block name of each data block. The key-value information is constructed by determining the data block name as a value and the data block data as a value, and stores them in a solid-state drive with a key-value storage interface. The determination method based on the hash algorithm proposed in this embodiment Compared with the above determination method based on storage scenarios, the amount of calculation is increased, but the advantage of this method is that for data blocks with the same content, the calculated data ID is the same, because only one copy of the same data is stored on the disk , so it has the function of deduplication.

Referring to Figure 6, an embodiment of the present application discloses a data storage device, which may specifically include:

The data name determination module 11 is used to determine the data to be stored, and determine the data name of the data to be stored using a preset name determination method; the preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm;

The key value information construction module 12 is used to construct the key value information corresponding to the data to be stored by determining the data name as a key and the data to be stored as a value; and

The information storage module 13 is used to store the key value information corresponding to the data to be stored in the solid state drive through the key value storage interface in the solid state drive.

This application first determines the data to be stored, and uses a preset name determination method to determine the data name of the data to be stored. The preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm. The data name is determined by Construct the key value information corresponding to the data to be stored as a key and determine the data to be stored as a value, and store the key value information corresponding to the data to be stored in the solid state drive through the key value storage interface in the solid state drive. In this way, this solution avoids the use of hardware resources such as CPU and memory of storage nodes in traditional distributed storage, and does not need to use database software to complete data storage. Instead, it uses the data name as the key and the data as the value, directly Stored in a solid-state drive with a key-value storage interface, the storage node does not need to manage metadata in this method, thereby reducing the software level, reducing the resource consumption of the storage node, reducing the system complexity and system overhead, and achieving The purpose of efficiently improving system performance.

In some specific embodiments, the data storage device also includes:

The cutting length acquisition module is used to obtain the cutting length through the preset cutting length acquisition interface;

The data segmentation module is used to segment the data to be stored using the segmentation length to obtain each data block corresponding to the data to be stored, and determine the segmentation offset corresponding to each data block; and

The data block sequence number determination module is used to perform remainder on the segmentation length using the segmentation offset to obtain the remainder result corresponding to each data block, and determine the remainder result corresponding to each data block as each data block. The data block sequence number.

In some specific embodiments, the data name determination module 11 includes:

The scene determination unit is used to determine the current storage scene;

The first data block name determination unit is used to determine the data block name of each data block using the data block serial number and the current storage scenario;

Correspondingly, the key value information building module 12 and the information storage module 13 include:

The first key value information building unit is used to construct the key value information corresponding to the data block by determining the name of the data block as the key and the data block data in the data block as the value; and

The first information storage unit is used to store the key value information corresponding to the data block into the solid state drive through the key value storage interface in the solid state drive.

In some specific embodiments, the first data block name determination unit includes:

The first scene naming unit is used to determine the file number corresponding to the data to be stored if the current storage scene is file storage or object storage, and use the file number and the data block serial number to determine the data block name of the data block; and

The second scene naming unit is used to determine the logical unit number corresponding to the data to be stored if the current storage scenario is block storage, and use the logical unit number and the data block serial number to determine the data block name of the data block.

In some specific embodiments, the data name determination module 11 includes:

The hash value determination unit is used to calculate the hash value corresponding to the data block using a preset hash algorithm;

The second data block name determination unit is used to use the hash value as the data block name of the data block;

In some specific embodiments, the data storage device also includes:

The mapping relationship determination unit is used to bind the data block sequence number corresponding to any data block and the data block name, and form a mapping relationship; and

The mapping list determination unit is used to record the mapping relationship between the data block serial number and the data block name, so as to form a mapping relationship list between the data block serial number and the data block name.

In some specific embodiments, the data storage device also includes:

An object storage device determination unit is used to obtain the data to be stored and determine the target object storage device corresponding to the data to be stored;

A data writing unit is used to write the data to be stored into the target object storage device;

Correspondingly, the data name determination module 11 includes:

A data extraction unit is used to extract data to be stored from the target object storage device.

Furthermore, embodiments of the present application also disclose an electronic device. Figure 7 is a structural diagram of an electronic device 20 according to an exemplary embodiment. The content in the figure cannot be considered to be any computer-readable instructions within the scope of use of the present application. limit.

FIG. 7 is a schematic structural diagram of an electronic device 20 provided by an embodiment of the present application. The electronic device 20 may specifically include: one or more processors 21, one or more memories 22, a power supply 23, a display screen 24, an input and output interface 25, a communication interface 26 and a communication bus 27. The memory 22 is used for storage, and the computer-readable instructions are loaded and executed by one or more processors 21 to implement the relevant steps in the data storage method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in this embodiment may specifically be an electronic computer.

In this embodiment, the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 26 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be applicable Any communication protocol of the technical solution of this application is not specifically limited here; the input and output interface 25 is used to obtain external input data or output data to the external world, and its specific interface type can be selected according to specific application needs. Here No specific limitation is made.

In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc. The resources stored thereon can include an operating system 221, computer readable instructions 222, etc., and the storage method can be short-term storage or Permanent storage.

Among them, the operating system 221 is used to manage and control each hardware device on the electronic device 20 and the computer readable instructions 222, which can be Windows, Unix, Linux, etc. In addition to computer-readable instructions that can be used to complete the data storage method executed by the electronic device 20 disclosed in any of the foregoing embodiments, the computer-readable instructions 222 may further include computer-readable instructions that can be used to complete other specific tasks. instruction.

Furthermore, this application also discloses a non-volatile computer-readable storage medium. The non-volatile computer-readable storage medium mentioned here includes random access memory (Random Access Memory, RAM), memory, and read-only memory. Memory (Read-Only Memory, ROM), electrically programmable ROM, electrically erasable programmable ROM, register, hard disk, magnetic disk or optical disk or any other form of storage medium known in the technical field. The computer-readable instructions implement the aforementioned disclosed data storage method when executed by one or more processors. Regarding the specific steps of this method, reference may be made to the corresponding content disclosed in the foregoing embodiments, which will not be described again here.

Each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple. For relevant details, please refer to the description in the method section. Those skilled in the art may further realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of both. In order to clearly illustrate the possible functions of hardware and software, Interchangeability, in the above description, the composition and steps of each example have been generally described according to functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.

The steps of the methods or algorithms described in connection with the embodiments disclosed herein may be implemented directly in hardware, in software modules executed by one or more processors, or in a combination of both. Software modules may be located in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or anywhere in the field of technology. any other known form of storage media.

Finally, it should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or any such actual relationship or sequence between operations. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment. Without further limitation, an element qualified by the statement "comprises a..." does not exclude the presence of additional identical elements in the process, method, article, or device that includes the element.

The data storage methods, devices, equipment, and storage media provided by the present invention have been introduced in detail above. Specific examples are used in this article to illustrate the principles and implementation modes of the present invention. The description of the above embodiments is only used to help understand the present invention. The method of the invention and its core idea; at the same time, for those of ordinary skill in the field, there will be changes in the specific implementation and scope of application based on the idea of the invention. In summary, the contents of this specification should not be understood are limitations of the present invention.

Claims

A data storage method, characterized in that it is applied to a distributed storage system, including:

Determine the data to be stored, and determine the data name of the data to be stored using a preset name determination method; the preset name determination method is a determination method based on storage scenarios or a determination method based on a hash algorithm;

Construct key value information corresponding to the data to be stored by determining the data name as a key and the data to be stored as a value; and

The key value information corresponding to the data to be stored is stored in the solid state drive through a key value storage interface in the solid state drive.
The data storage method according to claim 1, characterized in that before determining the data to be stored, the method further includes:

Obtain the data to be stored and determine the target object storage device corresponding to the data to be stored; and

Write the data to be stored in the storage device of the target object.
The data storage method according to claim 2, characterized in that, after writing the data to be stored into the storage device of the target object, the method further includes:

Extract the data to be stored from the storage device of the target object.
The data storage method according to claim 1, characterized in that before determining the data name of the data to be stored using a preset name determination method, it further includes:

Obtain the cutting length through the preset cutting length acquisition interface;

Segment the data to be stored using the segmentation length to obtain each data block corresponding to the data to be stored, and determine the segmentation offset corresponding to each data block; and

Respectively use the segmentation offset to perform remainder on the segmentation length to obtain the remainder result corresponding to each data block, and determine the remainder result corresponding to each data block as the data of each data block. Block serial number.
The data storage method according to claim 4, characterized in that the segmentation offset is used to carry out remainder on the segmentation length to obtain the remainder result corresponding to each data block, and The remainder result corresponding to each data block is determined as the data block sequence number of each data block, including:

The segmentation offset is respectively used to perform remainder on the segmentation length to obtain the remainder result corresponding to each data block, and the remainder result corresponding to each data block is calculated based on the following formula Data block sequence number:

data index=offset*D*100%;

data index is the data block serial number, offset is the segmentation offset corresponding to the data block, and D is the segmentation length.
The data storage method according to claim 4, characterized in that the use of a preset name determination method to determine the data name of the data to be stored includes:

Determine the current storage scenario; and

Determine the data block name of each data block using the data block serial number and the current storage scenario;

Correspondingly, by determining the data name as a key and the data to be stored as a value, the key value information corresponding to the data to be stored is constructed, and all key value storage interfaces in the solid state drive are used. The key value information corresponding to the data to be stored is stored in the solid state drive, including:

Construct the key value information corresponding to the data block by determining the data block name as a key and the data block data in the data block as a value; and

The key value information corresponding to the data block is stored in the solid state drive through a key value storage interface in the solid state drive.
The data storage method according to claim 6, characterized in that, using the data block serial number and the current storage scenario to determine the data name of the data to be stored includes:

In response to the current storage scenario being file storage or object storage, determine the file number corresponding to the data to be stored, and use the file number and the data block serial number to determine the data block name of the data block; and

In response to the current storage scenario being block storage, the logical unit number corresponding to the data to be stored is determined, and the data block name of the data block is determined using the logical unit number and the data block serial number.
The data storage method according to claim 6, characterized in that, in response to the current storage scenario being file storage or object storage, determining the file number corresponding to the data to be stored, and using the file number and the data block The sequence number determines the data block name of the data block, including:

Determine the file number corresponding to the data to be stored, and use the file number and the data block serial number to calculate the data block name of each data block based on the following formula:

data ID＝inodenumber+offset*100%*D;

data ID is the name of the data block, offset is the segmentation offset corresponding to the data block, and D is the segmentation length.
The data storage method according to claim 6, characterized in that, in response to the current storage scenario being block storage, determining the logical unit number corresponding to the data to be stored, and using the logical unit number and the data block serial number Determine the data block name of the data block, including:

data ID＝LUN ID+offset*100%*D;

data ID is the name of the data block, offset is the segmentation offset corresponding to the data block, D is the segmentation length, and LUN ID is the logical unit number.
The data storage method according to claim 4, characterized in that the use of a preset name determination method to determine the data name of the data to be stored includes:

Calculate a hash value corresponding to the data block using a preset hash algorithm; and

Use the hash value as the data block name of the data block;

Correspondingly, by determining the data name as a key and the data to be stored as a value, the key value information corresponding to the data to be stored is constructed, and all key value storage interfaces in the solid state drive are used. The key value information corresponding to the data to be stored is stored in the solid state drive, including:

Construct the key value information corresponding to the data block by determining the data block name as a key and the data block data in the data block as a value; and

The key value information corresponding to the data block is stored in the solid state drive through a key value storage interface in the solid state drive.
The data storage method according to claim 10, wherein the preset hash algorithm includes an information digest algorithm and a secure hash algorithm.
The data storage method according to claim 10, characterized in that using the hash value as the data block name of the data block includes:

Bind the data block serial number corresponding to any of the data blocks and the data block name to obtain the target mapping relationship; and

Record the target mapping relationship to obtain a mapping relationship list between the data block serial number and the data block name.
The data storage method according to claim 12, characterized in that recording the target mapping relationship to obtain a mapping relationship list between the data block serial number and the data block name includes:

Record the target mapping relationship;

Calculate the target correspondence between the data block serial number and the data block name according to the target mapping relationship; and

Obtain a mapping relationship list between the data block serial number and the data block name according to the target corresponding relationship.
The data storage method according to claim 12, characterized in that, after recording the target mapping relationship to obtain a mapping relationship list between the data block serial number and the data block name, the method further includes:

In response to the occurrence of duplicate data, redundant duplicate data is removed.
The data storage method according to claim 12, characterized in that, after recording the target mapping relationship to obtain a mapping relationship list between the data block serial number and the data block name, the method further includes:

In response to the occurrence of duplicate data, a target data is retained for storage to the solid state drive.
The data storage method according to claim 10, further comprising:

Bind the data block serial number corresponding to any data block with the data block name and form a mapping relationship; and

The mapping relationship between the data block serial number and the data block name is recorded to form a mapping relationship list between the data block serial number and the data block name.
The data storage method according to any one of claims 1 to 16, characterized in that before determining the data to be stored, it further includes:

Obtain the data to be stored and determine the target object storage device corresponding to the data to be stored; and

Write the data to be stored into the target object storage device;

Correspondingly, the determination of data to be stored includes:

Extract the data to be stored from the target object storage device.
A data storage device, characterized in that it is applied to a distributed storage system, including:

A data name determination module, used to determine the data to be stored, and determine the data name of the data to be stored using a preset name determination method; the preset name determination method is a determination method based on storage scenarios or a determination based on a hash algorithm method;

A key value information construction module, configured to construct key value information corresponding to the data to be stored by determining the data name as a key and the data to be stored as a value; and

An information storage module is configured to store the key value information corresponding to the data to be stored in the solid state drive through a key value storage interface in the solid state drive.
An electronic device, characterized in that it includes one or more processors and one or more memories; wherein, when the one or more processors execute computer-readable instructions stored in the one or more memories, the implementation The data storage method according to any one of claims 1 to 17.
A non-volatile computer-readable storage medium, characterized in that it is used to store computer-readable instructions; wherein the computer-readable instructions implement claims 1 to 17 when executed by the one or more processors The data storage method described in any one of the above.