CN113176858A - Data processing method, storage system and storage device - Google Patents

Data processing method, storage system and storage device Download PDF

Info

Publication number
CN113176858A
CN113176858A CN202110495606.7A CN202110495606A CN113176858A CN 113176858 A CN113176858 A CN 113176858A CN 202110495606 A CN202110495606 A CN 202110495606A CN 113176858 A CN113176858 A CN 113176858A
Authority
CN
China
Prior art keywords
storage
data
blocks
data blocks
stripes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110495606.7A
Other languages
Chinese (zh)
Other versions
CN113176858B (en
Inventor
彭飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruijie Networks Co Ltd
Original Assignee
Ruijie Networks Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruijie Networks Co Ltd filed Critical Ruijie Networks Co Ltd
Priority to CN202110495606.7A priority Critical patent/CN113176858B/en
Publication of CN113176858A publication Critical patent/CN113176858A/en
Application granted granted Critical
Publication of CN113176858B publication Critical patent/CN113176858B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Abstract

The embodiment of the application provides a data processing method, a storage system and storage equipment. The method comprises the following steps: acquiring a plurality of data blocks; when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining verification data according to the plurality of data blocks; filling the check data into the rest stripes of the plurality of stripes; and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks. When the technical scheme of the application is adopted for data storage, the writing punishment can be effectively reduced and the problem of inconsistent updating can be solved.

Description

Data processing method, storage system and storage device
Technical Field
The present application relates to the field of storage technologies, and in particular, to a data processing method, a storage system, and a storage device.
Background
With the development of network technology and information processing technology, personal and enterprise data is in an explosive expansion trend, which makes distributed storage systems a common data storage system. However, in a distributed storage system, a node failure is a normal state, and in order to ensure high availability of data when the node failure occurs, the existing distributed storage system usually stores data by using a data redundancy mode, and the current main redundancy mode includes a plurality of main modes and an erasure code mode; the erasure code mode has the characteristics of high storage efficiency, low storage space occupancy rate and the like, but the existing erasure code mode has the problems of write punishment and inconsistent updating.
Disclosure of Invention
In view of the above, the present application provides a data processing method, a storage system, and a storage device that solve the above problems, or at least partially solve the above problems.
In one embodiment of the present application, a data processing method is provided. The method comprises the following steps:
acquiring a plurality of data blocks;
when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section;
filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks;
determining check data according to the data blocks;
filling the check data into the rest stripes of the plurality of stripes;
and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks.
In one embodiment of the present application, a storage system is provided. The system comprises:
a storage pool comprising a plurality of storage disks, a storage disk having a plurality of storage blocks;
the distributed storage module is used for acquiring a plurality of data blocks; when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks; filling the check data into the rest stripes of the plurality of stripes; and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks.
In one embodiment of the present application, a storage device is provided. The storage device includes: a memory and a processor; the memory is used for storing one or more computer instructions, and the one or more computer instructions can realize the steps of the data processing method when being executed by the processor.
According to the technical scheme provided by the embodiment of the application, an idle storage section is determined only after the number of the acquired data blocks reaches a set value; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; then, filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks, and filling the check data into the rest stripes of the stripes; and then according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections, so as to respectively store the strips in the storage sections into the corresponding storage blocks. According to the technical scheme provided by the embodiment of the application, the data blocks and the check data are stored in the corresponding storage blocks in a full storage section mode, and the reading operation of the data blocks and/or the check data is not involved in the period, so that the writing punishment is effectively reduced; and when the number of the acquired data blocks reaches the required set value, the operation of filling the corresponding stripes in the data blocks is executed, so that the problem of the reduction of random writing performance caused by directly writing small data into the disk can be avoided. In addition, the scheme of the present application updates data by using an additional writing method, and therefore, the problem of inconsistent update of data can also be effectively solved, and in particular, as to how the method of the present application updates data by using an additional writing method to solve the problem of inconsistent update of data, reference may be made to the following relevant contents.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required to be utilized in the description of the embodiments or the prior art are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained according to the drawings without creative efforts for those skilled in the art.
FIG. 1 is a schematic diagram of a distributed erasure code based storage system according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 3a is a schematic diagram of a specific data format corresponding to a plurality of strips included in a storage section according to an embodiment of the present application;
FIG. 3b is a schematic diagram of a data format corresponding to a header area of a first type of stripe in a plurality of stripes included in a storage section according to an embodiment of the present application;
fig. 3c is a schematic diagram of a data format corresponding to a segment header included in one of at least one first-type stripe included in a storage segment according to an embodiment of the present application;
fig. 4 is a schematic diagram illustrating a principle of writing data in an additional manner according to an embodiment of the present application;
FIG. 5 is a block diagram of a memory system according to another embodiment of the present application;
fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present application;
fig. 7 is a block diagram of a storage device according to an embodiment of the present application.
Detailed Description
Before explaining the schemes provided by the embodiments of the present application, related terms referred to in the present application will be briefly described.
A plurality of present storage techniques: one copy of original data is completely copied and stored in multiple copies, and for a distributed storage system, one copy of original data is completely stored to a plurality of storage server nodes (storage nodes for short); for example, an N copy is to store a user data to N storage nodes completely, so that when a certain storage node fails, the user data can be recovered from other storage nodes, thereby ensuring the reliability of the data, the technology allows the maximum number of storage node failures to be N-1, and the utilization rate of a storage space to be 1/N;
erasure Coding storage technology (EC): the K + M erasure code storage technology (EC (K + M) for short) is to divide original data into K original data blocks, encode the K original data blocks to generate M parity blocks, and store the K + M data blocks (including the original data blocks and the parity blocks) distributed on different storage nodes of a storage system to form a stripe with consistency, where no more than M arbitrary data blocks are lost or damaged, and the original K original data blocks can be recovered by the remaining other data blocks which are not lost or damaged, that is, the maximum number of data blocks which are tolerated by the storage system to be lost or damaged is M, and the storage space utilization rate is M
Figure BDA0003054286610000041
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
In some of the flows described in the specification, claims, and above-described figures of the present application, a number of operations are included that occur in a particular order, which operations may be performed out of order or in parallel as they occur herein. The sequence numbers of the operations, e.g., 101, 102, etc., are used merely to distinguish between the various operations, and do not represent any order of execution per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different. In the present application, the term "or/and" is only one kind of association relationship describing the associated object, and means that three relationships may exist, for example: a or/and B, which means that A can exist independently, A and B can exist simultaneously, and B can exist independently; the "/" character in this application generally indicates that the objects associated with each other are in an "or" relationship. In addition, the embodiments described below are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
With the development of network technology and information processing technology, personal and enterprise data is in an explosive expansion growth trend, and a traditional centralized storage server cannot meet the requirement of large-scale data storage, so that a distributed storage system is mostly adopted in the prior art to store data. When the distributed storage system stores data, a plurality of storage servers are used for sharing storage load, so that the reliability, the availability and the storage efficiency of the system can be improved, and the expansibility is easy. However, most of the existing distributed storage systems improve the reliability, availability and expandability of the systems by using a multi-copy data storage technology, which undoubtedly causes great waste of storage space and increases storage cost, for example, three copies are taken as an example, user data needs to be completely stored on three storage nodes, and the amount of data that can be stored by the entire storage system only occupies 1/3 of the original storage capacity, that is, the storage utilization rate of a disk is 1/3, and the number of disks allowed to fail is 2.
Compared with the multi-copy storage technology, the erasure code storage technology (EC technology for short) has higher storage efficiency and can reduce the storage space occupation. For example, EC (4+2) is used for data storage, user data is divided into 4 original data blocks, the 4 original data blocks are encoded to generate 2 check blocks, and then the 4+2 data blocks (including the original data blocks and the check blocks) are respectively stored on different storage nodes to form a consistency stripe, the maximum number of the damaged data blocks tolerated by the system is 2, and the storage space utilization rate is 2/3. Based on the above, it can be seen that the EC technology can improve the utilization rate of the storage space more than necessary under the same fault redundancy condition. However, when data storage is performed by using the existing EC technology, although the redundant check blocks can ensure the reliability of the data, since the redundant check blocks may change with the change of the original data blocks, there is a significant limitation in the existing EC technology: writing new data incurs additional overhead called write penalty, that is, write penalty means that one data write requires multiple reads and writes. Particularly, when data modification is required, the write penalty problem in the EC technology is particularly prominent, for example, when a write operation is performed once, an old original data block and an old parity block to be modified need to be read out simultaneously, a new parity block needs to be calculated in combination with a new original data block to be written, and finally the old original data block and the old parity block need to be overwritten by the new original data block and the new parity block. Specifically, for example, when EC (2+1) is used to store data, 1 time of data writing, 2 times of reading (i.e., reading once the old original data block to be modified and reading once the old parity block) are required, 2 times of writing (i.e., writing once the new original data block and writing once the new parity block) are required, and the writing penalty is 4; for another example, when EC (4+2) is used to store data, since two parity chunks exist, compared with EC (2+1), 2 parity chunks need to be read and 2 parity chunks need to be written, so that the write penalty corresponding to EC (4+2) is 6, and the existence of the write penalty undoubtedly affects the storage performance of the storage system. It should be noted that: the calculation formula of the new check block may specifically be: the new check block is the old original data block XOR new original data block XOR old check block; where XOR represents an exclusive or operation.
In addition, the existing EC technology also has a non-uniform update problem, for example, in the EC (K + M) technology, K + M data blocks (including original data blocks and parity blocks) distributed on different storage nodes form a stripe with uniformity, and when any data block in the stripe is lost, as long as no less than K data blocks are arbitrarily taken from the K original data blocks and the M parity blocks, the original K original data blocks can be recovered through a corresponding reconstruction algorithm. When a block or blocks of data in a stripe need to be modified, the system typically reads the corresponding parity block or blocks, recalculates the parity block based on the new block of data, and finally writes the new block of data and parity block at the same time. If cluster faults (such as system crash and power failure) suddenly occur in the process of simultaneous writing, a phenomenon that part of original data blocks or check blocks are modified and the other part of original data blocks or check blocks are not modified yet may occur, or a phenomenon that the same original data blocks or check blocks are only modified partially and the other part of original data blocks or check blocks are not modified yet occurs, which may cause a problem that data blocks and check blocks on a stripe are inconsistent, and the inconsistency is a non-consistent updating problem. Based on this, in order to solve the above problem, the embodiment of the present application proposes a data processing method, which is applicable to a storage system (such as the storage system shown in fig. 1) or a storage device using a Stripe (Stripe) as a management manner, for example, a storage array composed of a Solid State Disk (SSD), or an SSD itself, or a storage array composed of a tiled magnetic Recording (SMR), and the like, and is not limited herein.
A Stripe (Stripe) is a method for merging a plurality of disks (a single disk, i.e. a plurality of storage media; or a storage block of a disk, i.e. a plurality of block units constituting the storage block, etc.) into a volume, and a Stripe can be understood as a set of location-dependent blocks on two or more partitions in a disk array (or a single disk, or a storage block of a disk), which can also be referred to as a Stripe Unit (SU), i.e. a Stripe is composed of a plurality of SUs. Stripe is a specific result of striping management of storage space, as can be seen in particular by Stripe of skill in the art. In the embodiment of the present application, a stripe is composed of a plurality of block units (hereinafter referred to as data areas) of a memory block, and it is understood that the detailed description will be described below, and detailed description is not repeated here.
A storage system to which the data processing method provided in the embodiment of the present application is applicable is described below. Referring to fig. 1, a schematic structural diagram of a distributed erasure code based storage system according to an embodiment of the present application is shown. As shown in fig. 1, the storage system includes a plurality of storage nodes, such as a storage node a, a storage node B, and a storage node C, which may be, but not limited to, servers, workstations, etc., and the storage nodes may communicate with each other via InfiniBand, ethernet, etc. Each storage node may include a plurality of storage disks (i.e., disks shown in the figure) below it, such as storage node A including storage disk A1, storage disk A2, and storage disk A3, including but not limited to: mechanical Hard Disk (HDD), Solid State Drive (SSD), Storage Class Memory (SCM), and Shingling Magnetic Recording (SMR). In practical applications, the number of storage nodes in the storage system and the number of storage disks included in each storage node may be increased according to actual requirements, which is not limited in this embodiment. The storage system can perform centralized management on storage resources (such as storage nodes and storage disks) in the system, and when receiving data sent by a writer, allocates corresponding storage resources for the data. In specific implementation, the storage resources in the storage system are partitioned, and the partitioned storage resources are organized according to erasure code types, so as to realize management and allocation of the storage resources, specifically, the process is as follows:
step 1, establishing a storage pool aiming at storage resources in a storage system;
in particular, a storage pool may be formed by selecting portions of storage disks from a plurality of storage disks included under each storage node. The number of storage disks under the same storage node contained in the storage pool can be flexibly set according to actual requirements, for example, the number of storage disks under the same storage node contained in the storage pool can be 3, 5, and the like; alternatively, all storage disks contained under all storage nodes in the system may be grouped into one storage pool.
For example, as shown in fig. 1, the storage system includes 3 storage nodes, namely, storage node a, storage node B, and storage node C; each storage node comprises 3 storage disks, specifically, storage node a comprises storage disk a1, storage disk a2, and storage disk A3, storage node B comprises storage disk B1, storage disk B2, and storage disk B3, and storage node C comprises storage disk C1, storage disk C2, and storage disk C3, so that storage node a, storage node B, and all storage disks below storage node C can be combined into one storage pool 100.
Step 2, segmenting each storage disk in the storage pool according to the designated capacity to virtualize the storage disk into a plurality of storage blocks (Chunk, CK) with the same size, wherein the designated capacity can be flexibly set according to actual requirements and is not limited herein;
illustratively, assuming that the total capacity size of the storage disk a1 in the storage pool 100 is 1T and the specified capacity is 256M, the storage disk a1 may be virtualized into 1T/256M-4 x 1024-4096 storage blocks of the same size, each having a respective corresponding identifier that uniquely identifies the storage block. In particular, the identifier may be, but is not limited to, a number or an address, which may be allocated when the disk is virtualized into a plurality of storage blocks of the same size, and has uniqueness. For example, if the identifier is a number, when the storage disk a1 is virtualized into 4096 storage blocks, the 4096 storage blocks may be numbered sequentially as follows: a is11,a12,...,a14096. Accordingly, other storage disks in the storage pool can be respectively assigned with specified capacity (such as256M) into a plurality of equally sized memory blocks.
Step 3, selecting a storage block from different storage disks of a plurality of different storage nodes respectively according to the erasure code type to form a storage block Group (abbreviated as CKG);
with reference to fig. 1, for example, the EC (2+1) erasure code type is used, that is, a storage block needs to be selected from storage disks under 3 storage nodes to form a CKG, so that data can be stored in different storage disks under different storage nodes in a distributed manner, thereby ensuring the recoverability of data when a single point failure occurs. It is assumed here that a storage block a is selected from the storage disks a1 under the storage node a, respectively11Selecting a storage block B from the storage disk B1 under the storage node B12And selecting a storage block C from the storage disk C1 under the storage node C11And a memory block group 0 (i.e. CKG0 shown in the figure) is formed, and so on, a plurality of CKGs can be further organized and formed for use by upper layers, wherein only 2 organized CKGs (i.e. CKG0, CKG1) are schematically shown in the figure and do not represent the number of actual CKGs. Here CKG is the smallest unit of storage resources allocated by the storage system. It should be noted that: when selecting storage blocks from different storage disks under different storage nodes to organize and form a storage block group, a random selection mode may be adopted, or selection may be performed according to a load condition of each storage disk, which is not specifically limited herein, and a specific selection process may refer to the prior art.
Step 4, further dividing each storage block in each CKG into a plurality of data areas with finer granularity, forming a stripe with the plurality of data areas belonging to the same storage block, namely, after further refining and dividing the plurality of storage blocks in one CKG, respectively, forming a stripe corresponding to each of the plurality of storage blocks, and after performing finer granularity dividing on each storage block, forming a stripe, organizing the stripes corresponding to each of the plurality of storage blocks together to form a storage segment, and accordingly: the storage section comprises a plurality of strips, and the plurality of strips contained in the storage section have one-to-one correspondence with the plurality of storage blocks contained in the storage block group. For example, drawingsThe storage section 0 shown in 1 comprises a strip 1, a strip 2 and a strip 3, which are respectively connected with the storage blocks a in the storage block group 011Storage block b12Storage block c11One-to-one correspondence, specifically, stripe 1 is formed by pairing memory blocks a11A plurality of data areas obtained by dividing the data area are formed, and the strip 2 is formed by storing the storage block b12A plurality of data areas obtained by dividing the data area, the stripe 3 is formed by storing the block c11And a plurality of data areas obtained by cutting. The storage segment externally represents a logical disk (LUN) (or may also be referred to as a logical space unit) accessed by the host, the LUN is a storage unit that can be directly mapped to the host operating system to implement reading and writing, and when a reading and writing request of a user is processed and data migration is performed, the LUN applies for space, releases space, and migrates data to the storage system all in units of a data area of a stripe in the storage segment.
Here, it should be noted that: a plurality of memory blocks in one memory block group have the same size; accordingly, the plurality of strips in one storage section also have the same length.
Fig. 2 is a schematic flowchart illustrating a data processing method according to an embodiment of the present application. The method can be applied to the storage system based on distributed erasure codes shown in fig. 1, and as shown in fig. 2, the method comprises the following steps:
101. acquiring a plurality of data blocks;
102. when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section;
103. filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks;
104. determining check data according to the data blocks;
105. filling the check data into the rest stripes of the plurality of stripes;
106. and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks.
In the above 101, the data blocks are from the same data, and the specific form of the data may be, but is not limited to, one or more of symbols, characters, numbers, voice, images, videos, and the like. The data may be data sent by a writer through an interactive mode (such as a keyboard, a mouse, a hand touch, and the like) provided by a corresponding device, or may be data acquired by the storage system of this embodiment from another server or another storage system, which is not specifically limited herein; the device corresponding to the writer may be any terminal device such as a mobile phone, a tablet computer, a desktop computer, a notebook computer, and an intelligent wearable device, which is not specifically limited in this embodiment. In the process of storing the data into the storage system of the embodiment, the data is firstly cached into a first storage medium in the system, and the first storage medium divides the data into a plurality of data blocks with the same size according to the specified size, so that the data can be distributed and stored to different storage disks under different storage nodes by using an erasure code technology, and the recoverability of the data when a single-point fault occurs is ensured; in a specific implementation, the size of the data block may be determined according to actual situations, for example, for a 24MB data, the data block may be divided into 12 data blocks according to the size of 2 MB.
After the data is divided into a plurality of data blocks with the same size, considering that the divided data blocks may be relatively small, if the small data blocks are directly combined into a stripe and written into the storage disk, the writing randomness is easily increased, and the writing performance of the storage disk is further reduced. Accordingly, in order to avoid the problem of the decrease in random writing performance caused by the direct writing of small data blocks to the disk, further, in this embodiment, a plurality of data blocks corresponding to the data in the first storage medium are issued to the second storage medium for storage, so that when the number of the plurality of data blocks cached in the second storage medium reaches a set value, the plurality of data blocks are combined into a larger strip and then written into the corresponding storage disk, and a related process of how to store the data blocks into the corresponding storage disk may refer to the following related contents, which is not described herein in detail. Here, it should be noted that: referring to fig. 1, the first storage medium may be a storage volume 11 corresponding to a storage system; the second storage medium may be a Solid State Disk (SSD) 12 used for caching data in the storage system, and the first storage medium and the second storage medium may also be in other forms, which is not limited herein. Based on this, one possible implementation of the above step 101 "obtaining multiple data blocks" is: receiving data sent by a writer; the received data is divided into data blocks of the same size.
In 102, the plurality of stripes included in the storage segment correspond to a plurality of storage blocks belonging to different storage nodes and different storage disks, respectively. Specifically, a plurality of stripes included in the storage segment have a one-to-one correspondence with a plurality of storage blocks included in the storage block group; a stripe is determined based on the result of the slicing process for one memory block, and specifically, a stripe is understood to be composed of a plurality of data areas obtained by slicing one memory block. The storage block group is obtained by organizing storage blocks belonging to different storage nodes and different storage disks in the storage pool according to erasure code classes, and specifically, how to obtain the storage block group and how to obtain a corresponding storage segment according to the storage block group may refer to the above related contents, which are not described in detail herein. In addition, based on the content in step 101, in this embodiment, in order to avoid the problem of the random write performance degradation caused by the direct disk writing of small data, after dividing the received data into a plurality of data blocks with the same size, the plurality of data blocks are firstly cached to the second storage medium, and when the number of the data blocks cached to the second storage medium reaches a set value, a free storage segment is determined for the plurality of data blocks to store the plurality of data blocks. Accordingly, after determining that the number of the acquired data blocks reaches a set value, "determining an empty storage segment" in the step 102 may be implemented by specifically adopting the following steps:
1021. allocating a free memory block group;
1022. and determining the storage section according to the storage block group.
In a specific implementation, when the number of the acquired data blocks reaches a set value, a random allocation manner may be adopted to allocate an idle storage block group to the data blocks, and of course, other manners may also be adopted, which is not limited herein. And then further determine the storage segment corresponding to the storage block group according to the storage block group, see the above related contents.
In addition, based on the principle related to erasure code storage technology, it can be known that, in the plurality of stripes included in the storage segment, a part of the stripes are required to store the data block, and another part of the stripes are used to store the check data obtained by encoding the data block, so that when the content in a part of the stripes fails or cannot be read out, the data can be recovered from the content in the other stripes according to the corresponding EC algorithm, thereby ensuring high availability of the data in the system. On this basis, namely, the plurality of strips contained in the storage section are provided with two types of strips, namely a first type of strip and a second type of strip; the first-type stripe is used for storing the plurality of data blocks, the second-type stripe is used for storing the check data, and the check data is obtained by encoding the plurality of data blocks by using a corresponding erasure code encoding mode (such as RS encoding). When all data areas of the first type stripes included in a segment are filled, the segment is considered to be a full segment. As can be seen from this, the set value in step 102 represents the number of data blocks required to fill the storage segment, that is, the maximum number of data blocks required to represent that all data areas of the first type stripes included in the storage segment are filled, and specific descriptions related to the first type stripes and the determination process of the set value may be referred to the following related contents, which are not described herein again. Illustratively, referring to a block 0 shown in fig. 3a, the block 0 is determined according to the block group 0 (i.e., CKG0) shown in fig. 1, since the block group 0 is obtained by the storage system organizing the storage blocks belonging to different storage nodes and different storage disks in the system according to the EC (2+1) mode, the block 0 has 2 first type stripes, such as stripe 1 (i.e., D1) and stripe 2 (i.e., D2), and 1 second type stripe, such as stripe 3 (i.e., P); accordingly, the set value is the maximum number of data blocks required to fill the data areas of the slice 1 and the slice 2.
In an implementable technical method, the storage section contains at least one of the first type of stripe, the first type of stripe comprising a stripe header area and a data area; accordingly, the step 103 "fill corresponding content in a partial stripe of the plurality of stripes based on the plurality of data blocks" may specifically include:
1031. sequentially filling the plurality of data blocks into a corresponding data area of at least one first type stripe;
1032. and determining the strip head information filled in the strip head area of the strip of the first type based on the data blocks in the data area of the strip of the first type.
1031, in the process of filling the plurality of data blocks into the corresponding data area of at least one of the first-type stripes, based on the position relationship among the at least one of the first-type stripes, first sequentially filling part of the plurality of data blocks into the corresponding data area of one of the first-type stripes whose relative position is the most front, and when the corresponding data area of the first-type stripe is filled, performing a filling operation of the corresponding data area of the next first-type stripe. For example, with continued reference to fig. 3a, stripe D1 and stripe D2 are of the first type, assuming that the data blocks are specifically: d1, D2, D3, a. # s, dm +1, a. # s, dn-1, dn, based on the positional relationship between the strip D1 and the strip D2, here, the data blocks D1, D2, D3, a. # s, dm +1, a. # s, dn-1, dn may be first sequentially filled in the corresponding data area 1 of the strip D1, and when the data block dm is filled in the corresponding data area 1 of the strip D1, the data area 1 reaches a full state, then the data blocks dm +1, a. # s, dn-1, dn located after the data block dm are filled in the corresponding data area 2 of the strip D2. Of course, the plurality of data blocks may be sequentially filled into the corresponding data area of at least one of the first-type stripes in other manners, such as a priority order among at least one of the first-type stripes, which is not specifically limited herein.
1032, the data format of the slice header of the first type slice is as shown in fig. 3b, and accordingly, the slice header information filled in the slice header includes a magic number and a checksum, where the magic number is an internally self-defined constant value, such as 0xf981ef0 d; the checksum is a value determined based on the data block filled in the data area of the first-type stripe, and specifically may be a checksum corresponding to the filled data block calculated according to a set check algorithm (e.g., a CRC check algorithm) when the data block is filled in the corresponding data area of the first-type stripe, for example, the checksum in the stripe header information of the stripe D1 is a checksum calculated by using the CRC check algorithm on the data blocks D1, D2, D3, d.once.dm, as an example in the above step 1031 is received, and the data block filled in the corresponding data area 1 of the stripe D1 is set to be D1, D2, D3, d.once.m., and how to calculate the checksum by using the CRC check algorithm is the same as the prior art. Similarly, the magic number and checksum included in the slice header information of slice D2 may also be determined.
Further, a first type stripe of said at least one said first type stripe further comprises a segment header region, for example: two stripes of the first type are shown in fig. 3a, namely stripe D1 and stripe D2, stripe D1 further comprising a segment header region for stripe D1 in addition to the above-mentioned header and data regions. Accordingly, the step 103 "filling corresponding contents in a partial stripe of the plurality of stripes based on the plurality of data blocks" further includes:
1033. determining segment header information based on the plurality of data blocks;
1034. and filling the segment header information into the segment header area.
In specific implementation, the data format of the segment header area is as shown in fig. 3c, and accordingly, the segment header information filled in the segment header area may specifically include: magic number, checksum, version identification, data block number and description information of each data block. Wherein the magic number is a constant value internally defined; the checksum is obtained by calculating the multiple data blocks according to a set check algorithm (such as a CRC check algorithm), for example, the example in step 1031 continues to be carried out, and specifically, the setting of the multiple data blocks is: d1, d2, d3, a. # dm, dm +1, a. # dn-1, dn, the checksum filled in the segment header area is the checksum calculated by the CRC check algorithm on the data blocks d1, d2, d3, a. # dm, dm +1, a. # dn, dn; the description information of each data block may include, but is not limited to, a length, a fill-in time, an offset address, etc. of the data block; the version identifier is used for version upgrade, for example, when the data format of the storage segment needs to be changed, the storage segment before and after the change can be distinguished through the version identifier, where the change operation may specifically be, for example, updating data in the storage segment.
Based on the above and referring to fig. 3a to 3c, the stripes in the storage segment shown in fig. 3a have the same length and are known in advance, and here the stripe length is set to stripe _ len, since the set value in the above step 1021 is the maximum number of data blocks required to fill the corresponding data areas of all the stripes of the first type in the storage segment, that is, for the storage segment 0 shown in fig. 3a, the maximum number of data blocks required to be able to fill the data area 1 of the stripe 1 and the data area 1 of the stripe 2, the set value T corresponding to the storage segment 0 can be determined by the following formula, specifically:
data_len*T=(stripe_len-stripe_head_len-segment_head_len-T*data_describe_info_len)+(stripe_len-stripe_head_len); (1)
and (3) converting the formula (1) to obtain the set value T:
Figure BDA0003054286610000151
in formula (1) and formula (2), stripe _ head _ len represents a stripe header region length, segment _ head _ len represents a segment header region length, data _ descriptor _ info _ len represents a description information length of a data block, and date _ len represents a length of the data block.
The "determining the check data according to the plurality of data blocks" in 104 includes:
1041. encoding the plurality of data blocks;
1042. and determining the check data according to the encoding processing result.
In specific implementation, the plurality of data blocks may be encoded according to an encoding calculation manner of an EC _ TYPE (such as RS encoding), so as to determine the check data according to an encoding processing result, and how to calculate the check data according to an EC _ TYPE calculation manner of the check encoding is the same as that in the prior art, which is not described herein again.
In the above 105, the determined check data may be filled into the second-type stripe in a sequential filling manner, specifically, refer to the related content of filling the data block into the first-type stripe.
In 106, after the check data is filled into the second type of stripe, distributed storage may be performed on the data content filled in each stripe in the storage segment according to a corresponding relationship between the stripe and the storage block, so as to respectively store the contents in the plurality of stripes in the storage segment into the respective corresponding storage blocks, that is, respectively store the contents in the plurality of stripes in the storage segment into the respective corresponding storage disks.
In view of the above, what needs to be added here is: the magic numbers included in the header information corresponding to the respective at least one first-type strip in the storage section, and the magic numbers included in the header information corresponding to the at least one first-type strip in the storage section may be the same or different, and this embodiment is not limited specifically. In addition, when the storage segments which are filled up are stored in a distributed mode, the magic numbers and the checksums which are included in the stripe header information of the first type of stripes are written into the corresponding storage disks together with the data blocks which are filled into the data areas of the first type of stripes, so that when the data are read from the storage disks, the reliability of the data can be checked through the magic numbers and the checksums. For example, referring to fig. 3a to fig. 3B and fig. 4, assuming that user data (i.e. data 1) is distributed and stored into storage disk a1, storage disk B1 and storage disk C1 after filling storage segment 0, if the defined magic number is 0xf981ef0D, after reading storage segment 0 data (i.e. data 1) from storage disks (i.e. storage disk a1, storage disk B1 and storage disk C1), if the magic number can be taken out from stripe D1 in storage segment 0 and compared with 0xf981ef0D, if the comparison result is not consistent, it is determined that the data in stripe D1 is unreliable (i.e. the data stored in storage disk a1 is unreliable); in addition, when data is read, the checksum may be recalculated according to the corresponding check algorithm when the data is filled into the storage segment 0, and then the checksum is compared with the checksum stored in the storage disk, so as to determine the reliability of the data based on the comparison result, for example, when the data is read, the checksum corresponding to the stripe D1 may be recalculated by using the corresponding check algorithm when the data is filled, and the recalculated checksum is compared with the checksum stored in the storage disk a1, and if the comparison result is inconsistent, it is determined that the data stored in the storage disk a1 is unreliable. Based on the above, when data is read, whether the data in the strip has a problem or not can be quickly determined by using the magic number and the checksum, so that the data in the strip which cannot be checked is conveniently recovered through the data in the rest strips, and the reliability of the data is improved. The magic number is arranged, so that the efficiency of determining whether data have problems can be rapidly improved, the number of times of checking and calculating is reduced, and the performance of the system is improved.
In the above, when reading data from the storage disk, the magic number and the checksum in the head area of the stripe of one stripe in the storage segment are used to check whether the data in the corresponding stripe is reliable. Similarly, the magic number and the checksum in the segment head area of the storage segment are used for verifying whether the data in the whole storage segment are reliable or not.
In summary, in this embodiment, the data block and the check data are written into the corresponding storage disk together in a data full storage segment manner, that is, when performing the operation of writing data into the storage disk each time, it is only necessary to write the contents (such as the data block or the check block) in the plurality of stripes in the entire full storage segment into the corresponding storage disk together in a distributed manner, and the operation of reading the data block and the check data from the storage disk is not performed, which can effectively reduce the write penalty of the erasure code, and even for the erasure code of the EC (2+1) type, the write performance can be doubled compared with the write performance of the normal erasure code.
Further, the method provided by this embodiment further includes:
107a, acquiring logical addresses of the plurality of data blocks;
107b, determining a physical address based on the storage section filled with the plurality of data blocks;
107c, establishing a mapping relation between the logical address and the physical address, and storing the mapping relation into a database.
In a specific implementation, the logical address is generally issued and transferred by an upper operating system or application corresponding to the storage system, for example, when an upper operating system intends to perform read/write operations on the storage system corresponding to the upper operating system, the upper operating system will inform the storage system corresponding to the logical address to be read/written to the upper operating system. Based on this, the logical addresses of the data blocks may be issued and transferred by an upper operating system or application corresponding to the storage system provided in this embodiment, where the upper operating system or application is integrated on the device corresponding to the writer. Specifically, referring to fig. 1, when a writer performs a write operation on a storage system shown in the figure for data through a corresponding device, the device corresponding to the writer issues the data to be written and a corresponding written logical address to a storage volume 11 in the storage system, and then the storage volume 11 divides the received data into a plurality of data blocks and sends the plurality of data blocks to an SSD for caching, so that the storage system performs subsequent processing based on the plurality of data blocks cached in the SSD, where specific subsequent processing may refer to the related contents. It should be noted that the logical address is sent to the SSD by the storage volume 11 along with the plurality of data blocks, that is, the logical address of the plurality of data blocks is recorded in the SSD in addition to the plurality of data blocks cached in the SSD. Accordingly, the logical addresses of the data blocks may be obtained from the SSD as shown in fig. 1, so that based on the obtained logical addresses of the data blocks and the physical addresses determined based on the storage segments filled with the data blocks, a mapping relationship between the logical addresses and the physical addresses may be established, and the mapping relationship is maintained to the database, so as to facilitate a subsequent data reading operation performed according to the mapping relationship.
The above-mentioned content introduces the process of storing data into the storage disk mainly from the viewpoint of reducing the write penalty of erasure code, and for the problem of inconsistent update existing in the process of updating data by erasure code, this embodiment adopts the method of storing updated data into the storage disk in the form of additional write, that is, the user overwrites the data written into the same logical position and keeps the data at different positions of the storage disk, that is to say: when the existing data is modified, a new storage segment is allocated to the modified data, so that the modified data is stored in a new position of the storage disk, and the modified data does not occupy the position of the storage disk used for storing the data before modification. Therefore, in the process of updating the data stored in the storage disk, if sudden conditions such as system crash, power failure and the like occur, the data successfully stored in the storage disk can be recovered, so that the consistency of the EC of the data of the storage disk is ensured to be always met. The specific implementation data updating process is as follows:
referring to fig. 4 in combination with fig. 1, in the storage system provided in this embodiment, it is assumed that when a user writes data 1 to a logical address LBA1 for the first time, the storage system allocates an idle storage segment 0 to the data 1 (different storage segments correspond to different physical addresses), and stores the data 1 in the storage disk a1, the storage disk B1, and the storage disk C1 in a distributed manner through the storage segment 0, and for how to store the data 1 in the storage disk a1, the storage disk B1, and the storage disk C1 in a distributed manner, reference may be made to the above related contents, which is not described herein again in detail. When a user needs to update data 1 and write updated data 1 (i.e., data 2) to the logical address LBA1, the storage system allocates a new storage segment to the data 2, for example, the storage segment 1 (a plurality of stripes in the storage segment 1 have a one-to-one correspondence with the storage block group 1 shown in fig. 1), if a system crash, a power failure, or the like occurs during the distributed storage of the data 2 to a corresponding storage disk through the storage segment 1, because data in different stripes in the storage segment are distributed on different storage disks, when a power failure or the like occurs, it may be caused that only a part of the storage disks write data successfully, and another part of the storage disks do not write data successfully, so that data in the entire storage segment 1 is unusable. However, in this case, since a copy of the completed data 1 is stored in the storage segment 0, the user can read the complete data 1 stored in the disk before from the storage system after the power failure is recovered. However, if the conventional overwriting manner is adopted, that is, the same logical address LBA is written into the same storage segment only, and after the system crashes or is powered off, the data in the storage segment may be partially overwritten by the new data, so that a normal data cannot be read. In summary, the embodiment adopts the additional writing method, so that two different physical addresses can be corresponded to when data is written into the same LBA twice, and therefore, if the system crashes (e.g., system crashes or power failure) during the process of writing data into the disk, any influence on the data which has been written successfully last time will not be generated, thereby solving the problem of inconsistent update in the process of updating data by erasure codes and improving the reliability of data in the system.
It should be noted that: with continued reference to fig. 4, if the storage system successfully completes the update of data 1 in the manner of additional write, that is, data 1 (i.e., data 2) after the update is distributively and successfully stored in storage disk a2, storage disk B3, and storage disk C3, the data stored in the corresponding storage disk in the manner of distribution by the data before the update (i.e., data 1) becomes garbage data, that is, invalid data, for example, the data related to data 1 stored in storage disk a1, storage disk B1, and storage disk C1 is marked as garbage data. In addition, after the update of the data 1 is completed, the mapping relationship between the logical address and the physical address related to the data 1, which is established before, is further updated according to the position of the updated data 1 (i.e. the data 2) filled in the storage segment 1, and the old mapping relationship is cleared.
According to the technical scheme provided by the embodiment, after a plurality of data blocks are obtained, an idle storage segment is determined; the storage section comprises a plurality of strips, and different strips correspond to different storage blocks; filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks, and filling the check data into the rest stripes of the stripes; and then, according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the strips in the storage sections into the corresponding storage blocks. The scheme can reduce the write punishment and avoid the problem of the random write performance reduction caused by the direct write of the small data to the disk, and in addition, because the embodiment adopts the additional write mode to update the data, the problem of the non-uniform update in the erasure code data updating process is effectively solved.
Referring to fig. 5, a schematic structural diagram of a memory system according to another embodiment of the present application is shown. As shown in fig. 5, the storage system specifically includes:
a storage pool 20 comprising a plurality of storage disks having a plurality of storage blocks;
a distributed storage module 21 for acquiring a plurality of data blocks; when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks; filling the check data into the rest stripes of the plurality of stripes; and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to store the strips in the storage sections into the corresponding storage blocks respectively.
In specific implementation, the specific form of the storage disk may be, but is not limited to: storage media such as a Hard Disk Drive (HDD), a Solid State Drive (SSD), a Storage Class Memory (SCM), and a Shingled Magnetic Recording (SMR); the storage disk in the storage pool is divided into a plurality of storage blocks according to the designated capacity.
Further, the storage system provided in this embodiment further includes:
the storage pool management module 22 is configured to organize storage blocks belonging to different storage nodes and different storage disks according to erasure code types to obtain storage block groups;
the distributed storage module 21 is configured to allocate an idle storage block group when the number of the acquired data blocks reaches a set value; and determining the storage section according to the storage block group.
Further, the storage system provided in this embodiment further includes:
the storage volume 23 is used for receiving data sent by a writer and a logical address corresponding to the data; dividing received data into data blocks with the same size;
the cache module 24 is configured to cache the data blocks divided by the storage volume so that the distributed storage module obtains the plurality of data blocks; recording the logical addresses corresponding to the data blocks so that an address mapping module acquires the logical addresses corresponding to the data blocks; and determining the logical address corresponding to the data block according to the logical address corresponding to the data.
In specific implementation, the cache module may be a Solid State Disk (SSD), or may be in other forms, which is not limited herein.
Further, the storage system provided in this embodiment further includes:
an address mapping module 25, configured to obtain the logical addresses corresponding to the multiple data blocks from the cache module 24; acquiring the physical addresses determined by the distributed storage module for the plurality of data blocks; and establishing and storing the mapping relation between the logical address and the physical address.
It should be noted here that, in addition to the above modules (such as the storage pool, the distributed storage module, the storage pool management module, the storage volume, the caching module, and the address mapping module), the storage system provided in this embodiment may further include: the data logging module is contained in the cache module (not shown in the figure), and is used for receiving the divided data blocks sent by the storage volume and sending the data blocks to the cache module (such as a Solid State Disk (SSD)) for caching, that is, after the storage volume divides the received data sent by the writer into the data blocks with the same size, the divided data blocks can be sent to the data logging module first, and the data logging module sends the data blocks to the cache module for keeping. The data log module is arranged here, so that the problem of reduced random writing performance caused by direct disk writing of small data blocks can be solved, and data loss caused by system downtime (such as system breakdown and power failure) when data is not written into the storage disk can be prevented.
Here, it should be further noted that: the content of each step in the storage system provided in this embodiment, which is not described in detail in the foregoing embodiments, may refer to the corresponding content in the foregoing embodiments, and is not described herein again. In addition, the storage system provided in this embodiment may further include, in addition to the above steps, other parts or all of the steps in the above embodiments, and for details, reference may be made to corresponding contents in the above embodiments, and details are not described here again.
In summary, the data processing method provided by the embodiments of the present application can be summarized as the following process. Namely:
(1) sending a plurality of data blocks written to a logical address LBA to a cache module via a storage volume; wherein a plurality of data blocks belong to the same data.
(2) The cache module sends a plurality of data blocks to the space distribution unit, if the space distribution unit does not have a storage section to be downloaded currently, the space distribution unit firstly distributes a free storage section, applies a storage section space for storing the to-be-downloaded data block in the memory, and fills the plurality of data blocks into the storage section memory data space; if the space distribution unit has a storage section to be downloaded, directly filling the data blocks into the memory data space of the storage section to be downloaded; and if the storage section to be downloaded of the space allocation unit is filled, storing the filled storage section to the corresponding disk.
(3) And the space allocation unit obtains the physical addresses addr of the data blocks according to the filling positions of the data blocks and returns the physical addresses addr to the address mapping module.
(4) The address mapping module establishes a mapping relation between the physical address addr and the logical address LBA, stores the mapping relation to the database, and returns the success of data writing to the data log module.
(5) And when the data log module receives the data writing success message, clearing the data stored in the cache module.
(6) If the user continues to write a new data block at the same logical address LBA, for example, when updating the data that has been successfully written into the disk, the updated data block continues to be written at the same logical address LBA, and then the processes in steps (1) to (5) are repeated. However, when step (4) is executed, the address mapping module updates the mapping relationship between the physical address addr and the logical address LBA, and stores the updated mapping relationship in the database, and meanwhile, the address mapping module also clears the mapping relationship between the old physical address addr and the logical address LBA.
It can be seen from the above steps that the same LBA is written twice and corresponds to two different physical addresses, and if the system crashes during the process of writing data into the storage disk, data that has been written successfully before will not be affected, thereby ensuring the consistency of the EC and improving the reliability of the data.
Here, it should be noted that: the space allocation unit may correspond to the distributed storage module shown in fig. 5, and the data log module is included in the cache module shown in fig. 5. For the content that is not described in detail in the above steps, reference may be made to the corresponding content in the above embodiments, and details are not described here.
Here, it should be further explained that: the technical solution provided by the embodiment of the present application is applicable to any adaptive storage system, and the embodiment of the present application does not limit a specific storage system.
Fig. 6 shows a block diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 6, the data processing apparatus specifically includes:
a first obtaining module 501, configured to obtain a plurality of data blocks;
a first determining module 502, configured to determine a free storage segment when the number of the data blocks reaches a set value; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section;
a first filling module 503, configured to fill corresponding contents in a partial stripe of the plurality of stripes based on the plurality of data blocks;
a second determining module 504, configured to determine check data according to the multiple data blocks;
a second filling module 505, configured to fill the parity data into a remaining stripe of the plurality of stripes;
and a storing module 506, configured to perform distributed storage on the filled storage segments according to the corresponding relationship between the strips and the storage blocks, so as to store the contents in the plurality of strips in the storage segments into the respective corresponding storage blocks respectively.
According to the technical scheme provided by the embodiment, after the number of the acquired data blocks reaches a set value, an idle storage section is determined; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; then, filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks, and filling the check data into the rest stripes of the stripes; and then, according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the strips in the storage sections into the corresponding storage blocks. The technical scheme provided by the embodiment can reduce the writing punishment and solve the problem of inconsistent updating of data, and can also avoid the problem of reduced random writing performance caused by directly writing small data into the disk.
Further, the apparatus provided in this embodiment further includes:
the storage pool module comprises a plurality of storage nodes, one storage node comprises a plurality of storage disks, and one storage disk comprises a plurality of storage blocks;
the organizing module is used for organizing the storage blocks which belong to different storage nodes and different storage disks according to the erasure code type to obtain a storage block group;
the storage blocks are arranged in a storage block group, wherein a plurality of strips contained in the storage section correspond to a plurality of storage blocks contained in the storage block group one by one; and segmenting the storage block to obtain a plurality of data areas, wherein the plurality of data areas form a strip.
Further, the first determining module 502, when configured to determine an idle storage segment, is specifically configured to: when the number of the acquired data blocks reaches a set value, allocating an idle storage block group; and determining the storage section according to the storage block group.
Furthermore, the plurality of strips contained in the storage section are provided with a first type of strip and a second type of strip; wherein the first type stripe is used for storing the plurality of data blocks; the second type stripe is used for storing the check data.
Still further, the storage section comprises at least one first type of stripe, and the first type of stripe comprises a stripe header area and a data area; accordingly, the number of the first and second electrodes,
the first filling module 503, when configured to fill corresponding contents in a partial stripe of the plurality of stripes based on the plurality of data blocks, is specifically configured to: sequentially filling the plurality of data blocks into a corresponding data area of at least one first type stripe; and determining the strip head information filled in the strip head area of the strip of the first type based on the data blocks in the data area of the strip of the first type.
Still further, the slice header information includes: magic number and checksum.
Still further, a first type stripe of said at least one said first type stripe further comprises a segment header region; accordingly, the number of the first and second electrodes,
the first filling module 503 is configured to fill corresponding contents in a part of the plurality of stripes based on the plurality of data blocks, and is further specifically configured to: determining segment header information based on the plurality of data blocks; and filling the segment header information into the segment header area.
Still further, the segment header information includes: magic number, checksum, version identification, data block number and description information of each data block.
Further, when the second determining module 504 is configured to determine the check data according to the plurality of data blocks, it is specifically configured to: encoding the plurality of data blocks; and determining the check data according to the encoding processing result.
Further, the apparatus provided in this embodiment further includes:
a second obtaining module, configured to obtain logical addresses of the multiple data blocks;
a third determining module, configured to determine a physical address based on the storage segment filled with the plurality of data blocks;
and the establishing module is used for establishing the mapping relation between the logical address and the physical address and storing the mapping relation into a database.
Further, when the first obtaining module 501 is configured to obtain a plurality of data blocks, it is specifically configured to: receiving data sent by a writer; the received data is divided into data blocks of the same size.
Here, it should be noted that: the data processing apparatus provided in this embodiment may implement the technical solution described in the data processing method embodiment shown in fig. 2, and the specific implementation principle of each module or unit may refer to the corresponding content in the data processing method embodiment shown in fig. 2, which is not described herein again.
FIG. 7 is a schematic diagram illustrating a structure of a storage device according to an embodiment of the present application. As shown in fig. 7, the storage device includes: a memory 601 and a processor 602. The memory 601 may be configured to store other various data to support operations on the sensors. Examples of such data include instructions for any application or method operating on the sensor. The memory 601 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The memory 601 for storing one or more computer instructions;
the processor 602, coupled to the memory 601, is configured to execute the one or more computer instructions stored in the memory 601 to implement the data processing method provided by the foregoing method embodiments.
Further, as shown in fig. 7, the storage device further includes: communication component 603, display 604, power component 605, and audio component 606, among other components. Only some of the components are shown schematically in fig. 7, and the sensor is not meant to include only the components shown in fig. 7.
Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the steps or functions of the data processing method provided in the foregoing embodiments when executed by a computer.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (13)

1. A data processing method, comprising:
acquiring a plurality of data blocks;
when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section;
filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks;
determining check data according to the data blocks;
filling the check data into the rest stripes of the plurality of stripes;
and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks.
2. The method of claim 1,
the storage pool comprises a plurality of storage nodes, one storage node comprises a plurality of storage disks, and one storage disk comprises a plurality of storage blocks;
organizing storage blocks belonging to different storage nodes and different storage disks according to erasure code types to obtain storage block groups;
the storage blocks are arranged in the storage section, wherein a plurality of strips contained in the storage section correspond to a plurality of storage blocks contained in the storage block group one by one; and segmenting the storage block to obtain a plurality of data areas, wherein the plurality of data areas form a strip.
3. The method of claim 2, wherein determining a free deposit segment comprises:
allocating a free memory block group;
and determining the storage section according to the storage block group.
4. The method of claim 3, wherein the storage section comprises a plurality of strips having a first type of strip and a second type of strip; wherein the content of the first and second substances,
the first type stripe is used for storing the plurality of data blocks;
the second type stripe is used for storing the check data.
5. The method of claim 4, wherein said storage section contains at least one of said first type of stripe, said first type of stripe comprising a header region and a data region; and
based on the plurality of data blocks, filling corresponding contents in partial stripes of the plurality of stripes, including:
sequentially filling the plurality of data blocks into a corresponding data area of at least one first type stripe;
determining the strip head information filled in the strip head area of the first type strip based on the data blocks in the data area of the first type strip; wherein the slice header information includes: magic number and checksum.
6. The method of claim 5, wherein one of at least one of the first type of stripes further comprises a segment header region; and
based on the plurality of data blocks, filling corresponding contents in partial stripes of the plurality of stripes, further comprising:
determining segment header information based on the plurality of data blocks;
filling the segment header information into the segment header area;
wherein the segment header information includes: magic number, checksum, version identification, data block number and description information of each data block.
7. The method of any one of claims 1 to 6, further comprising:
acquiring logical addresses of the plurality of data blocks;
determining a physical address based on the storage section filled with the plurality of data blocks;
and establishing a mapping relation between the logical address and the physical address, and storing the mapping relation into a database.
8. The method of any one of claims 1 to 6, wherein obtaining a plurality of data blocks comprises:
receiving data sent by a writer;
the received data is divided into data blocks of the same size.
9. A storage system, comprising:
a storage pool comprising a plurality of storage disks, a storage disk having a plurality of storage blocks;
the distributed storage module is used for acquiring a plurality of data blocks; when the number of the data blocks reaches a set value, determining an idle storage section; the storage section comprises a plurality of strips, one strip is determined according to the segmentation result of one storage block, and different strips and different storage blocks have corresponding relations; the set value represents the number of data blocks required for filling the storage section; filling corresponding contents in partial stripes of the plurality of stripes based on the plurality of data blocks; determining check data according to the data blocks; filling the check data into the rest stripes of the plurality of stripes; and according to the corresponding relation between the strips and the storage blocks, performing distributed storage on the filled storage sections so as to respectively store the contents in the strips in the storage sections into the corresponding storage blocks.
10. The system of claim 9, further comprising:
the storage pool management module is used for organizing storage blocks which belong to different storage nodes and different storage disks according to the erasure code type to obtain a storage block group;
the distributed storage module is used for allocating an idle storage block group when the number of the acquired data blocks reaches a set value; and determining the storage section according to the storage block group.
11. The system of claim 9 or 10, further comprising:
the storage volume is used for receiving data sent by a writer and a logic address corresponding to the data; dividing received data into data blocks with the same size;
the cache module is used for caching the data blocks divided by the storage volume so that the respective storage module can acquire the data blocks; recording the logical addresses corresponding to the data blocks so that an address mapping module acquires the logical addresses corresponding to the data blocks; and determining the logical address corresponding to the data block according to the logical address corresponding to the data.
12. The system of claim 11, further comprising:
the address mapping module is used for acquiring the logic addresses corresponding to the data blocks from the cache module; acquiring the physical addresses determined by the distributed storage module for the plurality of data blocks; and establishing and storing the mapping relation between the logical address and the physical address.
13. A storage device, comprising: a memory and a processor; the memory is used for storing one or more computer instructions which, when executed by the processor, are capable of implementing the steps of the data processing method of any of the preceding claims 1-8.
CN202110495606.7A 2021-05-07 2021-05-07 Data processing method, storage system and storage device Active CN113176858B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110495606.7A CN113176858B (en) 2021-05-07 2021-05-07 Data processing method, storage system and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110495606.7A CN113176858B (en) 2021-05-07 2021-05-07 Data processing method, storage system and storage device

Publications (2)

Publication Number Publication Date
CN113176858A true CN113176858A (en) 2021-07-27
CN113176858B CN113176858B (en) 2022-12-13

Family

ID=76928450

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110495606.7A Active CN113176858B (en) 2021-05-07 2021-05-07 Data processing method, storage system and storage device

Country Status (1)

Country Link
CN (1) CN113176858B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114301575A (en) * 2021-12-21 2022-04-08 阿里巴巴(中国)有限公司 Data processing method, system, device and medium
CN114995770A (en) * 2022-08-02 2022-09-02 苏州浪潮智能科技有限公司 Data processing method, device, equipment, system and readable storage medium
CN115391093A (en) * 2022-08-18 2022-11-25 江苏安超云软件有限公司 Data processing method and system
CN115599315A (en) * 2022-12-14 2023-01-13 阿里巴巴(中国)有限公司(Cn) Data processing method, device, system, equipment and medium
WO2023029624A1 (en) * 2021-09-03 2023-03-09 华为技术有限公司 Storage block collection method and related apparatus
CN117149094A (en) * 2023-10-30 2023-12-01 苏州元脑智能科技有限公司 Method and device for determining data area state, disk array and storage system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161972A1 (en) * 2001-04-30 2002-10-31 Talagala Nisha D. Data storage array employing block checksums and dynamic striping
JP2003296038A (en) * 2002-03-21 2003-10-17 Network Appliance Inc Method for writing continuous arrays of stripes in raid storage system
CN102722340A (en) * 2012-04-27 2012-10-10 华为技术有限公司 Data processing method, apparatus and system
CN105677249A (en) * 2016-01-04 2016-06-15 浙江宇视科技有限公司 Data block partitioning method, device and system
CN105930500A (en) * 2016-05-06 2016-09-07 华为技术有限公司 Transaction recovery method in database system, and database management system
CN110399310A (en) * 2018-04-18 2019-11-01 杭州宏杉科技股份有限公司 A kind of recovery method and device of memory space
CN112019788A (en) * 2020-08-27 2020-12-01 杭州海康威视系统技术有限公司 Data storage method, device, system and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161972A1 (en) * 2001-04-30 2002-10-31 Talagala Nisha D. Data storage array employing block checksums and dynamic striping
JP2003296038A (en) * 2002-03-21 2003-10-17 Network Appliance Inc Method for writing continuous arrays of stripes in raid storage system
CN102722340A (en) * 2012-04-27 2012-10-10 华为技术有限公司 Data processing method, apparatus and system
CN105677249A (en) * 2016-01-04 2016-06-15 浙江宇视科技有限公司 Data block partitioning method, device and system
CN105930500A (en) * 2016-05-06 2016-09-07 华为技术有限公司 Transaction recovery method in database system, and database management system
CN110399310A (en) * 2018-04-18 2019-11-01 杭州宏杉科技股份有限公司 A kind of recovery method and device of memory space
CN112019788A (en) * 2020-08-27 2020-12-01 杭州海康威视系统技术有限公司 Data storage method, device, system and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023029624A1 (en) * 2021-09-03 2023-03-09 华为技术有限公司 Storage block collection method and related apparatus
CN114301575A (en) * 2021-12-21 2022-04-08 阿里巴巴(中国)有限公司 Data processing method, system, device and medium
WO2023116141A1 (en) * 2021-12-21 2023-06-29 阿里巴巴(中国)有限公司 Data processing method, system and device, and medium
CN114301575B (en) * 2021-12-21 2024-03-29 阿里巴巴(中国)有限公司 Data processing method, system, equipment and medium
CN114995770A (en) * 2022-08-02 2022-09-02 苏州浪潮智能科技有限公司 Data processing method, device, equipment, system and readable storage medium
CN114995770B (en) * 2022-08-02 2022-12-27 苏州浪潮智能科技有限公司 Data processing method, device, equipment, system and readable storage medium
CN115391093A (en) * 2022-08-18 2022-11-25 江苏安超云软件有限公司 Data processing method and system
CN115391093B (en) * 2022-08-18 2024-01-02 江苏安超云软件有限公司 Data processing method and system
CN115599315A (en) * 2022-12-14 2023-01-13 阿里巴巴(中国)有限公司(Cn) Data processing method, device, system, equipment and medium
CN115599315B (en) * 2022-12-14 2023-04-07 阿里巴巴(中国)有限公司 Data processing method, device, system, equipment and medium
CN117149094A (en) * 2023-10-30 2023-12-01 苏州元脑智能科技有限公司 Method and device for determining data area state, disk array and storage system
CN117149094B (en) * 2023-10-30 2024-02-09 苏州元脑智能科技有限公司 Method and device for determining data area state, disk array and storage system

Also Published As

Publication number Publication date
CN113176858B (en) 2022-12-13

Similar Documents

Publication Publication Date Title
CN113176858B (en) Data processing method, storage system and storage device
US11487619B2 (en) Distributed storage system
US10977124B2 (en) Distributed storage system, data storage method, and software program
US8972779B2 (en) Method of calculating parity in asymetric clustering file system
US20230013281A1 (en) Storage space optimization in a system with varying data redundancy schemes
US11074129B2 (en) Erasure coded data shards containing multiple data objects
CN112889034A (en) Erase coding of content driven distribution of data blocks
US10996894B2 (en) Application storage segmentation reallocation
US11301137B2 (en) Storage system and data arrangement method of storage system
US20190243553A1 (en) Storage system, computer-readable recording medium, and control method for system
US11449402B2 (en) Handling of offline storage disk
US20190347165A1 (en) Apparatus and method for recovering distributed file system
CN112749039A (en) Method, apparatus and program product for data writing and data recovery
US11507278B2 (en) Proactive copy in a storage environment
CN112115001A (en) Data backup method and device, computer storage medium and electronic equipment
US11481275B2 (en) Managing reconstruction of a malfunctioning disk slice
CN111124746A (en) Method, apparatus and computer readable medium for managing redundant arrays of independent disks
US11544005B2 (en) Storage system and processing method
US20230236932A1 (en) Storage system
JP6605762B2 (en) Device for restoring data lost due to storage drive failure
CN117806528A (en) Data storage method and device
CN115391093A (en) Data processing method and system
CN117311600A (en) Storage system including a plurality of solid state drives and management method thereof
CN116414294A (en) Method, device and equipment for generating block group
CN114115735A (en) Method and device for writing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant