CN114579062B

CN114579062B - Disk optimization method and device based on distributed storage system

Info

Publication number: CN114579062B
Application number: CN202210465590.XA
Authority: CN
Inventors: 周磊; 文刘飞; 陈坚
Original assignee: Shenzhen Sandstone Data Technology Co ltd
Current assignee: Shenzhen Sandstone Data Technology Co ltd
Priority date: 2022-04-29
Filing date: 2022-04-29
Publication date: 2022-08-05
Anticipated expiration: 2042-04-29
Also published as: CN114579062A

Abstract

The embodiment of the invention relates to a disk optimization method and a device based on a distributed storage system, wherein when a WRITE _ SAME request command is received and a back-end management main process and a back-end management standby process are determined, only the WRITE _ SAME command and template parameters are transmitted to the back-end management standby process, so that the bandwidth occupation caused by template expansion before transmission is avoided, the network loss is reduced, meanwhile, a bitmap structure is used as the data writing condition of a metadata information storage disk fragment, when the thick backup is performed with a 0 filling operation, only a bitmap is modified for occupying, and the data 0 writing consumption of the disk is avoided; and realizing lightweight persistence.

Description

Disk optimization method and device based on distributed storage system

Technical Field

The embodiment of the invention relates to the technical field of distributed storage systems, in particular to a disk optimization method and device based on a distributed storage system.

Background

With the rise of big data and cloud computing technologies, the demand for storage capacity is growing rapidly, and the storage capacity of organizations such as enterprises is progressing from PB to ZB, where 1PB =1024TB, 1EB =1024PB, and 1ZB =1024 EB. Traditional storage is gradually unable to meet the needs of new era due to the difficulty in expansion and high price per unit volume. Therefore, the share of the distributed storage system in the market is gradually increased, the virtualized storage software such as VMWare, which integrates computing, network and storage virtualization technologies and automation and management functions, supports enterprises to innovate the infrastructure thereof, deliver and manage the automated IT services, run new cloud native applications and micro-service-based applications, enables the data center to have the agility and economy of cloud service providers, and extends to the flexible hybrid cloud environment.

However, when the VMWare creates a virtual machine-specific storage disk, in a scenario where thick provisioning zero is set, a large data amount is flushed, so that write latency becomes large, and there is a space that can be optimized.

Disclosure of Invention

The embodiment of the application aims to provide a disk optimization method and device based on a distributed storage system, and aims to improve the configuration efficiency of a thick backup disk.

In a first aspect, an embodiment of the present application provides a disk optimization method based on a distributed storage system, where the method includes:

determining a back-end management main process and a back-end management standby process in the distributed storage system according to the received WRITE _ SAME request command; the WRITE _ SAME request command comprises a WRITE _ SAME command, a WRITE _ SAME template and template parameters;

performing main process target area division on a main process disk space, and performing standby process target area division on a standby process disk space; the main process disk space corresponds to the back-end management main process, and the standby process disk space corresponds to the back-end management standby process;

the WRITE _ SAME template is expanded through a main process disk management layer of the back-end management main process, and the WRITE _ SAME command and the template parameters are sent to the back-end management standby process through the back-end management main process, so that a standby process disk management layer of the back-end management standby process can expand the WRITE _ SAME template according to the template parameters;

distributing a main process physical block based on the main process target area through the main process disk management layer, modifying a bitmap bit of a thick standby interval identification table of the main process physical block into 1, and returning after finishing the persistence of the thick standby interval identification table of the main process physical block; and allocating a standby process physical block based on the standby process target area through the standby process disk management layer, modifying the bitmap bit of the thick standby interval identification table of the standby process physical block into 1, and returning after the completion of the persistence of the thick standby interval identification table of the standby process physical block.

In some embodiments, said allocating, by said host process disk management layer, physical blocks based on said target region comprises:

judging whether an object of the main process target area exists or not;

if the object of the main process target area exists, detecting whether a main process physical block of the main process target area exists;

and if the main process physical block of the main process target area does not exist, distributing the main process physical block.

In some embodiments, after the determining whether the object of the host process target area exists, the method further comprises:

and if the object of the main process target area does not exist, distributing the object of the main process target area.

In some embodiments, the method further comprises:

when a data writing operation instruction is received, cleaning a thick standby interval identification table of a physical block to be written corresponding to a target area to be written, updating a bitmap bit of the data identification table of the physical block to be written to be 1, and returning after data is written in the target area to be written; and the data writing operation instruction corresponds to the target area to be written.

In some embodiments, the clearing the thick spare interval identifier table of the physical block to be written corresponding to the target area to be written includes:

judging whether the object to be written into the physical block exists or not;

if the object to be written in the physical block exists, judging whether the target area to be written in is distributed with the physical block to be written in;

and if the to-be-written target area is distributed with the to-be-written physical block, modifying the bitmap bit of the thick standby interval identification table of the to-be-written physical block to be 0.

In some embodiments, the modifying the bitmap bit of the thick spare interval identifier table to be written into the physical block to 0 includes:

and if the bitmap bit of the thick standby interval identification table of the physical block to be written is 1, modifying the bitmap bit of the thick standby interval identification table of the physical block to be written into to 0.

In some embodiments, the method further comprises:

and when a data reading operation instruction is received, according to the recording condition of the thick spare interval identification table of the physical block to be read, constructing a data structure and returning, wherein the physical block to be read corresponds to the data reading operation instruction.

In some embodiments, the returning after constructing the data structure according to the recording condition of the thick spare interval identifier table of the physical block to be read includes:

acquiring a physical block to be read corresponding to a target area to be read according to the data reading operation instruction;

judging whether the target area to be read is allocated with a physical block to be read;

if the target area to be read is allocated with the physical block to be read, checking bitmap bits of a thick standby interval identification table of the physical block to be read;

if the bit of the bitmap of the thick backup interval identification table of the physical block to be read is 1, constructing null data corresponding to the length of the physical block to be read;

if the bitmap bit of the thick backup interval identification table of the physical block to be read is 0, retrieving the bitmap bit of the data identification table of the physical block to be read;

if the bitmap bit of the data identification table of the physical block to be read is 0, constructing null data corresponding to the length of the physical block to be read;

and if the bitmap bit of the data identification table of the physical block to be read is 1, reading data from the target area to be read.

In some embodiments, the method further comprises:

when a space recovery instruction is received, judging whether an object of a target area to be recovered corresponding to the space recovery instruction exists or not;

if the object of the target area to be recovered exists, judging whether the physical block to be recovered of the target area to be recovered is distributed;

if the physical blocks to be recovered of the target area to be recovered are distributed, checking whether the target area to be recovered is completely covered by the physical blocks to be recovered;

if the target area to be recovered is not completely covered by the physical block to be recovered, determining the uncovered covered physical block in the target area to be recovered;

judging whether the bitmap bit of the thick backup interval identification table of the covered physical block and the bitmap bit of the data identification table are all 0 or not;

and if all the bit values are 0, deleting the covered physical block of which the bitmap bit of the thick backup interval identification table and the bitmap bit of the data identification table are both 0 so as to recycle the physical space of the covered physical block.

In a second aspect, an embodiment of the present application further provides a disk optimization apparatus based on a distributed storage system, where the apparatus includes:

a determining module, configured to determine, according to the received WRITE _ SAME request command, a backend management host process and a backend management standby process in the distributed storage system; the WRITE _ SAME request command comprises a WRITE _ SAME command, a WRITE _ SAME template and template parameters;

the dividing module is used for performing main process target area division on a main process disk space and performing standby process target area division on a standby process disk space; the main process disk space corresponds to the back-end management main process, and the standby process disk space corresponds to the back-end management standby process;

the template expansion module is used for expanding the WRITE _ SAME template through a main process disk management layer of the back-end management main process, and sending the WRITE _ SAME command and the template parameters to the back-end management standby process through the back-end management main process so that a standby process disk management layer of the back-end management standby process can expand the WRITE _ SAME template according to the template parameters;

the persistence module is used for allocating a main process physical block based on the main process target area through the main process disk management layer, modifying a bitmap bit of a thick standby interval identification table of the main process physical block into 1, and returning after the completion of persistence of the thick standby interval identification table of the main process physical block; and allocating a standby process physical block based on the standby process target area through the standby process disk management layer, modifying the bitmap bit of the thick standby interval identification table of the standby process physical block into 1, and returning after the completion of the persistence of the thick standby interval identification table of the standby process physical block.

In a third aspect, an embodiment of the present application further provides a server device, where the server device includes at least one processor, and a memory, where the memory is communicatively connected to the at least one processor, and the memory stores instructions executable by the at least one processor, where the instructions are executed by the at least one processor to enable the at least one processor to perform the method described above.

In a fourth aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium storing computer-executable instructions that, when executed by a server device, cause the server device to perform the method as described above.

In a fifth aspect, the present application also provides a computer program product, which includes a computer program stored on a non-volatile computer-readable storage medium, the computer program including program instructions, which, when executed by a server device, cause the server device to perform the above-mentioned method.

Compared with the prior art, the application has the following beneficial effects at least: according to the disk optimization method and device based on the distributed storage system, when the WRITE _ SAME request command is received, and after the back-end management main process and the back-end management standby process are determined, only the WRITE _ SAME command and the template parameters are transmitted to the back-end management standby process, bandwidth occupation caused by template expansion before transmission is avoided, network loss is reduced, and lightweight persistence is achieved.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.

FIG. 1 is an architecture diagram of one embodiment of a distributed storage system of the present application;

FIG. 2 is a schematic flow chart diagram illustrating an embodiment of a disk optimization method based on a distributed storage system according to the present application;

FIG. 3 is a diagram of the logical relationship of the subject-physical blocks of the present application;

FIG. 4 is a schematic flow chart diagram illustrating a disk optimization method based on a distributed storage system according to another embodiment of the present application;

FIG. 5 is a schematic structural diagram of an embodiment of a disk optimization apparatus based on a distributed storage system according to the present application;

fig. 6 is a schematic hardware structure diagram of a controller according to an embodiment of the server apparatus of the present application.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It should be noted that, if not conflicted, the various features of the embodiments of the invention may be combined with each other within the scope of protection of the present application. Additionally, while functional block divisions are performed in apparatus schematics, with logical sequences shown in flowcharts, in some cases, steps shown or described may be performed in sequences other than block divisions in apparatus or flowcharts. Further, the terms "first," "second," "third," and the like, as used herein, do not limit the data and the execution order, but merely distinguish the same items or similar items having substantially the same functions and actions.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

In addition, the technical features involved in the respective embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

A distributed storage system uses a plurality of independent servers to form a cluster to store data. Each server in the cluster acts as a node on which multiple processes run to manage multiple physical disks on the server. In order to ensure high reliability of data, a plurality of identical copies of the data are copied, and each copy is stored on a different node, so that the condition of data loss caused by single node failure is avoided. In order to ensure that the data can be quickly recovered under the condition of node failure, the data of the distributed storage system is uniformly distributed according to a certain cabinet, and continuous data seen by the client can be dispersed in the distributed storage system.

As shown in fig. 1, fig. 1 is a business model architecture diagram of a distributed storage system. The distributed storage system integrates storage resources, integrates a plurality of sets of storage servers to form a unified management resource, comprises a front-end service interface process module, a distributed management process module and a rear-end storage process module, calculates a rear-end storage node for storing service data by using a core algorithm of distributed data consistency, and stores the data.

Referring to fig. 2, a schematic flowchart of an embodiment of a disk optimization method based on a distributed storage system, which may be executed by a controller of the distributed storage system, is shown, where the method includes steps S101 to S104.

S101: determining a back-end management main process and a back-end management standby process in the distributed storage system according to the received WRITE _ SAME request command; the WRITE _ SAME request command includes a WRITE _ SAME command, a WRITE _ SAME template, and template parameters.

In the application, the disk space of the storage is distributed and managed by taking fixed-length blocks as units, the fixed-length blocks comprise a plurality of physical blocks, thick standby interval identification tables are arranged among the physical blocks, and whether data exist in the physical blocks in the fixed-length blocks is represented by

bit

1 or 0 of a bitmap.

Specifically, in a server cluster of a distributed storage system, each server acts as a node, and multiple processes run on each node to manage multiple physical disks on the server. The disk space of each physical disk is allocated and managed in units of fixed-length blocks, as shown in fig. 3, where a fixed-length block includes a plurality of physical blocks.

The object refers to a basic logical unit of data persistence of the distributed storage system, and at the same time, the object serves as a basic unit which is operated (creation, deletion and modification).

The persistence of the distributed storage system refers to the operation of writing data to a disk by a distributed back-end service process.

The physical block chunk is configured as a fixed-length block of disk management and a continuous disk space with a certain length, and may be 256K, 512K, 1024K, or the like, and one physical block chunk is composed of a plurality of disk fragmentation blocks.

The disk fragment block is a disk logic management unit of an object, a group of disk fragments with a certain number form a physical block, and the physical block is set as a fixed length block, so the fixed number of disk fragment blocks form a fixed-size physical block chunk, the disk fragment block can be 4K, 8K, 16K and the like, 1K =1024 Byte, which is a Byte, and is a metering unit for metering storage capacity, and the data state of each disk fragment (block) is represented by a bitmap structure.

Each bit (bit) of the bitmap structure bitmap has two states, namely 1 and 0, one bit corresponds to one disk fragmentation block, 1 represents that the disk fragmentation block has data, and 0 represents that the disk fragmentation block has no data.

When the distributed storage system is subjected to persistence, a WRITE _ SAME request command is issued through the VMWare virtual machine, and the WRITE _ SAME request command comprises a WRITE _ SAME command, a WRITE _ SAME template and template parameters. The WRITE _ SAME request command is used for transferring a set of templates when the VMWare virtual machine issues thick provisioning execution and formulating a target storage area for filling the specified target area into the SAME data as the template of the VMWare virtual machine.

It is understood that in the WRITE _ SAME request command, the template parameters are (off < start point >, len < total length >, template _ data < template data >), wherein the template data means: when the content of a 512B character string is processed in the background, template data is copied and expanded to form a character string with length len.

For example, if Off =0, len =4K, and template _ data = "B". 512, then 4K mod 512 is used as the total number of cycles in the development, and template data template _ data is added to the destination buffer once until the cycle is completed, thereby realizing development of the WRITE _ SAME template.

And after receiving the WRITE _ SAME request command, the distributed back-end service process determines a back-end management main process and a back-end management standby process through a consistency algorithm of the distributed system.

S102: performing main process target area division on a main process disk space, and performing standby process target area division on a standby process disk space; the main process disk space corresponds to the back-end management main process, and the standby process disk space corresponds to the back-end management standby process.

The method comprises the steps that a front-end service process of the distributed storage system receives a WRITE _ SAME request command, a distributed management process module carries out main process target area division on a stored main process disk space according to the received WRITE _ SAME request command, and carries out standby process target area division on a standby process disk space, wherein the main process disk space corresponds to a rear-end management main process, and the standby process disk space corresponds to a rear-end management standby process.

It can be understood that the main process disk space and the standby process disk space are similar to the above disk space, and are distributed and managed by using the fixed length block as a unit.

S103: and the WRITE _ SAME template is expanded through a main process disk management layer of the back-end management main process, and the WRITE _ SAME command and the template parameters are sent to the back-end management standby process through the back-end management main process, so that a standby process disk management layer of the back-end management standby process expands the WRITE _ SAME template according to the template parameters.

S104: distributing a main process physical block based on the main process target area through the main process disk management layer, modifying a bitmap bit of a thick standby interval identification table of the main process physical block into 1, and returning after finishing the persistence of the thick standby interval identification table of the main process physical block; and allocating a standby process physical block based on the standby process target area through the standby process disk management layer, modifying the bitmap bit of the thick standby interval identification table of the standby process physical block into 1, and returning after the completion of the persistence of the thick standby interval identification table of the standby process physical block.

VMWare Workstation is the most common personal version virtualization software, and it provides many operation conveniences for engineering based on characteristics such as Windows environment installation, convenient setting page, flexible Vmtools tool, etc., and its important virtual machine function has many alternatives when configuring disk, including thin configuration thin, thick configuration delay zero-setting thick, thick configuration zero-setting three attributes.

The thin configuration means that the size of the space occupied by the disk is calculated according to the actual usage amount when the disk is created, that is, how many partitions are used, the space is not allocated in advance, the data reserved for the disk is not set to zero, and the maximum size of the partitioned disk is not exceeded.

The thick backup delay zero setting means that a disk with a cluster function is created, when the disk is created, space is directly allocated from the disk, and data is reserved for the disk to be set to zero. There is no need to wait for direct execution when there is an I/O operation.

The thick spare zero is a default creating format, and when a disk is created, space is directly allocated from the disk, but data is reserved for the disk and is not set to zero. So when there is an I/O operation, only a zero operation needs to be done.

The thick backup delay zero-setting and the thick backup zero-setting are both thick backup attributes, which can also be called as thick backup, and each space allocated to the disk by the thick backup delay zero-setting and the thick backup zero-setting is an available space, which can ensure that all the space can be fully written.

As shown in fig. 4, fig. 4 is a processing flow chart after receiving a WRITE _ SAME request command, after determining a backend management host process and a backend management backup process, because the WRITE _ SAME request command defaults to a thick backup attribute, the WRITE _ SAME template is expanded by a host process disk management layer of the backend management host process, and the WRITE _ SAME command and the template parameter are sent to the backend management backup process by the backend management host process, so that a backup process disk management layer of the backend management backup process expands the WRITE _ SAME template according to the template parameter.

Further, with the issue of the WRITE _ SAME request command, the backend management host process issues the WRITE _ SAME request command to the service processing layer, the service processing layer constructs a transaction based on the WRITE _ SAME request command, and sends the WRITE _ SAME request command to the host process disk management layer and the backend backup process corresponding to the backend management host process.

After receiving a WRITE _ SAME request command, a host process disk management layer expands a WRITE _ SAME template in the WRITE _ SAME request command, allocates a host process physical block based on a host process target area through the host process disk management layer, modifies a bitmap bit of a thick standby interval identification table thick _ bm of the host process physical block to 1, and returns after the thick standby interval identification table thick _ bm of the host process physical block is persisted.

In some embodiments, allocating, by the host process disk management layer, a host process physical block based on the host process target region may include:

judging whether an object of the main process target area exists or not;

Specifically, after a host process disk management layer develops a WRITE _ SAME template, whether an object of a host process target area exists is judged, if the object of the host process target area exists, the object does not need to be allocated, at this time, whether a host process physical block of the host process target area exists is detected, if the host process physical block exists, a thick standby interval identification table thick _ bm of the host process physical block is updated, and during updating, a bitmap bit of the thick standby interval identification table thick _ bm of the host process physical block is modified to 1 to indicate that the host process physical block has data, so that memory consumption caused by writing of data 0 of a disk is avoided.

On the contrary, if the object of the target area of the main process does not exist, the object is firstly distributed to the target area of the main process.

If the main process physical block of the main process target area does not exist, the main process physical block is distributed to the main process target area, and after the main process physical block is distributed, the thick standby interval identification table (thick _ bm) of the main process physical block is updated.

And after the thick standby interval identification table thick _ bm of the physical block of the persistent main process is finished, completing the disk persistent operation of the back-end management main process, and then uniformly returning to the client after the persistence of the back-end management standby process is returned. Due to the fact that write-in consumption of the disk data 0 is avoided, lightweight persistence of the disk of the back-end management main process is achieved.

Correspondingly, when the service processing layer sends the WRITE _ SAME request command to the back-end standby process, only the WRITE _ SAME command and the template parameter in the WRITE _ SAME request command are sent, when the back-end standby process receives the WRITE _ SAME command and the template parameter, the service processing layer of the back-end standby process constructs a transaction without expanding a WRITE _ SAME template, the WRITE _ SAME command and the template parameter are sent to the standby process disk management layer corresponding to the back-end standby process, and the standby process disk management layer expands the WRITE _ SAME template according to the template parameter.

And the backup process disk management layer receives the backup process physical block distributed based on the backup process target area, modifies the bitmap bit of the thick backup interval identification table thick _ bm of the backup process physical block into 1, and returns the backup process disk management layer after the thick backup interval identification table thick _ bm of the backup process physical block is durably finished.

In some embodiments, allocating, by the backup process disk management layer, a backup process physical block based on the backup process target area may include:

judging whether an object of the standby process target area exists or not;

if the object of the standby process target area exists, detecting whether a standby process physical block of the standby process target area exists;

and if the standby process physical block of the standby process target area does not exist, distributing the standby process physical block.

Specifically, after a WRITE _ SAME template is expanded by a backup process disk management layer, whether an object of a backup process target area exists is judged, if the object of the backup process target area exists, the object does not need to be allocated, at this time, whether a backup process physical block of the backup process target area exists is detected, if the backup process physical block exists, a thick backup interval identification table thick _ bm of the backup process physical block is updated, and during updating, a bitmap bit of the thick backup interval identification table thick _ bm of the backup process physical block is modified to 1 to indicate that the backup process physical block has data, so that memory consumption caused by writing of data 0 of a disk is avoided.

On the contrary, if the object of the standby process target area does not exist, the object is firstly distributed to the standby process target area.

And if the standby process physical block of the standby process target area does not exist, allocating the standby process physical block to the standby process target area, and updating the thick standby interval identification table (thick _ bm) of the standby process physical block after the standby process physical block is allocated.

And after the thick standby interval identification table thick _ bm of the physical block of the persistent standby process is completed, completing the disk persistent operation of the back-end management standby process, then successfully returning to the back-end management main process, and uniformly returning to the client by the back-end management main process. Due to the fact that write-in consumption of the disk data 0 is avoided, lightweight persistence of the disk of the back-end management backup process is achieved.

When the distributed storage system is in persistence, the bitmap bit of the thick provisioning interval identification table of the physical block, namely, the physical block.

Moreover, before the back-end management main process is transmitted to the back-end management standby process, the back-end management main process does not need to directly expand a WRITE _ SAME template and then transmit the WRITE _ SAME template to the back-end management standby process, only the WRITE _ SAME command and the template parameter are transmitted, and the template expansion is carried out only when the host process disk management layer of the back-end management main process or the back-end management standby process disk management layer of the back-end management standby process is reached, so that the bandwidth occupation of transmission between networks can be effectively reduced, and the network loss is reduced.

According to the embodiment of the application, when the WRITE _ SAME request command is received and the back-end management main process and the back-end management standby process are determined, only the WRITE _ SAME command and the template parameters are transmitted to the back-end management standby process, so that bandwidth occupation caused by template expansion before transmission is avoided, network loss is reduced, and lightweight persistence is achieved.

In some of these embodiments, the method further comprises:

Specifically, in order to ensure the data consistency of each disk space of the distributed storage system, the client may perform a write operation on the distributed storage system through the VMWare virtual machine, and when receiving a data write operation instruction, the distributed storage system first confirms a physical block to be written, which is allocated to a target area to be written, corresponding to the data write operation instruction, then clears a thick spare area identifier table thick _ bm of the physical block to be written, updates a bitmap bit of the data identifier table data _ bm of the physical block to be written to 1, then writes data, which needs to be written by the data write operation instruction, into the target area to be written, and returns the data after the bitmap bit is updated to 1.

In some embodiments, the clearing the thick spare interval identifier table of the physical block to be written corresponding to the target area to be written may include:

judging whether the object to be written into the physical block exists or not;

When a thick standby interval identification table thick _ bm of a physical block to be written corresponding to a target area to be written is cleared, firstly, whether an object of the physical block to be written exists is judged, if not, the physical block to be written is created, if so, whether the physical block to be written is distributed in the target area to be written is judged, if the physical block to be written is distributed in the target area to be written, a bitmap bit of the thick standby interval identification table of the physical block to be written is changed into 0, so that the thick standby interval identification table of the physical block to be written corresponding to the target area to be written is cleared, in addition, the updated mark of the thick standby interval identification table thick of the physical block to be written is recorded, and otherwise, no recording is carried out.

It should be noted that if the bitmap bit of the thick spare interval identifier table of the physical block to be written is 1, the bitmap bit of the thick spare interval identifier table of the physical block to be written is modified to 0, and if the bitmap bit of the thick spare interval identifier table of the physical block to be written is 0, the modification is not required.

Correspondingly, if the target area to be written is not allocated with the physical block to be written, the physical block to be written is allocated in the target area to be written, and after the physical block to be written is allocated, all the bitmap bits of the thick spare interval identification table of the physical block to be written are changed from 1 to 0, so that the thick spare interval identification table of the physical block to be written corresponding to the target area to be written is cleared.

When the bitmap bit of the data identification table data _ bm to be written into the physical block is updated to 1, identifying the physical block write data, specifically: checking whether a bitmap bit of the data identification table data _ bm to be written into the physical block is 1, if so, updating is not needed; if not 1, the bitmap bit is updated to 1, and the data identification table data _ bm is recorded.

And writing data into the target area to be written, namely actually writing the data carried by the data writing operation instruction into the target area to be written to complete the data writing operation.

After the writing is completed, whether the thick provisioning interval identification table to be written into the physical block, thick _ bm, has an update mark or not may be determined, and if yes, the thick provisioning interval identification table to be written into the physical block, thick _ bm, may be persisted.

Correspondingly, whether the data identification table data _ bm of the physical block to be written has the update mark or not can be judged, and if yes, the data identification table data _ bm of the physical block to be written is persisted.

And finishing the write-in operation in the thick standby interval identification table (thick _ bm) and the data identification table (data _ bm) of the persistent physical block to be written, and returning to the back-end management main process.

In some of these embodiments, the method further comprises:

In order to ensure the consistency of data of the distributed storage system, a client can read the distributed storage system through the VMWare virtual machine, and when the distributed storage system receives a data reading operation instruction, the distributed storage system constructs a data structure according to the recording condition of the thick backup interval identification table of the physical block to be read corresponding to the data reading operation instruction and returns the data structure.

In some embodiments, the returning after constructing the data structure according to the recording condition of the thick spare interval identifier table of the physical block to be read may include:

and if the bitmap bit of the data identification table of the physical block to be read is 0, constructing null data corresponding to the length of the physical block to be read.

Specifically, when receiving a data reading operation instruction, the disk management module determines a target area to be read corresponding to the data reading operation instruction, then loads a physical block to be read corresponding to an object of the target area to be read, traverses physical blocks to be read distributed in each disk of the distributed storage system, and judges whether the target area to be read is allocated with the physical block to be read; if the target area to be read is allocated with the physical block to be read, checking a bitmap bit of a thick standby interval identification table (thick _ bm) of the physical block to be read; if the bitmap bit of the thick backup interval identification table (thick _ bm) of the physical block to be read is 1, constructing null data corresponding to the length of the physical block to be read; if the bitmap bit of the thick backup interval identification table thick _ bm of the physical block to be read is 0, indicating that the physical block to be read possibly stores data, and retrieving the bitmap bit of the data identification table data _ bm of the physical block to be read; if the bitmap bit of the data identification table data _ bm of the physical block to be read is 0, constructing null data corresponding to the length of the physical block to be read; and if the bitmap bit of the data identification table data _ bm of the physical block to be read is 1, indicating that data exists on the physical block to be read corresponding to the target area to be read, and reading the data from the target area to be read.

In some embodiments, in order to ensure data consistency of the distributed storage system, after persisting, writing data, or reading data in the distributed storage system, there is a remaining space, and in this case, a space reclamation process is required, and the method further includes:

if the target area to be recovered is not completely covered by the physical block to be recovered, determining the uncovered covered physical block;

Specifically, when a space recovery instruction is received, whether an object of a target area to be recovered corresponding to the space recovery instruction exists is judged; if the object of the target area to be recovered exists, judging whether the physical block to be recovered of the target area to be recovered is distributed; correspondingly, if the object of the target area to be recycled does not exist, returning.

Judging whether the physical blocks to be recovered in the target area to be recovered are distributed or not; if the physical blocks to be recovered of the target area to be recovered are distributed, checking whether the target area to be recovered is completely covered by the physical blocks to be recovered; correspondingly, if the physical block to be recycled of the target area to be recycled is not allocated, returning.

And if the target area to be recovered is not completely covered by the physical block to be recovered, determining the uncovered covered physical block in the target area to be recovered. The covered physical blocks are physical blocks in the target area to be reclaimed except the covered physical blocks to be reclaimed. Then, obtaining the bitmap bit of the thick standby interval identification table thick _ bm of the covered physical block and the bitmap bit of the data identification table data _ bm, if the bitmap bit of the thick standby interval identification table thick _ bm is not 0, setting the bitmap bit to 0, marking the update mark of the thick standby interval identification table thick _ bm on the covered physical block, and if all the bitmap bits of the thick standby interval identification table thick _ bm are searched to be 0, marking the first mark thick _ bm _ is _ zero on the covered physical block.

Correspondingly, if the bitmap bit of the data identification table data _ bm is not 0, the bitmap bit is set to 0, the covered physical block is marked with the update flag of the data identification table data _ bm, and if all the bitmap bits of the retrieved data identification table data _ bm are 0, the covered physical block is marked with the second flag data _ bm _ is _ zero.

After the covered physical block is marked with a first mark, namely, thick _ bm _ is _ zero and the covered physical block is marked with a second mark, namely, data _ bm _ is _ zero, the bitmap bit of the thick preparation interval identification table, namely, thick _ bm, of the covered physical block and the bitmap bit of the data identification table, namely, data _ bm, are all 0, and then the covered physical block, namely, the bitmap bit of the thick preparation interval identification table, namely, thick _ bm, and the bitmap bit of the data identification table, namely, data _ bm, are all 0 is deleted, so that the physical space of the covered physical block is recovered.

Correspondingly, if the target area to be recovered is completely covered by the physical block to be recovered, all the completely covered physical blocks to be recovered are deleted, and the physical space of the physical block to be recovered is recovered.

And after the space recovery is finished, returning to the back-end management main process.

Referring to fig. 5, it shows a structure of a disk optimization apparatus based on a distributed storage system according to an embodiment of the present application, where the disk optimization apparatus 500 based on a distributed storage system includes:

a determining module 501, configured to determine, according to a received WRITE _ SAME request command, a backend management host process and a backend management standby process in the distributed storage system; the WRITE _ SAME request command comprises a WRITE _ SAME command, a WRITE _ SAME template and template parameters;

a dividing module 502, configured to perform main process target area division on a main process disk space, and perform backup process target area division on a backup process disk space; the main process disk space corresponds to the back-end management main process, and the standby process disk space corresponds to the back-end management standby process;

the template expansion module 503 is configured to expand the WRITE _ SAME template through a host process disk management layer of the backend management host process, and send the WRITE _ SAME command and the template parameter to the backend management backup process through the backend management host process, so that a backup process disk management layer of the backend management backup process expands the WRITE _ SAME template according to the template parameter;

a persistence module 504, configured to allocate, by the host process disk management layer, a host process physical block based on the host process target area, modify a bitmap bit of a thick backup interval identifier table of the host process physical block to 1, and return after completing persistence of the thick backup interval identifier table of the host process physical block; and allocating a standby process physical block based on the standby process target area through the standby process disk management layer, modifying the bitmap bit of the thick standby interval identification table of the standby process physical block into 1, and returning after the completion of the persistence of the thick standby interval identification table of the standby process physical block.

In some embodiments, the persistence module 504 is further configured to:

judging whether an object of the main process target area exists or not;

In some embodiments, the persistence module 504 is further configured to:

In some embodiments, the disk optimization apparatus 500 further includes a writing module 505 for:

In some embodiments, the write module 505 is further configured to:

judging whether the object to be written into the physical block exists or not;

In some embodiments, the write module 505 is further configured to:

and if the bitmap bit of the thick standby interval identification table of the physical block to be written is 1, modifying the bitmap bit of the thick standby interval identification table of the physical block to be written into to be 0.

In some embodiments, the disk optimization apparatus 500 further comprises a reading module 506 for:

In some embodiments, the reading module 506 is further configured to:

In some embodiments, the distributed storage system based disk optimization apparatus 500 further includes a space reclamation module 507 configured to:

It should be noted that the above-mentioned apparatus can execute the method provided by the embodiments of the present application, and has corresponding functional modules and beneficial effects for executing the method. For technical details which are not described in detail in the device embodiments, reference is made to the methods provided in the embodiments of the present application.

Fig. 6 is a schematic diagram of a hardware structure of a controller in an embodiment of a server device, as shown in fig. 6, the controller includes:

one or more processors 111, memory 112. Fig. 6 illustrates an example of one processor 111 and one memory 112.

The processor 111 and the memory 112 may be connected by a bus or other means, and fig. 6 illustrates the connection by the bus as an example.

The memory 112, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the disk optimization method based on the distributed storage system in the embodiment of the present application (for example, the determining module 501, the dividing module 502, the template expanding module 503, the persisting module 504, the writing module 505, the reading module 506, and the space recycling module 507 shown in fig. 5). The processor 111 executes various functional applications of the controller and data processing by running nonvolatile software programs, instructions and modules stored in the memory 112, that is, the disk optimization method based on the distributed storage system of the above method embodiment is implemented.

The memory 112 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of a disk optimization apparatus based on a distributed storage system, and the like. Further, the memory 112 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 112 optionally includes memory located remotely from processor 111, which may be connected to a database firewall through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory 112, and when executed by the one or more processors 111, perform the disk optimization method based on the distributed storage system in any of the above method embodiments, for example, perform the above-described method steps S101 to S104 in fig. 2; the functions of the modules 501 and 507 in fig. 5 are realized.

The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the methods provided in the embodiments of the present application.

The present application provides a non-transitory computer-readable storage medium, which stores computer-executable instructions, which are executed by one or more processors, such as one processor 111 in fig. 5, and enable the one or more processors to perform the method for optimizing a disk based on a distributed storage system in any of the method embodiments, such as performing the above-described method steps S101 to S104 in fig. 2; the functions of the modules 501 and 507 in fig. 5 are realized.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a general hardware platform, and may also be implemented by hardware. It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; within the idea of the invention, also technical features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A disk optimization method based on a distributed storage system is characterized by comprising the following steps:

2. The distributed storage system-based disk optimization method according to claim 1, wherein the allocating, by the host process disk management layer, host process physical blocks based on the host process target region comprises:

judging whether an object of the main process target area exists or not;

3. The distributed storage system-based disk optimization method according to claim 2, wherein after the determining whether the object of the main process target area exists, the method further comprises:

4. The distributed storage system based disk optimization method according to claim 1, wherein the method further comprises:

5. The disk optimization method based on the distributed storage system according to claim 4, wherein the clearing of the thick spare interval identifier table of the physical block to be written corresponding to the target area to be written comprises:

judging whether the object to be written into the physical block exists or not;

6. The disk optimization method based on the distributed storage system according to claim 5, wherein the modifying the bitmap bit of the thick spare interval identifier table of the physical block to be written to 0 includes:

7. The distributed storage system based disk optimization method according to claim 1, wherein the method further comprises:

8. The disk optimization method based on the distributed storage system according to claim 7, wherein the returning after constructing the data structure according to the recording condition of the thick backup interval identifier table of the physical block to be read comprises:

9. The distributed storage system based disk optimization method according to claim 1, wherein the method further comprises:

10. An apparatus for optimizing a disk based on a distributed storage system, the apparatus comprising:

the dividing module is used for carrying out main process target area division on the main process disk space and carrying out standby process target area division on the standby process disk space; the main process disk space corresponds to the back-end management main process, and the standby process disk space corresponds to the back-end management standby process;

11. A server device, comprising at least one processor, and a memory communicatively coupled to the at least one processor, the memory storing instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

12. A non-transitory computer-readable storage medium storing computer-executable instructions that, when executed by a server device, cause the server device to perform the method of any one of claims 1-9.