WO2021088586A1

WO2021088586A1 - Method and apparatus for managing metadata in storage system

Info

Publication number: WO2021088586A1
Application number: PCT/CN2020/119929
Authority: WO
Inventors: 王晨
Original assignee: 华为技术有限公司
Priority date: 2019-11-05
Filing date: 2020-10-09
Publication date: 2021-05-14
Also published as: CN112783698A

Abstract

Provided are a method and apparatus for managing metadata in a storage system; said storage system comprises a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices comprised by the storage system, that is to say, the storage unit is a logical storage unit; in the method, after metadata corresponding to data to be written is generated in the storage system, the storage unit used for storing the metadata is determined from among the plurality of storage units comprised by the storage system, thus the metadata is stored in at least two storage devices corresponding to the determined storage unit. Each storage unit is mapped to a physical storage space corresponding to at least two storage devices; thus if one of the plurality of storage devices corresponding to one storage unit fails, the metadata can also be recovered from the remaining storage device corresponding to the storage unit, thereby achieving metadata redundancy protection.

Description

Method and device for managing metadata in storage system

This application requires the priority of the Chinese patent application filed with the Chinese Patent Office, the application number is 201911072812.6, and the application name is "a kind of hard disk" on November 5, 2019, and the Chinese Patent Office with the application number filed on January 9, 2020 It is the priority of the Chinese patent application of 202010021351.6 and the application title is "A method and device for managing metadata in a storage system", the entire content of which is incorporated into this application by reference.

Technical field

This application relates to the field of storage technology, and in particular to a method and device for managing metadata in a storage system.

Background technique

In a storage system, in order to ensure the reliability of the stored data and metadata, it is usually necessary to perform redundancy protection on the data and metadata.

Take the redundancy protection of metadata as an example. One way is to create multiple metadata instances in the physical storage space, and save a copy of the metadata in each instance, so that when one of the metadata instances fails, The metadata of the data stored in the physical storage space may also be obtained through other metadata instances, so as to implement redundancy protection of the metadata of the data stored in the physical storage space. Among them, the metadata instance can be understood as a program code used to implement a value-added service based on metadata, such as a service for snapshotting metadata or a service for cloning metadata.

In the above technical solution, since multiple metadata instances need to be created to realize the redundancy protection of metadata, the implementation method is complicated.

Summary of the invention

The present application provides a method and device for managing metadata in a storage system, which are used to simplify the steps of performing redundancy protection on metadata.

In a first aspect, a method for managing metadata in a storage system is provided. The storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system, that is, In other words, the storage unit is a logical storage unit. In this method, after the metadata corresponding to the data to be written is generated in the storage system, the storage system includes a plurality of storage units that are used to store the metadata. Storage unit, thereby storing the metadata in at least two storage devices corresponding to the determined storage unit.

In the above technical solution, since each storage unit is mapped to the physical storage space corresponding to at least two storage devices, in this way, when one of the storage devices corresponding to a certain storage unit fails, the The metadata is recovered from the remaining storage device corresponding to the storage unit, so that redundancy protection of the metadata can be realized. Therefore, in the embodiments of the present application, there is no need to create multiple metadata instances that store the same metadata, and a simpler method for redundant protection of metadata is provided.

In a possible design, the storage unit may store the metadata in an additional write mode.

By means of additional writing, the efficiency of writing metadata can be improved, and when new data is added to the storage system, the old data (that is, the previously stored data) may be determined as invalid data, and there will be The multiple consecutive old data stored in advance are all invalid data, so that the multiple consecutive storage units corresponding to the multiple invalid data are all storage units that need to be garbage collected, which can reduce the overhead of garbage collection.

In a possible design, before determining the storage unit for storing the metadata, a data write request for writing the data to be written into the storage system may be received, and write data according to the data. The request and the metadata generate a record item corresponding to the metadata, and the record item includes a data write operation corresponding to the data write request and metadata updated after the data write operation is executed.

In this way, when the storage unit for storing metadata fails, the metadata before the failure can be recovered through the content in the record, which can increase the stability of the storage system.

In a possible design, the metadata includes:

The correspondence between the logical address and the physical address of each segment of the data to be written, the logical address of the storage unit occupied by the data to be written and each segment contained in the data to be written Correspondence between the logical addresses of each fragment, the logical address of each fragment is the logical address corresponding to the storage unit occupied by the fragment; or,

The metadata includes:

The correspondence between the logical address and the physical address of each copy of the data to be written, and the correspondence between the logical address of the data to be written and the logical address of each copy contained in the data to be written , The logical address of each copy is the logical address corresponding to the storage unit occupied by the copy;

The set of logical addresses of each segment included in the data to be written or the logical address of each copy included in the data to be written is the logical address of the data to be written.

In the above technical solution, metadata can record a variety of different contents according to actual usage requirements, which can increase the flexibility and applicability of the storage system.

In a possible design, the storage system may also create a first metadata instance for performing business operations on metadata in a preset storage unit.

In the above technical solution, the metadata instance is no longer to perform business operations on the metadata in the preset physical storage space, but to operate on the metadata in the preset storage unit, providing a new kind of metadata How the instance was created.

In a possible design, after the first metadata instance fails, a second metadata instance may be created, and the second metadata instance can access the metadata stored in the preset storage unit.

In the above technical solution, when a new metadata instance is created, the new metadata instance can directly use the metadata in the shared storage unit, which reduces the process of copying and transmitting metadata to the new metadata instance. Reduce the time delay of creating a new metadata instance and improve efficiency. Furthermore, since there is no need to transmit metadata between multiple metadata instances, transmission resources can be saved.

In a second aspect, a management device for metadata in a storage system is provided. The management device may be a management node or a management server, or a device in a management node or a management server. The management device includes a processor for implementing the method described in the first aspect. The management device may also include a memory for storing program instructions and data. The memory is coupled with the processor, and the processor can call and execute the program instructions stored in the memory to implement any one of the methods described in the first aspect.

In a possible design, the processor of the metadata management device executes the program instructions in the memory to realize the following functions:

Generate metadata corresponding to the data to be written;

Determining a storage unit for storing the metadata, where the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;

The metadata is stored in at least two storage devices corresponding to the storage unit.

In a possible design, the storage unit stores the metadata in an additional write mode.

In a possible design, the processor executes the program instructions stored in the memory to realize the following functions:

Receiving a data write request, where the data write request is used to write the data to be written into the storage system;

According to the data write request and the metadata, a record item corresponding to the metadata is generated; the record item includes the data write operation corresponding to the data write request and the metadata updated after the data write operation is executed .

In a possible design, the description of the metadata is similar to the corresponding content in the first aspect, and will not be repeated here.

Create a first metadata instance, where the first metadata instance is used to perform business operations on metadata in a preset storage unit.

After the first metadata instance fails, a second metadata instance is created, and the second metadata instance can access the metadata stored in the preset storage unit.

In a third aspect, a management device for metadata in a storage system is provided. The management device may be a management node or a management server, or a device in a management node or a management server. The management device may include a generating unit, a determining unit, and an executing unit, and these units may execute the corresponding function executed in any of the design examples of the first aspect, specifically:

The generating unit is used to generate metadata corresponding to the data to be written;

A determining unit, configured to determine a storage unit for storing the metadata, the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;

The execution unit is configured to store the metadata in at least two storage devices corresponding to the storage unit.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a computer, cause the The computer executes the method described in any one of the first aspect.

In a fifth aspect, an embodiment of the present application provides a computer program product, the computer program product stores a computer program, the computer program includes program instructions, and when executed by a computer, the program instructions cause the computer to execute the first The method of any one of the aspects.

In a sixth aspect, the present application provides a chip system. The chip system includes a processor and may also include a memory for implementing the method described in the first aspect. The chip system can be composed of chips, or it can include chips and other discrete devices.

In a seventh aspect, an embodiment of the present application provides a storage system that includes the metadata management device of the storage system described in the second aspect and any one of the designs of the second aspect, or the storage system includes the first The metadata management device of the storage system described in any one of the third aspect and the third aspect is designed.

For the beneficial effects of the foregoing second to seventh aspects and their implementation manners, reference may be made to the description of the beneficial effects of the method and implementation manners of the first aspect.

Description of the drawings

FIG. 1 is a schematic diagram of an example of an application scenario of an embodiment of the application;

2 is a schematic structural diagram of an example of a storage unit provided by this embodiment;

Figure 3 is a flowchart of the data storage process in an embodiment of the application;

4 is a schematic diagram of an example of multiple strips included in a storage unit in an embodiment of the application;

5 is a schematic diagram of an example of a mapping relationship between a storage unit and a storage device in an embodiment of the application;

Fig. 6 is a flowchart of the metadata storage process in an embodiment of the application;

FIG. 7 is a schematic diagram of an example of writing metadata to a storage unit in an embodiment of the application;

FIG. 8 is a schematic diagram of an example of a metadata structure in an embodiment of the application;

FIG. 9 is a flowchart of the garbage collection process of metadata in an embodiment of the application;

FIG. 10 is a flowchart of the management process of metadata instances in an embodiment of the application;

FIG. 11 is a schematic structural diagram of an example of a metadata management device of a storage system provided in an embodiment of the application; FIG.

FIG. 12 is a schematic structural diagram of another example of a management device for metadata of a storage system provided in an embodiment of the application.

Detailed ways

In order to make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

In the embodiments of the present application, "multiple" refers to two or more than two. In view of this, "multiple" may also be understood as "at least two" in the embodiments of the present application. "At least one" can be understood as one or more, for example, one, two or more. For example, including at least one refers to including one, two or more, and does not limit which ones are included. For example, including at least one of A, B, and C, then the included can be A, B, C, A and B, A and C, B and C, or A and B and C. "And/or" describes the association relationship of the associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. In addition, the character "/", unless otherwise specified, generally indicates that the associated objects before and after are in an "or" relationship. In the embodiments of the present application, "node" and "node" can be used interchangeably.

Unless otherwise stated, ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, timing, priority, or importance of multiple objects.

The metadata management method provided in the embodiments of this application can be applied to various storage systems, for example, it can be a centralized storage system, or it can be a distributed storage system, or it can be a cloud storage system such as a public cloud or a private cloud. Wait, there is no restriction here. For the convenience of description, the application of this metadata management method in a distributed storage system is taken as an example below.

Please refer to FIG. 1, which is a schematic diagram of an example of an application scenario provided by an embodiment of this application. In FIG. 1, a client server (client server) 100 and a storage system 110 are included, and the client server 100 communicates with the storage system 110. The storage system 110 includes a management module 111 and at least one storage node 112 (in FIG. 1, three storage nodes 112, respectively storage node 1 to storage node 3 are taken as an example), and the management module 111 is used to send each storage node 112 to each storage node 112. Data is written, and data is read from at least one storage node 112.

The storage node 112 in FIG. 1 may be an independent server, or it may also be a storage array including at least one storage device. The storage device may be a hard disk drive (HDD) disk device or a solid state drive. , SSD) disk device, serial advanced technology attachment (SATA) disk device, small computer system interface (SCSI) disk device, serial attached SCSI interface (serial attached SCSI, SAS) disk Equipment or Fibre Channel (FC) disk equipment, etc.

The management module 111 and at least one storage node 112 in FIG. 1 may be independent devices. For example, the management module 111 is an independent server; or, the management module 111 may also be a software module, which is deployed on a certain storage node 112. For example, the management module 111 and a certain storage node 112 run on the same server, and the specific forms of the management module 111 and the storage node 112 are not limited here.

In this embodiment, each storage node includes at least one storage unit. The storage unit is a segment of logical space. The logical space is obtained by mapping the physical space of the storage device included in the storage node, that is, the actual The physical space still comes from multiple storage nodes.

Please refer to FIG. 2, which is a schematic structural diagram of an example of the storage unit provided in this embodiment. In Figure 2, the storage unit is a collection of multiple logic blocks. The logical block is a logical space concept, which is obtained by the space division of the storage device. The size of a logical block can be 4KB or 8KB, etc. The size of the logical block is not limited here. Each logical block corresponds to a physical storage space of the same size as the logical block in the storage device. It should be noted that multiple logical blocks included in a storage unit come from multiple storage devices, and the multiple storage devices may come from different storage nodes, or may also come from the same storage device, which is not limited here.

Taking multiple logical blocks included in a storage unit from a storage device included in the same storage node 112 in the storage system as an example, as an example, the storage node 112 may be based on a set redundant array of independent hard disks (redundant array). of independent disks, RAID) type, which maps the logical blocks in the logical block set included in the storage unit to data storage units for storing data fragments, and generates a checksum based on the data fragments stored in each logical block The data is sliced, and then the check data is sliced and stored in the check storage unit, and the data storage unit and the check data storage unit form a strip. A storage unit contains one or more strips. Wherein, the data storage unit includes at least two logic blocks, and the verification storage unit includes at least one logic block. For example, the storage node 112 takes out one logical block from four storage devices, such as storage device A to storage device D, to form a storage unit. The four logical blocks form a striped data storage unit, and then from the other two Each logical block is taken out of the storage device to form a check storage unit. In this way, when any two logical blocks in the strip fail, the any two logical blocks can be any two data storage units or logical blocks corresponding to any two check storage units, or can be a data storage unit and The logic block corresponding to a check storage unit can reconstruct the data in the failed logic block according to the data in the remaining logic block.

As another example, the storage node 112 may also divide multiple logical blocks in the logical block set included in the storage unit into duplicate units according to the set multiple duplicate type. Wherein, each copy unit includes at least one logic block, the at least one logic block stores data, and the data stored in each copy unit is the same. For example, if a copy unit includes two logical blocks, the storage node 112 will take out one logical block from each of the two storage devices to form a copy unit. Assume that the multiple copy type is copy type 3, that is, one data needs to be stored in three copies. , The storage node 112 can each take out one logical block from the other four storage devices, and compose every two logical blocks into a copy unit to obtain another two copy units, and the same data is stored in the three copy units. In this way, when any copy unit fails, data can be obtained from the other two copy units.

In the following, the application scenario shown in FIG. 1 is taken as an example to describe the metadata management method provided by the embodiment of the present application. For ease of understanding, the technical solutions of the embodiments of the present application will be introduced in the following four aspects. In the following introduction, the steps executed by the storage system 110 may all be executed by the management module 111 of the storage system 110.

The first aspect is the data storage process.

Please refer to FIG. 3, which is a flowchart of the data storage process in an embodiment of this application. The flowchart is described as follows:

S31. The client server 100 sends a data write request to the storage system 110.

The data write request includes the data to be written and the virtual storage address of the data to be written. The virtual storage address refers to the identifier and offset of the logical unit (LU) to which the data to be written is to be written, and the virtual storage address is an address visible to the client server 100. The data write request may be obtained by the client server 100 according to a user's operation, or may be generated according to system requirements during operation.

S32. The storage system 110 determines a storage unit for storing the data to be written.

After the management module 111 of the storage system 110 receives the data write request, it determines the storage unit of the data to be written according to the usage of the storage unit in the storage system 110 and the size of the data to be written carried in the data write request.

As an example, assuming that the size of the data to be written is 1 MB and the size of each storage unit is 1 MB, the storage system 110 determines that the data to be written requires 1 storage unit. The storage system 110 determines that no data is stored before receiving the data write request, and then determines that the storage unit occupied by the data to be written is storage unit 0. In this example, the initial storage unit is the storage unit 0 as an example. In other embodiments, the initial storage unit may also be the storage unit 1, which is not limited here.

As another example, a storage unit may include multiple strips, that is, a striped data storage unit includes some logical blocks in the logical block set corresponding to the storage unit. Please refer to Figure 4, a storage unit contains 3 strips. If the size of the data stored in a strip is 32KB, the size of a storage unit is 96KB. If the size of the data to be written is smaller than the size of a storage unit, it can be determined to store the data to be written in a partial logical block included in a certain storage unit, for example, to store the data corresponding to at least one stripe. Block. For example, a storage unit includes 12 logic blocks, and each 4 logic blocks corresponds to a stripe, that is, every 4 logic blocks can store data with a data volume of 32KB. If the size of the data to be written is 32KB, the storage system 110 determines that before receiving the data write request, data has been stored in the first 4 logical blocks of storage unit 0 (that is, logical block 0 to logical block 3), then it can be determined to store the data to be written in the storage unit 0 in logic block 4 to logic block 7.

It should be noted that in actual use, a storage unit may correspond to more than 3 strips. For example, it can correspond to dozens or hundreds of strips. The number of strips shown in Figure 4 is only an example. It should not be understood as a restriction on the storage unit.

In other embodiments, each storage device may provide a segment of logical address instead of providing it to the storage unit in the form of a logical block. In this case, the storage unit is a collection of multiple logical address segments.

S33. The storage system 110 stores the data to be written according to the determined storage unit for storing the data to be written.

The management module 111 of the storage system 110 pre-stores the mapping relationship between each storage unit and the storage device of the storage node. When the storage unit used to store the data to be written is determined, the data to be written is determined according to the mapping relationship. Write to the corresponding storage node.

As an example, the management module 111 of the storage system 110 stores the data written to the storage unit according to a preset RAID type. Continuing to refer to FIG. 4, the storage unit 0 includes 12 logical blocks, and each of the 4 logical blocks corresponds to a stripe, and the 4 logical blocks are used to store data fragments. For example, logical block 0 to logical block 3 are the logical blocks used to store data slices in the first strip, and logical block 4 to logical block 7 are logical blocks used to store data slices in the second strip. Logic block 8 to logic block 11 are the logic blocks used to store data slices in the third stripe, and each stripe also includes logic blocks used to store test data slices, for example, the first stripe It also includes a logic block P0 and a logic block Q0. The second section also includes a logic block P1 and a logic block Q1, and the third section also includes a logic block P2 and a logic block Q2.

The storage system 110 presets a mapping relationship between the logical blocks included in each segment and the storage device of the storage node. For example, the mapping relationship is: the 4 logical blocks used to store data fragments in each stripe correspond to storage device A in storage node 1 to storage node 4 in turn, and each stripe is used to verify data fragments The logical blocks of corresponds to storage device A in storage node 5 and storage node 6 in turn. In Figure 4, in multiple strips corresponding to a storage unit, logical blocks with the same position are from the same storage node. For example, the storage unit shown in FIG. 4 includes 3 strips. The first strip includes logic block 0 to logic block 3, logic block P0, and logic block Q0, and the second strip includes logic block 4 to logic block Q0. For block 7, logic block P1, and logic block Q1, logic block 0 and logic block 4 are located in the same position, logic block 1 and logic block 5 are located in the same position, and so on.

After the management module 111 receives the data to be written, it can divide the data to be written into multiple data fragments according to the preset RAID type, and calculate the parity fragments, and divide the data fragments and parity into multiple data fragments. The fragments are stored in the storage device corresponding to each logical block. For example, the size of the data to be written is 32KB, and it is determined that the data to be written is stored in logical block 4 to logical block 7, then the management module 111 divides the data to be written into 4 data fragments, each The size of the fragment is 8KB, and then according to the 4 data fragments, 2 parity data fragments are calculated, and the size of each parity fragment is also 8KB. Then, the management module 111 sends each data fragment and the verification data fragment to the corresponding storage node for persistent storage. With the mapping relationship as described above, the management module 111 sends 4 data fragments to storage node 1 to storage node 4 respectively, and sends 2 parity data fragments to storage node 5 and storage node 6 respectively. Each storage node stores corresponding data in a preset storage device.

As another example, the management module 111 of the storage system 110 stores the data written to the storage unit according to a preset multiple copy type. Please refer to FIG. 5, the storage unit 0 includes 12 logic blocks, and each logic block is used to store data. The storage system 110 presets the mapping relationship between each logical block and the storage device of the storage node. For example, if the multiple copy type is 2 copies, each logical block can correspond to two different storage devices on a storage node, and the mapping relationship is: logical block 0 to logical block 3 correspond to storage node 1 to storage node in turn The mapping relationship between storage device A and storage device B on 4, and other logical blocks and storage devices may be similar to logical block 0 to logical block 3, and will not be repeated here.

After the management module 111 receives the data to be written, it can copy the data to be written into multiple data according to the preset multiple copy type, and store the data to be written and the copied data corresponding to each logical block. In the storage device. For example, the size of the data to be written is 32KB, and the size of each logical block is 4KB. If it is determined to write the data to be written into logical blocks 0 to 4, the management module 111 divides the data to be written The data is 4 copies, and the size of each data is 8KB. Then the 4 copies of data are copied to obtain 8 copies of data. Then, the management module 111 sends the 8 copies of data to the corresponding storage node for persistent storage. With the mapping relationship as described above, the management module 111 sends two identical data of the eight pieces of data to storage nodes 1 to 4 respectively, and each storage node stores the corresponding data in a preset storage device.

From a logical point of view, the data to be written is written into the storage unit of the storage system 110. From a physical point of view, the data is ultimately still stored in multiple storage nodes. For each fragment, the identification of the storage unit where it is located and the location inside the storage unit are the logical address of the fragment, and the actual address of the fragment in the storage node is the physical address of the fragment. address.

The second aspect is the storage process of metadata.

After the data to be written is stored in the storage device, in order to facilitate subsequent searching or reading of the data, the storage system 110 also needs to store the description information of the data. When the storage node receives the data read request, it is usually based on the data read request. The carried information (for example, data name or virtual address) finds the metadata of the data to be read, and then further obtains the data to be read according to the metadata. Metadata includes, but is not limited to: the correspondence between the logical address and physical address of each fragment, the correspondence between the logical address of the data and the logical address of each fragment contained in the data, and the The correspondence between the logical address and the physical address, and the correspondence between the logical address of the data and the logical address of the copy of the data. The set of logical addresses of each fragment contained in the data or the logical address of each copy is the logical address of the data.

Please refer to FIG. 6, which is a flowchart of the metadata storage process in an embodiment of this application. The flowchart is described as follows:

S61. The storage system 110 generates metadata.

After the data to be written is stored in the storage system 110, the management module 111 of the storage system 110 generates metadata of the data to be written. For example, in the embodiment shown in FIG. 3, the management module 111 stores the data to be written in logic block 0 to logic block 4 of the storage unit, and then the management module 111 will, according to the size of the data to be written, Store the address and other information to generate the metadata of the data to be written. The content of metadata is not limited here.

S62. The storage system 110 determines a storage unit for storing the metadata.

In the embodiment of the present application, the physical storage space used by the storage system 110 for storing data and the physical storage space used for storing metadata are separated. For example, if each storage node includes 4 storage devices, normally, Compared with the data itself, the metadata of the data occupies a smaller storage space. Therefore, the storage device A to the storage device C in each storage node in the storage system 110 can be set to store data, and each storage The storage device D in the node is used to store metadata; or, if the storage system 110 includes 4 storage nodes, it is also possible to set all storage devices in storage node 1 to storage node 3 to store data, and storage node 4 All storage devices are used to store metadata. In the embodiments of the present application, the storage unit used to store data and the storage unit used to store metadata are essentially the same, except that the content stored in the storage unit is different. In other words, the storage unit used to store data and the storage unit used to store metadata are different. The storage unit of metadata comes from different storage devices.

As an example, after the management module 111 generates the metadata, it can determine the storage unit used to store the metadata according to the usage of the storage unit used to store the metadata in the storage system 110. For example, please refer to Figure 7. A storage unit for storing metadata includes 6 logical blocks, and every 2 logical blocks corresponds to a stripe. The management module 111 determines that before generating the metadata, a storage unit has been used for storage. If data is stored in the first two logic blocks (ie, logic block 0 and logic block 1) of the metadata storage unit 0, the management module 111 can determine to store the generated metadata in the logic block 2 and logic block 2 of the storage unit 0. Block 3. This method can be understood as storing metadata in the storage unit in an additional write manner.

In other embodiments, step S63 may be performed before step S62.

S63. The storage system 110 generates a record item corresponding to the metadata.

After the management module 111 generates the metadata, it can obtain the write ahead log (WAL) record item corresponding to the metadata according to the metadata and the operation corresponding to the metadata. When the WAL record item is stored After the corresponding storage space, a WAL log is formed.

The operation corresponding to metadata is illustrated by an example. For example, if the metadata is generated according to a data write request sent by the client server 100, the operation corresponding to the metadata is a data write operation. Then, the management module 111 saves the record item in the memory, and the memory can be understood as the memory of the node or server where the management module 111 is located. When the WAL record items recorded in the memory meet a preset condition, for example, the preset condition may be that the number of WAL record items recorded in the memory reaches a threshold, then the metadata in the multiple WAL record items recorded in the memory is determined Write to the storage unit, thereby executing step S62 to determine the storage unit corresponding to the metadata in each WAL record. Wherein, the method of determining the storage unit corresponding to the metadata in each WAL record can be similar to step S62, that is, according to the usage of the storage unit used to store the metadata, determine the storage unit used to store each WAL record in turn. The storage unit of the data will not be repeated here.

Since metadata and corresponding operations are recorded in the WAL record, in this way, when the storage unit used to store the metadata fails, the content in the WAL record can be used to recover the previous failure Metadata can increase the stability of the storage system 110.

S64. The storage system 110 writes the metadata into the determined storage unit.

Step S64 is similar to step S33, and a specific example is used for description below.

Continuing to refer to Figure 7, the storage unit 0 for storing metadata includes 6 logic blocks, and every 2 logic blocks corresponds to a stripe, that is, logic block 0 and logic block 1 correspond to the first stripe, logic block 2 and Logic block 3 corresponds to the second slice, and

logic blocks

4 and 5 correspond to the third slice. These logic blocks correspond to the logic blocks used to store metadata slices in each slice. And each stripe also includes logic blocks for storing verification metadata. For example, the first stripe includes logic block P0, the second stripe includes logic block P1, and the third stripe includes logic block P1. Logic block P2.

When the management module 111 determines to store the generated metadata in logical block 2 and logical block 3 of the storage unit 0, the data to be written can be divided into multiple metadata slices according to the preset RAID type, and The check fragment is obtained by calculation, and the metadata fragment and the check fragment are stored in a storage device corresponding to each logical block.

Alternatively, the management module 111 copies each metadata segment according to a preset multiple copy type, and then stores each metadata segment and the copied metadata segment in each storage device. It is similar to step S33 and will not be repeated here.

It can be seen from the above description that after the management module 111 generates the metadata, it can perform steps S62 and S64, or perform steps S62 to S64 to store the metadata in the corresponding storage device, that is, the management module 111 can use There are two ways to store metadata. Then, the management module 111 can select which of the two ways to store metadata according to a preset judgment condition. As an example, the preset judgment condition may be judging whether the metadata is metadata for new data or metadata for updating old data. If it is metadata for new data, it can be understood that it does not need to be updated in situ Step S62 and Step S64 can be performed for metadata of. If it is metadata for updating old data, it can be understood as metadata that needs to be updated in situ, then step S62 to step S64 can be performed. The preset judgment condition can also be other content, which is not limited here.

S65. The storage system 110 updates the metadata structure.

After the management module 111 writes the metadata into the corresponding storage device, the management module 111 also needs to update the metadata structure of the storage system 110. In the embodiment of this application, the metadata structure may be a binary tree (Btree), a log-structured merge-tree (LSM tree), and of course, it may also be other types that can be stored in an additional write mode. The metadata structure of, there is no restriction on the metadata structure here.

For example, please refer to Figure 8(a), which is the Btree corresponding to the metadata that has been saved in the storage system 110. After the management module 111 stores the metadata in the corresponding storage device, it can be based on the metadata of the data to be written. Update the Btree. For example, in Figure 8(a), metadata h, metadata e, metadata s, metadata a, metadata f, and metadata q are included. The name of the metadata corresponding to the data to be written is metadata z, The metadata z includes the metadata s, and the metadata z is taken as the child node of the metadata s, and the Btree as shown in FIG. 8(b) is obtained.

For another example, the name of the metadata corresponding to the data to be written is metadata h', and the metadata h'includes metadata e and metadata s, then metadata h'is used as the metadata of metadata e and metadata s For the parent node, the Btree as shown in Figure 8(c) is obtained.

Step S65 is an optional step, which is represented by a dotted line in FIG. 6.

The third aspect is the garbage collection process of metadata.

In order to make reasonable use of the storage space in the metadata partition, when there are too many garbage metadata in the storage system 100, garbage collection can be started. Please refer to FIG. 9, which is a flowchart of the garbage collection process of metadata in an embodiment of this application. The flowchart is described as follows:

S91. The storage system 110 determines a storage unit used for garbage collection.

In this embodiment, garbage collection is performed in units of storage units. The storage unit used for garbage collection may be that the garbage metadata contained reaches the first set threshold, or the storage unit that contains the most garbage metadata among the multiple storage units, or the effective metadata contained in the storage unit The data is lower than the second set threshold, or the storage unit is the storage unit containing the least valid metadata among the plurality of storage units. For example, in the Btree shown in Figure 8(c), both metadata h and metadata h'are the parent nodes of metadata e and metadata s, and metadata h'is stored after metadata h, so , The management module 111 can determine that the metadata h is garbage metadata. The logic blocks occupied by the metadata h are the logic block 1 and the logic block 2 of the storage unit 0. Therefore, it is determined that the storage unit 0 includes 2 garbage logic blocks. When the number of garbage logical blocks in a storage unit reaches a preset threshold, which may be 3, it is determined that the storage unit is a storage unit used for garbage collection. For the convenience of description, the storage unit used for garbage collection is the storage unit 0 as an example in the following.

S92. The storage system 110 migrates the effective metadata in the storage unit used for garbage collection to other storage units.

When it is determined that storage unit 0 is a storage unit for garbage collection, the effective metadata in storage unit 0 is migrated to other storage units. For example, continuing to refer to FIG. 7, the garbage metadata is stored in logic block 1 to logic block 4 in storage unit 0, and the valid metadata is stored in logic block 5 and logic block 6, the management module 111 will logically The valid metadata stored in block 5 and logic block 6 are migrated to a new storage unit, for example, storage unit 2.

S93. The storage system 110 releases the storage space occupied by the storage unit used for garbage collection.

Specifically, the management module 111 may send a deletion instruction to the storage node corresponding to the storage unit 0 to delete the metadata segment corresponding to the storage unit 0 or verify the metadata segment.

The fourth aspect is the management process of metadata instances.

The storage system 110 can implement various value-added services by creating different metadata instances, such as a service for snapshotting metadata or a service for cloning metadata. Metadata instances can be understood as program codes used to implement a certain value-added service. Please refer to FIG. 10, which is a flowchart of the metadata instance management process in an embodiment of this application. The flowchart is described as follows:

S101. The storage system 110 creates a first metadata instance.

The first metadata instance is used to perform business operations on the metadata stored in the preset storage unit. As an example, the business operation is a snapshot operation, that is, the first metadata instance is an instance of snapshotting metadata in a preset storage unit. The preset storage unit may be part or all of the storage units used to store metadata in the storage system 110. For example, the storage units used to store metadata in the storage system 110 include storage units 0 to 4, and The preset storage units may be storage unit 0 and storage unit 1, which can be set according to actual usage. After the management module 111 runs the program code corresponding to the first metadata instance, it creates the first metadata instance.

It should be noted that in related technologies, redundant protection of metadata is achieved by creating multiple metadata instances. For example, the storage space corresponding to physical address 1 to physical address 20 in the storage system 110 needs to be stored. When the metadata of the data is protected, the management module 111 of the storage system 110 will create at least two metadata instances for the storage space. The at least two metadata instances may include metadata instance 1 and metadata instance 2. The management module 111 allocates storage space for storing metadata for each metadata instance. For example, the storage space for storing metadata for metadata instance 1 is the storage space corresponding to physical address 50 to physical address 55, which is The storage space configured in the metadata instance 2 is the storage space corresponding to the physical address 60 to the physical address 65. When data is stored in physical address 1 to physical address 20, metadata instance 1 stores the metadata of the data in its configured storage space, for example, the metadata of the data is metadata 1, metadata instance 1 Store metadata 1 in a storage space starting at physical address 50. Then, the management module 111 of the storage system 110 copies the metadata stored in the metadata instance 1, and stores the copied metadata in the storage space configured for the metadata instance 2. For example, the management module 111 copies the metadata 1 , And store the copied metadata 1 in another storage space whose starting address is the physical address 60. It can be seen that in related technologies, multiple metadata instances need to be created, which is more complicated. In the embodiment of the present application, since the metadata in the storage system 110 has been stored in the storage device using a preset RAID type or multiple copy type, the metadata has been redundantly protected. Therefore, in this case, In the embodiments of the application, there is no need to create multiple metadata instances storing the same metadata, and a simpler method for redundant protection of metadata is provided.

In addition, when the preset RAID type is used to store metadata, since there is no need to store multiple copies of the same metadata, the storage space occupied by the metadata can be reduced, and the storage space utilization can be improved.

S102. The storage system 110 determines that the first metadata instance is faulty, and then creates a second metadata instance.

When the management module 111 determines that the first metadata instance is faulty, it can create a second metadata instance for taking a snapshot of the metadata, and set the storage unit and the first metadata instance that can be accessed by the second metadata instance the same. For example, the storage units that can be accessed by the first metadata instance are storage unit 0 and storage unit 1, and the storage units that can be accessed by the second metadata instance are also storage unit 0 and storage unit 1, thereby realizing multiple metadata instances Sharing of accessible storage units, so that when a new metadata instance is created, the new metadata instance can directly use the metadata in the shared storage unit, reducing the need to copy and transfer metadata to the new metadata instance The data process can reduce the time delay of creating a new metadata instance and improve efficiency. Furthermore, since there is no need to transmit metadata between multiple metadata instances, transmission resources can be saved.

It should be noted that in the above-mentioned metadata instance management, the metadata in the storage unit preset for metadata instance management is taken as an example for description. Of course, the creation and management of metadata instances are not limited to this .

In the above-mentioned embodiments of the present application, in order to realize each function in the method provided in the above-mentioned embodiments of the present application, the storage system may include a hardware structure and/or a software module, and a hardware structure, a software module, or a hardware structure plus a software module Form to achieve the above functions. Whether a certain function of the above-mentioned functions is executed by a hardware structure, a software module, or a hardware structure plus a software module depends on the specific application and design constraint conditions of the technical solution.

FIG. 11 shows a schematic structural diagram of an apparatus 1100 for managing metadata of a storage system. The apparatus 1100 for managing metadata of the storage system may be the device where the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 is located, or it may be located in the device where the management module 111 is located. To realize the functions of the management module 111. The apparatus 1100 for managing metadata of the storage system may be a hardware structure or a hardware structure plus a software module.

The device 1100 for managing metadata of the storage system includes at least one memory for storing program instructions and/or data. The apparatus 1100 for managing metadata of the storage system further includes at least one processor, the at least one processor is coupled to the memory, and the at least one processor can execute the program instructions stored in the memory.

The apparatus 1100 for managing metadata of a storage system may include a generating unit 1101, a determining unit 1102, and an executing unit 1103.

The generating unit 1101 may call the processor to execute the program instructions stored in the memory to execute step S61 in the embodiment shown in FIG. 6 and/or other processes for supporting the technology described herein.

The determining unit 1102 may call the processor to execute the program instructions stored in the memory to execute step S32 in the embodiment shown in FIG. 3, or execute step S62 in the embodiment shown in FIG. 6, or execute step S62 in the embodiment shown in FIG. Step S91 in the embodiment, and/or other processes used to support the technology described herein.

The execution unit 1103 may call the processor to execute the program instructions stored in the memory to execute step S33 in the embodiment shown in FIG. 3, or execute steps S63 to S65 in the embodiment shown in FIG. 6, or execute step S63 to step S65 in the embodiment shown in FIG. Steps S92 to S93 in the embodiment shown, or steps S101 to S102 in the embodiment shown in FIG. 10 are executed, and/or other processes used to support the technology described herein.

In a possible design, the apparatus 1100 for managing metadata of the storage system may further include a receiving unit 1104, which may call the processor to execute the program instructions stored in the memory to execute the program instructions in the embodiment shown in FIG. 3 Step S31, and/or other processes used to support the techniques described herein. The receiving unit 1104 is used for the storage system metadata management device 1100 to communicate with other modules, and it can be a circuit, a device, an interface, a bus, a software module, a transceiver, or any other device that can implement communication. The receiving unit 1104 is not necessary. In FIG. 11, the receiving unit 1104 is represented by a dotted line.

Among them, all relevant content of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.

The division of modules in the embodiment shown in FIG. 11 is illustrative, and is only a logical function division. In actual implementation, there may be other division methods. In addition, the functional modules in each embodiment of the present application may be integrated In a processor, it can also exist alone physically, or two or more modules can be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software function modules.

FIG. 12 shows an apparatus 1200 for managing metadata of a storage system provided by an embodiment of the present application. The apparatus 1200 for managing metadata of a storage system may be the implementation shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10. In the example, the device where the management module 111 is located, or the device where the management module 111 is located, can be used to implement the functions of the management module 111.

The apparatus 1200 for managing metadata of a storage system includes at least one processor 1220, and the apparatus 1200 for managing metadata of a storage system is used to implement or support the function of the management module 111 in the method provided in the embodiment of the present application. Exemplarily, the processor 1220 may determine a storage unit for storing metadata. For details, refer to the detailed description in the method example, which is not repeated here.

The apparatus 1200 for managing metadata of the storage system may further include at least one memory 1230 for storing program instructions and/or data. The memory 1230 and the processor 1220 are coupled. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units or modules, and may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules. The processor 1220 may operate in cooperation with the memory 1230. The processor 1220 may execute program instructions stored in the memory 1230. At least one of the at least one memory may be included in the processor.

The apparatus 1200 for managing metadata of the storage system may further include a communication interface 1210 for communicating with other devices through a transmission medium, so that the apparatus 1200 for managing metadata of the storage system may communicate with other devices. Exemplarily, the other device may be a client or a storage device. The processor 1220 may use the communication interface 1210 to send and receive data.

The embodiment of the present application does not limit the specific connection medium between the aforementioned communication interface 1210, the processor 1220, and the memory 1230. In the embodiment of the present application, in FIG. 12, the memory 1230, the processor 1220, and the communication interface 1210 are connected by a bus 1250. The bus is represented by a thick line in FIG. , Is not limited. The bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one thick line is used in FIG. 12 to represent it, but it does not mean that there is only one bus or one type of bus.

In the embodiment of the present application, the processor 1220 may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. Or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.

In the embodiment of the present application, the memory 1230 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), etc., or a volatile memory (volatile memory), For example, random-access memory (RAM). The memory is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this. The memory in the embodiments of the present application may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.

The embodiment of the present application also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 Methods.

The embodiments of the present application also provide a computer program product, including instructions, which when run on a computer, cause the computer to execute the method executed by the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 .

The embodiment of the present application provides a storage system, and the storage system includes the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10.

The methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL for short) or wireless (such as infrared, wireless, microwave, etc.). A computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The available medium may be a magnetic medium (for example, a floppy disk, hard disk, Magnetic tape), optical media (for example, digital video disc (DVD for short)), or semiconductor media (for example, SSD).

Claims

A method for managing metadata in a storage system, which is characterized in that it includes:

Generate metadata corresponding to the data to be written;

Determining a storage unit for storing the metadata, where the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;

The metadata is stored in at least two storage devices corresponding to the storage unit.
The method according to claim 1, wherein the storage unit stores the metadata in an additional write manner.
The method according to claim 1 or 2, characterized in that, before determining the storage unit for storing the metadata, the method further comprises:

Receiving a data write request, where the data write request is used to write the data to be written into the storage system;

According to the data write request and the metadata, a record item corresponding to the metadata is generated; the record item includes the data write operation corresponding to the data write request and the metadata updated after the data write operation is executed .
The method according to any one of claims 1-3, characterized in that,

The metadata includes:

The correspondence between the logical address and the physical address of each segment of the data to be written, the logical address of the storage unit occupied by the data to be written and each segment contained in the data to be written Correspondence between the logical addresses of each fragment, the logical address of each fragment is the logical address corresponding to the storage unit occupied by the fragment; or,

The metadata includes:

The correspondence between the logical address and the physical address of each copy of the data to be written, and the correspondence between the logical address of the data to be written and the logical address of each copy contained in the data to be written , The logical address of each copy is the logical address corresponding to the storage unit occupied by the copy;

The set of logical addresses of each segment included in the data to be written or the logical address of each copy included in the data to be written is the logical address of the data to be written.
The method according to any one of claims 1-4, wherein the method further comprises:

Create a first metadata instance, where the first metadata instance is used to perform business operations on metadata in a preset storage unit.
The method according to claim 5, wherein the method further comprises:

After the first metadata instance fails, a second metadata instance is created, and the second metadata instance can access the metadata stored in the preset storage unit.
A management device for metadata in a storage system, characterized in that it comprises:

The generating unit is used to generate metadata corresponding to the data to be written;

A determining unit, configured to determine a storage unit for storing the metadata, the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;

The execution unit is configured to store the metadata in at least two storage devices corresponding to the storage unit.
8. The device according to claim 7, wherein the storage unit stores the metadata in an additional write manner.
The device according to claim 7 or 8, wherein the device further comprises:

A receiving unit, configured to receive a data write request, where the data write request is used to write the data to be written into the storage system;

The generating unit is further configured to: generate a record item corresponding to the metadata according to the data write request and the metadata; the record item includes the data write operation corresponding to the data write request and execute all Metadata updated after the write data operation.
The device according to any one of claims 7-9, characterized in that:

The metadata includes:

The correspondence between the logical address and the physical address of each segment of the data to be written, the logical address of the storage unit occupied by the data to be written and each segment contained in the data to be written Correspondence between the logical addresses of each fragment, the logical address of each fragment is the logical address corresponding to the storage unit occupied by the fragment; or,

The metadata includes:

The correspondence between the logical address and the physical address of each copy of the data to be written, and the correspondence between the logical address of the data to be written and the logical address of each copy contained in the data to be written , The logical address of each copy is the logical address corresponding to the storage unit occupied by the copy;

The set of logical addresses of each segment included in the data to be written or the logical address of each copy included in the data to be written is the logical address of the data to be written.
The device according to any one of claims 7-10, wherein the execution unit is further configured to:

Create a first metadata instance, where the first metadata instance is used to perform business operations on metadata in a preset storage unit.
The device according to claim 11, wherein the execution unit is further configured to:

After the first metadata instance fails, a second metadata instance is created, and the second metadata instance can access the metadata stored in the preset storage unit.
A device for managing metadata in a storage system, which is characterized in that it comprises a processor and a memory. The memory stores computer-executable instructions. The computer-executable instructions are used to make the computer executable when called by the processor. The processor executes the method according to any one of claims 1-6.
A computer storage medium, wherein the computer storage medium stores instructions, and when the instructions are run on a computer, the computer executes the method according to any one of claims 1-6.
A computer program product, characterized in that the computer program product stores instructions, which when run on a computer, cause the computer to execute the method according to any one of claims 1-6.