WO2021088586A1 - Procédé et appareil de gestion de métadonnées dans un système de stockage - Google Patents

Procédé et appareil de gestion de métadonnées dans un système de stockage Download PDF

Info

Publication number
WO2021088586A1
WO2021088586A1 PCT/CN2020/119929 CN2020119929W WO2021088586A1 WO 2021088586 A1 WO2021088586 A1 WO 2021088586A1 CN 2020119929 W CN2020119929 W CN 2020119929W WO 2021088586 A1 WO2021088586 A1 WO 2021088586A1
Authority
WO
WIPO (PCT)
Prior art keywords
metadata
data
storage
storage unit
written
Prior art date
Application number
PCT/CN2020/119929
Other languages
English (en)
Chinese (zh)
Inventor
王晨
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021088586A1 publication Critical patent/WO2021088586A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2043Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share a common memory address space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Definitions

  • This application relates to the field of storage technology, and in particular to a method and device for managing metadata in a storage system.
  • the metadata instance can be understood as a program code used to implement a value-added service based on metadata, such as a service for snapshotting metadata or a service for cloning metadata.
  • the present application provides a method and device for managing metadata in a storage system, which are used to simplify the steps of performing redundancy protection on metadata.
  • a method for managing metadata in a storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system, that is, In other words, the storage unit is a logical storage unit.
  • the storage system includes a plurality of storage units that are used to store the metadata. Storage unit, thereby storing the metadata in at least two storage devices corresponding to the determined storage unit.
  • each storage unit is mapped to the physical storage space corresponding to at least two storage devices, in this way, when one of the storage devices corresponding to a certain storage unit fails, the The metadata is recovered from the remaining storage device corresponding to the storage unit, so that redundancy protection of the metadata can be realized. Therefore, in the embodiments of the present application, there is no need to create multiple metadata instances that store the same metadata, and a simpler method for redundant protection of metadata is provided.
  • the storage unit may store the metadata in an additional write mode.
  • the efficiency of writing metadata can be improved, and when new data is added to the storage system, the old data (that is, the previously stored data) may be determined as invalid data, and there will be The multiple consecutive old data stored in advance are all invalid data, so that the multiple consecutive storage units corresponding to the multiple invalid data are all storage units that need to be garbage collected, which can reduce the overhead of garbage collection.
  • a data write request for writing the data to be written into the storage system may be received, and write data according to the data.
  • the request and the metadata generate a record item corresponding to the metadata, and the record item includes a data write operation corresponding to the data write request and metadata updated after the data write operation is executed.
  • the storage unit for storing metadata fails, the metadata before the failure can be recovered through the content in the record, which can increase the stability of the storage system.
  • the metadata includes:
  • the logical address of each fragment is the logical address corresponding to the storage unit occupied by the fragment;
  • the metadata includes:
  • the logical address of each copy is the logical address corresponding to the storage unit occupied by the copy;
  • the set of logical addresses of each segment included in the data to be written or the logical address of each copy included in the data to be written is the logical address of the data to be written.
  • metadata can record a variety of different contents according to actual usage requirements, which can increase the flexibility and applicability of the storage system.
  • the storage system may also create a first metadata instance for performing business operations on metadata in a preset storage unit.
  • the metadata instance is no longer to perform business operations on the metadata in the preset physical storage space, but to operate on the metadata in the preset storage unit, providing a new kind of metadata How the instance was created.
  • a second metadata instance may be created, and the second metadata instance can access the metadata stored in the preset storage unit.
  • the new metadata instance when a new metadata instance is created, the new metadata instance can directly use the metadata in the shared storage unit, which reduces the process of copying and transmitting metadata to the new metadata instance. Reduce the time delay of creating a new metadata instance and improve efficiency. Furthermore, since there is no need to transmit metadata between multiple metadata instances, transmission resources can be saved.
  • a management device for metadata in a storage system may be a management node or a management server, or a device in a management node or a management server.
  • the management device includes a processor for implementing the method described in the first aspect.
  • the management device may also include a memory for storing program instructions and data. The memory is coupled with the processor, and the processor can call and execute the program instructions stored in the memory to implement any one of the methods described in the first aspect.
  • the processor of the metadata management device executes the program instructions in the memory to realize the following functions:
  • Determining a storage unit for storing the metadata where the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;
  • the metadata is stored in at least two storage devices corresponding to the storage unit.
  • the storage unit stores the metadata in an additional write mode.
  • the processor executes the program instructions stored in the memory to realize the following functions:
  • a record item corresponding to the metadata is generated; the record item includes the data write operation corresponding to the data write request and the metadata updated after the data write operation is executed .
  • the description of the metadata is similar to the corresponding content in the first aspect, and will not be repeated here.
  • the processor executes the program instructions stored in the memory to realize the following functions:
  • the processor executes the program instructions stored in the memory to realize the following functions:
  • a second metadata instance is created, and the second metadata instance can access the metadata stored in the preset storage unit.
  • a management device for metadata in a storage system may be a management node or a management server, or a device in a management node or a management server.
  • the management device may include a generating unit, a determining unit, and an executing unit, and these units may execute the corresponding function executed in any of the design examples of the first aspect, specifically:
  • the generating unit is used to generate metadata corresponding to the data to be written
  • a determining unit configured to determine a storage unit for storing the metadata, the storage system includes a plurality of storage units, and each storage unit is mapped to a physical storage space corresponding to at least two storage devices included in the storage system;
  • the execution unit is configured to store the metadata in at least two storage devices corresponding to the storage unit.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a computer, cause the The computer executes the method described in any one of the first aspect.
  • an embodiment of the present application provides a computer program product, the computer program product stores a computer program, the computer program includes program instructions, and when executed by a computer, the program instructions cause the computer to execute the first The method of any one of the aspects.
  • the present application provides a chip system.
  • the chip system includes a processor and may also include a memory for implementing the method described in the first aspect.
  • the chip system can be composed of chips, or it can include chips and other discrete devices.
  • an embodiment of the present application provides a storage system that includes the metadata management device of the storage system described in the second aspect and any one of the designs of the second aspect, or the storage system includes the first The metadata management device of the storage system described in any one of the third aspect and the third aspect is designed.
  • FIG. 1 is a schematic diagram of an example of an application scenario of an embodiment of the application
  • FIG. 2 is a schematic structural diagram of an example of a storage unit provided by this embodiment
  • FIG. 3 is a flowchart of the data storage process in an embodiment of the application.
  • FIG. 4 is a schematic diagram of an example of multiple strips included in a storage unit in an embodiment of the application.
  • FIG. 5 is a schematic diagram of an example of a mapping relationship between a storage unit and a storage device in an embodiment of the application
  • Fig. 6 is a flowchart of the metadata storage process in an embodiment of the application.
  • FIG. 7 is a schematic diagram of an example of writing metadata to a storage unit in an embodiment of the application.
  • FIG. 8 is a schematic diagram of an example of a metadata structure in an embodiment of the application.
  • FIG. 9 is a flowchart of the garbage collection process of metadata in an embodiment of the application.
  • FIG. 10 is a flowchart of the management process of metadata instances in an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of an example of a metadata management device of a storage system provided in an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of another example of a management device for metadata of a storage system provided in an embodiment of the application.
  • “multiple” refers to two or more than two. In view of this, “multiple” may also be understood as “at least two” in the embodiments of the present application. “At least one” can be understood as one or more, for example, one, two or more. For example, including at least one refers to including one, two or more, and does not limit which ones are included. For example, including at least one of A, B, and C, then the included can be A, B, C, A and B, A and C, B and C, or A and B and C.
  • ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, timing, priority, or importance of multiple objects.
  • the metadata management method provided in the embodiments of this application can be applied to various storage systems, for example, it can be a centralized storage system, or it can be a distributed storage system, or it can be a cloud storage system such as a public cloud or a private cloud. Wait, there is no restriction here.
  • the application of this metadata management method in a distributed storage system is taken as an example below.
  • FIG. 1 is a schematic diagram of an example of an application scenario provided by an embodiment of this application.
  • a client server (client server) 100 and a storage system 110 are included, and the client server 100 communicates with the storage system 110.
  • the storage system 110 includes a management module 111 and at least one storage node 112 (in FIG. 1, three storage nodes 112, respectively storage node 1 to storage node 3 are taken as an example), and the management module 111 is used to send each storage node 112 to each storage node 112. Data is written, and data is read from at least one storage node 112.
  • the storage node 112 in FIG. 1 may be an independent server, or it may also be a storage array including at least one storage device.
  • the storage device may be a hard disk drive (HDD) disk device or a solid state drive. , SSD) disk device, serial advanced technology attachment (SATA) disk device, small computer system interface (SCSI) disk device, serial attached SCSI interface (serial attached SCSI, SAS) disk Equipment or Fibre Channel (FC) disk equipment, etc.
  • HDD hard disk drive
  • SSD solid state drive
  • SATA serial advanced technology attachment
  • SCSI small computer system interface
  • serial attached SCSI interface serial attached SCSI, SAS
  • FC Fibre Channel
  • the management module 111 and at least one storage node 112 in FIG. 1 may be independent devices.
  • the management module 111 is an independent server; or, the management module 111 may also be a software module, which is deployed on a certain storage node 112.
  • the management module 111 and a certain storage node 112 run on the same server, and the specific forms of the management module 111 and the storage node 112 are not limited here.
  • each storage node includes at least one storage unit.
  • the storage unit is a segment of logical space.
  • the logical space is obtained by mapping the physical space of the storage device included in the storage node, that is, the actual The physical space still comes from multiple storage nodes.
  • FIG. 2 is a schematic structural diagram of an example of the storage unit provided in this embodiment.
  • the storage unit is a collection of multiple logic blocks.
  • the logical block is a logical space concept, which is obtained by the space division of the storage device.
  • the size of a logical block can be 4KB or 8KB, etc.
  • the size of the logical block is not limited here.
  • Each logical block corresponds to a physical storage space of the same size as the logical block in the storage device. It should be noted that multiple logical blocks included in a storage unit come from multiple storage devices, and the multiple storage devices may come from different storage nodes, or may also come from the same storage device, which is not limited here.
  • the storage node 112 may be based on a set redundant array of independent hard disks (redundant array). of independent disks, RAID) type, which maps the logical blocks in the logical block set included in the storage unit to data storage units for storing data fragments, and generates a checksum based on the data fragments stored in each logical block
  • RAID redundant array of independent disks
  • a storage unit contains one or more strips.
  • the data storage unit includes at least two logic blocks
  • the verification storage unit includes at least one logic block.
  • the storage node 112 takes out one logical block from four storage devices, such as storage device A to storage device D, to form a storage unit.
  • the four logical blocks form a striped data storage unit, and then from the other two Each logical block is taken out of the storage device to form a check storage unit.
  • the any two logical blocks in the strip fail, can be any two data storage units or logical blocks corresponding to any two check storage units, or can be a data storage unit and
  • the logic block corresponding to a check storage unit can reconstruct the data in the failed logic block according to the data in the remaining logic block.
  • the storage node 112 may also divide multiple logical blocks in the logical block set included in the storage unit into duplicate units according to the set multiple duplicate type.
  • each copy unit includes at least one logic block, the at least one logic block stores data, and the data stored in each copy unit is the same. For example, if a copy unit includes two logical blocks, the storage node 112 will take out one logical block from each of the two storage devices to form a copy unit. Assume that the multiple copy type is copy type 3, that is, one data needs to be stored in three copies.
  • the storage node 112 can each take out one logical block from the other four storage devices, and compose every two logical blocks into a copy unit to obtain another two copy units, and the same data is stored in the three copy units. In this way, when any copy unit fails, data can be obtained from the other two copy units.
  • the application scenario shown in FIG. 1 is taken as an example to describe the metadata management method provided by the embodiment of the present application.
  • the technical solutions of the embodiments of the present application will be introduced in the following four aspects.
  • the steps executed by the storage system 110 may all be executed by the management module 111 of the storage system 110.
  • the first aspect is the data storage process.
  • FIG. 3 is a flowchart of the data storage process in an embodiment of this application. The flowchart is described as follows:
  • the client server 100 sends a data write request to the storage system 110.
  • the data write request includes the data to be written and the virtual storage address of the data to be written.
  • the virtual storage address refers to the identifier and offset of the logical unit (LU) to which the data to be written is to be written, and the virtual storage address is an address visible to the client server 100.
  • the data write request may be obtained by the client server 100 according to a user's operation, or may be generated according to system requirements during operation.
  • the storage system 110 determines a storage unit for storing the data to be written.
  • the management module 111 of the storage system 110 After the management module 111 of the storage system 110 receives the data write request, it determines the storage unit of the data to be written according to the usage of the storage unit in the storage system 110 and the size of the data to be written carried in the data write request.
  • the storage system 110 determines that the data to be written requires 1 storage unit. The storage system 110 determines that no data is stored before receiving the data write request, and then determines that the storage unit occupied by the data to be written is storage unit 0.
  • the initial storage unit is the storage unit 0 as an example. In other embodiments, the initial storage unit may also be the storage unit 1, which is not limited here.
  • a storage unit may include multiple strips, that is, a striped data storage unit includes some logical blocks in the logical block set corresponding to the storage unit.
  • a storage unit contains 3 strips. If the size of the data stored in a strip is 32KB, the size of a storage unit is 96KB. If the size of the data to be written is smaller than the size of a storage unit, it can be determined to store the data to be written in a partial logical block included in a certain storage unit, for example, to store the data corresponding to at least one stripe.
  • Block For example, a storage unit includes 12 logic blocks, and each 4 logic blocks corresponds to a stripe, that is, every 4 logic blocks can store data with a data volume of 32KB.
  • the storage system 110 determines that before receiving the data write request, data has been stored in the first 4 logical blocks of storage unit 0 (that is, logical block 0 to logical block 3), then it can be determined to store the data to be written in the storage unit 0 in logic block 4 to logic block 7.
  • a storage unit may correspond to more than 3 strips. For example, it can correspond to dozens or hundreds of strips.
  • the number of strips shown in Figure 4 is only an example. It should not be understood as a restriction on the storage unit.
  • each storage device may provide a segment of logical address instead of providing it to the storage unit in the form of a logical block.
  • the storage unit is a collection of multiple logical address segments.
  • the storage system 110 stores the data to be written according to the determined storage unit for storing the data to be written.
  • the management module 111 of the storage system 110 pre-stores the mapping relationship between each storage unit and the storage device of the storage node. When the storage unit used to store the data to be written is determined, the data to be written is determined according to the mapping relationship. Write to the corresponding storage node.
  • the management module 111 of the storage system 110 stores the data written to the storage unit according to a preset RAID type.
  • the storage unit 0 includes 12 logical blocks, and each of the 4 logical blocks corresponds to a stripe, and the 4 logical blocks are used to store data fragments.
  • logical block 0 to logical block 3 are the logical blocks used to store data slices in the first strip
  • logical block 4 to logical block 7 are logical blocks used to store data slices in the second strip.
  • Logic block 8 to logic block 11 are the logic blocks used to store data slices in the third stripe, and each stripe also includes logic blocks used to store test data slices, for example, the first stripe It also includes a logic block P0 and a logic block Q0.
  • the second section also includes a logic block P1 and a logic block Q1
  • the third section also includes a logic block P2 and a logic block Q2.
  • the storage system 110 presets a mapping relationship between the logical blocks included in each segment and the storage device of the storage node.
  • the mapping relationship is: the 4 logical blocks used to store data fragments in each stripe correspond to storage device A in storage node 1 to storage node 4 in turn, and each stripe is used to verify data fragments
  • the logical blocks of corresponds to storage device A in storage node 5 and storage node 6 in turn.
  • in multiple strips corresponding to a storage unit logical blocks with the same position are from the same storage node.
  • the storage unit shown in FIG. 4 includes 3 strips.
  • the first strip includes logic block 0 to logic block 3, logic block P0, and logic block Q0
  • the second strip includes logic block 4 to logic block Q0.
  • logic block 0 and logic block 4 are located in the same position
  • logic block 1 and logic block 5 are located in the same position, and so on.
  • the management module 111 After the management module 111 receives the data to be written, it can divide the data to be written into multiple data fragments according to the preset RAID type, and calculate the parity fragments, and divide the data fragments and parity into multiple data fragments.
  • the fragments are stored in the storage device corresponding to each logical block. For example, the size of the data to be written is 32KB, and it is determined that the data to be written is stored in logical block 4 to logical block 7, then the management module 111 divides the data to be written into 4 data fragments, each The size of the fragment is 8KB, and then according to the 4 data fragments, 2 parity data fragments are calculated, and the size of each parity fragment is also 8KB.
  • the management module 111 sends each data fragment and the verification data fragment to the corresponding storage node for persistent storage.
  • the management module 111 sends 4 data fragments to storage node 1 to storage node 4 respectively, and sends 2 parity data fragments to storage node 5 and storage node 6 respectively.
  • Each storage node stores corresponding data in a preset storage device.
  • the management module 111 of the storage system 110 stores the data written to the storage unit according to a preset multiple copy type.
  • the storage unit 0 includes 12 logic blocks, and each logic block is used to store data.
  • the storage system 110 presets the mapping relationship between each logical block and the storage device of the storage node. For example, if the multiple copy type is 2 copies, each logical block can correspond to two different storage devices on a storage node, and the mapping relationship is: logical block 0 to logical block 3 correspond to storage node 1 to storage node in turn
  • the mapping relationship between storage device A and storage device B on 4, and other logical blocks and storage devices may be similar to logical block 0 to logical block 3, and will not be repeated here.
  • the management module 111 After the management module 111 receives the data to be written, it can copy the data to be written into multiple data according to the preset multiple copy type, and store the data to be written and the copied data corresponding to each logical block. In the storage device. For example, the size of the data to be written is 32KB, and the size of each logical block is 4KB. If it is determined to write the data to be written into logical blocks 0 to 4, the management module 111 divides the data to be written The data is 4 copies, and the size of each data is 8KB. Then the 4 copies of data are copied to obtain 8 copies of data. Then, the management module 111 sends the 8 copies of data to the corresponding storage node for persistent storage. With the mapping relationship as described above, the management module 111 sends two identical data of the eight pieces of data to storage nodes 1 to 4 respectively, and each storage node stores the corresponding data in a preset storage device.
  • the data to be written is written into the storage unit of the storage system 110. From a physical point of view, the data is ultimately still stored in multiple storage nodes. For each fragment, the identification of the storage unit where it is located and the location inside the storage unit are the logical address of the fragment, and the actual address of the fragment in the storage node is the physical address of the fragment. address.
  • the second aspect is the storage process of metadata.
  • the storage system 110 After the data to be written is stored in the storage device, in order to facilitate subsequent searching or reading of the data, the storage system 110 also needs to store the description information of the data.
  • the storage node receives the data read request, it is usually based on the data read request.
  • the carried information finds the metadata of the data to be read, and then further obtains the data to be read according to the metadata.
  • Metadata includes, but is not limited to: the correspondence between the logical address and physical address of each fragment, the correspondence between the logical address of the data and the logical address of each fragment contained in the data, and the The correspondence between the logical address and the physical address, and the correspondence between the logical address of the data and the logical address of the copy of the data.
  • the set of logical addresses of each fragment contained in the data or the logical address of each copy is the logical address of the data.
  • FIG. 6 is a flowchart of the metadata storage process in an embodiment of this application. The flowchart is described as follows:
  • the storage system 110 generates metadata.
  • the management module 111 of the storage system 110 After the data to be written is stored in the storage system 110, the management module 111 of the storage system 110 generates metadata of the data to be written. For example, in the embodiment shown in FIG. 3, the management module 111 stores the data to be written in logic block 0 to logic block 4 of the storage unit, and then the management module 111 will, according to the size of the data to be written, Store the address and other information to generate the metadata of the data to be written.
  • the content of metadata is not limited here.
  • the storage system 110 determines a storage unit for storing the metadata.
  • the physical storage space used by the storage system 110 for storing data and the physical storage space used for storing metadata are separated.
  • each storage node includes 4 storage devices, normally, Compared with the data itself, the metadata of the data occupies a smaller storage space. Therefore, the storage device A to the storage device C in each storage node in the storage system 110 can be set to store data, and each storage The storage device D in the node is used to store metadata; or, if the storage system 110 includes 4 storage nodes, it is also possible to set all storage devices in storage node 1 to storage node 3 to store data, and storage node 4 All storage devices are used to store metadata.
  • the storage unit used to store data and the storage unit used to store metadata are essentially the same, except that the content stored in the storage unit is different.
  • the storage unit used to store data and the storage unit used to store metadata are different.
  • the storage unit of metadata comes from different storage devices.
  • the management module 111 can determine the storage unit used to store the metadata according to the usage of the storage unit used to store the metadata in the storage system 110.
  • a storage unit for storing metadata includes 6 logical blocks, and every 2 logical blocks corresponds to a stripe.
  • the management module 111 determines that before generating the metadata, a storage unit has been used for storage. If data is stored in the first two logic blocks (ie, logic block 0 and logic block 1) of the metadata storage unit 0, the management module 111 can determine to store the generated metadata in the logic block 2 and logic block 2 of the storage unit 0.
  • Block 3. This method can be understood as storing metadata in the storage unit in an additional write manner.
  • step S63 may be performed before step S62.
  • the storage system 110 generates a record item corresponding to the metadata.
  • the management module 111 After the management module 111 generates the metadata, it can obtain the write ahead log (WAL) record item corresponding to the metadata according to the metadata and the operation corresponding to the metadata.
  • WAL write ahead log
  • the operation corresponding to metadata is illustrated by an example.
  • the management module 111 saves the record item in the memory, and the memory can be understood as the memory of the node or server where the management module 111 is located.
  • the preset condition may be that the number of WAL record items recorded in the memory reaches a threshold, then the metadata in the multiple WAL record items recorded in the memory is determined Write to the storage unit, thereby executing step S62 to determine the storage unit corresponding to the metadata in each WAL record.
  • step S62 the method of determining the storage unit corresponding to the metadata in each WAL record can be similar to step S62, that is, according to the usage of the storage unit used to store the metadata, determine the storage unit used to store each WAL record in turn.
  • the storage unit of the data will not be repeated here.
  • the storage system 110 writes the metadata into the determined storage unit.
  • Step S64 is similar to step S33, and a specific example is used for description below.
  • the storage unit 0 for storing metadata includes 6 logic blocks, and every 2 logic blocks corresponds to a stripe, that is, logic block 0 and logic block 1 correspond to the first stripe, logic block 2 and Logic block 3 corresponds to the second slice, and logic blocks 4 and 5 correspond to the third slice. These logic blocks correspond to the logic blocks used to store metadata slices in each slice. And each stripe also includes logic blocks for storing verification metadata. For example, the first stripe includes logic block P0, the second stripe includes logic block P1, and the third stripe includes logic block P1. Logic block P2.
  • the management module 111 determines to store the generated metadata in logical block 2 and logical block 3 of the storage unit 0, the data to be written can be divided into multiple metadata slices according to the preset RAID type, and The check fragment is obtained by calculation, and the metadata fragment and the check fragment are stored in a storage device corresponding to each logical block.
  • the management module 111 copies each metadata segment according to a preset multiple copy type, and then stores each metadata segment and the copied metadata segment in each storage device. It is similar to step S33 and will not be repeated here.
  • the management module 111 can perform steps S62 and S64, or perform steps S62 to S64 to store the metadata in the corresponding storage device, that is, the management module 111 can use There are two ways to store metadata. Then, the management module 111 can select which of the two ways to store metadata according to a preset judgment condition.
  • the preset judgment condition may be judging whether the metadata is metadata for new data or metadata for updating old data. If it is metadata for new data, it can be understood that it does not need to be updated in situ Step S62 and Step S64 can be performed for metadata of. If it is metadata for updating old data, it can be understood as metadata that needs to be updated in situ, then step S62 to step S64 can be performed.
  • the preset judgment condition can also be other content, which is not limited here.
  • the storage system 110 updates the metadata structure.
  • the management module 111 After the management module 111 writes the metadata into the corresponding storage device, the management module 111 also needs to update the metadata structure of the storage system 110.
  • the metadata structure may be a binary tree (Btree), a log-structured merge-tree (LSM tree), and of course, it may also be other types that can be stored in an additional write mode.
  • Btree binary tree
  • LSM tree log-structured merge-tree
  • Figure 8(a) is the Btree corresponding to the metadata that has been saved in the storage system 110.
  • the management module 111 stores the metadata in the corresponding storage device, it can be based on the metadata of the data to be written.
  • Update the Btree For example, in Figure 8(a), metadata h, metadata e, metadata s, metadata a, metadata f, and metadata q are included.
  • the name of the metadata corresponding to the data to be written is metadata z
  • the metadata z includes the metadata s, and the metadata z is taken as the child node of the metadata s, and the Btree as shown in FIG. 8(b) is obtained.
  • the name of the metadata corresponding to the data to be written is metadata h'
  • the Btree as shown in Figure 8(c) is obtained.
  • Step S65 is an optional step, which is represented by a dotted line in FIG. 6.
  • the third aspect is the garbage collection process of metadata.
  • FIG. 9 is a flowchart of the garbage collection process of metadata in an embodiment of this application. The flowchart is described as follows:
  • the storage system 110 determines a storage unit used for garbage collection.
  • garbage collection is performed in units of storage units.
  • the storage unit used for garbage collection may be that the garbage metadata contained reaches the first set threshold, or the storage unit that contains the most garbage metadata among the multiple storage units, or the effective metadata contained in the storage unit
  • the data is lower than the second set threshold, or the storage unit is the storage unit containing the least valid metadata among the plurality of storage units.
  • both metadata h and metadata h' are the parent nodes of metadata e and metadata s, and metadata h'is stored after metadata h, so ,
  • the management module 111 can determine that the metadata h is garbage metadata.
  • the logic blocks occupied by the metadata h are the logic block 1 and the logic block 2 of the storage unit 0.
  • the storage unit 0 includes 2 garbage logic blocks.
  • a preset threshold which may be 3
  • the storage unit used for garbage collection is the storage unit 0 as an example in the following.
  • the storage system 110 migrates the effective metadata in the storage unit used for garbage collection to other storage units.
  • storage unit 0 is a storage unit for garbage collection
  • the effective metadata in storage unit 0 is migrated to other storage units.
  • the garbage metadata is stored in logic block 1 to logic block 4 in storage unit 0
  • the valid metadata is stored in logic block 5 and logic block 6, the management module 111 will logically
  • the valid metadata stored in block 5 and logic block 6 are migrated to a new storage unit, for example, storage unit 2.
  • the storage system 110 releases the storage space occupied by the storage unit used for garbage collection.
  • the management module 111 may send a deletion instruction to the storage node corresponding to the storage unit 0 to delete the metadata segment corresponding to the storage unit 0 or verify the metadata segment.
  • the fourth aspect is the management process of metadata instances.
  • the storage system 110 can implement various value-added services by creating different metadata instances, such as a service for snapshotting metadata or a service for cloning metadata.
  • Metadata instances can be understood as program codes used to implement a certain value-added service.
  • FIG. 10 is a flowchart of the metadata instance management process in an embodiment of this application. The flowchart is described as follows:
  • the storage system 110 creates a first metadata instance.
  • the first metadata instance is used to perform business operations on the metadata stored in the preset storage unit.
  • the business operation is a snapshot operation, that is, the first metadata instance is an instance of snapshotting metadata in a preset storage unit.
  • the preset storage unit may be part or all of the storage units used to store metadata in the storage system 110.
  • the storage units used to store metadata in the storage system 110 include storage units 0 to 4, and
  • the preset storage units may be storage unit 0 and storage unit 1, which can be set according to actual usage.
  • the storage space corresponding to physical address 1 to physical address 20 in the storage system 110 needs to be stored.
  • the management module 111 of the storage system 110 will create at least two metadata instances for the storage space.
  • the at least two metadata instances may include metadata instance 1 and metadata instance 2.
  • the management module 111 allocates storage space for storing metadata for each metadata instance.
  • the storage space for storing metadata for metadata instance 1 is the storage space corresponding to physical address 50 to physical address 55, which is
  • the storage space configured in the metadata instance 2 is the storage space corresponding to the physical address 60 to the physical address 65.
  • metadata instance 1 stores the metadata of the data in its configured storage space, for example, the metadata of the data is metadata 1, metadata instance 1 Store metadata 1 in a storage space starting at physical address 50. Then, the management module 111 of the storage system 110 copies the metadata stored in the metadata instance 1, and stores the copied metadata in the storage space configured for the metadata instance 2. For example, the management module 111 copies the metadata 1 , And store the copied metadata 1 in another storage space whose starting address is the physical address 60. It can be seen that in related technologies, multiple metadata instances need to be created, which is more complicated. In the embodiment of the present application, since the metadata in the storage system 110 has been stored in the storage device using a preset RAID type or multiple copy type, the metadata has been redundantly protected. Therefore, in this case, In the embodiments of the application, there is no need to create multiple metadata instances storing the same metadata, and a simpler method for redundant protection of metadata is provided.
  • the preset RAID type when used to store metadata, since there is no need to store multiple copies of the same metadata, the storage space occupied by the metadata can be reduced, and the storage space utilization can be improved.
  • the storage system 110 determines that the first metadata instance is faulty, and then creates a second metadata instance.
  • the management module 111 can create a second metadata instance for taking a snapshot of the metadata, and set the storage unit and the first metadata instance that can be accessed by the second metadata instance the same.
  • the storage units that can be accessed by the first metadata instance are storage unit 0 and storage unit 1
  • the storage units that can be accessed by the second metadata instance are also storage unit 0 and storage unit 1, thereby realizing multiple metadata instances Sharing of accessible storage units, so that when a new metadata instance is created, the new metadata instance can directly use the metadata in the shared storage unit, reducing the need to copy and transfer metadata to the new metadata instance
  • the data process can reduce the time delay of creating a new metadata instance and improve efficiency. Furthermore, since there is no need to transmit metadata between multiple metadata instances, transmission resources can be saved.
  • metadata in the storage unit preset for metadata instance management is taken as an example for description.
  • creation and management of metadata instances are not limited to this .
  • the storage system may include a hardware structure and/or a software module, and a hardware structure, a software module, or a hardware structure plus a software module Form to achieve the above functions. Whether a certain function of the above-mentioned functions is executed by a hardware structure, a software module, or a hardware structure plus a software module depends on the specific application and design constraint conditions of the technical solution.
  • FIG. 11 shows a schematic structural diagram of an apparatus 1100 for managing metadata of a storage system.
  • the apparatus 1100 for managing metadata of the storage system may be the device where the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 is located, or it may be located in the device where the management module 111 is located. To realize the functions of the management module 111.
  • the apparatus 1100 for managing metadata of the storage system may be a hardware structure or a hardware structure plus a software module.
  • the device 1100 for managing metadata of the storage system includes at least one memory for storing program instructions and/or data.
  • the apparatus 1100 for managing metadata of the storage system further includes at least one processor, the at least one processor is coupled to the memory, and the at least one processor can execute the program instructions stored in the memory.
  • the apparatus 1100 for managing metadata of a storage system may include a generating unit 1101, a determining unit 1102, and an executing unit 1103.
  • the generating unit 1101 may call the processor to execute the program instructions stored in the memory to execute step S61 in the embodiment shown in FIG. 6 and/or other processes for supporting the technology described herein.
  • the determining unit 1102 may call the processor to execute the program instructions stored in the memory to execute step S32 in the embodiment shown in FIG. 3, or execute step S62 in the embodiment shown in FIG. 6, or execute step S62 in the embodiment shown in FIG. Step S91 in the embodiment, and/or other processes used to support the technology described herein.
  • the execution unit 1103 may call the processor to execute the program instructions stored in the memory to execute step S33 in the embodiment shown in FIG. 3, or execute steps S63 to S65 in the embodiment shown in FIG. 6, or execute step S63 to step S65 in the embodiment shown in FIG. Steps S92 to S93 in the embodiment shown, or steps S101 to S102 in the embodiment shown in FIG. 10 are executed, and/or other processes used to support the technology described herein.
  • the apparatus 1100 for managing metadata of the storage system may further include a receiving unit 1104, which may call the processor to execute the program instructions stored in the memory to execute the program instructions in the embodiment shown in FIG. 3 Step S31, and/or other processes used to support the techniques described herein.
  • the receiving unit 1104 is used for the storage system metadata management device 1100 to communicate with other modules, and it can be a circuit, a device, an interface, a bus, a software module, a transceiver, or any other device that can implement communication.
  • the receiving unit 1104 is not necessary. In FIG. 11, the receiving unit 1104 is represented by a dotted line.
  • the division of modules in the embodiment shown in FIG. 11 is illustrative, and is only a logical function division. In actual implementation, there may be other division methods.
  • the functional modules in each embodiment of the present application may be integrated In a processor, it can also exist alone physically, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software function modules.
  • FIG. 12 shows an apparatus 1200 for managing metadata of a storage system provided by an embodiment of the present application.
  • the apparatus 1200 for managing metadata of a storage system may be the implementation shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10.
  • the device where the management module 111 is located, or the device where the management module 111 is located, can be used to implement the functions of the management module 111.
  • the apparatus 1200 for managing metadata of a storage system includes at least one processor 1220, and the apparatus 1200 for managing metadata of a storage system is used to implement or support the function of the management module 111 in the method provided in the embodiment of the present application.
  • the processor 1220 may determine a storage unit for storing metadata. For details, refer to the detailed description in the method example, which is not repeated here.
  • the apparatus 1200 for managing metadata of the storage system may further include at least one memory 1230 for storing program instructions and/or data.
  • the memory 1230 and the processor 1220 are coupled.
  • the coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units or modules, and may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • the processor 1220 may operate in cooperation with the memory 1230.
  • the processor 1220 may execute program instructions stored in the memory 1230. At least one of the at least one memory may be included in the processor.
  • the apparatus 1200 for managing metadata of the storage system may further include a communication interface 1210 for communicating with other devices through a transmission medium, so that the apparatus 1200 for managing metadata of the storage system may communicate with other devices.
  • the other device may be a client or a storage device.
  • the processor 1220 may use the communication interface 1210 to send and receive data.
  • the embodiment of the present application does not limit the specific connection medium between the aforementioned communication interface 1210, the processor 1220, and the memory 1230.
  • the memory 1230, the processor 1220, and the communication interface 1210 are connected by a bus 1250.
  • the bus is represented by a thick line in FIG. , Is not limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one thick line is used in FIG. 12 to represent it, but it does not mean that there is only one bus or one type of bus.
  • the processor 1220 may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. Or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the memory 1230 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), etc., or a volatile memory (volatile memory), For example, random-access memory (RAM).
  • the memory is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this.
  • the memory in the embodiments of the present application may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • the embodiment of the present application also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 Methods.
  • the embodiments of the present application also provide a computer program product, including instructions, which when run on a computer, cause the computer to execute the method executed by the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10 .
  • the embodiment of the present application provides a storage system, and the storage system includes the management module 111 in the embodiment shown in FIG. 3 or FIG. 6 or FIG. 9 or FIG. 10.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented by software, it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • a computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, hard disk, Magnetic tape), optical media (for example, digital video disc (DVD for short)), or semiconductor media (for example, SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé et un appareil de gestion de métadonnées dans un système de stockage. Ledit système de stockage comprend une pluralité d'unités de stockage et chaque unité de stockage est mise en correspondance avec un espace de stockage physique correspondant à au moins deux dispositifs de stockage compris par le système de stockage, c'est-à-dire, l'unité de stockage est une unité de stockage logique ; dans le procédé, après que des métadonnées correspondant à des données à écrire sont générées dans le système de stockage, l'unité de stockage utilisée pour stocker les métadonnées est déterminée parmi la pluralité d'unités de stockage comprises dans le système de stockage. De cette manière, les métadonnées sont stockées dans au moins deux dispositifs de stockage correspondant à l'unité de stockage déterminée. Chaque unité de stockage est mise en correspondance avec un espace de stockage physique correspondant à au moins deux dispositifs de stockage ; ainsi, si un des dispositifs parmi la pluralité de dispositifs de stockage correspondant à une unité de stockage tombe en panne, les métadonnées peuvent être récupérées à partir du dispositif de stockage restant correspondant à l'unité de stockage, ce qui permet de protéger la redondance des métadonnées.
PCT/CN2020/119929 2019-11-05 2020-10-09 Procédé et appareil de gestion de métadonnées dans un système de stockage WO2021088586A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201911072812.6 2019-11-05
CN201911072812 2019-11-05
CN202010021351.6A CN112783698A (zh) 2019-11-05 2020-01-09 一种存储系统中的元数据的管理方法及装置
CN202010021351.6 2020-01-09

Publications (1)

Publication Number Publication Date
WO2021088586A1 true WO2021088586A1 (fr) 2021-05-14

Family

ID=75749970

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119929 WO2021088586A1 (fr) 2019-11-05 2020-10-09 Procédé et appareil de gestion de métadonnées dans un système de stockage

Country Status (2)

Country Link
CN (1) CN112783698A (fr)
WO (1) WO2021088586A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342751B (zh) * 2021-07-30 2021-11-09 联想凌拓科技有限公司 元数据处理方法、装置、设备和可读存储介质
CN113867642B (zh) * 2021-09-29 2023-08-04 杭州海康存储科技有限公司 数据处理方法、装置及存储设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1776675A (zh) * 2004-11-17 2006-05-24 国际商业机器公司 用于存储并使用多存储位置中的元数据的方法和系统
CN107622019A (zh) * 2016-07-14 2018-01-23 爱思开海力士有限公司 存储器系统及其操作方法
CN108108308A (zh) * 2016-11-24 2018-06-01 爱思开海力士有限公司 存储器系统及其操作方法
US20190079859A1 (en) * 2017-09-13 2019-03-14 Intel Corporation Apparatus, computer program product, system, and method for managing multiple regions of a memory device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8819208B2 (en) * 2010-03-05 2014-08-26 Solidfire, Inc. Data deletion in a distributed data storage system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1776675A (zh) * 2004-11-17 2006-05-24 国际商业机器公司 用于存储并使用多存储位置中的元数据的方法和系统
CN107622019A (zh) * 2016-07-14 2018-01-23 爱思开海力士有限公司 存储器系统及其操作方法
CN108108308A (zh) * 2016-11-24 2018-06-01 爱思开海力士有限公司 存储器系统及其操作方法
US20190079859A1 (en) * 2017-09-13 2019-03-14 Intel Corporation Apparatus, computer program product, system, and method for managing multiple regions of a memory device

Also Published As

Publication number Publication date
CN112783698A (zh) 2021-05-11

Similar Documents

Publication Publication Date Title
WO2018040591A1 (fr) Procédé et système de réplication de données à distance
US10467246B2 (en) Content-based replication of data in scale out system
US9946655B2 (en) Storage system and storage control method
US11188520B2 (en) Storage tier verification checks
JP6344798B2 (ja) データ送信方法、データ受信方法、及びストレージデバイス
JP4990066B2 (ja) 論理ボリュームのペアを利用したデータ保存の方式を変更する機能を備えたストレージシステム
US8204858B2 (en) Snapshot reset method and apparatus
US20080282047A1 (en) Methods and apparatus to backup and restore data for virtualized storage area
US20100199065A1 (en) Methods and apparatus for performing efficient data deduplication by metadata grouping
JP2022512064A (ja) 様々なデータ冗長性スキームを備えた、システムにおける利用可能なストレージ空間を改善すること
US10620843B2 (en) Methods for managing distributed snapshot for low latency storage and devices thereof
WO2019184012A1 (fr) Procédé d'écriture de données, serveur client, et système
WO2021088586A1 (fr) Procédé et appareil de gestion de métadonnées dans un système de stockage
WO2019062856A1 (fr) Procédé et appareil de reconstruction de données, et système de stockage de données
US20200341871A1 (en) Raid schema for providing metadata protection in a data storage system
WO2021017782A1 (fr) Procédé d'accès à un système de stockage distribué, client et produit programme d'ordinateur
US20200174683A1 (en) Method and system for delivering message in storage system
US11775194B2 (en) Data storage method and apparatus in distributed storage system, and computer program product
US10346077B2 (en) Region-integrated data deduplication
US20180307427A1 (en) Storage control apparatus and storage control method
US11194501B2 (en) Standby copies withstand cascading fails
US11216204B2 (en) Degraded redundant metadata, DRuM, technique
CN111913664B (zh) 一种数据写入方法及装置
JP2002288014A (ja) ファイル管理システム及びファイルデータ書込方法
CN111124746A (zh) 管理独立盘冗余阵列的方法、设备和计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20884466

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20884466

Country of ref document: EP

Kind code of ref document: A1