CN110928493B - Metadata module and metadata module processing method - Google Patents

Metadata module and metadata module processing method Download PDF

Info

Publication number
CN110928493B
CN110928493B CN201911049648.7A CN201911049648A CN110928493B CN 110928493 B CN110928493 B CN 110928493B CN 201911049648 A CN201911049648 A CN 201911049648A CN 110928493 B CN110928493 B CN 110928493B
Authority
CN
China
Prior art keywords
module
metadata
request
transaction
offline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911049648.7A
Other languages
Chinese (zh)
Other versions
CN110928493A (en
Inventor
王新忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN201911049648.7A priority Critical patent/CN110928493B/en
Publication of CN110928493A publication Critical patent/CN110928493A/en
Application granted granted Critical
Publication of CN110928493B publication Critical patent/CN110928493B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays

Abstract

The invention discloses a metadata module, comprising: metadata object module: a rootNode for managing metadata objects including LUN information, B + trees; the initialization, the updating and the recovery of the ROOT area data structure are realized; the transaction module: for ensuring atomicity of the request; a write cache module: the request for the I/O service is cached in the memory; b + tree module: a B + tree operation algorithm for implementing the metadata module; a reading and caching module: for improving the read performance of the metadata module; the query module: for performing a query operation. The metadata module processing method for processing the lower-layer offline condition of the storage system is further disclosed, atomicity of metadata requests during lower-layer offline is guaranteed, integrity and consistency of data are guaranteed, and reliability of metadata module development is improved.

Description

Metadata module and metadata module processing method
Technical Field
The invention relates to the field of storage systems, in particular to a metadata module and a metadata module processing method.
Background
Under the cluster structure of the full flash memory storage system, metadata is the most important part, the performance of a metadata module is improved through a read cache and a write cache, and the process is very complex. For general I/O service, the mapping relation from logical address to physical address needs to be managed; aiming at the garbage collection function, the mapping relation from a physical address to a logical address needs to be managed; for the supported deduplication function, the mapping relationship from the fingerprint value of the I/O to the physical address needs to be managed. Therefore, for one I/O service, operations such as management of mapping relationships need to be performed many times, and thus atomicity of the transaction needs to be guaranteed during the transaction.
When the storage system encounters the conditions of lower-layer offline such as hard disk failure, disk pulling, full storage pool and the like, hard disk replacement or capacity expansion is required, more complex operation is required for the metadata module, and if the operation is improper, the reliability and consistency of data are affected.
Disclosure of Invention
In order to solve the technical problems, the invention provides a metadata module and a metadata module processing method, which ensure the reliability and consistency of data under the condition of lower layer off-line.
In order to realize the purpose, the invention adopts the following technical scheme:
a metadata module comprising:
metadata object module: a rootNode for managing metadata objects including LUN information, B + trees; initializing, updating and recovering a data structure of the ROOT area;
the transaction module: for ensuring atomicity of the request;
a write cache module: the request for the I/O service is cached in the memory;
b + tree module: a B + tree operation algorithm for implementing the metadata module;
a reading and caching module: for improving the read performance of the metadata module;
the query module: for performing a query operation.
Further, the ensuring atomicity of the request includes:
if a plurality of sub-requests in one request are completely finished, the request is finished;
if any sub-request is not completed, the request rolls back, and the completed sub-request is cancelled.
Further, the WRITE cache module includes a WRITE _ BACK mode and a WRITE _ THROUGH mode, wherein:
WRITE _ BACK mode: the write cache module is divided into a preset memory space and is used for caching the request sent by the transaction module and requesting to refresh when a set condition is met;
WRITE _ THROUGH mode: and directly refreshing the request sent by the transaction module.
The invention also provides a metadata module processing method for processing the condition of the lower layer off-line of the storage system, which comprises the following specific steps based on the metadata module:
and (3) offline treatment: stopping the background task, canceling the metadata request, and establishing a transaction redo linked list;
and (5) re-online processing: and starting a write cache down-brushing task, and restarting a background task according to the transaction redo linked list redo request.
Further, the background tasks include volume deletion and write cache timing flushing.
Further, the canceling metadata request includes:
if the memory space request of the metadata module is not processed, actively canceling the metadata request and returning FAILED _ OFFLINE to the upper layer;
if the metadata module applies for the memory space, judging whether the current state is a lower-layer offline state or not through a module inlet, and if the current state is a non-offline state, normally processing the current state; if the state is the OFFLINE state, the request is cancelled, and FAILED _ OFFLINE is returned to the upper layer.
Further, the metadata request is processed through a transaction module, a write cache module and an inquiry module, wherein the request cancelled by the transaction module is written into the transaction redo linked list.
Further, the restarting the background task according to the transaction redo linked list redo request includes:
and after the transaction is redone, informing the upper layer of the completion of the processing, and restarting the metadata background task.
The invention has the beneficial effects that:
the invention solves the problem of complex operation of the metadata module when the lower layer in the full flash memory storage system is offline, ensures the atomicity of the metadata request through the transaction module, thereby ensuring the integrity and consistency of data, being beneficial to improving the reliability of metadata module development, playing a certain role in high-availability development of the full flash memory metadata module, and effectively reducing the development period.
Drawings
FIG. 1 is a schematic diagram of a metadata module according to an embodiment of the present invention;
FIG. 2 is a schematic view of an off-line processing flow of a metadata module processing method according to the present invention;
FIG. 3 is a schematic diagram of a re-online processing flow of the metadata module processing method according to the present invention.
Detailed Description
In order to clearly explain the technical features of the present invention, the following detailed description of the present invention is provided with reference to the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different features of the invention. To simplify the disclosure of the present invention, specific example components and arrangements are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and processes are omitted so as to not unnecessarily limit the invention.
The embodiment of the invention provides a metadata module, as shown in fig. 1, the metadata module is internally divided into the following sub-modules:
metadata object module: a rootNode for managing metadata objects including LUN information, B + trees; and initializing, updating and recovering the data structure of the ROOT area.
The transaction module: for ensuring atomicity of the request. Since a request can be divided into multiple sub-requests, a transaction mechanism is needed for atomicity guarantee: if all the sub-requests are completed, if one of the sub-requests is not completed, the sub-requests fail, the requests need to be rolled back for redoing, and the completed sub-requests also need to be cancelled.
A write cache module: for caching requests for I/O traffic in memory. The WRITE cache module is responsible for caching the processing of the I/O service in the memory, and is divided into a WRITE _ BACK mode and a WRITE _ THROUGH mode according to service requirements. In the WRITE _ BACK mode, the WRITE cache is divided into a predetermined memory space, and is used for caching the request sent by the transaction module and performing the flushing only when a set condition is reached. And in the WRITE _ thread mode, the request sent by the transaction module is directly flushed.
B + tree module: and the B + tree operation algorithm is used for realizing the metadata module. The B + tree module is an interactive core of the whole metadata module.
A reading and caching module: for improving the read performance of the metadata module.
The query module: for performing a query operation.
The embodiment of the invention also provides a metadata module processing method, which is used for processing the lower-layer offline condition of the storage system, wherein the lower-layer offline condition comprises the conditions of hard disk failure, disk pulling, full storage pool and the like. The method comprises the following specific steps:
and (3) offline treatment: stopping the background task, canceling the metadata request, and establishing a transaction redo linked list;
and (4) re-online processing: and starting a write cache down-brushing task, and restarting a background task according to a transaction redo linked list redo request.
The background tasks comprise volume deletion, write cache timing and flashing and the like.
For the case of offline of the lower RAID, at this time, the storage pool established based on the RAID cannot perform read-write operation, and at this time, if an I/O operation is performed, a metadata module is required to perform corresponding offline processing, and the basic principle is as follows: and waiting for the metadata request to actively cancel, if the metadata request cannot be cancelled, cancelling by performing passive judgment at each position on the I/O path, and then returning FAILED _ OFFLINE to the upper layer.
As shown in fig. 2, the offline processing flow of an embodiment of the present invention specifically includes:
11) the background task is stopped. For background tasks such as volume deletion, write cache timing flushing and the like, the tasks involve a large amount of metadata internal I/O operations in the processing process, namely, a large amount of read-write requests can be caused, and the background tasks need to be stopped in the first step because the RAID cannot be processed when being offline.
12) The metadata request is actively cancelled. If the memory space requests of the transaction module, the write cache module, the query module and other modules in the metadata module are not processed, the metadata request is actively cancelled, and FAILED _ OFFLINE is returned to the upper layer instead of being issued to the lower layer module. For the transaction module, the requests are written into the transaction redo linked list, and the redo is taken out after waiting for the online.
13) The metadata request is cancelled passively. If the metadata module applies for the memory space and each stage of each module starts to process the request, judging whether the current state is the lower RAID offline state or not through a module inlet, and if the current state is not the offline state, normally processing; if the current state is the OFFLINE state, the request is cancelled, and FAILED _ OFFLINE is returned to the upper layer. For the transaction module, the requests are written into the transaction redo linked list, and the redo is taken out after waiting for the online.
After the lower layer RAID is re-online, it is necessary to ensure that the upper layer does not issue a new I/O service before each module completes processing. Therefore, at this time, the metadata module needs to handle the failure of the sub-request of the transaction, which may occur during the offline process of the RAID, and reopen the background task.
As shown in fig. 3, the re-online processing flow is as follows:
21) and starting a writing cache brushing task. And starting a write cache refreshing task stopped in the RAID offline process to ensure that the transaction request can be normally issued.
22) The transaction is redone. In the off-line process, the transaction module writes the cancelled request into the transaction redo linked list, after the RAID is on-line, the request in the redo linked list needs to be firstly re-issued, and each module starts to process the request normally.
23) And restarting the metadata background task. After the transaction is redone, the metadata module can receive the new I/O service and issue, at the moment, the upper layer is informed of the end of processing, the metadata background task is restarted, and the new I/O service is prepared to be processed.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto. Various other modifications and variations to the foregoing description may be apparent to those skilled in the art. This need not be, nor should it be exhaustive of all embodiments. On the basis of the technical scheme of the invention, various modifications or changes which can be made by a person skilled in the art without creative efforts are still within the protection scope of the invention.

Claims (6)

1. A metadata module, comprising:
metadata object module: a rootNode for managing metadata objects including LUN information, B + trees; the initialization, the updating and the recovery of the ROOT area data structure are realized;
the transaction module: for ensuring atomicity of requests;
the ensuring atomicity of the request comprises:
if a plurality of sub-requests in one request are completely finished, the request is finished;
if any sub-request is not completed, the request is rolled back, and the completed sub-request is cancelled;
a write cache module: the request for the I/O service is cached in the memory;
the WRITE cache module comprises a WRITE _ BACK mode and a WRITE _ THROUGH mode, wherein:
WRITE _ BACK mode: the write cache module is divided into a preset memory space and is used for caching the request sent by the transaction module and requesting to be refreshed when a set condition is reached;
WRITE _ THROUGH mode: directly refreshing the request sent by the transaction module;
b + tree module: a B + tree operation algorithm for implementing the metadata module;
a reading and caching module: for improving the read performance of the metadata module;
the query module: for performing a query operation.
2. A metadata module processing method for processing a storage system lower layer offline condition, which is based on the metadata module of claim 1 and comprises the following specific steps:
and (3) offline treatment: stopping the background task, canceling the metadata request, and establishing a transaction redo linked list;
and (5) re-online processing: and starting a write cache down-brushing task, and restarting a background task according to a transaction redo linked list redo request.
3. The metadata module processing method according to claim 2, wherein the background tasks include volume deletion and write cache timing flushing.
4. The metadata module processing method according to claim 2, wherein the cancel metadata request includes:
if the memory space request of the metadata module is not processed, actively canceling the metadata request and returning FAILED _ OFFLINE to an upper layer;
if the metadata module applies for the memory space, judging whether the current state is a lower-layer offline state or not through a module inlet, and if the current state is a non-offline state, normally processing the current state; if the state is the OFFLINE state, the request is cancelled, and FAILED _ OFFLINE is returned to the upper layer.
5. The metadata module processing method according to claim 4, wherein the metadata request is processed by a transaction module, a write cache module and a query module, wherein a request for cancellation by the transaction module is written into the transaction redo linked list.
6. The metadata module processing method of claim 2, wherein the restarting the background task according to the transaction redo list redo request comprises:
and after the transaction is redone, informing the upper layer of the completion of the processing, and restarting the metadata background task.
CN201911049648.7A 2019-10-31 2019-10-31 Metadata module and metadata module processing method Active CN110928493B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911049648.7A CN110928493B (en) 2019-10-31 2019-10-31 Metadata module and metadata module processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911049648.7A CN110928493B (en) 2019-10-31 2019-10-31 Metadata module and metadata module processing method

Publications (2)

Publication Number Publication Date
CN110928493A CN110928493A (en) 2020-03-27
CN110928493B true CN110928493B (en) 2022-07-22

Family

ID=69849989

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911049648.7A Active CN110928493B (en) 2019-10-31 2019-10-31 Metadata module and metadata module processing method

Country Status (1)

Country Link
CN (1) CN110928493B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836051B (en) * 2021-11-29 2022-03-22 苏州浪潮智能科技有限公司 Metadata space recovery method, device, equipment and storage medium
CN116662019B (en) * 2023-07-31 2023-11-03 苏州浪潮智能科技有限公司 Request distribution method and device, storage medium and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609335A (en) * 2012-01-12 2012-07-25 浪潮(北京)电子信息产业有限公司 Device and method for protecting metadata by copy-on-write
CN109522243A (en) * 2018-10-22 2019-03-26 郑州云海信息技术有限公司 Metadata cache management method, device and storage medium in a kind of full flash memory storage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609335A (en) * 2012-01-12 2012-07-25 浪潮(北京)电子信息产业有限公司 Device and method for protecting metadata by copy-on-write
CN109522243A (en) * 2018-10-22 2019-03-26 郑州云海信息技术有限公司 Metadata cache management method, device and storage medium in a kind of full flash memory storage

Also Published As

Publication number Publication date
CN110928493A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN106471478B (en) Device controller and method for performing multiple write transactions atomically within a non-volatile data storage device
US8074035B1 (en) System and method for using multivolume snapshots for online data backup
US20200133800A1 (en) Key-value store on persistent memory
US8356148B2 (en) Snapshot metadata management in a storage system
US11347774B2 (en) High availability database through distributed store
US9141486B2 (en) Intelligent I/O cache rebuild in a storage controller
EP1770492B1 (en) A method for improving writing data efficiency and storage subsystem and system implementing the same
CN104050094A (en) System, method and computer-readable medium for managing a cache store to achieve improved cache ramp-up across system reboots
US8566530B2 (en) Prefetching source tracks for destaging updated tracks in a copy relationship
CN107992269B (en) Transaction writing method based on deduplication SSD
CN106445405B (en) Data access method and device for flash memory storage
US10152247B2 (en) Atomically committing write requests
US9280469B1 (en) Accelerating synchronization of certain types of cached data
CN110928493B (en) Metadata module and metadata module processing method
CN110673978B (en) Data recovery method and related device after power failure of double-control cluster
US8407434B2 (en) Sequentially written journal in a data store
US8019953B2 (en) Method for providing atomicity for host write input/outputs (I/Os) in a continuous data protection (CDP)-enabled volume using intent log
US20140215127A1 (en) Apparatus, system, and method for adaptive intent logging
US6658541B2 (en) Computer system and a database access method thereof
CN109726264B (en) Method, apparatus, device and medium for index information update
US9921913B2 (en) Flushing host cache data before rebuilding degraded redundant virtual disk
CN111124258B (en) Data storage method, device and equipment of full flash memory array and readable storage medium
CN106469119B (en) Data writing caching method and device based on NVDIMM
JP2010152747A (en) Storage system, cache control method and cache control program for storage
US9645926B2 (en) Storage system and method for managing file cache and block cache based on access type

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant