CN111143110B

CN111143110B - Metadata-based raid data recovery method in logical volume management

Info

Publication number: CN111143110B
Application number: CN201911334599.1A
Authority: CN
Inventors: 梁效宁; 许超明; 刘波
Original assignee: Xly Salvationdata Technology Inc
Current assignee: Xly Salvationdata Technology Inc
Priority date: 2019-12-23
Filing date: 2019-12-23
Publication date: 2023-08-01
Anticipated expiration: 2039-12-23
Also published as: CN111143110A

Abstract

The invention discloses a raid data recovery method based on metadata in logical volume management, which is characterized by comprising the following steps: s100: loading a disk and searching a super block, wherein the method comprises the following steps of: s101: addressing a first sector of the disk; s102: reading the content of the current sector, judging whether the content of the first 8 bytes of the current sector is equal to the first keyword, if so, executing the step S103, otherwise, executing the step S104; s103: comparing the content of the current sector with the metadata structure, judging whether the content of the current sector is matched with the metadata structure, if so, indicating that the content of the current sector is metadata, and executing the step S200 if the current block is a super block, otherwise, executing the step S104; s104: addressing a next sector; s105, judging whether the current addressing range is larger than a threshold value, if so, ending the flow, otherwise, executing step S102. S200: analyzing metadata of the super block, obtaining values of all domains in the metadata and generating a raid parameter; s300: and recovering the raid data according to the raid structure.

Description

Metadata-based raid data recovery method in logical volume management

Technical Field

The invention belongs to the field of electronic data recovery and evidence obtaining, relates to a method for recovering raid data, and particularly relates to a method for recovering raid data based on metadata in logical volume management.

Background

The lvm is Logical Volume Manager (logical volume management) and is a mechanism for managing disk partitions in a Linux environment, and can freely adjust the size of a file system on the premise of realizing zero shutdown in Linux, and the file system spans different disks and partitions, so that the lvm device management technology is widely used in a mass storage system.

The lvm raid is a type of data type in the lvm, has the advantages of compatibility of the lvm and the raid, supports the advantages of bad disk repair, online capacity expansion and the like, and is widely used in a mass storage system.

The data recovery and data extraction under the general lvm are all based on the operation system on which the data recovery and data extraction are dependent. In the offline case, the file is typically restored by analyzing the configuration file of lvm. If the configuration file of lvm is destroyed, we will not be able to realize data recovery and data extraction at lvm.

In addition, in the prior art, the recovery and extraction of data under lvm are based on the operating system on which the recovery and extraction of data under lvm are dependent, and if the operating system on which the lvm is mounted is destroyed, the recovery and extraction of data under lvm cannot be realized.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for recovering the lvm raid data by analyzing the information contained in the metadata (i.e. meta data) of the lvm raid without depending on an operation body system on which the lvm depends or through an lvm raid configuration file, and the invention provides a method for recovering the lvm raid data by depending on the information of the lvm raid meta, thereby achieving the purpose of recovering the raid data.

For ease of description, the invention may include the following terms:

pe: physical extension physical block

pv: physical volume;

pvs: physical volumes

vg: volume group

vgs: volume groups volume group

lv: logical volume

lvs: logical volumes

segment: segment(s)

strip: strap strip

The application of the invention comprises the following steps:

s100: loading a disk and searching a super block, wherein the method comprises the following steps of:

s101: addressing a first sector of the disk;

s102: reading the content of the current sector, judging whether the content of the first 8 bytes of the current sector is equal to the first keyword, if so, executing the step S103, otherwise, executing the step S104;

s103: comparing the content of the current sector with a metadata structure, judging whether the content of the current sector is matched with the metadata structure, if so, indicating that the content of the current sector is the metadata, and if so, executing a step S200, otherwise, executing a step S104, wherein the metadata structure is shown in a table 1;

table 1: metadata structure

S104: addressing a next sector;

s105, judging whether the current addressing range is larger than a threshold value, if so, ending the flow, otherwise, executing step S102.

S200: analyzing the metadata of the super block, obtaining the values of all domains in the metadata and generating a raid parameter;

s300: and recovering the raid data according to the raid structure.

Preferably, the first keyword is a character string DmRd, and the value of the first keyword corresponds to a Signature domain of the metadata and is 0x64526d44;

the threshold is 2048, and the unit is a sector.

Preferably, each field of the metadata acquired in the step S200 includes:

signature, namely, a metadata identification, a character string is a fixed field of DmRd, and the value of the character string is 0x64526d44;

device count: number of disks used by raid

device index, disk order of current disk in raid

raid level: level of raid

stripe size in sectors

unown 1: retention Domain 1 and is identical to Retention Domain 1 of group raid

unown 2: a reserved field, 2, with a value of all zeros

unown 3: retention field 3, its value is full FF

The domains are all stored in a small-end format.

Preferably, the raid parameter generated in step S200 is shown in table 2.

Table 2: raid parameter

Preferably, the step S300 includes the steps of:

s301: determining and addressing a start address of the data block;

s302: extracting the content of the data block by taking the start address of the data block as a first address and the end address of the physical volume as a last address;

s303: and recovering the raid data according to the raid parameters and the raid structure.

The invention has the following beneficial effects: the method solves the technical problem that a metadata-based raid data recovery method in logical volume management does not exist in the prior art, and analyzes the metadata structure and the raid parameters in the logical volume management.

Drawings

FIG. 1 is a general flow chart of the method provided by the present invention.

Detailed Description

For ease of description, the invention may include the following terms:

pe: physical extension physical block

pv: physical volume;

pvs: physical volumes

vg: volume group

vgs: volume groups volume group

lv: logical volume

lvs: logical volumes

segment: segment(s)

Wherein, a plurality of pes are arranged in one pv; 1 or more pv constitute vg; more than one lv is present in vg; lv allocates space from vg; data about offset addresses and the like are stored in a small-end format except for the data formats (e.g., ASCII codes, regular strings) specifically described.

In addition, the present application is incorporated herein by reference in its entirety for all patent applications entitled "metadata-based raid data recovery method", application number 2019108135847, and application date 2019, month 08, and 30.

Fig. 1 shows a general flow chart of the method provided by the invention. As shown in fig. 1, the present invention includes the steps of:

s101: addressing a first sector of the disk;

s102: reading the content of the current sector, judging whether the content of the first 8 bytes of the current sector is equal to the metadata identification, namely, a character string DmRd (the value of which is 0x64526d 44), if so, executing a step S103, otherwise, executing a step S104;

s103: comparing the content of the current sector with the metadata structure, judging whether the content of the current sector is matched with the metadata structure, if so, indicating that the content of the current sector is metadata, in other words, the current block is a super block, executing step S200, otherwise, executing step S104, wherein the metadata structure is shown in the table 1;

table 1: metadata structure

Wherein each field of the acquired metadata includes:

device count: number of disks used by raid

device index, disk order of current disk in raid

raid level: raid level

stripe size in sectors

unown 2: a reserved field, 2, with a value of all zeros

unown 3: retention field 3, its value is full FF

The domains are all stored in a small-end format.

S104: addressing a next sector;

s105, judging whether the current addressing range is larger than a threshold (namely 2048 sectors), if so, ending the flow, otherwise, executing step S102.

S200: and analyzing the metadata of the super block, obtaining the values of all domains in the metadata and generating the raid parameters.

The generated raid parameters are shown in table 2,

table 2: raid parameter

The table provides a comparison table of the lvm raid level, rotation mode and type, and a person skilled in the art can recover the raid data based on the metadata in the logical volume management according to the table.

It is noted that it can be seen in this table that the level value of raid4 in lvm is 5 instead of 4, and raid4 can be uniquely determined by both the level and the layout of raid4.

S300: and recovering the raid data according to the raid structure.

S301: determining and addressing a start address of a data block: the superblock in the lvm raid defaults to 4M byte.

In lvm, after one disk is formatted into pv, the system will use the whole disk for lvm raid, where the front part is the lvm raid superblock and the rear part is the lvm raid data block, and the data block size is not greater than the difference between the pv size and the superblock size, i.e. the size of the data block size < = pv-the superblock size. Therefore, after the super block of the lvm raid is found and the size of the super block is determined, the rear part of the disk is the lvm data area.

Furthermore, the size of the superblock may be variable. In this case, the super block size needs to be calculated, and the calculation formula is:

size of superblock = content_size × content_count × number of bytes per sector

The extent_size is obtained from the configuration file of lvm pv, and is in units of sectors, for example,

flags＝[]

extent_size＝8192

max_lv＝0

max_pv＝0

metadata_copies＝0

the extension_count is obtained from rmeta lv in pe. For example, the number of the cells to be processed,

segment1{

start_extent＝0

extent_count＝1

in addition, as the start position of the data block in the group lvm raid is the same as the offset of the superblock, one is found, and then all of the groups are found.

in addition, the data block size < = size of pv-super block size, in order to avoid data from being extracted less, the data from the start position of the data block to the end position of pv can be extracted as the data block of lvm raid.

Specifically, a metadata-based raid data recovery method is adopted for recovery, and please refer to the patent application entitled "a metadata-based raid data recovery method", application number 2019108135847, and application date 2019, 08, 30.

The method provided by the invention solves the technical problem that a raid data recovery method based on metadata in logical volume management does not exist in the prior art.

It is to be understood that the invention is not limited to the examples described above, and that modifications and variations may be effected in light of the above teachings by those skilled in the art, all of which are intended to be within the scope of the invention as defined in the appended claims.

Claims

1. A raid data recovery method based on metadata in logical volume management is characterized by comprising the following steps:

s101: addressing a first sector of the disk;

s102: reading the content of the current sector, judging whether the content of the first 8 bytes of the current sector is equal to a first keyword, if so, executing step S103, otherwise, executing step S104, wherein the first keyword is a character string DmRd, and the value of the first keyword is 0x64526d44 corresponding to the Signature domain of the metadata;

table 1: metadata structure

S104: addressing a next sector;

s105, judging whether the current addressing range is larger than a threshold value, if so, ending the flow, otherwise, executing step S102, wherein the threshold value is 2048, and the unit is a sector;

s200: analyzing the metadata of the super block, acquiring values of all domains in the metadata and generating a raid parameter, wherein all domains of the acquired metadata comprise:

devicecount number of disks used by raid

deviceindex, disk order of current disk in raid

raidlevel: raid level

stripe size in sectors

unown 2: a reserved field, 2, with a value of all zeros

unown 3: retention field 3, its value is full FF

Each domain is stored in a small-end format;

s300: and recovering the raid data according to the raid structure.

2. The method for recovering raid data based on metadata in logical volume management according to claim 1, wherein the raid parameters generated in step S200 are shown in table 2:

table 2: raid parameter

3. The method for recovering raid data based on metadata in logical volume management according to claim 1, wherein said step S300 comprises the steps of:

s301: determining and addressing a start address of the data block;