CN111124311B

CN111124311B - Method for recovering raid data based on configuration information under logical volume management

Info

Publication number: CN111124311B
Application number: CN201911334627.XA
Authority: CN
Inventors: 许超明; 梁效宁; 刘波
Original assignee: Xly Salvationdata Technology Inc
Current assignee: Xly Salvationdata Technology Inc
Priority date: 2019-12-23
Filing date: 2019-12-23
Publication date: 2023-06-23
Anticipated expiration: 2039-12-23
Also published as: CN111124311A

Abstract

The invention discloses a method for recovering raid data based on configuration information under logical volume management, which is characterized by comprising the following steps: s100: loading each disk, wherein a cluster managed by a logic volume consists of one or more disks, each logic volume is distributed to one or more disks, and the configuration information comprises a configuration area, an offset address and a maximum byte length; s200: acquiring each physical volume, and acquiring UUIDs of each physical volume, offset addresses of configuration areas of each physical volume and maximum byte lengths of the configuration areas of each physical volume; s300: addressing and analyzing a configuration area managed by a logic volume of each physical volume to obtain a volume group, wherein the configuration area comprises description information of the configuration area and data of the configuration area; s400: and recovering the raid data managed by the logical volume according to the logical volume name, the logical volume type, the parameters of the raid, the number of disks and the size of the stripe.

Description

Method for recovering raid data based on configuration information under logical volume management

Technical Field

The invention belongs to the field of electronic data recovery and evidence obtaining, relates to a method for recovering raid data, and particularly relates to a method for recovering raid data based on configuration information under logical volume management.

Background

The lvm is Logical Volume Manager (logical volume management) and is a mechanism for managing disk partitions in a Linux environment, and can freely adjust the size of a file system on the premise of realizing zero shutdown in Linux, and the file system spans different disks and partitions, so that the lvm device management technology is widely used in a mass storage system.

In lvm, not only the organization of JBOD (Just a Bunch Of Disks, disk clusters) but also raid (Redundant Arrays of Independent Drives) is supported.

In the prior art, the recovery and extraction of data under lvm are both based on the operating system on which the data is dependent, and if the operating system carrying the lvm is destroyed, the recovery of cluster data of the logical volume management lvm cannot be performed.

In addition, in the prior art, when restoring data under the management of logical volumes, it is required to determine whether different physical volumes belong to the same volume group, usually, the UUID of the volume group is analyzed first, and then whether different physical volumes are the same volume group is determined by determining whether the UUIDs are the same, but the method is required to analyze first and then determine, which is complicated.

Disclosure of Invention

Aiming at the defect of the prior art, the invention acquires the parameters of the raid by analyzing the data of the lvm configuration area and recovers the raid data managed by the logical volume according to the raid data recovery method based on the metadata, thereby achieving the purpose of recovering the raid data.

For ease of description, the invention may include the following terms:

pe: physical extension physical block

pv: physical volume;

pvs: physical volumes

vg: volume group

vgs: volume groups volume group

lv: logical volume

lvs: logical volumes

segment: segment(s)

strip: strap strip

JBOD: disk cluster (Just a Bunch Of Disks)

The application of the invention comprises the following steps:

s100: loading each disk, wherein a cluster managed by a logic volume consists of one or more disks, each logic volume is distributed to one or more disks, and the configuration information comprises a configuration area, an offset address and a maximum byte length;

s200: acquiring each physical volume, and acquiring UUIDs of each physical volume, offset addresses of configuration areas of each physical volume and maximum byte lengths of the configuration areas of each physical volume;

s300: addressing and analyzing a configuration area managed by a logical volume of each physical volume to obtain a volume group, wherein the configuration area comprises description information of the configuration area and data of the configuration area, and the step S300 comprises the following steps:

s301: addressing and analyzing description information of the configuration areas of each physical volume according to offset addresses of the configuration areas of each physical volume, and acquiring the offset addresses of the data of the configuration areas and byte lengths of the data of the configuration areas;

s302: addressing an offset address of the data of the configuration area, and acquiring the data of the configuration area according to the byte length of the data of the configuration area;

s303: analyzing the data of the configuration area of each volume group, obtaining basic information of the volume group, analyzing and storing each physical volume, analyzing and storing each logical volume, wherein the logical volumes comprise logical volume names, logical volume types, logical volume sizes, strip sizes and disk numbers;

s400: and recovering the raid data managed by the logical volume according to the logical volume name, the logical volume type, the reference table of the raid, the number of disks and the size of the stripe.

Preferably, the step S200 includes the steps of:

s201: the number of addressed sectors gives an initial value of 0;

s202: judging whether the number of the addressed sectors is less than or equal to 8, if yes, executing step S203, otherwise, ending the flow;

s203: reading a currently addressed sector, and the number of addressed sectors = the number of addressed sectors +1;

s204: judging whether each value of the current sector is matched with the structure of the physical volume, if so, executing step S205, otherwise, executing step S202, wherein the metadata information represents that the current sector is raid;

s205: the UUID of the current physical volume is recorded.

Preferably, the structure of the physical volume has a data structure as shown in the following table one:

data structure of structure body of table-physical volume

Wherein, signature: the physical volume signature managed by the logical volume is fixed as a character string LABELONE, and the corresponding value is 0x454E4F4C4542414C, which is used as an identification for judging whether the current sector is the physical volume;

sector number: offset location of fabric sector of current physical volume;

checksum: checksum, CRC32 checksum from 0x14 bytes to the current identification sector end address;

type indicator: name and version information of logical volume management;

UUID: unique identification of the physical volume represented by ASCII string;

physical volume size: byte length of the logical volume, the unit is bytes;

lvm config area offset: the offset address of the logic volume management configuration area takes the initial address of the disk as the initial address, and the unit is bytes;

lvm config area size: the logical volume manages the byte length of the configuration area in bytes.

Preferably, the description information of the configuration area has a data structure as shown in the following table two:

data structure of description information of table two configuration area

Wherein, checksum: a CRC32 checksum of the configuration area, which is updated as the data of the configuration area is updated;

signature: configuring a regular character string of the area identifier: "\x20lvm2\x20x [5a% r0n ]"

version information

lvm config area size: the byte length of the logical volume management configuration area is in bytes;

current lvm config offset: the offset address of the data of the current logical volume management configuration area takes the initial address of the logical volume management configuration area as the initial address, and the unit is bytes;

current lvm config size: the current logical volume manages the byte length of the configuration area in bytes.

Preferably, the step S400 includes the steps of:

s401: acquiring and judging whether the current logical volume is used by the raid, if so, ending the flow, otherwise, executing step S402;

s402: acquiring and judging whether the current logical volume type is raid, if yes, executing step S403, otherwise, ending the flow;

s403: acquiring a raid type, a stripe size, the number of disks and acquiring a raid reference table: inquiring a raid reference table according to the type of the raid, and acquiring the rotational direction of the raid and the organization mode of the raid, wherein the raid reference table is shown in the following table III:

table three raid reference table

S404: analyzing and recovering the disk cluster data of the raid data block managed by the logical volume;

s405: and acquiring the tracks fields in the data of the configuration area, taking the sequence of the disks used by the tracks fields as the sequence of each disk of the tracks, and recovering the track data managed by the logical volume according to the type of the tracks, the size of the string, the number of the disks, the rotation direction of the tracks and the organization mode of the tracks in the step S403.

Preferably, the step S404 includes the steps of:

s4041: judging whether the current restored section number is equal to the acquired section number, if so, ending the flow, otherwise, executing step S4042;

s4042: analyzing the current section, comprising the following steps:

s40421: according to the number of the sections, the size of the current section is obtained, and the unit is a physical block;

s40422: reading information of the strips domain, acquiring a physical volume to which the current section belongs, and acquiring a disk to which the current section belongs according to the UUID;

s40423: reading information of the strips domain, acquiring an offset address of the current section in the physical volume, and calculating the offset address of the current section in the disk by adopting the following formula:

offset address of current sector in disk = offset address of current sector in physical volume × physical block size + physical block start address;

s4043: and reading the data of each section according to the offset address of each section in the disk and the size of each section, and sequentially combining the data to obtain the cluster data managed by the logical volume, thereby completing the recovery of the current cluster data.

Preferably, in the step S405, a metadata-based raid data recovery method is used to recover raid data managed by a logical volume.

The invention has the following beneficial effects: the method solves the technical problem that a method for recovering the raid data based on the configuration information under the management of the logical volume does not exist in the prior art.

Drawings

FIG. 1 is a general flow chart of the method provided by the present invention;

FIG. 2 is a flowchart of obtaining UUID, offset address of configuration area, and maximum byte length of configuration area according to one embodiment of the present invention;

FIG. 3 is a flowchart for recovering raid data managed by a logical volume according to the present invention.

Detailed Description

For ease of description, the invention may include the following terms:

pe: physical extension physical block

pv: physical volume;

pvs: physical volumes

vg: volume group

vgs: volume groups volume group

lv: logical volume

lvs: logical volumes

segment: segment(s)

Wherein, a plurality of pes are arranged in one pv; 1 or more pv constitute vg; more than one lv exists in vg, and the lv allocates space from vg; data about offset addresses and the like are stored in a small-end format except for the data formats (e.g., ASCII codes, regular strings) specifically described.

In addition, the present application is incorporated herein by reference in its entirety for all patent applications entitled "metadata-based raid data recovery method", application number 2019108135847, and application date 2019, month 08, and 30.

Fig. 1 shows a general flow chart of the method provided by the invention. As shown in fig. 1, the present invention includes the steps of:

s100: loading each disk, wherein a cluster managed by a logic volume consists of one or more disks, each logic volume is distributed to one or more disks, and configuration information comprises a configuration area, an offset address and a maximum byte length of the configuration area;

specifically, pvs are acquired, and UUID of each pvs is acquired. The disk used by lvm will first be formatted as pv, if not pv, then the disk is not the disk used by lvm. The UUID is a unique ID generated by the lvm for each pv, is an important basis for the connection of the pv and the pv, and needs to be stored after the UUID is obtained. Step S200 includes the steps of:

s201: the number of addressed sectors gives an initial value of 0;

s204: judging whether each value of the current sector is matched with the structure of the physical volume, if so, executing step S205, otherwise, executing step S202, wherein the metadata information represents that the current sector is raid; the structure of the physical volume is parsed to have a data structure as shown in table one below:

data structure of structure body of table-physical volume

sector number: offset location of fabric sector of current physical volume;

type indicator: name and version information of logical volume management;

UUID: unique identification of the physical volume represented by ASCII string;

physical volume size: byte length of the logical volume, the unit is bytes;

S205: the UUID of the current physical volume is recorded.

S300: addressing and analyzing the configuration area managed by the logic volume of each physical volume to obtain a volume group, wherein the configuration area comprises the description information of the configuration area and the data of the configuration area, the description information of the configuration area is analyzed to have a data structure shown in the following table II,

data structure of description information of table two configuration area

version information

Step S300 includes the steps of:

s301: according to the offset address of the configuration area of each physical volume, namely lvm config area offset, addressing and analyzing the description information of the configuration area of each physical volume, acquiring the offset address of the data of the configuration area and the byte length of the data of the configuration area, namely current lvm config offset and current lvm config size;

specifically, the data of the lvm configuration area is recorded and stored in ASCII characters, and the offset address (i.e., current lvm config offset) of the data of the current logical volume management configuration area is a beginning address of the logical volume management configuration area, and the byte length is current lvm config size.

The following is data of a configuration area in one embodiment of the present invention. The storage can be opened directly in text mode due to ASCII mode. Furthermore, if pv belongs to the same vg, this part of the data is identical.

vg0{

id＝"NINMzM-2lws-EcuP-AW3P-8k7o-p4dy-efCUaZ"

seqno＝6

format＝"lvm2"

status＝["RESIZEABLE","READ","WRITE"]

flags＝[]

extent_size＝8192

max_lv＝0

max_pv＝0

metadata_copies＝0

physical_volumes{

pv0{

id＝"il2KJ1-3NkI-GJmW-Auo0-YE3W-Cvxa-Yu5c4M"

device＝"/dev/sdb1"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv1{

id＝"zosUjH-HQw3-ajsC-L9dg-CnW7-FbuY-ilwcTz"device＝"/dev/sdb2"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv2{

id＝"O9tY9N-a4KT-XNfj-ne7u-3efs-5ebx-sf8gQd"device＝"/dev/sdc1"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv3{

id＝"MT14Rt-pidM-QAgs-6VQ2-3Pnf-fSlA-eyiuCr"device＝"/dev/sdc2"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

logical_volumes{

lv0{

id＝"x5HV35-Jrwx-EgLk-OuDx-G0ht-INje-WSJ8If"status＝["READ","WRITE","VISIBLE"]flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝256

type＝"raid5"

device_count＝3

region_size＝1024

stripe_size＝128

raids＝[

"lv0_rmeta_0","lv0_rimage_0",

"lv0_rmeta_1","lv0_rimage_1",

"lv0_rmeta_2","lv0_rimage_2"

]

}

lv0_rimage_0{

id＝"VFH1Nm-c211-bF0t-fLyd-8D0J-dU8G-SLQUi7"status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv0",1

]

}

lv0_rmeta_0{

id＝"XxVbb1-ycXU-1XZE-NTcc-JmoQ-gB8b-yQApEZ"status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv0",0

]

}

lv0_rimage_1{

id＝"0eedim-z9Xa-hEsd-s0ah-Igl2-quzn-yjxRmA"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv1",1

]

}

lv0_rmeta_1{

id＝"1H1I3c-lFtN-QaX1-w77d-shr5-wb0N-aMeC5J"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv1",0

]

}

lv0_rimage_2{

id＝"wOAtat-KFlc-lOZ9-ap0A-4exn-u8lL-MnWcYW"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv2",1

]

}

lv0_rmeta_2{

id＝"yZd2xe-vvnb-6YyX-wSVK-0h6o-Tffv-rxXzRP"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv2",0

]

}

#Generated by LVM2 version 2.02.133(2)(2015-10-30):TueMar 5 02:14:26 2019

contents＝"Text Format Volume Group"

version＝1

description＝""

creation_host＝"ubuntu"#Linux ubuntu 4.4.0-21-generic#37-Ubuntu SMP Mon Apr 18 18:33:37UTC 2016x86_64

creation_time＝1551780866#Tue Mar 5 02:14:26 2019

S303: analyzing the data of the configuration area of each volume group, obtaining the basic information of the volume group, analyzing and storing each physical volume, analyzing and storing each logical volume, wherein the logical volumes comprise logical volume names, logical volume types, logical volume sizes, strip sizes and disk numbers;

specifically, in the above embodiment,

A. the basic information of the volume group is obtained, that is, the name of the current vg, the UUID, and the name of the current vg in the data of the extension_size configuration area may be obtained from the data of the configuration area in the above embodiment, where the "logical_volumes { }" includes related information of all pv used by the current volume group (vg), and the "logical_volumes { }" includes all lv established by the user and lv used by the raid.

id (NINMzM-2 lws-EcuP-AW3P-8k7o-P4 dy-efCUaZ) is UUID of the vg, and the size of pe is extension_size, and the value is 8192 sectors;

and saves the name, id and extension size of the current vg.

B. Parsing and saving each physical volume: since all the information of the pv exists in the physical volume ("physical_volumes { }), taking the data of the configuration area in the above embodiment as an example, it is first determined whether to parse the information to the end of the physical volume, that is," physical_volumes { "and if so, then all the pv is parsed, otherwise, the pv is continuously parsed and saved. For example, the UUID of pv, pe_start, pe_count are saved, in the above embodiment, the starting position of pv0 is 2048 sectors, pe_count is 255 physical blocks (pes), and the UUID of pv0 is: il2KJ1-3 NkI-GJmW-Auo-YE 3W-Cvxa-Yu5c4M.

C. Parse and save each logical volume (i.e., "logical_volumes { }):

"logical_volumes { }" includes all user-established lv and the lv used by raid. Taking the data of the configuration area in the above embodiment as an example, the following is:

lv0 is the name of the logical volume (lv);

x5HV35-Jrwx-EgLk-OuDx-G0ht-INje-WSJ8If is UUID of logical volume (lv);

creation_time is the creation time of the logical volume (lv) in unix timestamp;

the extension_count is the size of the logical volume (lv), and the unit is a physical block;

type is the logical volume type of the logical volume (lv), in this embodiment, raid5, and device_count is the number of disks used by the logical volume (lv);

stripe_size is the stripe (stripe) size of the raid in sectors; the tracks field (i.e., tracks { }) contains the logical volume that makes up the track, where the format is lv_rmeta_no. is the logical volume name of the super block of the track, where x represents the name of the logical volume and No. is an index (index), e.g., lv0_rmeta_0; the format is lv_rimage_no. is the logical volume name of the data block of the raid, which indicates the name of the logical volume, no. is index (index), for example, lv0_rimage_0;

further, when type is stripled, the second entry value of the stripses field (i.e., 0) indicates the position of the current segment offset in pv, denoted as pv_off, in pe, i.e., physical block;

since all information of the lv exists in the logical volume ("logical_volumes { }), whether to parse the information to the end of the logical volume is firstly judged, namely," logical_volumes { "is judged, if yes, all the lv is parsed, otherwise, the lv is parsed continuously and stored.

Wherein, in the raid of logical volume management (lvm), if the name of lv is as follows:

“lv0_rmeta_0”

“lv0_rimage_0”

the regular expressions are respectively:

“\w_rmeta_\d$”

“\w_rimage_\d$”

these logical volumes (lv) are not created by the user himself but are allocated for use in the raid of the logical volume management (lv). When resolving a logical volume (lv) in a volume group (vg), it is necessary to determine whether the name of lv is the above regular expression, if yes, the representative is allocated to the raid used in the logical volume management (lv), otherwise, the logical volume (lv) created by the user himself, and it is noted that the logical volume (lv) created by the user himself is required to be restored.

Wherein "\w_rmeta\d$" is the super block of raid for logical volume management (lvm);

"_w_rimage_d$" is a data block of raid of logical volume management (lvm);

the super block and the data block exist in the same pv in sequence, and the data of the configuration area according to the above embodiment may be summarized as the following data allocation table:

data allocation table of lv0 in pv

D. If type= "pinned", the logical volume name, section type (section), section (section) is saved.

b. If type=raidxx, XXX represents a combination of numbers and/or letters and/or underlining, then by name, the logical volume (lv) names used by the logical volume management (lvm) raid data blocks in the logical volume list, the extension_count, the strip_size, the device_count, the segment type, and the tracks are saved. Taking the data of the configuration area in the above embodiment as an example, the following is:

one logical volume (lv) in vg0, named "lv0", is the size: (511+1) physical blocks, of type raid5, occupy three pvs, and 6 logical volumes (lv) are used as shown in the data allocation table. The method comprises the following steps:

"lv0_rmeta_0","lv0_rimage_0",

"lv0_rmeta_1","lv0_rimage_1",

"lv0_rmeta_2","lv0_rimage_2"

FIG. 3 is a flowchart showing a specific process for recovering raid data managed by a logical volume according to the present invention, and as shown in FIG. 3, step S400 includes the steps of:

s401: acquiring and judging whether the current logical volume is used by the raid, if so, ending the flow, otherwise, executing step S402; specifically, whether the regular expression matching the name of the logical volume (lv) is "\w_rmeta\d$" and/or "\w_rimage\d$", if so, the current logical volume (lv) is used as a raid, and the flow is directly ended without processing; otherwise, step S402 is performed;

s402: acquiring and judging whether the current logical volume type is raid, if yes, executing step S403, otherwise, ending the flow; specifically, it is determined whether the type is structed, if yes, step S403 is executed, otherwise, the flow is ended;

table three raid reference table

Specifically, according to the segment type table, the level of the raid, the rotational direction of the raid, the organization mode of the raid, the size of the raid stripe, and the number of disks (device_count) are obtained, which is specifically described as follows:

grade of raid: numbers following raid

raid rotation direction: for example, "left-hand sync" in the table above, if there is no rotation direction, then the level of raid has no such parameter.

Organization mode of raid: for example, "rotate parity N, data continue" in the table above, if there is no organization, then the level raid has no such parameter.

raid stripe size: i.e. strip size

S404: analyzing and recovering the disk cluster data of the raid data block managed by the logical volume, including the following steps:

s4042: analyzing the current section, comprising the following steps:

S405: obtaining a tracks field in the data of the configuration area, obtaining the tracks field in the data of the configuration area, taking the sequence of the disks used by the tracks field as the sequence of each disk of the tracks, and recovering the track data managed by the logical volume according to the type of the tracks, the size of the strings, the number of the disks, the rotation direction of the tracks and the organization mode of the tracks in the step S403. Specifically, the raid data managed by the logical volume is recovered by using a raid data recovery method based on metadata, please refer to the application entitled "a raid data recovery method based on metadata", application number 2019108135847, and application date 2019, 08, 30.

The method provided by the invention solves the technical problem that a method for recovering raid data based on configuration information under logical volume management does not exist in the prior art.

It is to be understood that the invention is not limited to the examples described above, and that modifications and variations may be effected in light of the above teachings by those skilled in the art, all of which are intended to be within the scope of the invention as defined in the appended claims.

Claims

1. A method for recovering raid data based on configuration information under logical volume management is characterized by comprising the following steps:

s400: recovering the raid data managed by the logical volume according to the logical volume name, the logical volume type, the reference table of the raid, the number of disks and the size of the stripe, wherein step S400 includes the following steps:

table three raid reference table

2. The method for recovering raid data based on configuration information according to claim 1, wherein said step S200 comprises the steps of:

s201: the number of addressed sectors gives an initial value of 0;

s205: the UUID of the current physical volume is recorded.

3. The method for recovering raid data based on configuration information under logical volume management according to claim 2, wherein the physical volume structure has a data structure shown in the following table one:

data structure of structure body of table-physical volume

sector number: offset location of fabric sector of current physical volume;

type indicator: name and version information of logical volume management;

UUID: unique identification of the physical volume represented by ASCII string;

physical volume size: byte length of the logical volume, the unit is bytes;

4. The method for recovering raid data based on configuration information under logical volume management according to claim 1, wherein the description information of the configuration area has a data structure shown in the following table two:

data structure of description information of table two configuration area

version information

5. The method for recovering raid data based on configuration information according to claim 1, wherein said step S404 comprises the steps of:

s4042: analyzing the current section, comprising the following steps:

6. The method for recovering raid data based on configuration information according to claim 1, wherein in step S405, the raid data managed by the logical volume is recovered by using a raid data recovery method based on metadata.