CN111124311A

CN111124311A - Configuration information-based raid data recovery method under logical volume management

Info

Publication number: CN111124311A
Application number: CN201911334627.XA
Authority: CN
Inventors: 许超明; 梁效宁; 刘波
Original assignee: Xly Salvationdata Technology Inc
Current assignee: Xly Salvationdata Technology Inc
Priority date: 2019-12-23
Filing date: 2019-12-23
Publication date: 2020-05-08
Anticipated expiration: 2039-12-23
Also published as: CN111124311B

Abstract

The invention discloses a method for recovering raid data based on configuration information under logical volume management, which is characterized by comprising the following steps: s100: loading each disk, wherein a disk cluster managed by a logical volume is composed of one or more disks, each logical volume is allocated to one or more disks, and the configuration information comprises a configuration area, an offset address and a maximum byte length of the configuration area; s200: acquiring each physical volume, and acquiring a UUID of each physical volume, a configuration area offset address of each physical volume and the maximum byte length of the configuration area of each physical volume; s300: addressing and analyzing a configuration area of logical volume management of each physical volume to obtain a volume group, wherein the configuration area comprises description information of the configuration area and data of the configuration area; s400: and recovering raid data managed by the logical volume according to the name of the logical volume, the type of the logical volume, the parameter of raid, the number of disks and the size of stripe.

Description

Configuration information-based raid data recovery method under logical volume management

Technical Field

The invention belongs to the field of electronic data recovery and forensics, relates to a method for recovering raid data, and particularly relates to a method for recovering raid data based on configuration information under logical volume management.

Background

lvm is a shorthand of Logical Volume Manager, which is a mechanism for managing disk partitions in a Linux environment, and is a perfect solution that the size of a file system can be freely adjusted on the premise of realizing zero shutdown in the Linux, and the file system spans different disks and partitions, so that the lvm device management technology is widely used in a large-capacity storage system.

In lvm, not only Just the Just a Bunch Of Disks (Just a Disks) organization, but also the raid (redundant Arrays Of Independent drives) organization is supported.

In the prior art, data recovery and extraction under the lvm are based on an operating system on which the data recovery and extraction depend, and if the operating system carrying the lvm is damaged, the data of a disk cluster of the logical volume management lvm cannot be recovered.

In addition, in the prior art, when recovering data under the management of logical volumes, it is necessary to determine whether different physical volumes belong to the same volume group, and usually, a UUID of a volume group is analyzed first, and then, it is determined whether different physical volumes are the same volume group by determining whether the UUIDs are the same.

Disclosure of Invention

Aiming at the defects of the prior art, the method and the device provided by the invention can be used for acquiring the parameter of the raid by analyzing the data of the lvm configuration area and recovering the raid data managed by the logical volume according to a raid data recovery method based on the metadata, thereby achieving the purpose of recovering the raid data.

For ease of description, the present invention may include the following terms:

pe: physical Extent physical Block

pv: physical volume;

pvs: physical volumes physical volume

vg: volume group

vgs: volume groups volume group

lv: local volume logical volume

lvs: logical volumes logical volume

segment: segment of

strip: strip tape

JBOD: disk cluster (Just a Bunch Of Disks)

The invention comprises the following steps:

s100: loading each disk, wherein a disk cluster managed by a logical volume is composed of one or more disks, each logical volume is allocated to one or more disks, and the configuration information comprises a configuration area, an offset address and a maximum byte length of the configuration area;

s200: acquiring each physical volume, and acquiring a UUID of each physical volume, a configuration area offset address of each physical volume and the maximum byte length of the configuration area of each physical volume;

s300: addressing and analyzing a configuration area of logical volume management of each physical volume to obtain a volume group, where the configuration area includes description information of the configuration area and data of the configuration area, and the step S300 includes the following steps:

s301: addressing and analyzing the description information of the configuration area of each physical volume according to the offset address of the configuration area of each physical volume, and acquiring the offset address of the data of the configuration area and the byte length of the data of the configuration area;

s302: addressing the offset address of the data of the configuration area, and acquiring the data of the configuration area according to the byte length of the data of the configuration area;

s303: analyzing the data of the configuration area of each volume group, acquiring basic information of the volume group, analyzing and storing each physical volume, analyzing and storing each logical volume, wherein the logical volumes comprise logical volume names, logical volume types, logical volume sizes, stripe sizes and disk numbers;

s400: and recovering raid data managed by the logical volume according to the name of the logical volume, the type of the logical volume, a reference table of raid, the number of disks and the size of stripe.

Preferably, the step S200 includes the steps of:

s201: the number of sectors addressed is given an initial value of 0;

s202: judging whether the number of the addressed sectors is less than or equal to 8, if so, executing the step S203, otherwise, ending the flow;

s203: reading a currently addressed sector, wherein the number of the addressed sectors is equal to the number of the addressed sectors + 1;

s204: judging whether each value of the current sector is matched with the structural body of the physical volume, if so, representing that the current sector is the metadata information of raid, and executing the step S205, otherwise, executing the step S202;

s205: the UUID of the current physical volume is recorded.

Preferably, the structure of the physical volume has a data structure as shown in the following table one:

data structure of table-physical volume

Wherein, signature: the physical volume signature managed by the logical volume is fixed to a character string label, and the corresponding value is 0x454E4F4C4542414C and is used as an identifier for judging whether the current sector is the physical volume;

sector number: an offset position of a structure sector of a current physical volume;

checksum: a checksum, a CRC32 checksum starting from 0x14 bytes to the currently identified sector end address;

type indicator: name and version information of logical volume management;

UUID: a unique identification of the physical volume represented in an ASCII string;

physical volume size: the logical volume byte length is in bytes;

lvm config area offset: the logical volume manages the offset address of the configuration area, which takes the initial address of the disk as the first address and the unit as byte;

lvm config area size: the logical volume manages the byte length of the configuration area in bytes.

Preferably, the description information of the configuration area has a data structure as shown in the following table two:

data structure of description information of table two configuration area

Wherein, checksum: a CRC32 checksum of the configuration region, which is updated with the update of the data of the configuration region;

signature: configuring regular character strings of the area identifiers: "\ x20LVM2\ x20x [ 5A% r0N ]"

version information

lvm config area size: the logical volume manages the byte length of the configuration area, and the unit is byte;

current lvm config offset: the offset address of the data of the current logical volume management configuration area takes the initial address of the logical volume management configuration area as a first address, and the unit is byte;

current lvm config size: the current logical volume manages the byte length of the configuration area, and the unit is byte.

Preferably, the step S400 includes the steps of:

s401: acquiring and judging whether the current logical volume is used by raid, if so, ending the process, otherwise, executing the step S402;

s402: acquiring and judging whether the type of the current logical volume is raid, if so, executing step S403, otherwise, ending the process;

s403: acquiring the raid type, the size of the stripe, the number of the disks and acquiring a raid reference table: according to the type of the raid, querying a raid reference table to obtain the rotation direction of the raid and the organization mode of the raid, wherein the raid reference table is shown as the following table three:

table three raid reference table

S404: analyzing and recovering disk cluster data of the raid data block managed by the logical volume;

s405: and acquiring the raid domains in the data of the configuration area, taking the sequence of the disks used by the raid domains as the sequence of each raid disk, and recovering raid data managed by the logical volume according to the raid type, the size of the strip, the number of the disks, the raid rotation direction and the raid organization mode in the step S403.

Preferably, the step S404 includes the steps of:

s4041: judging whether the number of the current recovered sections is equal to the number of the acquired sections, if so, ending the process, otherwise, executing the step S4042;

s4042: analyzing the current section, comprising the following steps:

s40421: acquiring the size of the current section according to the acquired number of the sections, wherein the unit is a physical block;

s40422: reading information of the strips domain, acquiring a physical volume to which the current section belongs, and acquiring a disk to which the current section belongs according to the UUID;

s40423: reading information of the strips domain, obtaining an offset address of the current section in the physical volume, and calculating the offset address of the current section in the disk by adopting the following formula:

the offset address of the current sector in the disk is the offset address of the current sector in the physical volume, the physical block size and the physical block starting address;

s4043: and reading the data of each section according to the offset address of each section in the disk and the size of each section, and combining the data in sequence to obtain the disk cluster data managed by the logical volume so as to complete the recovery of the current disk cluster data.

Preferably, in step S405, a raid data recovery method based on metadata is used to recover raid data managed by the logical volume.

The invention has the following beneficial effects: the method solves the technical problem that no method for recovering raid data based on configuration information under logical volume management exists in the prior art.

Drawings

FIG. 1 is a general flow diagram of a method provided by the present invention;

FIG. 2 is a detailed flowchart of obtaining a UUID, configuring an offset address of a region, and configuring a maximum byte length of the region according to an embodiment of the present invention;

FIG. 3 is a specific flowchart of recovering raid data managed by logical volumes in the present invention.

Detailed Description

For ease of description, the present invention may include the following terms:

pe: physical Extent physical Block

pv: physical volume;

pvs: physical volumes physical volume

vg: volume group

vgs: volume groups volume group

lv: local volume logical volume

lvs: logical volumes logical volume

segment: segment of

Wherein, a plurality of pes exist in one pv; 1 or more pv make up vg; there is more than one lv in vg; lv allocates space from vg; except for the specific data format (such as ASCII code and regular character string), data related to offset addresses and the like are stored in a small-end format.

In addition, the present application incorporates by reference the entire content of an invention application entitled "a method for recovering raid data based on metadata", having application No. 2019108135847 and having application date 2019, 08 and 30.

Fig. 1 shows a general flow chart of the method provided by the present invention. As shown in fig. 1, the present invention comprises the steps of:

s100: loading each disk, wherein a disk cluster managed by the logical volume is composed of one or more disks, each logical volume is allocated to one or more disks, and the configuration information comprises a configuration area, an offset address and a maximum byte length of the configuration area;

specifically, pvs are obtained, and the UUID of each pvs is obtained. The disk used by lvm would first be formatted to pv, if not, then the disk is not the disk used by lvm. The UUID is a unique ID generated by the lvm for each pv, is an important basis for the connection between the lv and the pv, and needs to be stored after the UUID is acquired. Step S200 includes the steps of:

s201: the number of sectors addressed is given an initial value of 0;

s204: judging whether each value of the current sector is matched with the structural body of the physical volume, if so, representing that the current sector is the metadata information of raid, and executing the step S205, otherwise, executing the step S202; the structure of the physical volume is parsed to have a data structure as shown in the following table:

data structure of table-physical volume

type indicator: name and version information of logical volume management;

physical volume size: the logical volume byte length is in bytes;

S205: the UUID of the current physical volume is recorded.

S300: addressing and analyzing the configuration area of the logical volume management of each physical volume to obtain volume groups, wherein the configuration area comprises the description information of the configuration area and the data of the configuration area, and the description information of the configuration area has a data structure shown in the following table two after analysis,

data structure of description information of table two configuration area

version information

Step S300 includes the steps of:

s301: addressing and analyzing the description information of the configuration area of each physical volume according to the offset address of the configuration area of each physical volume, namely, the lvm config area offset, and acquiring the offset address of the data of the configuration area and the byte length of the data of the configuration area, namely, current lvm config offset and current lvm config size;

specifically, the data in the lvm configuration area is recorded and stored in ASCII characters, the offset address (i.e., current lvm config offset) of the data in the current logical volume management configuration area is the first address of the start address of the logical volume management configuration area, and the byte length is current lvm config size.

The following is data of the configuration area in one embodiment of the present invention. The storage is in an ASCII mode, and the storage can be directly opened in a text mode. Furthermore, if pv belongs to the same vg, then this part of the data is the same.

vg0{

id＝"NINMzM-2lws-EcuP-AW3P-8k7o-p4dy-efCUaZ"

seqno＝6

format＝"lvm2"

status＝["RESIZEABLE","READ","WRITE"]

flags＝[]

extent_size＝8192

max_lv＝0

max_pv＝0

metadata_copies＝0

physical_volumes{

pv0{

id＝"il2KJ1-3NkI-GJmW-Auo0-YE3W-Cvxa-Yu5c4M"

device＝"/dev/sdb1"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv1{

id＝"zosUjH-HQw3-ajsC-L9dg-CnW7-FbuY-ilwcTz"

device＝"/dev/sdb2"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv2{

id＝"O9tY9N-a4KT-XNfj-ne7u-3efs-5ebx-sf8gQd"

device＝"/dev/sdc1"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

pv3{

id＝"MT14Rt-pidM-QAgs-6VQ2-3Pnf-fSlA-eyiuCr"

device＝"/dev/sdc2"

status＝["ALLOCATABLE"]

flags＝[]

dev_size＝2097153

pe_start＝2048

pe_count＝255

}

logical_volumes{

lv0{

id＝"x5HV35-Jrwx-EgLk-OuDx-G0ht-INje-WSJ8If"

status＝["READ","WRITE","VISIBLE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝256

type＝"raid5"

device_count＝3

region_size＝1024

stripe_size＝128

raids＝[

"lv0_rmeta_0","lv0_rimage_0",

"lv0_rmeta_1","lv0_rimage_1",

"lv0_rmeta_2","lv0_rimage_2"

]

}

lv0_rimage_0{

id＝"VFH1Nm-c211-bF0t-fLyd-8D0J-dU8G-SLQUi7"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv0",1

]

}

lv0_rmeta_0{

id＝"XxVbb1-ycXU-1XZE-NTcc-JmoQ-gB8b-yQApEZ"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv0",0

]

}

lv0_rimage_1{

id＝"0eedim-z9Xa-hEsd-s0ah-Igl2-quzn-yjxRmA"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv1",1

]

}

lv0_rmeta_1{

id＝"1H1I3c-lFtN-QaX1-w77d-shr5-wb0N-aMeC5J"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv1",0

]

}

lv0_rimage_2{

id＝"wOAtat-KFlc-lOZ9-ap0A-4exn-u8lL-MnWcYW"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝128

type＝"striped"

stripe_count＝1

stripes＝[

"pv2",1

]

}

lv0_rmeta_2{

id＝"yZd2xe-vvnb-6YyX-wSVK-0h6o-Tffv-rxXzRP"

status＝["READ","WRITE"]

flags＝[]

creation_host＝"ubuntu"

creation_time＝1551780866

segment_count＝1

segment1{

start_extent＝0

extent_count＝1

type＝"striped"

stripe_count＝1

stripes＝[

"pv2",0

]

}

#Generated by LVM2 version 2.02.133(2)(2015-10-30):Tue Mar 5 02:14:262019

contents＝"Text Format Volume Group"

version＝1

description＝""

creation_host＝"ubuntu"#Linux ubuntu 4.4.0-21-generic#37-Ubuntu SMPMon Apr 18 18:33:37 UTC

2016 x86_64

creation_time＝1551780866#Tue Mar 5 02:14:26 2019

S303: analyzing data of a configuration area of each volume group, acquiring basic information of the volume group, analyzing and storing each physical volume, analyzing and storing each logical volume, wherein the logical volumes comprise logical volume names, logical volume types, logical volume sizes, stripe sizes and disk numbers;

in particular, as in the above embodiments,

A. the basic information of the volume group is obtained, that is, the data in the configuration area in the above embodiment may obtain the name of the current vg in the data in the current vg configuration area, the UUID, and the extent _ size configuration area is "vg 0", the physical _ volumes { } "includes the relevant information of all pvs used by the current volume group (vg), and the" local _ volumes { } "includes all the lv used by the user and the lv used by the raid.

id (NINMzM-2lws-EcuP-AW3P-8k7o-p4dy-efCUaZ) is the UUID of vg, pe is the extent _ size, and the value is 8192 sectors;

and saves the name, id and extend _ size of the current vg.

B. Parsing and storing each physical volume: since all pv information is stored in the physical volume ("physical _ volumes { }"), taking the data of the allocation area in the above embodiment as an example, it is first determined whether to parse the end of the physical volume, that is, "physical _ volumes {" is "}", and if it is, all the pvs are parsed, otherwise, the parsing of the pv is continued and the pv is saved. For example, the UUID, pe _ start, and pe _ count of pv are saved, in the above embodiment, the start position of pv0 is 2048 sectors, pe _ count is 255 physical blocks (pe), and the UUID of pv0 is: il2KJ1-3NkI-GJmW-Auo0-YE3W-Cvxa-Yu5c 4M.

C. Parsing and saving each logical volume (i.e., "local _ volumes { }"):

"local _ volumes { }" includes all user-established lv and lv used by raid. Taking the data of the configuration area in the above embodiment as an example, the following is:

lv0 is the name of logical volume (lv);

x5HV35-Jrwx-EgLk-OuDx-G0ht-INje-WSJ8If is the UUID of logical volume (lv);

the creation _ time is the creation time of the logical volume (lv) and is in the format of unix timestamp;

the extent _ count is the size of the logical volume (lv) in units of physical blocks;

the type is a logical volume type of the logical volume (lv), which is raid5 in this embodiment, and the device _ count is the number of disks used by the logical volume (lv);

the stripe _ size is the size of a stripe (stripe) of the raid and the unit is a sector; the raids field (i.e., raids { }) contains the logical volumes that make up the raid, where the format lv _ rmeta _ No. is the logical volume name of the superblock of the raid, where lv denotes the name of the logical volume and No. is the index (index), e.g., lv0_ rmeta _ 0; the format lv _ rimage _ No. is the logical volume name of the data block of the raid, representing the name of the logical volume, and No. is an index (index), for example, lv0_ rimage _ 0;

in addition, when type is stripped, the second item value (i.e., 0) of the strips field indicates the position of the current segment offset in pv, denoted pv _ off, and has the unit of pe, i.e., physical block;

because all the lv information is stored in the logical volume ("local _ volumes { }"), it is first determined whether to parse to the end of the logical volume, that is, the "}" of the "local _ volumes {" is performed, if yes, all the lv is parsed, otherwise, the lv is continuously parsed and stored.

Wherein, in the raid of logical volume management (lvm), if the name of lv is as follows:

“lv0_rmeta_0”

“lv0_rimage_0”

the regular expressions are respectively:

“\w_rmeta_\d$”

“\w_rimage_\d$”

these logical volumes (lv) are not created by the user himself, but are allocated for use by the raid at the logical volume management (lvm). When analyzing the logical volume (lv) in the volume group (vg), it is necessary to determine whether the name of lv is the regular expression, if so, the representation is assigned to the raid in the logical volume management (lvm) for use, otherwise, the representation is the logical volume (lv) created by the user, and it is to be noted that the logical volume (lv) created by the user needs to be recovered.

Wherein "\ w _ rmeta \ $" is a superblock of the raid of logical volume management (lvm);

"\ w _ rimage _ \ _ $" is a data block of the raid of logical volume management (lvm);

the super block and the data block successively exist in the same pv, and the data of the configuration area in the above embodiment can be summarized as the following data allocation table:

table of data allocation of lv0 in pv

D. If the type is "striped", the logical volume name, segment type (segment type), segment (segment) are saved.

b. If type ═ raidXXX, XXX represents a combination of numbers and/or letters and/or underlines, then the logical volume (lv) name used by the raid data block of logical volume management (lvm) in the logical volume's Extent _ count, stripe _ size, device _ count, segment type and raids list is saved by name. Taking the data of the configuration area in the above embodiment as an example, the following is:

there is a logical volume (lv) in vg0, named "lv 0", of size: (511+1) physical blocks of type raid5 occupy three pv, the occupancy as shown in the data allocation table using 6 logical volumes (lv). The method comprises the following steps:

"lv0_rmeta_0","lv0_rimage_0",

"lv0_rmeta_1","lv0_rimage_1",

"lv0_rmeta_2","lv0_rimage_2"

Fig. 3 shows a specific flowchart of recovering raid data managed by a logical volume in the present invention, and as shown in fig. 3, step S400 includes the following steps:

s401: acquiring and judging whether the current logical volume is used by raid, if so, ending the process, otherwise, executing the step S402; specifically, whether a regular expression matching the name of the logical volume (lv) is "\ w _ rmeta \ d $" and/or "\ w _ rimage \ d $", if yes, indicates that the current logical volume (lv) is used by raid, and the flow is directly ended without processing; otherwise, go to step S402;

s402: acquiring and judging whether the type of the current logical volume is raid, if so, executing step S403, otherwise, ending the process; specifically, judging whether the type is clamped, if so, executing step S403, otherwise, ending the flow;

table three raid reference table

Specifically, according to the segment type table, the rank of raid, the raid rotation direction, the raid organization method, the raid stripe size, and the number of disks (device _ count) are obtained, which is specifically described as follows:

rank of raid: number following raid

raid rotation direction: for example, "left-hand sync" in the table above, if there is no direction of rotation, then this parameter is not present for this level of raid.

The raid organization mode: for example, "rotating parity N, data continue" in the table above, if there is no organization, then this parameter is not present at this level raid.

raid stripe size: i.e., stripe _ size

S404: analyzing and recovering disk cluster data of a raid data block managed by a logical volume, comprising the following steps:

s4042: analyzing the current section, comprising the following steps:

S405: and acquiring a raid domain in the data of the configuration area, acquiring the raid domain in the data of the configuration area, taking the sequence of the disks used by the raid domain as the sequence of each raid disk, and recovering raid data managed by the logical volume according to the raid type, the size of the stripe, the number of the disks, the raid rotation direction, and the organization mode of the raid in the step S403. Specifically, a raid data recovery method based on metadata is adopted to recover raid data managed by a logical volume, please refer to an invention application entitled "a raid data recovery method based on metadata", having an application number of 2019108135847 and having an application date of 2019, 08 months and 30 days.

The method solves the technical problem that no method for recovering raid data based on configuration information under logical volume management exists in the prior art.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations are possible to those skilled in the art in light of the above teachings, and that all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims

1. A method for recovering raid data based on configuration information under logical volume management is characterized by comprising the following steps:

2. The method for recovering raid data based on configuration information under logical volume management according to claim 1, wherein said step S200 includes the steps of:

s201: the number of sectors addressed is given an initial value of 0;

s205: the UUID of the current physical volume is recorded.

3. The method for recovering raid data based on configuration information under logical volume management according to claim 2, wherein the structural body of the physical volume has a data structure shown in the following table one:

data structure of table-physical volume

type indicator: name and version information of logical volume management;

physical volume size: the logical volume byte length is in bytes;

4. The method for recovering raid data based on configuration information under logical volume management according to claim 1, wherein the description information of the configuration area has a data structure as shown in the following table two:

data structure of description information of table two configuration area

version information

5. The method for recovering raid data based on configuration information under logical volume management according to claim 1, wherein said step S400 comprises the steps of:

table three raid reference table

6. The method for recovering raid data based on configuration information under logical volume management according to claim 5, wherein the step S404 comprises the steps of:

s4042: analyzing the current section, comprising the following steps:

7. The method for recovering raid data based on configuration information under logical volume management according to claim 5, wherein in said step S405, a raid data recovery method based on metadata is adopted to recover raid data of logical volume management.