CN109885551B

CN109885551B - Electronic device, metadata processing method, and computer-readable storage medium

Info

Publication number: CN109885551B
Application number: CN201910007216.3A
Authority: CN
Inventors: 宋小兵; 姜文峰
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-01-04
Filing date: 2019-01-04
Publication date: 2024-03-12
Anticipated expiration: 2039-01-04
Also published as: WO2020140623A1; CN109885551A

Abstract

The invention relates to a metadata management technology, and discloses an electronic device, a metadata processing method and a computer readable storage medium. According to the method, performance data of clone volumes to be processed are obtained, the clone volumes to be processed meeting preset merging conditions are inquired according to the performance data of each clone volume to be processed, when the query is made, all clone volumes to be processed meeting the preset merging conditions are used as volumes to be merged, metadata and merging data of each volume to be merged are obtained, merging operation is carried out on the metadata and the merging data of each volume to be merged according to preset merging rules, and the merged metadata are stored in a preset storage space. Compared with the prior art, the method and the device improve the data query speed of the clone volume.

Description

Electronic device, metadata processing method, and computer-readable storage medium

Technical Field

The present invention relates to the field of distributed storage technologies, and in particular, to an electronic device, a metadata processing method, and a computer readable storage medium.

Background

The CEPH distributed file system is a distributed storage system with large capacity, high performance and high reliability. To protect against data loss, CEPH provides a snapshot function. Where a snapshot is a fully available copy of a given data set that includes an image of the corresponding data at some point in time (i.e., the point in time at which the copy began). The snapshot can be a copy of the data represented by the snapshot or a copy of the data represented by the snapshot, and when the CEPH storage device fails, the snapshot function can be utilized to timely restore the data to the state of the snapshot generation time point. But the snapshot is readable only and not writable. To this end, CEPH also provides a function: cloning functions that can either read or write clone volumes formed by performing cloning operations on a snapshot.

The snapshot and clone operation flow of CEPH is as follows:

an image a (image, one volume in a CEPH cluster, is an external representation of a CEPH block device resource) in the CEPH includes a plurality of data slices (including a data slice 1 and a data slice 2), a snapshot operation is performed on an original volume (i.e., a base image, a volume that is not subjected to any snapshot operation) of the image a, a snapshot volume 1 (i.e., a snap 1) is created, then if data of the data slice 1 and the data slice 2 are to be updated, the data are required to be read from the data slices 1 and 2 of the original volume and copied into the snapshot volume 1, meanwhile, storage location information of the data slices 1 and 2 is saved in metadata of the snapshot volume 1, and the data slices 1 and 2 in the original volume are updated, so as to obtain new data slices 1-1 and 2-1. Then, a second snapshot operation is performed on image a, creating snapshot volume 2 (i.e. snap 2), and then updating the data of data slice 1-1, where the snapshot operation reads the data from data slice 1-1 of image a and copies the data to snapshot volume 2, and meanwhile, stores the storage location information of data slice 1-1 in metadata of snapshot volume 2, and updates data slice 1-1 in image a to obtain data slice 1-2. After performing multiple snapshot operations on the image a, a snapshot chain (i.e., snapshot volumes arranged in chronological order) corresponding to the image a is formed.

Cloning (i.e., clone) is performed on a snapshot volume in the snapshot chain to form a clone volume.

If a data sheet in a clone volume is to be searched, firstly, searching storage position information corresponding to the data sheet in the clone volume, and searching the storage position information of other data sheets from metadata of each snapshot volume and metadata of image A in a snapshot chain corresponding to the clone volume as only the storage position information corresponding to the data sheet which is newly written after the clone volume is generated is stored in metadata of the clone volume; because the metadata of each snapshot volume and the metadata of image a in the snapshot chain are all stored in a plurality of storage nodes of the distributed storage system in a dispersed manner, a query process may need to interact with a plurality of storage nodes, so that the data query speed of the clone volume is low.

It can be seen that how to increase the data query speed of clone volumes is a challenge.

Disclosure of Invention

The main object of the present invention is to provide an electronic device, a metadata processing method and a computer-readable storage medium, aiming at improving the data query speed of clone volumes.

In order to achieve the above object, the present invention provides an electronic device including a memory and a processor, wherein a metadata processing program is stored in the memory, and the metadata processing program when executed by the processor implements the following steps:

a first acquisition step: acquiring performance data of the clone volume to be processed in real time or at fixed time or when receiving a merging instruction;

inquiring: inquiring the clone volumes to be processed meeting preset merging conditions according to the performance data of each clone volume to be processed;

a second acquisition step: when the query is made, taking all the clone volumes to be processed meeting the preset merging conditions as volumes to be merged, and acquiring metadata and merging data of each volume to be merged;

a first merging step: and respectively carrying out merging operation on the metadata of each volume to be merged and the data to be merged according to a preset merging rule, and storing the merged metadata into a preset storage space.

Preferably, the performance data of the clone volume to be processed comprises loading time, read-write operation times per second, read-write operation delay and expected accessed frequency of the clone volume to be processed.

Preferably, the querying step includes:

obtaining calculation parameters of each clone volume to be processed, wherein the calculation parameters comprise a loading time threshold, a loading time adjustment coefficient, a read-write operation frequency threshold per second, a read-write operation frequency adjustment coefficient per second, a read-write operation delay threshold and a read-write operation delay adjustment coefficient;

substituting the performance data and the calculation parameters of each clone volume to be processed into a preset formula to calculate to obtain the scoring value of each clone volume to be processed, wherein the preset formula comprises:

S＝[(A-A’)×w1+(B’-B)×w2+(C-C’)×w3]×F

wherein S represents a scoring value, A represents loading time, A ' represents a loading time threshold, w1 represents a loading time adjustment coefficient, B represents a number of read-write operations per second, B ' represents a number of read-write operations per second threshold, w2 represents a number of read-write operations per second adjustment coefficient, C represents a read-write operation delay, C ' represents a read-write operation delay threshold, w3 represents a read-write operation delay adjustment coefficient, and F represents an expected accessed frequency;

inquiring all the to-be-processed clone volumes with the score values meeting the preset score value conditions, and determining that the inquired to-be-processed clone volumes meet the preset merging conditions when inquiring.

Preferably, the preset combining rule includes:

the calculation steps are as follows: according to the metadata and the data to be combined of each volume to be combined, calculating the combined data increment and the combined data increment proportion of each volume to be combined;

screening: screening out the rolls to be combined, wherein the adding proportion of the combined data is smaller than the preset proportion;

sequencing: sorting the rolls to be combined obtained through screening to obtain a roll queue to be combined;

the selection step: selecting the volumes to be combined one by one according to the sequence of the queues of the volumes to be combined, acquiring the residual capacity of the preset storage space after selecting the volumes to be combined, and calculating the difference between the residual capacity and the increase of the combined data of the selected volumes to be combined;

and a second merging step: when the difference value is larger than or equal to a preset threshold value, merging the metadata of the selected volume to be merged with the data to be merged of the selected volume to be merged, and taking the difference value as a new residual capacity of the preset storage space;

judging: judging whether the unselected volumes to be merged exist in the queue of the volumes to be merged, returning to the selection step when the volumes to be merged exist, and ending the flow when the volumes to be merged do not exist.

Preferably, the sorting step includes:

selecting a parameter from the performance data as a first ranking index;

calculating a second sorting index of each screened roll to be merged according to a first sorting index of each screened roll to be merged and a preset index threshold of the first sorting index;

and sorting all the screened rolls to be combined according to the size sequence of the second sorting index of each screened roll to be combined to obtain the queues of the rolls to be combined.

In addition, to achieve the above object, the present invention also provides a metadata processing method, which includes the steps of:

Preferably, the performance data of the clone volume to be processed includes loading time, number of read/write operations per second, read/write operation delay, expected accessed frequency of the clone volume to be processed, and the querying step includes:

S＝[(A-A’)×w1+(B’-B)×w2+(C-C’)×w3]×F

Preferably, the preset combining rule includes:

Preferably, the sorting step includes:

selecting a parameter from the performance data as a first ranking index;

Furthermore, to achieve the above object, the present invention also proposes a computer-readable storage medium storing a metadata processing program executable by at least one processor to cause the at least one processor to perform the steps of the metadata processing method as set forth in any one of the above.

The method acquires performance data of the clone volume to be processed in real time or at regular time or when receiving a merging instruction; inquiring the clone volumes to be processed meeting preset merging conditions according to the performance data of each clone volume to be processed; when the query is made, taking all the clone volumes to be processed meeting the preset merging conditions as volumes to be merged, and acquiring metadata and merging data of each volume to be merged; and respectively carrying out merging operation on the metadata of each volume to be merged and the data to be merged according to a preset merging rule, and storing the merged metadata into a preset storage space. Compared with the prior art, the method and the device for merging the metadata of the clone volumes to be processed, which meet the preset merging conditions, merge the metadata, store the merged metadata into the preset storage space, and only need to query the storage position information of the data piece in the merged metadata corresponding to the clone volume when querying the data piece in the clone volume with merged metadata, and do not need to interact with a plurality of storage nodes, so that the data query speed of the clone volume is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram illustrating an exemplary embodiment of a metadata processing program according to the present invention;

FIG. 2 is a block diagram illustrating an embodiment of a metadata processing program according to the present invention;

FIG. 3 is a flowchart illustrating an embodiment of a metadata processing method according to the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.

The invention provides a metadata processing program.

Referring to FIG. 1, a diagram of an operating environment of a metadata processing program 10 according to an embodiment of the present invention is shown.

In the present embodiment, the metadata processing program 10 is installed and run in the electronic apparatus 1. The electronic device 1 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a server, or the like. The electronic device 1 may include, but is not limited to, a memory 11 and a processor 12 in communication with each other via a program bus. Fig. 1 shows only an electronic device 1 with components 11, 12, but it is understood that not all shown components are required to be implemented, and that more or fewer components may alternatively be implemented.

The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a hard disk or a memory of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic apparatus 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic apparatus 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic apparatus 1. The memory 11 is used for storing application software installed in the electronic device 1 and various data, such as program codes of the metadata processing program 10. The memory 11 may also be used to temporarily store data that has been output or is to be output.

The processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for executing program code or processing data stored in the memory 11, such as executing the metadata processing program 10 or the like.

Referring to FIG. 2, a block diagram of a metadata processing program 10 according to an embodiment of the present invention is shown. In this embodiment, the metadata processing program 10 may be divided into one or more modules, and one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to complete the present invention. For example, in fig. 2, the metadata processing program 10 may be divided into a first acquisition module 101, a query module 102, a second acquisition module 103, and a merge module 104. The modules referred to in the present invention are a series of instruction segments of a computer program capable of performing a specific function, more suitable than the program for describing the execution of the metadata processing program 10 in the electronic device 1, wherein:

the first obtaining module 101 is configured to obtain performance data of the clone volumes to be processed in real time or at a fixed time, or when receiving the merge instruction.

The performance data of the clone volume to be processed comprises loading time of the clone volume to be processed, read-write operation times per second (Input/output Operations Per Second, IOPS), read-write operation delay and expected accessed frequency. Wherein the expected frequency of access can be set by the user in advance, for example, the number of times of access to the clone volume to be processed is predicted in advance, and the predicted value is taken as the expected frequency of access.

And the query module 102 is configured to query the clone volumes to be processed that satisfy the preset merge condition according to the performance data of each clone volume to be processed.

First, the calculation parameters of each clone volume to be processed are acquired. The calculation parameters of different clone volumes to be processed can be the same or different.

The calculation parameters comprise a loading time threshold, a loading time adjustment coefficient, a read-write operation frequency threshold per second, a read-write operation frequency adjustment coefficient per second, a read-write operation delay threshold and a read-write operation delay adjustment coefficient.

And substituting the performance data and the calculation parameters of each clone volume to be processed into a preset formula to calculate so as to obtain the scoring value of each clone volume to be processed.

The preset formula may be set as required, for example, the preset formula includes:

S＝[(A-A’)×w1+(B’-B)×w2+(C-C’)×w3]×F

wherein S represents a scoring value, A represents loading time, A ' represents a loading time threshold, w1 represents a loading time adjustment coefficient, B represents a number of read/write operations per second, B ' represents a number of read/write operations per second threshold, w2 represents a number of read/write operations per second adjustment coefficient, C represents a read/write operation delay, C ' represents a read/write operation delay threshold, w3 represents a read/write operation delay adjustment coefficient, and F represents an expected accessed frequency.

And finally, inquiring all the to-be-processed clone volumes with the scoring values meeting the preset scoring conditions, and determining that the inquired to-be-processed clone volumes meet the preset merging conditions when inquiring.

And the second obtaining module 103 is configured to take all the clone volumes to be processed that satisfy the preset merging condition as volumes to be merged when the query is received, and obtain metadata and data to be merged of each volume to be merged.

The step of obtaining the data to be merged of each volume to be merged includes:

first, the data piece identification information of all the data pieces contained in a volume to be merged is acquired. For example, the piece identification information of all pieces of data contained in the volume to be merged is acquired from the piece identification information list of the volume to be merged.

Then, whether each piece of data identification information exists in the metadata of the to-be-merged volume is inquired, and the piece of data identification information which is not stored in the metadata of the to-be-merged volume is used as to-be-processed identification information.

And then, after determining all the identification information to be processed, searching the storage position information corresponding to each identification information to be processed in the metadata of each snapshot volume in a snapshot chain on which the volumes to be combined depend and the metadata of the original volume corresponding to the volumes to be combined.

And finally, taking the storage position information corresponding to all the identification information to be processed of the searched volume to be combined as the data to be combined of the volume to be combined.

And the merging module 104 is configured to perform merging operation on the metadata of each volume to be merged and the data to be merged according to a preset merging rule, and store the merged metadata into a preset storage space.

The preset merge rule includes steps S41 to S46 (not shown in the figure), wherein:

step S41, calculating the merging data increment and the merging data increment ratio of each volume to be merged according to the metadata and the merging data of each volume to be merged.

And respectively counting the data quantity of the data to be combined and the data quantity of the metadata of each volume to be combined, wherein the combined data increment of the volume to be combined is the data quantity of the data to be combined of the volume to be combined, and the combined data increment ratio of the volume to be combined is the ratio between the combined data increment of the volume to be combined and the data quantity of the metadata.

And S42, screening out the to-be-merged rolls of which the merging data increasing proportion is smaller than a preset proportion.

Step S43, sorting the rolls to be combined obtained through screening to obtain a roll queue to be combined.

Wherein, step S43 includes:

a parameter is selected from the performance data as a first ranking indicator, e.g., load time is selected as a first ranking indicator. And calculating a second sorting index of each screened to-be-merged volume according to the first sorting index of each screened to-be-merged volume and a preset index threshold of the first sorting index. For example, a time difference between the loading time of the volume to be merged and the loading time threshold is calculated, and the time difference is used as a second sorting index of the volume to be merged. Finally, sorting all the screened volumes to be merged according to the size sequence of the second sorting index of each screened volume to be merged (for example, according to the sequence that the difference between the loading time and the loading time threshold is from large to small, so as to preferentially merge volumes to be merged with slow loading), so as to obtain the queue of the volumes to be merged.

Step S44, selecting the volumes to be combined one by one according to the sequence of the queues to be combined, acquiring the residual capacity of the preset storage space after selecting the volumes to be combined, and calculating the difference between the residual capacity and the combined data increment of the selected volumes to be combined.

And step S45, when the difference value is greater than or equal to a preset threshold value, merging the metadata of the selected volume to be merged with the data to be merged of the selected volume to be merged, and taking the difference value as the new residual capacity of the preset storage space.

Step S46, determining whether there is an unselected volume to be merged in the queue of volumes to be merged, if so, returning to step S44, and if not, ending the flow.

The method comprises the steps of obtaining performance data of clone volumes to be processed; searching the clone volumes to be processed which meet preset merging conditions according to the performance data of each clone volume to be processed; when the data are found, taking all the clone volumes to be processed which meet the preset merging conditions as volumes to be merged, acquiring metadata and merging data of each volume to be merged, merging the metadata and the merging data of each volume to be merged, and storing the merged metadata into a preset storage space. Compared with the prior art, the method and the device for merging the metadata of the clone volumes to be processed, which meet the preset merging conditions, merge the metadata, store the merged metadata into the preset storage space, and only need to query the storage position information of the data piece in the merged metadata corresponding to the clone volume when querying the data piece in the clone volume with merged metadata, and do not need to interact with a plurality of storage nodes, so that the data query speed of the clone volume is improved.

Further, in the present embodiment, the metadata processing program 10 further includes a monitoring module (not shown in the figure) for:

monitoring the residual capacity of the preset storage space in real time or at fixed time, and sending out prompt information when the residual capacity of the preset storage space is smaller than or equal to a preset capacity threshold value.

In this embodiment, the remaining capacity of the preset storage space is monitored in real time or at regular time, so that the failure of the merging operation caused by insufficient remaining capacity is effectively prevented.

In addition, the invention provides a metadata processing method.

Fig. 3 is a flowchart of a metadata processing method according to an embodiment of the invention.

In this embodiment, the method includes:

step S10, acquiring performance data of the clone volume to be processed in real time or at fixed time or when receiving a merging instruction.

Step S20, inquiring the clone volumes to be processed meeting preset merging conditions according to the performance data of each clone volume to be processed.

S＝[(A-A’)×w1+(B’-B)×w2+(C-C’)×w3]×F

And step S30, when the query is made, taking all the clone volumes to be processed meeting the preset merging conditions as volumes to be merged, and acquiring metadata and merging data of each volume to be merged.

And S40, respectively carrying out merging operation on the metadata of each volume to be merged and the data to be merged according to a preset merging rule, and storing the merged metadata into a preset storage space.

Wherein, step S43 includes:

Further, in this embodiment, the method further includes:

Further, the present invention also proposes a computer-readable storage medium storing a metadata processing program executable by at least one processor to cause the at least one processor to execute the metadata processing method in any of the above embodiments.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structural changes made by the description of the present invention and the accompanying drawings or direct/indirect application in other related technical fields are included in the scope of the invention.

Claims

1. An electronic device comprising a memory and a processor, wherein the memory has a metadata processing program stored thereon, which when executed by the processor performs the steps of:

a first merging step: according to a preset merging rule, merging operation is respectively carried out on the metadata of each volume to be merged and the data to be merged, and the merged metadata is stored in a preset storage space;

wherein, the inquiring step comprises the following steps: obtaining calculation parameters of each clone volume to be processed, wherein the calculation parameters comprise a loading time threshold, a loading time adjustment coefficient, a read-write operation frequency threshold per second, a read-write operation frequency adjustment coefficient per second, a read-write operation delay threshold and a read-write operation delay adjustment coefficient; substituting the performance data and the calculation parameters of each clone volume to be processed into a preset formula to calculate so as to obtain the grading value of each clone volume to be processed; inquiring all the clone volumes to be processed, the scoring values of which meet preset scoring conditions, and determining that the inquired clone volumes to be processed meet preset merging conditions when inquiring;

the preset formula comprises:

the preset merging rule comprises the following steps: the calculation steps are as follows: calculating the merging data increment and the merging data increment ratio of each volume to be merged according to the metadata and the data to be merged of each volume to be merged, wherein the merging data increment of each volume to be merged is the data quantity of the data to be merged of the volume to be merged, and the merging data increment ratio of each volume to be merged is the ratio between the merging data increment of the volume to be merged and the data quantity of the metadata; screening: screening out the rolls to be combined, wherein the adding proportion of the combined data is smaller than the preset proportion; sequencing: sorting the rolls to be combined obtained through screening to obtain a roll queue to be combined; the selection step: selecting the volumes to be combined one by one according to the sequence of the queues of the volumes to be combined, acquiring the residual capacity of the preset storage space after selecting the volumes to be combined, and calculating the difference between the residual capacity and the increase of the combined data of the selected volumes to be combined; and a second merging step: when the difference value is larger than or equal to a preset threshold value, merging the metadata of the selected volume to be merged with the data to be merged of the selected volume to be merged, and taking the difference value as a new residual capacity of the preset storage space; judging: judging whether the unselected volumes to be merged exist in the queue of the volumes to be merged, returning to the selection step when the volumes to be merged exist, and ending the flow when the volumes to be merged do not exist.

2. The electronic device of claim 1, wherein the performance data of the clone volume to be processed includes a loading time of the clone volume to be processed, a number of read/write operations per second, a read/write operation latency, and an expected frequency of access.

3. The electronic device of claim 1, wherein the sorting step comprises:

selecting a parameter from the performance data as a first ranking index;

4. A method of metadata processing, the method comprising the steps of:

the preset formula comprises:

5. The method of metadata processing according to claim 4, wherein the performance data of the clone volume to be processed includes a loading time of the clone volume to be processed, a number of read/write operations per second, a read/write operation delay time, and an expected accessed frequency.

6. The metadata processing method according to claim 4, wherein the sorting step includes:

selecting a parameter from the performance data as a first ranking index;

7. A computer-readable storage medium storing a metadata processing program executable by at least one processor to cause the at least one processor to perform the steps of the metadata processing method according to any one of claims 4-6.