CN111007990A

CN111007990A - Positioning method for quickly positioning data block reference in snapshot system

Info

Publication number: CN111007990A
Application number: CN201911345693.7A
Authority: CN
Inventors: 胡国玉; 沈海嘉; 杨浩; 袁清波; 郭照斌
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Zhongke Sugon Information Industry Chengdu Co ltd; Dawning Information Industry Beijing Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-04-14
Anticipated expiration: 2039-12-24
Also published as: CN111007990B

Abstract

The application discloses a positioning method for quickly positioning data block reference in a snapshot system, which comprises the following steps: step 1, creating a volume attribute for a main volume, creating a snapshot system according to the main volume, and assigning values to the volume attribute of a snapshot and the data block attribute of a data block; step 2, according to the data block attribute of the data block, determining a corresponding snapshot in a snapshot system, and generating a snapshot set; step 3, loading metadata of snapshots in the snapshot set according to the logical block addresses of the data blocks, and determining the physical block addresses mapped to the snapshots in the snapshot set at the logical addresses of the data blocks; and 4, when the physical block address is judged to be equal to the source physical block address of the data block to be migrated, updating the mapping relation between the corresponding snapshot and the migrated data block according to the target physical block address of the data block to be migrated. According to the technical scheme, the data block references in the snapshot system are quickly positioned, and the migration efficiency of data migration of data block bottom layer data is improved.

Description

Positioning method for quickly positioning data block reference in snapshot system

Technical Field

The application relates to the technical field of data storage, in particular to a positioning method for quickly positioning data block references in a snapshot system.

Background

The snapshot technology is applied to the fields of data backup and data mining. A storage system providing block device service to the outside world treats each block device as a Logical Unit (LUN), and a user can create a file system on the LUN through a client. LUNs are also known as volumes. The volume may be a primary volume, or a primary volume-based snapshot, or a snapshot-based snapshot. A clone is a special snapshot and is not distinguished from a normal snapshot in this application.

In the snapshot system, a main volume and all related sets of snapshots and clones are distinguished by volume sets (LUN sets), each volume set contains a plurality of Version sets (Version sets), wherein a Version set represents a snapshot source (main volume, snapshot or clone) and all sets of snapshots and clones generated based on the snapshot source, and a copy identification repica id is further included in the snapshot system to identify a specific LUN copy.

As shown in fig. 1, each snapshot includes a plurality of Data Blocks (DB), a higher-level Address of each Data Block is referred to as a Logical Block Address (LBA), a lower-level Address of each Data Block is referred to as a Physical Block Address (PBA), and the Data blocks are stored in corresponding locations of the disk according to the PBA. Therefore, a mapping relationship exists between the logical block address LBA and the physical block address PBA, and the snapshot can implement sharing of the physical blocks through the mapping relationship.

Volume B is generated at a certain time based on volume a (primary volume), and volume C is generated again at a certain time based on volume B. The LBA of each volume and the PBA on the disk have a mapping relationship as shown in FIG. 1(a), wherein some physical blocks (200, 201, 202) are shared by more than one volume due to the snapshot relationship, and the physical block (500) is only individually referenced by one volume.

Assuming that the physical block PBA202 is a data block accessed at a high frequency, the disk 0 is a mechanical hard disk with a relatively slow speed, and the other disk 1 is a high-speed solid state disk (the disk 1 and the physical block of the disk 0 are addressed together, and the PBA starts from 1000), for better performance of an application program, the physical block 202 will be automatically migrated to the disk 1 based on data migration of a disk data hot spot, and at this time, migration to the physical block PBA1000 is assumed. In addition to copying the data in the physical block PBA202 to the physical block PBA1000 during the migration process, all the logical blocks referencing the physical block PBA202 in the volumes a, B, and C need to be remapped to the physical block PBA1000, so as to ensure that the data in the physical block PBA1000 remains the same as the mapping relationship of the original physical block PBA202 after the migration is completed, as shown in fig. 1 (B).

In addition to "changing storage media according to data heat", there are other scenarios such as defragmentation of disks.

A key step affecting the data migration efficiency of the snapshot system is how to quickly locate all the logical blocks referencing the physical block to be migrated, and then complete the modification of the mapping from all the LBAs to the PBAs. In a snapshot system, some attributes are usually recorded for each allocated physical block, such as its corresponding LBA, a copy identifier (replica id) and a version set identifier (fsid) of the corresponding volume when the physical block is allocated for the first time, and the like, where the copy identifier (rsica id) is used to identify a specific LUN copy, and the version set identifier is used to identify a specific version set.

However, in the prior art, when updating the mapping from LBA to PBA, there are two disadvantages in locating a data block:

on one hand, the snapshot is not distinguished as the copy identification under the two conditions of the snapshot source and the snapshot, so that when the data block is quoted and positioned, the copy identifications of all the snapshot source and the snapshot are required to be traversed, the snapshot source and the snapshot which are greater than or equal to the data block copy identification are selected, and when the snapshot system has more levels, the efficiency of the data block positioning method is low.

On the other hand, although all references to the same data block can be concentrated to one location by introducing a new primary mapping, as shown in fig. 2, m1 to m4 are metadata recording addresses for storing new primary mapping contents, and at this time, to migrate the physical block PBA202 to the physical block PBA1000, only the references in the metadata recording addresses m3 need to be modified, although the location efficiency of LBA mapping is improved, the performance of data paths such as normal read and write is seriously affected because a layer of mapping is introduced, while data migration is usually a background process, and the performance of the normal data paths is seriously affected in order to improve the performance of data migration, which is not compensated.

Disclosure of Invention

The purpose of this application lies in: the positioning method for quickly positioning the data block references in the snapshot system is provided, so that the underlying data migration including cold and hot data migration, defragmentation and the like can be completed more efficiently.

The technical scheme of the first aspect of the application is as follows: a positioning method for quickly positioning data block reference in a snapshot system is provided, the positioning method is suitable for updating mapping relation between a snapshot and a data block, and the method comprises the following steps: step 1, creating a volume attribute for a main volume, creating a snapshot system according to the main volume, and respectively assigning values to the volume attribute of the snapshot and the data block attribute of the data block according to the generation sequence of the snapshot and the data block in the snapshot system and the volume attribute of the main volume; step 2, according to the data block attribute of the data block, determining a corresponding snapshot in a snapshot system, and generating a snapshot set; step 3, according to the data block migration instruction, determining a source physical block address of a data block to be migrated and a target physical block address of the data block to be migrated, loading metadata of snapshots in the snapshot set, and determining physical block addresses mapped to the data block logical block addresses of the snapshots in the snapshot set; and 4, when the physical block address mapped by the snapshot is judged to be equal to the source physical block address of the data block to be migrated, updating the mapping relation between the corresponding snapshot and the migrated data block according to the target physical block address and the logical block address of the data block to be migrated after the data migration of the migrated data block is completed.

In any of the foregoing technical solutions, further, the volume attribute of the snapshot includes a source copy identifier, a source version set identifier, a target copy identifier, and a target version set identifier, and the data block attribute includes a copy identifier, a version set identifier, and a data block logical block address.

In any one of the above technical solutions, further, in step 1, assigning a data block attribute of the data block specifically includes: and when judging that a new data block is generated, assigning the copy identification of the newly generated data block as the source copy identification of the corresponding main volume or snapshot, and assigning the version set identification of the newly generated data block as the source version set identification of the corresponding main volume or snapshot.

In any one of the above technical solutions, further, in step 1, assigning a value to a volume attribute of the snapshot specifically includes: step 11, when judging that a new snapshot is generated, respectively assigning a source copy identifier and a source version set identifier of a snapshot source of the newly generated snapshot to a target copy identifier and a target version set identifier of the newly generated snapshot; step 12, sequentially selecting two elements in the copy identification set according to the sizes of the elements, and respectively assigning the selected two elements to the source copy identification of the snapshot source and the source copy identification of the newly generated snapshot; and step 13, sequentially selecting an element in the version set identifier set, and assigning the selected element to the source version set identifier of the newly generated snapshot.

In any one of the above technical solutions, further, in the step 2, specifically including: step 21, determining a version set of the snapshot system according to the assignment of the source version set identifier and the assignment of the target version set identifier of the main volume in the snapshot system, and the assignment of the source version set identifier and the assignment of the target version set identifier of the snapshot, wherein the version set at least comprises one snapshot; step 22, respectively extracting the assignment of the copy identifier and the assignment of the version set identifier of the data block in the snapshot system, selecting the version set with equal assignment of the version set in the snapshot system, selecting the snapshot with the assignment of the source version set identifier equal to the assignment of the version set in the selected version set, recording the selected snapshot as a source volume of the version set, and recording the rest snapshots as target volumes; and step 23, when the assignment of the source copy identifier of the source volume is judged to be equal to the assignment of the copy identifier of the data block, marking the source volume as the only element of the snapshot set, and generating the snapshot set.

In any one of the above technical solutions, further, in the step 2, specifically, the method further includes: step 24, when judging that the assignment of the source copy identification of the source volume is not equal to the assignment of the copy identification of the data block, marking the source volume as an element of the snapshot set, selecting all target volumes in the target volume, wherein the assignment of the target copy identification is greater than or equal to the assignment of the copy identification of the data block, and marking the selected target volumes as elements of the snapshot set; and step 25, generating a snapshot set according to the marked elements and the snapshot taking the elements as snapshot sources.

The technical scheme of the second aspect of the application is as follows: there is provided a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for locating a reference to a data block in a snapshot system according to any one of the technical solutions in the first aspect.

The beneficial effect of this application is:

according to the technical scheme, the volume attributes of the main volume are created, the values of the volume attributes of the snapshot and the data block attributes of the data blocks are assigned according to the generation sequence of the snapshot and the data blocks in the snapshot system, and the snapshot set of the data blocks is generated according to the volume attributes, so that the data blocks correspond to the snapshot set one by one, and the efficiency of searching the snapshot according to the data blocks in the snapshot system is improved. And then according to the data block migration instruction, determining the physical block address mapped by each snapshot by loading metadata, and after migration is completed, updating the mapping relation between the snapshots and the migrated data blocks, so that the problem of low migration efficiency of the underlying data in the snapshot system is solved, and the high performance of a conventional data path can be maintained while the underlying data migration of the snapshot system is efficiently completed. The method and the device are suitable for any migration scene aiming at the underlying data of the snapshot system, such as cold and hot data dynamic migration, defragmentation and the like.

According to the technical scheme, the searching range of the snapshot can be limited on the specific snapshot subtree, so that the data volume of metadata needing to be loaded is greatly reduced, and the efficiency of positioning data block full reference of a snapshot system is improved.

Drawings

The advantages of the above and/or additional aspects of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic diagram of a physical block migration process in the prior art;

FIG. 2 is a diagram illustrating a physical block migration process after a first level of mapping is introduced in the prior art;

FIG. 3 is a schematic flow chart diagram of a location method for quickly locating data block references in a snapshot system in accordance with one embodiment of the present application;

FIG. 4 is a schematic diagram of attributes of a primary volume and data blocks according to one embodiment of the present application;

FIG. 5 is a schematic diagram of a snapshot volume attribute assignment process according to one embodiment of the present application;

FIG. 6 is a schematic diagram of a version set according to one embodiment of the present application.

Detailed Description

In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.

As shown in fig. 3, the present embodiment provides a positioning method for quickly positioning a reference to a data block in a snapshot system, including:

step 1, creating a volume attribute for a main volume, creating a snapshot system according to the main volume, and respectively assigning values to the volume attribute of the snapshot and the data block attribute of the data block according to the generation sequence of the snapshot and the data block in the snapshot system, wherein the volume attribute comprises a source copy identifier src reapplication id, a source version set identifier src version set id, a target copy identifier destreplica id and a target version set identifier version set id, and the data block attribute comprises a copy identifier reapplication id, a version set identifier version set id and a logical block address LBA.

Preferably, when it is determined that a new data block is generated, the copy identifier replica id of the data block is assigned as the source copy identifier src replica id of its corresponding volume (primary volume or snapshot), and the version set identifier version set id is assigned as the source version set identifier src version set id of its corresponding volume.

Specifically, when a user creates a primary volume in the storage system, a volume attribute is created for the primary volume, and the volume attribute is assigned, wherein the volume attribute and the corresponding primary volume are stored separately. When a user creates a snapshot system according to actual requirements, the created content comprises a snapshot and a data block, so that the volume attribute of the snapshot and the data block attribute of the data block can be assigned according to the generation sequence of the snapshot and the data block.

For any snapshot in the snapshot system, the snapshot may be a snapshot generated according to a certain snapshot source (primary volume), or may be a snapshot source that generates a certain snapshot. The snapshot system refers to data in a snapshot source in a sharing mode to generate a new snapshot. Both the primary volume and the snapshot are volumes that reference data blocks through a LBA to PBA mapping.

In this embodiment, by setting two sets of identifiers, the first set of identifiers is used to identify an identity when a snapshot is taken as a snapshot source, and includes a source copy identifier src reapplica id and a source version set identifier src version set id, and the second set of identifiers is used to identify an identity when a snapshot is taken as a snapshot, and includes a target copy identifier dest reapplica id and a target version set identifier dest version set id.

When a new snapshot is generated, two new identifiers are sequentially selected from the copy identifier set and are respectively used for updating the source copy identifier src replica id of the snapshot source and serving as the source copy identifier src replica id of the generated snapshot, a new identifier is sequentially selected from the version set identifier set and serves as the source version identifier src versionset id of the new snapshot, namely when a snapshot is generated, the source version identifier and the target version identifier of the snapshot source are not changed. And sequentially inheriting the src repliica id and the src version set id of the source copy identifier and the src version set id of the corresponding snapshot source, wherein the copy identifier set and the version set identifier set are generated randomly.

The embodiment shows a method for assigning a volume attribute of a snapshot, which specifically includes:

step 11, when judging that a new snapshot is generated, respectively assigning a source copy identifier src replica id and a source version set identifier src version set id of a snapshot source to a target copy identifier detreplica id and a target version set identifier dest version set id of the snapshot;

step 12, sequentially selecting two elements in the copy identification set according to the sizes of the elements, and respectively assigning the selected two elements to a source copy identification src replica id of the snapshot source and a source copy identification src replica id of the snapshot;

and step 13, sequentially selecting an element from the version set identifier set, and assigning the selected element to the source version set identifier src version set id of the snapshot.

Preferably, the elements in the replica identification set and the version set identification set are both monotonically increasing arrays. The two sets can be two independent and non-interfering sets, or a common set, and the elements are selected sequentially in sequence.

Specifically, as shown in fig. 4, in this embodiment, the copy identifier set is set to [ a1, a2, …, An ], and the version identifier set is set to [ B1, B2, …, Bn ], where a1 < a2 < … < An, B1 < B2 < … < Bn. The snapshot system first has a primary volume, establishes a volume attribute for the primary volume, and assigns an initial value to the volume attribute, it should be noted that, because the primary volume has no snapshot identity, a second group of identifiers of the primary volume may be assigned with an invalid value X.

The first elements a1 and B1 in the copy identification set and the version set identification set are respectively selected as initial values of a source copy identification src replica id and a source version set identification src version set id of the primary volume, namely, src replication id is a1 and src version set id is B1, wherein a data block DB1-1 is newly generated under the primary volume, a value of the data block copy identification replication id is the same as the current source copy identification src replication id of the primary volume, namely, replication id is a1, and a value of the data block version set identification replication set id is the same as the source version set identification src version id of the primary volume, namely, the replication id is B1.

As shown in fig. 5, after snapshot 1 is created with the primary volume as the snapshot source, the source copy identifier src version id of the snapshot source is a1 and the source version set identifier src version set id is B1, which are assigned to the target copy identifier dest version id and the target version set identifier dest version set id of snapshot 1, respectively.

Two elements, a2 and A3, are selected in the replica identification set, element a2(A3) is assigned to the source replica identification src replica id of the snapshot source, and element A3(a2) is assigned to the source replica identification src replica id of snapshot 1. An element B2 is selected from the version set identifier set, and element B2 is assigned to the source version set identifier src version set id of snapshot 1.

It should be noted that after the elements a2 and A3 are selected, the elements a2 and A3 may be assigned to the snapshot source and the source copy identification src replica ids of the snapshot in any order.

The replica identification set can be [1,2,3, …, N ], and the version set identification set can also be [1,2,3, …, N ].

And 2, determining a corresponding snapshot in the snapshot system according to the data block attribute of the data block, and generating a snapshot set V.

Specifically, the data block attributes of the data block include a copy identifier replica id and a version set identifier versioning set id, where the copy identifier replica id is assigned as a source copy identifier src replica id of the corresponding snapshot, and the version set identifier versioning set id is assigned as a source version set identifier src version set id of the corresponding snapshot, so that the snapshot referencing the data block can be determined according to the data block attributes, where a new data block is always generated by writing a certain logical block address of a certain primary volume or snapshot, and the written primary volume or snapshot is the corresponding snapshot.

This embodiment shows a method for generating a snapshot set V in step 2, where the method specifically includes:

step 21, determining a version set of the snapshot system according to the assignment of the source version set identifiers src version set id and the assignment of the target version set identifiers dest version set id of the primary volume and the snapshot in the snapshot system;

specifically, as shown in fig. 6, after creating snapshot 1 according to the main volume, snapshot 2 is created according to the main volume, then snapshot 1 is taken as a snapshot source, snapshots 1-1 and 1-2 are created in sequence, and finally snapshot 1-2 is created by taking snapshot 1-2 as the snapshot source, and the assignment of the snapshot volume attributes is referred to steps 11 to 13, which is not described herein again. And during the snapshot system creation process, the data block is created as required, and the arrow in fig. 6 indicates the reference relationship between the data block and the snapshot.

Therefore, the snapshot system can be divided into 6 version sets according to the source version set identifier src version set id and the target version set identifier dest version set id, which sequentially includes:

version set B1: the method comprises the steps of a main volume, a snapshot 1 and a snapshot 2, wherein the assignment of a destversion set id of a target version set identifier of the snapshot 1 and the snapshot 2 is B1, and is equal to the assignment of a srcversion set id of a source version set identifier of the main volume;

version set B2: the method comprises the following steps of a snapshot 1, a snapshot 1-1 and a snapshot 1-2, wherein the assignment of a target version set identifier dest version set id of the snapshot 1-1 and the snapshot 1-2 is B2 and is equal to the assignment of a source version set identifier src version set id of the snapshot 1;

version set B3: snapshot 2, in the snapshot system, there is no snapshot equal to the assignment of the source version set identifier src version set id of snapshot 2;

version set B4: snapshot 1-1;

version set B5: snapshot 1-2 and snapshot 1-2-1;

version set B6: snapshot 1-2-1.

Step 22, respectively extracting assignments of copy identifiers replica id and version set identifiers version set id of the data blocks, selecting the version sets with equal assignments, selecting snapshots of which the source version set identifiers src version set id is equal to the version set, recording as source volumes of the version sets, and recording the rest snapshots as target volumes;

and step 23, when the assignment of the source copy identification src replica id of the source volume is determined to be equal to the assignment of the copy identification replica id of the data block, marking the source volume as the unique element of the snapshot set V, and generating the snapshot set.

Further, the method for generating the snapshot set in step 2 specifically includes:

step 24, when it is determined that the assignment of the source copy identifier src replica id of the source volume is not equal to the assignment of the copy identifier replica id of the data block, marking the source volume as an element of the snapshot set V, selecting all target volumes in the target volume whose target copy identifier dest replica id assignment is greater than or equal to the assignment of the copy identifier replica id of the data block, marking the selected target volumes as elements of the snapshot set V, and executing step 25;

and step 25, generating a snapshot set according to the elements marked in the step 24 and the snapshot taking the target volume as the snapshot source, wherein the snapshot set comprises the snapshot taking the target volume as the snapshot source and the snapshot generated by taking the target volume as the snapshot source. That is, taking the concept of a tree diagram as an example, all other snapshots in the subtree that takes the target volume as the source are also added to the snapshot set V, and the subtree that takes the target volume as the source contains the target volume and the subtree that takes each of its snapshots as the source.

In particular, assuming that the underlying physical block migration program determines that data block DB2-8 and data block DB2-3 need to be migrated, all volumes in the volume set that may reference both data blocks may be found in accordance with the above-described process.

For DB 2-8:

1) first, a copy identifier replica id and a version set identifier version section, A8 and B2, respectively, are extracted from the data block attributes.

2) Based on the version set identification B2, the corresponding version set, version set 2, is found, which contains snapshot 1 and its two snapshots (snapshot 1-1, snapshot 1-2).

3) Browsing each volume in version set 2, finding that snapshot 1 is the source volume of current version set 2, and its source copy identification src replica id is equal to copy identification A8 of the data block, so snapshot 1 is the only volume that can reference DB2-8, adding snapshot 1 to set V, going to step 3 to further compare PBA.

For DB 2-3:

1) first, a copy identifier replica id and a version set identifier version section, a3 and B2, respectively, are extracted from the data block attributes.

3) Browse Each volume in version set 2:

a) snapshot 1 is the source volume for current version-set 2, so snapshot 1 may reference DB2-3, adding snapshot 1 to set V. However, the assignment A8 for the source copy identification src replica id of the source volume does not equal the assignment A3 for the copy identification replica id of the data block, so that the analysis of the other volumes in version set 2, i.e., all target volumes, is continued, including snapshots 1-1 and snapshots 1-2.

b) Snapshot 1-1 and snapshot 1-2 are target volumes in version set 2, and both target copy identifications A3 and a6 satisfy the condition that the data block copy identification A3 is greater than or equal to, so that both snapshot volumes may also reference DB2-3 and both snapshot volumes are added to set V.

4) For each volume in the set V, if the volume is not the source volume of the version set (the snapshots 1-1 and 1-2 meet the condition), adding a subtree taking the volume as the source volume into the set V;

a) the version set that snapshot 1-1 was the source is version set 4, which contains only one volume of snapshot 1-1.

b) The version set that snapshot 1-2 was the source is version set 5, which also contains one snapshot 1-2-1 of snapshot 1-2; the version set 6 with the snapshot 1-2-1 as the source only contains the snapshot 1-2-1; therefore, the subtree whose snapshot 1-2 was the source joins set V, which is actually adding snapshot 1-2-1.

For data block DB2-3, a total of 4 snapshots, namely snapshot 1, snapshot 1-2, and snapshot 1-2-1, are included in snapshot set V. These 4 snapshots may all reference data block DB2-3, going to step 3 to compare PBAs further.

Step 3, according to the data block migration instruction, determining a source physical block address of a data block to be migrated and a target physical block address of the data block to be migrated, loading metadata of a snapshot in a snapshot set V according to a logical block address LBA of the data block, and determining a physical block address mapped to each snapshot in the snapshot set at the logical block address;

and 4, when the physical block address of the snapshot mapping is judged to be equal to the source physical block address of the data block to be migrated, updating the mapping relation between the corresponding snapshot and the migrated data block according to the target physical block address and the snapshot logical block address of the data block to be migrated after the data migration of the migrated data block is completed.

Specifically, the data block to be migrated DB2-3 is set to be migrated from the data block to be migrated source physical block address PBA202 to the data block to be migrated target physical block address PBA1000, and the logical block address LBA thereof is 300. All snapshots that may reference this migrated data block DB2-3 have been placed into snapshot set V (containing 4 snapshots, snapshot 1-1, snapshot 1-2, and snapshot 1-2-1) via step 2.

And respectively loading metadata for each snapshot in the snapshot set V to find a snapshot logical block address LBA300 and a mapped physical block address PBA, if the PBA is equal to a source physical block address of the data block to be migrated, namely, the snapshot really refers to the data block, then after the migrated data block is migrated from the source physical block address of the data block to be migrated to a target physical block address of the data block to be migrated, updating the metadata of the snapshot, so that the mapping from the data block logical block address LBA recorded in the metadata to the source physical block address of the data block to be migrated is updated to the mapping from the data block logical block address LBA to the target physical block address of the data block to be migrated.

Assume that the physical block addresses PBA mapped at the LBA300 of each snapshot in the snapshot set V are PBA 205, PBA202, and PBA400, respectively, that is, only the physical block addresses PBA mapped at the LBA300 of the snapshots 1-1 and 1-2 are equal to the physical block addresses PBA202 of the data block to be migrated of the migrated data block DB2-3, that is, only the snapshots 1-1 and 1-2 are referencing the data block DB2-3, and the snapshot 1 and 1-2-1 actually rewrite the data block at the snapshot logical block address LBA300, so that the physical block addresses PBA mapped thereto change and point to other data blocks.

After the data block DB2-3 is migrated from the source physical block address PBA202 of the data block to be migrated to the target physical block address PBA1000 of the data block to be migrated, metadata of the snapshot 1-1 and the snapshot 1-2 need to be modified simultaneously, the mapping relationship between the snapshot and the data block is updated, and the LBA300 is mapped to the PBA 1000.

It should be noted that, generally, the data size of the snapshot metadata is very large, and therefore, loading the LBA- > PBA map in the snapshot metadata involves a relatively time-consuming disk operation, while by the positioning method in this embodiment, the total amount of information such as the snapshot attribute operated in step 2 is very small, and can usually reside in a memory, which is why step 2 needs to be performed to obtain a relatively small snapshot set V.

The technical solution of the present application is described in detail above with reference to the accompanying drawings, and the present application provides a positioning method for quickly positioning a reference of a data block in a snapshot system, where the positioning method is suitable for updating a mapping relationship between a snapshot and the data block, and the method includes: step 1, creating a volume attribute for a main volume, creating a snapshot system according to the main volume, and respectively assigning values to the volume attribute of the snapshot and the data block attribute of the data block according to the generation sequence of the snapshot and the data block in the snapshot system; step 2, according to the data block attribute of the data block, determining a corresponding snapshot in a snapshot system, and generating a snapshot set; step 3, according to the data block migration instruction, determining a source physical block address of a data block to be migrated and a target physical block address of the data block to be migrated, loading metadata of snapshots in the snapshot set, and determining physical block addresses mapped to the data block logical block addresses of the snapshots in the snapshot set; and 4, when the physical block address mapped by the snapshot is judged to be equal to the source physical block address of the data block to be migrated, updating the mapping relation between the corresponding snapshot and the migrated data block according to the target physical block address and the logical block address of the data block to be migrated after the data migration of the migrated data block is completed. According to the technical scheme, the data block references in the snapshot system are quickly positioned, and the migration efficiency of data migration such as cold and hot data migration and defragmentation of data of the bottom layer of the data block is improved.

The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.

The units in the device can be merged, divided and deleted according to actual requirements.

Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.

Claims

1. A positioning method for quickly positioning data block reference in a snapshot system is characterized in that the positioning method is suitable for updating mapping relation between a snapshot and a data block, and comprises the following steps:

step 1, creating a volume attribute for a main volume, creating a snapshot system according to the main volume, and respectively assigning values to the volume attribute of the snapshot and the data block attribute of the data block according to the generation sequence of the snapshot and the data block in the snapshot system and the volume attribute of the main volume;

step 2, according to the data block attribute of the data block, determining a corresponding snapshot in the snapshot system, and generating a snapshot set;

step 3, according to a data block migration instruction, determining a source physical block address of a data block to be migrated and a target physical block address of the data block to be migrated, loading metadata of snapshots in the snapshot set, and determining physical block addresses mapped to the logical block addresses of the data blocks of the snapshots in the snapshot set;

and 4, when the physical block address mapped to the snapshot is judged to be equal to the source physical block address of the data block to be migrated, updating the mapping relation between the corresponding snapshot and the migrated data block according to the target physical block address of the data block to be migrated and the logical block address of the data block.

2. A method for quickly locating a reference to a data block in a snapshot system as recited in claim 1, wherein said snapshot volume attributes include a source copy identification, a source version set identification, a target copy identification, and a target version set identification, and wherein said data block attributes include a copy identification, a version set identification, and a data block logical block address.

3. The method for quickly locating a reference to a data block in a snapshot system as claimed in claim 2, wherein in step 1, assigning a value to a data block attribute of the data block specifically includes:

and when judging that a new data block is generated, assigning the copy identifier of the newly generated data block as the source copy identifier of the corresponding main volume or snapshot, and assigning the version set identifier of the newly generated data block as the source version set identifier of the corresponding main volume or snapshot.

4. The method for quickly locating a reference to a data block in a snapshot system according to claim 2, wherein in step 1, assigning a value to a volume attribute of the snapshot specifically includes:

step 11, when it is determined that a new snapshot is generated, assigning a source copy identifier and a source version set identifier of a snapshot source of the newly generated snapshot to a target copy identifier and a target version set identifier of the newly generated snapshot respectively;

step 12, sequentially selecting two elements in the copy identification set according to the sizes of the elements, and respectively assigning the selected two elements to the source copy identification of the snapshot source and the source copy identification of the newly generated snapshot;

and step 13, sequentially selecting an element from the version set identifier set, and assigning the selected element to the source version set identifier of the newly generated snapshot.

5. The method for quickly locating a reference to a data block in a snapshot system as claimed in claim 2, wherein the step 2 specifically includes:

step 21, determining a version set of the snapshot system according to the assignment of the source version set identifier and the assignment of the target version set identifier of the main volume in the snapshot system, and the assignment of the source version set identifier and the assignment of the target version set identifier of the snapshot, wherein the version set at least comprises one snapshot;

step 22, respectively extracting the assignment of the copy identifier and the assignment of the version set identifier of the data block in the snapshot system, selecting the version set with equal assignment of the version set in the snapshot system, selecting the snapshot with the assignment of the source version set identifier equal to the assignment of the version set in the selected version set, recording the selected snapshot as a source volume of the version set, and recording the rest snapshots as target volumes;

and step 23, when the assignment of the source copy identifier of the source volume is determined to be equal to the assignment of the copy identifier of the data block, marking the source volume as a unique element of the snapshot set, and generating the snapshot set.

6. The method for quickly locating a reference to a data block in a snapshot system as claimed in claim 5, wherein in step 2, the method specifically includes:

step 24, when it is determined that the assignment of the source copy identifier of the source volume is not equal to the assignment of the copy identifier of the data block, marking the source volume as an element of the snapshot set, selecting all target volumes in the target volumes whose assignment of the target copy identifiers is greater than or equal to the assignment of the copy identifiers of the data block, and marking the selected target volumes as elements of the snapshot set;

and step 25, generating the snapshot set according to the marked elements and the snapshot taking the elements as snapshot sources.

7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a method for fast location of references to data blocks in a snapshot system as claimed in one of the claims 1 to 6.