WO2020038186A1

WO2020038186A1 - Data migration method and apparatus, and storage device

Info

Publication number: WO2020038186A1
Application number: PCT/CN2019/098061
Authority: WO
Inventors: 贾胜迁
Original assignee: 华为技术有限公司
Priority date: 2018-08-24
Filing date: 2019-07-27
Publication date: 2020-02-27
Also published as: CN109324758A

Abstract

Provided are a data migration method and apparatus and a storage device. First data is stored in a first storage address, a first virtual address is recorded in metadata of the first data, and there is a first mapping relationship between the first storage address and the first virtual address. The method comprises: the storage device migrating the first data from the first storage address to a second storage address; and the storage device adjusting the first mapping relationship to a second mapping relationship, wherein the second mapping relationship is a correlation between the second storage address and the first virtual address. According to the method, after the first data is migrated, a storage address in the metadata of the first data does not need to be modified, thereby greatly improving the metadata management efficiency.

Description

Data migration method, device and storage device

Technical field

The embodiments of the present application relate to the field of communications technologies, and in particular, to a data migration method, device, and storage device.

Background technique

Some electronic devices include storage media that need to be erased before writing, and these storage media include multiple storage areas. Data is written into the storage area in an additional write manner. After the storage addresses in the storage area are filled in order, a garbage collection (GC) process may need to be performed on the storage area. The GC process refers to: after the storage address in the storage area is filled up in order, if a part of the data in the storage area is deleted, the storage space after the partial data is deleted because the remaining data occupies the storage space Not available, so you need to migrate the data in the storage area and reset the storage area.

In the prior art, the storage address information of data is recorded through metadata, and the metadata is organized in a tree structure. After the above GC process is performed, it is necessary to start from the leaf node metadata corresponding to the migrated data and modify the metadata layer by layer.

However, the prior art methods lead to inefficient metadata management.

Summary of the Invention

The embodiments of the present application provide a data migration method, device, and storage device, which are used to solve the problem of low data management efficiency in the prior art.

A first aspect of the embodiments of the present application provides a data migration method. The method is applied to a storage device. The storage device includes a controller and at least one hard disk. The controller is configured to manage data stored in the hard disk. The first data is stored in a first storage address, and a first virtual address is recorded in the metadata of the first data. There is a first mapping relationship between the first storage address and the first virtual address. The method includes:

The storage device migrates the first data from the first storage address to the second storage address, and further, the storage device adjusts the first mapping relationship to a second mapping relationship, where the second mapping relationship is the second storage address and The corresponding relationship of the first virtual address.

In the method, a first virtual address is recorded in metadata of the first data, and a first mapping relationship between the first virtual address and the first storage address storing the first data is recorded in the storage device before the first data is migrated, After the first data is migrated from the first storage address to the second storage address, only the first mapping relationship needs to be modified to the second mapping relationship between the first virtual address and the second storage address, so that there is no need to modify the first data The metadata can also ensure that the first data can still be accessed correctly after the migration. Therefore, this embodiment greatly reduces the number of metadata modification times while ensuring that the data is correctly accessed, thereby greatly improving the metadata. Management efficiency reduces the load on the storage system.

In a possible design, the storage address is determined by a storage area ID and a storage offset in the storage area.

In a possible design, the first virtual address is jointly identified by the identifier of the first virtual area and the first virtual offset.

In a possible design, the first mapping relationship includes a mapping relationship between the identifier of the first virtual area and the first storage area, and a mapping relationship between the first virtual offset and the first storage offset. .

The second mapping relationship includes a mapping relationship between the identifier of the first virtual area and the second storage area, and a mapping relationship between the first virtual offset and the second storage offset.

In a possible design, after the storage device adjusts the first mapping relationship to the second mapping relationship, the following process may also be performed:

After the storage device migrates all the data in the storage area corresponding to the first storage address to the storage area corresponding to the second storage address, the storage device updates a migration identifier, which is used to identify the migration of the first virtual area. frequency.

In a possible design, the above method further includes:

Assigning an identifier of a second virtual area to the second data to be stored and assigning a third storage area corresponding to a third storage address for storing the second data;

Assigning a third storage offset in a third storage area to the second data, and the third storage area and the third storage offset jointly determine the third storage address;

Determining a second virtual offset corresponding to the second data according to the third storage offset and the migration identifier;

Writing the second data into the third storage address, and recording the identifier of the second virtual area and the second virtual offset in metadata of the second data;

Establishing a third mapping relationship, where the third mapping relationship includes: a mapping relationship between an identifier of the second virtual area and a third storage area, and a mapping between the second virtual offset and a third storage offset relationship.

In this method, when determining the storage location of the data to be stored, the determination is made in combination with the migration identifier, so that after the data is migrated, the virtual offset of the newly written data will not be different from the virtual offset of the already written data. Duplicates occur to avoid errors when reading and writing data.

In a possible design, the method further includes:

Determining the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data;

Reading the first data from the first storage address.

In a possible design, the determining the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data, include:

Obtaining a first storage area ID corresponding to the first storage address according to a mapping relationship between the identifier of the first virtual area and the first storage area;

Obtaining a first storage offset corresponding to the first storage address according to the mapping relationship between the first virtual offset and the first storage offset;

The mapping relationship between the first virtual offset and the first storage offset is identified by a mapping table of the first virtual offset and the first storage offset, or by a preset The formula is OK.

A second aspect of the embodiments of the present application provides a data migration device, which is applied to a storage device, and the device may be a controller in the storage device or a part of the controller. The first data is stored in a first storage address, a first virtual address is recorded in the metadata of the first data, and there is a first mapping relationship between the first storage address and the first virtual address.

This device can implement the functions in the first aspect described above. These functions can be realized by hardware, and can also be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above.

In a possible design, the device may include a processing module, which may perform a corresponding function in the above method, for example, a processing module, configured to migrate the first data from the first storage address to the second storage. Address; and,

Adjusting the first mapping relationship to a second mapping relationship, where the second mapping relationship is a correspondence between the second storage address and the first virtual address.

A third aspect of the embodiments of the present application provides a storage device. The storage device includes a controller and at least one hard disk. The controller is configured to manage data stored in the hard disk. The first data is stored at a first storage address. A first virtual address is recorded in the metadata of the data, and there is a first mapping relationship between the first storage address and the first virtual address. The controller is further configured to execute the method according to the first aspect.

A fourth aspect of the embodiments of the present application provides a computer program product, where the computer program product includes computer program code, and when the computer program code is executed by a computer, causes the computer to execute the method described in the first aspect.

A fifth aspect of the embodiments of the present application provides a computer-readable storage medium. The computer storage medium stores computer instructions. When the computer instructions are executed by a computer, the computer is caused to execute the method described in the first aspect.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows the storage structure of the file BB.doc in the tree structure of the file system;

2 is a schematic flowchart of an embodiment of a data migration method according to an embodiment of the present application;

Figure 3 shows the storage structure of the file BB.doc in the tree structure of the file system;

4 is a schematic flowchart of writing second data in a storage device;

5 is a schematic flowchart of reading the first data from a storage device;

FIG. 6 is a schematic block diagram of a data migration device 600 according to an embodiment of the present application;

FIG. 7 is a physical block diagram of a storage device 700 according to an embodiment of the present application.

detailed description

The storage device can organize files through a file system, and the file system can be regarded as a method and a data structure of the storage device's operating system for specifying the files on the storage device. In the file system, data can be divided into data files and metadata, where the data file is a file stored on a storage device. Metadata can be regarded as data describing data. The metadata can hold information such as the owner, size, permissions, timestamp, link, and storage address of the data on the disk of the storage device. The file system can be managed in a tree structure.

Assume that a file BB.doc is stored in the file system, and the storage location of the file BB.doc is "/AA/BB.doc", then Figure 1 is the storage structure of the file BB.doc in the tree structure of the file system As shown in Figure 1, the first layer of the tree structure, the root, is the storage address of the root directory "/". Furthermore, in the second layer of the tree structure, the content stored in the storage address of the root directory includes metadata of "AA" and information of other files or directories included in the root directory. According to the metadata of "AA" recorded in the second layer, the storage address of "AA" can be obtained. Furthermore, in the third layer of the tree structure, the content stored in the storage address of "AA", the content Includes "BB.doc" metadata and information on other files or directories contained in the "AA" directory. According to the metadata of "AA" recorded in the third layer, the storage address of "BB.doc" can be obtained. Furthermore, in the fourth layer of the tree structure, the storage address of "BB.doc" is stored. Content, which is the data corresponding to the file "BB.doc".

It can be known from the above storage structure that the storage address of a file in the file system is recorded by metadata, and is recorded layer by layer according to a tree structure.

For different types of hard disks, the storage address may be expressed differently. In the following, shingled magnetic recording (SMR) and solid-state drive (SSD) are used as examples. SMR is sometimes also called SMR disk or SMR hard disk.

In the SMR hard disk, the storage address of the data is represented by "zone + offset". When necessary (such as when reading data), you can also use the length parameter to describe the length of the data.

Among them, the SMR hard disk is composed of multiple zones (areas), and each zone can be regarded as a storage area in the SMR hard disk. offset (offset) describes a specific position within the zone, for example (zone3, 48KB) describes a position 48KB away from the starting position of zone3. In theory, the offset can be any value greater than 0, as long as it does not exceed the length of the zone. In actual products, in order to facilitate data management, offset can use an integer multiple of 48K bytes as the granularity.

In the SSD, the storage address of the data is expressed in the form of "page group + offset". The SSD can include multiple pages. These pages can be divided into multiple page groups. Each page group can be regarded as the SSD. Within a storage area. It can be divided into multiple offsets in each page group. offset can be in bytes, or it can be in pages. Exemplarily, if the offset is in units of a page, an offset can be expressed as "beginning of the second page and ending at the third page" in a certain page group, or offset only describes "in a certain page group" From the second page as the starting page ", the end page is described with an additional length parameter.

As can be seen from the above introduction of SMR hard disks and SSDs, both types of media can use "zone + offset" to describe the storage location of data.

Taking the above SMR hard disk as an example, it is assumed that the file "BB.doc" is stored on the SMR hard disk of a storage device, and the storage address of the file on the SMR hard disk is "zone1 + offset1", that is, the data of the file is stored in the SMR On the hard disk, zone1 is located in offset1 (in other words, offset1 in zone1 is used as the starting position for storing BB.doc). The storage path of the file in the file system of the storage device is the path described in the corresponding description of FIG. 1. In the prior art, in the tree structure of the file system, the “zone1 + offset1” is recorded in the third layer. Storage address, the storage address of "AA" is recorded in the second layer, and the storage address of "/" is recorded in the first layer.

On this basis, if the data of zone1 where "BB.doc" is GC, the valid data of zone1 is migrated, assuming that the valid data of zone1 is migrated to zone2 and "BB.doc" is migrated to zone2 Offset2, the storage address of "BB.doc" is changed from "zone1 + offset1" before migration to "zone2 + offset2". In this technology, the storage address of "BB.doc" recorded in the third layer needs to be modified first, and its value is changed from "zone1 + offset1" to "zone2 + offset2". Furthermore, the second layer needs to be modified layer by layer. The storage address of "AA" recorded in the and the storage address of "/" recorded in the first layer. A large amount of data may be stored in zone1. After the data is migrated, the metadata of the data needs to be modified.

Therefore, using the above method, when the hard disk performs GC and the storage location of the data files in the file system changes, multiple metadata modifications need to be performed, resulting in low data management efficiency of the hard disk.

The technical solutions in the embodiments of the present application can solve the foregoing problems.

It should be noted that the "storage address" described in the embodiments of the present application refers to the address used to identify the data storage location at the operating system level. In the specific implementation process, the "storage address" needs to pass through the bottom layer below the operating system. After the software performs further conversion, the actual physical address on the hard disk is obtained. The specific process of the conversion may refer to the method in the prior art, which is not specifically limited in the embodiment of the present application. The storage address can be zone + offset mentioned above.

FIG. 2 is a schematic flowchart of an embodiment of a data migration method provided by an embodiment of the present application. The method is executed by the foregoing storage device, and the storage device may be a desktop computer, a notebook computer, a server, a storage controller + hard disk, and the like having data storage. Functional device. The storage device includes a controller and at least one hard disk, and the controller is configured to manage data stored in the hard disk. For non-dedicated storage devices, a processor (or a combination of a processor and other auxiliary devices, such as a combination of a processor and a memory) can be understood as a storage controller from a functional perspective.

For ease of understanding, this embodiment uses the first data as an object to describe a data migration process. The first data may be any data in a hard disk of the storage device. Exemplarily, the first data may correspond to a file managed by the file system.

Before performing data migration (for example, data migration by GC, migration by hot and cold data tiered storage, or data migration by other reasons), the first data is stored at a first storage address on a hard disk, The storage address recorded in the metadata of the first data is a first virtual address, and there is a one-to-one correspondence between the first storage address and the first virtual address. For convenience of description, we call it a first mapping. relationship.

Optionally, the first virtual address is not a storage address, but a virtualized storage address, and an identifier recorded in the metadata of the first data. The first virtual address and the first storage address have a first mapping relationship. Furthermore, the first storage address of the first data can be obtained through the first virtual address recorded in the metadata and the first mapping relationship. In the embodiment of the present application, a virtual address (virtualized storage address) is introduced, and the storage address recorded in the metadata is replaced with the virtual address recorded in the metadata. Therefore, when the storage address of the data is changed, it is only necessary to update the mapping relationship between the virtual address and the storage address, and the virtual address in the metadata can remain unchanged. Therefore, a large number of metadata updates caused by data migration are avoided, the metadata management efficiency is improved, and the load of the storage system is reduced.

As shown in Figure 2, the method includes:

S201. The storage device migrates the first data from the first storage address to the second storage address.

Optionally, before this step, the storage device has determined that the storage area where the first data is located needs to be GC. Further, in this step, the storage device migrates the first data from the first storage address to the second storage address.

The above-mentioned "migration" process may specifically be:

First, the first data is written into the second storage address. Optionally, the first data stored on the first storage address may be subsequently deleted.

S202. The storage device adjusts the first mapping relationship to a second mapping relationship, where the second mapping relationship is a correspondence relationship between the second storage address and the first virtual address.

Optionally, when the first data is saved to a hard disk, the storage device may allocate a first virtual address to the first data, and record the first virtual address in the metadata of the first data, and further After the first data is migrated from the first storage address to the second storage address, the storage device does not need to modify the first virtual address, but only the mapping relationship corresponding to the first data, that is, the first mapping relationship is modified to the second Mapping relations. Because the first storage address of the first data can be obtained through the mapping relationship between the first virtual address recorded in the metadata and the first data, after the first data is migrated, the first virtual address recorded in the metadata is obtained. As well as the modified second mapping relationship, the storage address of the first data can still be obtained. At the same time, in this process, the metadata of the first data is not modified.

It should be noted that this mapping relationship (for example, the first mapping relationship and the second mapping relationship) may be recorded in the metadata, or may be recorded in another location by itself. As long as it can be read by the processor or memory controller.

Optionally, the first mapping relationship and the second mapping relationship may be stored in the form of a mapping table. Exemplarily, a mapping table of data may be stored in the storage device. For the above-mentioned first data, the first mapping relationship is recorded in the mapping table before data migration, and the second mapping is recorded in the mapping table after data migration. relationship. The following Table 1 is an example of the mapping table before the first data migration, and the following Table 2 is an example of the mapping table after the first data migration.

Table 1

虚拟地址Virtual address	存储地址Storage address
第一虚拟地址First virtual address	第一存储地址First storage address

Table 2

虚拟地址Virtual address	存储地址Storage address
第一虚拟地址First virtual address	第二存储地址Secondary storage address

In the specific implementation process, during the operation of the storage device, the above mapping table may be generated in the memory of the storage device. As the data on the hard disk of the storage device reads and writes, the storage device updates and maintains the above mapping table. When the storage device needs to stop running, for example, when it needs to be shut down, the storage device can save the above mapping table to the hard disk of the storage device. When the storage device is restarted next time, the mapping table stored in the hard disk is loaded into the memory, and then reading and writing are continued based on the mapping table, and the mapping table is updated and maintained.

In this embodiment, a first virtual address is recorded in metadata of the first data, and before the first data is migrated, a first mapping relationship between the first virtual address and the first storage address storing the first data is recorded in the storage device. After the first data is migrated from the first storage address to the second storage address, it is only necessary to modify the first mapping relationship to the second mapping relationship between the first virtual address and the second storage address, thereby eliminating the need to modify the first data. At the same time, it can also ensure that the first data can still be correctly accessed after the migration. Therefore, this embodiment greatly reduces the number of metadata modification times while ensuring that the data is correctly accessed, thereby greatly improving the metadata. Management efficiency, reducing the load on the storage system.

For different types of hard disks, the storage address may be expressed differently.

In one mode, in the above-mentioned SMR hard disk and a type of hard disk represented by the SSD, the storage address representation method can be regarded as a "storage area + storage offset" method. In this type of hard disk, the hard disk is divided into multiple storage areas, and within each storage area, multiple storage offsets can be included. For example, in an SMR hard disk, a zone represents a storage area, and an offset within each zone represents a storage offset within the zone. As another example, in the SSD, a page group is used to represent a storage area, and an offset within each page group represents a storage offset within the page group.

For the above-mentioned SMR hard disk and a type of hard disk represented by the SSD, the storage address of the data on the hard disk can be determined through the storage area ID and the storage offset in the storage area. SMR hard disks and a type of hard disks represented by SSDs can write data in a sequential write manner. In SMR hard disks and the type of hard disks represented by SSDs, new data cannot directly overwrite old data, and the space occupied by old data needs to be released through garbage collection for new data.

Exemplarily, assuming that the storage device has learned that a certain data D is stored in offset1 of zone1, the storage address of the data D can be expressed as: zone1offset1, that is, the storage address of data D is identified by the storage area of zone1 and the storage offset of offset1. To determine jointly. When reading the data D, it can be addressed to the storage area where the data D is located through the identifier of zone1, and then to the storage location where the data D is located through the offset of offset1. From this storage location, according to the data D The length (length) of the data D can be read, and the length (length) can be recorded in the metadata of the data D, and / or recorded in a read request to read the data D.

Further, optionally, the virtual address of the data may be jointly identified by the identifier of the virtual area and the virtual offset.

Specifically, for the above-mentioned first data, the first virtual address of the first data may be jointly identified by the identifier of the first virtual area and the first virtual offset.

Exemplarily, assuming that data D is stored in the storage device, the virtual address recorded in the metadata of the data D may be: vzone1voffset1. Among them, vzone1 represents the identifier of the virtual zone of data D, and voffset1 represents the virtual offset of data D.

It should be noted that the above-mentioned way of representing the virtual address by "vzone1 + voffset1" is highly similar to the way of representing the storage address "zone1 + offset1", and therefore has advantages such as being easily understood by device managers. In fact, virtual addresses can also take other forms. For example, you can use hexadecimal numbers, Roman letters, English letters to describe virtual addresses, or a combination of multiple symbols to describe virtual addresses, as long as it can indicate that "zone + offset" has a corresponding relationship.

For example, the first 48KB of zone1 is represented by the letter AAA1, the second 48KB of zone1 is represented by the letter AA2, the third 48KB of zone1 is represented by the letter AA3, and so on.

Optionally, if the storage address of the data in the storage device is jointly determined by the storage area ID and the storage offset in the storage area, and the virtual address of the data is collectively identified by the identification of the virtual area and the virtual offset, the corresponding mapping of the data The relationship can be expressed by the mapping relationship between the identification of the virtual area and the storage area, and the mapping relationship between the virtual offset and the storage offset.

Specifically, for the first data, before the first data is migrated, the first mapping relationship corresponding to the first data may include:

The mapping relationship between the identifier of the first virtual area and the first storage area, and the mapping relationship between the first virtual offset and the first storage offset.

After the first data is migrated, the mapping relationship corresponding to the first data is modified to a second mapping relationship, and the second mapping relationship may include:

The mapping relationship between the identifier of the first virtual area and the second storage area, and the mapping relationship between the first virtual offset and the second storage offset.

The following describes an optional implementation manner of the above mapping relationship.

As mentioned above, the mapping relationship corresponding to the data can be represented by the mapping relationship between the identification of the virtual area and the storage area, and the mapping relationship between the virtual offset and the storage offset. The following describes the optional implementation of the two mapping relationships, respectively. the way.

1. Mapping relationship between the identification of the virtual area and the storage area

Optionally, the mapping relationship between the identification of the virtual area and the storage area may be saved in the form of a mapping table.

For a piece of data in a storage device, when the data is written to the hard disk of the storage device for the first time, the storage device assigns a virtual area identifier to the data. The virtual area identifier may be a virtual area identifier that already exists in the mapping table. ID for the new virtual area. If it is an existing virtual area identifier, the storage device may directly use the mapping relationship of the existing virtual area identifier. If it is the identifier of the new virtual area, the storage device adds a mapping relationship between the identifier of the new virtual area and the storage area to the mapping table according to the storage area allocated for the data. After the data is migrated, the storage device modifies the storage area corresponding to the identifier of the virtual area into the migrated storage area in the above mapping table. When the data needs to be read, query the above mapping table to get the data storage area.

An example is described below.

Assume that there is data D to be stored in the storage device. The identifier of the virtual area allocated by the storage device to the data D is vzone1, and the storage area allocated to the data D is zone1. After the data D is saved, the above mapping table may be as shown in Table 3 below.

table 3

虚拟区域的标识Identification of the virtual area	存储区域Storage area
vzone1vzone1	zone1zone1

After data D is migrated from zone1 to zone2, the above mapping table is modified into the form shown in Table 4 below.

Table 4

虚拟区域的标识Identification of the virtual area	存储区域Storage area
vzone1vzone1	zone2zone2

Mapping relationship between virtual offset and storage offset

In an optional manner, the mapping relationship between the virtual offset and the storage offset may be determined by a preset formula.

Optionally, after the storage device allocates a virtual area, a storage area, and a storage offset for a piece of data to be stored, a virtual offset corresponding to the storage offset may be directly calculated by using a preset first formula. Furthermore, when the data needs to be read, a storage offset corresponding to the virtual offset may be calculated by a preset second formula, and further, data is read from a position corresponding to the storage offset.

In one example, the first formula may be the following formula (1), and the second formula may be the following formula (2).

Voffset = offset + gctimes * sizeof (zone) (1)

Among them, offset is the storage offset allocated by the storage device for the data to be stored, zone is the ID of the storage area allocated by the storage device for the data to be stored, and gctimes is the number of migrations of the virtual area allocated by the storage device for the data to be stored. .

Optionally, in the storage device, each virtual area allocated by the storage device may correspond to a migration identifier, and the migration identifier is used to identify the number of migration times of the virtual area. For example, the migration identifier corresponding to the first virtual area is used to identify Number of migrations of a virtual area. The initial value of the migration identifier can be 0. Taking the first virtual area as an example, when the storage device migrates all the data in the storage area corresponding to the first storage address to the storage area corresponding to the second storage address, the first virtual area After the migration, the storage device can update the migration identifier. For example, the storage device can add 1 to the value of the original migration identifier as the new migration identifier.

In this example, by adding the value of the migration identifier in the above formula (1), after the data is migrated, the virtual offset of the newly written data will not overlap with the virtual offset of the already written data. To avoid errors when reading and writing data.

Offset = voffset% sizeof (zone) (2)

Among them, voffset is the virtual offset of data, and zone is the ID of the storage area allocated by the storage device for the data to be stored.

Before the data is read, the storage area of the data may be preferably searched according to the identification of the virtual area of the data, and further, the above-mentioned formula (2) is used to determine the storage offset of the data.

In another optional manner, the mapping relationship between the virtual offset and the storage offset may be saved in the form of a mapping table.

For a piece of data in a storage device, when the data is written to the hard disk of the storage device for the first time, the storage device allocates a virtual area identifier, a storage area, and a storage offset for the data. Further, the storage device determines the virtual Offset, and saves the correspondence between the storage offset and the virtual offset in a mapping table. After the data is migrated, after the data is migrated, the storage device modifies the storage offset corresponding to the virtual offset into the migrated storage offset in the mapping table. When you need to read the data, you can query the mapping table to get the storage offset of the data.

An example is described below.

Assume that the storage device has data D to be stored, the storage area allocated by the storage device for the data D is zone1, the virtual offset is voffset1, and the storage offset allocated for the data D is offset1. After the data D is saved, the above mapping table may be as shown in Table 5 below.

table 5

虚拟区域的标识Identification of the virtual area	存储区域Storage area
voffset1voffset1	offset1offset1

After that, the data D is migrated from offset1 of zone1 to offset2 of zone2, and the above mapping table is modified into the form shown in Table 6 below.

Table 6

虚拟区域的标识Identification of the virtual area	存储区域Storage area
voffset1voffset1	offset2offset2

It should be noted that in the specific implementation process, the above two optional methods may be implemented separately or in combination.

The combined implementation process is:

When data is first written to the storage device, the storage device can determine the virtual offset corresponding to the storage offset of the data according to the above formula (1). At this time, the correspondence between the storage offset and the virtual offset can be regarded as It is represented by the formula (1), and there is no need to record the corresponding relationship between the two in the storage device. After the data is migrated, the storage offset corresponding to the virtual offset of the data changes. At this time, the correspondence between the virtual offset of the data and the migrated storage offset can be written into the mapping table. When the data needs to be read, the storage device may first determine whether a mapping table of the virtual offset and the storage offset currently exists, and if it exists, further determine whether there is a virtual offset of the data in the mapping table. The storage offset corresponding to the virtual offset is the storage offset of the data. If there is no virtual offset of the data, or there is no mapping table of the virtual offset and the storage offset, you can follow The above formula (1) determines the storage offset corresponding to the virtual offset of the data. At this time, the correspondence between the storage offset and the virtual offset can be regarded as represented by the formula (2).

The optional representations of the storage address and the virtual address of the data and the optional representations of the mapping relationship between the storage address and the virtual address are explained above. In a specific implementation process, the storage device may store data according to the foregoing optional representation manner, and establish a mapping relationship corresponding to the data.

The following continues to use the example in the corresponding description of FIG. 1 to describe data storage and the establishment of a mapping relationship in the embodiment of the present application.

Assume that a file BB.doc is saved in the file system, and the storage location of the file BB.doc is “/AA/BB.doc”, then FIG. 3 is the storage structure of the file BB.doc in the tree structure of the file system As shown in FIG. 3, the first layer of the tree structure, that is, the root, is the identifier of the virtual area of the root directory and the virtual offset. The mapping relationship between the identifier of the virtual area and the storage area is stored in the first mapping table, and the mapping relationship between the virtual offset and the storage offset is stored in the second mapping table or expressed by a formula. Through these two mapping relationships, the storage device can know the storage area and storage offset of the root directory "/". The content stored in the storage area and the storage address determined by the storage offset is the tree. The content of the second layer of the shape structure includes the identifier of the virtual area of "AA" and the virtual offset, and the mapping relationship between the identifier of the virtual area and the storage area is stored in the first mapping table, and the virtual offset The mapping relationship with the storage offset is stored in the second mapping table or expressed by a formula. Through these two mapping relationships, the storage device can know the storage area and storage offset of "AA". The content stored in the storage area and the storage address determined by the storage offset is a tree structure. The content of the third layer, which includes the identification of the virtual area of "BB.doc" and the virtual offset, the storage device continues to follow the mapping relationship, and finally obtains the storage address of "BB.doc".

In the above example, the metadata recorded in each layer of the tree structure includes the identification of the virtual area and the virtual offset, rather than the storage area ID and storage offset of the data. Therefore, after the data is migrated, The metadata in each layer of the tree structure does not need to be modified, but only the corresponding mapping relationship can be modified.

The following describes the process of reading and writing data from the storage device under the above-mentioned optional representation mode.

FIG. 4 is a schematic flowchart of writing second data in a storage device. As shown in FIG. 4, the writing process includes:

S401: Allocate an identifier of a second virtual area to the second data to be stored, and allocate a third storage area corresponding to a third storage address for storing the second data.

After determining the second data storage, the storage device first allocates a virtual area identifier and a storage area for the second data, that is, the identifier of the second virtual area and the third storage area.

Optionally, the storage device may allocate a virtual area identifier and a storage area for the second data according to a preset policy.

S402. Allocate a third storage offset in the third storage area for the second data, and the third storage area and the third storage offset determine the third storage address together.

After the identifier of the virtual area and the storage area are allocated for the second data, the storage device further allocates a storage offset for the second data, that is, a third storage offset. Optionally, the storage device may allocate a storage offset for the second data starting from the first free storage offset of the allocated storage area.

It should be noted that in the specific implementation process, the size of the data is not the same. Therefore, the storage offset occupied by one data may be one or more. Therefore, the storage described in the embodiment of the present application The offset can be regarded as the starting position of the data in the storage area. The storage device needs to further combine the length of the data to obtain the complete storage offset of the data, which can also be stored in the metadata of the data at the same time.

Exemplarily, the storage address allocated by the storage device for the second data is the storage area zone2 + storage offset offset2. Assuming that the value of offset2 is 48KB, the starting position of the second data is a position 48KB from the starting position of zone2. Meanwhile, the second data has length information, for example, the length is 20 KB. When reading the second data, the storage area ID (zone2) and storage offset (48KB) of the second data are obtained through the mapping relationship between the virtual address and the storage address, that is, the starting position of the second data is obtained, and then After reading 20KB, you can read the complete content of the second data.

S403. Determine a second virtual offset corresponding to the second data according to the third storage offset and the migration identifier.

The specific process can refer to the corresponding description of the above formula (1), which will not be repeated here.

S404. Write the second data into the third storage address, and record an identifier of the second virtual area and the second virtual offset in metadata of the second data.

After determining the third storage area and the third storage offset for storing the second data, the storage device determines the storage address of the second data, that is, the third storage address. Further, in this step, the storage device The two data are written into a third storage address determined by the third storage area and the third storage offset.

Optionally, the storage device starts from the starting position indicated by the third storage offset, and writes the second data into the starting position and the subsequent storage positions in a sequential writing manner.

Furthermore, the storage device records the identifier of the virtual area allocated for the second data and the determined second virtual offset in the metadata of the second data. Optionally, the storage device further writes length information of the second data into metadata of the second data.

S405. Establish a third mapping relationship, where the third mapping relationship includes: the mapping relationship between the identifier of the second virtual area and the third storage area, and the mapping relationship between the second virtual offset and the third storage offset .

Optionally, the storage device may establish the mapping relationship between the identifier of the second virtual area and the third storage area according to the method described in the corresponding description in Table 3 above, and establish the first relationship according to the method described in the foregoing formula (1) or table (5). The mapping relationship between the second virtual offset and the third storage offset is not repeated here.

FIG. 5 is a schematic flowchart of reading the first data from a storage device. As shown in FIG. 5, the process of reading the first data includes:

S501. Determine the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data.

Optionally, when the foregoing first storage address is expressed in a "storage area + storage offset" manner, the first storage address includes a first storage area ID and a first storage offset, and the first virtual address includes a first The identification of the virtual area and the first virtual offset. Furthermore, according to the foregoing embodiment, it can be known that the first mapping relationship corresponding to the first data includes the mapping relationship between the identifier of the first virtual area and the first storage area, and the mapping between the first virtual offset and the first storage offset. relationship.

Further, in this step, the identifier of the first virtual area corresponding to the first data and the first virtual offset may be obtained first through the metadata of the first data, and further, according to the identifier of the first virtual area and the first storage, Area mapping relationship to obtain a first storage area ID corresponding to the first storage address, and to obtain a first storage area corresponding to the first storage address according to the mapping relationship between the first virtual offset and the first storage offset Stores the offset.

As an example, the storage device may obtain the first storage area ID corresponding to the first storage address by reading the mapping table.

As an example, the storage device may obtain the first storage offset corresponding to the first storage address by using the foregoing formula (2).

S502. Read the first data from the first storage address.

After obtaining the first storage area of the first data and the first storage offset, the storage device obtains the starting storage position of the first data. The storage device can further combine the length information of the first data to read the complete first data. data.

Exemplarily, it is assumed that the first storage area of the first data obtained through the foregoing step S501 is zone1, the first storage offset is 48KB, and the length of the first data recorded in the metadata of the first data is 20KB, Then the storage device first addresses a position within zone1 which is 48KB from the starting position of zone1. This position is the starting position of the first data. From this position, 20KB is sequentially read. The read data is the first data. Full content.

FIG. 6 is a schematic block diagram of a data migration apparatus 600 according to an embodiment of the present application. The apparatus is applied to a storage device. Optionally, the device may be a controller of the storage device, or be a part of the controller of the storage device. The first data is stored in a first storage address, a first virtual address is recorded in the metadata of the first data, and there is a first mapping relationship between the first storage address and the first virtual address. As shown in FIG. 6, the apparatus 600 includes: a processing module 601.

A processing module 601, configured to migrate the first data from the first storage address to the second storage address, and

The first mapping relationship is adjusted to a second mapping relationship, and the second mapping relationship is a correspondence relationship between the second storage address and the first virtual address.

This device is used to implement the foregoing method embodiments, and its implementation principles and technical effects are similar, and details are not described herein again.

In an optional implementation manner, the storage address is jointly determined by a storage area ID and a storage offset in the storage area.

In an optional implementation manner, the first virtual address is jointly identified by an identifier of the first virtual area and a first virtual offset.

In an optional implementation manner, the first mapping relationship includes a mapping relationship between the identifier of the first virtual area and the first storage area, and a relationship between the first virtual offset and the first storage offset. Mapping relations.

In an optional implementation manner, the processing module 601 is further configured to:

After all data in the storage area corresponding to the first storage address is migrated to the storage area corresponding to the second storage address, a migration identifier is updated, and the migration identifier is used to identify the number of migration times of the first virtual area.

An identifier of a second virtual area is allocated to the second data to be stored, and a third storage area corresponding to a third storage address used to store the second data is allocated.

A third storage offset in a third storage area is allocated to the second data, and the third storage area and the third storage offset jointly determine the third storage address.

Determining a second virtual offset corresponding to the second data according to the third storage offset and the migration identifier.

Write the second data into the third storage address, and record the identifier of the second virtual area and the second virtual offset in metadata of the second data.

Determining the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data.

Reading the first data from the first storage address.

In an optional implementation manner, the processing module 601 is specifically configured to:

According to the mapping relationship between the identifier of the first virtual area and the first storage area, a first storage area ID corresponding to the first storage address is obtained.

According to the mapping relationship between the first virtual offset and the first storage offset, a first storage offset corresponding to the first storage address is obtained.

FIG. 7 is a physical block diagram of a storage device 700 according to an embodiment of the present application. As shown in FIG. 7, the storage device includes a controller 701 and at least one hard disk 702. The controller 701 is configured to manage data stored in the hard disk 702. The first data is stored in a first storage address, a first virtual address is recorded in the metadata of the first data, and there is a first mapping relationship between the first storage address and the first virtual address. The controller 701 is also used for:

Migrating the first data from the first storage address to the second storage address, and,

In an optional implementation manner, the controller 701 is further configured to:

Reading the first data from the first storage address.

In an optional implementation manner, the controller 701 is specifically configured to:

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions according to the embodiments of the present invention are wholly or partially generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.) to another website site, computer, server, or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration. The available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.

Those skilled in the art should understand that the embodiments of the present application may be provided as a method, a system, or a computer program product. Therefore, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, this application may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

This application is described with reference to the flowcharts and / or block diagrams of the method, apparatus (system), and computer program product according to the embodiments of the present application. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing device to produce a machine, so that the instructions generated by the processor of the computer or other programmable data processing device are used to generate Means for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions The device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Although the preferred embodiments of the present application have been described, those skilled in the art can make other changes and modifications to these embodiments once they know the basic inventive concepts. Therefore, the following claims are intended to be construed to include the preferred embodiments and all changes and modifications that fall within the scope of this application.

Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalent technologies, the present application also intends to include these changes and variations.

Claims

A data migration method is applied to a storage device. The storage device includes a controller and at least one hard disk. The controller is used to manage data stored in the hard disk. The method is characterized in that the first data is stored at a first storage address. A first virtual address is recorded in the metadata of the first data, and there is a first mapping relationship between the first storage address and the first virtual address. The method includes:

The storage device migrates the first data from the first storage address to a second storage address;

The storage device adjusts the first mapping relationship to a second mapping relationship, and the second mapping relationship is a correspondence relationship between the second storage address and the first virtual address.
The method according to claim 1, wherein the storage address is jointly determined by a storage area ID and a storage offset in the storage area.
The method according to claim 2, wherein the first virtual address is jointly identified by an identifier of the first virtual area and a first virtual offset.
The method according to claim 3, wherein:

The first mapping relationship includes a mapping relationship between an identifier of the first virtual area and a first storage area, and a mapping relationship between the first virtual offset and a first storage offset;

The second mapping relationship includes a mapping relationship between an identifier of the first virtual area and a second storage area, and a mapping relationship between the first virtual offset and a second storage offset.
The method according to claim 4, wherein after the storage device adjusts the first mapping relationship to a second mapping relationship, the method further comprises:

After the storage device migrates all data in the storage area corresponding to the first storage address to the storage area corresponding to the second storage address, the storage device updates a migration identifier, where the migration identifier is used to identify all The migration times of the first virtual area are described.
The method according to claim 5, further comprising:

Assigning an identifier of a second virtual area to the second data to be stored and assigning a third storage area corresponding to a third storage address for storing the second data;

Assigning a third storage offset in a third storage area to the second data, and the third storage area and the third storage offset jointly determine the third storage address;

Determining a second virtual offset corresponding to the second data according to the third storage offset and the migration identifier;

Writing the second data into the third storage address, and recording the identifier of the second virtual area and the second virtual offset in metadata of the second data;

Establishing a third mapping relationship, where the third mapping relationship includes: a mapping relationship between an identifier of the second virtual area and a third storage area, and a mapping between the second virtual offset and a third storage offset relationship.
The method according to claim 5 or 6, further comprising:

Determining the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data;

Reading the first data from the first storage address.
The method according to claim 7, characterized in that the determining to store the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data The first storage address includes:

Obtaining a first storage area ID corresponding to the first storage address according to a mapping relationship between the identifier of the first virtual area and the first storage area;

Obtaining a first storage offset corresponding to the first storage address according to the mapping relationship between the first virtual offset and the first storage offset;

The mapping relationship between the first virtual offset and the first storage offset is identified by a mapping table of the first virtual offset and the first storage offset, or by a preset The formula is OK.
A data migration device applied to a storage device is characterized in that first data is stored at a first storage address, a first virtual address is recorded in metadata of the first data, and the first storage address and the There is a first mapping relationship between first virtual addresses, and the device includes: a processing module;

The processing module is configured to migrate the first data from the first storage address to a second storage address; and

Adjusting the first mapping relationship to a second mapping relationship, where the second mapping relationship is a correspondence between the second storage address and the first virtual address.
The device according to claim 9, wherein the storage address is jointly determined by a storage area ID and a storage offset in the storage area.
The device according to claim 10, wherein the first virtual address is jointly identified by an identifier of the first virtual area and a first virtual offset.
The device according to claim 11, wherein:

The first mapping relationship includes a mapping relationship between an identifier of the first virtual area and a first storage area, and a mapping relationship between the first virtual offset and a first storage offset;

The second mapping relationship includes a mapping relationship between an identifier of the first virtual area and a second storage area, and a mapping relationship between the first virtual offset and a second storage offset.
The apparatus according to claim 12, wherein the processing module is further configured to:

After all the data in the storage area corresponding to the first storage address is migrated to the storage area corresponding to the second storage address, the migration identifier is updated, and the migration identifier is used to identify the number of migration times of the first virtual area.
The apparatus according to claim 13, wherein the processing module is further configured to:

Assigning an identifier of a second virtual area to the second data to be stored and assigning a third storage area corresponding to a third storage address for storing the second data;

Assigning a third storage offset in a third storage area to the second data, and the third storage area and the third storage offset jointly determine the third storage address;

Determining a second virtual offset corresponding to the second data according to the third storage offset and the migration identifier;

Writing the second data into the third storage address, and recording the identifier of the second virtual area and the second virtual offset in metadata of the second data;

Establishing a third mapping relationship, where the third mapping relationship includes: a mapping relationship between an identifier of the second virtual area and a third storage area, and a mapping between the second virtual offset and a third storage offset relationship.
The apparatus according to claim 13 or 14, wherein the processing module is further configured to:

Determining the first storage address storing the first data according to the first mapping relationship and a first virtual address recorded in metadata of the first data;

Reading the first data from the first storage address.
The apparatus according to claim 15, wherein the processing module is specifically configured to:

Obtaining a first storage area ID corresponding to the first storage address according to a mapping relationship between the identifier of the first virtual area and the first storage area;

Obtaining a first storage offset corresponding to the first storage address according to the mapping relationship between the first virtual offset and the first storage offset;

The mapping relationship between the first virtual offset and the first storage offset is identified by a mapping table of the first virtual offset and the first storage offset, or by a preset The formula is OK.
A storage device includes a controller and at least one hard disk. The controller is used to manage data stored in the hard disk, and is characterized in that the first data is stored at a first storage address, and the first data is A first virtual address is recorded in the metadata, and there is a first mapping relationship between the first storage address and the first virtual address. The controller is further configured to execute the one of claims 1-8. method.
A computer program product, wherein the computer program product includes computer program code, and when the computer program code is executed by a computer, the computer causes the computer to execute the method according to any one of claims 1-8.
A computer-readable storage medium, wherein the computer storage medium stores computer instructions, and when the computer instructions are executed by a computer, the computer causes the computer to execute the method according to any one of claims 1-8. instruction. .