WO2016106757A1

WO2016106757A1 - Method for managing storage data, storage manager and storage system

Info

Publication number: WO2016106757A1
Application number: PCT/CN2014/096073
Authority: WO
Inventors: 李育国; 谯志华
Original assignee: 华为技术有限公司
Priority date: 2014-12-31
Filing date: 2014-12-31
Publication date: 2016-07-07
Also published as: CN106462491B; CN106462491A

Abstract

A method for managing storage data, a storage manager and a storage system, the method comprising: after receiving a data storage request every time, allocating m storage blocks for this data to be stored, wherein each storage block is used for representing a section of virtual address space, and each of the storage blocks is configured with an unique block number (S310); assigning n storage containers for the m storage blocks, wherein each storage container represents a section of physical storage space on a storage device (S320); updating a correlation between the storage blocks and the storage containers according to a correlation between the m storage blocks and the n storage containers, wherein the correlation between the storage blocks and the storage containers is used for recording a correlation between an allocated storage block and a storage container accommodating the allocated storage block (S330); and recording the block numbers of the m storage blocks into metadata of this data to be stored, wherein the block numbers of the m storage blocks are used as virtual addresses of this data to be stored (S340), thereby improving the disk space utilization rate.

Description

Storage data management method, storage manager and storage system

Technical field

The present invention relates to the field of computer technologies, and in particular, to a storage data management method, a storage manager, and a storage system.

Background technique

The location information of the file system record data generally adopts the following method: recording the virtual address of the data, and mapping the virtual address to the physical address through the mapping table. This approach is logically simple and the upper layer does not need to understand the underlying layout.

After the system has been running for a period of time, it has undergone multiple space reclamation. The problem of disk fragmentation is highlighted. Defragmentation is required. The defragmentation process depends on the layout of the file system. Considering space reclamation and data locality, the file system usually uses data/files as storage blocks (the storage blocks refer to the smallest unit or the most basic unit of reading and writing data in the file system, and may exist in different file systems. Different naming, such as basic data chunks, data blocks, etc., are recorded, and a plurality of storage blocks are organized in the form of a storage container Container (also referred to as a data segment Segment).

FIG. 1A is a schematic diagram of a CAT index of a file system for recording data location information using a container address translation table CAT in the prior art. The scheme combines the storage container number and the storage block number (<CTID, CKID>) where the data is located as the virtual address of the data, and maps the CT (abbreviation of Container) to the physical address through the container address conversion table CAT. Taking the data <CTID1, CKID2> as an example, the virtual address <CTID1, CKID2> represents the data stored in the storage block CK2 in the storage container CT1. By querying the CAT table, the physical address PA1 of the storage container CT1 can be known, and in each storage container, metadata is recorded in each storage container, and the size of each CK in the storage container, the check code, the position in the CT, and the like are related. After learning the physical address PA1 of CT1, the physical address of the data <CTID1, CKID2> can be determined by querying the metadata in CT1.

FIG. 1B is a schematic diagram of a CAT index structure after defragmentation in the prior art. It can be seen that after defragmentation, CK4 and CK6 in the original CT2 are migrated to CT1. At this time, the index information of CT2 is modified to the physical address of CT1 in the CAT table. PA1. However, since the prior art uses <CTID, CKID> as a virtual address of data, when defragmenting, only the memory block in one CT must be migrated to another CT as shown in FIG. 1C. It is shown that if CK4 in CT2 is migrated to CT1, CK6 in CT2 is migrated to CT3 because of the physical location of CT1 and CT3. The address is different, and the physical address of CT2 in the CAT table will not be mapped.

Therefore, in the prior art, the file system manages the stored data so that the flexibility and efficiency of the disk sorting are not good, and the disk space utilization after the disk sorting is still not high.

Summary of the invention

In view of this, it is necessary to provide a method, a storage manager, and a storage system for storing data management methods to improve the flexibility of defragmentation.

In a first aspect, an embodiment of the present invention provides a method for managing stored data, including:

Each time a data save request is received, m storage blocks are allocated for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block number, m a natural number greater than or equal to 1;

Specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;

Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already Corresponding relationship of the storage containers of the allocated storage blocks;

Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .

With reference to the first aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .

With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner, the corresponding relationship between the storage block and the storage container includes multiple indexes, where each index is used to indicate that the same is specified Point of all storage blocks of a storage container, the key of each index is a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is the same storage container Logo.

With reference to the second possible implementation of the first aspect, in a third possible implementation, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured The minimum value of the block number is greater than the maximum value of the block number of the storage block configured by the previous data to be saved or the maximum value of the block number of the m storage blocks being configured is smaller than the configuration of the data to be saved after the previous time. The minimum value of the block number of the storage block;

The representative value in each index is the smallest block number of the storage block accommodated by the same storage container.

In conjunction with the third possible implementation of the first aspect, in a fourth possible implementation, the n storage blocks are specified by the n storage blocks, and the m storage blocks and the n are The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:

a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;

Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;

e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.

With reference to the second possible implementation of the first aspect, in a fifth possible implementation, the block numbers of the m storage blocks allocated each time are configured to be linearly decremented, and the m storage blocks are configured The minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the data to be saved in the previous time or the maximum value of the block number in which the m memory blocks are configured is smaller than the memory block configured by the previous data to be saved. The minimum value of the block number;

The representative value in each index is the largest block number of the storage block accommodated in the same storage container.

In conjunction with the fifth possible implementation of the first aspect, in a sixth possible implementation, Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:

a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;

Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;

With reference to the third possible implementation manner of the first aspect, to any possible implementation manner of the sixth possible implementation manner of the first aspect, in a seventh possible implementation manner, the method further includes: receiving defragmentation An instruction to determine a storage container to be collated, wherein the storage container to be collated is a storage container indicated by a correspondence between the storage block and the storage container;

Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;

Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.

With reference to the third possible implementation manner of the first aspect, to the possible implementation manner of the sixth possible implementation manner of the first aspect, in an eighth possible implementation manner, the method further includes:

Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;

Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container Value adjacent to each other;

With reference to the third possible implementation manner of the first aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the first aspect, in a ninth possible implementation manner, the method further includes:

After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file where the data to be read is located, and acquiring the data to be read a virtual address, wherein the virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;

Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;

Reading metadata of the q storage containers, determining physical address information of the p storage blocks, and metadata of each storage container is used to describe information of all storage blocks in each of the containers.

In a second aspect, an embodiment of the present invention provides a storage manager, which is applied to a storage system, where the storage system includes a storage device and a storage manager, where the storage device includes a storage medium for providing a physical address space. The storage manager is configured to receive a data save request triggered by the application, and forward the data to be saved to the storage device for saving; the storage manager includes:

a storage block management module, configured to allocate m storage blocks for the data to be saved each time after receiving the data storage request, where each storage block is used to represent a virtual address space, and each storage block Configured with a unique block number, m is a natural number greater than or equal to 1;

a storage container management module, configured to specify n storage containers for the m storage blocks, where Each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;

a recording module, configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks Correspondence with a storage container accommodating the already allocated storage block; and

The recording module is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time. The virtual address of the data to be saved.

With reference to the second aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical medium corresponding to each storage container address.

With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner, the corresponding relationship between the storage block and the storage container recorded in the recording module includes multiple indexes, and each index is used by Pointing to the pointers of all the storage blocks assigned to the same storage container, the keys of each index are representative values of the block numbers of the storage blocks accommodated by the same storage container, and the value of each index is The ID of the same storage container.

With reference to the second possible implementation of the second aspect, in a third possible implementation, the block number of the m storage blocks allocated by the storage block management module is configured to be linearly incremented, and The minimum value of the block number of the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the maximum value of the block number of the m memory blocks is less than the data to be saved for the next time to be saved. The minimum value of the block number of the storage block;

The recording module is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.

In conjunction with the third possible implementation of the second aspect, in a fourth possible implementation, the storage container management module is specifically configured to perform the following operations:

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the recording module at the location Adding an index to the correspondence between the storage block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is the storage container of the current working Identification

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module is notified again that an index is added again in the correspondence between the storage block and the storage container, and the index is added again. The key of the index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;

c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;

The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.

With reference to the second possible implementation of the second aspect, in a fifth possible implementation, the block number of the m storage blocks allocated by the storage block management module is configured to be linearly decremented, and The minimum value of the block number of the m storage blocks is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number of the m storage blocks is smaller than the data to be saved for the previous time. The minimum value of the block number of the storage block;

The recording module is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.

With reference to the fifth possible implementation of the second aspect, in a sixth possible implementation, the storage container management module is specifically configured to perform the following operations:

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the storage capacity of the current work. Identification of the device;

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module is again notified to add an index again in the correspondence between the storage block and the storage container, the again The key of the added index is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;

With reference to the third possible implementation manner of the second aspect, and the possible implementation manner of the sixth possible implementation manner, in a seventh possible implementation manner, the storage manager further includes a defragmentation module. The defragmentation module is used to:

Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;

Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.

With reference to the third possible implementation manner of the second aspect, the possible implementation manner of the sixth possible implementation manner, in an eighth possible implementation manner, the storage manager further includes a defragmentation module, The defragmentation module is used to:

Receiving a defragmentation instruction to determine a storage container to be tidy, wherein the storage container to be tidyed includes garbage storage in all storage containers indicated by the correspondence between the storage block and the storage container Block storage container;

In a third aspect, an embodiment of the present invention provides a storage system, where the storage system includes a storage device and a storage manager.

The storage device includes a storage medium for providing a physical address space to save data;

The storage manager is configured to allocate m storage blocks for the data to be saved after each receiving the data saving request, where each storage block is used to represent a virtual address space, and each storage The block configuration has a unique block number, and m is a natural number greater than or equal to 1;

Specifying n storage containers for the m storage blocks, where each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;

Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already The correspondence of the storage containers of the allocated storage blocks;

With reference to the third aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .

With reference to the first possible implementation manner of the third aspect, in a second possible implementation manner, the storing, by the storage manager, the corresponding relationship between the storage storage block and the storage container includes:

The correspondence between the storage block and the storage container recorded by the storage manager includes a plurality of indexes, and each index represents a pointing of all the storage blocks assigned to the same storage container, where The key of each index is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.

With reference to the second possible implementation of the third aspect, in a third possible implementation, the configuration unit includes: the storage manager is specifically configured to: allocate the m storage blocks each time The block number is configured to be linearly incremented, and the minimum value of the block number configuring the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the block number of the m memory blocks is configured. The maximum value is less than the minimum value of the block number of the storage block configured for the data to be saved the next time;

The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.

In conjunction with the third possible implementation of the third aspect, in a fourth possible implementation, the storage manager is specifically configured to perform the following operations:

With reference to the second possible implementation of the third aspect, in a fifth possible implementation, the storage manager is specifically configured to configure a block number of the m storage blocks allocated each time as linear decrement And configuring the minimum number of the block numbers of the m storage blocks to be larger than the data to be saved for the next time. The maximum value of the block number of the set storage block or the maximum value of the block number configuring the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;

The storage manager is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.

With reference to the fifth possible implementation manner of the third aspect, in a sixth possible implementation, the storage manager specifies n storage containers for the m storage blocks, and records a correspondence between the storage block and the storage container Relationship, specifically used to perform the following operations:

d. adding an index to the storage node and the storage container, the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added The value of the index is the identity of the storage container of the updated work;

With reference to the third possible implementation manner of the third aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the third aspect, in a seventh possible implementation manner, the storage manager is further used to:

With reference to the third possible implementation manner of the third aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the third aspect, in the eighth possible implementation manner, the storage manager is further used to:

The storage manager is also used to:

After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file in which the data to be read is located, and acquiring the to-be-read a virtual address of the data, wherein the virtual address of the data to be read includes a block number of p storage blocks, p is a natural number greater than or equal to 1, and the storage block is queried according to the block number of the p storage blocks Corresponding relationship of the storage containers, determining q storage containers accommodating the p storage blocks, q being a natural number greater than or equal to 1, and reading metadata of the q storage containers, determining physical properties of the p storage blocks Address information, metadata of each storage container is used to describe information of all storage blocks in each of the containers. In a fourth aspect, an embodiment of the present invention provides a storage manager, including:

An interface, a processor, and a memory for interacting with a storage device, the processor being coupled to the processor via a bus, the processor interacting with the storage device through the interface;

The memory is configured to store computer execution instructions, when the storage manager is running, The processor executes the computer-executed instructions stored by the memory to cause the storage manager to perform the method of managing stored data provided by any of the first aspect or the first aspect of the first aspect.

In a fifth aspect, an embodiment of the present invention provides a computer, including: a processor, a memory, a bus, and a communication interface;

The memory is configured to store computer execution instructions, the processor is coupled to the memory via the bus, and when the computer is running, the processor executes the computer-executed instructions stored by the memory to cause The computer performs the management method of the stored data provided by the above first aspect or any possible implementation of the first aspect.

In a sixth aspect, an embodiment of the present invention provides a computer readable medium, including a computer executing instruction, when the processor of the computer executes the computer to execute an instruction, where the computer performs any of the above first aspect or the first aspect The implementation method of storing data provided by the implementation.

In the embodiment of the present invention, each time, m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the correspondence between the m storage blocks and the n storage containers. Updating a correspondence between the storage block and the storage container, and recording the block number of the m storage blocks to the metadata of the data to be saved, the block numbers of the m storage blocks being used as the The virtual address of the data to be saved is such that the virtual address of the data in the system is independent of the storage container where the data is located, and the correspondence between the storage block and the storage container can be queried according to the block number of the storage block where the data is located, thereby obtaining data. The information about the physical address, the management method of the data storage, so that when the defragmentation is performed, the storage container is not required to be migrated as a whole, and the disk is directly defragmented by the storage block, thereby improving the defragmentation. The efficiency and flexibility also increase the disk space utilization.

DRAWINGS

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.

1A is a schematic diagram showing a CAT index structure of a file system for recording location information of data using a CAT container address translation table in the prior art;

1B is a file system disk in the prior art for using CAT to record location information of data. Schematic diagram of the CAT index structure after defragmentation;

1C is a schematic diagram showing the principle of file system disk defragmentation of a location information using CAT to record data in the prior art;

2A is a schematic structural diagram of a storage system according to an embodiment of the present invention;

2B is a schematic diagram of an application scenario according to an embodiment of the present invention;

FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention; FIG.

4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention;

FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention; FIG.

FIG. 6A is an exemplary flowchart of a disk sorting method according to an embodiment of the invention; FIG.

6B is a schematic diagram of a disk sorting method according to an embodiment of the invention;

FIG. 7 is a schematic structural diagram of a storage manager according to an embodiment of the invention; FIG.

FIG. 8 is a schematic structural diagram of a computer according to an embodiment of the present invention.

detailed description

The embodiments of the present invention will be described in detail with reference to the accompanying drawings in the embodiments of the present invention. It is a partial embodiment of the invention, not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

To facilitate understanding of the implementation, the embodiment of the present invention first provides a storage system 200. 2A is a schematic diagram of a logical structure of the storage system 200. The storage system 200 includes: a storage manager 210 and a storage device 220, and the storage manager 210 and an external device (such as a host, an application server, etc., this solution does not The number of external devices is limited) to be in communication with the storage device 220.

The storage device 220 may include a storage medium 222 and a storage controller 221 (the embodiment of the present invention does not limit the number of the storage medium 222 and the storage controller 221, and the figure is shown here for convenience of description. The storage medium 222 is configured to provide a physical address space for storing data. In a specific implementation process, the storage medium 222 may be, for example, but not limited to, an EEPROM, a ROM, a solid state hard disk SSD, or the like. Hard disk HDD, tape, optical hard disk Or other non-volatile storage device, which is not limited to the embodiment of the present invention; the storage controller 221 is used to manage and schedule the plurality of storage media 222, by way of example and not limitation. The storage controller 221 and the storage medium 222 may constitute a Redundant Arrays of Independent Disks (RAID), which is not a limitation on the embodiments of the present invention.

The storage manager 210, as an intermediate device (host host, APP Service) and the storage device 220, can be configured as an intermediate unit for reading and writing data, and can be separately set on a physical device as shown in FIG. 2A. The storage manager 210 can be implemented by modifying the file system. It should be noted that the file system mentioned in the embodiment of the present invention refers to organizing and allocating the address space of the file storage device (including but not limited to the storage device 200 in FIG. 2A), and is responsible for storing the file/data and The system for managing, retrieving, and protecting the stored files/data, that is, the file system in the embodiment of the present invention includes a file management function and a space management function.

The storage manager 210 is specifically configured to allocate m storage blocks for the data to be saved after receiving the data save request from the host (the storage block CK in the embodiment of the present invention refers to reading and writing data). The smallest unit or the most basic unit may have different names in different systems, such as basic data chunks, data blocks, etc., where each memory block is used to represent a virtual address space, each of which The memory blocks are configured with a unique block number (CKID), and m is a natural number greater than or equal to 1. It should be noted that each time the data save request is received in the embodiment of the present invention, the storage manager 210 may receive a data save request from the host at a time, or may refer to the storage manager 210 receiving the data at one time. The save request data save request from the host, and the data save request may be triggered by an external application or may be triggered by an instruction function of the storage manager 210, which is not a limitation on the embodiment of the present solution.

After the storage blocks are allocated, the storage manager 210 specifies n storage containers (CT) for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device 220, where n is A natural number greater than or equal to 1. It should be noted that the storage container mentioned in the embodiment of the present invention is used to accommodate a plurality of the storage blocks. The storage container is also referred to in the art as a data container Container (CT) or a data segment Segment. The storage container can also hold metadata related to the storage block. Each storage container represents a piece of physical storage space on the storage device 220, which means that each storage container allocated by the storage manager actually corresponds to the storage device 220 (specifically, the storage medium 222). a continuous uninterrupted physical address space, the segment The physical storage space may be composed of a continuous, uninterrupted physical address space, or may be composed of discrete, intermittent physical address spaces, by way of example and not limitation, when the storage medium 222 is a disk, Each storage container may actually correspond to a contiguous logical address on a logical volume provided by the storage device, or may correspond to a contiguous sector or track on the disk, or may be composed of discrete, intermittent sectors or tracks on the disk. A piece of physical storage space, such as by RAID striping, forms the discrete, intermittent sectors or tracks into a physical storage space. Each storage container is configured with metadata, and the metadata of each storage container records related information such as a check code, a data size, a position in the CT, and the like of each storage block of the storage block accommodated by each storage container.

In the embodiment of the present invention, a new correspondence relationship different from the prior art is saved in the storage manager 210, which is referred to as a correspondence between a storage block and a storage container in the embodiment of the present invention, and the storage block and the storage container Corresponding relationship is used to record the corresponding relationship between the storage block that has been allocated and the storage container that accommodates the allocated storage block. In the embodiment of the present invention, the storage manager 210 is different from that used in the prior art. <CTID, CKID> is used in combination as a method of virtual address, but records the block number of the m storage blocks to the metadata of the file in which the data to be saved is located, the m storage blocks The block number is used as the virtual address of the data to be saved this time. The storage manager 210 uses the block number of the storage block as the virtual address of the data to be saved, and may be executed after the storage block is allocated, or may be performed after the correspondence between the record number storage block and the storage container, and is implemented by the present invention. This example does not limit this. It should be noted that the virtual address used in the embodiment of the present invention is used as the addressing address when the storage manager addresses the data, and the upper layer application and the underlying storage device do not need to perceive the virtual address, and the virtual address only It has an addressing meaning for the storage manager. For example, after receiving the data read request, the storage manager determines a virtual address of the data to be read according to the data read request, that is, the data to be read is defined by the storage manager. The location in the virtual storage space. Generally, the virtual address of the data to be saved or the virtual address of the data to be read is recorded in the storage manager in the form of metadata. The storage manager in the embodiment of the present invention stores metadata of a plurality of files, each file corresponding to the saved data, each file has metadata of the file, and the metadata of the file is only known in the art. For example, include file directory information and index node information, and the like. In the storage system 200 provided by the embodiment of the present invention, the storage manager 210 is responsible for allocating m storage blocks for the data to be saved, and designating n storage containers for the m storage blocks, according to the m storage blocks and Recording the correspondence between the storage blocks and the storage containers, and recording the block numbers of the m storage blocks to the metadata of the data to be saved, the m storage blocks The block number is used as the virtual address of the data to be saved this time, so that the number recorded in the system The virtual address is independent of the storage container where the data is located, and can query the corresponding relationship between the storage block and the storage container according to the storage block where the data is located, thereby obtaining related information of the physical address of the data, and the management method of the data storage makes When defragmenting, there is no need to migrate the entire container at the granularity of the storage container, and the disk is directly defragmented by the size of the storage block, thereby improving the efficiency and flexibility of the defragmentation, and greatly improving the space utilization of the disk. .

Further, for the convenience of recording and management, the storage manager 210 may be configured with a unique identifier (CTID) for each storage container, and the identifier of each storage container is used to indicate to each storage container. The corresponding physical address. In a specific implementation process, by way of example only, the identifier of each storage container may be mapped to a physical address by mapping or by specifying an initial physical address of the system storage container and specifying a space size of each storage container (eg, 8M) And obtaining the physical address corresponding to each storage container by calculating the offset (CTID*8M). In this regard, it is not intended to limit the scope of the embodiments of the present invention.

Further, the correspondence between the storage block and the storage container recorded by the storage manager 210 includes multiple indexes, and each index represents a pointer of all storage blocks assigned to the same storage container, where each index The key is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.

In the embodiment of the present invention, the storage manager 210 records the correspondence between the storage block and the storage container by using multiple indexes, and each index is used to indicate the orientation of all the storage blocks assigned to the same storage container, and The key of each index is configured as a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is configured as an identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage. The redundancy of the correspondence between the block and the storage container makes the correspondence easier to check and improves the efficiency of query usage.

As a preferred implementation manner, the storage manager 210 may specifically configure a block number of the m storage blocks allocated each time to be linearly incremented, and a minimum value of the block numbers of the m storage blocks to be configured. The maximum value of the block number of the storage block configured to be larger than the previous data to be saved or the maximum value of the block number of the m storage blocks to be configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved after the previous time; The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container. In a specific implementation process, by way of example only and not limitation, the storage manager 210 configures the block number of the m memory blocks allocated each time to linearly increase by the following algorithm: CKID _new = CKID _max +1 And wherein the CKID _new is a block number of a newly configured one of the storage blocks, and the CKID _max is a current largest block number in the file system.

In the embodiment of the present invention, the storage manager 210 configures the block number of the m storage blocks to be linearly incremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the previous data to be saved. The maximum value of the block number of the configured storage block or the maximum value of the block number in which the m storage blocks are configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved the next time; and each of the pieces The representative value in the index is recorded as the smallest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.

Further, based on the implementation manner that the storage block number is linearly incremented, the storage manager 210 specifically implements the storage block designation to the storage container and the record of the correspondence between the storage block and the storage container by:

a. Obtain a storage container of the current work, and assign the m storage blocks to the currently working storage container one by one in a sequence from a small block number to a large block number; in the specific implementation process, in order to ensure each storage in the system The value range of the block number of the storage block accommodated in the container does not intersect with the block number value range of the storage block accommodated in any other storage container, and the storage manager 210 has at most one storage configured at any one time. The container is for accommodating a storage block, which is the currently working storage container.

Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container; It should be noted that the free storage container is a storage container that does not contain any storage block, such as a new storage container created by the file system, or an old storage container that does not contain a storage block after being spatially reclaimed.

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container; it should be noted that the full storage container means that there is not enough space in the storage container to accommodate the next to be allocated. The memory block is gone. At the same time, based on the same reasons as in step a, When the currently working storage container is a full storage container, the obtained storage container of the updated work must be a free container.

This embodiment describes in detail how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement. Of course, in the specific implementation process, the above steps may be performed. a~Step e may be split into more small steps or merged into several steps, or may be performed in the order of execution between steps, since the above transformations are based on the present embodiment without the need for creative labor. It can be achieved, and therefore should be attributed to the scope of protection of this embodiment.

As a preferred implementation manner, the storage manager 210 is configured to configure a block number of the m storage blocks allocated each time to be linearly decremented, and configure a minimum block number of the m storage blocks. The value is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number configuring the m storage blocks is smaller than the minimum of the block number of the storage block configured for the previous data to be saved. value;

The storage manager 210 is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.

In the embodiment of the present invention, the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the data to be saved later. The maximum value of the block number of the configured memory block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and each of the pieces The representative value in the index is recorded as the largest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.

Optionally, when the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, the storage manager 210 is specifically configured to perform the following operations:

a, get the current working storage container, according to the order from the big block to the small block number one by one Said m storage blocks are assigned to the currently working storage container;

This embodiment details how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, of course, in the specific implementation process, such as the previous one. According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.

Further, when the storage manager 210 configures the block number of the m storage blocks to be linearly incremented or linearly decremented each time, the storage manager 210 is further configured to:

Scan the storage container to be collated, and obtain a non-garbage storage block (that is, a storage block that is also used by the system) included in each storage container to be collated;

In this embodiment, after receiving the defragmentation command, the storage manager 210 can determine that it is to be scanned. The storage container (which can be implemented in a plurality of manners, which is not limited to the embodiment of the present solution), obtains a non-garbage storage block included in each storage container to be collated, and re-creates the non-garbage storage block as a granularity. Specifying a new storage container, and updating the correspondence between the storage block and the storage container, in particular, by configuring the block number of the non-garbage storage block accommodated by each new storage container to be linearly increasing or linearly decreasing, and The range of the block number of the storage block accommodated in each new storage container does not overlap with the range of the block number of the storage block accommodated in the other new storage container. In the defragmentation, the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.

Receiving a defragmentation instruction to determine a storage container to be collated, wherein the storage container to be collated includes a garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container (ie, is not used by the system or a storage container of a storage block that has been spatially reclaimed;

Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container The value is adjacent to each other (the adjacent size means that the values of the keys of the two indexes are next to each other, and the key value interval formed by the key values of the two indexes does not include the key value of any other index);

In this embodiment, after receiving the defragmentation instruction, the storage manager 210 can determine that the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container is a storage container to be tidy. And by designating a new storage container for the non-garbage storage block in the logically adjacent storage container to be collated, and updating the correspondence between the storage block and the storage container, wherein each new storage container is accommodated The block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is different from the block number of the storage block accommodated by the other new storage container. There is no intersection in the scope. Enables flexible collation and storage of storage containers containing garbage storage blocks when defragmenting Then, in the correspondence between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record all the non-garbage storage blocks in each new storage container, thereby reducing the virtual address to the physical The mapping cost of the address.

Further, the method further includes: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where The virtual address of the data to be read includes the block number of the p memory blocks, and p is a natural number greater than or equal to 1; the information of the data to be read includes the data to be read, by way of example only and not limitation The file name, the offset of the data to be read in the file, the length of the data to be read, and the like. After obtaining the information of the data to be read, the metadata of the file where the data to be read is located may be queried, for example, the directory of the file system is read first, and the inode (index node) of the data to be read is obtained. The information is further queried according to the inode information of the data to be read, and the virtual address of the data to be read includes a block number of p storage blocks.

Reading the metadata of the q storage containers to determine physical address information of the p storage blocks; in a specific implementation process, the metadata of the q storage containers records each storage in the q storage containers The block's check code, data size, location in the CT, and other related information. Since the identifiers of the q storage containers indicate the physical addresses of the q storage containers, after the physical addresses of the q storage containers are known, the information of the metadata in the q storage containers is added. The physical address information of the p memory blocks can be determined.

And reading the data to be read from the storage device according to physical address information of the p storage blocks.

Further, the method may further include: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where The virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;

Sending the identifiers of the q storage containers and the location information of the p storage blocks recorded by the metadata of the q storage containers to the storage device, so that the storage device determines Describe physical address information of the p storage blocks;

And reading the data to be read from the storage device according to physical address information of the p storage blocks. In a specific implementation process, the metadata records related information such as a check code, a data size, a location in the CT, and the like of each storage block in the storage container corresponding to the block number of the storage block to be addressed.

FIG. 2B is a schematic structural diagram of a specific implementation of a storage manager according to an embodiment of the present invention. 2B includes storage manager A (unnumbered), storage device 230, and storage manager B (unnumbered), two storage managers A and B are shown here for convenience of description, and storage manager B can be used as storage. The backup of the manager A, the number of the storage manager is not limited to the embodiment of the present invention. Generally, a storage manager is provided to implement the embodiment of the present invention. The storage manager A includes a processor 211, an interface (not shown) that interacts with the storage device, and a memory 212 that communicates via a bus (not numbered), the processor 211 executes computer instructions in the memory 212, and Having the storage manager A perform includes, but is not limited to, the embodiment of Figure 2A. The storage manager A communicates with the storage device 230 through the interface interacting with the storage device, the storage device 230 is configured to store data forwarded by the storage manager A, and the function or method and method implemented by the storage manager A The storage device 220 in 2A is similar and will not be described again here.

FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention. The management method of the stored data may be, but is not limited to, applied to the storage system as shown in FIG. 2A or the application scenario shown in FIG. 2B. Although the processes of method 300 described below include multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined in fewer operations, which may be performed sequentially or in parallel ( For example, using a parallel processor or a multi-threaded environment) or changing the order of execution between steps should fall within the scope of protection of embodiments of the present invention. As shown in FIG. 3, the method includes:

Step S310, after receiving the data save request, allocate m storage blocks for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block. No. m is a natural number greater than or equal to 1. It should be noted that, as already described in the embodiment of FIG. 2A, the receiving of the data save request may refer to receiving one data save request at a time, or may refer to receiving multiple data save requests at a time.

Step S320, specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1. The section of the physical storage space has been explained in detail in the embodiment of FIG. 2A, and details are not described herein again.

Step S330, updating the correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and accommodate Corresponding relationship of the storage containers of the allocated storage blocks.

In step S340, the block number of the m storage blocks is recorded in the metadata of the data to be saved, and the block numbers of the m storage blocks are used as the virtual address of the data to be saved. .

According to the technical solution provided by the embodiment of the present invention, each time, m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the m storage blocks and the n storage blocks. Corresponding relationship between the storage container and the storage container, and recording the block number of the m storage blocks to the metadata of the data to be saved, the block numbers of the m storage blocks are used by As a virtual address of the data to be saved, the correspondence between the storage block and the storage container can be recorded, and the virtual address of the data is independent of the storage container where the data is located, and thus, when the disk is sorted The overall migration of the storage container is not required, which improves the efficiency and flexibility of the defragmentation and greatly improves the space utilization of the disk.

Further, in order to facilitate recording and management, each storage container in step S320 is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container. The embodiment of FIG. 2A has described in detail how to use the identifier of each storage container to indicate the physical address corresponding to each storage container, and details are not described herein again.

Further, the correspondence between the storage block and the storage container in step S330 includes multiple indexes, wherein each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the key of each index A representative value of the block number of all the storage blocks accommodated by the same storage container, the value of each index being the identifier of the same storage container.

According to the technical solution provided by the embodiment of the present invention, the correspondence between the storage block and the storage container is recorded by using multiple indexes, and each index is used to indicate a pointing of all storage blocks assigned to the same storage container, and the The key of each index is configured as a representative value of the block number of all the storage blocks accommodated by the same storage container, and the value of each index is configured as the identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage. The mapping cost of the correspondence between the block and the storage container.

Preferably, in a case that the correspondence between the storage block and the storage container includes multiple indexes, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured. The minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the previous data to be saved. Or the maximum value of the block number of the m storage blocks being configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved in the previous time;

In the embodiment of the present invention, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the previous data to be saved. The maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the data to be saved the next time; and the contents in each of the indexes The representative value is recorded as the smallest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.

Preferably, when the block numbers of the m storage blocks allocated each time are configured to be linearly incremented, the n storage containers are specified n storage containers, and according to the m storage blocks and the n The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:

It should be noted that the concepts of technical terms such as the currently working storage container, the full storage container, and the idle storage container mentioned in the foregoing embodiments have been described in detail in the embodiment described in FIG. 2A, and details are not described herein again. . In the meantime, the embodiment details how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, and of course, in a specific implementation process, The above steps a to e may be split into more small steps or merged into several steps, and the order of execution between the steps may be changed. Since the above transformations are based on the present embodiment, no creativity is required. Labor can be achieved, and therefore should be attributed to the scope of protection of this embodiment.

Preferably, in a case that the correspondence between the storage block and the storage container includes multiple indexes, the block number of the m storage blocks allocated each time may be configured to be linearly decremented, and the m storage blocks The minimum value of the configured block number is greater than the maximum value of the block number of the storage block configured by the data to be saved in the next time or the maximum value of the block number in which the m storage blocks are configured is smaller than the storage configured by the previous data to be saved. The minimum value of the block number of the block;

The representative value in each index is the maximum block number of the storage block accommodated by the same storage container.

In the embodiment of the present invention, the block number of the m storage blocks allocated each time is configured to be linearly decremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the data to be saved later. The maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and the each of the indexes in the index The representative value is recorded as the maximum block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.

Preferably, when the block numbers of the m storage blocks allocated each time are configured to be linearly decremented, the n storage blocks are specified n storage containers, and according to the m storage blocks and the n The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:

Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;

This embodiment describes in detail how to allocate the n storage containers for the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, of course, in the specific implementation process, as before According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.

Further, when the block number of the m memory blocks allocated each time is configured to be linearly increasing or linearly decreasing, the method further includes:

In this embodiment, after receiving the defragmentation command, the storage container to be scanned can be determined (which can be implemented in various manners, which is not limited to the embodiment of the present solution), and the storage container included in each storage container is obtained. Non-garbage storage block, and reassigning a new storage container with the non-garbage storage block as a granularity, and updating the corresponding relationship between the storage block and the storage container, specifically by configuring the new storage container to accommodate the The block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is compared with other new storage. The range of block numbers of the storage blocks accommodated in the storage container is implemented without an intersection. In the defragmentation, the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.

Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container The value magnitudes are adjacent to each other (the value of the keys is adjacently described in detail in the embodiment described in FIG. 2A and will not be described again);

In this embodiment, after receiving the defragmentation instruction, the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container can be determined as the storage container to be tidyed, and Specifying a new storage container for the non-garbage storage block in the logical storage node to be collated, and updating the corresponding relationship between the storage block and the storage container, wherein the non-trash is accommodated in each new storage container The block number of the storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by other new storage containers. . Therefore, when the defragmentation is performed, the storage container containing the garbage storage block can be flexibly arranged, and after the finishing, the correspondence between the storage block and the storage container still ensures that each new storage container needs only one index. All non-garbage storage blocks in each new storage container are recorded, thereby reducing the mapping cost of the virtual address to the physical address.

4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention, where the CK2C table can be used to record a storage block and a storage container. Corresponding relationship, that is, the CK2C table is a preferred organization form of the correspondence between the storage block and the storage container in the above respective embodiments. For convenience of description, the block number of the storage block recorded by the CK2C table is selected in the form of linear increment and never multiplexing (unique representation), that is, each time the virtual address/CKID allocated for the new storage block is the same as before. The virtual address/CKID of the storage block is different, and the allocated CKID is always linearly increased, and of course, it can be linearly reduced all the time, not shown here). Preferably, the CKID is a 64-bit unsigned integer (int type), and the global addressing space of the file system is [0, 2^64-1], and it should be noted that the embodiment of the present invention The number of CKIDs is not limited, and the system can adjust according to actual needs. Depending on the specific application scenario, the CK2C table may be implemented by using a linear table or a B+ tree, and the specific manner is not limited to the embodiment of the present invention.

As shown in FIG. 4, the CK2C table includes multiple indexes (each column in the CK2C table is an index, such as <CKID1, CTID1> is an index), wherein each index and each storage container are one by one. Correspondingly, the number of indexes in the CK2C table is equal to the number of storage containers CT recorded in the CK2C table (here, for convenience of description, only three CTs, that is, three indexes are shown, which are not limiting of the present invention) . It should also be noted that the value range of the CKID contained in the storage container corresponding to each index does not coincide. For example, in the figure, CT1 includes a range of CKIDs of 1 to 4, CT2 includes a range of CKIDs of 5 to 8 ..., and so on, and the range of CKIDs included in each CT does not overlap or intersect. In a specific implementation, the value range of the CKID included in each CT can be implemented in a plurality of ways without overlapping or intersection, which is only an example and not a limitation, and there is only one at any time in the file system. The container in the working state (ie, the currently working storage container), when a new storage block is written (the embodiment of the present invention does not limit the number of the new storage block), the new storage block is written to the current work. In the storage container, if the currently working storage container is full or the remaining free space of the currently working storage container is insufficient, a new storage container (idle container) may be created as a new working container to store the New storage block. At the same time, it should be noted that the previously used storage container cannot be reused even if there is still space available internally (that is, the previous storage container will not be called as a new working container to store the new storage block) unless previously used. After the storage container is spatially reclaimed or defragmented, it becomes a blank container (that is, there is no storage block inside the container). In this case, the previously used storage container can be called as the new working container. Therefore, in addition to the fact that CKID has linear increase/decrease and never reuse, it ensures that the range of CKIDs included in each CT does not overlap or intersect. Of course, it is also possible to realize that the range of CKIDs included in each CT does not overlap or overlap by allocating different CKID segments for different CTs, and is not a limitation on the embodiments of the present solution.

Optionally, when the CKID is linearly increased, the CK2C table is recorded in the form of a key-value key value pair, and the Key (key) of each index recorded in the CK2C table is uniformly configured for each index. The smallest CKID in the storage container CT (as shown in FIG. 4), the Value (value) of each index is the CTID of the storage container corresponding to each index, and each CTID is uniquely determined to have a physical address. By way of example only and not limitation, the CTID may directly correspond to the physical address by calculating the offset (eg, assuming that the size of the CT is 8M, then CTID*8M=PA), or may be mapped to the physical address by mapping. Therefore, knowing the virtual address (CKID) of a certain data, the physical address of the data can be uniquely determined by querying the CK2C table. For example, the specific query mode may be: when the Key (primary key) of each index recorded in the CK2C table is uniformly configured as the smallest CKID in the storage container CT corresponding to each index, and the CKID of the file system is linearly increased. For example, if the virtual address CKID of the data that needs to be queried for the physical address is CKID7, the CK2C table is first queried for the CKID7 key, and if so, the search ends; if not, Then, through the CK2C table, all the keys smaller than CKID7 in the CK2C table (ie, CKID1 and CKID5 in FIG. 4) are determined, and the key whose value is closest to CKID7 is determined from all the keys smaller than CKID7, as shown in FIG. 4, that is, CKID5. Then, CKID7 must fall into the storage container with CKID5 as the key (CTID2). Finally, by querying the metadata in CTID2 and the physical address PA2 corresponding to CTID2, the physical address of CKID7 can be uniquely determined.

In the embodiment of the present invention, a CK2C table is provided for recording a correspondence between a storage block and a storage container. The smallest or largest CKID in the storage container is used as the key of the CK2C table index corresponding to the storage container, and the CKID is used as the data. The virtual address, and the range of CKIDs contained in each storage container does not overlap, and there is no intersection, so that each CT only corresponds to one CK2C table index, and the physicality of all the storage blocks in one storage container CT can be determined by one index. The address, and the CK2C table is small in size, reducing the mapping cost of the virtual address to the physical address.

FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention. The method uses the CK2C table shown in FIG. 4 to manage and record spatial location information of data, which is only an example and not a limitation. The CKID is configured to grow linearly and never multiplexed, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest CKID in the storage container CT corresponding to each index. It should be noted that although the flow of the method described below includes multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined into fewer operations, which may be sequential. Execute or execute in parallel (for example, using a parallel processor or a multi-threaded environment). As shown in FIG. 5, the method includes the following steps:

In step S510, the new storage block to be saved is received, and the new storage block to be saved may be a single storage block in the specific implementation process, or may be the m storages generated in step S310 in the embodiment of FIG. The blocks are transmitted one by one in the order from small block number to large block number. The following is a description of the m memory blocks in the embodiment of FIG. 3, which is not limited to the embodiment of the present invention.

Step S511, the current working storage container is obtained (the currently working storage container is described in the embodiments of FIG. 2A and FIG. 4, and details are not described herein again);

Step S512, determining whether the available space of the currently working storage container (working CT) is greater than or equal to the data size (Ck Size) of the new storage block to be saved, that is, whether the working CT can accommodate the new storage to be saved. Piece.

Step S513, if the available space of the currently working storage container is greater than or equal to the data size (Ck Size) of the new storage block to be saved, the size is subtracted/removed from the available space of the currently working storage container. A space for the CK Size is allocated to the new storage block to be saved. Of course, as an exemplary supplementary explanation, in a specific implementation, after the currently working storage container receives the new storage block to be saved, the metadata in the currently working storage container needs to record the new to be saved. A series of attribute information such as the size of the storage block, the check code, and the position in the CT.

Step S514, if the available space of the currently working storage container is smaller than the data size (Ck Size) of the new storage block to be saved, replace the currently working storage container to satisfy the new storage block pair space to be saved. The size requirement, in a specific implementation, a new storage container (ie, a free container that does not contain a storage block) can be created as an updated work CT to accommodate the new storage block to be saved; or select a previously used one, and The old container that has become a blank container after being spatially recovered is used as an updated work CT to accommodate the new storage block to be saved.

Step S515, subtracting/removing the space of the size CK Size from the available space of the updated working CT to allocate the new storage block to be saved. Other operations are similar to step S513, and are not described herein again.

Step S516, since the updated working CT does not record in the CK2C table, a new index needs to be added/inserted in the CK2C table for recording the updated working CT, the key key of the new index. For the block number CKID _new of the new storage block to be saved, the value value is the container number CTID of the updated working CT. In a specific implementation, when inserting the new index, as a preference, the new index may be sequentially inserted in order of the size of the index (CKID) of the index in the CK2C table.

Step S517, finally returning a virtual address, that is, a block number CKID _new of the new storage block to be saved, to the new storage block to be saved. Therefore, the block number of the m memory blocks in the embodiment of FIG. 3 is the virtual address of the data to be saved.

It should be noted that, in order to make the solution more perfect, preferably, after step S513, an operation may be added to determine whether the currently working storage container (working CT) already contains a storage block, and if the storage block is already included, If the CK2C table already has the index of the currently working storage container, then step S517 is directly executed; if the currently working storage container is found to be a free container (this case is special, the system does not generally have a working storage). The container is a free container, but the possibility of its occurrence cannot be ruled out. For example, when the system is initially working, a new free container is created as a working CT, and then a request for applying for VA appears. That is, if the storage block is not included, the CK2C table is indicated. There is no index of the currently working storage container, and then an index of the storage container of the current work needs to be inserted into the CK2C table, and the key of the index is the block number CKID _new of the new storage block to be saved. Value is the CTID of the currently working storage container.

FIG. 6A is an exemplary flowchart of one embodiment of a disk sorting method according to an embodiment of the present invention. The method uses the CK2C table shown in FIG. 4 to manage and record the spatial location information of the data, which is only an example and not a limitation. In this method, the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced. , never reused, here is only used to select the CKID linear increase), and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the storage corresponding to each index. The smallest CKID in the container CT. It should be noted that although the flow of the method described below includes multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined into fewer operations, which may be sequential. Execute or execute in parallel (for example, using a parallel processor or a multi-threaded environment). As shown in FIG. 6A, the method includes the following steps:

Step S611, receiving a disk defragmentation request; in a specific implementation, the disk defragmentation request may be initiated periodically or after the space is recovered, it is known which CKs in the CT are garbage (will not be used by the system again), so that defragmentation is not required. Scan the entire file system directly at the level of the memory block, the logic is simple. In this regard, it is not intended to limit the embodiments of the present invention.

Step S612, obtaining a batch of CTs to be collated according to the CK2C table, where the CT to be collated is a storage container including a garbage storage block in all storage containers indicated on the CK2C table; and multiple sets of virtual addresses are continuously stored. The container further determines at least two sets of storage containers to be collated consecutively with virtual addresses, wherein each set of virtual addresses consecutive storage containers to be sorted includes k to be sorted a storage container, wherein the k storage containers to be collated are k logically consecutive indexes in a correspondence between the storage block and the storage container. There is no index of other storage containers between the k indexes as an interval. For example, CT1, CT2, CT3, and CT4 in FIG. 6B are a set of storage containers with consecutive virtual addresses, and CK1, CK6, CK10, and CK16 are four consecutive indexes, and CT1, CT3, and CT4 are not a group of virtual ones. For consecutive storage containers, CK1, CK10, and CK16 are not consecutive indexes because of the intermediate interval CK6. The reason why it is necessary to determine at least two sets of storage containers to be collated consecutively is because there may be a storage container that does not contain a garbage storage block, and such a storage container is not regarded as a storage container to be sorted, and storage to be sorted may occur. There is a gap between the virtual addresses of the containers. In this case, the packets need to be processed to ensure that the storage blocks accommodated in the new storage containers obtained after the disk is sorted are not in scope with the storage blocks accommodated by the other storage containers. There is an intersection. The specific scanning and grouping manners may be in the order of size. The following is an example of CT1, CT2, CT3, and CT4 in FIG. 6B. Step S613, scanning the non-garbage CKs in a single CT one by one according to the key values of the four CTs (CT1, CT2, CT3, CT4) to be collated, that is, in the order of CT1→CT2→CT3→CT4 Scan one by one.

Step S614, determining whether the scanned non-spam CK constitutes a full storage container (ie, there is no extra space to store the next non-spam CK) or whether all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned. ? If yes, go to step S615; if no, go back to step S613. When CK8 is scanned as shown in FIG. 6B, it is found that CK1 to CK8 have already formed a full storage container, and then the next step S615 is performed.

Step S615, when the scanned non-spam CK constitutes a full storage container or all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned, apply for a new CT (free container), for example, newly built in FIG. 6B. Storage container CT5.

Step S616, the scanned non-spam CK is migrated to the new CT. As shown in FIG. 6B, CK1 to CK8 are migrated into the new CT5.

Step S617, inserting/adding an index in the CK2C table for recording a correspondence between the storage block in the new CT and the new CT. As shown in FIG. 6B, an index <CKID1, CTID5> is inserted in the CK2C table. In step S618, it is determined whether the scanning is completed. If the scanning is completed, the step S619 is performed; if the scanning is not completed, the processing returns to the step S613.

In step S619, when all the storage containers to be sorted are scanned, the defragmentation operation is ended. As shown in Figure 6B, CT4 has all been scanned.

FIG. 6B is a schematic diagram of a disk sorting method according to an embodiment of the invention. The method uses Figure 4 The CK2C table shown manages and records the spatial location information of the data, which is only an example and not a limitation. In this method, the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced, never reused. Here, only the CKID linear increase is selected for the example, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest of the storage containers CT corresponding to each index. CKID. 6B uses the disk sorting method in FIG. 6A, and finally migrates the non-spam CK in the four storage containers to be sorted into the new storage container CT5 and the new storage container CT6, and the original container CT1, CT2, CT3, CT4, and The CK2C table for recording the correspondence between the storage block and the storage container has only two indexes after the disk is collated, corresponding to the new storage container CT5 and the new storage container CT6, respectively, and the number of indexes included in the CK2C table can be seen. The number of storage containers can be dynamically consistent, does not grow as the system runs, and ultimately reduces the mapping cost of virtual addresses to physical addresses.

FIG. 7 is a schematic diagram showing the logical structure of a storage manager 700 according to an embodiment of the invention. The storage manager 700 can be, but is not limited to, the storage manager 210 of FIG. 2A or the storage manager A of FIG. 2B, and can also be, but is not limited to, perform the methods described in FIGS. 3, 5, and 6A. As shown in FIG. 7, the storage manager 700 includes a storage block management module 710, a storage container management module 720, and a recording module 730. The storage block management module 710 is configured to allocate m storage blocks for the data to be saved each time the data storage request is received, where each storage block is used to represent a virtual address space, and each storage The block configuration has a unique block number, and m is a natural number greater than or equal to 1;

a storage container management module 720, configured to specify n storage containers for the m storage blocks, where each storage container represents a physical storage space on the storage device, and n is a natural number greater than or equal to 1;

The recording module 730 is configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage. Corresponding relationship between the block and the storage container accommodating the allocated storage block; and the recording module 730 is further configured to record the block number of the m storage blocks to the metadata of the data to be saved this time. The block numbers of the m memory blocks are used as virtual addresses of the data to be saved.

Optionally, in order to facilitate recording and management, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container.

Optionally, when each storage container is configured with a unique identifier, the recording module 730 The corresponding relationship between the storage block and the storage container recorded therein includes a plurality of indexes, each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the keys of each index are the same storage. A representative value of a block number of a storage block accommodated by the container, the value of each index being an identifier of the same storage container.

Optionally, the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly incremented, and the minimum value of the block numbers of the m storage blocks is greater than that of the previous data to be saved. The maximum value of the block number of the configured storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the data to be saved the next time;

The recording module 730 is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.

Preferably, the storage container management module 720 is specifically configured to perform the following operations:

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is a minimum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module 730 is notified again that an index is added again in the correspondence between the storage block and the storage container, the again The key of the added index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;

The recording module 730 is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving the notification of adding an index sent by the storage container management module 720.

Preferably, the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly decremented, and the minimum value of the block numbers of the m storage blocks is greater than that configured for the next data to be saved. The maximum value of the block number of the storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;

The recording module 730 is specifically configured to record the representative value in each index as the maximum block number of the storage block accommodated by the same storage container.

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is previously a free storage container, and if the currently working storage container is a free storage container, notifying the recording module 730 of the storage block and the storage container Adding an index to the relationship, the key of the added index is a maximum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module 730 is again notified to add an index again in the correspondence between the storage block and the storage container, The key of the index added again is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;

Optionally, the storage manager 700 further includes a defragmentation module (not shown), where the defragmentation module is configured to:

FIG. 8 is a schematic diagram showing the logical structure of a computer 800 according to an embodiment of the present invention. The computer 800 of the embodiment of the present invention may include:

Processor 801, memory 802, system bus 803, and communication interface 804. The CPU 801, the memory 802, and the communication interface 804 are connected by the system bus 803 and complete communication with each other.

Processor 801 may be a single core or multi-core central processing unit, or a particular integrated circuit, or one or more integrated circuits configured to implement embodiments of the present invention.

The memory 802 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.

Memory 802 is used to hold computer execution instructions (not shown). Specifically, the program code 805 may be included in the computer execution instruction.

When the computer is running, the processor 801 runs a computer execution instruction, and the method flow described in any one of FIG. 3, FIG. 5, or FIG. 6 can be performed.

Those of ordinary skill in the art will appreciate that various aspects of the present invention, or possible implementations of various aspects, may be embodied as a system, method, or computer program product. Thus, aspects of the invention, or possible implementations of various aspects, may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," "modules," or "systems." Furthermore, aspects of the invention, or possible implementations of various aspects, may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.

The computer readable medium can be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).

The processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor is capable of performing the various functional steps specified in each step of the flowchart, or a combination of steps; A device that functions as specified in each block, or combination of blocks.

The computer readable program code can execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server. . It should also be noted that in some alternative implementations, the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted. For example, two steps, or two blocks, shown in succession may be executed substantially concurrently or the blocks may be executed in the reverse order.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.

For the purpose of explanation, the foregoing description has been made with reference to the specific embodiments. However, the above illustrative description is not intended to be exhaustive or to limit the invention to the precise forms disclosed. According to the above teaching Many modifications and changes are possible. The embodiments were chosen and described in order to best explain the principles of the invention, Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims

A method for managing stored data, the method comprising:

Each time a data save request is received, m storage blocks are allocated for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block number, m a natural number greater than or equal to 1;

Specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;

Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already Corresponding relationship of the storage containers of the allocated storage blocks;

Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
The method according to claim 1, wherein each of the storage containers is configured with a unique identifier, and the identifier of each of the storage containers is used to indicate a physical address corresponding to each of the storage containers.
The method according to claim 2, wherein the correspondence between the storage block and the storage container comprises a plurality of indexes, wherein each index is used to indicate a direction of all storage blocks assigned to the same storage container, The key of each index is a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is an identifier of the same storage container.
The method according to claim 3, wherein the block number of the m memory blocks allocated each time is configured to be linearly incremented, and the minimum value of the block numbers of the m memory blocks being configured is greater than the previous time The maximum value of the block number of the storage block configured by the save data or the maximum value of the block number of the m storage blocks configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved the next time;

The representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
The method of claim 4 wherein:

Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:

a. Obtain the storage container of the current work, and follow the order from small block number to large block number one by one. Said m storage blocks are assigned to the currently working storage container;

Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;

e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
The method according to claim 3, wherein the block number of the m memory blocks allocated each time is configured to be linearly decremented, and the minimum value of the block numbers of the m memory blocks being configured is greater than the latter time The maximum value of the block number of the storage block configured by the save data or the maximum value of the block number of the m storage blocks configured is smaller than the minimum value of the block number of the storage block configured by the previous data to be saved;

The representative value in each index is the largest block number of the storage block accommodated in the same storage container.
The method of claim 6 wherein:

Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:

a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;

Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;

e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
The method of any of claims 4-7, further comprising:

Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;

Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;

Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
The method of any of claims 4-7, further comprising:

Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;

Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block The correspondence relationship with the storage container is k logically consecutive indexes, and k is a natural number greater than or equal to 2;

Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and updating the correspondence between the storage block and the storage container, wherein each new storage capacity The block number of the non-garbage storage block accommodated by the device is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is different from the block of the storage block accommodated by the other new storage container. There is no intersection of the range of numbers.
The method of any of claims 2-9, wherein the method further comprises:

After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file where the data to be read is located, and acquiring the data to be read a virtual address, wherein the virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;

Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;

Reading metadata of the q storage containers, determining physical address information of the p storage blocks, and metadata of each storage container is used to describe information of all storage blocks in each of the containers.
A storage manager, characterized in that it is applied to a storage system, the storage system comprising a storage device and a storage manager, the storage device comprising a storage medium for providing a physical address space, the storage manager for receiving a data save request triggered by the application, and the data to be saved is forwarded to the storage device for saving; the storage manager includes:

a storage block management module, configured to allocate m storage blocks for the data to be saved each time after receiving the data storage request, where each storage block is used to represent a virtual address space, and each storage block Configured with a unique block number, m is a natural number greater than or equal to 1;

a storage container management module, configured to specify n storage containers for the m storage blocks, where each storage container represents a physical storage space on the storage device, where n is a natural number greater than or equal to 1;

a recording module, configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks Correspondence with a storage container accommodating the already allocated storage block; and

The recording module is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time. The virtual address of the data to be saved.
The storage manager according to claim 11, wherein each of the storage containers is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each of the storage containers. .
A storage manager according to claim 12, wherein

The corresponding relationship between the storage block and the storage container recorded in the recording module includes multiple indexes, and each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the key of each index A representative value of a block number of a storage block accommodated by the same storage container, the value of each index being an identifier of the same storage container.
The storage manager according to claim 13, wherein the block number of the m storage blocks allocated by the storage block management module is configured to be linearly incremented, and the block numbers of the m storage blocks are The minimum value is greater than the maximum value of the block number of the storage block configured for the previous data to be saved or the maximum value of the block number of the m storage blocks is smaller than the minimum block number of the storage block configured for the data to be saved the next time. value;

The recording module is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
The storage manager according to claim 14, wherein the storage container management module is specifically configured to perform the following operations:

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the recording module in the correspondence between the storage block and the storage container Adding an index, the key of the added index is a minimum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module is notified again that an index is added again in the correspondence between the storage block and the storage container, and the index is added again. The key of the index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the storage capacity of the updated working storage container Identification

c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;

The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
The storage manager according to claim 13, wherein the block number of the m storage blocks allocated by the storage block management module at each time is configured to be linearly decremented, and the block numbers of the m storage blocks are The minimum value is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number of the m storage blocks is smaller than the minimum block number of the storage block configured for the previous data to be saved. value;

The recording module is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
The storage manager according to claim 16, wherein the storage container management module is specifically configured to perform the following operations:

a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is a maximum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module is again notified to add an index again in the correspondence between the storage block and the storage container, the again The key of the added index is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;

c. When the storage container of the updated work is a full storage container, perform step b again, straight Up to the m storage blocks are specified in the n storage containers;

The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
The storage manager according to any one of claims 14-17, wherein the storage manager further comprises a defragmentation module, and the defragmentation module is configured to:

Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;

Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;

Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
The storage manager according to any one of claims 14-17, wherein the storage manager further comprises a defragmentation module, and the defragmentation module is configured to:

Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;

Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block The correspondence relationship with the storage container is k logically consecutive indexes, and k is a natural number greater than or equal to 2;

Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein each new storage container is The block number of the non-garbage storage block accommodated is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is the same as the block number of the storage block accommodated by the other new storage container. There is no intersection in the scope.
A storage system, characterized in that the storage system includes a storage device and storage Manager

The storage device includes a storage medium for providing a physical address space to save data;

The storage manager is configured to allocate m storage blocks for the data to be saved after each receiving the data saving request, where each storage block is used to represent a virtual address space, and each storage The block is configured with a unique block number, m is a natural number greater than or equal to 1; n storage containers are specified for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, n is greater than a natural number equal to 1;

Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already The correspondence of the storage containers of the allocated storage blocks;

Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
The storage system according to claim 20, wherein each of the storage containers is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each of the storage containers.
The storage system according to claim 21, wherein the storage manager is configured to record a correspondence between the storage block and the storage container, and specifically includes:

The corresponding relationship between the storage block and the storage container recorded by the storage manager includes multiple indexes, and each index represents a pointer of all storage blocks assigned to the same storage container, wherein each index has the same key A representative value of a block number of a storage block accommodated by a storage container, the value of each index being an identifier of the same storage container.
The storage system according to claim 22, wherein the storage manager is configured to configure a block number of the m storage blocks allocated each time to be linearly incremented, and configure the m storage blocks. The minimum value of the block number is greater than the maximum value of the block number of the storage block configured for the previous data to be saved or the maximum value of the block number configuring the m storage blocks is smaller than the storage block configured for the data to be saved later. The minimum value of the block number;

The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
The storage system according to claim 23, wherein the storage manager is specifically configured to perform the following operations:

a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;

Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;

e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
The storage system according to claim 22, wherein the storage manager is configured to configure a block number of the m storage blocks allocated each time to be linearly decremented, and configure the m storage blocks. The minimum value of the block number is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number configuring the m storage blocks is smaller than the storage block configured for the previous data to be saved. The minimum value of the block number;

The storage manager is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
The storage system according to claim 25, wherein the storage manager specifies n storage containers for the m storage blocks, and records a correspondence between the storage blocks and the storage containers, specifically for performing the following operations. :

a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;

Determining, if the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, if the currently working storage container is idle storage a container, in the corresponding relationship between the storage block and the storage container, an index, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;

c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;

d. adding an index to the storage node and the storage container, the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added The value of the index is the identity of the storage container of the updated work;

e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
The storage system according to any one of claims 23 to 26, wherein the storage manager is further configured to:

Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;

Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;

Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
The storage system according to any one of claims 23 to 26, wherein the storage manager is further configured to:

Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;

Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block Corresponding relationship with the storage container is k logically consecutive indexes, k is a natural number greater than or equal to 2;

Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and updating the corresponding relationship between the storage block and the storage container, wherein each new storage container holds the non-non-contained storage container The block number of the garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not exist with the range of the block number of the storage block accommodated by the other new storage container Intersection.
The storage system according to any one of claims 22 to 28, wherein the storage manager is further configured to:

After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file in which the data to be read is located, and acquiring the to-be-read a virtual address of the data, wherein the virtual address of the data to be read includes a block number of p storage blocks, p is a natural number greater than or equal to 1, and the storage block is queried according to the block number of the p storage blocks Corresponding relationship of the storage containers, determining q storage containers accommodating the p storage blocks, q being a natural number greater than or equal to 1, and reading metadata of the q storage containers, determining physical properties of the p storage blocks Address information, metadata of each storage container is used to describe information of all storage blocks in each of the containers.
A storage manager, comprising: an interface for interacting with a storage device, a processor, a memory, the processor being coupled to the processor via a bus, the processor through the interface and the storage Device interaction information;

The memory is for storing computer execution instructions, the processor executing the computer-executed instructions stored by the memory to cause the storage manager to perform as in claims 1-10 when the storage manager is running Any of the methods of managing stored data.
A computer, comprising: a processor, a memory, a bus, and a communication interface;

The memory is configured to store computer execution instructions, the processor is coupled to the memory via the bus, and when the computer is running, the processor executes the computer-executed instructions stored by the memory to cause The computer performs the method of managing stored data according to any one of claims 1-10.
A computer readable medium, comprising: computer-executable instructions, when executed by a processor of a computer, the computer executing the instructions 1-10 A method of managing stored data as described in any one of the preceding claims.