WO2016106757A1 - Method for managing storage data, storage manager and storage system - Google Patents

Method for managing storage data, storage manager and storage system Download PDF

Info

Publication number
WO2016106757A1
WO2016106757A1 PCT/CN2014/096073 CN2014096073W WO2016106757A1 WO 2016106757 A1 WO2016106757 A1 WO 2016106757A1 CN 2014096073 W CN2014096073 W CN 2014096073W WO 2016106757 A1 WO2016106757 A1 WO 2016106757A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
storage container
block
blocks
container
Prior art date
Application number
PCT/CN2014/096073
Other languages
French (fr)
Chinese (zh)
Inventor
李育国
谯志华
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201480016987.4A priority Critical patent/CN106462491B/en
Priority to PCT/CN2014/096073 priority patent/WO2016106757A1/en
Publication of WO2016106757A1 publication Critical patent/WO2016106757A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a storage data management method, a storage manager, and a storage system.
  • the location information of the file system record data generally adopts the following method: recording the virtual address of the data, and mapping the virtual address to the physical address through the mapping table. This approach is logically simple and the upper layer does not need to understand the underlying layout.
  • the file system After the system has been running for a period of time, it has undergone multiple space reclamation. The problem of disk fragmentation is highlighted. Defragmentation is required. The defragmentation process depends on the layout of the file system. Considering space reclamation and data locality, the file system usually uses data/files as storage blocks (the storage blocks refer to the smallest unit or the most basic unit of reading and writing data in the file system, and may exist in different file systems. Different naming, such as basic data chunks, data blocks, etc., are recorded, and a plurality of storage blocks are organized in the form of a storage container Container (also referred to as a data segment Segment).
  • a storage container Container also referred to as a data segment Segment
  • FIG. 1A is a schematic diagram of a CAT index of a file system for recording data location information using a container address translation table CAT in the prior art.
  • the scheme combines the storage container number and the storage block number ( ⁇ CTID, CKID>) where the data is located as the virtual address of the data, and maps the CT (abbreviation of Container) to the physical address through the container address conversion table CAT.
  • CT abbreviation of Container
  • the physical address PA1 of the storage container CT1 can be known, and in each storage container, metadata is recorded in each storage container, and the size of each CK in the storage container, the check code, the position in the CT, and the like are related.
  • the physical address of the data ⁇ CTID1, CKID2> can be determined by querying the metadata in CT1.
  • FIG. 1B is a schematic diagram of a CAT index structure after defragmentation in the prior art. It can be seen that after defragmentation, CK4 and CK6 in the original CT2 are migrated to CT1. At this time, the index information of CT2 is modified to the physical address of CT1 in the CAT table. PA1. However, since the prior art uses ⁇ CTID, CKID> as a virtual address of data, when defragmenting, only the memory block in one CT must be migrated to another CT as shown in FIG. 1C. It is shown that if CK4 in CT2 is migrated to CT1, CK6 in CT2 is migrated to CT3 because of the physical location of CT1 and CT3. The address is different, and the physical address of CT2 in the CAT table will not be mapped.
  • the file system manages the stored data so that the flexibility and efficiency of the disk sorting are not good, and the disk space utilization after the disk sorting is still not high.
  • an embodiment of the present invention provides a method for managing stored data, including:
  • each storage block is used to represent a virtual address space, and each storage block is configured with a unique block number, m a natural number greater than or equal to 1;
  • each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
  • each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .
  • the corresponding relationship between the storage block and the storage container includes multiple indexes, where each index is used to indicate that the same is specified Point of all storage blocks of a storage container, the key of each index is a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is the same storage container Logo.
  • the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured
  • the minimum value of the block number is greater than the maximum value of the block number of the storage block configured by the previous data to be saved or the maximum value of the block number of the m storage blocks being configured is smaller than the configuration of the data to be saved after the previous time.
  • the representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
  • the n storage blocks are specified by the n storage blocks, and the m storage blocks and the n are The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
  • the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • the block numbers of the m storage blocks allocated each time are configured to be linearly decremented, and the m storage blocks are configured
  • the minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the data to be saved in the previous time or the maximum value of the block number in which the m memory blocks are configured is smaller than the memory block configured by the previous data to be saved. The minimum value of the block number;
  • the representative value in each index is the largest block number of the storage block accommodated in the same storage container.
  • Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers including:
  • the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • the method further includes: receiving defragmentation An instruction to determine a storage container to be collated, wherein the storage container to be collated is a storage container indicated by a correspondence between the storage block and the storage container;
  • the method further includes:
  • the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
  • the method further includes:
  • the file metadata of the file where the data to be read is located After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file where the data to be read is located, and acquiring the data to be read a virtual address, wherein the virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;
  • Reading metadata of the q storage containers, determining physical address information of the p storage blocks, and metadata of each storage container is used to describe information of all storage blocks in each of the containers.
  • an embodiment of the present invention provides a storage manager, which is applied to a storage system, where the storage system includes a storage device and a storage manager, where the storage device includes a storage medium for providing a physical address space.
  • the storage manager is configured to receive a data save request triggered by the application, and forward the data to be saved to the storage device for saving; the storage manager includes:
  • a storage block management module configured to allocate m storage blocks for the data to be saved each time after receiving the data storage request, where each storage block is used to represent a virtual address space, and each storage block Configured with a unique block number, m is a natural number greater than or equal to 1;
  • a storage container management module configured to specify n storage containers for the m storage blocks, where Each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
  • a recording module configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks Correspondence with a storage container accommodating the already allocated storage block;
  • the recording module is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time.
  • the virtual address of the data to be saved is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time. The virtual address of the data to be saved.
  • each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical medium corresponding to each storage container address.
  • the corresponding relationship between the storage block and the storage container recorded in the recording module includes multiple indexes, and each index is used by Pointing to the pointers of all the storage blocks assigned to the same storage container, the keys of each index are representative values of the block numbers of the storage blocks accommodated by the same storage container, and the value of each index is The ID of the same storage container.
  • the block number of the m storage blocks allocated by the storage block management module is configured to be linearly incremented, and The minimum value of the block number of the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the maximum value of the block number of the m memory blocks is less than the data to be saved for the next time to be saved.
  • the recording module is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  • the storage container management module is specifically configured to perform the following operations:
  • obtaining a storage container of the current work assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the recording module at the location Adding an index to the correspondence between the storage block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is the storage container of the current working Identification
  • the storage container of the updated work is an idle storage container; the recording module is notified again that an index is added again in the correspondence between the storage block and the storage container, and the index is added again.
  • the key of the index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
  • step b When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
  • the recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
  • the block number of the m storage blocks allocated by the storage block management module is configured to be linearly decremented, and The minimum value of the block number of the m storage blocks is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number of the m storage blocks is smaller than the data to be saved for the previous time.
  • the recording module is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
  • the storage container management module is specifically configured to perform the following operations:
  • obtaining a storage container of the current work assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the storage capacity of the current work. Identification of the device;
  • the currently working storage container is a full storage container
  • obtain the updated working storage container and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number.
  • the updated working storage container is an idle storage container;
  • the recording module is again notified to add an index again in the correspondence between the storage block and the storage container, the again The key of the added index is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
  • step b When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
  • the recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
  • the storage manager further includes a defragmentation module.
  • the defragmentation module is used to:
  • Receiving a defragmentation instruction determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
  • the block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
  • the storage manager further includes a defragmentation module, The defragmentation module is used to:
  • the block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
  • an embodiment of the present invention provides a storage system, where the storage system includes a storage device and a storage manager.
  • the storage device includes a storage medium for providing a physical address space to save data
  • the storage manager is configured to allocate m storage blocks for the data to be saved after each receiving the data saving request, where each storage block is used to represent a virtual address space, and each storage
  • the block configuration has a unique block number, and m is a natural number greater than or equal to 1;
  • n storage containers for the m storage blocks, where each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
  • each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .
  • the storing, by the storage manager, the corresponding relationship between the storage storage block and the storage container includes:
  • the correspondence between the storage block and the storage container recorded by the storage manager includes a plurality of indexes, and each index represents a pointing of all the storage blocks assigned to the same storage container, where
  • the key of each index is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.
  • the configuration unit includes: the storage manager is specifically configured to: allocate the m storage blocks each time The block number is configured to be linearly incremented, and the minimum value of the block number configuring the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the block number of the m memory blocks is configured. The maximum value is less than the minimum value of the block number of the storage block configured for the data to be saved the next time;
  • the storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  • the storage manager is specifically configured to perform the following operations:
  • the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • the storage manager is specifically configured to configure a block number of the m storage blocks allocated each time as linear decrement And configuring the minimum number of the block numbers of the m storage blocks to be larger than the data to be saved for the next time.
  • the maximum value of the block number of the set storage block or the maximum value of the block number configuring the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;
  • the storage manager is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
  • the storage manager specifies n storage containers for the m storage blocks, and records a correspondence between the storage block and the storage container Relationship, specifically used to perform the following operations:
  • the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added
  • the value of the index is the identity of the storage container of the updated work
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • the storage manager is further used to:
  • Receiving a defragmentation instruction determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
  • the storage manager is further used to:
  • the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
  • the storage manager is also used to:
  • an embodiment of the present invention After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file in which the data to be read is located, and acquiring the to-be-read a virtual address of the data, wherein the virtual address of the data to be read includes a block number of p storage blocks, p is a natural number greater than or equal to 1, and the storage block is queried according to the block number of the p storage blocks Corresponding relationship of the storage containers, determining q storage containers accommodating the p storage blocks, q being a natural number greater than or equal to 1, and reading metadata of the q storage containers, determining physical properties of the p storage blocks Address information, metadata of each storage container is used to describe information of all storage blocks in each of the containers.
  • a storage manager including:
  • An interface for interacting with a storage device, the processor being coupled to the processor via a bus, the processor interacting with the storage device through the interface;
  • the memory is configured to store computer execution instructions, when the storage manager is running,
  • the processor executes the computer-executed instructions stored by the memory to cause the storage manager to perform the method of managing stored data provided by any of the first aspect or the first aspect of the first aspect.
  • an embodiment of the present invention provides a computer, including: a processor, a memory, a bus, and a communication interface;
  • the memory is configured to store computer execution instructions
  • the processor is coupled to the memory via the bus, and when the computer is running, the processor executes the computer-executed instructions stored by the memory to cause
  • the computer performs the management method of the stored data provided by the above first aspect or any possible implementation of the first aspect.
  • an embodiment of the present invention provides a computer readable medium, including a computer executing instruction, when the processor of the computer executes the computer to execute an instruction, where the computer performs any of the above first aspect or the first aspect
  • the implementation method of storing data provided by the implementation is not limited to:
  • each time m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the correspondence between the m storage blocks and the n storage containers. Updating a correspondence between the storage block and the storage container, and recording the block number of the m storage blocks to the metadata of the data to be saved, the block numbers of the m storage blocks being used as the
  • the virtual address of the data to be saved is such that the virtual address of the data in the system is independent of the storage container where the data is located, and the correspondence between the storage block and the storage container can be queried according to the block number of the storage block where the data is located, thereby obtaining data.
  • the information about the physical address, the management method of the data storage so that when the defragmentation is performed, the storage container is not required to be migrated as a whole, and the disk is directly defragmented by the storage block, thereby improving the defragmentation.
  • the efficiency and flexibility also increase the disk space utilization.
  • 1A is a schematic diagram showing a CAT index structure of a file system for recording location information of data using a CAT container address translation table in the prior art
  • 1B is a file system disk in the prior art for using CAT to record location information of data.
  • 1C is a schematic diagram showing the principle of file system disk defragmentation of a location information using CAT to record data in the prior art
  • FIG. 2A is a schematic structural diagram of a storage system according to an embodiment of the present invention.
  • FIG. 2B is a schematic diagram of an application scenario according to an embodiment of the present invention.
  • FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention
  • FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention.
  • FIG. 6A is an exemplary flowchart of a disk sorting method according to an embodiment of the invention.
  • 6B is a schematic diagram of a disk sorting method according to an embodiment of the invention.
  • FIG. 7 is a schematic structural diagram of a storage manager according to an embodiment of the invention.
  • FIG. 8 is a schematic structural diagram of a computer according to an embodiment of the present invention.
  • the embodiment of the present invention first provides a storage system 200.
  • 2A is a schematic diagram of a logical structure of the storage system 200.
  • the storage system 200 includes: a storage manager 210 and a storage device 220, and the storage manager 210 and an external device (such as a host, an application server, etc., this solution does not The number of external devices is limited) to be in communication with the storage device 220.
  • an external device such as a host, an application server, etc., this solution does not The number of external devices is limited
  • the storage device 220 may include a storage medium 222 and a storage controller 221 (the embodiment of the present invention does not limit the number of the storage medium 222 and the storage controller 221, and the figure is shown here for convenience of description.
  • the storage medium 222 is configured to provide a physical address space for storing data.
  • the storage medium 222 may be, for example, but not limited to, an EEPROM, a ROM, a solid state hard disk SSD, or the like. Hard disk HDD, tape, optical hard disk Or other non-volatile storage device, which is not limited to the embodiment of the present invention; the storage controller 221 is used to manage and schedule the plurality of storage media 222, by way of example and not limitation.
  • the storage controller 221 and the storage medium 222 may constitute a Redundant Arrays of Independent Disks (RAID), which is not a limitation on the embodiments of the present invention.
  • RAID Redundant Arrays of Independent Disks
  • the storage manager 210 as an intermediate device (host host, APP Service) and the storage device 220, can be configured as an intermediate unit for reading and writing data, and can be separately set on a physical device as shown in FIG. 2A.
  • the storage manager 210 can be implemented by modifying the file system.
  • the file system mentioned in the embodiment of the present invention refers to organizing and allocating the address space of the file storage device (including but not limited to the storage device 200 in FIG. 2A), and is responsible for storing the file/data and
  • the system for managing, retrieving, and protecting the stored files/data, that is, the file system in the embodiment of the present invention includes a file management function and a space management function.
  • the storage manager 210 is specifically configured to allocate m storage blocks for the data to be saved after receiving the data save request from the host (the storage block CK in the embodiment of the present invention refers to reading and writing data).
  • the smallest unit or the most basic unit may have different names in different systems, such as basic data chunks, data blocks, etc., where each memory block is used to represent a virtual address space, each of which The memory blocks are configured with a unique block number (CKID), and m is a natural number greater than or equal to 1.
  • CKID unique block number
  • m is a natural number greater than or equal to 1.
  • the storage manager 210 may receive a data save request from the host at a time, or may refer to the storage manager 210 receiving the data at one time.
  • the save request data save request from the host, and the data save request may be triggered by an external application or may be triggered by an instruction function of the storage manager 210, which is not a limitation on the embodiment of the present solution.
  • the storage manager 210 specifies n storage containers (CT) for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device 220, where n is A natural number greater than or equal to 1. It should be noted that the storage container mentioned in the embodiment of the present invention is used to accommodate a plurality of the storage blocks.
  • the storage container is also referred to in the art as a data container Container (CT) or a data segment Segment.
  • CT data container Container
  • the storage container can also hold metadata related to the storage block.
  • Each storage container represents a piece of physical storage space on the storage device 220, which means that each storage container allocated by the storage manager actually corresponds to the storage device 220 (specifically, the storage medium 222).
  • the physical storage space may be composed of a continuous, uninterrupted physical address space, or may be composed of discrete, intermittent physical address spaces, by way of example and not limitation, when the storage medium 222 is a disk, Each storage container may actually correspond to a contiguous logical address on a logical volume provided by the storage device, or may correspond to a contiguous sector or track on the disk, or may be composed of discrete, intermittent sectors or tracks on the disk.
  • a piece of physical storage space such as by RAID striping, forms the discrete, intermittent sectors or tracks into a physical storage space.
  • Each storage container is configured with metadata, and the metadata of each storage container records related information such as a check code, a data size, a position in the CT, and the like of each storage block of the storage block accommodated by each storage container.
  • a new correspondence relationship different from the prior art is saved in the storage manager 210, which is referred to as a correspondence between a storage block and a storage container in the embodiment of the present invention, and the storage block and the storage container Corresponding relationship is used to record the corresponding relationship between the storage block that has been allocated and the storage container that accommodates the allocated storage block.
  • the storage manager 210 is different from that used in the prior art. ⁇ CTID, CKID> is used in combination as a method of virtual address, but records the block number of the m storage blocks to the metadata of the file in which the data to be saved is located, the m storage blocks The block number is used as the virtual address of the data to be saved this time.
  • the storage manager 210 uses the block number of the storage block as the virtual address of the data to be saved, and may be executed after the storage block is allocated, or may be performed after the correspondence between the record number storage block and the storage container, and is implemented by the present invention. This example does not limit this. It should be noted that the virtual address used in the embodiment of the present invention is used as the addressing address when the storage manager addresses the data, and the upper layer application and the underlying storage device do not need to perceive the virtual address, and the virtual address only It has an addressing meaning for the storage manager. For example, after receiving the data read request, the storage manager determines a virtual address of the data to be read according to the data read request, that is, the data to be read is defined by the storage manager. The location in the virtual storage space.
  • the virtual address of the data to be saved or the virtual address of the data to be read is recorded in the storage manager in the form of metadata.
  • the storage manager in the embodiment of the present invention stores metadata of a plurality of files, each file corresponding to the saved data, each file has metadata of the file, and the metadata of the file is only known in the art. For example, include file directory information and index node information, and the like.
  • the storage manager 210 is responsible for allocating m storage blocks for the data to be saved, and designating n storage containers for the m storage blocks, according to the m storage blocks and Recording the correspondence between the storage blocks and the storage containers, and recording the block numbers of the m storage blocks to the metadata of the data to be saved, the m storage blocks
  • the block number is used as the virtual address of the data to be saved this time, so that the number recorded in the system
  • the virtual address is independent of the storage container where the data is located, and can query the corresponding relationship between the storage block and the storage container according to the storage block where the data is located, thereby obtaining related information of the physical address of the data, and the management method of the data storage makes
  • defragmenting there is no need to migrate the entire container at the granularity of the storage container, and the disk is directly defragmented by the size of the storage block, thereby improving the efficiency and flexibility of the defragmentation, and greatly improving the space utilization of the disk. .
  • the storage manager 210 may be configured with a unique identifier (CTID) for each storage container, and the identifier of each storage container is used to indicate to each storage container.
  • CTID unique identifier
  • the identifier of each storage container may be mapped to a physical address by mapping or by specifying an initial physical address of the system storage container and specifying a space size of each storage container (eg, 8M) And obtaining the physical address corresponding to each storage container by calculating the offset (CTID*8M).
  • CTID unique identifier
  • each index represents a pointer of all storage blocks assigned to the same storage container, where each index The key is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.
  • the storage manager 210 records the correspondence between the storage block and the storage container by using multiple indexes, and each index is used to indicate the orientation of all the storage blocks assigned to the same storage container, and
  • the key of each index is configured as a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is configured as an identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage.
  • the redundancy of the correspondence between the block and the storage container makes the correspondence easier to check and improves the efficiency of query usage.
  • the storage manager 210 may specifically configure a block number of the m storage blocks allocated each time to be linearly incremented, and a minimum value of the block numbers of the m storage blocks to be configured.
  • the maximum value of the block number of the storage block configured to be larger than the previous data to be saved or the maximum value of the block number of the m storage blocks to be configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved after the previous time;
  • the storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  • the storage manager 210 configures the block number of the m storage blocks to be linearly incremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the previous data to be saved.
  • the maximum value of the block number of the configured storage block or the maximum value of the block number in which the m storage blocks are configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved the next time; and each of the pieces The representative value in the index is recorded as the smallest block number of the storage block accommodated by the same storage container.
  • the algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
  • the storage manager 210 specifically implements the storage block designation to the storage container and the record of the correspondence between the storage block and the storage container by:
  • a. Obtain a storage container of the current work, and assign the m storage blocks to the currently working storage container one by one in a sequence from a small block number to a large block number; in the specific implementation process, in order to ensure each storage in the system
  • the value range of the block number of the storage block accommodated in the container does not intersect with the block number value range of the storage block accommodated in any other storage container, and the storage manager 210 has at most one storage configured at any one time.
  • the container is for accommodating a storage block, which is the currently working storage container.
  • the free storage container is a storage container that does not contain any storage block, such as a new storage container created by the file system, or an old storage container that does not contain a storage block after being spatially reclaimed.
  • the currently working storage container is a full storage container
  • obtain the updated working storage container and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number.
  • the updated working storage container is an idle storage container; it should be noted that the full storage container means that there is not enough space in the storage container to accommodate the next to be allocated. The memory block is gone.
  • the obtained storage container of the updated work must be a free container.
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • This embodiment describes in detail how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers.
  • the algorithm is simple and easy to implement.
  • the above steps may be performed.
  • a ⁇ Step e may be split into more small steps or merged into several steps, or may be performed in the order of execution between steps, since the above transformations are based on the present embodiment without the need for creative labor. It can be achieved, and therefore should be attributed to the scope of protection of this embodiment.
  • the storage manager 210 is configured to configure a block number of the m storage blocks allocated each time to be linearly decremented, and configure a minimum block number of the m storage blocks.
  • the value is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number configuring the m storage blocks is smaller than the minimum of the block number of the storage block configured for the previous data to be saved. value;
  • the storage manager 210 is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
  • the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the data to be saved later.
  • the maximum value of the block number of the configured memory block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and each of the pieces The representative value in the index is recorded as the largest block number of the storage block accommodated by the same storage container.
  • the algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
  • the storage manager 210 when the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, the storage manager 210 is specifically configured to perform the following operations:
  • the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added
  • the value of the index is the identity of the storage container of the updated work
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • This embodiment details how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers.
  • the algorithm is simple and easy to implement, of course, in the specific implementation process, such as the previous one. According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.
  • the storage manager 210 configures the block number of the m storage blocks to be linearly incremented or linearly decremented each time, the storage manager 210 is further configured to:
  • Receiving a defragmentation instruction determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
  • the storage manager 210 can determine that it is to be scanned.
  • the storage container (which can be implemented in a plurality of manners, which is not limited to the embodiment of the present solution), obtains a non-garbage storage block included in each storage container to be collated, and re-creates the non-garbage storage block as a granularity.
  • the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.
  • the storage manager 210 configures the block number of the m storage blocks to be linearly incremented or linearly decremented each time, the storage manager 210 is further configured to:
  • the storage container to be collated includes a garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container (ie, is not used by the system or a storage container of a storage block that has been spatially reclaimed;
  • the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container
  • the value is adjacent to each other (the adjacent size means that the values of the keys of the two indexes are next to each other, and the key value interval formed by the key values of the two indexes does not include the key value of any other index);
  • the storage manager 210 can determine that the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container is a storage container to be tidy. And by designating a new storage container for the non-garbage storage block in the logically adjacent storage container to be collated, and updating the correspondence between the storage block and the storage container, wherein each new storage container is accommodated
  • the block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is different from the block number of the storage block accommodated by the other new storage container. There is no intersection in the scope.
  • the method further includes: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where
  • the virtual address of the data to be read includes the block number of the p memory blocks, and p is a natural number greater than or equal to 1;
  • the information of the data to be read includes the data to be read, by way of example only and not limitation The file name, the offset of the data to be read in the file, the length of the data to be read, and the like.
  • the metadata of the file where the data to be read is located may be queried, for example, the directory of the file system is read first, and the inode (index node) of the data to be read is obtained.
  • the information is further queried according to the inode information of the data to be read, and the virtual address of the data to be read includes a block number of p storage blocks.
  • the method may further include: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where The virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;
  • the metadata records related information such as a check code, a data size, a location in the CT, and the like of each storage block in the storage container corresponding to the block number of the storage block to be addressed.
  • FIG. 2B is a schematic structural diagram of a specific implementation of a storage manager according to an embodiment of the present invention.
  • 2B includes storage manager A (unnumbered), storage device 230, and storage manager B (unnumbered), two storage managers A and B are shown here for convenience of description, and storage manager B can be used as storage.
  • the backup of the manager A, the number of the storage manager is not limited to the embodiment of the present invention.
  • a storage manager is provided to implement the embodiment of the present invention.
  • the storage manager A includes a processor 211, an interface (not shown) that interacts with the storage device, and a memory 212 that communicates via a bus (not numbered), the processor 211 executes computer instructions in the memory 212, and Having the storage manager A perform includes, but is not limited to, the embodiment of Figure 2A.
  • the storage manager A communicates with the storage device 230 through the interface interacting with the storage device, the storage device 230 is configured to store data forwarded by the storage manager A, and the function or method and method implemented by the storage manager A
  • the storage device 220 in 2A is similar and will not be described again here.
  • FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention.
  • the management method of the stored data may be, but is not limited to, applied to the storage system as shown in FIG. 2A or the application scenario shown in FIG. 2B.
  • the processes of method 300 described below include multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined in fewer operations, which may be performed sequentially or in parallel ( For example, using a parallel processor or a multi-threaded environment) or changing the order of execution between steps should fall within the scope of protection of embodiments of the present invention.
  • the method includes:
  • Step S310 after receiving the data save request, allocate m storage blocks for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block.
  • No. m is a natural number greater than or equal to 1. It should be noted that, as already described in the embodiment of FIG. 2A, the receiving of the data save request may refer to receiving one data save request at a time, or may refer to receiving multiple data save requests at a time.
  • Step S320 specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1.
  • the section of the physical storage space has been explained in detail in the embodiment of FIG. 2A, and details are not described herein again.
  • Step S330 updating the correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and accommodate Corresponding relationship of the storage containers of the allocated storage blocks.
  • step S340 the block number of the m storage blocks is recorded in the metadata of the data to be saved, and the block numbers of the m storage blocks are used as the virtual address of the data to be saved. .
  • each time m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the m storage blocks and the n storage blocks.
  • the block numbers of the m storage blocks are used by As a virtual address of the data to be saved, the correspondence between the storage block and the storage container can be recorded, and the virtual address of the data is independent of the storage container where the data is located, and thus, when the disk is sorted
  • the overall migration of the storage container is not required, which improves the efficiency and flexibility of the defragmentation and greatly improves the space utilization of the disk.
  • each storage container in step S320 is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container.
  • the embodiment of FIG. 2A has described in detail how to use the identifier of each storage container to indicate the physical address corresponding to each storage container, and details are not described herein again.
  • the correspondence between the storage block and the storage container in step S330 includes multiple indexes, wherein each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the key of each index A representative value of the block number of all the storage blocks accommodated by the same storage container, the value of each index being the identifier of the same storage container.
  • the correspondence between the storage block and the storage container is recorded by using multiple indexes, and each index is used to indicate a pointing of all storage blocks assigned to the same storage container, and the The key of each index is configured as a representative value of the block number of all the storage blocks accommodated by the same storage container, and the value of each index is configured as the identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage. The mapping cost of the correspondence between the block and the storage container.
  • the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured.
  • the minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the previous data to be saved.
  • the maximum value of the block number of the m storage blocks being configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved in the previous time;
  • the representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
  • the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the previous data to be saved.
  • the maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the data to be saved the next time; and the contents in each of the indexes
  • the representative value is recorded as the smallest block number of the storage block accommodated by the same storage container.
  • the algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
  • the n storage containers are specified n storage containers, and according to the m storage blocks and the n
  • the correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
  • the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
  • the updated working storage container is an idle storage container
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • the block number of the m storage blocks allocated each time may be configured to be linearly decremented, and the m storage blocks
  • the minimum value of the configured block number is greater than the maximum value of the block number of the storage block configured by the data to be saved in the next time or the maximum value of the block number in which the m storage blocks are configured is smaller than the storage configured by the previous data to be saved.
  • the representative value in each index is the maximum block number of the storage block accommodated by the same storage container.
  • the block number of the m storage blocks allocated each time is configured to be linearly decremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the data to be saved later.
  • the maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and the each of the indexes in the index
  • the representative value is recorded as the maximum block number of the storage block accommodated by the same storage container.
  • the algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
  • the n storage blocks are specified n storage containers, and according to the m storage blocks and the n
  • the correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
  • the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work.
  • the updated working storage container is an idle storage container
  • step e When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  • This embodiment describes in detail how to allocate the n storage containers for the m storage blocks and record the correspondence between the storage blocks and the storage containers.
  • the algorithm is simple and easy to implement, of course, in the specific implementation process, as before According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.
  • the method further includes:
  • Receiving a defragmentation instruction determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
  • the storage container to be scanned can be determined (which can be implemented in various manners, which is not limited to the embodiment of the present solution), and the storage container included in each storage container is obtained.
  • the block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is compared with other new storage.
  • the range of block numbers of the storage blocks accommodated in the storage container is implemented without an intersection.
  • the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.
  • the method further includes:
  • the storage container to be collated includes a garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container (ie, is not used by the system or a storage container of a storage block that has been spatially reclaimed;
  • the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container can be determined as the storage container to be tidyed, and Specifying a new storage container for the non-garbage storage block in the logical storage node to be collated, and updating the corresponding relationship between the storage block and the storage container, wherein the non-trash is accommodated in each new storage container
  • the block number of the storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by other new storage containers.
  • the storage container containing the garbage storage block can be flexibly arranged, and after the finishing, the correspondence between the storage block and the storage container still ensures that each new storage container needs only one index. All non-garbage storage blocks in each new storage container are recorded, thereby reducing the mapping cost of the virtual address to the physical address.
  • FIG. 4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention, where the CK2C table can be used to record a storage block and a storage container.
  • the CK2C table is a preferred organization form of the correspondence between the storage block and the storage container in the above respective embodiments.
  • the block number of the storage block recorded by the CK2C table is selected in the form of linear increment and never multiplexing (unique representation), that is, each time the virtual address/CKID allocated for the new storage block is the same as before.
  • the virtual address/CKID of the storage block is different, and the allocated CKID is always linearly increased, and of course, it can be linearly reduced all the time, not shown here).
  • the CKID is a 64-bit unsigned integer (int type), and the global addressing space of the file system is [0, 2 ⁇ 64-1], and it should be noted that the embodiment of the present invention
  • the number of CKIDs is not limited, and the system can adjust according to actual needs.
  • the CK2C table may be implemented by using a linear table or a B+ tree, and the specific manner is not limited to the embodiment of the present invention.
  • the CK2C table includes multiple indexes (each column in the CK2C table is an index, such as ⁇ CKID1, CTID1> is an index), wherein each index and each storage container are one by one.
  • the number of indexes in the CK2C table is equal to the number of storage containers CT recorded in the CK2C table (here, for convenience of description, only three CTs, that is, three indexes are shown, which are not limiting of the present invention) . It should also be noted that the value range of the CKID contained in the storage container corresponding to each index does not coincide.
  • CT1 includes a range of CKIDs of 1 to 4
  • CT2 includes a range of CKIDs of 5 to 8 ..., and so on
  • the range of CKIDs included in each CT does not overlap or intersect.
  • the value range of the CKID included in each CT can be implemented in a plurality of ways without overlapping or intersection, which is only an example and not a limitation, and there is only one at any time in the file system.
  • the container in the working state ie, the currently working storage container
  • when a new storage block is written the embodiment of the present invention does not limit the number of the new storage block
  • the new storage block is written to the current work.
  • a new storage container may be created as a new working container to store the New storage block.
  • the previously used storage container cannot be reused even if there is still space available internally (that is, the previous storage container will not be called as a new working container to store the new storage block) unless previously used.
  • the storage container is spatially reclaimed or defragmented, it becomes a blank container (that is, there is no storage block inside the container). In this case, the previously used storage container can be called as the new working container.
  • the CK2C table is recorded in the form of a key-value key value pair, and the Key (key) of each index recorded in the CK2C table is uniformly configured for each index.
  • the smallest CKID in the storage container CT (as shown in FIG. 4), the Value (value) of each index is the CTID of the storage container corresponding to each index, and each CTID is uniquely determined to have a physical address.
  • the physical address of the data can be uniquely determined by querying the CK2C table.
  • the specific query mode may be: when the Key (primary key) of each index recorded in the CK2C table is uniformly configured as the smallest CKID in the storage container CT corresponding to each index, and the CKID of the file system is linearly increased.
  • the CK2C table is first queried for the CKID7 key, and if so, the search ends; if not, Then, through the CK2C table, all the keys smaller than CKID7 in the CK2C table (ie, CKID1 and CKID5 in FIG. 4) are determined, and the key whose value is closest to CKID7 is determined from all the keys smaller than CKID7, as shown in FIG. 4, that is, CKID5. Then, CKID7 must fall into the storage container with CKID5 as the key (CTID2). Finally, by querying the metadata in CTID2 and the physical address PA2 corresponding to CTID2, the physical address of CKID7 can be uniquely determined.
  • a CK2C table for recording a correspondence between a storage block and a storage container.
  • the smallest or largest CKID in the storage container is used as the key of the CK2C table index corresponding to the storage container, and the CKID is used as the data.
  • the virtual address, and the range of CKIDs contained in each storage container does not overlap, and there is no intersection, so that each CT only corresponds to one CK2C table index, and the physicality of all the storage blocks in one storage container CT can be determined by one index.
  • the address, and the CK2C table is small in size, reducing the mapping cost of the virtual address to the physical address.
  • FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention.
  • the method uses the CK2C table shown in FIG. 4 to manage and record spatial location information of data, which is only an example and not a limitation.
  • the CKID is configured to grow linearly and never multiplexed, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest CKID in the storage container CT corresponding to each index.
  • the flow of the method described below includes multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined into fewer operations, which may be sequential. Execute or execute in parallel (for example, using a parallel processor or a multi-threaded environment). As shown in FIG. 5, the method includes the following steps:
  • step S510 the new storage block to be saved is received, and the new storage block to be saved may be a single storage block in the specific implementation process, or may be the m storages generated in step S310 in the embodiment of FIG.
  • the blocks are transmitted one by one in the order from small block number to large block number. The following is a description of the m memory blocks in the embodiment of FIG. 3, which is not limited to the embodiment of the present invention.
  • Step S511 the current working storage container is obtained (the currently working storage container is described in the embodiments of FIG. 2A and FIG. 4, and details are not described herein again);
  • Step S512 determining whether the available space of the currently working storage container (working CT) is greater than or equal to the data size (Ck Size) of the new storage block to be saved, that is, whether the working CT can accommodate the new storage to be saved. Piece.
  • Step S513 if the available space of the currently working storage container is greater than or equal to the data size (Ck Size) of the new storage block to be saved, the size is subtracted/removed from the available space of the currently working storage container.
  • a space for the CK Size is allocated to the new storage block to be saved.
  • the metadata in the currently working storage container needs to record the new to be saved.
  • a series of attribute information such as the size of the storage block, the check code, and the position in the CT.
  • Step S514 if the available space of the currently working storage container is smaller than the data size (Ck Size) of the new storage block to be saved, replace the currently working storage container to satisfy the new storage block pair space to be saved.
  • a new storage container ie, a free container that does not contain a storage block
  • the old container that has become a blank container after being spatially recovered is used as an updated work CT to accommodate the new storage block to be saved.
  • Step S515 subtracting/removing the space of the size CK Size from the available space of the updated working CT to allocate the new storage block to be saved.
  • Other operations are similar to step S513, and are not described herein again.
  • Step S5166 since the updated working CT does not record in the CK2C table, a new index needs to be added/inserted in the CK2C table for recording the updated working CT, the key key of the new index.
  • the value value is the container number CTID of the updated working CT.
  • the new index when inserting the new index, as a preference, may be sequentially inserted in order of the size of the index (CKID) of the index in the CK2C table.
  • Step S517 finally returning a virtual address, that is, a block number CKID new of the new storage block to be saved, to the new storage block to be saved. Therefore, the block number of the m memory blocks in the embodiment of FIG. 3 is the virtual address of the data to be saved.
  • step S517 is directly executed; if the currently working storage container is found to be a free container (this case is special, the system does not generally have a working storage).
  • the container is a free container, but the possibility of its occurrence cannot be ruled out. For example, when the system is initially working, a new free container is created as a working CT, and then a request for applying for VA appears. That is, if the storage block is not included, the CK2C table is indicated.
  • FIG. 6A is an exemplary flowchart of one embodiment of a disk sorting method according to an embodiment of the present invention.
  • the method uses the CK2C table shown in FIG. 4 to manage and record the spatial location information of the data, which is only an example and not a limitation.
  • the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced. , never reused, here is only used to select the CKID linear increase), and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the storage corresponding to each index.
  • Step S611 receiving a disk defragmentation request; in a specific implementation, the disk defragmentation request may be initiated periodically or after the space is recovered, it is known which CKs in the CT are garbage (will not be used by the system again), so that defragmentation is not required. Scan the entire file system directly at the level of the memory block, the logic is simple. In this regard, it is not intended to limit the embodiments of the present invention.
  • Step S612 obtaining a batch of CTs to be collated according to the CK2C table, where the CT to be collated is a storage container including a garbage storage block in all storage containers indicated on the CK2C table; and multiple sets of virtual addresses are continuously stored.
  • the container further determines at least two sets of storage containers to be collated consecutively with virtual addresses, wherein each set of virtual addresses consecutive storage containers to be sorted includes k to be sorted a storage container, wherein the k storage containers to be collated are k logically consecutive indexes in a correspondence between the storage block and the storage container. There is no index of other storage containers between the k indexes as an interval.
  • 6B are a set of storage containers with consecutive virtual addresses, and CK1, CK6, CK10, and CK16 are four consecutive indexes, and CT1, CT3, and CT4 are not a group of virtual ones.
  • CK1, CK10, and CK16 are not consecutive indexes because of the intermediate interval CK6.
  • the reason why it is necessary to determine at least two sets of storage containers to be collated consecutively is because there may be a storage container that does not contain a garbage storage block, and such a storage container is not regarded as a storage container to be sorted, and storage to be sorted may occur. There is a gap between the virtual addresses of the containers.
  • the packets need to be processed to ensure that the storage blocks accommodated in the new storage containers obtained after the disk is sorted are not in scope with the storage blocks accommodated by the other storage containers.
  • the specific scanning and grouping manners may be in the order of size.
  • the following is an example of CT1, CT2, CT3, and CT4 in FIG. 6B.
  • Step S613 scanning the non-garbage CKs in a single CT one by one according to the key values of the four CTs (CT1, CT2, CT3, CT4) to be collated, that is, in the order of CT1 ⁇ CT2 ⁇ CT3 ⁇ CT4 Scan one by one.
  • Step S614 determining whether the scanned non-spam CK constitutes a full storage container (ie, there is no extra space to store the next non-spam CK) or whether all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned. ? If yes, go to step S615; if no, go back to step S613.
  • CK8 is scanned as shown in FIG. 6B, it is found that CK1 to CK8 have already formed a full storage container, and then the next step S615 is performed.
  • Step S615 when the scanned non-spam CK constitutes a full storage container or all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned, apply for a new CT (free container), for example, newly built in FIG. 6B.
  • Step S616 the scanned non-spam CK is migrated to the new CT. As shown in FIG. 6B, CK1 to CK8 are migrated into the new CT5.
  • Step S617 inserting/adding an index in the CK2C table for recording a correspondence between the storage block in the new CT and the new CT.
  • an index ⁇ CKID1, CTID5> is inserted in the CK2C table.
  • step S618 it is determined whether the scanning is completed. If the scanning is completed, the step S619 is performed; if the scanning is not completed, the processing returns to the step S613.
  • step S619 when all the storage containers to be sorted are scanned, the defragmentation operation is ended. As shown in Figure 6B, CT4 has all been scanned.
  • FIG. 6B is a schematic diagram of a disk sorting method according to an embodiment of the invention.
  • the method uses Figure 4
  • the CK2C table shown manages and records the spatial location information of the data, which is only an example and not a limitation.
  • the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced, never reused.
  • only the CKID linear increase is selected for the example, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest of the storage containers CT corresponding to each index.
  • CKID. 6B uses the disk sorting method in FIG.
  • the CK2C table for recording the correspondence between the storage block and the storage container has only two indexes after the disk is collated, corresponding to the new storage container CT5 and the new storage container CT6, respectively, and the number of indexes included in the CK2C table can be seen.
  • the number of storage containers can be dynamically consistent, does not grow as the system runs, and ultimately reduces the mapping cost of virtual addresses to physical addresses.
  • FIG. 7 is a schematic diagram showing the logical structure of a storage manager 700 according to an embodiment of the invention.
  • the storage manager 700 can be, but is not limited to, the storage manager 210 of FIG. 2A or the storage manager A of FIG. 2B, and can also be, but is not limited to, perform the methods described in FIGS. 3, 5, and 6A.
  • the storage manager 700 includes a storage block management module 710, a storage container management module 720, and a recording module 730.
  • the storage block management module 710 is configured to allocate m storage blocks for the data to be saved each time the data storage request is received, where each storage block is used to represent a virtual address space, and each storage
  • the block configuration has a unique block number, and m is a natural number greater than or equal to 1;
  • a storage container management module 720 configured to specify n storage containers for the m storage blocks, where each storage container represents a physical storage space on the storage device, and n is a natural number greater than or equal to 1;
  • the recording module 730 is configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage. Corresponding relationship between the block and the storage container accommodating the allocated storage block; and the recording module 730 is further configured to record the block number of the m storage blocks to the metadata of the data to be saved this time.
  • the block numbers of the m memory blocks are used as virtual addresses of the data to be saved.
  • each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container.
  • the recording module 730 includes a plurality of indexes, each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the keys of each index are the same storage. A representative value of a block number of a storage block accommodated by the container, the value of each index being an identifier of the same storage container.
  • the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly incremented, and the minimum value of the block numbers of the m storage blocks is greater than that of the previous data to be saved.
  • the maximum value of the block number of the configured storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the data to be saved the next time;
  • the recording module 730 is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  • the storage container management module 720 is specifically configured to perform the following operations:
  • obtaining a storage container of the current work assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is a minimum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
  • the storage container of the updated work is an idle storage container; the recording module 730 is notified again that an index is added again in the correspondence between the storage block and the storage container, the again The key of the added index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
  • step b When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
  • the recording module 730 is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving the notification of adding an index sent by the storage container management module 720.
  • the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly decremented, and the minimum value of the block numbers of the m storage blocks is greater than that configured for the next data to be saved.
  • the maximum value of the block number of the storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;
  • the recording module 730 is specifically configured to record the representative value in each index as the maximum block number of the storage block accommodated by the same storage container.
  • the storage container management module 720 is specifically configured to perform the following operations:
  • obtaining a storage container of the current work assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is previously a free storage container, and if the currently working storage container is a free storage container, notifying the recording module 730 of the storage block and the storage container Adding an index to the relationship, the key of the added index is a maximum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
  • the currently working storage container is a full storage container
  • obtain the updated working storage container and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number.
  • the updated working storage container is an idle storage container;
  • the recording module 730 is again notified to add an index again in the correspondence between the storage block and the storage container,
  • the key of the index added again is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
  • step b When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
  • the recording module 730 is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving the notification of adding an index sent by the storage container management module 720.
  • the storage manager 700 further includes a defragmentation module (not shown), where the defragmentation module is configured to:
  • Receiving a defragmentation instruction determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
  • the block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
  • the storage manager 700 further includes a defragmentation module (not shown), where the defragmentation module is configured to:
  • the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
  • the block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
  • FIG. 8 is a schematic diagram showing the logical structure of a computer 800 according to an embodiment of the present invention.
  • the computer 800 of the embodiment of the present invention may include:
  • the CPU 801, the memory 802, and the communication interface 804 are connected by the system bus 803 and complete communication with each other.
  • Processor 801 may be a single core or multi-core central processing unit, or a particular integrated circuit, or one or more integrated circuits configured to implement embodiments of the present invention.
  • the memory 802 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • Memory 802 is used to hold computer execution instructions (not shown). Specifically, the program code 805 may be included in the computer execution instruction.
  • the processor 801 runs a computer execution instruction, and the method flow described in any one of FIG. 3, FIG. 5, or FIG. 6 can be performed.
  • aspects of the present invention, or possible implementations of various aspects may be embodied as a system, method, or computer program product.
  • aspects of the invention, or possible implementations of various aspects may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," “modules,” or “systems.”
  • aspects of the invention, or possible implementations of various aspects may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
  • the computer readable medium can be a computer readable signal medium or a computer readable storage medium.
  • the computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).
  • the processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor is capable of performing the various functional steps specified in each step of the flowchart, or a combination of steps; A device that functions as specified in each block, or combination of blocks.
  • the computer readable program code can execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server.
  • the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted. For example, two steps, or two blocks, shown in succession may be executed substantially concurrently or the blocks may be executed in the reverse order.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for managing storage data, a storage manager and a storage system, the method comprising: after receiving a data storage request every time, allocating m storage blocks for this data to be stored, wherein each storage block is used for representing a section of virtual address space, and each of the storage blocks is configured with an unique block number (S310); assigning n storage containers for the m storage blocks, wherein each storage container represents a section of physical storage space on a storage device (S320); updating a correlation between the storage blocks and the storage containers according to a correlation between the m storage blocks and the n storage containers, wherein the correlation between the storage blocks and the storage containers is used for recording a correlation between an allocated storage block and a storage container accommodating the allocated storage block (S330); and recording the block numbers of the m storage blocks into metadata of this data to be stored, wherein the block numbers of the m storage blocks are used as virtual addresses of this data to be stored (S340), thereby improving the disk space utilization rate.

Description

一种存储数据的管理方法、存储管理器及存储系统Storage data management method, storage manager and storage system 技术领域Technical field
本发明涉及计算机技术领域,尤其涉及一种存储数据的管理方法、存储管理器以及存储系统。The present invention relates to the field of computer technologies, and in particular, to a storage data management method, a storage manager, and a storage system.
背景技术Background technique
文件系统记录数据的位置信息一般采取以下方式:记录数据的虚拟地址,通过映射表将虚拟地址映射到物理地址。该方式逻辑简单且上层不需要理解底层布局。The location information of the file system record data generally adopts the following method: recording the virtual address of the data, and mapping the virtual address to the physical address through the mapping table. This approach is logically simple and the upper layer does not need to understand the underlying layout.
在系统运行一段时间后,已经经过多次空间回收,磁盘碎片化问题凸显,需要进行磁盘碎片(Defragmentation)整理,碎片整理过程依赖于文件系统的布局。考虑到空间回收和数据局部性,通常文件系统会把数据/文件以存储块(所述存储块是指文件系统中读写数据的最小单位或者最基本的单位,在不同的文件系统中可能有不同的命名,如基本数据块Chunk,数据块Data Block等)的形式进行记录,并将多个存储块以存储容器Container(本领域也称数据段Segment)的形式组织起来。After the system has been running for a period of time, it has undergone multiple space reclamation. The problem of disk fragmentation is highlighted. Defragmentation is required. The defragmentation process depends on the layout of the file system. Considering space reclamation and data locality, the file system usually uses data/files as storage blocks (the storage blocks refer to the smallest unit or the most basic unit of reading and writing data in the file system, and may exist in different file systems. Different naming, such as basic data chunks, data blocks, etc., are recorded, and a plurality of storage blocks are organized in the form of a storage container Container (also referred to as a data segment Segment).
图1A为现有技术中一种使用容器地址转换表CAT记录数据位置信息的文件系统的CAT索引示意图。该方案将数据所在的存储容器号和存储块号(<CTID,CKID>)组合起来作为数据的虚拟地址,再通过容器地址转换表CAT将CT(Container的简称)映射到物理地址。以数据<CTID1,CKID2>为例,该虚拟地址<CTID1,CKID2>即代表存储容器CT1中的存储块CK2中所存的数据。通过查询CAT表可知存储容器CT1的物理地址PA1,同时在每个存储容器中都有元数据记录该存储容器中每个CK的大小,校验码,在CT中的位置等相关信息,因此在得知了CT1的物理地址PA1后,再通过查询CT1中的元数据即可确定所述数据<CTID1,CKID2>的物理地址。FIG. 1A is a schematic diagram of a CAT index of a file system for recording data location information using a container address translation table CAT in the prior art. The scheme combines the storage container number and the storage block number (<CTID, CKID>) where the data is located as the virtual address of the data, and maps the CT (abbreviation of Container) to the physical address through the container address conversion table CAT. Taking the data <CTID1, CKID2> as an example, the virtual address <CTID1, CKID2> represents the data stored in the storage block CK2 in the storage container CT1. By querying the CAT table, the physical address PA1 of the storage container CT1 can be known, and in each storage container, metadata is recorded in each storage container, and the size of each CK in the storage container, the check code, the position in the CT, and the like are related. After learning the physical address PA1 of CT1, the physical address of the data <CTID1, CKID2> can be determined by querying the metadata in CT1.
图1B为现有技术经过磁盘整理后的CAT索引结构示意图,可见经过磁盘整理后,原CT2中的CK4和CK6迁移到CT1,此时CAT表中,CT2的索引信息将修改为CT1的物理地址PA1。然而,由于该现有技术通过<CTID,CKID>组合起来作为数据的虚拟地址,进行磁盘整理的时候,也只能是一个CT中的存储块必须整体迁移到另一个CT中,如图1C所示,如果将CT2中的CK4迁移到CT1中,而将CT2中的CK6迁移到CT3中,因为CT1和CT3的物理地 址是不同的,此时CAT表中的CT2的物理地址将无法映射。FIG. 1B is a schematic diagram of a CAT index structure after defragmentation in the prior art. It can be seen that after defragmentation, CK4 and CK6 in the original CT2 are migrated to CT1. At this time, the index information of CT2 is modified to the physical address of CT1 in the CAT table. PA1. However, since the prior art uses <CTID, CKID> as a virtual address of data, when defragmenting, only the memory block in one CT must be migrated to another CT as shown in FIG. 1C. It is shown that if CK4 in CT2 is migrated to CT1, CK6 in CT2 is migrated to CT3 because of the physical location of CT1 and CT3. The address is different, and the physical address of CT2 in the CAT table will not be mapped.
因此,现有技术中,文件系统对存储数据的管理方法使得磁盘整理的时候的灵活性和效率都不好,磁盘整理之后的磁盘空间利用率仍然不高。Therefore, in the prior art, the file system manages the stored data so that the flexibility and efficiency of the disk sorting are not good, and the disk space utilization after the disk sorting is still not high.
发明内容Summary of the invention
有鉴于此,实有必要提供一种存储数据的管理方法的方法、存储管理器和存储系统,以提高磁盘整理的灵活性。In view of this, it is necessary to provide a method, a storage manager, and a storage system for storing data management methods to improve the flexibility of defragmentation.
第一方面,本发明实施例提出了一种存储数据的管理方法,包括:In a first aspect, an embodiment of the present invention provides a method for managing stored data, including:
每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;Each time a data save request is received, m storage blocks are allocated for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block number, m a natural number greater than or equal to 1;
为所述m个存储块指定n个存储容器,其中,每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数;Specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already Corresponding relationship of the storage containers of the allocated storage blocks;
记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
结合第一方面,在第一种可能的实现方式中,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。With reference to the first aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .
结合第一方面的第一种可能的实现方式,在第二种可能的实现方式中,所述存储块与存储容器的对应关系包括多条索引,其中,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的全部存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner, the corresponding relationship between the storage block and the storage container includes multiple indexes, where each index is used to indicate that the same is specified Point of all storage blocks of a storage container, the key of each index is a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is the same storage container Logo.
结合第一方面的第二种可能的实现方式,在第三种可能的实现方式中,每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的 存储块的块号的最小值;With reference to the second possible implementation of the first aspect, in a third possible implementation, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured The minimum value of the block number is greater than the maximum value of the block number of the storage block configured by the previous data to be saved or the maximum value of the block number of the m storage blocks being configured is smaller than the configuration of the data to be saved after the previous time. The minimum value of the block number of the storage block;
所述每条索引中的代表值为所述同一个存储容器所容纳的存储块的最小块号。The representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
结合第一方面的第三种可能的实现方式,在第四种可能的实现方式中,所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:In conjunction with the third possible implementation of the first aspect, in a fourth possible implementation, the n storage blocks are specified by the n storage blocks, and the m storage blocks and the n are The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
结合第一方面的第二种可能的实现方式,在第五种可能的实现方式中,每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块被配置的块号的最小值大于后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于前一次待保存数据所配置的存储块的块号的最小值;With reference to the second possible implementation of the first aspect, in a fifth possible implementation, the block numbers of the m storage blocks allocated each time are configured to be linearly decremented, and the m storage blocks are configured The minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the data to be saved in the previous time or the maximum value of the block number in which the m memory blocks are configured is smaller than the memory block configured by the previous data to be saved. The minimum value of the block number;
所述每条索引中的代表值为同一个存储容器所容纳的存储块的最大块号。The representative value in each index is the largest block number of the storage block accommodated in the same storage container.
结合第一方面的第五种可能的实现方式,在第六种可能的实现方式中, 所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:In conjunction with the fifth possible implementation of the first aspect, in a sixth possible implementation, Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
结合第一方面的第三种可能实现的方式至第一方面的第六种可能实现的方式种的任一种可能实现的方式,在第七种可能的实现方式中,还包括:接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;With reference to the third possible implementation manner of the first aspect, to any possible implementation manner of the sixth possible implementation manner of the first aspect, in a seventh possible implementation manner, the method further includes: receiving defragmentation An instruction to determine a storage container to be collated, wherein the storage container to be collated is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
结合第一方面的第三种可能实现的方式至第一方面的第六种可能实现的方式种的任一种可能实现的方式,在第八种可能的实现方式中,还包括: With reference to the third possible implementation manner of the first aspect, to the possible implementation manner of the sixth possible implementation manner of the first aspect, in an eighth possible implementation manner, the method further includes:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻;Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container Value adjacent to each other;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
结合第一方面的第三种可能实现的方式至第一方面的第六种可能实现的方式种的任一种可能实现的方式,在第九种可能的实现方式中,所述方法还包括:With reference to the third possible implementation manner of the first aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the first aspect, in a ninth possible implementation manner, the method further includes:
接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,查询所述待读取的数据所在的文件的文件元数据,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数;After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file where the data to be read is located, and acquiring the data to be read a virtual address, wherein the virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;
根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数;Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;
读取所述q个存储容器的元数据,确定所述p个存储块的物理地址信息,每个存储容器的元数据用于描述所述每个容器中所有存储块的信息。Reading metadata of the q storage containers, determining physical address information of the p storage blocks, and metadata of each storage container is used to describe information of all storage blocks in each of the containers.
第二方面,本发明实施例提出了一种存储管理器,应用于存储系统中,所述存储系统包括存储设备以及存储管理器,所述存储设备包含用于提供物理地址空间的存储介质,所述存储管理器用于接收由应用触发的数据保存请求,将所述待保存的数据转发到所述存储设备进行保存;所述存储管理器包括:In a second aspect, an embodiment of the present invention provides a storage manager, which is applied to a storage system, where the storage system includes a storage device and a storage manager, where the storage device includes a storage medium for providing a physical address space. The storage manager is configured to receive a data save request triggered by the application, and forward the data to be saved to the storage device for saving; the storage manager includes:
存储块管理模块,用于在每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;a storage block management module, configured to allocate m storage blocks for the data to be saved each time after receiving the data storage request, where each storage block is used to represent a virtual address space, and each storage block Configured with a unique block number, m is a natural number greater than or equal to 1;
存储容器管理模块,用于为所述m个存储块指定n个存储容器,其中, 每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数;a storage container management module, configured to specify n storage containers for the m storage blocks, where Each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
记录模块,用于根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;以及,a recording module, configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks Correspondence with a storage container accommodating the already allocated storage block; and
所述记录模块还用于记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。The recording module is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time. The virtual address of the data to be saved.
结合第二方面,在第一种可能的实现方式中,,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。With reference to the second aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical medium corresponding to each storage container address.
结合第二方面的第一种可能的实现方式,在第二种可能的实现方式中,所述记录模块中记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner, the corresponding relationship between the storage block and the storage container recorded in the recording module includes multiple indexes, and each index is used by Pointing to the pointers of all the storage blocks assigned to the same storage container, the keys of each index are representative values of the block numbers of the storage blocks accommodated by the same storage container, and the value of each index is The ID of the same storage container.
结合第二方面的第二种可能的实现方式,在第三种可能的实现方式中,所述存储块管理模块每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块的块号的最小值大于为前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为后一次待保存数据所配置的存储块的块号的最小值;With reference to the second possible implementation of the second aspect, in a third possible implementation, the block number of the m storage blocks allocated by the storage block management module is configured to be linearly incremented, and The minimum value of the block number of the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the maximum value of the block number of the m memory blocks is less than the data to be saved for the next time to be saved. The minimum value of the block number of the storage block;
所述记录模块具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。The recording module is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
结合第二方面的第三种可能的实现方式,在第四种可能的实现方式中,所述存储容器管理模块具体用于执行下述操作:In conjunction with the third possible implementation of the second aspect, in a fourth possible implementation, the storage container management module is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块在所 述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the recording module at the location Adding an index to the correspondence between the storage block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is the storage container of the current working Identification
b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module is notified again that an index is added again in the correspondence between the storage block and the storage container, and the index is added again. The key of the index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
所述记录模块具体用于,在接到所述存储容器管理模块发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
结合第二方面的第二种可能的实现方式,在第五种可能的实现方式中,所述存储块管理模块每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块的块号的最小值大于为后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;With reference to the second possible implementation of the second aspect, in a fifth possible implementation, the block number of the m storage blocks allocated by the storage block management module is configured to be linearly decremented, and The minimum value of the block number of the m storage blocks is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number of the m storage blocks is smaller than the data to be saved for the previous time. The minimum value of the block number of the storage block;
所述记录模块具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The recording module is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
结合第二方面的第五种可能的实现方式,在第六种可能的实现方式中,所述存储容器管理模块具体用于执行下述操作:With reference to the fifth possible implementation of the second aspect, in a sixth possible implementation, the storage container management module is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容 器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the storage capacity of the current work. Identification of the device;
b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module is again notified to add an index again in the correspondence between the storage block and the storage container, the again The key of the added index is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
所述记录模块具体用于,在接到所述存储容器管理模块发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
结合第二方面的第三种可能实现的方式至第六种可能的实现方式中任一种可能实现的方式,在第七种可能的实现方式中,所述存储管理器还包括磁盘整理模块,所述磁盘整理模块用于:With reference to the third possible implementation manner of the second aspect, and the possible implementation manner of the sixth possible implementation manner, in a seventh possible implementation manner, the storage manager further includes a defragmentation module. The defragmentation module is used to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
结合第二方面的第三种可能实现的方式至第六种可能的实现方式中任一种可能实现的方式,在第八种可能的实现方式中,所述存储管理器还包括磁盘整理模块,所述磁盘整理模块用于:With reference to the third possible implementation manner of the second aspect, the possible implementation manner of the sixth possible implementation manner, in an eighth possible implementation manner, the storage manager further includes a defragmentation module, The defragmentation module is used to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储 块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidy, wherein the storage container to be tidyed includes garbage storage in all storage containers indicated by the correspondence between the storage block and the storage container Block storage container;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻;Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container Value adjacent to each other;
为所述非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
第三方面,本发明实施例提出了一种存储系统,所述存储系统包括存储设备以及存储管理器;In a third aspect, an embodiment of the present invention provides a storage system, where the storage system includes a storage device and a storage manager.
所述存储设备包含存储介质,用于提供物理地址空间来保存数据;The storage device includes a storage medium for providing a physical address space to save data;
所述存储管理器,用于在每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;The storage manager is configured to allocate m storage blocks for the data to be saved after each receiving the data saving request, where each storage block is used to represent a virtual address space, and each storage The block configuration has a unique block number, and m is a natural number greater than or equal to 1;
为所述m个存储块指定n个存储容器,其中,每个存储容器表示所述存储设备上的一段物理存储空间,n为大于等于1的自然数;Specifying n storage containers for the m storage blocks, where each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;以及Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already The correspondence of the storage containers of the allocated storage blocks;
记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
结合第三方面,在第一种可能的实现方式中,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。With reference to the third aspect, in a first possible implementation, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container .
结合第三方面的第一种可能的实现方式,在第二种可能的实现方式中,所述存储管理器用于记录存储块与存储容器的对应关系具体包括:With reference to the first possible implementation manner of the third aspect, in a second possible implementation manner, the storing, by the storage manager, the corresponding relationship between the storage storage block and the storage container includes:
所述存储管理器记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引表示被指定到同一个存储容器的全部存储块的指向,其中, 每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。The correspondence between the storage block and the storage container recorded by the storage manager includes a plurality of indexes, and each index represents a pointing of all the storage blocks assigned to the same storage container, where The key of each index is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.
结合第三方面的第二种可能的实现方式,在第三种可能的实现方式中,所述配置单元包括:所述存储管理器具体用于,将每次分配的所述m个存储块的块号配置为线性递增,且配置所述m个存储块的块号的最小值大于为前一次待保存数据所配置的存储块的块号的最大值或者配置所述m个存储块的块号的最大值小于为后一次待保存数据所配置的存储块的块号的最小值;With reference to the second possible implementation of the third aspect, in a third possible implementation, the configuration unit includes: the storage manager is specifically configured to: allocate the m storage blocks each time The block number is configured to be linearly incremented, and the minimum value of the block number configuring the m memory blocks is greater than the maximum value of the block number of the memory block configured for the previous data to be saved or the block number of the m memory blocks is configured. The maximum value is less than the minimum value of the block number of the storage block configured for the data to be saved the next time;
所述存储管理器具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
结合第三方面的第三种可能的实现方式,在第四种可能的实现方式中,所述存储管理器具体用于执行下述操作:In conjunction with the third possible implementation of the third aspect, in a fourth possible implementation, the storage manager is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
结合第三方面的第二种可能的实现方式,在第五种可能的实现方式中,所述存储管理器具体用于,将每次分配的所述m个存储块的块号配置为线性递减,且配置所述m个存储块的块号的最小值大于为后一次待保存数据所配 置的存储块的块号的最大值或者配置所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;With reference to the second possible implementation of the third aspect, in a fifth possible implementation, the storage manager is specifically configured to configure a block number of the m storage blocks allocated each time as linear decrement And configuring the minimum number of the block numbers of the m storage blocks to be larger than the data to be saved for the next time. The maximum value of the block number of the set storage block or the maximum value of the block number configuring the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;
所述存储管理器具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The storage manager is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
结合第三方面的第五种可能的实现方式,在第六种可能的实现方式中,所述存储管理器为所述m个存储块指定n个存储容器,以及记录存储块与存储容器的对应关系,具体用于执行下述操作:With reference to the fifth possible implementation manner of the third aspect, in a sixth possible implementation, the storage manager specifies n storage containers for the m storage blocks, and records a correspondence between the storage block and the storage container Relationship, specifically used to perform the following operations:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条的索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index to the storage node and the storage container, the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added The value of the index is the identity of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
结合第三方面的第三种可能实现的方式至第三方面的第六种可能实现的方式中任一种可能的实现方式,在第七种可能的实现方式中,所述存储管理器还用于:With reference to the third possible implementation manner of the third aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the third aspect, in a seventh possible implementation manner, the storage manager is further used to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块; Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
结合第三方面的第三种可能实现的方式至第三方面的第六种可能实现的方式中任一种可能的实现方式,在第八种可能的实现方式中,所述存储管理器还用于:With reference to the third possible implementation manner of the third aspect, to any one of the possible implementation manners of the sixth possible implementation manner of the third aspect, in the eighth possible implementation manner, the storage manager is further used to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻;Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container Value adjacent to each other;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
所述存储管理器还用于:The storage manager is also used to:
在接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,查询所述待读取的数据所在的文件的文件元数据,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数,根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数,以及读取所述q个存储容器的元数据,确定所述p个存储块的物理地址信息,每个存储容器的元数据用于描述所述每个容器中所有存储块的信息。第四方面,本发明实施例提供一种一种存储管理器,包括:After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file in which the data to be read is located, and acquiring the to-be-read a virtual address of the data, wherein the virtual address of the data to be read includes a block number of p storage blocks, p is a natural number greater than or equal to 1, and the storage block is queried according to the block number of the p storage blocks Corresponding relationship of the storage containers, determining q storage containers accommodating the p storage blocks, q being a natural number greater than or equal to 1, and reading metadata of the q storage containers, determining physical properties of the p storage blocks Address information, metadata of each storage container is used to describe information of all storage blocks in each of the containers. In a fourth aspect, an embodiment of the present invention provides a storage manager, including:
包括用于与存储设备交互的接口、处理器、存储器,所述处理器通过总线与所述处理器连接,所述处理器通过所述接口与所述存储设备交互信息;An interface, a processor, and a memory for interacting with a storage device, the processor being coupled to the processor via a bus, the processor interacting with the storage device through the interface;
所述存储器用于存储计算机执行指令,当所述存储管理器运行时,所述 处理器执行所述存储器存储的所述计算机执行指令,以使所述存储管理器执行以上第一方面或第一方面任一可能的实现方式所提供的存储数据的管理方法。The memory is configured to store computer execution instructions, when the storage manager is running, The processor executes the computer-executed instructions stored by the memory to cause the storage manager to perform the method of managing stored data provided by any of the first aspect or the first aspect of the first aspect.
第五方面,本发明实施例提供一种计算机,包括:处理器、存储器、总线和通信接口;In a fifth aspect, an embodiment of the present invention provides a computer, including: a processor, a memory, a bus, and a communication interface;
所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所述总线连接,当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述计算机执行以上第一方面或第一方面任一可能的实现方式所提供的存储数据的管理方法。The memory is configured to store computer execution instructions, the processor is coupled to the memory via the bus, and when the computer is running, the processor executes the computer-executed instructions stored by the memory to cause The computer performs the management method of the stored data provided by the above first aspect or any possible implementation of the first aspect.
第六方面,本发明实施例提供一种计算机可读介质,包括计算机执行指令,以供计算机的处理器执行所述计算机执行指令时,所述计算机执行以上第一方面或第一方面任一可能的实现方式所提供的存储数据的管理方法。In a sixth aspect, an embodiment of the present invention provides a computer readable medium, including a computer executing instruction, when the processor of the computer executes the computer to execute an instruction, where the computer performs any of the above first aspect or the first aspect The implementation method of storing data provided by the implementation.
本发明实施例中,每次为待保存的数据分配m个存储块,并为所述m个存储块指定n个存储容器,根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,并记录所述m个存储块的块号到所述本次待保存的数据的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址,使得系统中数据的虚拟地址与数据所在的存储容器无关,能够根据数据所在的存储块的块号查询所述存储块与存储容器的对应关系,从而获得数据的物理地址的相关信息,此种数据存储的管理方法,使得在进行磁盘整理的时候,不需要以存储容器为粒度来整体迁移,而直接以存储块为粒度进行磁盘的整理,因而提升了磁盘整理的效率和灵活性,也提升了磁盘的空间利用率。In the embodiment of the present invention, each time, m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the correspondence between the m storage blocks and the n storage containers. Updating a correspondence between the storage block and the storage container, and recording the block number of the m storage blocks to the metadata of the data to be saved, the block numbers of the m storage blocks being used as the The virtual address of the data to be saved is such that the virtual address of the data in the system is independent of the storage container where the data is located, and the correspondence between the storage block and the storage container can be queried according to the block number of the storage block where the data is located, thereby obtaining data. The information about the physical address, the management method of the data storage, so that when the defragmentation is performed, the storage container is not required to be migrated as a whole, and the disk is directly defragmented by the storage block, thereby improving the defragmentation. The efficiency and flexibility also increase the disk space utilization.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1A为现有技术中一种使用CAT容器地址转换表记录数据的位置信息的文件系统的CAT索引结构示意图;1A is a schematic diagram showing a CAT index structure of a file system for recording location information of data using a CAT container address translation table in the prior art;
图1B为现有技术中一种使用CAT记录数据的位置信息的文件系统磁盘 碎片整理后的CAT索引结构示意图;1B is a file system disk in the prior art for using CAT to record location information of data. Schematic diagram of the CAT index structure after defragmentation;
图1C为现有技术中一种使用CAT记录数据的位置信息的文件系统磁盘碎片整理的原理示意图;1C is a schematic diagram showing the principle of file system disk defragmentation of a location information using CAT to record data in the prior art;
图2A为根据本发明实施例提供的一种存储系统的结构示意图;2A is a schematic structural diagram of a storage system according to an embodiment of the present invention;
图2B为本发明实施例的应用场景示意图;2B is a schematic diagram of an application scenario according to an embodiment of the present invention;
图3为根据本发明实施例的一种存储数据的管理方法的示范性流程图;FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention; FIG.
图4为根据本发明实施例创建的CK2C(Chunk to Container)存储块到存储容器映射表的结构示意图;4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention;
图5为依据本发明一实施例的存储数据的管理方法的示范性流程图;FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention; FIG.
图6A为依据本发明一实施例的磁盘整理方法的示范性流程图;FIG. 6A is an exemplary flowchart of a disk sorting method according to an embodiment of the invention; FIG.
图6B为依据本发明一实施例的磁盘整理方法的原理图;6B is a schematic diagram of a disk sorting method according to an embodiment of the invention;
图7为依据本发明一实施例的存储管理器的结构示意图;FIG. 7 is a schematic structural diagram of a storage manager according to an embodiment of the invention; FIG.
图8为本发明实施例的计算机的结构示意图。FIG. 8 is a schematic structural diagram of a computer according to an embodiment of the present invention.
具体实施方式detailed description
为使得本发明的发明目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本发明一部分实施例,而非全部实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The embodiments of the present invention will be described in detail with reference to the accompanying drawings in the embodiments of the present invention. It is a partial embodiment of the invention, not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
为方便理解实施,本发明实施例首先提供一种存储系统200。如图2A所示为该存储系统200的逻辑结构示意图,存储系统200包括:存储管理器210和存储设备220,所述存储管理器210分别与外部设备(如主机、应用服务器等,本方案不限制外部设备的数量)和所述存储设备220通信相连。To facilitate understanding of the implementation, the embodiment of the present invention first provides a storage system 200. 2A is a schematic diagram of a logical structure of the storage system 200. The storage system 200 includes: a storage manager 210 and a storage device 220, and the storage manager 210 and an external device (such as a host, an application server, etc., this solution does not The number of external devices is limited) to be in communication with the storage device 220.
其中,所述存储设备220可以包含存储介质222和存储控制器221(本发明实施例并不限制所述存储介质222和所述存储控制器221的数量,这里仅为了方便描述才示出了图示中数量的221和222),所述存储介质222用于提供物理地址空间来保存数据,在具体实现过程中,所述存储介质222,可以通过例如但不限于EEPROM、ROM、固态硬盘SSD、硬盘HDD、磁带、光硬盘 或其他非易失性存储装置来实现,对此不作为对本发明实施例的限制;所述存储控制器221用于管理、调度所述多个存储介质222,仅作为示例而非限制,所述存储控制器221与所述存储介质222可以组成磁盘阵列RAID(Redundant Arrays of Independent Disks),对此不作为对本发明实施例的限制。The storage device 220 may include a storage medium 222 and a storage controller 221 (the embodiment of the present invention does not limit the number of the storage medium 222 and the storage controller 221, and the figure is shown here for convenience of description. The storage medium 222 is configured to provide a physical address space for storing data. In a specific implementation process, the storage medium 222 may be, for example, but not limited to, an EEPROM, a ROM, a solid state hard disk SSD, or the like. Hard disk HDD, tape, optical hard disk Or other non-volatile storage device, which is not limited to the embodiment of the present invention; the storage controller 221 is used to manage and schedule the plurality of storage media 222, by way of example and not limitation. The storage controller 221 and the storage medium 222 may constitute a Redundant Arrays of Independent Disks (RAID), which is not a limitation on the embodiments of the present invention.
所述存储管理器210,作为外部设备(主机host、APP Service)与所述存储设备220实现数据读写的中间单元,可以如图2A所示单独设置在一台物理设备上,作为一种具体的实现方式,所述存储管理器210可以通过对文件系统进行改造来实现。需要注意的是,本发明实施例中提到的文件系统是指对文件存储设备(包括但不限于图2A中的存储设备200)的地址空间进行组织和分配,负责文件/数据的存储并对存入的文件/数据进行管理、检索和保护的系统,即本发明实施例中的文件系统包括文件管理功能和空间管理功能。The storage manager 210, as an intermediate device (host host, APP Service) and the storage device 220, can be configured as an intermediate unit for reading and writing data, and can be separately set on a physical device as shown in FIG. 2A. The storage manager 210 can be implemented by modifying the file system. It should be noted that the file system mentioned in the embodiment of the present invention refers to organizing and allocating the address space of the file storage device (including but not limited to the storage device 200 in FIG. 2A), and is responsible for storing the file/data and The system for managing, retrieving, and protecting the stored files/data, that is, the file system in the embodiment of the present invention includes a file management function and a space management function.
所述存储管理器210,具体用于在每次接收到来自主机的数据保存请求之后,为本次待保存的数据分配m个存储块(本发明实施例中的存储块CK是指读写数据的最小单位或者最基本的单位,在不同的系统中可能有不同的命名,如基本数据块Chunk,数据块Data Block等),其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号(CKID),m为大于等于1的自然数。需要注意的是,本发明实施例中所指的每次接收到数据保存请求可以是指存储管理器210一次接收到一个来自主机的数据保存请求,也可以是指存储管理器210一次接收到多个来自主机的保存请求数据保存请求,且所述数据保存请求可以是由外部应用触发也可以由所述存储管理器210内部的指令函数触发,对此,均不作为对本方案实施例的限制。The storage manager 210 is specifically configured to allocate m storage blocks for the data to be saved after receiving the data save request from the host (the storage block CK in the embodiment of the present invention refers to reading and writing data). The smallest unit or the most basic unit may have different names in different systems, such as basic data chunks, data blocks, etc., where each memory block is used to represent a virtual address space, each of which The memory blocks are configured with a unique block number (CKID), and m is a natural number greater than or equal to 1. It should be noted that each time the data save request is received in the embodiment of the present invention, the storage manager 210 may receive a data save request from the host at a time, or may refer to the storage manager 210 receiving the data at one time. The save request data save request from the host, and the data save request may be triggered by an external application or may be triggered by an instruction function of the storage manager 210, which is not a limitation on the embodiment of the present solution.
在分配好存储块之后,所述存储管理器210为所述m个存储块指定n个存储容器(CT),其中,每个存储容器表示所述存储设备220上的一段物理存储空间,n为大于等于1的自然数。需要注意的是,本发明实施例中所提到的存储容器用于容纳多个所述存储块,本领域也称所述存储容器为数据容器Container(CT)或者数据段Segment。存储容器还可以容纳存储块相关的元数据。所述每个存储容器表示所述存储设备220上的一段物理存储空间,是指所述存储管理器所分配的每个存储容器实际对应了所述存储设备220(具体是指存储介质222)上的一段连续不间断的物理地址空间,所述一段 物理存储空间既可以是由连续、不间断的物理地址空间组成,也可以是由离散、间断的物理地址空间所组成,仅作为示例而非限制,当所述存储介质222为磁盘时,所述每个存储容器可以实际对应存储设备提供的逻辑卷上的一段连续的逻辑地址,也可以对应磁盘上一段连续的扇区或磁道,还可以对应磁盘上由离散、间断的扇区或磁道所组成的一段物理存储空间,例如通过RAID条带化的形式将所述离散、间断的扇区或磁道组成一段物理存储空间。每个存储容器配置有元数据,每个存储容器的元数据记录所述每个存储容器容纳的存储块的每个存储块的校验码、数据大小、在CT中的位置等相关信息。After the storage blocks are allocated, the storage manager 210 specifies n storage containers (CT) for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device 220, where n is A natural number greater than or equal to 1. It should be noted that the storage container mentioned in the embodiment of the present invention is used to accommodate a plurality of the storage blocks. The storage container is also referred to in the art as a data container Container (CT) or a data segment Segment. The storage container can also hold metadata related to the storage block. Each storage container represents a piece of physical storage space on the storage device 220, which means that each storage container allocated by the storage manager actually corresponds to the storage device 220 (specifically, the storage medium 222). a continuous uninterrupted physical address space, the segment The physical storage space may be composed of a continuous, uninterrupted physical address space, or may be composed of discrete, intermittent physical address spaces, by way of example and not limitation, when the storage medium 222 is a disk, Each storage container may actually correspond to a contiguous logical address on a logical volume provided by the storage device, or may correspond to a contiguous sector or track on the disk, or may be composed of discrete, intermittent sectors or tracks on the disk. A piece of physical storage space, such as by RAID striping, forms the discrete, intermittent sectors or tracks into a physical storage space. Each storage container is configured with metadata, and the metadata of each storage container records related information such as a check code, a data size, a position in the CT, and the like of each storage block of the storage block accommodated by each storage container.
本发明实施例中,所述存储管理器210中将保存不同于现有技术的新的对应关系,在本发明实施例中称为存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;本发明实施例中,所述存储管理器210不同于现有技术中所使用的,将<CTID,CKID>组合起来用作为虚拟地址的方法,而是记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。所述存储管理器210将存储块的块号用作为待保存数据的虚拟地址可以在分配好存储块之后即执行,也可以在记录号存储块与存储容器的对应关系之后再进行,本发明实施例对此不做限定。需要注意的是,本发明实施例所述的虚拟地址用来作为所述存储管理器对数据进行寻址时的寻址地址使用,上层应用和底层存储设备均无需感知这个虚拟地址,虚拟地址仅对于存储管理器而言具有寻址意义。例如,所述存储管理器在接收到的数据读取请求后,根据所述数据读取请求确定待读取的数据的虚拟地址,即定位待读取的数据在所述存储管理器所定义的虚拟存储空间中的位置。一般来说,待保存的数据的虚拟地址或者待读取的数据的虚拟地址采用元数据的形式记录在所述存储管理器中。本发明实施例中的所述存储管理器保存有多个文件的元数据,每个文件对应有保存的数据,每个文件都有文件的元数据,文件的元数据为本领域公知的只是,例如,包括文件目录信息和索引节点信息等等。本发明实施例提供的存储系统200中,由存储管理器210负责为待保存的数据分配m个存储块,并为所述m个存储块指定n个存储容器,根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,并记录所述m个存储块的块号到所述本次待保存的数据的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址,使得系统中记录的数 据的虚拟地址与数据所在的存储容器无关,能够根据数据所在的存储块查询所述存储块与存储容器的对应关系,从而获得数据的物理地址的相关信息,此种数据存储的管理方法,使得在进行磁盘整理的时候,不需要以存储容器为粒度来整体迁移,而直接以存储块为粒度进行磁盘的整理,因而提升了磁盘整理的效率和灵活性,也大幅提升了磁盘的空间利用率。In the embodiment of the present invention, a new correspondence relationship different from the prior art is saved in the storage manager 210, which is referred to as a correspondence between a storage block and a storage container in the embodiment of the present invention, and the storage block and the storage container Corresponding relationship is used to record the corresponding relationship between the storage block that has been allocated and the storage container that accommodates the allocated storage block. In the embodiment of the present invention, the storage manager 210 is different from that used in the prior art. <CTID, CKID> is used in combination as a method of virtual address, but records the block number of the m storage blocks to the metadata of the file in which the data to be saved is located, the m storage blocks The block number is used as the virtual address of the data to be saved this time. The storage manager 210 uses the block number of the storage block as the virtual address of the data to be saved, and may be executed after the storage block is allocated, or may be performed after the correspondence between the record number storage block and the storage container, and is implemented by the present invention. This example does not limit this. It should be noted that the virtual address used in the embodiment of the present invention is used as the addressing address when the storage manager addresses the data, and the upper layer application and the underlying storage device do not need to perceive the virtual address, and the virtual address only It has an addressing meaning for the storage manager. For example, after receiving the data read request, the storage manager determines a virtual address of the data to be read according to the data read request, that is, the data to be read is defined by the storage manager. The location in the virtual storage space. Generally, the virtual address of the data to be saved or the virtual address of the data to be read is recorded in the storage manager in the form of metadata. The storage manager in the embodiment of the present invention stores metadata of a plurality of files, each file corresponding to the saved data, each file has metadata of the file, and the metadata of the file is only known in the art. For example, include file directory information and index node information, and the like. In the storage system 200 provided by the embodiment of the present invention, the storage manager 210 is responsible for allocating m storage blocks for the data to be saved, and designating n storage containers for the m storage blocks, according to the m storage blocks and Recording the correspondence between the storage blocks and the storage containers, and recording the block numbers of the m storage blocks to the metadata of the data to be saved, the m storage blocks The block number is used as the virtual address of the data to be saved this time, so that the number recorded in the system The virtual address is independent of the storage container where the data is located, and can query the corresponding relationship between the storage block and the storage container according to the storage block where the data is located, thereby obtaining related information of the physical address of the data, and the management method of the data storage makes When defragmenting, there is no need to migrate the entire container at the granularity of the storage container, and the disk is directly defragmented by the size of the storage block, thereby improving the efficiency and flexibility of the defragmentation, and greatly improving the space utilization of the disk. .
进一步的,为方便记录和管理,所述存储管理器210可以为所述每个存储容器配置有唯一的标识(CTID),所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。在具体实现过程中,仅作为示例,所述每个存储容器的标识可以通过映射方式映射到物理地址或通过指定系统存储容器的初始物理地址并指定所述每个存储容器的空间大小(如8M),再通过计算偏移量(CTID*8M)得到所述每个存储容器所对应的物理地址。对此,不作为对本发明实施例保护范围的限制。Further, for the convenience of recording and management, the storage manager 210 may be configured with a unique identifier (CTID) for each storage container, and the identifier of each storage container is used to indicate to each storage container. The corresponding physical address. In a specific implementation process, by way of example only, the identifier of each storage container may be mapped to a physical address by mapping or by specifying an initial physical address of the system storage container and specifying a space size of each storage container (eg, 8M) And obtaining the physical address corresponding to each storage container by calculating the offset (CTID*8M). In this regard, it is not intended to limit the scope of the embodiments of the present invention.
进一步的,所述存储管理器210记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引表示被指定到同一个存储容器的全部存储块的指向,其中,每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。Further, the correspondence between the storage block and the storage container recorded by the storage manager 210 includes multiple indexes, and each index represents a pointer of all storage blocks assigned to the same storage container, where each index The key is a representative value of the block number of the storage block accommodated by the same storage container, and the value of each index is the identifier of the same storage container.
本发明实施例中,所述存储管理器210通过多条索引来记录所述存储块与存储容器的对应关系,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,且所述每条索引的键配置为同一个存储容器所容纳的全部存储块的块号的代表值,所述每条索引的值配置为所述同一个存储容器的标识。因而,在保证能够记录所述存储块与存储容器的对应关系的同时,使得所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块,降低了所述存储块与存储容器的对应关系的冗余度,使得对应关系简单易查,提高查询使用的效率。In the embodiment of the present invention, the storage manager 210 records the correspondence between the storage block and the storage container by using multiple indexes, and each index is used to indicate the orientation of all the storage blocks assigned to the same storage container, and The key of each index is configured as a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is configured as an identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage. The redundancy of the correspondence between the block and the storage container makes the correspondence easier to check and improves the efficiency of query usage.
作为一种优选的实施方式,所述存储管理器210具体可以将每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的存储块的块号的最小值;所述存储管理器具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。在具体实现过程中,仅作为示例而非限制,所述存储管理器210将每次分配的所述m个存储块的块 号配置为线性递增可以通过如下算法实现:CKIDnew=CKIDmax+1,其中,所述CKIDnew为每次新配置的一个存储块的块号,所述CKIDmax为所述文件系统中当前最大的数据块号。As a preferred implementation manner, the storage manager 210 may specifically configure a block number of the m storage blocks allocated each time to be linearly incremented, and a minimum value of the block numbers of the m storage blocks to be configured. The maximum value of the block number of the storage block configured to be larger than the previous data to be saved or the maximum value of the block number of the m storage blocks to be configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved after the previous time; The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container. In a specific implementation process, by way of example only and not limitation, the storage manager 210 configures the block number of the m memory blocks allocated each time to linearly increase by the following algorithm: CKID new = CKID max +1 And wherein the CKID new is a block number of a newly configured one of the storage blocks, and the CKID max is a current largest block number in the file system.
本发明实施例中,存储管理器210将每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的存储块的块号的最小值;并将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。使得算法简单,且易于实现所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块。In the embodiment of the present invention, the storage manager 210 configures the block number of the m storage blocks to be linearly incremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the previous data to be saved. The maximum value of the block number of the configured storage block or the maximum value of the block number in which the m storage blocks are configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved the next time; and each of the pieces The representative value in the index is recorded as the smallest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
进一步,基于上述存储块号线性递增的实现方式,所述存储管理器210具体通过如下的方式来实现存储块指定到存储容器,以及存储块与存储容器的对应关系的记录:Further, based on the implementation manner that the storage block number is linearly incremented, the storage manager 210 specifically implements the storage block designation to the storage container and the record of the correspondence between the storage block and the storage container by:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;在具体实现过程中,为了保证系统中每个存储容器里面所容纳的存储块的块号的数值范围与其他任何一个存储容器所容纳的存储块的块号数值范围都不出现交集,所述存储管理器210在任一时刻都最多仅配置有一个存储容器用于容纳存储块,该存储容器就是所述当前工作的存储容器。a. Obtain a storage container of the current work, and assign the m storage blocks to the currently working storage container one by one in a sequence from a small block number to a large block number; in the specific implementation process, in order to ensure each storage in the system The value range of the block number of the storage block accommodated in the container does not intersect with the block number value range of the storage block accommodated in any other storage container, and the storage manager 210 has at most one storage configured at any one time. The container is for accommodating a storage block, which is the currently working storage container.
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;需要注意的是,所述空闲存储容器即为不包含任何存储块的存储容器,比如由文件系统新建的存储容器,或者经空间回收后不包含存储块的旧存储容器。Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container; It should be noted that the free storage container is a storage container that does not contain any storage block, such as a new storage container created by the file system, or an old storage container that does not contain a storage block after being spatially reclaimed.
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;需要注意的是,所述满存储容器是指存储容器中已经没有足够的空间来容纳下一个待分配的存储块了。同时,基于与步骤a中相同的理由, 在当前工作的存储容器为满存储容器时,获得的所述更新的工作的存储容器必须为空闲容器。c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container; it should be noted that the full storage container means that there is not enough space in the storage container to accommodate the next to be allocated. The memory block is gone. At the same time, based on the same reasons as in step a, When the currently working storage container is a full storage container, the obtained storage container of the updated work must be a free container.
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
本实施例详细说明了如何为所述m个存储块分配所述n个存储容器并记录所述存储块与存储容器的对应关系,算法简单且易于实现,当然在具体实现过程中,可能上述步骤a~步骤e可能会被拆分为更多个小步骤或者合并为几条步骤执行,也可能变换步骤之间的执行顺序,由于上述变换都是基于本实施例的基础上无需经过创造性劳动即可实现,因此都应归于本实施例的保护范围。This embodiment describes in detail how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement. Of course, in the specific implementation process, the above steps may be performed. a~Step e may be split into more small steps or merged into several steps, or may be performed in the order of execution between steps, since the above transformations are based on the present embodiment without the need for creative labor. It can be achieved, and therefore should be attributed to the scope of protection of this embodiment.
作为一种优选的实施方式,所述存储管理器210具体用于,将每次分配的所述m个存储块的块号配置为线性递减,且配置所述m个存储块的块号的最小值大于为后一次待保存数据所配置的存储块的块号的最大值或者配置所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;As a preferred implementation manner, the storage manager 210 is configured to configure a block number of the m storage blocks allocated each time to be linearly decremented, and configure a minimum block number of the m storage blocks. The value is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number configuring the m storage blocks is smaller than the minimum of the block number of the storage block configured for the previous data to be saved. value;
所述存储管理器210具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The storage manager 210 is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
本发明实施例中,存储管理器210将每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块被配置的块号的最小值大于后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于前一次待保存数据所配置的存储块的块号的最小值;并将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。使得算法简单,且易于实现所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块。In the embodiment of the present invention, the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, and the minimum value of the configured block numbers of the m storage blocks is greater than the data to be saved later. The maximum value of the block number of the configured memory block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and each of the pieces The representative value in the index is recorded as the largest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
可选的,当所述存储管理器210将每次分配的所述m个存储块的块号配置为线性递减时,所述存储管理器210具体用于执行下述操作:Optionally, when the storage manager 210 configures the block number of the m storage blocks to be linearly decremented each time, the storage manager 210 is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所 述m个存储块指定到所述当前工作的存储容器;a, get the current working storage container, according to the order from the big block to the small block number one by one Said m storage blocks are assigned to the currently working storage container;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条的索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index to the storage node and the storage container, the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added The value of the index is the identity of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
本实施例详细说明了如何为所述m个存储块分配所述n个存储容器并记录所述存储块与存储容器的对应关系,算法简单且易于实现,当然在具体实现过程中,如前一个实施例所述,基于本实施例的变形都应归于本实施例的保护范围。This embodiment details how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, of course, in the specific implementation process, such as the previous one. According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.
进一步的,当所述存储管理器210将每次分配的所述m个存储块的块号配置为线性递增或线性递减时,所述存储管理器210还用于:Further, when the storage manager 210 configures the block number of the m storage blocks to be linearly incremented or linearly decremented each time, the storage manager 210 is further configured to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块(即还会被系统使用的存储块);Scan the storage container to be collated, and obtain a non-garbage storage block (that is, a storage block that is also used by the system) included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
本实施例中,存储管理器210在接收到磁盘整理指令后,能够确定待扫 描的存储容器(可以通过多种方式实现,对此不作为本方案实施例的限制),获取每个待整理的存储容器包含的非垃圾存储块,并以所述非垃圾存储块为粒度重新指定新的存储容器,同时更新所述存储块与存储容器的对应关系,具体是通过配置每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集来实现。使得在磁盘整理时,能够以存储块为粒度进行灵活整理,并且整理后在所述存储块与存储容器的对应关系中仍然保证每个新的存储容器仅需要一条索引就能记录所述每个新的存储容器中的全部非垃圾存储块,进而降低虚拟地址到物理地址的映射代价。In this embodiment, after receiving the defragmentation command, the storage manager 210 can determine that it is to be scanned. The storage container (which can be implemented in a plurality of manners, which is not limited to the embodiment of the present solution), obtains a non-garbage storage block included in each storage container to be collated, and re-creates the non-garbage storage block as a granularity. Specifying a new storage container, and updating the correspondence between the storage block and the storage container, in particular, by configuring the block number of the non-garbage storage block accommodated by each new storage container to be linearly increasing or linearly decreasing, and The range of the block number of the storage block accommodated in each new storage container does not overlap with the range of the block number of the storage block accommodated in the other new storage container. In the defragmentation, the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.
进一步的,当所述存储管理器210将每次分配的所述m个存储块的块号配置为线性递增或线性递减时,所述存储管理器210还用于:Further, when the storage manager 210 configures the block number of the m storage blocks to be linearly incremented or linearly decremented each time, the storage manager 210 is further configured to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块(即已经不会被系统使用或已被空间回收的存储块)的存储容器;Receiving a defragmentation instruction to determine a storage container to be collated, wherein the storage container to be collated includes a garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container (ie, is not used by the system or a storage container of a storage block that has been spatially reclaimed;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻(该大小相邻即是指两条索引的键的值紧挨着,所述两条索引的键值构成的键值区间中不包含其他任何一条索引的键值);Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container The value is adjacent to each other (the adjacent size means that the values of the keys of the two indexes are next to each other, and the key value interval formed by the key values of the two indexes does not include the key value of any other index);
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
本实施例中,存储管理器210在接收到磁盘整理指令后,能够确定所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器为待整理的存储容器,并通过在所述逻辑相邻的待整理的存储容器中对非垃圾存储块指定新的存储容器,同时更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。使得在磁盘整理时,能够以包含垃圾存储块的存储容器为对象进行灵活整理,并且整理 后在所述存储块与存储容器的对应关系中仍然保证每个新的存储容器仅需要一条索引就能记录所述每个新的存储容器中的全部非垃圾存储块,进而降低虚拟地址到物理地址的映射代价。In this embodiment, after receiving the defragmentation instruction, the storage manager 210 can determine that the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container is a storage container to be tidy. And by designating a new storage container for the non-garbage storage block in the logically adjacent storage container to be collated, and updating the correspondence between the storage block and the storage container, wherein each new storage container is accommodated The block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is different from the block number of the storage block accommodated by the other new storage container. There is no intersection in the scope. Enables flexible collation and storage of storage containers containing garbage storage blocks when defragmenting Then, in the correspondence between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record all the non-garbage storage blocks in each new storage container, thereby reducing the virtual address to the physical The mapping cost of the address.
进一步的,所述方法还包括:接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数;仅作为示例而非限制,所述待读取的数据的信息包括所述待读取的数据的文件名、所述待读取的数据在文件中的偏移量、所述待读取的数据的长度等相关信息。当获得所述待读取的数据的信息后,可以查询待读取的数据所在的文件的元数据,例如先读取文件系统的目录,获得所述待读取的数据的inode(索引节点)信息,再根据所述待读取的数据的inode信息,查询所述待读取的数据的虚拟地址,所述待读取的数据的虚拟地址包括p个存储块的块号。Further, the method further includes: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where The virtual address of the data to be read includes the block number of the p memory blocks, and p is a natural number greater than or equal to 1; the information of the data to be read includes the data to be read, by way of example only and not limitation The file name, the offset of the data to be read in the file, the length of the data to be read, and the like. After obtaining the information of the data to be read, the metadata of the file where the data to be read is located may be queried, for example, the directory of the file system is read first, and the inode (index node) of the data to be read is obtained. The information is further queried according to the inode information of the data to be read, and the virtual address of the data to be read includes a block number of p storage blocks.
根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数;Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;
读取所述q个存储容器的元数据,确定所述p个存储块的物理地址信息;具体实现过程中,所述q个存储容器的元数据记录了所述q个存储容器内每个存储块的校验码、数据大小、在CT中的位置等相关信息。由于所述q个存储容器的标识指示到所述q个存储容器的物理地址,因此知道了所述q个存储容器的物理地址后,再加上所述q个存储容器内的元数据的信息,就可以确定所述p个存储块的物理地址信息。Reading the metadata of the q storage containers to determine physical address information of the p storage blocks; in a specific implementation process, the metadata of the q storage containers records each storage in the q storage containers The block's check code, data size, location in the CT, and other related information. Since the identifiers of the q storage containers indicate the physical addresses of the q storage containers, after the physical addresses of the q storage containers are known, the information of the metadata in the q storage containers is added. The physical address information of the p memory blocks can be determined.
根据所述p个存储块的物理地址信息,从所述存储设备上读取所述待读取的数据。And reading the data to be read from the storage device according to physical address information of the p storage blocks.
进一步的,所述方法还可以包括:接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数;Further, the method may further include: after receiving the data read request, acquiring the virtual address of the data to be read according to the information of the data to be read carried in the data read request, where The virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;
根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数;Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;
将所述q个存储容器的标识以及所述q个存储容器的元数据所记录的所述p个存储块的位置信息发送给所述存储设备,以便于所述存储设备确定所 述所述p个存储块的物理地址信息;Sending the identifiers of the q storage containers and the location information of the p storage blocks recorded by the metadata of the q storage containers to the storage device, so that the storage device determines Describe physical address information of the p storage blocks;
根据所述p个存储块的物理地址信息,从所述存储设备上读取所述待读取的数据。在具体实现过程中,所述元数据记录了所述待寻址的存储块的块号所对应的存储容器内每个存储块的校验码、数据大小、在CT中的位置等相关信息。And reading the data to be read from the storage device according to physical address information of the p storage blocks. In a specific implementation process, the metadata records related information such as a check code, a data size, a location in the CT, and the like of each storage block in the storage container corresponding to the block number of the storage block to be addressed.
图2B为本发明实施例的存储管理器的一种具体实现的组成示意图。图2B包括存储管理器A(未编号)、存储设备230和存储管理器B(未编号),这里只是为了方便描述才示出了两个存储管理器A和B,存储管理器B可以作为存储管理器A的备份,存储管理器的数量不作为对本发明实施例的限制,一般来说,设置一个存储管理器即可实现本发明实施例。存储管理器A包括处理器211、与存储设备交互的接口(未示出)和存储器212,处理器211和存储器212通过总线(未编号)通信,处理器211执行存储器212中的计算机指令,并使得存储管理器A执行包括但不限于图2A中的实施例。存储管理器A通过所述与存储设备交互的接口与所述存储设备230通信,所述所述存储设备230用于存储存储管理器A转发的数据,存储管理器A实现的功能或方法与图2A中的存储设备220类似,这里不再赘述。FIG. 2B is a schematic structural diagram of a specific implementation of a storage manager according to an embodiment of the present invention. 2B includes storage manager A (unnumbered), storage device 230, and storage manager B (unnumbered), two storage managers A and B are shown here for convenience of description, and storage manager B can be used as storage. The backup of the manager A, the number of the storage manager is not limited to the embodiment of the present invention. Generally, a storage manager is provided to implement the embodiment of the present invention. The storage manager A includes a processor 211, an interface (not shown) that interacts with the storage device, and a memory 212 that communicates via a bus (not numbered), the processor 211 executes computer instructions in the memory 212, and Having the storage manager A perform includes, but is not limited to, the embodiment of Figure 2A. The storage manager A communicates with the storage device 230 through the interface interacting with the storage device, the storage device 230 is configured to store data forwarded by the storage manager A, and the function or method and method implemented by the storage manager A The storage device 220 in 2A is similar and will not be described again here.
图3为依据本发明实施例的一种存储数据的管理方法的示范性流程图。该存储数据的管理方法可以但不限于应用于如图2A所示的存储系统中或图2B所示的应用场景中。虽然下文描述的方法300的流程包括以特定顺序出现的多个操作,但是应该清楚了解,这些操作也可以包括更多的操作或合并于更少的操作中,这些操作可以顺序执行或并行执行(例如使用并行处理器或多线程环境)或者改变步骤之间的执行顺序,对此均应落入本发明实施例的保护范围。如图3所示,该方法包括:FIG. 3 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention. The management method of the stored data may be, but is not limited to, applied to the storage system as shown in FIG. 2A or the application scenario shown in FIG. 2B. Although the processes of method 300 described below include multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined in fewer operations, which may be performed sequentially or in parallel ( For example, using a parallel processor or a multi-threaded environment) or changing the order of execution between steps should fall within the scope of protection of embodiments of the present invention. As shown in FIG. 3, the method includes:
步骤S310,每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数。需要注意的是,如图2A的实施例中已述的,所述每次接收到数据保存请求可以是指一次接收到一个数据保存请求,也可以是指一次接收到多个数据保存请求。Step S310, after receiving the data save request, allocate m storage blocks for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block. No. m is a natural number greater than or equal to 1. It should be noted that, as already described in the embodiment of FIG. 2A, the receiving of the data save request may refer to receiving one data save request at a time, or may refer to receiving multiple data save requests at a time.
步骤S320,为所述m个存储块指定n个存储容器,其中,每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数。图2A的实施例中已对所述一段物理存储空间作了详细的解释,在此不再赘述。 Step S320, specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1. The section of the physical storage space has been explained in detail in the embodiment of FIG. 2A, and details are not described herein again.
步骤S330,根据所述m个存储块与所述n个存储容器的对应关系更新储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系。Step S330, updating the correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and accommodate Corresponding relationship of the storage containers of the allocated storage blocks.
步骤S340,记录所述m个存储块的块号到所述本次待保存的数据的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。In step S340, the block number of the m storage blocks is recorded in the metadata of the data to be saved, and the block numbers of the m storage blocks are used as the virtual address of the data to be saved. .
根据本发明实施例提供的技术方案,通过每次为待保存的数据分配m个存储块,并为所述m个存储块指定n个存储容器,根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,同时记录所述m个存储块的块号到所述本次待保存的数据的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址,使得存储块与存储容器的对应关系能够记录下来,且所述数据的虚拟地址与数据所在的存储容器无关,因而,在进行磁盘整理的时候,不需要以存储容器为粒度来整体迁移,提升了磁盘整理的效率和灵活性,也大幅提升了磁盘的空间利用率。According to the technical solution provided by the embodiment of the present invention, each time, m storage blocks are allocated for the data to be saved, and n storage containers are specified for the m storage blocks, according to the m storage blocks and the n storage blocks. Corresponding relationship between the storage container and the storage container, and recording the block number of the m storage blocks to the metadata of the data to be saved, the block numbers of the m storage blocks are used by As a virtual address of the data to be saved, the correspondence between the storage block and the storage container can be recorded, and the virtual address of the data is independent of the storage container where the data is located, and thus, when the disk is sorted The overall migration of the storage container is not required, which improves the efficiency and flexibility of the defragmentation and greatly improves the space utilization of the disk.
进一步的,为方便记录和管理,步骤S320中的所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。图2A的实施例中已经详细描述了如何使用所述每个存储容器的标识指示到所述每个存储容器所对应的物理地址,在此不再赘述。Further, in order to facilitate recording and management, each storage container in step S320 is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container. The embodiment of FIG. 2A has described in detail how to use the identifier of each storage container to indicate the physical address corresponding to each storage container, and details are not described herein again.
进一步的,步骤S330中所述存储块与存储容器的对应关系包括多条索引,其中,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的全部存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。Further, the correspondence between the storage block and the storage container in step S330 includes multiple indexes, wherein each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the key of each index A representative value of the block number of all the storage blocks accommodated by the same storage container, the value of each index being the identifier of the same storage container.
根据本发明实施例提供的技术方案,通过多条索引来记录所述存储块与存储容器的对应关系,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,且所述每条索引的键配置为同一个存储容器所容纳的全部存储块的块号的代表值,所述每条索引的值配置为所述同一个存储容器的标识。因而,在保证能够记录所述存储块与存储容器的对应关系的同时,使得所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块,降低了所述存储块与存储容器的对应关系的映射代价。According to the technical solution provided by the embodiment of the present invention, the correspondence between the storage block and the storage container is recorded by using multiple indexes, and each index is used to indicate a pointing of all storage blocks assigned to the same storage container, and the The key of each index is configured as a representative value of the block number of all the storage blocks accommodated by the same storage container, and the value of each index is configured as the identifier of the same storage container. Therefore, while ensuring that the correspondence between the storage block and the storage container can be recorded, each storage container can record all the storage blocks in each storage container by only one index, which reduces the storage. The mapping cost of the correspondence between the block and the storage container.
优选的,在所述存储块与存储容器的对应关系包括多条索引的情况下,每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值 或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的存储块的块号的最小值;Preferably, in a case that the correspondence between the storage block and the storage container includes multiple indexes, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the m storage blocks are configured. The minimum value of the block number is greater than the maximum value of the block number of the memory block configured by the previous data to be saved. Or the maximum value of the block number of the m storage blocks being configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved in the previous time;
所述每条索引中的代表值为所述同一个存储容器所容纳的存储块的最小块号。The representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
本发明实施例中,将每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的存储块的块号的最小值;并将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。使得算法简单,且易于实现所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块。In the embodiment of the present invention, the block number of the m storage blocks allocated each time is configured to be linearly incremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the previous data to be saved. The maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the data to be saved the next time; and the contents in each of the indexes The representative value is recorded as the smallest block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
优选的,当每次分配的所述m个存储块的块号配置为线性递增时,所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:Preferably, when the block numbers of the m storage blocks allocated each time are configured to be linearly incremented, the n storage containers are specified n storage containers, and according to the m storage blocks and the n The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。 e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
需要注意的是,上述实施例提到的当前工作的存储容器、满存储容器、空闲的存储容器等技术术语的概念在图2A所述的实施例中已经做了详细描述,在此不再赘述。同时,本实施例详细说明了如何为所述m个存储块分配所述n个存储容器并记录所述存储块与存储容器的对应关系,算法简单且易于实现,当然在具体实现过程中,可能上述步骤a~步骤e可能会被拆分为更多个小步骤或者合并为几条步骤执行,也可能变换步骤之间的执行顺序,由于上述变换都是基于本实施例的基础上无需经过创造性劳动即可实现,因此都应归于本实施例的保护范围。It should be noted that the concepts of technical terms such as the currently working storage container, the full storage container, and the idle storage container mentioned in the foregoing embodiments have been described in detail in the embodiment described in FIG. 2A, and details are not described herein again. . In the meantime, the embodiment details how to allocate the n storage containers to the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, and of course, in a specific implementation process, The above steps a to e may be split into more small steps or merged into several steps, and the order of execution between the steps may be changed. Since the above transformations are based on the present embodiment, no creativity is required. Labor can be achieved, and therefore should be attributed to the scope of protection of this embodiment.
优选的,在所述存储块与存储容器的对应关系包括多条索引的情况下,还可以将每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块被配置的块号的最小值大于后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于前一次待保存数据所配置的存储块的块号的最小值;Preferably, in a case that the correspondence between the storage block and the storage container includes multiple indexes, the block number of the m storage blocks allocated each time may be configured to be linearly decremented, and the m storage blocks The minimum value of the configured block number is greater than the maximum value of the block number of the storage block configured by the data to be saved in the next time or the maximum value of the block number in which the m storage blocks are configured is smaller than the storage configured by the previous data to be saved. The minimum value of the block number of the block;
所述每条索引中的代表值为所述同一个存储容器所容纳的存储块的最大块号。The representative value in each index is the maximum block number of the storage block accommodated by the same storage container.
本发明实施例中,将每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块被配置的块号的最小值大于后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于前一次待保存数据所配置的存储块的块号的最小值;并将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。使得算法简单,且易于实现所述每个存储容器仅需要一条索引就能够记录所述每个存储容器中的全部存储块。In the embodiment of the present invention, the block number of the m storage blocks allocated each time is configured to be linearly decremented, and the minimum value of the configured block numbers of the m storage blocks is greater than the storage configured by the data to be saved later. The maximum value of the block number of the block or the maximum value of the block number in which the m memory blocks are configured is smaller than the minimum value of the block number of the memory block configured by the previous data to be saved; and the each of the indexes in the index The representative value is recorded as the maximum block number of the storage block accommodated by the same storage container. The algorithm is simple and easy to implement. Each storage container needs only one index to record all the storage blocks in each storage container.
优选的,当每次分配的所述m个存储块的块号配置为线性递减时,所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:Preferably, when the block numbers of the m storage blocks allocated each time are configured to be linearly decremented, the n storage blocks are specified n storage containers, and according to the m storage blocks and the n The correspondence between the storage containers updates the correspondence between the storage blocks and the storage containers, including:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;
b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工 作的存储容器的标识;Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;
c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
本实施例详细说明了如何为所述m个存储块分配所述n个存储容器并记录所述存储块与存储容器的对应关系,算法简单且易于实现,当然在具体实现过程中,如之前的实施例所述,基于本实施例的变形都应归于本实施例的保护范围。This embodiment describes in detail how to allocate the n storage containers for the m storage blocks and record the correspondence between the storage blocks and the storage containers. The algorithm is simple and easy to implement, of course, in the specific implementation process, as before According to the embodiment, the modifications based on the present embodiment are all due to the protection range of the embodiment.
进一步的,当每次分配的所述m个存储块的块号配置为线性递增或线性递减时,所述方法还包括:Further, when the block number of the m memory blocks allocated each time is configured to be linearly increasing or linearly decreasing, the method further includes:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块(即还会被系统使用的存储块);Scan the storage container to be collated, and obtain a non-garbage storage block (that is, a storage block that is also used by the system) included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
本实施例中,在接收到磁盘整理指令后,能够确定待扫描的存储容器(可以通过多种方式实现,对此不作为本方案实施例的限制),获取每个待整理的存储容器包含的非垃圾存储块,并以所述非垃圾存储块为粒度重新指定新的存储容器,同时更新所述存储块与存储容器的对应关系,具体是通过配置每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存 储容器所容纳的存储块的块号的范围不存在交集来实现。使得在磁盘整理时,能够以存储块为粒度进行灵活整理,并且整理后在所述存储块与存储容器的对应关系中仍然保证每个新的存储容器仅需要一条索引就能记录所述每个新的存储容器中的全部非垃圾存储块,进而降低虚拟地址到物理地址的映射代价。In this embodiment, after receiving the defragmentation command, the storage container to be scanned can be determined (which can be implemented in various manners, which is not limited to the embodiment of the present solution), and the storage container included in each storage container is obtained. Non-garbage storage block, and reassigning a new storage container with the non-garbage storage block as a granularity, and updating the corresponding relationship between the storage block and the storage container, specifically by configuring the new storage container to accommodate the The block number of the non-garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is compared with other new storage. The range of block numbers of the storage blocks accommodated in the storage container is implemented without an intersection. In the defragmentation, the storage block can be flexibly organized according to the granularity of the storage block, and in the corresponding relationship between the storage block and the storage container, it is still ensured that each new storage container needs only one index to record each of the storage containers. All non-spam blocks in the new storage container, which in turn reduces the mapping cost of virtual addresses to physical addresses.
进一步的,当每次分配的所述m个存储块的块号配置为线性递增或线性递减时,所述方法还包括:Further, when the block number of the m memory blocks allocated each time is configured to be linearly increasing or linearly decreasing, the method further includes:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块(即已经不会被系统使用或已被空间回收的存储块)的存储容器;Receiving a defragmentation instruction to determine a storage container to be collated, wherein the storage container to be collated includes a garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container (ie, is not used by the system or a storage container of a storage block that has been spatially reclaimed;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻(所述键的值大小相邻在图2A所述的实施例中已经做了详细描述,不再赘述);Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container The value magnitudes are adjacent to each other (the value of the keys is adjacently described in detail in the embodiment described in FIG. 2A and will not be described again);
为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
本实施例中,在接收到磁盘整理指令后,能够确定所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器为待整理的存储容器,并通过在所述逻辑相邻的待整理的存储容器中对非垃圾存储块指定新的存储容器,同时更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。使得在磁盘整理时,能够以包含垃圾存储块的存储容器为对象进行灵活整理,并且整理后在所述存储块与存储容器的对应关系中仍然保证每个新的存储容器仅需要一条索引就能记录所述每个新的存储容器中的全部非垃圾存储块,进而降低虚拟地址到物理地址的映射代价。In this embodiment, after receiving the defragmentation instruction, the storage container including the garbage storage block in all the storage containers indicated by the corresponding relationship between the storage block and the storage container can be determined as the storage container to be tidyed, and Specifying a new storage container for the non-garbage storage block in the logical storage node to be collated, and updating the corresponding relationship between the storage block and the storage container, wherein the non-trash is accommodated in each new storage container The block number of the storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by other new storage containers. . Therefore, when the defragmentation is performed, the storage container containing the garbage storage block can be flexibly arranged, and after the finishing, the correspondence between the storage block and the storage container still ensures that each new storage container needs only one index. All non-garbage storage blocks in each new storage container are recorded, thereby reducing the mapping cost of the virtual address to the physical address.
图4为根据本发明实施例创建的CK2C(Chunk to Container)存储块到存储容器映射表的结构示意图,该CK2C表可以用来记录存储块与存储容器 的对应关系,即所述CK2C表为上述各个实施例中的所述存储块与存储容器的对应关系的一种优选的组织形式。为了方便描述,所述CK2C表记录的存储块的块号选择了线性递增、永不复用(唯一表示)的形式,(即每次为新存储块所分配的虚拟地址/CKID都与之前旧存储块的虚拟地址/CKID不同,且分配的CKID是一直线性增大的,当然也可以一直线性减小,这里未示出)。优选的,该CKID是64位的无符号整型数(int型),此时文件系统的全局寻址空间即为[0,2^64-1],需要注意的是,本发明实施例并不限制CKID的数量级,系统可以根据实际需要进行调整。根据具体应用场景,该CK2C表可以用线性表或者B+树来实现,具体采用何种方式则不作为对本发明实施例的限制。4 is a schematic structural diagram of a CK2C (Chunk to Container) storage block to a storage container mapping table created according to an embodiment of the present invention, where the CK2C table can be used to record a storage block and a storage container. Corresponding relationship, that is, the CK2C table is a preferred organization form of the correspondence between the storage block and the storage container in the above respective embodiments. For convenience of description, the block number of the storage block recorded by the CK2C table is selected in the form of linear increment and never multiplexing (unique representation), that is, each time the virtual address/CKID allocated for the new storage block is the same as before. The virtual address/CKID of the storage block is different, and the allocated CKID is always linearly increased, and of course, it can be linearly reduced all the time, not shown here). Preferably, the CKID is a 64-bit unsigned integer (int type), and the global addressing space of the file system is [0, 2^64-1], and it should be noted that the embodiment of the present invention The number of CKIDs is not limited, and the system can adjust according to actual needs. Depending on the specific application scenario, the CK2C table may be implemented by using a linear table or a B+ tree, and the specific manner is not limited to the embodiment of the present invention.
如图4所示,该CK2C表中包含多条索引(该CK2C表中每一列即是一条索引,如<CKID1,CTID1>即为一条索引),其中,每条索引和每个存储容器一一对应,即所述CK2C表中索引的条数等于该CK2C表所记录的存储容器CT的个数(这里为了方便描述,仅示出了三个CT亦即三条索引,不作为对本发明的限制)。同时需要注意的是,与每条索引相对应的存储容器所包含的CKID的数值范围不会重合。例如,图中CT1所包含的CKID范围是1~4,CT2所包含的CKID范围是5~8……,以此类推,每个CT所包含的CKID的范围不会出现重合或交集。在具体实现过程中,可以通过多种方式实现所述每个CT所包含的CKID的数值范围不会出现重合或交集,仅作为示例而非限制,文件系统中任一时刻,都最多仅有一个处于工作状态的容器(即当前工作的存储容器),当有新的存储块写入时(本发明实施例不限制所述新存储块的数量),则将所述新存储块写入当前工作的存储容器中,如果所述当前工作的存储容器已满或者所述当前工作的存储容器剩余的可用空间不够,则可以创建一个新的存储容器(空闲容器)作为新的工作容器来存储所述新的存储块。同时,需要注意的是,之前已用的存储容器,即使内部仍有可用空间也不能复用(即不会调用之前的存储容器作为新的工作容器来存储新的存储块),除非之前已用的存储容器经空间回收或碎片整理后变成了空白容器(即容器内部已无存储块),在此情形下则可以调用之前已用的存储容器作为新的工作容器。因此,再加上CKID具有线性增大/减小、永不复用的特性,就保证了每个CT所包含的CKID的范围不会出现重合或交集。当然,还可以通过为不同的CT分配不同的CKID区段来实现每个CT所包含的CKID的范围不出现重合或交集,不作为对本方案实施例的限制。 As shown in FIG. 4, the CK2C table includes multiple indexes (each column in the CK2C table is an index, such as <CKID1, CTID1> is an index), wherein each index and each storage container are one by one. Correspondingly, the number of indexes in the CK2C table is equal to the number of storage containers CT recorded in the CK2C table (here, for convenience of description, only three CTs, that is, three indexes are shown, which are not limiting of the present invention) . It should also be noted that the value range of the CKID contained in the storage container corresponding to each index does not coincide. For example, in the figure, CT1 includes a range of CKIDs of 1 to 4, CT2 includes a range of CKIDs of 5 to 8 ..., and so on, and the range of CKIDs included in each CT does not overlap or intersect. In a specific implementation, the value range of the CKID included in each CT can be implemented in a plurality of ways without overlapping or intersection, which is only an example and not a limitation, and there is only one at any time in the file system. The container in the working state (ie, the currently working storage container), when a new storage block is written (the embodiment of the present invention does not limit the number of the new storage block), the new storage block is written to the current work. In the storage container, if the currently working storage container is full or the remaining free space of the currently working storage container is insufficient, a new storage container (idle container) may be created as a new working container to store the New storage block. At the same time, it should be noted that the previously used storage container cannot be reused even if there is still space available internally (that is, the previous storage container will not be called as a new working container to store the new storage block) unless previously used. After the storage container is spatially reclaimed or defragmented, it becomes a blank container (that is, there is no storage block inside the container). In this case, the previously used storage container can be called as the new working container. Therefore, in addition to the fact that CKID has linear increase/decrease and never reuse, it ensures that the range of CKIDs included in each CT does not overlap or intersect. Of course, it is also possible to realize that the range of CKIDs included in each CT does not overlap or overlap by allocating different CKID segments for different CTs, and is not a limitation on the embodiments of the present solution.
可选的,当CKID为线性增大时,所述CK2C表采用key-value键值对的形式进行记录,该CK2C表中记录的每条索引的Key(键)统一配置为每条索引对应的存储容器CT中最小的CKID(如图4所示),每条索引的Value(值)为每条索引对应的存储容器的CTID,同时每个CTID都唯一确定有一个物理地址。仅作为示例而非限制,CTID可以通过计算偏移量直接对应到物理地址(例如假设CT的大小为8M,则CTID*8M=PA),也可以通过映射方式映射到物理地址。因此,知道了某个数据的虚拟地址(CKID),就能通过查询所述CK2C表从而唯一确定该数据的物理地址。仅作为示例,具体查询方式可以为:当该CK2C表中记录的每条索引的Key(主键)统一配置为每条索引对应的存储容器CT中最小的CKID,且文件系统的CKID为线性增大、永不复用时,以图4为例,假设需要查询物理地址的数据的虚拟地址CKID为CKID7,则首先查询CK2C表中有无值为CKID7的key,若有则结束查找;若无,则继续通过CK2C表确定该CK2C表中所有比CKID7小的key(图4中即CKID1、CKID5),再从所有比CKID7小的key中确定出数值最接近CKID7的key,如图4中即CKID5,那么CKID7必然落入以CKID5为key的存储容器中(CTID2),最后通过查询CTID2内的元数据加上CTID2对应的物理地址PA2,就可以唯一确定CKID7的物理地址。Optionally, when the CKID is linearly increased, the CK2C table is recorded in the form of a key-value key value pair, and the Key (key) of each index recorded in the CK2C table is uniformly configured for each index. The smallest CKID in the storage container CT (as shown in FIG. 4), the Value (value) of each index is the CTID of the storage container corresponding to each index, and each CTID is uniquely determined to have a physical address. By way of example only and not limitation, the CTID may directly correspond to the physical address by calculating the offset (eg, assuming that the size of the CT is 8M, then CTID*8M=PA), or may be mapped to the physical address by mapping. Therefore, knowing the virtual address (CKID) of a certain data, the physical address of the data can be uniquely determined by querying the CK2C table. For example, the specific query mode may be: when the Key (primary key) of each index recorded in the CK2C table is uniformly configured as the smallest CKID in the storage container CT corresponding to each index, and the CKID of the file system is linearly increased. For example, if the virtual address CKID of the data that needs to be queried for the physical address is CKID7, the CK2C table is first queried for the CKID7 key, and if so, the search ends; if not, Then, through the CK2C table, all the keys smaller than CKID7 in the CK2C table (ie, CKID1 and CKID5 in FIG. 4) are determined, and the key whose value is closest to CKID7 is determined from all the keys smaller than CKID7, as shown in FIG. 4, that is, CKID5. Then, CKID7 must fall into the storage container with CKID5 as the key (CTID2). Finally, by querying the metadata in CTID2 and the physical address PA2 corresponding to CTID2, the physical address of CKID7 can be uniquely determined.
本方案实施例中提供了一种CK2C表用来记录存储块与存储容器的对应关系,由存储容器中最小的或最大的CKID作为存储容器对应的CK2C表索引的key,同时由CKID作为数据的虚拟地址,且每个存储容器中所包含的CKID的范围不交叠、无交集,使得每个CT只对应一条CK2C表索引,通过一条索引就能确定一个存储容器CT中所有的存储块的物理地址,且该CK2C表体积小,降低了虚拟地址到物理地址的映射代价。In the embodiment of the present invention, a CK2C table is provided for recording a correspondence between a storage block and a storage container. The smallest or largest CKID in the storage container is used as the key of the CK2C table index corresponding to the storage container, and the CKID is used as the data. The virtual address, and the range of CKIDs contained in each storage container does not overlap, and there is no intersection, so that each CT only corresponds to one CK2C table index, and the physicality of all the storage blocks in one storage container CT can be determined by one index. The address, and the CK2C table is small in size, reducing the mapping cost of the virtual address to the physical address.
图5为依据本发明一实施例的存储数据的管理方法的示范性流程图,该方法使用图4所示的CK2C表管理、记录数据的空间位置信息,仅作为示例而非限制,该方法中CKID配置为线性增长,永不复用,且该方法所使用的所述CK2C表中记录的每条索引的Key(键)统一配置为每条索引对应的存储容器CT中最小的CKID。需要注意的是,虽然下文描述的该方法的流程包括以特定顺序出现的多个操作,但是应该清楚了解,这些操作也可以包括更多的操作或合并于更少的操作中,这些操作可以顺序执行或并行执行(例如使用并行处理器或多线程环境)。如图5所示,该方法包括以下步骤: FIG. 5 is an exemplary flowchart of a method for managing stored data according to an embodiment of the present invention. The method uses the CK2C table shown in FIG. 4 to manage and record spatial location information of data, which is only an example and not a limitation. The CKID is configured to grow linearly and never multiplexed, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest CKID in the storage container CT corresponding to each index. It should be noted that although the flow of the method described below includes multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined into fewer operations, which may be sequential. Execute or execute in parallel (for example, using a parallel processor or a multi-threaded environment). As shown in FIG. 5, the method includes the following steps:
步骤S510,接收待保存的新存储块,所述待保存的新存储块在具体实现过程中,可以是一个单一的存储块,也可以是图3实施例中的步骤S310所生成的m个存储块按照从小块号到大块号的顺序逐个传送而来的。以下以图3实施例中的所述m个存储块为例进行描述,对此,不作为对本发明实施例的限制。In step S510, the new storage block to be saved is received, and the new storage block to be saved may be a single storage block in the specific implementation process, or may be the m storages generated in step S310 in the embodiment of FIG. The blocks are transmitted one by one in the order from small block number to large block number. The following is a description of the m memory blocks in the embodiment of FIG. 3, which is not limited to the embodiment of the present invention.
步骤S511,获取当前工作的存储容器(所述当前工作的存储容器在图2A及图4的实施例中均作了描述,不再赘述);Step S511, the current working storage container is obtained (the currently working storage container is described in the embodiments of FIG. 2A and FIG. 4, and details are not described herein again);
步骤S512,判断当前工作的存储容器(工作CT)的可用空间是否大于或等于所述待保存的新存储块的数据大小(Ck Size),即工作CT能否容纳下所述待保存的新存储块。Step S512, determining whether the available space of the currently working storage container (working CT) is greater than or equal to the data size (Ck Size) of the new storage block to be saved, that is, whether the working CT can accommodate the new storage to be saved. Piece.
步骤S513,若所述当前工作的存储容器的可用空间大于或等于所述待保存的新存储块的数据大小(Ck Size),则从当前工作的存储容器的可用空间中减去/拿出大小为CK Size的空间分配给所述待保存的新存储块。当然,仅作为示例性的补充说明,在具体实现中,当前工作的存储容器在接收所述待保存的新存储块后,该当前工作的存储容器内的元数据需要记录所述待保存的新存储块的大小,校验码,在CT中的位置等一系列属性信息。Step S513, if the available space of the currently working storage container is greater than or equal to the data size (Ck Size) of the new storage block to be saved, the size is subtracted/removed from the available space of the currently working storage container. A space for the CK Size is allocated to the new storage block to be saved. Of course, as an exemplary supplementary explanation, in a specific implementation, after the currently working storage container receives the new storage block to be saved, the metadata in the currently working storage container needs to record the new to be saved. A series of attribute information such as the size of the storage block, the check code, and the position in the CT.
步骤S514,若所述当前工作的存储容器的可用空间小于所述待保存的新存储块的数据大小(Ck Size),则更换当前工作的存储容器以满足所述待保存的新存储块对空间大小的需求,具体实现中,可以创建一个新的存储容器(即不包含存储块的空闲容器)作为更新的工作CT来容纳所述待保存的新存储块;或选择之前已经使用过的,并经空间回收后变为空白容器的旧容器作为更新的工作CT来容纳所述待保存的新存储块。Step S514, if the available space of the currently working storage container is smaller than the data size (Ck Size) of the new storage block to be saved, replace the currently working storage container to satisfy the new storage block pair space to be saved. The size requirement, in a specific implementation, a new storage container (ie, a free container that does not contain a storage block) can be created as an updated work CT to accommodate the new storage block to be saved; or select a previously used one, and The old container that has become a blank container after being spatially recovered is used as an updated work CT to accommodate the new storage block to be saved.
步骤S515,从所述更新的工作CT的可用空间中减去/拿出大小为CK Size的空间分配给所述待保存的新存储块。其他操作与步骤S513类似,在此不再赘述。Step S515, subtracting/removing the space of the size CK Size from the available space of the updated working CT to allocate the new storage block to be saved. Other operations are similar to step S513, and are not described herein again.
步骤S516,由于所述更新的工作CT在所述CK2C表中没有记录,所以需要在所述CK2C表中新增/插入一条新索引,用于记录该更新的工作CT,该新索引的键key为所述待保存的新存储块的块号CKIDnew,值value为所述更新的工作CT的容器号CTID。在具体实现中,在插入该新索引时,作为优选的,可以按照CK2C表中索引的键值(CKID)的大小顺序来按序插入新索引。 Step S516, since the updated working CT does not record in the CK2C table, a new index needs to be added/inserted in the CK2C table for recording the updated working CT, the key key of the new index. For the block number CKID new of the new storage block to be saved, the value value is the container number CTID of the updated working CT. In a specific implementation, when inserting the new index, as a preference, the new index may be sequentially inserted in order of the size of the index (CKID) of the index in the CK2C table.
步骤S517,最后对所述待保存的新存储块返回虚拟地址,即所述待保存的新存储块的块号CKIDnew。因此,图3实施例中的所述m个存储块的块号就是所述待保存的数据的虚拟地址。Step S517, finally returning a virtual address, that is, a block number CKID new of the new storage block to be saved, to the new storage block to be saved. Therefore, the block number of the m memory blocks in the embodiment of FIG. 3 is the virtual address of the data to be saved.
需要注意的是,为了使方案更加完善,作为优选的,在步骤S513之后,可以加上如下操作:判断当前工作的存储容器(工作CT)是否已经包含有存储块,如果已经包含有存储块,则说明CK2C表里已经存有该当前工作的存储容器的索引,那么直接执行步骤S517;如果发现该当前工作的存储容器为空闲容器(此种情况比较特殊,一般不会出现系统正在工作的存储容器为空闲容器,但是也不能排除其出现的可能性,比如系统初始工作的时候,新建了一个空闲容器为工作CT,然后出现了申请VA的请求),即没有包含存储块,则说明CK2C表里没有该当前工作的存储容器的索引,那么需要先在CK2C表里插入一条记录所述当前工作的存储容器的索引,该索引的key为所述待保存的新存储块的块号CKIDnew,value为所述当前工作的存储容器的CTID。It should be noted that, in order to make the solution more perfect, preferably, after step S513, an operation may be added to determine whether the currently working storage container (working CT) already contains a storage block, and if the storage block is already included, If the CK2C table already has the index of the currently working storage container, then step S517 is directly executed; if the currently working storage container is found to be a free container (this case is special, the system does not generally have a working storage). The container is a free container, but the possibility of its occurrence cannot be ruled out. For example, when the system is initially working, a new free container is created as a working CT, and then a request for applying for VA appears. That is, if the storage block is not included, the CK2C table is indicated. There is no index of the currently working storage container, and then an index of the storage container of the current work needs to be inserted into the CK2C table, and the key of the index is the block number CKID new of the new storage block to be saved. Value is the CTID of the currently working storage container.
图6A为依据本发明实施例的磁盘整理方法的实施方式之一的示范性流程图。该方法使用图4所示的CK2C表管理、记录数据的空间位置信息,仅作为示例而非限制,该方法中CKID配置为线性增长,永不复用(当然也可以将CKID配置为线性减小,永不复用,这里仅是用来示例选择了CKID线性增大),且该方法所使用的所述CK2C表中记录的每条索引的Key(键)统一配置为每条索引对应的存储容器CT中最小的CKID。需要注意的是,虽然下文描述的该方法的流程包括以特定顺序出现的多个操作,但是应该清楚了解,这些操作也可以包括更多的操作或合并于更少的操作中,这些操作可以顺序执行或并行执行(例如使用并行处理器或多线程环境)。如图6A所示,该方法包括以下步骤:FIG. 6A is an exemplary flowchart of one embodiment of a disk sorting method according to an embodiment of the present invention. The method uses the CK2C table shown in FIG. 4 to manage and record the spatial location information of the data, which is only an example and not a limitation. In this method, the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced. , never reused, here is only used to select the CKID linear increase), and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the storage corresponding to each index. The smallest CKID in the container CT. It should be noted that although the flow of the method described below includes multiple operations occurring in a particular order, it should be clearly understood that these operations may also include more operations or be combined into fewer operations, which may be sequential. Execute or execute in parallel (for example, using a parallel processor or a multi-threaded environment). As shown in FIG. 6A, the method includes the following steps:
步骤S611,接收磁盘整理请求;在具体实现中,可以定时发起磁盘整理请求也可以经过空间回收后,已经知道CT中哪些CK是垃圾(不会再被系统使用),这样,碎片整理时不需要扫描整个文件系统,直接在存储块这一层直接进行,逻辑简单。对此,不作为对本发明实施例的限制。Step S611, receiving a disk defragmentation request; in a specific implementation, the disk defragmentation request may be initiated periodically or after the space is recovered, it is known which CKs in the CT are garbage (will not be used by the system again), so that defragmentation is not required. Scan the entire file system directly at the level of the memory block, the logic is simple. In this regard, it is not intended to limit the embodiments of the present invention.
步骤S612,根据所述CK2C表获取一批待整理的CT,所述待整理的CT为所述CK2C表上所指示的全部存储容器中包含垃圾存储块的存储容器;多组虚拟地址连续的存储容器进一步,确定至少两组虚拟地址连续的待整理的存储容器,其中,每组虚拟地址连续的待整理的存储容器包含k个待整理的 存储容器,所述k个待整理的存储容器在所述存储块与存储容器的对应关系中为k条逻辑连续的索引。所述k条索引之间没有其他的存储容器的索引作为间隔。例如,下图6B中的CT1、CT2、CT3、CT4即是一组虚拟地址连续的存储容器,CK1、CK6、CK10和CK16为4条连续的索引,而CT1、CT3、CT4就不是一组虚拟地址连续的存储容器,CK1、CK10和CK16不是连续的索引,因为中间间隔CK6。需要确定至少两组虚拟地址连续的待整理的存储容器的原因是因为可能存在不包含垃圾存储块的存储容器,这种存储容器不会被视为待整理的存储容器,会出现待整理的存储容器的虚拟地址存在间隔的情况,对于这种情况,需要分组进行处理,以保证磁盘整理后获得的新的存储容器中所容纳的存储块与其它的存储容器所容纳的存储块在范围上不存在交集。具体的扫描和分组方式可以按照大小顺序,以下以图6B中的CT1、CT2、CT3、CT4为例加以说明。步骤S613,按照所述4个待整理的CT(CT1、CT2、CT3、CT4)的键值从小到大的顺序逐个扫描单个CT中的非垃圾CK,即按照CT1→CT2→CT3→CT4的顺序逐一扫描。Step S612, obtaining a batch of CTs to be collated according to the CK2C table, where the CT to be collated is a storage container including a garbage storage block in all storage containers indicated on the CK2C table; and multiple sets of virtual addresses are continuously stored. The container further determines at least two sets of storage containers to be collated consecutively with virtual addresses, wherein each set of virtual addresses consecutive storage containers to be sorted includes k to be sorted a storage container, wherein the k storage containers to be collated are k logically consecutive indexes in a correspondence between the storage block and the storage container. There is no index of other storage containers between the k indexes as an interval. For example, CT1, CT2, CT3, and CT4 in FIG. 6B are a set of storage containers with consecutive virtual addresses, and CK1, CK6, CK10, and CK16 are four consecutive indexes, and CT1, CT3, and CT4 are not a group of virtual ones. For consecutive storage containers, CK1, CK10, and CK16 are not consecutive indexes because of the intermediate interval CK6. The reason why it is necessary to determine at least two sets of storage containers to be collated consecutively is because there may be a storage container that does not contain a garbage storage block, and such a storage container is not regarded as a storage container to be sorted, and storage to be sorted may occur. There is a gap between the virtual addresses of the containers. In this case, the packets need to be processed to ensure that the storage blocks accommodated in the new storage containers obtained after the disk is sorted are not in scope with the storage blocks accommodated by the other storage containers. There is an intersection. The specific scanning and grouping manners may be in the order of size. The following is an example of CT1, CT2, CT3, and CT4 in FIG. 6B. Step S613, scanning the non-garbage CKs in a single CT one by one according to the key values of the four CTs (CT1, CT2, CT3, CT4) to be collated, that is, in the order of CT1→CT2→CT3→CT4 Scan one by one.
步骤S614,判断扫描到的非垃圾CK是否组成一个满存储容器(即已经没有多余的空间来存放下一个非垃圾CK)或是否所有的待整理CT(CT1、CT2、CT3、CT4)已经扫描完毕?如果是,则执行步骤S615;如果否,则返回执行步骤S613。如图6B中当扫描到CK8时,发现CK1~CK8已经构成一个满的存储容器,则执行下一步S615。Step S614, determining whether the scanned non-spam CK constitutes a full storage container (ie, there is no extra space to store the next non-spam CK) or whether all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned. ? If yes, go to step S615; if no, go back to step S613. When CK8 is scanned as shown in FIG. 6B, it is found that CK1 to CK8 have already formed a full storage container, and then the next step S615 is performed.
步骤S615,在扫描到的非垃圾CK组成一个满存储容器或所有的待整理CT(CT1、CT2、CT3、CT4)已经扫描完毕时,则申请一个新CT(空闲容器),例如图6B中新建的存储容器CT5.Step S615, when the scanned non-spam CK constitutes a full storage container or all the CTs to be collated (CT1, CT2, CT3, CT4) have been scanned, apply for a new CT (free container), for example, newly built in FIG. 6B. Storage container CT5.
步骤S616,将扫描到的非垃圾CK迁移到所述新CT中。如图6B中,将CK1~CK8迁移进新CT5中。Step S616, the scanned non-spam CK is migrated to the new CT. As shown in FIG. 6B, CK1 to CK8 are migrated into the new CT5.
步骤S617,在所述CK2C表里插入/新增一条索引用于记录所述新CT中的存储块与所述新CT的对应关系。如图6B中,在CK2C表中插入一条索引<CKID1,CTID5>。步骤S618,判断是否扫描完成,若扫描完成执行步骤S619;若没有扫描完成,则返回执行步骤S613。Step S617, inserting/adding an index in the CK2C table for recording a correspondence between the storage block in the new CT and the new CT. As shown in FIG. 6B, an index <CKID1, CTID5> is inserted in the CK2C table. In step S618, it is determined whether the scanning is completed. If the scanning is completed, the step S619 is performed; if the scanning is not completed, the processing returns to the step S613.
步骤S619,当所有待整理的存储容器扫描完毕时,结束磁盘整理操作。如图6B中CT4已经全部扫描完毕。In step S619, when all the storage containers to be sorted are scanned, the defragmentation operation is ended. As shown in Figure 6B, CT4 has all been scanned.
图6B为依据本发明一实施例的磁盘整理方法的原理图。该方法使用图4 所示的CK2C表管理、记录数据的空间位置信息,仅作为示例而非限制,该方法中CKID配置为线性增长,永不复用(当然也可以将CKID配置为线性减小,永不复用,这里仅是用来示例选择了CKID线性增大),且该方法所使用的所述CK2C表中记录的每条索引的Key(键)统一配置为每条索引对应的存储容器CT中最小的CKID。图6B使用了图6A中的磁盘整理方法,最终将原容器CT1、CT2、CT3、CT4四个待整理的存储容器中的非垃圾CK迁移进了新存储容器CT5和新存储容器CT6中,且用于记录存储块与存储容器的对应关系的所述CK2C表在磁盘整理后仅有两条索引,分别对应上述新存储容器CT5和新存储容器CT6,可见所述CK2C表包含的索引的条数于存储容器的数量能够动态地保持一致,不会随着系统运行时间的推移而增长,最终降低了虚拟地址到物理地址的映射代价。FIG. 6B is a schematic diagram of a disk sorting method according to an embodiment of the invention. The method uses Figure 4 The CK2C table shown manages and records the spatial location information of the data, which is only an example and not a limitation. In this method, the CKID is configured to grow linearly and never reuse (of course, the CKID can also be configured to be linearly reduced, never reused. Here, only the CKID linear increase is selected for the example, and the Key (key) of each index recorded in the CK2C table used by the method is uniformly configured as the smallest of the storage containers CT corresponding to each index. CKID. 6B uses the disk sorting method in FIG. 6A, and finally migrates the non-spam CK in the four storage containers to be sorted into the new storage container CT5 and the new storage container CT6, and the original container CT1, CT2, CT3, CT4, and The CK2C table for recording the correspondence between the storage block and the storage container has only two indexes after the disk is collated, corresponding to the new storage container CT5 and the new storage container CT6, respectively, and the number of indexes included in the CK2C table can be seen. The number of storage containers can be dynamically consistent, does not grow as the system runs, and ultimately reduces the mapping cost of virtual addresses to physical addresses.
图7为依据本发明一实施例的存储管理器700的逻辑结构示意图。该存储管理器700可以但不限于作为图2A中的存储管理器210也可以作为图2B中的存储管理器A,同时也可以但不限于执行图3、图5及图6A所述的方法。如图7所示,该存储管理器700包括存储块管理模块710、存储容器管理模块720及记录模块730。存储块管理模块710,用于在每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;FIG. 7 is a schematic diagram showing the logical structure of a storage manager 700 according to an embodiment of the invention. The storage manager 700 can be, but is not limited to, the storage manager 210 of FIG. 2A or the storage manager A of FIG. 2B, and can also be, but is not limited to, perform the methods described in FIGS. 3, 5, and 6A. As shown in FIG. 7, the storage manager 700 includes a storage block management module 710, a storage container management module 720, and a recording module 730. The storage block management module 710 is configured to allocate m storage blocks for the data to be saved each time the data storage request is received, where each storage block is used to represent a virtual address space, and each storage The block configuration has a unique block number, and m is a natural number greater than or equal to 1;
存储容器管理模块720,用于为所述m个存储块指定n个存储容器,其中,每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数;a storage container management module 720, configured to specify n storage containers for the m storage blocks, where each storage container represents a physical storage space on the storage device, and n is a natural number greater than or equal to 1;
记录模块730,用于根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;以及,所述记录模块730还用于记录所述m个存储块的块号到所述本次待保存的数据的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。The recording module 730 is configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage. Corresponding relationship between the block and the storage container accommodating the allocated storage block; and the recording module 730 is further configured to record the block number of the m storage blocks to the metadata of the data to be saved this time. The block numbers of the m memory blocks are used as virtual addresses of the data to be saved.
可选的,为了方便记录和管理,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。Optionally, in order to facilitate recording and management, each storage container is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each storage container.
可选的,当所述每个存储容器配置有唯一的标识时,所述记录模块730 中记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。Optionally, when each storage container is configured with a unique identifier, the recording module 730 The corresponding relationship between the storage block and the storage container recorded therein includes a plurality of indexes, each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the keys of each index are the same storage. A representative value of a block number of a storage block accommodated by the container, the value of each index being an identifier of the same storage container.
可选的,所述存储块管理模块710每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块的块号的最小值大于为前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为后一次待保存数据所配置的存储块的块号的最小值;Optionally, the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly incremented, and the minimum value of the block numbers of the m storage blocks is greater than that of the previous data to be saved. The maximum value of the block number of the configured storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the data to be saved the next time;
所述记录模块730具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。The recording module 730 is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
优选的,所述存储容器管理模块720具体用于执行下述操作:Preferably, the storage container management module 720 is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块730在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is a minimum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块730在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module 730 is notified again that an index is added again in the correspondence between the storage block and the storage container, the again The key of the added index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
所述记录模块730具体用于,在接到所述存储容器管理模块720发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。 The recording module 730 is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving the notification of adding an index sent by the storage container management module 720.
优选的,所述存储块管理模块710每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块的块号的最小值大于为后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;Preferably, the block number of the m storage blocks allocated by the storage block management module 710 is configured to be linearly decremented, and the minimum value of the block numbers of the m storage blocks is greater than that configured for the next data to be saved. The maximum value of the block number of the storage block or the maximum value of the block number of the m storage blocks is smaller than the minimum value of the block number of the storage block configured for the previous data to be saved;
所述记录模块730具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The recording module 730 is specifically configured to record the representative value in each index as the maximum block number of the storage block accommodated by the same storage container.
优选的,所述存储容器管理模块720具体用于执行下述操作:Preferably, the storage container management module 720 is specifically configured to perform the following operations:
a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块730在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is previously a free storage container, and if the currently working storage container is a free storage container, notifying the recording module 730 of the storage block and the storage container Adding an index to the relationship, the key of the added index is a maximum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块730在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module 730 is again notified to add an index again in the correspondence between the storage block and the storage container, The key of the index added again is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
所述记录模块730具体用于,在接到所述存储容器管理模块720发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。The recording module 730 is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving the notification of adding an index sent by the storage container management module 720.
可选的,所述存储管理器700还包括磁盘整理模块(未示出),所述磁盘整理模块用于:Optionally, the storage manager 700 further includes a defragmentation module (not shown), where the defragmentation module is configured to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器; Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
为所述非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
可选的,所述存储管理器700还包括磁盘整理模块(未示出),所述磁盘整理模块用于:Optionally, the storage manager 700 further includes a defragmentation module (not shown), where the defragmentation module is configured to:
接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
扫描所述待整理的存储容器,获取逻辑相邻的待整理的存储容器中包含的非垃圾存储块,所述逻辑相邻表示所述存储块与存储容器的对应关系中每条索引的键的值大小相邻;Scanning the storage container to be collated, and acquiring a non-garbage storage block included in a logically adjacent storage container to be collated, wherein the logical neighboring indicates a key of each index in a correspondence relationship between the storage block and the storage container Value adjacent to each other;
为所述非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
如图8,为本发明实施例的计算机800的逻辑结构示意图。本发明实施例的计算机800可包括:FIG. 8 is a schematic diagram showing the logical structure of a computer 800 according to an embodiment of the present invention. The computer 800 of the embodiment of the present invention may include:
处理器801、存储器802、系统总线803和通信接口804。CPU801、存储器802和通信接口804之间通过系统总线803连接并完成相互间的通信。 Processor 801, memory 802, system bus 803, and communication interface 804. The CPU 801, the memory 802, and the communication interface 804 are connected by the system bus 803 and complete communication with each other.
处理器801可能为单核或多核中央处理单元,或者为特定集成电路,或者为被配置成实施本发明实施例的一个或多个集成电路。 Processor 801 may be a single core or multi-core central processing unit, or a particular integrated circuit, or one or more integrated circuits configured to implement embodiments of the present invention.
存储器802可以为高速RAM存储器,也可以为非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。The memory 802 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
存储器802用于保存计算机执行指令(未示出)。具体的,计算机执行指令中可以包括程序代码805。 Memory 802 is used to hold computer execution instructions (not shown). Specifically, the program code 805 may be included in the computer execution instruction.
当计算机运行时,处理器801运行计算机执行指令,可以执行图3、图5、或图6任意之一所述的方法流程。When the computer is running, the processor 801 runs a computer execution instruction, and the method flow described in any one of FIG. 3, FIG. 5, or FIG. 6 can be performed.
本领域普通技术人员将会理解,本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。Those of ordinary skill in the art will appreciate that various aspects of the present invention, or possible implementations of various aspects, may be embodied as a system, method, or computer program product. Thus, aspects of the invention, or possible implementations of various aspects, may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," "modules," or "systems." Furthermore, aspects of the invention, or possible implementations of various aspects, may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、光纤、便携式只读存储器(CD-ROM)。The computer readable medium can be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium includes, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing, such as random access memory (RAM), read only memory (ROM), Erase programmable read-only memory (EPROM or flash memory), optical fiber, portable read-only memory (CD-ROM).
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作;生成实施在框图的每一块、或各块的组合中规定的功能动作的装置。The processor in the computer reads the computer readable program code stored in the computer readable medium such that the processor is capable of performing the various functional steps specified in each step of the flowchart, or a combination of steps; A device that functions as specified in each block, or combination of blocks.
计算机可读程序代码可以完全在用户的计算机上执行、部分在用户的计算机上执行、作为单独的软件包、部分在用户的计算机上并且部分在远程计算机上,或者完全在远程计算机或者服务器上执行。也应该注意,在某些替代实施方案中,在流程图中各步骤、或框图中各块所注明的功能可能不按图中注明的顺序发生。例如,依赖于所涉及的功能,接连示出的两个步骤、或两个块实际上可能被大致同时执行,或者这些块有时候可能被以相反顺序执行。The computer readable program code can execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server. . It should also be noted that in some alternative implementations, the functions noted in the various steps in the flowcharts or in the blocks in the block diagrams may not occur in the order noted. For example, two steps, or two blocks, shown in succession may be executed substantially concurrently or the blocks may be executed in the reverse order.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
处于说明的目的,前面的描述是参考具体实施例而进行的。但是,上述说明性论述并不打算穷举或将本发明局限于所公开的精确形式。根据上述教 导,众多修改和变化都是可行的。选择并描述这些实施例是为了最佳地说明本发明的原理及其实际应用,从而使本领域技术人员最佳地利用本发明。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 For the purpose of explanation, the foregoing description has been made with reference to the specific embodiments. However, the above illustrative description is not intended to be exhaustive or to limit the invention to the precise forms disclosed. According to the above teaching Many modifications and changes are possible. The embodiments were chosen and described in order to best explain the principles of the invention, Any modifications, equivalent substitutions, improvements, etc. made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims (32)

  1. 一种存储数据的管理方法,其特征在于,所述方法包括:A method for managing stored data, the method comprising:
    每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;Each time a data save request is received, m storage blocks are allocated for the data to be saved, wherein each storage block is used to represent a virtual address space, and each storage block is configured with a unique block number, m a natural number greater than or equal to 1;
    为所述m个存储块指定n个存储容器,其中,每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数;Specifying n storage containers for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, and n is a natural number greater than or equal to 1;
    根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already Corresponding relationship of the storage containers of the allocated storage blocks;
    记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
  2. 根据权利要求1所述的方法,其特征在于,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。The method according to claim 1, wherein each of the storage containers is configured with a unique identifier, and the identifier of each of the storage containers is used to indicate a physical address corresponding to each of the storage containers.
  3. 根据权利要求2所述的方法,其特征在于,所述存储块与存储容器的对应关系包括多条索引,其中,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的全部存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。The method according to claim 2, wherein the correspondence between the storage block and the storage container comprises a plurality of indexes, wherein each index is used to indicate a direction of all storage blocks assigned to the same storage container, The key of each index is a representative value of a block number of all storage blocks accommodated by the same storage container, and the value of each index is an identifier of the same storage container.
  4. 根据权利要求3所述的方法,其特征在于,每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块被配置的块号的最小值大于前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于后一次待保存数据所配置的存储块的块号的最小值;The method according to claim 3, wherein the block number of the m memory blocks allocated each time is configured to be linearly incremented, and the minimum value of the block numbers of the m memory blocks being configured is greater than the previous time The maximum value of the block number of the storage block configured by the save data or the maximum value of the block number of the m storage blocks configured is smaller than the minimum value of the block number of the storage block configured by the data to be saved the next time;
    所述每条索引中的代表值为所述同一个存储容器所容纳的存储块的最小块号。The representative value in each index is the smallest block number of the storage block accommodated by the same storage container.
  5. 根据权利要求4所述的方法,其特征在于,The method of claim 4 wherein:
    所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:
    a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所 述m个存储块指定到所述当前工作的存储容器;a. Obtain the storage container of the current work, and follow the order from small block number to large block number one by one. Said m storage blocks are assigned to the currently working storage container;
    b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
    c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
    d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
    e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  6. 根据权利要求3所述的方法,其特征在于,每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块被配置的块号的最小值大于后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块被配置的块号的最大值小于前一次待保存数据所配置的存储块的块号的最小值;The method according to claim 3, wherein the block number of the m memory blocks allocated each time is configured to be linearly decremented, and the minimum value of the block numbers of the m memory blocks being configured is greater than the latter time The maximum value of the block number of the storage block configured by the save data or the maximum value of the block number of the m storage blocks configured is smaller than the minimum value of the block number of the storage block configured by the previous data to be saved;
    所述每条索引中的代表值为同一个存储容器所容纳的存储块的最大块号。The representative value in each index is the largest block number of the storage block accommodated in the same storage container.
  7. 根据权利要求6所述的方法,其特征在于,The method of claim 6 wherein:
    所述为所述m个存储块指定n个存储容器,以及根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,包括:Determining, by the n storage blocks, the n storage containers, and updating the correspondence between the storage blocks and the storage containers according to the correspondence between the m storage blocks and the n storage containers, including:
    a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;
    b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工 作的存储容器的标识;Determining, whether the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;
    c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
    d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
    e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  8. 根据权利要求4-7任一项所述的方法,其特征在于,还包括:The method of any of claims 4-7, further comprising:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
    扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
    为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
  9. 根据权利要求4-7任一项所述的方法,其特征在于,还包括:The method of any of claims 4-7, further comprising:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
    确定至少两组虚拟地址连续的待整理的存储容器,其中,每组虚拟地址连续的待整理的存储容器包含k个待整理的存储容器,所述k个待整理的存储容器在所述存储块与存储容器的对应关系中为k条逻辑连续的索引,k为大于等于2的自然数;Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block The correspondence relationship with the storage container is k logically consecutive indexes, and k is a natural number greater than or equal to 2;
    分别为每组待整理的存储容器中包含的非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容 器所容纳的非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and updating the correspondence between the storage block and the storage container, wherein each new storage capacity The block number of the non-garbage storage block accommodated by the device is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is different from the block of the storage block accommodated by the other new storage container. There is no intersection of the range of numbers.
  10. 根据权利要求2-9任一项所述的方法,其特征在于,所述方法还包括:The method of any of claims 2-9, wherein the method further comprises:
    接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,查询所述待读取的数据所在的文件的文件元数据,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数;After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file where the data to be read is located, and acquiring the data to be read a virtual address, wherein the virtual address of the data to be read includes a block number of p storage blocks, and p is a natural number greater than or equal to 1;
    根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数;Querying a correspondence between the storage block and the storage container according to the block number of the p storage blocks, and determining q storage containers accommodating the p storage blocks, where q is a natural number greater than or equal to 1;
    读取所述q个存储容器的元数据,确定所述p个存储块的物理地址信息,每个存储容器的元数据用于描述所述每个容器中所有存储块的信息。Reading metadata of the q storage containers, determining physical address information of the p storage blocks, and metadata of each storage container is used to describe information of all storage blocks in each of the containers.
  11. 一种存储管理器,其特征在于,应用于存储系统中,所述存储系统包括存储设备以及存储管理器,所述存储设备包含用于提供物理地址空间的存储介质,所述存储管理器用于接收由应用触发的数据保存请求,将所述待保存的数据转发到所述存储设备进行保存;所述存储管理器包括:A storage manager, characterized in that it is applied to a storage system, the storage system comprising a storage device and a storage manager, the storage device comprising a storage medium for providing a physical address space, the storage manager for receiving a data save request triggered by the application, and the data to be saved is forwarded to the storage device for saving; the storage manager includes:
    存储块管理模块,用于在每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;a storage block management module, configured to allocate m storage blocks for the data to be saved each time after receiving the data storage request, where each storage block is used to represent a virtual address space, and each storage block Configured with a unique block number, m is a natural number greater than or equal to 1;
    存储容器管理模块,用于为所述m个存储块指定n个存储容器,其中,每个存储容器表示存储设备上的一段物理存储空间,n为大于等于1的自然数;a storage container management module, configured to specify n storage containers for the m storage blocks, where each storage container represents a physical storage space on the storage device, where n is a natural number greater than or equal to 1;
    记录模块,用于根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;以及,a recording module, configured to update a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks Correspondence with a storage container accommodating the already allocated storage block; and
    所述记录模块还用于记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。 The recording module is further configured to record the block number of the m storage blocks to the metadata of the file where the data to be saved is located, and the block numbers of the m storage blocks are used as the current time. The virtual address of the data to be saved.
  12. 根据权利要求11所述的存储管理器,其特征在于,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。The storage manager according to claim 11, wherein each of the storage containers is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each of the storage containers. .
  13. 根据权利要求12所述的存储管理器,其特征在于,A storage manager according to claim 12, wherein
    所述记录模块中记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引用于表示被指定到同一个存储容器的全部存储块的指向,所述每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。The corresponding relationship between the storage block and the storage container recorded in the recording module includes multiple indexes, and each index is used to indicate a pointer of all storage blocks assigned to the same storage container, and the key of each index A representative value of a block number of a storage block accommodated by the same storage container, the value of each index being an identifier of the same storage container.
  14. 根据权利要求13所述的存储管理器,其特征在于,所述存储块管理模块每次分配的所述m个存储块的块号配置为线性递增,且所述m个存储块的块号的最小值大于为前一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为后一次待保存数据所配置的存储块的块号的最小值;The storage manager according to claim 13, wherein the block number of the m storage blocks allocated by the storage block management module is configured to be linearly incremented, and the block numbers of the m storage blocks are The minimum value is greater than the maximum value of the block number of the storage block configured for the previous data to be saved or the maximum value of the block number of the m storage blocks is smaller than the minimum block number of the storage block configured for the data to be saved the next time. value;
    所述记录模块具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。The recording module is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  15. 根据权利要求14所述的存储管理器,其特征在于,所述存储容器管理模块具体用于执行下述操作:The storage manager according to claim 14, wherein the storage container management module is specifically configured to perform the following operations:
    a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container in the order from a small block number to a large block number, and determining that the currently working storage container is in the storage state Whether the storage block of the smallest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the recording module in the correspondence between the storage block and the storage container Adding an index, the key of the added index is a minimum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
    b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的 标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the storage container of the updated work is an idle storage container; the recording module is notified again that an index is added again in the correspondence between the storage block and the storage container, and the index is added again. The key of the index is the smallest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the storage capacity of the updated working storage container Identification
    c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again until the m storage blocks are specified in the n storage containers;
    所述记录模块具体用于,在接到所述存储容器管理模块发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
  16. 根据权利要求13所述的存储管理器,其特征在于,所述存储块管理模块每次分配的所述m个存储块的块号配置为线性递减,且所述m个存储块的块号的最小值大于为后一次待保存数据所配置的存储块的块号的最大值或者所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;The storage manager according to claim 13, wherein the block number of the m storage blocks allocated by the storage block management module at each time is configured to be linearly decremented, and the block numbers of the m storage blocks are The minimum value is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number of the m storage blocks is smaller than the minimum block number of the storage block configured for the previous data to be saved. value;
    所述记录模块具体用于,将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The recording module is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
  17. 根据权利要求16所述的存储管理器,其特征在于,所述存储容器管理模块具体用于执行下述操作:The storage manager according to claim 16, wherein the storage container management module is specifically configured to perform the following operations:
    a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器,以及确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,则通知所述记录模块在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;a, obtaining a storage container of the current work, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number, and determining that the currently working storage container is in the storage space Whether the storage block of the largest block number of the m storage blocks is a free storage container before, and if the currently working storage container is a free storage container, notifying the corresponding relationship between the storage block and the storage container Adding an index, the key of the added index is a maximum block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
    b、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;再次通知所述记录模块在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;b. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container; the recording module is again notified to add an index again in the correspondence between the storage block and the storage container, the again The key of the added index is the largest block number of the remaining storage blocks in the m storage blocks, and the value of the index added again is the identifier of the storage container of the updated work;
    c、当所述更新的工作的存储容器为满存储容器时,再次执行步骤b,直 至将所述m个存储块指定在所述n个存储容器中;c. When the storage container of the updated work is a full storage container, perform step b again, straight Up to the m storage blocks are specified in the n storage containers;
    所述记录模块具体用于,在接到所述存储容器管理模块发来的增加一条索引的通知时,在所述存储块与存储容器的对应关系中执行增加索引的操作。The recording module is specifically configured to perform an operation of adding an index in a correspondence between the storage block and the storage container when receiving a notification of adding an index sent by the storage container management module.
  18. 根据权利要求14-17任一项所述的存储管理器,其特征在于,所述存储管理器还包括磁盘整理模块,所述磁盘整理模块用于:The storage manager according to any one of claims 14-17, wherein the storage manager further comprises a defragmentation module, and the defragmentation module is configured to:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
    扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
    为所述非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning the new storage container to the non-garbage storage block, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein the non-garbage storage block is accommodated by each new storage container The block number is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each of the new storage containers does not intersect with the range of the block number of the storage block accommodated by the other new storage containers.
  19. 根据权利要求14-17任一项所述的存储管理器,其特征在于,所述存储管理器还包括磁盘整理模块,所述磁盘整理模块用于:The storage manager according to any one of claims 14-17, wherein the storage manager further comprises a defragmentation module, and the defragmentation module is configured to:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
    确定至少两组虚拟地址连续的待整理的存储容器,其中,每组虚拟地址连续的待整理的存储容器包含k个待整理的存储容器,所述k个待整理的存储容器在所述存储块与存储容器的对应关系中为k条逻辑连续的索引,k为大于等于2的自然数;Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block The correspondence relationship with the storage container is k logically consecutive indexes, and k is a natural number greater than or equal to 2;
    分别为每组待整理的存储容器中包含的非垃圾存储块重新指定新的存储容器,并通知所述记录模块更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and notifying the recording module to update the correspondence between the storage block and the storage container, wherein each new storage container is The block number of the non-garbage storage block accommodated is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container is the same as the block number of the storage block accommodated by the other new storage container. There is no intersection in the scope.
  20. 一种存储系统,其特征在于,所述存储系统包括存储设备以及存储 管理器;A storage system, characterized in that the storage system includes a storage device and storage Manager
    所述存储设备包含存储介质,用于提供物理地址空间来保存数据;The storage device includes a storage medium for providing a physical address space to save data;
    所述存储管理器,用于在每次接收到数据保存请求之后,为本次待保存的数据分配m个存储块,其中,每个存储块用于表示一段虚拟地址空间,所述每个存储块配置有唯一的块号,m为大于等于1的自然数;为所述m个存储块指定n个存储容器,其中,每个存储容器表示所述存储设备上的一段物理存储空间,n为大于等于1的自然数;The storage manager is configured to allocate m storage blocks for the data to be saved after each receiving the data saving request, where each storage block is used to represent a virtual address space, and each storage The block is configured with a unique block number, m is a natural number greater than or equal to 1; n storage containers are specified for the m storage blocks, wherein each storage container represents a piece of physical storage space on the storage device, n is greater than a natural number equal to 1;
    根据所述m个存储块与所述n个存储容器的对应关系更新存储块与存储容器的对应关系,所述存储块与存储容器的对应关系用于记录已经分配的存储块与容纳所述已经分配的存储块的存储容器的对应关系;以及Updating a correspondence between the storage block and the storage container according to the correspondence between the m storage blocks and the n storage containers, where the correspondence between the storage block and the storage container is used to record the allocated storage blocks and to accommodate the already The correspondence of the storage containers of the allocated storage blocks;
    记录所述m个存储块的块号到所述本次待保存的数据所在的文件的元数据中,所述m个存储块的块号用于作为所述本次待保存的数据的虚拟地址。Recording the block number of the m memory blocks to the metadata of the file in which the data to be saved is located, and the block numbers of the m memory blocks are used as the virtual address of the data to be saved. .
  21. 根据权利要求20所述的存储系统,其特征在于,所述每个存储容器配置有唯一的标识,所述每个存储容器的标识用于指示到所述每个存储容器所对应的物理地址。The storage system according to claim 20, wherein each of the storage containers is configured with a unique identifier, and the identifier of each storage container is used to indicate a physical address corresponding to each of the storage containers.
  22. 根据权利要求21所述的存储系统,其特征在于,所述存储管理器用于记录存储块与存储容器的对应关系具体包括:The storage system according to claim 21, wherein the storage manager is configured to record a correspondence between the storage block and the storage container, and specifically includes:
    所述存储管理器记录的所述存储块与存储容器的对应关系中包括多条索引,每条索引表示被指定到同一个存储容器的全部存储块的指向,其中,每条索引的键为同一个存储容器所容纳的存储块的块号的代表值,所述每条索引的值为所述同一个存储容器的标识。The corresponding relationship between the storage block and the storage container recorded by the storage manager includes multiple indexes, and each index represents a pointer of all storage blocks assigned to the same storage container, wherein each index has the same key A representative value of a block number of a storage block accommodated by a storage container, the value of each index being an identifier of the same storage container.
  23. 根据权利要求22所述的存储系统,其特征在于,所述存储管理器具体用于,将每次分配的所述m个存储块的块号配置为线性递增,且配置所述m个存储块的块号的最小值大于为前一次待保存数据所配置的存储块的块号的最大值或者配置所述m个存储块的块号的最大值小于为后一次待保存数据所配置的存储块的块号的最小值;The storage system according to claim 22, wherein the storage manager is configured to configure a block number of the m storage blocks allocated each time to be linearly incremented, and configure the m storage blocks. The minimum value of the block number is greater than the maximum value of the block number of the storage block configured for the previous data to be saved or the maximum value of the block number configuring the m storage blocks is smaller than the storage block configured for the data to be saved later. The minimum value of the block number;
    所述存储管理器具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最小块号。The storage manager is specifically configured to record the representative value in each index as a minimum block number of a storage block accommodated by the same storage container.
  24. 根据权利要求23所述的存储系统,其特征在于,所述存储管理器具体用于执行下述操作: The storage system according to claim 23, wherein the storage manager is specifically configured to perform the following operations:
    a、获取当前工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the small block number to the large block number;
    b、确定所述当前工作的存储容器在容纳所述m个存储块中的最小块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最小块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining whether the currently working storage container is a free storage container before accommodating the storage block of the smallest block number of the m storage blocks, and if the currently working storage container is a free storage container, in the storage Adding an index to the correspondence between the block and the storage container, the key of the added index is the smallest block number of the m storage blocks, and the value of the added index is an identifier of the currently working storage container;
    c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从小块号至大块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one according to the order from the small block number to the large block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
    d、在所述存储块与存储容器的对应关系中再次增加一条索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最小块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index again in the correspondence between the storage block and the storage container, where the key of the index added again is the smallest block number of the remaining storage blocks in the m storage blocks, and the index is increased again. Value of the storage container of the updated work;
    e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  25. 根据权利要求22所述的存储系统,其特征在于,所述存储管理器具体用于,将每次分配的所述m个存储块的块号配置为线性递减,且配置所述m个存储块的块号的最小值大于为后一次待保存数据所配置的存储块的块号的最大值或者配置所述m个存储块的块号的最大值小于为前一次待保存数据所配置的存储块的块号的最小值;The storage system according to claim 22, wherein the storage manager is configured to configure a block number of the m storage blocks allocated each time to be linearly decremented, and configure the m storage blocks. The minimum value of the block number is greater than the maximum value of the block number of the storage block configured for the data to be saved later or the maximum value of the block number configuring the m storage blocks is smaller than the storage block configured for the previous data to be saved. The minimum value of the block number;
    所述存储管理器具体用于将所述每条索引中的所述代表值记录为所述同一个存储容器所容纳的存储块的最大块号。The storage manager is specifically configured to record the representative value in each index as a maximum block number of a storage block accommodated by the same storage container.
  26. 根据权利要求25所述的存储系统,其特征在于,所述存储管理器为所述m个存储块指定n个存储容器,以及记录存储块与存储容器的对应关系,具体用于执行下述操作:The storage system according to claim 25, wherein the storage manager specifies n storage containers for the m storage blocks, and records a correspondence between the storage blocks and the storage containers, specifically for performing the following operations. :
    a、获取当前工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块指定到所述当前工作的存储容器;a, obtaining the current working storage container, assigning the m storage blocks one by one to the currently working storage container according to the order from the large block number to the small block number;
    b、确定所述当前工作的存储容器在容纳所述m个存储块中的最大块号的存储块之前是否为空闲存储容器,若所述当前工作的存储容器为空闲存储 容器,在所述存储块与存储容器的对应关系中增加一条索引,所述增加的索引的键为所述m个存储块中的最大块号,所述增加的索引的值为所述当前工作的存储容器的标识;Determining, if the currently working storage container is a free storage container before accommodating the storage block of the largest block number of the m storage blocks, if the currently working storage container is idle storage a container, in the corresponding relationship between the storage block and the storage container, an index, the key of the added index is the largest block number of the m storage blocks, and the value of the added index is the current work The identity of the storage container;
    c、当所述当前工作的存储容器为满存储容器时,获得更新的工作的存储容器,按照从大块号至小块号的顺序逐个将所述m个存储块中剩余的存储块指定到所述更新的工作的存储容器中,所述更新的工作的存储容器为空闲的存储容器;c. When the currently working storage container is a full storage container, obtain the updated working storage container, and assign the remaining storage blocks in the m storage blocks one by one in order from the large block number to the small block number. In the storage container of the updated work, the updated working storage container is an idle storage container;
    d、在所述存储块与存储容器的对应关系中再次增加一条的索引,所述再次增加的索引的键为所述m个存储块中剩余的存储块的最大块号,所述再次增加的索引的值为所述更新的工作的存储容器的标识;d. adding an index to the storage node and the storage container, the key of the index added again is the maximum block number of the remaining storage blocks in the m storage blocks, and the added The value of the index is the identity of the storage container of the updated work;
    e、当所述更新的工作的存储容器为满存储容器时,返回执行步骤c,直至将所述m个存储块指定在所述n个存储容器中。e. When the storage container of the updated work is a full storage container, return to step c until the m storage blocks are specified in the n storage containers.
  27. 根据权利要求23-26任一项所述的存储系统,其特征在于,所述存储管理器还用于:The storage system according to any one of claims 23 to 26, wherein the storage manager is further configured to:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的存储容器;Receiving a defragmentation instruction, determining a storage container to be tidy, wherein the storage container to be tidy is a storage container indicated by a correspondence between the storage block and the storage container;
    扫描所述待整理的存储容器,获取每个待整理的存储容器包含的非垃圾存储块;Scanning the storage containers to be collated, and acquiring non-garbage storage blocks included in each storage container to be collated;
    为所述非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to the non-garbage storage block, and updating a correspondence between the storage block and the storage container, wherein a block number of the non-garbage storage block accommodated by each new storage container is linearly incremented Or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not intersect with the range of the block number of the storage block accommodated by the other new storage container.
  28. 根据权利要求23-26任一项所述的存储系统,其特征在于,所述存储管理器还用于:The storage system according to any one of claims 23 to 26, wherein the storage manager is further configured to:
    接收磁盘整理指令,确定待整理的存储容器,所述待整理的存储容器为所述存储块与存储容器的对应关系所指示的全部存储容器中包含垃圾存储块的存储容器;Receiving a defragmentation instruction to determine a storage container to be tidyed, wherein the storage container to be tidy is a storage container including a garbage storage block in all storage containers indicated by the corresponding relationship between the storage block and the storage container;
    确定至少两组虚拟地址连续的待整理的存储容器,其中,每组虚拟地址连续的待整理的存储容器包含k个待整理的存储容器,所述k个待整理的存储容器在所述存储块与存储容器的对应关系中为k条逻辑连续的索引,k为 大于等于2的自然数;Determining at least two sets of storage containers to be collated consecutively, wherein each set of virtual addresses consecutive storage containers to be collated comprises k storage containers to be sorted, and the k storage containers to be sorted are in the storage block Corresponding relationship with the storage container is k logically consecutive indexes, k is a natural number greater than or equal to 2;
    分别为每组待整理的存储容器中包含的非垃圾存储块重新指定新的存储容器,并更新所述存储块与存储容器的对应关系,其中,每个新的存储容器所容纳的所述非垃圾存储块的块号为线性递增或者线性递减,且所述每个新的存储容器所容纳的存储块的块号的范围与其它新的存储容器所容纳的存储块的块号的范围不存在交集。Reassigning a new storage container to each non-garbage storage block included in each group of storage containers to be collated, and updating the corresponding relationship between the storage block and the storage container, wherein each new storage container holds the non-non-contained storage container The block number of the garbage storage block is linearly increasing or linearly decreasing, and the range of the block number of the storage block accommodated by each new storage container does not exist with the range of the block number of the storage block accommodated by the other new storage container Intersection.
  29. 根据权利要求22-28任一项所述的存储系统,其特征在于,所述存储管理器还用于:The storage system according to any one of claims 22 to 28, wherein the storage manager is further configured to:
    在接收到数据读取请求之后,根据所述数据读取请求中携带的待读取的数据的信息,查询所述待读取的数据所在的文件的文件元数据,获取所述待读取的数据的虚拟地址,其中,所述待读取的数据的虚拟地址包括p个存储块的块号,p为大于等于1的自然数,根据所述p个存储块的块号查询所述存储块与存储容器的对应关系,确定容纳所述p个存储块的q个存储容器,q为大于等于1的自然数,以及读取所述q个存储容器的元数据,确定所述p个存储块的物理地址信息,每个存储容器的元数据用于描述所述每个容器中所有存储块的信息。After receiving the data read request, querying, according to the information of the data to be read carried in the data read request, the file metadata of the file in which the data to be read is located, and acquiring the to-be-read a virtual address of the data, wherein the virtual address of the data to be read includes a block number of p storage blocks, p is a natural number greater than or equal to 1, and the storage block is queried according to the block number of the p storage blocks Corresponding relationship of the storage containers, determining q storage containers accommodating the p storage blocks, q being a natural number greater than or equal to 1, and reading metadata of the q storage containers, determining physical properties of the p storage blocks Address information, metadata of each storage container is used to describe information of all storage blocks in each of the containers.
  30. 一种存储管理器,其特征在于,包括用于与存储设备交互的接口、处理器、存储器,所述处理器通过总线与所述处理器连接,所述处理器通过所述接口与所述存储设备交互信息;A storage manager, comprising: an interface for interacting with a storage device, a processor, a memory, the processor being coupled to the processor via a bus, the processor through the interface and the storage Device interaction information;
    所述存储器用于存储计算机执行指令,当所述存储管理器运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述存储管理器执行如权利要求1-10中任一所述的存储数据的管理方法。The memory is for storing computer execution instructions, the processor executing the computer-executed instructions stored by the memory to cause the storage manager to perform as in claims 1-10 when the storage manager is running Any of the methods of managing stored data.
  31. 一种计算机,其特征在于,包括:处理器、存储器、总线和通信接口;A computer, comprising: a processor, a memory, a bus, and a communication interface;
    所述存储器用于存储计算机执行指令,所述处理器与所述存储器通过所述总线连接,当所述计算机运行时,所述处理器执行所述存储器存储的所述计算机执行指令,以使所述计算机执行如权利要求1-10中任一所述的存储数据的管理方法。The memory is configured to store computer execution instructions, the processor is coupled to the memory via the bus, and when the computer is running, the processor executes the computer-executed instructions stored by the memory to cause The computer performs the method of managing stored data according to any one of claims 1-10.
  32. 一种计算机可读介质,其特征在于,包括计算机执行指令,以供计算机的处理器执行所述计算机执行指令时,所述计算机执行如权利要求1-10 中任一所述的存储数据的管理方法。 A computer readable medium, comprising: computer-executable instructions, when executed by a processor of a computer, the computer executing the instructions 1-10 A method of managing stored data as described in any one of the preceding claims.
PCT/CN2014/096073 2014-12-31 2014-12-31 Method for managing storage data, storage manager and storage system WO2016106757A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480016987.4A CN106462491B (en) 2014-12-31 2014-12-31 Management method of stored data, storage manager and storage system
PCT/CN2014/096073 WO2016106757A1 (en) 2014-12-31 2014-12-31 Method for managing storage data, storage manager and storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/096073 WO2016106757A1 (en) 2014-12-31 2014-12-31 Method for managing storage data, storage manager and storage system

Publications (1)

Publication Number Publication Date
WO2016106757A1 true WO2016106757A1 (en) 2016-07-07

Family

ID=56284026

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/096073 WO2016106757A1 (en) 2014-12-31 2014-12-31 Method for managing storage data, storage manager and storage system

Country Status (2)

Country Link
CN (1) CN106462491B (en)
WO (1) WO2016106757A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254733A (en) * 2018-09-04 2019-01-22 北京百度网讯科技有限公司 Methods, devices and systems for storing data
CN110019031A (en) * 2017-08-31 2019-07-16 华为技术有限公司 File creation method and file management device
WO2021233187A1 (en) * 2020-05-18 2021-11-25 中科寒武纪科技股份有限公司 Method and device for allocating storage addresses for data in memory

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109656886B (en) * 2018-12-26 2021-11-09 百度在线网络技术(北京)有限公司 Key value pair-based file system implementation method, device, equipment and storage medium
CN113282582B (en) * 2021-05-21 2023-06-20 海南超船电子商务有限公司 Efficient storage method and system for ship position data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040230766A1 (en) * 2003-05-12 2004-11-18 Cameron Douglas J. Thin-provisioning with snapshot technology
CN1916875A (en) * 2005-08-17 2007-02-21 联发科技股份有限公司 Memory management method and system
CN103052945A (en) * 2010-08-06 2013-04-17 阿尔卡特朗讯 A method of managing computer memory, corresponding computer program product
CN103853665A (en) * 2012-12-03 2014-06-11 华为技术有限公司 Storage space allocation method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101676882B (en) * 2008-09-16 2013-01-16 美光科技公司 Built-in mapping message of memory device
CN105493051B (en) * 2013-06-25 2019-03-08 马维尔国际贸易有限公司 Adaptive cache Memory Controller

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040230766A1 (en) * 2003-05-12 2004-11-18 Cameron Douglas J. Thin-provisioning with snapshot technology
CN1916875A (en) * 2005-08-17 2007-02-21 联发科技股份有限公司 Memory management method and system
CN103052945A (en) * 2010-08-06 2013-04-17 阿尔卡特朗讯 A method of managing computer memory, corresponding computer program product
CN103853665A (en) * 2012-12-03 2014-06-11 华为技术有限公司 Storage space allocation method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019031A (en) * 2017-08-31 2019-07-16 华为技术有限公司 File creation method and file management device
CN109254733A (en) * 2018-09-04 2019-01-22 北京百度网讯科技有限公司 Methods, devices and systems for storing data
WO2021233187A1 (en) * 2020-05-18 2021-11-25 中科寒武纪科技股份有限公司 Method and device for allocating storage addresses for data in memory

Also Published As

Publication number Publication date
CN106462491B (en) 2020-08-14
CN106462491A (en) 2017-02-22

Similar Documents

Publication Publication Date Title
US9684462B2 (en) Method and apparatus utilizing non-uniform hash functions for placing records in non-uniform access memory
KR102042643B1 (en) Managing multiple namespaces in a non-volatile memory (nvm)
US9626286B2 (en) Hardware and firmware paths for performing memory read processes
JP6344675B2 (en) File management method, distributed storage system, and management node
JP4931810B2 (en) FAT analysis for optimized sequential cluster management
US7395384B2 (en) Method and apparatus for maintaining data on non-volatile memory systems
US8214583B2 (en) Direct file data programming and deletion in flash memories
US10481837B2 (en) Data storage device and method for operating data storage device with efficient trimming operations
CN107239526B (en) File system implementation method, defragmentation method and operation position positioning method
JP4738038B2 (en) Memory card
TWI533152B (en) Data storage apparatus and method
JP5129156B2 (en) Access device and write-once recording system
WO2016147281A1 (en) Distributed storage system and control method for distributed storage system
WO2016106757A1 (en) Method for managing storage data, storage manager and storage system
US9767120B2 (en) Multi-way checkpoints in a data storage system
KR20150106657A (en) Device and method for storing data in distributed storage system
KR20130018602A (en) Memory system including key-value store
US20150186259A1 (en) Method and apparatus for storing data in non-volatile memory
CN105912475A (en) System and method for copy on write on an SSD
US10922276B2 (en) Online file system check
CA2758235A1 (en) Device and method for storage, retrieval, relocation, insertion or removal of data in storage units
KR20150071500A (en) Method and Apparatus for Managing Data
KR101579941B1 (en) Method and apparatus for isolating input/output of virtual machines
CN103530067A (en) Data operation method and device
CN113515469A (en) Method for creating and deleting name space and storage device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14909563

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14909563

Country of ref document: EP

Kind code of ref document: A1