WO2021043026A1 - 一种存储空间的管理方法及装置 - Google Patents

一种存储空间的管理方法及装置 Download PDF

Info

Publication number
WO2021043026A1
WO2021043026A1 PCT/CN2020/111002 CN2020111002W WO2021043026A1 WO 2021043026 A1 WO2021043026 A1 WO 2021043026A1 CN 2020111002 W CN2020111002 W CN 2020111002W WO 2021043026 A1 WO2021043026 A1 WO 2021043026A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
space
storage
storage system
metadata
Prior art date
Application number
PCT/CN2020/111002
Other languages
English (en)
French (fr)
Inventor
任仁
朱芳芳
郭平静
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021043026A1 publication Critical patent/WO2021043026A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays

Definitions

  • This application relates to the field of storage technology, and in particular to a method and device for managing storage space.
  • a data deduplication technology is proposed, that is, if a certain data is stored in multiple copies in the storage system, the multiple copies of data will be deleted and only one copy of the data will be saved, thereby reducing the data by reducing the data.
  • the purpose of the occupied storage space is, if a certain data is stored in multiple copies in the storage system, the multiple copies of data will be deleted and only one copy of the data will be saved, thereby reducing the data by reducing the data.
  • the implementation process of one of the deduplication technologies is: Calculate the fingerprint of each data, store the fingerprint in the deduplication metadata space, and determine the duplicate fingerprints from the multiple fingerprints stored in the deduplication metadata space.
  • the data corresponding to the fingerprint is the data to be deduplicated, and batch data deduplication is performed on the data to be deduplicated.
  • This application provides a storage space management method and device to provide a method for setting a deduplication metadata space.
  • a storage space management method is provided.
  • the data volume of the data stored in the storage system is first obtained, and then according to the size of the data volume of the stored data, the storage system used to store fingerprint records is adjusted The size of the item's deduplication metadata space.
  • the fingerprint record item is used to record the fingerprint of the data.
  • the size of the deduplication metadata space can be flexibly adjusted, the efficiency of deduplication can be improved by increasing the size of the deduplication metadata space. Moreover, since adjusting the size of the deduplication metadata space is based on the amount of data stored in the storage system as the adjustment factor, the size of the deduplication metadata space set by this method can also meet the data storage needs of users. Does not affect the storage performance of the storage system.
  • obtaining the data volume of the data stored in the storage system can include but is not limited to the following two methods:
  • the first method is to obtain the data amount of data stored in the data storage space of the storage system, and the data storage space is used to store data.
  • the storage space of the storage system can be divided into data storage space and metadata storage space, and the data stored in the storage system is the data stored in the data storage space. Therefore, when the data stored in the storage system needs to be obtained The amount of data stored in the data storage space can be directly obtained, and the method of obtaining is simple.
  • the second method is to obtain the data volume of metadata stored in the metadata storage space of the storage system, and calculate the data volume of the data stored in the storage system according to the data volume of the metadata and the preset ratio.
  • the metadata storage space is used to store metadata of data.
  • the data of the data stored in the storage system can be obtained by obtaining the data volume of the metadata in the storage system. Because the metadata storage space is smaller than the data storage space, the search space can be reduced and the time delay can be reduced.
  • adjusting the size of the deduplication metadata space according to the data volume of the data stored in the storage system can include but is not limited to the following three methods:
  • the first adjustment method is to increase the deduplication metadata space when the amount of data stored in the storage system is not greater than the first threshold.
  • the deduplication metadata space can be increased to improve the efficiency of deduplication.
  • the second adjustment method is to reduce the deduplication metadata space when the amount of data stored in the storage system is not less than the second threshold.
  • the storage system can pre-store the correspondence between the proportion of the deduplication metadata space and the amount of data stored in the storage system, and then obtain the maximum amount of data that can be stored in the storage system. After the data volume, the size of the deduplication metadata space is adjusted according to the maximum data volume, the size of the data volume of the data stored in the storage system, and the preset correspondence relationship.
  • the size of the deduplication metadata space in the current situation can be determined according to the maximum amount of data allowed by the storage system and the aforementioned preset correspondence, which can improve the accuracy of adjusting the size of the deduplication metadata space .
  • the size of the deduplication metadata space can be adjusted through the data storage space.
  • the data storage space can be reduced to increase the deduplication metadata space.
  • the data storage space can be increased to reduce the deduplication metadata space. Delete the metadata space.
  • a storage space management device may be a storage node or a storage server, or a storage node or a device in the storage server.
  • the storage space management device includes a processor, configured to implement the method described in the first aspect.
  • the storage space management device may also include a memory for storing program instructions and data.
  • the memory is coupled with the processor, and the processor can call and execute the program instructions stored in the memory to implement any one of the methods described in the first aspect.
  • the storage space management device may further include an interface, and the interface communicates with the processor.
  • the storage space management device includes a processor and an interface, and the interface communicates with the processor; wherein, the processor is configured to:
  • the size of the deduplication metadata space of the storage system is adjusted; wherein the deduplication metadata space is used to store fingerprint record items; the fingerprint record items are used to record fingerprints of the data.
  • the processor is specifically used for:
  • the data storage space is used to store data
  • the data storage space is used to store metadata of the data.
  • the processor is specifically used for:
  • the deduplication metadata space is reduced.
  • the processor is specifically used for:
  • Adjust the deduplication according to the maximum data volume, the size of the data volume, and the correspondence between the preset proportion of the deduplication metadata space and the data volume of the data stored in the storage system The size of the metadata space.
  • the processor is specifically used for:
  • a storage space management device may be a storage node or a storage server, or may be a storage node or a device in the storage server.
  • the storage space management apparatus may include a processing unit and an acquiring unit, and these units may perform corresponding functions performed in any of the design examples of the first aspect, specifically:
  • the acquiring unit is used to acquire the data volume of the data stored in the storage system
  • the processing unit is configured to adjust the size of the deduplication metadata space of the storage system according to the size of the data volume; wherein, the deduplication metadata space is used for storing fingerprint record items; and the fingerprint record items are used for Record the fingerprint of the data.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a computer, cause the The computer executes the method described in any one of the first aspect.
  • an embodiment of the present application provides a computer program product, the computer program product stores a computer program, the computer program includes program instructions, and when executed by a computer, the program instructions cause the computer to execute the first The method of any one of the aspects.
  • the present application provides a chip system.
  • the chip system includes a processor and may also include a memory for implementing the method described in the first aspect.
  • the chip system can be composed of chips, or it can include chips and other discrete devices.
  • an embodiment of the present application provides a storage system that includes a storage device and the storage space management apparatus described in the second aspect and any one of the designs of the second aspect, or the storage system includes storage The device and the storage space management device described in the third aspect and any one of the third aspect designs.
  • FIG. 1 is a flowchart of a method for deleting duplicate data in the prior art
  • FIG. 2 is a schematic diagram of an example of fingerprint records before performing a data deduplication operation and fingerprint records after performing a data deduplication operation in the prior art
  • FIG. 3 is a flowchart of a data storage method provided by an embodiment of the application.
  • FIG. 4 is a structural diagram of an example of a storage space management device provided by an embodiment of the application.
  • FIG. 5 is a structural diagram of another example of a storage space management apparatus provided by an embodiment of the application.
  • “multiple” refers to two or more than two. In view of this, “multiple” may also be understood as “at least two” in the embodiments of the present application. “At least one” can be understood as one or more, for example, one, two or more. For example, including at least one refers to including one, two or more, and does not limit which ones are included. For example, including at least one of A, B, and C, then the included can be A, B, C, A and B, A and C, B and C, or A and B and C.
  • ordinal numbers such as “first” and “second” mentioned in the embodiments of the present application are used to distinguish multiple objects, and are not used to limit the order, timing, priority, or importance of multiple objects.
  • data deduplication technology can be divided into online deduplication mode and post-duplication mode according to the time when the deduplication operation is performed.
  • the online deduplication method refers to the deduplication operation before storing the data in the cache of the storage system to the storage device
  • the post deduplication method refers to the deduplication operation after the data in the cache is stored in the storage device. operating.
  • the technical solution in the embodiment of the present application is an improvement for the post-deduplication mode.
  • the storage node of the storage system generates and stores a fingerprint record item corresponding to each data.
  • the storage node calculates the fingerprint of each data. After the storage node stores the data in the storage address, it generates a fingerprint record item of the data, and stores the fingerprint record item in the deduplication metadata space used to store the fingerprint record item . Wherein, the fingerprint record item contains the correspondence between the fingerprint of the data and the storage address of the data.
  • the deduplication metadata space stores the fingerprint record items corresponding to the 10 data respectively.
  • the 10 fingerprint record items are shown in Figure 2(a ) Shown.
  • the fingerprint record item corresponding to a piece of data includes three parts, namely a serial number, a fingerprint (fingerprint, FP), and a token (token).
  • the serial number can indicate the order in which the storage node generates the fingerprint record items corresponding to the data, and the token is used to indicate the storage address of the data and other information.
  • the storage node sorts multiple fingerprint record items, obtains and stores sorted fingerprint records.
  • multiple fingerprint record items can be sorted according to the FP identification from small to large. After sorting, the fingerprint record items with the same FP identification are arranged together, so as to obtain as shown in Figure 2(b) Display the sorted fingerprint records, and store the sorted fingerprint records in the deduplication metadata space.
  • the storage node determines duplicate fingerprints from the sorted fingerprint records.
  • the storage node pre-stores a threshold for judging whether it is a duplicate fingerprint.
  • the threshold may be configured by the user through the client of the storage system, or may be pre-appointed, which is not limited here. Then, the storage node judges whether the number of occurrences of fingerprint records including the same fingerprint in the sorted fingerprint records is greater than or equal to the threshold, and if it is greater than the threshold, it determines that the fingerprint is a duplicate fingerprint. If a fingerprint is a duplicate fingerprint, it means that the data corresponding to the fingerprint is the same, that is, the data is repeatedly stored in the storage device.
  • the threshold may be 3.
  • the storage node determines that FP_1 and FP_4 are duplicate fingerprints.
  • the device in order to increase the deduplication rate, the device does not need to repeat the fingerprint threshold. For example, as long as the fingerprint record item contains duplicate fingerprints, the deduplication will be performed.
  • the storage node performs a data deduplication operation on the data corresponding to the duplicate fingerprint.
  • the storage node After the storage node determines the duplicate fingerprint, it uses the duplicate fingerprint to query the fingerprint table. If the duplicate fingerprint can be found in the fingerprint table, it means that the storage system has stored the unique data corresponding to the duplicate fingerprint and the fingerprint table records The storage address of the unique data, so that the corresponding relationship between the access address of the data and the storage address of the data is changed to the corresponding relationship between the access address of the data and the fingerprint.
  • the access address refers to the address at which data is externally presented, for example, logical block address (logical block address, LBA), etc., which is not limited in the embodiment of the present invention.
  • the storage node selects a fingerprint record item from at least one fingerprint record item containing the duplicate fingerprint, and reads it. Take the data in the storage address corresponding to the duplicate fingerprint in the fingerprint record item, store the data in the deduplication area to obtain the new storage address of the data, and establish the fingerprint and the new storage in the fingerprint table Address mapping, and the corresponding relationship between the access address of the data and the storage address of the data is changed to the corresponding relationship between the access address of the data and the fingerprint.
  • the fingerprint table is used to record the mapping between the fingerprint of the unique data after deduplication and the storage address of the unique data in the deduplication area.
  • the deduplication area refers to the storage area in the storage system that stores the only data after deduplication.
  • the efficiency of deduplication is related to the fingerprint record items stored in the deduplication metadata space.
  • the deduplication metadata space refers to the memory space of the storage system.
  • the storage space that users can use to store data is reduced, so when users want to store the same amount of data, they need more storage space
  • the storage system of deduplication metadata is small, the number of fingerprints stored is limited, so the probability of duplicate fingerprints is low, which affects the efficiency of deduplication. Therefore, how to set deduplication metadata reasonably The size of the data space is an important factor affecting the storage performance of the storage system.
  • the amount of data stored in the storage system changes in real time. For example, when the storage system is first used, there is less data stored in the storage system. As the use time increases, the data stored in the storage system more and more. However, when the data stored in the storage system is small, the remaining storage space in the storage system is relatively large. In this case, part of the remaining storage space can be used for deduplication, which can increase the deduplication rate of deduplication. And the speed of deduplication. When more and more data is stored in the storage system, it means that the storage system is gradually being overwritten and the data requires a larger storage space. In this case, part of the remaining storage space occupied by the deduplication can be released. Meet the user's data storage needs.
  • an embodiment of the present application provides a storage space management method to provide a method for setting a deduplication metadata space.
  • the method can be applied to a storage system, and the storage system can be a distributed storage system or a centralized storage system.
  • the storage system may be a file storage system, a block storage system, or an object storage system, or a combination of the foregoing storage systems, which is not limited in the embodiments of the present application.
  • the storage space of the storage system can be divided into data storage space and metadata storage space.
  • the data storage space is used to store data
  • the metadata storage space is used to store metadata of the data.
  • the data volume of the data is a preset ratio with the data volume of its metadata.
  • the preset ratio is determined by the storage system.
  • the preset ratio can be 10/1 or 20/1, that is, 10 megabytes of storage.
  • the first method is to obtain the data amount of data stored in the data storage space of the storage system.
  • the data amount of data stored in the data storage space is the data amount of data stored in the storage system.
  • the management node in the storage system or the array controller of the storage array can obtain the data volume of the data stored in each of the multiple storage nodes managed by it, and the data stored in the multiple storage nodes
  • the sum of the data volume is the data volume of all the data stored in the storage system.
  • the management node is used to manage 3 storage nodes, namely storage node 1 to storage node 3.
  • the management node obtains 10MB of data stored in each storage node of storage node 1 to storage node 3, and the storage node is determined to be stored
  • the acquisition method is simple and does not increase the computational complexity of the management node.
  • the data volume of metadata stored in the metadata storage space of the storage system is obtained, and the data volume of the data stored in the storage system is calculated according to the ratio of the data volume of the metadata to a preset value.
  • the management node or controller in the storage system obtains the data volume of metadata stored in each of the multiple storage nodes managed by it, and the data of the metadata stored in the multiple storage nodes The sum of the amount is the data amount of all metadata stored in the storage system. Then, the management node calculates the data volume of all the data stored in the storage system through the preset ratio between the data volume of the data and the data volume of its metadata.
  • the size of the deduplication metadata space varies according to the amount of data stored in the storage system. After obtaining the data volume of the data stored in the storage system, the size of the deduplication metadata space can be adjusted with the data volume.
  • adjusting the size of the deduplication metadata space can include but is not limited to the following two ways:
  • the first adjustment method is to increase the deduplication metadata space when the data amount is not greater than the first threshold; or, when the data amount is not less than the second threshold, decrease the deduplication metadata space.
  • the storage system will set an initial size for the deduplication metadata space, for example, set the initial size of the deduplication metadata space to 10MB. And the storage system pre-stores a threshold for judging whether the size of the deduplication metadata space needs to be adjusted, for example, the first threshold for judging whether to increase the deduplication metadata space, and for judging whether to reduce the deduplication metadata
  • a threshold for judging whether the size of the deduplication metadata space needs to be adjusted for example, the first threshold for judging whether to increase the deduplication metadata space, and for judging whether to reduce the deduplication metadata
  • the second threshold of the space, the first threshold may be smaller than the second threshold, or the first threshold may also be the same as the second threshold, which is not limited here.
  • the management node may determine to increase or decrease the deduplication metadata space according to the size relationship between the amount of data stored in the storage system and the first threshold or the second threshold.
  • an adjustment value can be preset, and the deduplication metadata space can be increased or decreased according to the adjustment value. For example, if the adjustment value is 10MB, when it is determined to increase or decrease the deduplication metadata space, 10MB is added or subtracted on the basis of the current value of the deduplication metadata space.
  • the management node can periodically obtain the data volume of the data stored in the storage system, and adjust the size of the deduplication metadata space multiple times.
  • the target value can be set in advance, for example, two target values can be set, respectively corresponding to increasing the first target value of the deduplication metadata space and decreasing the second target value of the deduplication metadata space, when When it is determined that the deduplication metadata space needs to be increased, the deduplication metadata space is adjusted to the first target value, and when it is determined that the deduplication metadata space needs to be reduced, the deduplication metadata space is adjusted to the first target value. 2. Target value.
  • the management node after the management node periodically obtains the data volume of the data stored in the storage system, if the current judgment result is the same as the last judgment result, for example, the deduplication metadata space is increased, because The result of the last adjustment has already adjusted the size of the deduplication metadata space to the target value, and the management node does not need to adjust again, which can lower the load of the management node.
  • the second adjustment method is to preset the corresponding relationship between the storage space occupied by the deduplication metadata space and the data volume of the data stored in the storage system. After obtaining the data volume of the data currently stored in the storage system, Then, the size of the deduplication metadata space is adjusted according to the maximum data amount of data that can be stored in the storage system, the data amount of the data currently stored in the storage system, and the corresponding relationship.
  • the correspondence between the proportion of the storage space of the deduplication metadata space and the data volume of the data stored in the storage system may be as shown in Table 1.
  • Table 1 when the data volume of the stored data is 0%, the deduplication metadata space accounts for the largest proportion in the storage system, which is 2%; when the data volume of the stored data reaches 60%, the deduplication metadata The proportion of data space in the storage system drops to 1%. When the amount of stored data reaches 85%, the proportion of deduplication metadata space in the storage system drops to 0.2%.
  • the management node After obtaining the data volume of the data currently stored in the storage system, the management node queries Table 1 to determine the proportion of the deduplication metadata space under the current situation. For example, if the amount of data currently stored is 10MB, which is less than 20MB in Table 1, it can be determined that the proportion of the deduplication metadata space is 2%.
  • the management node determines the maximum amount of data that the storage system can store.
  • the storage system in order to enable the storage system to store as much data as possible, after receiving the data to be stored sent by the user, the storage system will compress the data to be stored, and then store the compressed data in the storage system.
  • the maximum amount of data that can be stored by the storage system refers to the amount of data before the compression operation. If the reduction rate used during the compression operation is different, the amount of data that the storage system can store The maximum amount of data is also different, and the reduction rate used by the storage system may continue to change. Therefore, before determining the maximum amount of data that the storage system can store, it is necessary to determine the reduction rate used by the storage system.
  • the management node can obtain the data volume of all metadata currently stored in the storage system, and then calculate the data volume received by the storage system according to the preset ratio between the data volume of the metadata and the data volume of the data corresponding to the metadata. The first amount of data before the compression operation. Then, the second data volume of all the data stored in the data storage space of the storage system is obtained, and the ratio of the first data volume to the second data volume is the reduction rate currently used by the storage system.
  • the management node determines the product of the total amount of storage space of the storage system and the reduction rate currently used by the storage system, that is, the maximum amount of data that the storage system can store. For example, through the above calculations, it is determined that the current reduction rate of the storage system is 3, and the total storage space of the storage system is 100TB, and the management node determines that the maximum amount of data that the storage system can store is 300TB.
  • the management node determines the product of the proportion of the deduplication metadata space and the maximum amount of data that the storage system can store, that is, the value of the deduplication metadata space, thereby adjusting the deduplication metadata space to this value .
  • two types of fingerprint record items are stored in the deduplication metadata space, one is unsorted fingerprint record items, and the other is Is the sorted fingerprint record item.
  • the elimination strategy can be used to remove them in batches, so that the storage system can obtain the usable deduplication metadata space expressly.
  • the sorted fingerprint record items it needs to be removed depending on the sorting result. For example, whether the number of fingerprint record items with the same fingerprint is greater than or equal to the preset number, etc., it can be seen that removing the sorted fingerprint record items requires a certain amount Time delay.
  • the storage space used to store unsorted fingerprint record items in the deduplication metadata space is called the first deduplication metadata space
  • the storage space used to store sorted fingerprint record items is called the first deduplication metadata space.
  • This is the second deduplication metadata space.
  • Table 2 and Table 3 respectively, the corresponding relationship between the proportion of the first deduplication metadata space and the amount of data stored in the storage system, and the second deduplication metadata Correspondence between the proportion of space and the amount of data stored in the storage system.
  • the amount of data stored is expressed as a percentage of the amount of data to the total amount of storage space.
  • the maximum proportion of the first deduplication metadata space is 2%.
  • the first deduplication metadata space can be the maximum proportion;
  • the percentage of the total amount of data to the storage space is greater than or equal to 60%, it is necessary to reduce the proportion of the first deduplication metadata space, for example, from 2% to 1%; when the amount of data and the total storage space
  • the percentage is greater than or equal to 85%, adjust the proportion of the first deduplication metadata space to be closer to the proportion required by the storage system.
  • the storage system requires that the proportion of the first deduplication metadata space can only be 0.1 %, in this case, the proportion of the first deduplication metadata space can be adjusted to a value close to 0.1%, for example, reduced to 0.2%.
  • the percentage of the total amount of data to the storage space is greater than or equal to 95%, strictly control the proportion of the first deduplication metadata space required by the storage system (for example, 0.1%).
  • the rate of decrease in the proportion of the first deduplication metadata space depends on the speed at which the storage system eliminates unsorted fingerprint record items. The faster the storage system eliminates unsorted fingerprint record items, the slower the rate of decrease in the proportion of the first deduplication metadata space.
  • the amount of data stored Percentage of the second deduplication metadata space 0% 0.40% 25% 0.30% 50% 0.20% 75% 0.10%
  • the maximum proportion of the second deduplication metadata space is 0.4%.
  • the second deduplication metadata space can be the maximum proportion; Sorted fingerprint records are eliminated slowly. Therefore, when the percentage of the total amount of data and storage space is greater than or equal to 25%, the proportion of the second deduplication metadata space needs to be reduced, for example, from 0.4 % Dropped to 0.3%.
  • the other items in Table 3 are not described here.
  • the management node determines the proportion of the first deduplication metadata space and the proportion of the second deduplication metadata space, it determines that the size of the first deduplication metadata space is the proportion of the first deduplication metadata space and The product of the maximum amount of data that the storage system can store, and the size of the second deduplication metadata space is the product of the proportion of the second deduplication metadata space and the maximum amount of data, so that the first deduplication element
  • the data space and the second deduplication metadata space are adjusted to corresponding values.
  • the method of obtaining the maximum amount of data that can be stored by the storage system is similar to the method in the foregoing example, and will not be repeated here.
  • the deduplication metadata space can occupy a large storage space for a long period of time. In this way, there are more fingerprint record items stored in the deduplication metadata space, thus repeating data at a time.
  • increasing the size of the deduplication metadata space can be understood as reducing the size of the metadata space.
  • Small data storage space is used to increase the deduplication metadata space, or it can be understood as taking up part of the data storage space for storing fingerprint record items.
  • reducing the size of the deduplication metadata space can be understood as reducing the deduplication metadata space by increasing the data storage space, or it can be understood as the deduplication metadata space returning part of the data storage space it occupies.
  • the metadata storage space of the storage system also includes other metadata, such as the metadata of the data stored in the storage system, the metadata of the storage pool of the storage system, The volume metadata of the logical volume of the storage system, etc., so that the metadata storage space of the storage system can be divided into a deduplication metadata space and other metadata spaces according to the role of stored metadata.
  • the data volume of metadata and volume metadata of the storage pool is relatively fixed. Therefore, the data volume of metadata stored in other metadata spaces is the same as the data of metadata stored in the storage system.
  • the amount of data stored in the storage system is small, the amount of metadata in other metadata spaces is also small. In this way, when there is remaining storage space in the other metadata space, the deduplication metadata space can also use the remaining storage space in the other metadata space.
  • the deduplication metadata space can also use the remaining storage space in other metadata spaces to adjust deduplication
  • the size of the metadata space, the available storage space of the deduplication metadata space (the remaining storage space of the metadata space), that is, the maximum amount of data that the storage system can store, and the storage system excluding the deduplication metadata space The difference in the amount of data of other metadata.
  • the specific process of adjusting the size of the deduplication metadata space is similar to the aforementioned method of adjusting the size of the deduplication metadata space by occupying the data storage space, and will not be repeated here.
  • the size of the deduplication metadata space can be flexibly adjusted, the efficiency of deduplication can be improved by increasing the size of the deduplication metadata space. Moreover, since adjusting the size of the deduplication metadata space is based on the amount of data stored in the storage system as the adjustment factor, the size of the deduplication metadata space set by this method can also meet the data storage needs of users. Does not affect the storage performance of the storage system.
  • the storage system may include a hardware structure and/or a software module, and a hardware structure, a software module, or a hardware structure plus a software module may be used.
  • a hardware structure, a software module, or a hardware structure plus a software module may be used.
  • FIG. 4 shows a schematic structural diagram of an apparatus 400 for managing storage space.
  • the storage space management apparatus 400 may be used to implement the functions of a storage node of a storage system or an array controller of a storage array.
  • the storage space management device 400 may be a hardware structure, a software module, or a hardware structure plus a software module.
  • the storage space management device 400 may be implemented by a chip system. In the embodiments of the present application, the chip system may be composed of chips, or may include chips and other discrete devices.
  • the storage space management apparatus 400 may include a processing unit 401 and an acquiring unit 402.
  • the acquiring unit 402 may be used to perform step S31 in the embodiment shown in FIG. 3, and/or used to support other processes of the technology described herein.
  • the acquiring unit 402 may be used to communicate with the processing unit 401, or the acquiring unit 402 may be used to communicate with the storage space management apparatus 400 and other modules, which may be circuits, devices, interfaces, buses, Software module, transceiver or any other device that can realize communication.
  • the processing unit 401 may be used to execute step S32 in the embodiment shown in FIG. 3, and/or to support other processes of the technology described herein.
  • the division of modules in the embodiment shown in FIG. 4 is illustrative, and is only a logical function division. In actual implementation, there may be other division methods.
  • the functional modules in the various embodiments of the present application may be integrated In a processor, it can also exist alone physically, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software function modules.
  • FIG. 5 shows a storage space management apparatus 500 provided by an embodiment of the present application.
  • the storage space management apparatus 500 may be used to implement the functions of a storage node of a storage system or an array controller of a storage array.
  • the device 500 for managing the storage space may be a chip system.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the storage space management apparatus 500 includes at least one processor 520, which is configured to implement or support the storage space management apparatus 500 to implement the functions of the storage node or the array controller of the storage array in the method provided in the embodiment of the present application.
  • the processor 520 may adjust the size of the deduplication metadata space of the storage system according to the amount of data stored in the storage system. For details, refer to the detailed description in the method example, which is not repeated here.
  • the storage space management apparatus 500 may further include at least one memory 530 for storing program instructions and/or data.
  • the memory 530 and the processor 520 are coupled.
  • the coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units or modules, and may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • the processor 520 may cooperate with the memory 530 to operate.
  • the processor 520 may execute program instructions stored in the memory 530. At least one of the at least one memory may be included in the processor.
  • the storage space management apparatus 500 may further include an interface 510 for communicating with the processor 520 or for communicating with other devices through a transmission medium, so that the storage space management apparatus 500 can communicate with other devices.
  • the other device may be a storage client or a storage device.
  • the processor 520 may use the interface 510 to send and receive data.
  • the specific connection medium between the aforementioned interface 510, the processor 520, and the memory 530 is not limited in the embodiment of the present application.
  • the memory 530, the processor 520, and the interface 510 are connected by a bus 550.
  • the bus is represented by a thick line in FIG. It is not limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one thick line is used to represent in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the processor 520 may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. Or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed by a combination of hardware and software modules in the processor.
  • the memory 530 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), etc., and may also be a volatile memory (volatile memory). For example, random-access memory (RAM).
  • the memory is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this.
  • the memory in the embodiments of the present application may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • An embodiment of the present application also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the method executed by the storage node or the array controller in the embodiment shown in FIG. 3.
  • An embodiment of the present application also provides a computer program product, including instructions, which when run on a computer, cause the computer to execute the method executed by the storage node or the array controller in the embodiment shown in FIG. 3.
  • the embodiment of the present application provides a chip system.
  • the chip system includes a processor and may also include a memory for implementing the functions of the storage node or the array controller in the foregoing method.
  • the chip system can be composed of chips, or it can include chips and other discrete devices.
  • An embodiment of the present application provides a storage system, which includes a storage device and a storage node or an array controller in the embodiment shown in FIG. 3.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented by software, it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • a computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, hard disk, Magnetic tape), optical media (for example, digital video disc (DVD for short)), or semiconductor media (for example, SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种存储空间的管理方法及装置,在该方法中,首先获取存储系统中存储的数据的数据量,然后根据存储的数据的数据量的大小,调整存储系统中用于存储指纹记录项的重删元数据空间的大小,该指纹记录项用于记录数据的指纹。由于可以灵活调整重删元数据空间的大小,这样可以通过增加重删元数据空间的大小来提高重复数据删除的效率。且,由于调整重删元数据空间的大小是以存储系统中存储的数据的数据量为调整因子,因此,通过该方法设置的重删元数据空间的大小,也能够满足用户的数据存储需求,不影响存储系统的存储性能。

Description

一种存储空间的管理方法及装置 技术领域
本申请涉及存储技术领域,尤其涉及一种存储空间的管理方法及装置。
背景技术
随着技术的发展,越来越多的数据需要使用存储系统进行存储。为了节省存储系统的存储空间,提出了重复数据删除技术,即,若某个数据在存储系统中存储多份,则将该多份数据删除而只保存一份数据,从而通过缩减数据实现减少数据所占用的存储空间的目的。
目前,其中一种重复数据删除技术的实现过程为:计算每个数据的指纹,并将指纹存储到重删元数据空间中,从重删元数据空间存储的多个指纹中确定重复指纹,与重复指纹对应的数据为待进行重复数据删除的数据,对该待进行重复数据删除的数据进行批量的重复数据删除。
由于存储系统的存储空间有限,当重删元数据空间较大时,则用户能够用来存储数据的存储空间减小;当重删元数据空间较小时,由于存储的指纹数量有限,从而重复指纹出现的概率较低,影响重复数据删除的效率,因此,如何合理地设置重删元数据空间的大小,是影响存储系统的存储性能的重要因素。
发明内容
本申请提供一种存储空间的管理方法及装置,用以提供一种设置重删元数据空间的方法。
第一方面,提供一种存储空间的管理方法,在该方法中,首先获取存储系统中存储的数据的数据量,然后根据存储的数据的数据量的大小,调整存储系统中用于存储指纹记录项的重删元数据空间的大小,该指纹记录项用于记录数据的指纹。
在上述技术方案中,由于可以灵活调整重删元数据空间的大小,这样可以通过增加重删元数据空间的大小来提高重复数据删除的效率。且,由于调整重删元数据空间的大小是以存储系统中存储的数据的数据量为调整因子,因此,通过该方法设置的重删元数据空间的大小,也能够满足用户的数据存储需求,不影响存储系统的存储性能。
在一种可能的设计中,获取存储系统中存储的数据的数据量可以包括但不限于如下两种方式:
第一种方式,获取存储在该存储系统的数据存储空间中的数据的数据量,该数据存储空间用于存储数据。
在上述技术方案中,存储系统的存储空间可以分为数据存储空间和元数据存储空间,而存储系统中存储的数据即存储在数据存储空间的数据,因此,当需要获取存储系统中存储的数据的数据量时,可以直接获取数据存储空间中存储的数据的数据量,获取方式简单。
第二种方式,获取存储在该存储系统的元数据存储空间中的元数据的数据量,根据该元数据的数据量与预设比值,计算得到该存储系统中存储的数据的数据量,该元数据存储 空间用于存储数据的元数据。
在上述技术方案中,由于数据本身的数据量与其元数据的数据量之间存在预设比值,这样,可以通过获取存储系统中的元数据的数据量,来获取存储系统中存储的数据的数据量,由于元数据存储空间相较于数据存储空间较小,因此,可以减少搜索空间,减少时延。
在一种可能的设计中,根据该存储系统中存储的数据的数据量的大小,调整重删元数据空间的大小,可以包括但不限于如下三种方式:
第一种调整方式,在存储系统中存储的数据的数据量不大于第一阈值时,增大该重删元数据空间。
在上述技术方案中,当存储系统中存储的数据较少时,说明存储系统中有较多的空闲存储空间,因此,可以增大重删元数据空间提高重复数据删除的效率。
第二种调整方式,在存储系统中存储的数据的数据量不小于第二阈值时,减小该重删元数据空间。
在上述技术方案中,当存储系统中存储的数据较多时,说明数据即将写满存储系统,数据需要较大的存储空降,则可以减少重删元数据空间以存储数据,避免对存储系统的存储性能的影响。
第三种调整方式,存储系统中可以预先存储重删元数据空间的占比与存储系统中存储的数据的数据量之间的对应关系,然后,在获取该存储系统中能够存储的数据的最大数据量后,根据该最大数据量、该存储系统中存储的数据的数据量的大小,以及该预设的对应关系,调整该重删元数据空间的大小。
在上述技术方案中,可以根据存储系统所允许的最大数据量以及上述预设的对应关系,确定在当前情况下重删元数据空间的大小,可以提高调整重删元数据空间的大小的准确度。
在一种可能的设计中,可以通过数据存储空间来实现对重删元数据空间的大小的调整。当需要增大该重删元数据空间时,可以通过减小数据存储空间以增大重删元数据空间,当需要减小该重删元数据空间时,可以通过增加数据存储空间以减小重删元数据空间。
第二方面,提供一种存储空间的管理装置,该存储空间的管理装置可以是存储节点或者存储服务端,也可以是存储节点或者存储服务端中的装置。该存储空间的管理装置包括处理器,用于实现上述第一方面描述的方法。该存储空间的管理装置还可以包括存储器,用于存储程序指令和数据。该存储器与该处理器耦合,该处理器可以调用并执行该存储器中存储的程序指令,用于实现上述第一方面描述的方法中的任意一种方法。该存储空间的管理装置还可以包括接口,该接口与处理器进行通信。
在一种可能的设计中,该存储空间的管理装置包括处理器和接口,所述接口与所述处理器通信;其中,所述处理器,用于:
获取存储系统中存储的数据的数据量;
根据所述数据量的大小,调整所述存储系统的重删元数据空间的大小;其中,所述重删元数据空间用于存储指纹记录项;指纹记录项用于记录数据的指纹。
在一种可能的设计中,所述处理器具体用于:
获取存储在所述存储系统的数据存储空间中的数据的数据量,所述数据存储空间用于存储数据;或,
获取存储在所述存储系统的元数据存储空间中的元数据的数据量,根据所述元数据的数据量与预设比值,计算得到所述存储系统中存储的数据的数据量,所述元数据存储空间用于存储数据的元数据。
在一种可能的设计中,所述处理器具体用于:
在所述数据量不大于第一阈值时,增大所述重删元数据空间;或,
在所述数据量不小于第二阈值时,减小所述重删元数据空间。
在一种可能的设计中,所述处理器具体用于:
获取所述存储系统中能够存储的数据的最大数据量;
根据所述最大数据量、所述数据量的大小,以及,预设的重删元数据空间的占比与所述存储系统中存储的数据的数据量之间的对应关系,调整所述重删元数据空间的大小。
在一种可能的设计中,所述处理器具体用于:
减小所述存储系统的数据存储空间以增大所述重删元数据空间的大小;或,
增大所述数据存储空间以减小所述重删元数据空间的大小。
第三方面,提供一种存储空间的管理装置,该存储空间的管理装置可以是存储节点或者存储服务端,也可以是存储节点或者存储服务端中的装置。该存储空间的管理装置可以包括处理单元和获取单元,这些单元可以执行上述第一方面任一种设计示例中的所执行的相应功能,具体的:
所述获取单元,用于获取存储系统中存储的数据的数据量;
所述处理单元,用于根据所述数据量的大小,调整所述存储系统的重删元数据空间的大小;其中,所述重删元数据空间用于存储指纹记录项;指纹记录项用于记录数据的指纹。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被计算机执行时,使所述计算机执行第一方面中任意一项所述的方法。
第五方面,本申请实施例提供一种计算机程序产品,所述计算机程序产品存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被计算机执行时,使所述计算机执行第一方面中任意一项所述的方法。
第六方面,本申请提供了一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现第一方面所述的方法。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
第七方面,本申请实施例提供了一种存储系统,该存储系统包括存储设备以及第二方面及第二方面任一种设计中所述的存储空间的管理装置,或者,该存储系统包括存储设备以及第三方面及第三方面任一种设计中所述的存储空间的管理装置。
上述第二方面至第七方面及其实现方式的有益效果可以参考对第一方面的方法及其实现方式的有益效果的描述。
附图说明
图1为现有技术中的一种重复数据的删除方法的流程图;
图2为现有技术中进行重复数据删除操作之前的指纹记录和进行重复数据删除操作之后的指纹记录的一种示例的示意图;
图3为本申请实施例提供的数据存储方法的流程图;
图4为本申请实施例提供的存储空间的管理装置的一种示例的结构图;
图5为本申请实施例提供的存储空间的管理装置的另一种示例的结构图。
具体实施方式
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。
本申请实施例中“多个”是指两个或两个以上,鉴于此,本申请实施例中也可以将“多个”理解为“至少两个”。“至少一个”,可理解为一个或多个,例如理解为一个、两个或更多个。例如,包括至少一个,是指包括一个、两个或更多个,而且不限制包括的是哪几个,例如,包括A、B和C中的至少一个,那么包括的可以是A、B、C、A和B、A和C、B和C、或A和B和C。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,字符“/”,如无特殊说明,一般表示前后关联对象是一种“或”的关系。在本申请实施例中,“节点”和“节点”可以互换使用。
除非有相反的说明,本申请实施例提及“第一”、“第二”等序数词用于对多个对象进行区分,不用于限定多个对象的顺序、时序、优先级或者重要程度。
为便于理解本申请实施例中的方法,首先对重复数据删除技术进行说明。
目前,重复数据删除技术根据执行重复数据删除操作的时刻,可以分为在线重删方式和后重删方式。其中,在线重删方式是指,在将存储系统的缓存中的数据存储到存储设备之前进行重复数据删除操作,后重删方式是指在将缓存中的数据存储到存储设备之后进行重复数据删除操作。本申请实施例中的技术方案是针对后重删方式进行的改进。
请参考图1,为后重删方式的流程图,该流程图的描述如下:
S11、存储系统的存储节点生成并存储每个数据对应的指纹记录项。
具体来讲,存储节点计算每个数据的指纹,存储节点将数据存储到存储地址后,生成数据的指纹记录项,并将该指纹记录项存储到用于存储指纹记录项的重删元数据空间。其中,指纹记录项包含数据的指纹与数据的存储地址的对应关系。
作为一种示例,请参考图2,假设存储设备中存储了10个数据,则重删元数据空间存储与该10个数据分别对应的指纹记录项,该10个指纹记录项如图2(a)所示。在图2(a)中,与一个数据对应的指纹记录项包括三个部分,分别为编号、指纹(finger print,FP)以及令牌(token)。编号可以指示存储节点生成与数据对应的指纹记录项的顺序,通过token来指示该数据的存储地址等信息。
S12、存储节点对多个指纹记录项进行排序,获得并存储排序后的指纹记录。
具体来讲,可以按照FP的标识从小到大的顺序,将多个指纹记录项进行排序,排序后,具有相同FP的标识的指纹记录项则排列在一起,从而获得如图2(b)所示的排序后的指纹记录,并将该排序后的指纹记录存储在重删元数据空间。
S13、存储节点从排序后的指纹记录中确定重复指纹。
存储节点中预先存储用于判断是否为重复指纹的门限值,该门限值可以是用户通过存储系统的客户端配置的,也可以是预先约定的,在此不作限制。然后,存储节点判断排序后的指纹记录中,包括同一指纹的指纹记录出现的次数是否大于或等于该门限值,若大于该门限值,则确定该指纹是重复指纹。若某一个指纹为重复指纹,则说明与该指纹对应的数据相同,也就是说,在存储设备中重复存储了该数据。
作为一种示例,该门限值可以为3。在图2(b)所示的指纹记录中,有3个指纹记录项包括FP_1,以及,有4个指纹记录项包括FP_4,从而存储节点确定FP_1和FP_4为重复指纹。
另外一种实现,为提高重复数据删除率,可以不设备重复指纹门限值,例如,只要查找到指纹记录项中包含重复的指纹,即进行重复数据删除。
S14、存储节点对与重复指纹对应的数据进行重复数据删除操作。
当存储节点确定重复指纹后,则使用该重复指纹查询指纹表,若指纹表中能够查找到该重复指纹,则表明存储系统中已经存储与该重复指纹对应的唯一数据,且指纹表中记录了该唯一数据的存储地址,从而将该数据的访问地址与该数据的存储地址的对应关系改为该数据的访问地址与该指纹的对应关系。其中,访问地址是指数据对外呈现的地址,例如,逻辑块地址(logical block address,LBA)等,本发明实施例对此不作限定。若在指纹表中查找不到该重复指纹,则表明存储系统中没有存储与该重复指纹对应的唯一数据,从而存储节点从包含该重复指纹的至少一个指纹记录项中选择一个指纹记录项,读取该指纹记录项中与该重复指纹对应的存储地址中的数据,将该数据存储到重复数据删除区域中,得到该数据的新的存储地址,在指纹表中建立该指纹与该新的存储地址的映射,并将该数据的访问地址与该数据的存储地址的对应关系改为该数据的访问地址与该指纹的对应关系。
需要说明的是,指纹表用于记录重复数据删除后的唯一数据的指纹与该唯一数据在重复数据删除区域的存储地址的映射。重复数据删除区域是指存储系统中用于存储重复数据删除后的唯一数据的存储区域。
由上述过程可知,重复数据删除的效率是根据重删元数据空间中存储的指纹记录项相关联的。例如,如图2(b)所示,在重删元数据空间中存储10个指纹记录项时,执行一次如图1所示的方法可以确定出两个重复指纹,分别为FP_1和FP_4;在重删元数据空间中仅存储图2(b)中所示的前5个指纹时,则执行一次该方法只能确定出一个重复指纹,即FP_1,可见,重删元数据空间中存储的指纹记录项越多,则执行一次重复数据删除过程可以删除更多的重复数据。从这个角度上来说,增加重删元数据空间有利于提升重复数据删除的效率。本发明实施例的一种实现方式,重删元数据空间是指存储系统的内存空间。
但是,由于存储系统的存储空间有限,当重删元数据空间较大时,则用户能够用来存储数据的存储空间减小,从而用户要存储相同数据量的数据时,则需要存储空间更大的存储系统,会增加成本;而当重删元数据空间较小时,由于存储的指纹数量有限,从而重复指纹出现的概率较低,影响重复数据删除的效率,因此,如何合理地设置重删元数据空间的大小,是影响存储系统的存储性能的重要因素。
在实际使用过程中,存储系统中存储的数据量是实时变化的,例如,在存储系统刚开 始使用时,存储系统中存储的数据较少,随着使用时间的增加,存储系统中存储的数据越来越多。然而,在存储系统中存储的数据较少时,存储系统中的剩余存储空间较大,在这种情况下,可以占用一部分剩余存储空间用于重复数据删除,可以增加重复数据删除的重删率以及重删速度。在存储系统中存储的数据越来越多时,说明存储系统逐渐被写满,数据需要较大的存储空间,在这种情况下,则可以将重复数据删除所占用的部分剩余存储空间释放,以满足用户的数据存储需求。
鉴于此,本申请实施例提供一种存储空间的管理方法,用以提供一种设置重删元数据空间的方法。该方法可以应用于存储系统中,该存储系统可以是分布式存储系统,也可以是集中式存储系统。该存储系统可以是文件存储系统、块存储系统或者对象存储系统,或者上述存储系统的组合,在本申请实施例中不作限制。
请参考图3,为该方法的流程图,该流程图的描述如下:
S31、获取存储系统中存储的数据的数据量。
存储系统的存储空间可以分为数据存储空间和元数据存储空间,数据存储空间用于存储数据,元数据存储空间用于存储数据的元数据。数据的数据量与其元数据的数据量成预设比值,该预设比值是由存储系统确定的,该预设比值可以为10/1或者20/1等,也就是说,存储10兆字节(megabytes,MB)数据,其元数据的数据量为1/10*10MB=1MB。因此,在本申请实施例中,获取存储系统中存储的数据的数据量包括但不限于如下两种方式:
第一种方式,获取存储在该存储系统的数据存储空间中的数据的数据量,该数据存储空间存储的数据的数据量即该存储系统中存储的数据的数据量。
具体来讲,可由存储系统中的管理节点或者存储阵列的阵列控制器等,获取其所管理的多个存储节点中每个存储节点中存储的数据的数据量,该多个存储节点中存储的数据的数据量之和即存储系统中存储的所有的数据的数据量。例如,管理节点用于管理3个存储节点,分别为存储节点1~存储节点3,管理节点获取存储节点1~存储节点3中每个存储节点存储的数据的数据量均为10MB,则确定存储系统中存储的所有的数据的数据量为10*3=30MB。该获取方式简单,不会增加管理节点的运算复杂度。
第二种方式,获取存储在该存储系统的元数据存储空间中的元数据的数据量,根据该元数据的数据量与预设比值,计算得到该存储系统中存储的数据的数据量。
具体来讲,存储系统中的管理节点或者控制器等,获取其所管理的多个存储节点中每个存储节点中存储的元数据的数据量,该多个存储节点中存储的元数据的数据量之和即存储系统中存储的所有的元数据的数据量。然后,管理节点通过数据的数据量与其元数据的数据量之间的预设比值,计算出存储系统中存储的所有的数据的数据量。沿用上述例子,管理节点获取存储节点1~存储节点3中每个存储节点存储的数据的数据量均为1MB,则确定存储系统中存储的所有的元数据的数据量为1*3=3MB,假设数据的数据量与其元数据的数据量之间的比值为10/1,则管理节点确定存储系统中存储的数据的数据量为3*10=30MB。由于元数据存储空间相较于数据存储空间较小,因此,可以减少搜索空间,减少时延。
S32、根据数据量的大小,调整存储系统的重删元数据空间的大小。
在本申请实施例中,重删元数据空间的大小是根据存储系统中存储的数据量的大小而变化。在获取存储系统中存储的数据的数据量后,则可以该数据量调整重删元数据空间的 大小。
具体来讲,调整重删元数据空间的大小可以包括但不限于如下两种方式:
第一种调整方式,在该数据量不大于第一阈值时,增大该重删元数据空间;或,在该数据量不小于第二阈值时,减小该重删元数据空间。
具体来讲,存储系统会给重删元数据空间设置一个初始大小,例如,设置重删元数据空间的初始大小为10MB。且存储系统中预先存储用于判断是否需要调整重删元数据空间大小的阈值,例如,用于判断是否增大重删元数据空间的第一阈值,和用于判断是否减小重删元数据空间的第二阈值,第一阈值可以小于第二阈值,或者,第一阈值也可以与第二阈值相同,在此不做限制。然后,管理节点可以根据存储系统中存储的数据的数据量与该第一阈值或该第二阈值之间的大小关系,确定增大或者减小该重删元数据空间。
作为一种示例,可以预先设置调整值,按照该调整值增大或减小该重删元数据空间。例如,该调整值为10MB,则当确定增大或减小重删元数据空间时,则在重删元数据空间的当前值的基础上增加或减去10MB。
在这种情况下,管理节点可以通过周期性获取存储系统中存储的数据的数据量,多次调整重删元数据空间的大小。
作为另一种示例,可以预先设置目标值,例如,设置两个目标值,分别对应增大重删元数据空间的第一目标值,和减小重删元数据空间的第二目标值,当确定需要增大重删元数据空间时,则将该重删元数据空间调整为该第一目标值,当确定需要减小重删元数据空间时,则将该重删元数据空间调整为第二目标值。
在这种情况下,当管理节点周期性获取存储系统中存储的数据的数据量后,若当前的判断结果与上一次的判断结果相同,例如,都是增大重删元数据空间,则由于上一次调整结果已经将该重删元数据空间的大小调整至目标值,则管理节点可以不用再次进行调整,可以较低管理节点的负载。
第二种调整方式,预先设置重删元数据空间在存储空间的占比与该存储系统中存储的数据的数据量之间的对应关系,在获取该存储系统当前存储的数据的数据量后,则根据该存储系统中能够存储的数据的最大数据量、存储系统当前存储的数据的数据量的大小,以及该对应关系,调整该重删元数据空间的大小。
作为一种示例,重删元数据空间在存储空间的占比与该存储系统中存储的数据的数据量之间的对应关系可以如表1所示。在表1中,当存储的数据的数据量为0%时,重删元数据空间在存储系统中的占比最大,为2%;当存储的数据的数据量达到60%后,重删元数据空间在存储系统的占比下降为1%,当存储的数据的数据量达到85%后,重删元数据空间在存储系统中的占比下降为0.2%。
表1
存储的数据的数据量 重删元数据空间的占比
20MB 2.00%
200TB 1.00%
240TB 0.20%
管理节点在获取该存储系统中当前存储的数据的数据量后,则查询表1,确定在当前情况下,重删元数据空间的占比。例如,当前存储的数据量为10MB,小于表1中的20MB,则可以确定重删元数据空间的占比为2%。
然后,管理节点确定存储系统能够存储的数据的最大数据量。
在存储系统中,为了使得存储系统能够尽可能多的存储数据,存储系统在接收用户发送的待存储数据后,会对待存储数据进行压缩等操作,然后将压缩后的数据存储在存储系统中。在本申请实施例中,该存储系统能够存储的数据的最大数据量是指在进行压缩操作之前的数据的数据量,若进行压缩操作时使用的缩减率不同,则存储系统能够存储的数据的最大数据量也不同,而存储系统所使用的缩减率可能会不断发生变化,因此,在确定存储系统能够存储的数据的最大数据量之前,首先需要确定存储系统所使用的缩减率。例如,管理节点可以获取存储系统中当前存储的所有的元数据的数据量,然后根据元数据的数据量和与元数据对应的数据的数据量之间的预设比值,计算出存储系统接收的在进行压缩操作之前的数据的第一数据量。然后,获取存储系统的数据存储空间中存储的所有数据的第二数据量,该第一数据量和第二数据量的比值即存储系统当前使用的缩减率。
然后,管理节点确定存储系统的存储空间的总量与该存储系统当前使用的缩减率的乘积,即存储系统能够存储的数据的最大数据量。例如,通过上述计算,确定存储系统当前使用的缩减率为3,存储系统的存储空间的总量为100TB,则管理节点确定存储系统能够存储的数据的最大数据量为300TB。
最后,管理节点确定重删元数据空间的占比与该存储系统能够存储的数据的最大数据量的乘积,即重删元数据空间的取值,从而将重删元数据空间调整为该取值。
作为另一种示例,由图1所示的对后重删方式的流程的描述可知,重删元数据空间中存储了两种指纹记录项,一种是未排序的指纹记录项,另一种是排序的指纹记录项。对于未排序的指纹记录项,可以在重删元数据空间不足时,通过淘汰策略进行批量移除,从而使得存储系统可以快递地获取能够使用的重删元数据空间。而对于排序后的指纹记录项,需要依赖排序结果进行移除,例如,出现同一指纹的指纹记录项的次数是否大于或等于有预设次数等,可见,移除排序后的指纹记录项需要一定的时延。因此,本申请实施例中,针对上述两种不同的指纹记录项,可以设置不同的重删元数据空间在存储空间的占比与该存储系统中存储的数据的数据量之间的对应关系,使得对重删元数据空间的调整更加精确。
为方便说明,在下文中,将重删元数据空间中用于存储未排序的指纹记录项的存储空间称为第一重删元数据空间,将用于存储排序后的指纹记录项的存储空间称为第二重删元数据空间。作为一种示例,请参考表2和表3,分别为第一重删元数据空间的占比与该存储系统中存储的数据的数据量之间的对应关系,以及,第二重删元数据空间的占比与该存储系统中存储的数据的数据量之间的对应关系。在表2和表3中,存储的数据的数据量用数据量与存储空间的总量的百分比来表示。
表2
Figure PCTCN2020111002-appb-000001
Figure PCTCN2020111002-appb-000002
在表2中,第一重删元数据空间的占比最大为2%,当存储系统中存储的数据的数据量较少时,则第一重删元数据空间可以为该最大占比;在数据量与存储空间的总量百分比大于或等于60%时候,才需要降低该第一重删元数据空间的占比,例如,从2%降到1%;当数据量与存储空间的总量百分比大于或等于85%时,将第一重删元数据空间的占比调整为较为接近存储系统所要求的占比,例如,存储系统要求第一重删元数据空间的占比只能为0.1%,则在这种情况下,可以将第一重删元数据空间的占比调整为接近0.1%的取值,例如,降为0.2%。当数据量与存储空间的总量百分比大于或等于95%时,按照存储系统要求的第一重删元数据空间的占比(例如,0.1%)严格控制。
需要说明的是,第一重删元数据空间的占比的下降速率,取决于存储系统淘汰未排序的指纹记录项的速度。存储系统淘汰未排序的指纹记录项的速率越快,则第一重删元数据空间的占比的下降速率越慢。
表3
存储的数据的数据量 第二重删元数据空间的占比
0% 0.40%
25% 0.30%
50% 0.20%
75% 0.10%
在表3中,第二重删元数据空间的占比最大为0.4%,当存储系统中存储的数据的数据量较少时,则第二重删元数据空间可以为该最大占比;由于排序的指纹记录项被淘汰的速度较慢,因此,在数据量与存储空间的总量百分比大于或等于25%时候,则需要降低该第二重删元数据空间的占比,例如,从0.4%降到0.3%。在此不对表3中的其他表项一一进行说明。
由表2和表3可知,第二重删元数据空间的占比的下降速度相较于第一重删元数据空间的占比的下降速度更快。
管理节点确定第一重删元数据空间的占比,和第二重删元数据空间的占比后,则确定第一重删元数据空间的大小为第一重删元数据空间的占比与该存储系统能够存储的数据的最大数据量的乘积,第二重删元数据空间的大小为该第二重删元数据空间的占比与该最大数据量的乘积,从而将第一重删元数据空间和第二重删元数据空间调整为对应的取值。其中,获取该存储系统能够存储的数据的最大数据量的方式与前述示例中的方式相似,在此不再赘述。
由前述表1~表3可知,重删元数据空间可以在较长的一段时间内占用一个较大的存储空间,这样,重删元数据空间中存储的指纹记录项较多,从而一次重复数据删除操作能够确定出的重复指纹的数量也越多,从而可以提高存储系统进行重复数据删除操作的重删率和重删速率。
由于存储系统中存储空间的大小是固定的,且存储空间可以分为数据存储空间和元数据存储空间,因此,在本申请实施例中,增大重删元数据空间的大小可以理解为通过减小数据存储空间来增大重删元数据空间,或者,可以理解为,占用部分数据存储空间以用于存储指纹记录项。相应地,减小重删元数据空间的大小可以理解为通过增大数据存储空间来减小重删元数据空间,或者,可以理解为,重删元数据空间退还其占用的部分数据存储空间。
另外,存储系统的元数据存储空间中除了用于存储指纹记录项,还包括用于存储其他元数据,例如,该存储系统中存储的数据的元数据,该存储系统的存储池的元数据,存储系统的逻辑卷的卷元数据等,从而可以将存储系统的元数据存储空间按照存储的元数据的作用分为重删元数据空间和其他元数据空间。而其他元数据空间中,存储池的元数据和卷元数据等的数据量较为固定,因此,其他元数据空间中存储的元数据的数据量,与存储系统中存储的数据的元数据的数据量相关联,当存储系统中存储的数据的数据量较小时,其他元数据空间中的元数据的数据量也较小。这样,当其他元数据空间有剩余存储空间时,重删元数据空间也可以使用该其他元数据空间中的剩余存储空间。
作为一种示例,假设元数据空间的容量为该存储系统能够存储的数据的最大数据量的1.5%,其他元数据空间中存储的元数据的数据量是随着存储系统中存储的数据的增加而增加的,因此,当存储系统中存储的数据较少时,其他元数据空间中也有剩余存储空间,则重删元数据空间也可以利用其他元数据空间中的剩余存储空间,来调整重删元数据空间的大小,重删元数据空间的可用存储空间(元数据空间的剩余存储空间),即该存储系统能够存储的数据的最大数据量,与存储系统中除重删元数据空间外的其他元数据的数据量的差值。在这种情况下,调整重删元数据空间的大小的具体过程,与前述占用数据存储空间调整重删元数据空间的大小的方式相似,在此不再赘述。
在上述技术方案中,由于可以灵活调整重删元数据空间的大小,这样可以通过增加重删元数据空间的大小来提高重复数据删除的效率。且,由于调整重删元数据空间的大小是以存储系统中存储的数据的数据量为调整因子,因此,通过该方法设置的重删元数据空间的大小,也能够满足用户的数据存储需求,不影响存储系统的存储性能。
上述本申请提供的实施例中,为了实现上述本申请实施例提供的方法中的各功能,存储系统可以包括硬件结构和/或软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能以硬件结构、软件模块、还是硬件结构加软件模块的方式来执行,取决于技术方案的特定应用和设计约束条件。
图4示出了一种存储空间的管理装置400的结构示意图。其中,存储空间的管理装置400可以用于实现存储系统的存储节点或者存储阵列的阵列控制器的功能。存储空间的管理装置400可以是硬件结构、软件模块、或硬件结构加软件模块。存储空间的管理装置400可以由芯片系统实现。本申请实施例中,芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
存储空间的管理装置400可以包括处理单元401和获取单元402。
获取单元402可以用于执行图3所示的实施例中的步骤S31,和/或用于支持本文所描述的技术的其它过程。一种可能的实现方式,获取单元402可以用于与处理单元401通信, 或者,获取单元402可以用于存储空间的管理装置400和其它模块进行通信,其可以是电路、器件、接口、总线、软件模块、收发器或者其它任意可以实现通信的装置。
处理单元401可以用于执行图3所示的实施例中的步骤S32,和/或用于支持本文所描述的技术的其它过程。
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
图4所示的实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
如图5所示为本申请实施例提供的存储空间的管理装置500,其中,存储空间的管理装置500可以用于实现存储系统的存储节点或者存储阵列的阵列控制器的功能。其中,该存储空间的管理装置500可以为芯片系统。本申请实施例中,芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
存储空间的管理装置500包括至少一个处理器520,用于实现或用于支持存储空间的管理装置500实现本申请实施例提供的方法中存储节点或者存储阵列的阵列控制器的功能。示例性地,处理器520可以根据存储系统中存储的数据的数据量的大小,调整所述存储系统的重删元数据空间的大小,具体参见方法示例中的详细描述,此处不做赘述。
存储空间的管理装置500还可以包括至少一个存储器530,用于存储程序指令和/或数据。存储器530和处理器520耦合。本申请实施例中的耦合是装置、单元或模块之间的间接耦合或通信连接,可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。处理器520可能和存储器530协同操作。处理器520可能执行存储器530中存储的程序指令。所述至少一个存储器中的至少一个可以包括于处理器中。
存储空间的管理装置500还可以包括接口510,用于与处理器520通信,或者用于通过传输介质和其它设备进行通信,从而用于存储空间的管理装置500可以和其它设备进行通信。示例性地,该其它设备可以是存储客户端或者存储设备。处理器520可以利用接口510收发数据。
本申请实施例中不限定上述接口510、处理器520以及存储器530之间的具体连接介质。本申请实施例在图5中以存储器530、处理器520以及接口510之间通过总线550连接,总线在图5中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请实施例中,处理器520可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
在本申请实施例中,存储器530可以是非易失性存储器,比如硬盘(hard disk drive, HDD)或固态硬盘(solid-state drive,SSD)等,还可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM)。存储器是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本申请实施例中的存储器还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
本申请实施例中还提供一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行图3所示的实施例中存储节点或阵列控制器执行的方法。
本申请实施例中还提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行图3所示的实施例中存储节点或阵列控制器执行的方法。
本申请实施例提供了一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现前述方法中存储节点或阵列控制器的功能。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
本申请实施例提供了一种存储系统,该存储系统包括存储设备以及图3所示的实施例中存储节点或阵列控制器。
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,简称DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,简称DVD))、或者半导体介质(例如,SSD)等。

Claims (17)

  1. 一种存储空间的管理方法,其特征在于,包括:
    获取存储系统中存储的数据的数据量;
    根据所述数据量的大小,调整所述存储系统的重删元数据空间的大小;其中,所述重删元数据空间用于存储指纹记录项;指纹记录项用于记录数据的指纹。
  2. 根据权利要求1所述的方法,其特征在于,获取存储系统中存储的数据的数据量,包括:
    获取存储在所述存储系统的数据存储空间中的数据的数据量,所述数据存储空间用于存储数据;或,
    获取存储在所述存储系统的元数据存储空间中的元数据的数据量,根据所述元数据的数据量与预设比值,计算得到所述存储系统中存储的数据的数据量,所述元数据存储空间用于存储数据的元数据。
  3. 根据权利要求1或2所述的方法,其特征在于,根据所述数据量的大小,调整重删元数据空间的大小,包括:
    在所述数据量不大于第一阈值时,增大所述重删元数据空间;或,
    在所述数据量不小于第二阈值时,减小所述重删元数据空间。
  4. 根据权利要求1或2所述的方法,其特征在于,根据所述数据量的大小,调整重删元数据空间的大小,包括:
    获取所述存储系统中能够存储的数据的最大数据量;
    根据所述最大数据量、所述数据量的大小,以及,预设的重删元数据空间的占比与所述存储系统中存储的数据的数据量之间的对应关系,调整所述重删元数据空间的大小。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,调整所述重删元数据空间的大小,包括:
    减小所述存储系统的数据存储空间以增大所述重删元数据空间的大小;或,
    增大所述数据存储空间以减小所述重删元数据空间的大小。
  6. 一种存储空间的管理装置,其特征在于,包括接口和处理器,所述接口与所述处理器通信;其中,所述处理器,用于:
    获取存储系统中存储的数据的数据量;
    根据所述数据量的大小,调整所述存储系统的重删元数据空间的大小;其中,所述重删元数据空间用于存储指纹记录项;指纹记录项用于记录数据的指纹。
  7. 根据权利要求6所述的装置,其特征在于,所述处理器具体用于:
    获取存储在所述存储系统的数据存储空间中的数据的数据量,所述数据存储空间用于存储数据;或,
    获取存储在所述存储系统的元数据存储空间中的元数据的数据量,根据所述元数据的数据量与预设比值,计算得到所述存储系统中存储的数据的数据量,所述元数据存储空间用于存储数据的元数据。
  8. 根据权利要求6或7所述的装置,其特征在于,所述处理器具体用于:
    在所述数据量不大于第一阈值时,增大所述重删元数据空间;或,
    在所述数据量不小于第二阈值时,减小所述重删元数据空间。
  9. 根据权利要求6或7所述的装置,其特征在于,所述处理器具体用于:
    获取所述存储系统中能够存储的数据的最大数据量;
    根据所述最大数据量、所述数据量的大小,以及,预设的重删元数据空间的占比与所述存储系统中存储的数据的数据量之间的对应关系,调整所述重删元数据空间的大小。
  10. 根据权利要求6-9所述的装置,其特征在于,所述处理器具体用于:
    减小所述存储系统的数据存储空间以增大所述重删元数据空间的大小;或,
    增大所述数据存储空间以减小所述重删元数据空间的大小。
  11. 一种存储空间的管理装置,其特征在于,包括获取单元和处理单元,其中,
    所述获取单元,用于获取存储系统中存储的数据的数据量;
    所述处理单元,用于根据所述数据量的大小,调整所述存储系统的重删元数据空间的大小;其中,所述重删元数据空间用于存储指纹记录项;指纹记录项用于记录数据的指纹。
  12. 根据权利要求11所述的装置,其特征在于,所述获取单元用于:
    获取存储在所述存储系统的数据存储空间中的数据的数据量,所述数据存储空间用于存储数据;或,
    获取存储在所述存储系统的元数据存储空间中的元数据的数据量,根据所述元数据的数据量与预设比值,计算得到所述存储系统中存储的数据的数据量,所述元数据存储空间用于存储数据的元数据。
  13. 根据权利要求11或12所述的装置,其特征在于,所述处理单元用于:
    在所述数据量不大于第一阈值时,增大所述重删元数据空间;或,
    在所述数据量不小于第二阈值时,减小所述重删元数据空间。
  14. 根据权利要求11或12所述的装置,其特征在于,所述处理单元用于:
    获取所述存储系统中能够存储的数据的最大数据量;
    根据所述最大数据量、所述数据量的大小,以及,预设的重删元数据空间的占比与所述存储系统中存储的数据的数据量之间的对应关系,调整所述重删元数据空间的大小。
  15. 根据权利要求11-14所述的装置,其特征在于,所述处理单元用于:
    减小所述存储系统的数据存储空间以增大所述重删元数据空间的大小;或,
    增大所述数据存储空间以减小所述重删元数据空间的大小。
  16. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有指令,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1-5任一项所述的方法。
  17. 一种计算机程序产品,其特征在于,所述计算机程序产品存储有指令,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1-5任一项所述的方法。
PCT/CN2020/111002 2019-09-05 2020-08-25 一种存储空间的管理方法及装置 WO2021043026A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910835398.3A CN110750211B (zh) 2019-09-05 2019-09-05 一种存储空间的管理方法及装置
CN201910835398.3 2019-09-05

Publications (1)

Publication Number Publication Date
WO2021043026A1 true WO2021043026A1 (zh) 2021-03-11

Family

ID=69276170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111002 WO2021043026A1 (zh) 2019-09-05 2020-08-25 一种存储空间的管理方法及装置

Country Status (2)

Country Link
CN (1) CN110750211B (zh)
WO (1) WO2021043026A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750211B (zh) * 2019-09-05 2021-05-04 华为技术有限公司 一种存储空间的管理方法及装置
CN113625938A (zh) * 2020-05-06 2021-11-09 华为技术有限公司 一种元数据存储方法及其设备
CN112181291B (zh) * 2020-09-04 2022-08-02 杭州宏杉科技股份有限公司 数据回写方法、装置、电子设备及机器可读存储介质
CN112148791B (zh) * 2020-09-15 2024-05-24 张立旭 一种分布式数据动态调整存储方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1767050A (zh) * 2005-09-23 2006-05-03 北京中星微电子有限公司 可编程的存储装置和方法
US20070088920A1 (en) * 2005-10-19 2007-04-19 Philip Garcia Managing data for memory, a data store, and a storage device
CN103309812A (zh) * 2012-03-15 2013-09-18 中国移动通信集团公司 一种智能卡内存空间的分配方法及装置
CN110069215A (zh) * 2019-03-27 2019-07-30 浙江宇视科技有限公司 一种基于块存储的动态调整存储单元的方法及装置
CN110750211A (zh) * 2019-09-05 2020-02-04 华为技术有限公司 一种存储空间的管理方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064639B (zh) * 2012-12-28 2016-08-03 华为技术有限公司 数据存储方法及装置
CN105335100B (zh) * 2015-09-29 2018-09-21 华为技术有限公司 一种数据处理方法、装置及闪存设备
CN109074226B (zh) * 2016-09-28 2020-03-20 华为技术有限公司 一种存储系统中重复数据删除方法、存储系统及控制器
CN106843762B (zh) * 2017-01-17 2019-11-12 深圳忆联信息系统有限公司 管理存储区域的方法及固态硬盘
CN107632791A (zh) * 2017-10-10 2018-01-26 郑州云海信息技术有限公司 一种存储空间的分配方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1767050A (zh) * 2005-09-23 2006-05-03 北京中星微电子有限公司 可编程的存储装置和方法
US20070088920A1 (en) * 2005-10-19 2007-04-19 Philip Garcia Managing data for memory, a data store, and a storage device
CN103309812A (zh) * 2012-03-15 2013-09-18 中国移动通信集团公司 一种智能卡内存空间的分配方法及装置
CN110069215A (zh) * 2019-03-27 2019-07-30 浙江宇视科技有限公司 一种基于块存储的动态调整存储单元的方法及装置
CN110750211A (zh) * 2019-09-05 2020-02-04 华为技术有限公司 一种存储空间的管理方法及装置

Also Published As

Publication number Publication date
CN110750211A (zh) 2020-02-04
CN110750211B (zh) 2021-05-04

Similar Documents

Publication Publication Date Title
WO2021043026A1 (zh) 一种存储空间的管理方法及装置
US11960393B2 (en) Data processing method and apparatus, and flash device
WO2021027541A1 (zh) 一种重复数据的删除方法及装置
US11360705B2 (en) Method and device for queuing and executing operation commands on a hard disk
WO2021073635A1 (zh) 一种数据存储方法及装置
WO2021036689A1 (zh) 一种缓存空间的管理方法及装置
KR20180086120A (ko) 테일 레이턴시를 인식하는 포어그라운드 가비지 컬렉션 알고리즘
CN111475507B (zh) 一种工作负载自适应单层lsmt的键值数据索引方法
CN108089825B (zh) 一种基于分布式集群的存储系统
WO2023020247A1 (zh) 时序指标数据降精度处理方法、装置和计算机设备
US10013425B1 (en) Space-efficient persistent block reservation optimized for compression
US11461239B2 (en) Method and apparatus for buffering data blocks, computer device, and computer-readable storage medium
CN116700634B (zh) 分布式存储系统垃圾回收方法、装置及分布式存储系统
CN111427920B (zh) 数据采集方法、装置、系统、计算机设备及存储介质
TWI722392B (zh) 記憶裝置
CN114116634B (zh) 一种缓存方法、装置及可读存储介质
WO2023050856A1 (zh) 数据处理方法及存储系统
EP4321981A1 (en) Data processing method and apparatus
CN115167778A (zh) 存储的管理方法、系统及服务器
US10795596B1 (en) Delayed deduplication using precalculated hashes
WO2021016728A1 (zh) 存储系统中数据处理方法、装置及计算机存储可读存储介质
CN110658999A (zh) 一种信息更新方法、装置、设备及计算机可读存储介质
US20240143449A1 (en) Data Processing Method and Apparatus
WO2024032015A1 (zh) 数据缩减方法、装置及系统
CN114237507B (zh) 管理数据的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20861593

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20861593

Country of ref document: EP

Kind code of ref document: A1