CN116700634B - Garbage recycling method and device for distributed storage system and distributed storage system - Google Patents

Garbage recycling method and device for distributed storage system and distributed storage system Download PDF

Info

Publication number
CN116700634B
CN116700634B CN202310988513.7A CN202310988513A CN116700634B CN 116700634 B CN116700634 B CN 116700634B CN 202310988513 A CN202310988513 A CN 202310988513A CN 116700634 B CN116700634 B CN 116700634B
Authority
CN
China
Prior art keywords
storage system
distributed storage
determining
garbage collection
read
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310988513.7A
Other languages
Chinese (zh)
Other versions
CN116700634A (en
Inventor
孙润宇
侯斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310988513.7A priority Critical patent/CN116700634B/en
Publication of CN116700634A publication Critical patent/CN116700634A/en
Application granted granted Critical
Publication of CN116700634B publication Critical patent/CN116700634B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a garbage collection method and device for a distributed storage system and the distributed storage system, wherein the method comprises the following steps: acquiring historical operation data of a distributed storage system; determining performance indexes and current service pressure of the distributed storage system according to the historical operation data; determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure; and determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to collect garbage from the distributed storage system based on the target garbage collection parameters. According to the method provided by the scheme, the performance index and the current service pressure of the distributed storage system are comprehensively considered, and the target garbage collection parameter is set, so that the problem that the garbage collection influences the normal processing of the distributed storage system on the service is avoided, and the stability of the distributed storage system is improved.

Description

Garbage recycling method and device for distributed storage system and distributed storage system
Technical Field
The present application relates to the field of distributed storage technologies, and in particular, to a method and an apparatus for garbage collection in a distributed storage system, and a distributed storage system.
Background
At present, with the rapid development of information technology, a large amount of data needs to be stored, and a distributed file storage system is rapidly developed as a reliable solution. Distributed storage systems often generate garbage data during data storage, so how to recycle the garbage data becomes an important study.
In the prior art, the garbage collection rate is generally configured according to the current data occupation degree of the distributed storage system.
However, if the garbage is recovered at a larger garbage recovery rate under the condition of higher service load of the distributed storage system, the service processing of the distributed storage system is affected, and the stability of the distributed storage system is reduced.
Disclosure of Invention
The application provides a garbage recycling method and device for a distributed storage system and the distributed storage system, and aims to overcome the defects that the stability of the distributed storage system is reduced in the prior art.
The first aspect of the application provides a garbage collection method of a distributed storage system, comprising the following steps:
acquiring historical operation data of a distributed storage system;
determining performance indexes and current service pressure of the distributed storage system according to the historical operation data;
Determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure;
and determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to carry out garbage collection on the distributed storage system based on the target garbage collection parameters.
In an alternative embodiment, the determining the performance index of the distributed storage system according to the historical operation data includes:
determining read-write IOPS and read-write bandwidth of the distributed storage system according to the historical operation data;
and determining the performance index of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system.
In an alternative embodiment, the determining the performance index of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system includes:
according to a first preset granularity, determining a performance statistical result of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
and determining the performance index of the distributed storage system according to the performance statistical result of the distributed storage system.
In an alternative embodiment, the determining, according to the first preset granularity, the performance statistics of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system includes:
according to a first preset granularity, determining the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
and determining a performance statistical result of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity.
In an alternative embodiment, the determining the performance index of the distributed storage system according to the performance statistics of the distributed storage system includes:
determining a first aggregate number of the performance statistics according to a ratio between a first preset granularity and a second preset granularity;
obtaining the latest and continuous performance statistics as a first target set according to the first total quantity of the performance statistics;
determining average read-write IOPS and average read-write bandwidth of the distributed storage system in the second preset granularity according to the first target set;
Determining a performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity;
wherein the second preset granularity is greater than the first preset granularity.
In an alternative embodiment, the determining the performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system within the second preset granularity includes:
determining a latest performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity;
determining a maximum performance index of the distributed storage system according to the latest performance index and the historical performance index of the distributed storage system;
and taking the maximum performance index of the distributed storage system as the performance index of the distributed storage system.
In an alternative embodiment, the determining the current service pressure of the distributed storage system according to the historical operation data includes:
determining a second total number of the performance statistics according to the ratio between the first preset granularity and the third preset granularity;
Obtaining the latest and continuous performance statistics as a second target set according to the second total quantity of the performance statistics;
determining average read-write IOPS and average read-write bandwidth of the distributed storage system in the third preset granularity according to the second target set;
determining the current service pressure of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the third preset granularity;
wherein the third preset granularity is greater than the first preset granularity and less than the second preset granularity.
In an optional implementation manner, the determining the target garbage collection policy according to the magnitude relation between the performance index and the current service pressure includes:
determining the current garbage collection level of the distributed storage system according to the size relation between the performance index and the current service pressure;
and determining the target garbage collection strategy according to the current garbage collection level of the distributed storage system.
In an optional embodiment, the determining the current garbage collection level of the distributed storage system according to the magnitude relation between the performance index and the current service pressure includes:
Determining a performance comparison value corresponding to the performance index according to a preset proportion;
and when the performance comparison value is smaller than the current service pressure, determining that the current garbage collection level of the distributed storage system is the lowest level.
In an alternative embodiment, the method further comprises:
and when the performance comparison value is not smaller than the current service pressure, determining that the current garbage collection level of the distributed storage system is a general level.
In an alternative embodiment, said determining said target garbage collection policy according to a current garbage collection level of said distributed storage system comprises:
when the current garbage collection level of the distributed storage system is the lowest level, determining an object with the invalid data occupation ratio reaching a preset threshold as an object to be collected;
wherein the target garbage collection policy includes an object to be collected.
In an alternative embodiment, said determining said target garbage collection policy according to a current garbage collection level of said distributed storage system comprises:
when the current garbage collection level of the distributed storage system is a general level, determining expected read-write IOPS and bandwidth limit values of garbage collection according to the difference value between the performance comparison value and the current service pressure;
Wherein the target garbage collection policy includes expected read-write IOPS and bandwidth limit values for garbage collection.
In an alternative embodiment, said determining target garbage collection parameters of said distributed storage system according to said target garbage collection policy comprises:
determining a target write bandwidth and a target read bandwidth of a garbage collection process of the distributed storage system according to the bandwidth limit value represented by the target garbage collection strategy;
the target garbage collection parameter at least comprises a target write bandwidth and a target read bandwidth of the garbage collection process.
In an alternative embodiment, the method further comprises:
acquiring the effective data volume of the garbage collection process;
determining the dormancy time of the garbage collection process according to the target write bandwidth of the garbage collection process and the effective data volume;
the target garbage collection parameter at least comprises the dormancy time of the garbage collection process.
In an alternative embodiment, the method further comprises:
acquiring the storage pool capacity occupation ratio of the distributed storage system;
and under the condition that the capacity occupancy ratio of the storage pool reaches a preset occupancy ratio threshold, if the actual garbage collection rate of the garbage collection process is smaller than the current actual service pressure, determining a parameter optimization strategy of the garbage collection process according to the performance index of the distributed storage system.
In an optional implementation manner, the determining the parameter optimization strategy of the garbage collection process according to the performance index of the distributed storage system includes:
according to a preset parameter optimization proportion, according to the performance index of the distributed storage system, redefining expected read-write IOPS and bandwidth limiting values of garbage recovery to obtain optimized read-write IOPS and optimized bandwidth limiting values;
and determining a parameter optimization strategy of the garbage collection process according to the optimized read-write IOPS and the optimized bandwidth limit value.
A second aspect of the present application provides a garbage collection device for a distributed storage system, including:
the acquisition module is used for acquiring historical operation data of the distributed storage system;
the determining module is used for determining the performance index and the current service pressure of the distributed storage system according to the historical operation data;
the strategy making module is used for determining a target garbage recycling strategy according to the size relation between the performance index and the current service pressure;
and the recycling module is used for determining target garbage recycling parameters of the distributed storage system according to the target garbage recycling strategy so as to recycle garbage of the distributed storage system based on the target garbage recycling parameters.
A third aspect of the present application provides a distributed storage system comprising: a plurality of storage nodes;
the storage node performs garbage collection using the method described above in the first aspect and the various possible designs of the first aspect.
A fourth aspect of the present application provides an electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes the computer-executable instructions stored by the memory such that the at least one processor performs the method as described above in the first aspect and the various possible designs of the first aspect.
A fifth aspect of the application provides a computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the method as described above for the first aspect and the various possible designs of the first aspect.
The technical scheme of the application has the following advantages:
the application provides a garbage collection method and device for a distributed storage system and the distributed storage system, wherein the method comprises the following steps: acquiring historical operation data of a distributed storage system; determining performance indexes and current service pressure of the distributed storage system according to the historical operation data; determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure; and determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to collect garbage from the distributed storage system based on the target garbage collection parameters. According to the method provided by the scheme, the performance index and the current service pressure of the distributed storage system are comprehensively considered, and the target garbage collection parameter is set, so that the problem that the garbage collection influences the normal processing of the distributed storage system on the service is avoided, and the stability of the distributed storage system is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings required for the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.
FIG. 1 is a schematic diagram of a network on which an embodiment of the present application is based;
FIG. 2 is a schematic flow chart of a garbage collection method for a distributed storage system according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an exemplary performance index provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram of a garbage recycling device of a distributed storage system according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a distributed storage system according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. These drawings and the written description are not intended to limit the scope of the disclosed concept in any way, but to illustrate the inventive concept to those skilled in the art by reference to specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. In the following description of the embodiments, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Currently, there is a scheme in the distributed storage scenario that a target storage device (Object-based Storage Device, abbreviated as OSD) is partitioned and deployed on a high-speed medium such as a Solid State Disk (SSD), a high-speed storage pool is created by using the OSD, a low-speed large-capacity storage pool is created by using the OSD, the two storage pools are bound, all data requests from clients pass through the high-speed storage pool, then the data are aggregated by the storage pool to form a 4M large Object, and are written into the low-speed storage pool, in the additional writing mode, all data of the low-speed storage pool are distributed with a new physical space, in the modifying process of the original data, the modified part of the original data is still invalid, the new data is still written into the newly distributed space, and management is performed by means of a mapped forward relation. In this way, after a period of random writing, more and more invalid data are generated, and garbage recycling is needed for the invalid data to ensure the space utilization rate of the storage pool. In the additional write mode, garbage collection needs to read out the effective part of the object to be collected first, and then re-write the newly allocated space after aggregation. The general garbage collection strategy is to adjust the garbage collection speed based on the water level, wherein the higher the water level is, the faster the garbage collection speed is, and the water level is the current data occupation degree of the distributed storage system. The disadvantage of this strategy is that the garbage collection speed and the traffic load are not considered at the same time, and when the garbage collection speed is high, the traffic load is also high, the influence on the traffic is large, and the performance of the disk is not fully utilized.
In order to solve the above problems, the method and device for recycling garbage in a distributed storage system and the distributed storage system provided by the embodiments of the present application include: acquiring historical operation data of a distributed storage system; determining performance indexes and current service pressure of the distributed storage system according to the historical operation data; determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure; and determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to collect garbage from the distributed storage system based on the target garbage collection parameters. According to the method provided by the scheme, the performance index and the current service pressure of the distributed storage system are comprehensively considered, and the target garbage collection parameter is set, so that the problem that the garbage collection influences the normal processing of the distributed storage system on the service is avoided, and the stability of the distributed storage system is improved.
The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
First, the structure of a network structure on which the present application is based will be described:
The garbage collection method and device for the distributed storage system and the distributed storage system are suitable for garbage collection of invalid data in the distributed storage system to release storage space. Fig. 1 is a schematic structural diagram of a network according to an embodiment of the present application, which mainly includes a garbage collection device and a distributed storage system. The garbage collection device determines target garbage collection parameters based on the method provided by the embodiment of the application, so as to collect garbage from invalid data in the distributed storage system based on the target garbage collection parameters.
The embodiment of the application provides a garbage collection method for a distributed storage system, which is used for garbage collection of invalid data in the distributed storage system so as to release storage space. The execution main body of the embodiment of the application is electronic equipment such as a server, a desktop computer, a notebook computer, a tablet computer and other electronic equipment which can be used for garbage collection of invalid data in a distributed storage system.
As shown in fig. 2, a flow chart of a garbage collection method of a distributed storage system according to an embodiment of the present application is shown, where the method includes:
Step 201, historical operating data of a distributed storage system is obtained.
The historical operation data can be an operation log of the distributed storage system in the past period, and the operation log at least comprises the size and the number of the read-write data objects.
Specifically, the size and the number of the read-write data objects can be obtained at the place where the data read-write interface of the distributed storage system is called; i.e. call interface Object (OP) number accumulation, object size accumulation. The OP numbers are used for counting the number of read-write Input-output operations (Input/Output Operations Per Second, IOPS for short) which occur per second of an object storage device OSD of the distributed storage system, and the object size is used for counting the read-write bandwidth of the OSD.
Step 202, determining performance indexes and current service pressure of the distributed storage system according to the historical operation data.
Specifically, the read-write IOPS and read-write bandwidth of the object storage device in the past period of time of the distributed storage system can be determined by analyzing the historical operation data, so that the performance index and the current service pressure of the distributed storage system are determined.
And 203, determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure.
Specifically, according to the size relation between the performance index and the current service pressure, whether the current garbage collection process adopts a high-rate garbage collection strategy or a low-rate garbage collection rate can be determined, so that the influence of the high-rate garbage collection strategy on the service processing of the distributed storage system under the condition that the current service pressure is high is avoided.
Step 204, determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to perform garbage collection on the distributed storage system based on the target garbage collection parameters.
Specifically, the target garbage collection parameters of the garbage collection process of the distributed storage system can be determined according to the target garbage collection strategy so as to configure the garbage collection rate of the garbage collection process, and the problem that the distributed storage system cannot influence the processing of other services when the garbage collection process carries out garbage collection on the distributed storage system based on the target garbage collection parameters is solved.
On the basis of the above embodiment, as a practical implementation manner, in an embodiment, determining, according to historical operation data, a performance index of the distributed storage system includes:
Step 2021, determining the read-write IOPS and read-write bandwidth of the distributed storage system according to the historical operation data;
in step 2022, the performance index of the distributed storage system is determined according to the read-write IOPS and the read-write bandwidth of the distributed storage system.
It should be noted that, the performance index of the distributed storage system may specifically be an average read-write IOPS and an average read-write bandwidth in a past operation period of the distributed storage system.
Specifically, in an embodiment, according to a first preset granularity, determining a performance statistics result of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system; and determining the performance index of the distributed storage system according to the performance statistical result of the distributed storage system.
The first preset granularity may be three minutes, specifically, a performance statistical result of the distributed storage system may be determined according to the read-write IOPS and the read-write bandwidth of the distributed storage system in the past three minutes, and then a plurality of performance statistical results of three continuous minutes are determined as performance indexes of the distributed storage system.
Specifically, in an embodiment, according to a first preset granularity, an average read-write IOPS and an average read-write bandwidth of the distributed storage system within the first preset granularity may be determined according to the read-write IOPS and the read-write bandwidth of the distributed storage system; and determining a performance statistical result of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity.
Specifically, when the first preset granularity is three minutes, the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity can be determined according to the read-write IOPS and the read-write bandwidth accumulated by the distributed storage system in the past three minutes, so that the performance statistical result of the distributed storage system is obtained. The performance statistical result of the distributed storage system comprises average read-write IOPS and average read-write bandwidth of the distributed storage system in each first preset granularity.
Specifically, in an embodiment, the first aggregate number of the performance statistics may be determined according to a ratio between the first preset granularity and the second preset granularity; obtaining the latest and continuous performance statistics as a first target set according to the first total quantity of the performance statistics; determining the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity according to the first target set; and determining the performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity.
Wherein the second preset granularity is larger than the first preset granularity.
Specifically, if the second preset granularity is 1 hour and the first preset granularity is 3 minutes, the first aggregate number of the performance statistics is determined to be 20 (60 minutes/3 minutes), so that the latest and continuous 20 performance statistics are obtained as the first target set. And then determining the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity according to the average calculation result of the performance statistics result in the first target set, so as to obtain the performance index of the distributed storage system.
As shown in fig. 3, an exemplary schematic diagram of performance indexes provided in the embodiment of the present application is shown, with the continuous operation of the distributed storage system, the number of performance statistics is continuously increased, and the performance indexes of the distributed storage system are continuously updated while the number of performance statistics is increased, so as to ensure the accuracy of the performance indexes of the distributed storage system, that is, the first target set is continuously updated with the latest and continuous 20 performance statistics, so as to obtain the latest performance indexes of the distributed storage system. The total data amount counted in three minutes in fig. 3 is the read-write IOPS and the read-write bandwidth accumulated in the three minutes in the past by the distributed storage system, and the average value of the total data amount counted in three minutes is the performance statistical result of the distributed storage system in the three minutes.
If the distributed storage system just starts to run, the running time of the distributed storage system does not reach the preset second granularity (1 hour), the average value can be obtained by dividing the sum of the current existing data by the time, and the average value can be used as the performance index of the distributed storage system.
Specifically, in an embodiment, since the distributed storage system is continuously operated, to further ensure objectivity of the finally obtained performance index, a latest performance index of the distributed storage system may be determined according to an average read-write IOPS and an average read-write bandwidth of the distributed storage system within a second preset granularity; determining the maximum performance index of the distributed storage system according to the latest performance index and the historical performance index of the distributed storage system; and taking the maximum performance index of the distributed storage system as the performance index of the distributed storage system.
Wherein the most recent performance index of the distributed storage system is the most recently updated performance index.
Specifically, the performance index can be updated every time and compared with the historical performance index obtained before so as to determine the maximum performance index of the distributed storage system, and finally the maximum performance index of the distributed storage system is used as the performance index for the subsequent participation in the target garbage collection strategy formulation.
On the basis of the above embodiment, as a practical implementation manner, in an embodiment, determining the current service pressure of the distributed storage system according to the historical operation data includes:
step 2023, determining a second aggregate number of the performance statistics according to a ratio between the first preset granularity and the third preset granularity;
step 2024, obtaining the latest and continuous performance statistics as a second target set according to the second aggregate number of performance statistics;
step 2025, determining, according to the second target set, an average read-write IOPS and an average read-write bandwidth of the distributed storage system within a third preset granularity;
in step 2026, the current service pressure of the distributed storage system is determined according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system within the third preset granularity.
Wherein the third preset granularity is larger than the first preset granularity and smaller than the second preset granularity.
Specifically, when the first preset granularity is 3 minutes, the second preset granularity is 1 hour, the third preset granularity may be 12 minutes, the second total number of the performance statistics results is 4, and specifically, the latest and continuous 4 performance statistics results may be obtained as the second target set. And then determining the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the third preset granularity according to the average calculation result of the performance statistics result in the second target set, so as to obtain the current service pressure of the distributed storage system.
The average value obtained by dividing the sum of the current existing data by the time can be taken as the current service pressure of the distributed storage system for the distributed storage system with the running time which does not reach the preset third granularity (12 minutes).
It should be noted that, to ensure accuracy of the finally determined current service pressure, the current service pressure may be continuously updated along with an increase of the running time of the distributed storage system, and a specific determining manner of the current service pressure may refer to a determining manner of the performance index provided by the foregoing embodiment, where the two processes are similar, and only difference is that the adopted preset granularity is different.
On the basis of the foregoing embodiment, as an implementation manner, in an embodiment, determining the target garbage collection policy according to the magnitude relation between the performance index and the current service pressure includes:
step 2031, determining a current garbage collection level of the distributed storage system according to a size relation between the performance index and the current service pressure;
step 2032, determining a target garbage collection policy according to the current garbage collection level of the distributed storage system.
Specifically, in an embodiment, to further avoid the garbage collection process running the corresponding distributed storage system to process other services, a performance comparison value corresponding to the performance index may be determined according to a preset proportion; and when the performance comparison value is smaller than the current service pressure, determining that the current garbage collection level of the distributed storage system is the lowest level.
Accordingly, in one embodiment, when the performance comparison value is not less than the current traffic pressure, the current garbage collection level of the distributed storage system is determined to be a general level.
Wherein, the preset proportion can be 80%.
Specifically, if the 80% performance index (performance comparison value) is smaller than the current service pressure, it is determined that the current service load of the distributed storage system is larger, and the distributed storage system is not suitable for high-rate garbage collection, so that the current garbage collection level of the distributed storage system is determined to be the lowest level. If the 80% performance index (performance comparison value) is not smaller than the current service pressure, determining that the current service load of the distributed storage system is smaller, and at the moment, performing high-rate garbage collection, thereby determining that the current garbage collection level of the distributed storage system is a general level.
Further, in an embodiment, when the current garbage collection level of the distributed storage system is the lowest level, an object whose invalid data occupancy ratio reaches a preset threshold is determined as an object to be collected.
Wherein the target garbage collection policy includes objects to be collected.
Specifically, when the current garbage collection level is the lowest level, only invalid data in the objects with the invalid data accounting for 90% of the whole object data is collected, and the object with the invalid data occupancy ratio reaching the preset threshold (90%) is determined as the object to be collected. After determining the object to be recycled, the default low-speed garbage recycling parameter can be used as a target garbage recycling parameter to carry out subsequent garbage recycling operation.
Accordingly, in one embodiment, when the current garbage collection level of the distributed storage system is a general level, the expected read-write IOPS and bandwidth limit values of the garbage collection are determined according to the difference between the performance comparison value and the current traffic pressure.
The target garbage collection strategy comprises expected read-write IOPS and bandwidth limiting values of garbage collection.
Specifically, the current service pressure may be subtracted from the performance comparison value to obtain the expected read-write IOPS and bandwidth limit for garbage collection. Since the performance index and the current traffic pressure are continuously updated, the expected read-write IOPS and bandwidth limit values for garbage collection are updated accordingly.
When the current service pressure is 0, the average read-write IOPS and the average read-write bandwidth represented by the performance comparison value can be directly used as the expected read-write IOPS and the bandwidth limit value for garbage collection.
Further, in an embodiment, a target write bandwidth and a target read bandwidth of a garbage collection process of the distributed storage system may be determined based on a bandwidth limit value characterized by a target garbage collection policy.
The target garbage collection parameters at least comprise a target write bandwidth and a target read bandwidth of the garbage collection process.
It should be noted that, when the garbage collection process runs, the valid data in the object to be collected is read out, then the read valid data is written into the newly allocated space, and finally the object to be collected is deleted integrally, so that the parameters of the garbage collection process include the target write bandwidth and the target read bandwidth.
Specifically, according to the maximum read bandwidth and the maximum write bandwidth characterized by the bandwidth limiting value, setting a target write bandwidth and a target read bandwidth of the garbage collection process, for example, taking the maximum write bandwidth as the target write bandwidth and taking the maximum read bandwidth as the target read bandwidth.
On the basis of the foregoing embodiment, as a practical manner, to improve the stability of the garbage recycling process, in one embodiment, the method further includes:
step 301, obtaining effective data volume of a garbage collection process;
step 302, determining the sleep time of the garbage collection process according to the target write bandwidth and the effective data volume of the garbage collection process.
The target garbage collection parameter at least comprises a sleep time of a garbage collection process.
It should be noted that, since the garbage collection process needs to write valid data into the newly allocated space, the object storage device of the distributed storage system has a fixed data writing requirement, for example, after aggregating the data to be written into a 4M large object, it writes the data into the low-speed storage pool.
Specifically, after each time the garbage collection process is started, the effective data amount (for example, 4M) of the garbage collection process is firstly obtained, that is, the garbage collection process needs to aggregate the 4M effective data, and then the effective data can be written into the low-speed storage pool. Specifically, the sleep time (0.5 s) of the garbage collection process can be determined by the ratio between the target write bandwidth (8M/s) and the effective data volume (4M). And if the target write bandwidth is smaller than the effective data amount, taking the default minimum sleep time as the sleep time of the garbage collection process.
When the garbage collection process reads the effective data, the effective data of a plurality of objects to be collected need to be read, if the whole object is read, the sleep time of the garbage collection process needs to be set by taking the maximum read bandwidth as a limiting condition, and if only the effective data part in the object is read, the sleep time of the garbage collection process can be set by taking the expected read IOPS as the limiting condition.
Specifically, in an embodiment, since the actual running capability of the garbage collection process may not reach the preset target garbage collection parameter, in order to further improve the garbage collection efficiency, the storage pool capacity occupation ratio of the distributed storage system may be obtained; and under the condition that the capacity occupancy ratio of the storage pool reaches a preset occupancy ratio threshold, if the actual garbage collection rate of the garbage collection process is smaller than the current actual service pressure, determining a parameter optimization strategy of the garbage collection process according to the performance index of the distributed storage system.
The storage pool capacity occupation ratio of the distributed storage system is the current data occupation degree of the distributed storage system.
Specifically, when the capacity occupancy ratio of the storage pool of the distributed storage system reaches a preset occupancy ratio threshold (90%), and the actual garbage collection rate of the garbage collection process is smaller than the current actual service pressure, the rate of the garbage collection process can be increased appropriately, namely, a parameter optimization strategy of the garbage collection process is determined.
Specifically, in an embodiment, according to a preset parameter optimization ratio, according to a performance index of the distributed storage system, the expected read-write IOPS and the bandwidth limit value of garbage collection are redetermined to obtain an optimized read-write IOPS and an optimized bandwidth limit value; and determining a parameter optimization strategy of the garbage collection process according to the optimized read-write IOPS and the optimized bandwidth limit value.
Specifically, the garbage recovery rate of the garbage recovery process can be adjusted to 70% (the preset parameter optimization ratio), namely 70% of the performance index is determined to be the expected read-write IOPS and bandwidth limit value of garbage recovery, so as to obtain the optimized read-write IOPS and the optimized bandwidth limit value, and further, the parameter optimization strategy of the garbage recovery process is redetermined according to the optimized read-write IOPS and the optimized bandwidth limit value, so that the garbage recovery rate of the garbage recovery process is improved, the back pressure service is utilized, the service pressure is reduced, the release speed of the garbage data of the storage pool is improved, and the space is released as soon as possible.
According to the garbage collection method for the distributed storage system, provided by the embodiment of the application, historical operation data of the distributed storage system are obtained; determining performance indexes and current service pressure of the distributed storage system according to the historical operation data; determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure; and determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to collect garbage from the distributed storage system based on the target garbage collection parameters. According to the method provided by the scheme, the performance index and the current service pressure of the distributed storage system are comprehensively considered, and the target garbage collection parameter is set, so that the problem that the garbage collection influences the normal processing of the distributed storage system on the service is avoided, and the stability of the distributed storage system is improved. And by dynamically adjusting the garbage collection rate according to the actual running condition of the distributed storage system, the garbage collection speed and the service pressure are balanced on the basis of fully utilizing the performance of the magnetic disk, the influence of garbage collection on the service pressure is reduced, and the performance of the storage cluster is improved.
The embodiment of the application provides a garbage collection device of a distributed storage system, which is used for executing the garbage collection method of the distributed storage system.
Fig. 4 is a schematic structural diagram of a garbage collection device of a distributed storage system according to an embodiment of the present application. The garbage collection device 40 of the distributed storage system includes: an acquisition module 401, a determination module 402, a policy making module 403, and a reclamation module 404.
The acquisition module is used for acquiring historical operation data of the distributed storage system; the determining module is used for determining the performance index and the current service pressure of the distributed storage system according to the historical operation data; the strategy making module is used for determining a target garbage recycling strategy according to the size relation between the performance index and the current service pressure; and the recycling module is used for determining target garbage recycling parameters of the distributed storage system according to the target garbage recycling strategy so as to recycle garbage of the distributed storage system based on the target garbage recycling parameters.
The specific manner in which the individual modules perform the operations of the distributed storage system garbage collection device of this embodiment has been described in detail in connection with embodiments of the method, and will not be described in detail herein.
The garbage collection device for the distributed storage system provided by the embodiment of the application is used for executing the garbage collection method for the distributed storage system provided by the embodiment of the application, and the implementation mode and the principle are the same and are not repeated.
The embodiment of the application provides a distributed storage system, which is used for executing the garbage collection method of the distributed storage system.
Fig. 5 is a schematic structural diagram of a distributed storage system according to an embodiment of the present application. The distributed storage system 50 includes: a plurality of storage nodes.
The storage node adopts the garbage collection method of the distributed storage system provided by the embodiment to collect garbage.
The storage node divides partition deployment Object storage devices (Object-based Storage Device, abbreviated as OSD) on a high-speed medium such as a Solid State Disk (SSD), creates a high-speed storage pool by using the OSD, also divides partition deployment OSD on a low-speed medium Hard Disk Drive (HDD) H, creates a low-speed large-capacity storage pool by using the OSD, binds the two storage pools, and all data requests from clients pass through the high-speed storage pool first, then the data are aggregated by the storage pool into 4M large objects and written into the low-speed storage pool. A garbage collection process runs on the storage node, and garbage collection is performed on invalid data on a low-speed storage pool of the storage node to release storage space.
The embodiment of the application provides electronic equipment for executing the garbage collection method of the distributed storage system.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 60 includes: at least one processor 61 and a memory 62.
The memory stores computer-executable instructions; at least one processor executes computer-executable instructions stored in the memory, causing the at least one processor to perform the distributed storage system garbage collection method as provided by the embodiments above.
The implementation manner and principle of the electronic device provided by the embodiment of the application are the same, and are not repeated.
The embodiment of the application provides a computer readable storage medium, wherein computer execution instructions are stored in the computer readable storage medium, and when a processor executes the computer execution instructions, the garbage collection method of the distributed storage system provided by any embodiment is realized.
The storage medium containing computer executable instructions in the embodiments of the present application may be used to store the computer executable instructions of the garbage collection method of the distributed storage system provided in the foregoing embodiments, and the implementation manner and principle of the computer executable instructions are the same and are not repeated.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform part of the steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. The specific working process of the above-described device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (18)

1. A method for garbage collection in a distributed storage system, comprising:
acquiring historical operation data of a distributed storage system;
determining performance indexes and current service pressure of the distributed storage system according to the historical operation data;
determining a target garbage collection strategy according to the size relation between the performance index and the current service pressure;
determining target garbage collection parameters of the distributed storage system according to the target garbage collection strategy, so as to collect garbage from the distributed storage system based on the target garbage collection parameters;
the determining, according to the historical operating data, a performance index of the distributed storage system includes:
Determining read-write IOPS and read-write bandwidth of the distributed storage system according to the historical operation data;
determining performance indexes of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
the determining the performance index of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system comprises the following steps:
according to a first preset granularity, determining a performance statistical result of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
and determining the performance index of the distributed storage system according to the performance statistical result of the distributed storage system.
2. The method of claim 1, wherein determining the performance statistics of the distributed storage system based on the read-write IOPS and the read-write bandwidth of the distributed storage system at a first predetermined granularity comprises:
according to a first preset granularity, determining the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
and determining a performance statistical result of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the first preset granularity.
3. The method of claim 1, wherein determining the performance metrics of the distributed storage system based on the performance statistics of the distributed storage system comprises:
determining a first aggregate number of the performance statistics according to a ratio between a first preset granularity and a second preset granularity;
obtaining the latest and continuous performance statistics as a first target set according to the first total quantity of the performance statistics;
determining average read-write IOPS and average read-write bandwidth of the distributed storage system in the second preset granularity according to the first target set;
determining a performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity;
wherein the second preset granularity is greater than the first preset granularity.
4. The method of claim 3, wherein the determining the performance index of the distributed storage system based on the average read-write IOPS and the average read-write bandwidth of the distributed storage system within the second preset granularity comprises:
determining a latest performance index of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the second preset granularity;
Determining a maximum performance index of the distributed storage system according to the latest performance index and the historical performance index of the distributed storage system;
and taking the maximum performance index of the distributed storage system as the performance index of the distributed storage system.
5. The method of claim 3, wherein determining a current traffic pressure of the distributed storage system based on the historical operating data comprises:
determining a second total number of the performance statistics according to the ratio between the first preset granularity and the third preset granularity;
obtaining the latest and continuous performance statistics as a second target set according to the second total quantity of the performance statistics;
determining average read-write IOPS and average read-write bandwidth of the distributed storage system in the third preset granularity according to the second target set;
determining the current service pressure of the distributed storage system according to the average read-write IOPS and the average read-write bandwidth of the distributed storage system in the third preset granularity;
wherein the third preset granularity is greater than the first preset granularity and less than the second preset granularity.
6. The method of claim 1, wherein determining a target garbage collection policy based on a magnitude relation between the performance index and a current service pressure comprises:
determining the current garbage collection level of the distributed storage system according to the size relation between the performance index and the current service pressure;
and determining the target garbage collection strategy according to the current garbage collection level of the distributed storage system.
7. The method of claim 6, wherein determining the current garbage collection level of the distributed storage system based on the magnitude relationship between the performance index and the current traffic pressure comprises:
determining a performance comparison value corresponding to the performance index according to a preset proportion;
and when the performance comparison value is smaller than the current service pressure, determining that the current garbage collection level of the distributed storage system is the lowest level.
8. The method of claim 7, wherein the method further comprises:
and when the performance comparison value is not smaller than the current service pressure, determining that the current garbage collection level of the distributed storage system is a general level.
9. The method of claim 7, wherein determining the target garbage collection policy according to the current garbage collection level of the distributed storage system comprises:
when the current garbage collection level of the distributed storage system is the lowest level, determining an object with the invalid data occupation ratio reaching a preset threshold as an object to be collected;
wherein the target garbage collection policy includes an object to be collected.
10. The method of claim 7, wherein determining the target garbage collection policy according to the current garbage collection level of the distributed storage system comprises:
when the current garbage collection level of the distributed storage system is a general level, determining expected read-write IOPS and bandwidth limit values of garbage collection according to the difference value between the performance comparison value and the current service pressure;
wherein the target garbage collection policy includes expected read-write IOPS and bandwidth limit values for garbage collection.
11. The method of claim 10, wherein determining the target garbage collection parameter of the distributed storage system according to the target garbage collection policy comprises:
Determining a target write bandwidth and a target read bandwidth of a garbage collection process of the distributed storage system according to the bandwidth limit value represented by the target garbage collection strategy;
the target garbage collection parameter at least comprises a target write bandwidth and a target read bandwidth of the garbage collection process.
12. The method of claim 11, wherein the method further comprises:
acquiring the effective data volume of the garbage collection process;
determining the dormancy time of the garbage collection process according to the target write bandwidth of the garbage collection process and the effective data volume;
the target garbage collection parameter at least comprises the dormancy time of the garbage collection process.
13. The method as recited in claim 11, further comprising:
acquiring the storage pool capacity occupation ratio of the distributed storage system;
and under the condition that the capacity occupancy ratio of the storage pool reaches a preset occupancy ratio threshold, if the actual garbage collection rate of the garbage collection process is smaller than the current actual service pressure, determining a parameter optimization strategy of the garbage collection process according to the performance index of the distributed storage system.
14. The method of claim 13, wherein determining the parameter optimization strategy of the garbage collection process according to the performance index of the distributed storage system comprises:
according to a preset parameter optimization proportion, according to the performance index of the distributed storage system, redefining expected read-write IOPS and bandwidth limiting values of garbage recovery to obtain optimized read-write IOPS and optimized bandwidth limiting values;
and determining a parameter optimization strategy of the garbage collection process according to the optimized read-write IOPS and the optimized bandwidth limit value.
15. A distributed storage system garbage collection device, comprising:
the acquisition module is used for acquiring historical operation data of the distributed storage system;
the determining module is used for determining the performance index and the current service pressure of the distributed storage system according to the historical operation data;
the strategy making module is used for determining a target garbage recycling strategy according to the size relation between the performance index and the current service pressure;
the recycling module is used for determining target garbage recycling parameters of the distributed storage system according to the target garbage recycling strategy so as to recycle garbage of the distributed storage system based on the target garbage recycling parameters;
The determining module is specifically configured to:
determining read-write IOPS and read-write bandwidth of the distributed storage system according to the historical operation data;
determining performance indexes of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
the determining module is specifically configured to:
according to a first preset granularity, determining a performance statistical result of the distributed storage system according to the read-write IOPS and the read-write bandwidth of the distributed storage system;
and determining the performance index of the distributed storage system according to the performance statistical result of the distributed storage system.
16. A distributed storage system, comprising: a plurality of storage nodes;
the storage node performs garbage collection using the method of any one of claims 1 to 14.
17. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of any one of claims 1 to 14.
18. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the method of any one of claims 1 to 14.
CN202310988513.7A 2023-08-08 2023-08-08 Garbage recycling method and device for distributed storage system and distributed storage system Active CN116700634B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310988513.7A CN116700634B (en) 2023-08-08 2023-08-08 Garbage recycling method and device for distributed storage system and distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310988513.7A CN116700634B (en) 2023-08-08 2023-08-08 Garbage recycling method and device for distributed storage system and distributed storage system

Publications (2)

Publication Number Publication Date
CN116700634A CN116700634A (en) 2023-09-05
CN116700634B true CN116700634B (en) 2023-11-03

Family

ID=87837925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310988513.7A Active CN116700634B (en) 2023-08-08 2023-08-08 Garbage recycling method and device for distributed storage system and distributed storage system

Country Status (1)

Country Link
CN (1) CN116700634B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630638A (en) * 2014-10-31 2016-06-01 国际商业机器公司 Equipment and method for distributing cache for disk array
CN110764714A (en) * 2019-11-06 2020-02-07 深圳大普微电子科技有限公司 Data processing method, device and equipment and readable storage medium
CN112416814A (en) * 2020-11-25 2021-02-26 合肥大唐存储科技有限公司 Management method for garbage collection in solid state disk, storage medium and electronic device
CN113971137A (en) * 2020-07-22 2022-01-25 华为技术有限公司 Garbage recovery method and device
CN116467267A (en) * 2023-03-31 2023-07-21 阿里巴巴(中国)有限公司 Garbage recycling method, device, storage medium and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630638A (en) * 2014-10-31 2016-06-01 国际商业机器公司 Equipment and method for distributing cache for disk array
CN110764714A (en) * 2019-11-06 2020-02-07 深圳大普微电子科技有限公司 Data processing method, device and equipment and readable storage medium
CN113971137A (en) * 2020-07-22 2022-01-25 华为技术有限公司 Garbage recovery method and device
CN112416814A (en) * 2020-11-25 2021-02-26 合肥大唐存储科技有限公司 Management method for garbage collection in solid state disk, storage medium and electronic device
CN116467267A (en) * 2023-03-31 2023-07-21 阿里巴巴(中国)有限公司 Garbage recycling method, device, storage medium and system

Also Published As

Publication number Publication date
CN116700634A (en) 2023-09-05

Similar Documents

Publication Publication Date Title
US8521986B2 (en) Allocating storage memory based on future file size or use estimates
US20200089624A1 (en) Apparatus and method for managing storage of data blocks
CN103995855B (en) The method and apparatus of data storage
EP2713278A1 (en) Method and system for controlling quality of service of storage system, and storage system
JP2004513456A (en) Adaptive prefetching of data on disk
US11816029B2 (en) Adjustment of garbage collection parameters in a storage system
CN112631520B (en) Distributed block storage system, method, apparatus, device and medium
CN108874324A (en) A kind of access request processing method, device, equipment and readable storage medium storing program for executing
WO2021043026A1 (en) Storage space management method and device
US20230009375A1 (en) Data prefetching method and apparatus, and storage device
US20050097130A1 (en) Tracking space usage in a database
GB2497172A (en) Reserving space on a storage device for new data based on predicted changes in access frequencies of storage devices
US7203713B2 (en) Method and apparatus for optimizing extent size
CN116700634B (en) Garbage recycling method and device for distributed storage system and distributed storage system
CN116643704A (en) Storage management method, storage management device, electronic equipment and storage medium
US20230100110A1 (en) Computing resource management method, electronic equipment and program product
CN114518848B (en) Method, device, equipment and medium for processing stored data
CN114327862B (en) Memory allocation method and device, electronic equipment and storage medium
CN115993932A (en) Data processing method, device, storage medium and electronic equipment
CN114201369A (en) Server cluster management method and device, electronic equipment and storage medium
CN113885803A (en) Data storage method and device, electronic equipment and storage medium
CN117389485B (en) Storage performance optimization method, storage performance optimization device, storage system, electronic equipment and medium
CN111177022B (en) Feature extraction method, device, equipment and storage medium
CN114817169A (en) Storage management method, apparatus and computer program product
CN117076536A (en) Automated statistical information collection method, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant