CN109284233A - A kind of rubbish recovering method and relevant apparatus of storage system - Google Patents

A kind of rubbish recovering method and relevant apparatus of storage system Download PDF

Info

Publication number
CN109284233A
CN109284233A CN201811087264.XA CN201811087264A CN109284233A CN 109284233 A CN109284233 A CN 109284233A CN 201811087264 A CN201811087264 A CN 201811087264A CN 109284233 A CN109284233 A CN 109284233A
Authority
CN
China
Prior art keywords
cover
big block
probability
block space
cover probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811087264.XA
Other languages
Chinese (zh)
Other versions
CN109284233B (en
Inventor
何孝金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811087264.XA priority Critical patent/CN109284233B/en
Publication of CN109284233A publication Critical patent/CN109284233A/en
Application granted granted Critical
Publication of CN109284233B publication Critical patent/CN109284233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/72Details relating to flash memory management
    • G06F2212/7205Cleaning, compaction, garbage collection, erase control

Abstract

This application discloses a kind of rubbish recovering methods of storage system, comprising: obtains IO characteristic and the corresponding IO of IO characteristic covers state, carry out machine learning training managing according to IO characteristic and IO covering state, obtain cover probability prediction model;Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, obtains multiple cover probabilities;It is big block space to be recycled by the bulk free token that cover probability is less than default cover probability;Garbage reclamation processing is carried out to all big block spaces to be recycled.Judge whether the valid data in big block space will become junk data by the prediction model of machine learning training, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, avoid the waste of IO performance.Disclosed herein as well is a kind of garbage retrieving system, server and computer readable storage mediums, have the above beneficial effect.

Description

A kind of rubbish recovering method and relevant apparatus of storage system
Technical field
This application involves field of computer technology, in particular to a kind of rubbish recovering method of storage system, garbage reclamation System, server and computer readable storage medium.
Background technique
With the continuous development of information technology, the data stored in internet are more and more, in order to improve data storage There is AFA (full flash memory storage) array in efficiency.AFA array be all using SSD (solid state hard disk) hard disk stored, due to The limitation for writing feature and erasable number of SSD itself, it will usually will be write again, be realized to bulk after discrete data aggregate The garbage reclamation in space is handled, efficiently to utilize SSD hard disk.
In general, the rubbish recovering method that the prior art provides is the total amount of the junk data of each big block space of statistics, The most big block space of selection junk data migrates the valid data in big block space to new as big block space to be recycled Space in, to discharge the memory space of big block space.
But can exist reform into junk data after the valid data of big block space are migrated to new space in the prior art The case where, do not achieve the effect that rubbish recovering method not only, also wastes the storage system migrated to valid data IO (read-write operation), influences the performance of host, has an effect on the service life of SSD hard disk.
Therefore, the Important Problems that the effect of garbage reclamation technique is those skilled in the art's concern how to be improved.
Summary of the invention
The purpose of the application be to provide the rubbish recovering method of storage system a kind of, garbage retrieving system, server and Computer readable storage medium judges that the valid data in big block space whether will by the prediction model of machine learning training Become junk data, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, keep away Exempt from the waste of IO performance.
In order to solve the above technical problems, the application provides a kind of rubbish recovering method of storage system, comprising:
It obtains IO characteristic and the corresponding IO of the IO characteristic covers state, state is covered according to the IO characteristic and the IO Machine learning training managing is carried out, cover probability prediction model is obtained;
Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, it is general to obtain multiple coverings Rate;
It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;
Garbage reclamation processing is carried out to all big block spaces to be recycled.
Optionally, prediction processing is carried out to the big block space according to the cover probability prediction model, obtained corresponding Cover probability, comprising:
Probabilistic forecasting processing is carried out according to all data blocks of the cover probability prediction model to the big block space, is obtained To the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
Optionally, prediction processing is carried out to the big block space according to the cover probability prediction model, obtained corresponding Cover probability, comprising:
The data block to be predicted of the big block space is selected according to data block selection rule;
Probability is carried out according to all to be predicted data blocks of the cover probability prediction model to the big block space Prediction processing, obtains the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
Optionally, it obtains IO characteristic and the corresponding IO of the IO characteristic covers state, according to the IO characteristic and the IO Covering state carries out machine learning training managing, obtains cover probability prediction model, comprising:
The corresponding IO in I/O logic address and the I/O logic address obtained in preset time period covers state;
Machine is carried out to the corresponding IO covering state in the I/O logic address and the I/O logic address according to preset time period Device study processing, obtains the cover probability prediction model.
Optionally, garbage reclamation processing is carried out to all big block spaces to be recycled, comprising:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new In big block space, to complete garbage reclamation processing.
The application also provides a kind of garbage retrieving system of storage system, comprising:
Machine learning training module, for obtaining IO characteristic and the corresponding IO covering state of the IO characteristic, according to described IO characteristic and IO covering state carry out machine learning training managing, obtain cover probability prediction model;
Cover probability prediction module, it is pre- for being carried out respectively to each big block space according to the cover probability prediction model Survey processing, obtains multiple cover probabilities;
Mark module to be recycled, the bulk free token for the cover probability to be less than default cover probability are wait return Receive big block space;
Garbage reclamation processing module, for carrying out garbage reclamation processing to all big block spaces to be recycled.
Optionally, the cover probability prediction module, comprising:
Data block probability prediction unit, for all numbers according to the cover probability prediction model to the big block space Probabilistic forecasting processing is carried out according to block, obtains the corresponding multiple data block cover probabilities of the big block space;
Cover probability addition unit is corresponded to for being added all data block cover probabilities of the big block space The cover probability.
Optionally, the machine learning training module, comprising:
IO characteristic acquiring unit, it is corresponding for obtaining I/O logic address in preset time period and the I/O logic address IO covers state;
Training unit, for being covered according to preset time period to the I/O logic address and the corresponding IO in the I/O logic address Cover shape state carries out machine learning processing, obtains the cover probability prediction model.
The application also provides a kind of server, comprising:
Memory, for storing computer program;
Processor, the step of rubbish recovering method as described above is realized when for executing the computer program.
The application also provides a kind of computer readable storage medium, and calculating is stored on the computer readable storage medium The step of machine program, the computer program realizes rubbish recovering method as described above when being executed by processor.
The rubbish recovering method of a kind of storage system provided herein, comprising: obtain IO characteristic and the IO characteristic Corresponding IO covers state, carries out machine learning training managing according to the IO characteristic and IO covering state, is covered Probabilistic Prediction Model;Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, is obtained multiple Cover probability;It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;To all The big block space to be recycled carries out garbage reclamation processing.
By the IO characteristic and corresponding IO covering state progress machine learning training managing in the storage system of acquisition, obtain To the cover probability prediction model that can predict the capped probability of a certain piece of I/O data, then according to the cover probability prediction model It can predict that the probability that chunk data is written to the overlay, that is, prediction become the probability of junk data from valid data, to height Probability becomes the big block space of junk data without garbage reclamation, and the big block space for becoming junk data to low probability carries out rubbish Rubbish recycling, that is, garbage reclamation is carried out to the big block space that less will become junk data, to improve the IO of storage system Can, and avoid and read and write in vain caused by invalid garbage reclamation, IO waist performance is avoided, the operation longevity of SSD disk is improved Life.
The application also provides the garbage retrieving system, server and computer readable storage medium of a kind of storage system, With the above beneficial effect, this will not be repeated here.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of the rubbish recovering method of storage system provided by the embodiment of the present application;
Fig. 2 is the flow chart of the prediction processing method of rubbish recovering method provided by the embodiment of the present application;
Fig. 3 is the flow chart of another prediction processing method of rubbish recovering method provided by the embodiment of the present application;
Fig. 4 is a kind of structural schematic diagram of the garbage retrieving system of storage system provided by the embodiment of the present application.
Specific embodiment
The core of the application be to provide the rubbish recovering method of storage system a kind of, garbage retrieving system, server and Computer readable storage medium judges that the valid data in big block space whether will by the prediction model of machine learning training Become junk data, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, keep away Exempt from the waste of IO performance.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
The rubbish recovering method that the prior art provides is the total amount of the junk data of each big block space of statistics, selects rubbish The most big block space of data migrates the valid data in big block space into new space as big block space to be recycled, To discharge the memory space of big block space.But can exist in the prior art when the valid data of big block space are migrated to new sky Between after the case where reforming into junk data, do not achieve the effect that rubbish recovering method not only, also waste to valid data into The IO (read-write operation) of the storage system of row migration, influences the performance of host, has an effect on the service life of SSD hard disk.
Therefore, the embodiment of the present application provides a kind of rubbish recovering method of storage system, by the storage system of acquisition IO characteristic and corresponding IO covering state carry out machine learning training managing, obtain can predicting that a certain piece of I/O data is capped Then the cover probability prediction model of probability can predict what chunk data was written to the overlay according to the cover probability prediction model Probability, that is, prediction become the probability of junk data from valid data, become the big block space of junk data not to high probability Garbage reclamation is carried out, the big block space for becoming junk data to low probability carries out garbage reclamation, that is, to less will become rubbish The big block space of rubbish data carries out garbage reclamation, to improve the IO performance of storage system, and avoids invalid garbage reclamation and makes At invalid read-write, avoid IO waist performance, improve the service life of SSD disk.
Referring to FIG. 1, Fig. 1 is a kind of process of the rubbish recovering method of storage system provided by the embodiment of the present application Figure.
This method may include:
S101, obtains IO characteristic and the corresponding IO of IO characteristic covers state, carries out machine according to IO characteristic and IO covering state The processing of device learning training, obtains cover probability prediction model;
This step is mainly to obtain the characteristic of machine learning, i.e., IO characteristic in this step and corresponding with the IO characteristic IO cover state, then according to characteristic carry out machine learning training, obtain cover probability prediction model.It is covered by this Lid Probabilistic Prediction Model and the IO characteristic of acquisition can predict the probability being written to the overlay in corresponding storage address.
Wherein, the characteristic of acquisition can be in each period often capped logical address, the specific period and Logical address is exactly the IO characteristic obtained, and the covering state of logical address is exactly to be written to the overlay in the case that the IO characteristic is signified. Characteristic can also be the characteristic in hot spot data region, for example, the data for getting a certain address area are often write Covering, then can serve as the characteristic being trained.
Wherein, the present embodiment, which carries out algorithm used by machine learning, can be bayesian algorithm, be also possible to k nearest neighbor calculation Method, the algorithm for any one machine learning that can also be provided using the prior art.It can be seen that in this step machine learning algorithm It is not unique, it is not specifically limited herein.
Optionally, this step may include:
Step 1: the corresponding IO in I/O logic address and I/O logic address obtained in preset time period covers state;
Step 2: carrying out machine to the corresponding IO covering state in I/O logic address and I/O logic address according to preset time period Study processing, obtains cover probability prediction model.
This optinal plan is to obtain I/O logic address and IO covering state and corresponding preset time period progress machine Learning training processing, obtains cover probability prediction model.Pass through the cover probability prediction model of this optinal plan, so that it may use Period and I/O logic address judge the probability that the data of the logical address are written to the overlay, that is, become the general of junk data Rate.
S102 carries out prediction processing to each big block space respectively according to the cover probability prediction model, obtains multiple Cover probability;
On the basis of step S101, this step is intended to empty to each bulk respectively according to the cover probability prediction model Between carry out prediction processing, obtain multiple cover probabilities.It namely calculates effective in the big block space of each of multiple big block spaces Valid data in the probability that data are written to the overlay, namely each big block space become the probability of junk data.According to the covering Probability targetedly can carry out garbage reclamation processing by the big block space of selected section, rather than treat as to all big block spaces one Benevolence carries out garbage reclamation processing, avoids the waste of IO performance, improves the service efficiency of IO.
Due to the difference of cover probability prediction model, prediction processing, example can be carried out to big block space at different angles It such as, is to be trained with the whole valid data in big block space, then this step if cover probability prediction model is in training It is exactly to carry out whole prediction processing to big block space in rapid, that is, directly acquires the characteristic of big block space, it is general by covering Rate prediction model carries out that cover probability is calculated;It is in big block space if cover probability prediction model is in training The IO characteristic of each data block of valid data is trained, then being exactly to several numbers in the big block space in this step Prediction processing is carried out according to block, then obtained several corresponding data block cover probabilities carry out several data block cover probabilities Calculation processing obtains the cover probability of the big block space.Therefore, the mode of prediction processing is carried out simultaneously to big block space in this step It is not unique, it is not specifically limited herein.
Specifically, for the prediction processing that data block carries out, all data blocks that can be in the big block space are carried out in advance It surveys, is also possible to predict the partial data block in the big block space, can obtain several data block cover probabilities in a word. Then, carry out the cover probability that big block space is calculated according to these several data block cover probabilities, if specifically can be by Dry data block cover probability sums to obtain cover probability of the sum number as big block space, is also possible to cover several data blocks general Rate is averaging, and will be obtained average as the cover probability of big block space, be can also be and seek several data block cover probabilities Weighted average, using weighted average as the cover probability of big block space.Therefore, the side of prediction processing is carried out to data block herein Formula is not unique, is not specifically limited herein.
The bulk free token that cover probability is less than default cover probability is big block space to be recycled by S103;
On the basis of step S102, this step is intended to for cover probability being less than the bulk free token of default cover probability For big block space to be recycled, that is, the big block space that cover probability is less than prediction cover probability is considered as it can carry out rubbish and return The big block space received.Wherein, default cover probability can be set according to the cover probability of all big block spaces, for example, institute There are the intermediate value of cover probability or the value less than 30%, is also possible to the fixed value received, such as 35%, can also be with IO The value that can change, so, the setting means that cover probability is preset in this step is not unique, is not specifically limited herein.
By in this step by qualified bulk free token be big block space to be recycled, it is possible to get multiple Big block space to be recycled, it is also possible to get 1 big block space to be recycled, be not construed as limiting, change with actual conditions and become Change.
It is assumed that currently there are tetra- big block spaces of A, B, C, D, prediction processing is carried out to each big block space respectively, is obtained more A cover probability is successively 70%, 50%, 90%, 20%.Wherein, the probability the big more is easily overwritten, that is, therein has Data are easier is covered by new data for effect, it is therefore desirable to and big block space lesser to cover probability carries out garbage reclamation, so as to Valid data therein are rationally utilized.At this point, default cover probability is 60%, therefore, the two big block spaces by B, D Carry out garbage reclamation processing.
S104 carries out garbage reclamation processing to all big block spaces to be recycled.
On the basis of step S103, this step is intended to carry out garbage reclamation processing to bulk space to be recycled.This step In can using the prior art provide any one garbage reclamation processing, i.e., by all valid data specified in big block space It all migrates centrally stored into new space.
After having obtained cover probability prediction model, S102 to S104 can be used as a kind of processing method and be individually performed.
It can also be in order to improve the utilization rate and utilization efficiency of data, S104 can also include:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new In big block space, to complete garbage reclamation processing.
Namely will be when migrating valid data, can be according to the cover probability of valid data, it will be similar in cover probability Valid data move in identical space, and cold and hot data are separated, and improve data service efficiency.
Wherein, the difference of cover probability is less than the cover probability of prediction variable quantity i.e. any two big block space to be recycled Difference be less than predetermined probabilities, for example, the difference of the two is less than 5%, so that it may by the valid data of two units be stored in unification In a big block space.
It should be noted that in the present embodiment when normal handling write request, it can also be according to the feelings of the write request Condition continues machine learning, to update cover probability prediction model, improves the accuracy rate of subsequent prediction.
To sum up, the IO characteristic and corresponding IO covering state progress engineering in storage system that the present embodiment passes through acquisition Training managing is practised, obtains to predict that a certain piece of I/O data is capped the cover probability prediction model of probability, then be covered according to this Lid Probabilistic Prediction Model can predict that the probability that chunk data is written to the overlay, that is, prediction become rubbish number from valid data According to probability, the big block space for becoming junk data to high probability becomes junk data to low probability without garbage reclamation Big block space carries out garbage reclamation, that is, carries out garbage reclamation to the big block space that less will become junk data, to improve The IO performance of storage system, and avoid and read and write in vain caused by invalid garbage reclamation, IO waist performance is avoided, is improved The service life of SSD disk.
It is handled, can be used according to the prediction that cover probability prediction model carries out multiple big block spaces in a upper embodiment Any one processing method predicted using prediction model that the prior art provides.It may be to improve prediction accurately Rate, on the basis of a upper embodiment, using a kind of prediction processing method presented below.
Referring to FIG. 2, Fig. 2 is the process of the prediction processing method of rubbish recovering method provided by the embodiment of the present application Figure.
This method may include:
S201 carries out probabilistic forecasting processing according to all data blocks of the cover probability prediction model to big block space, obtains The corresponding multiple data block cover probabilities of big block space;
This step is intended to be carried out according to data block of the cover probability prediction model to each valid data in big block space Probabilistic forecasting processing, the data block covering that one big block space will obtain data block all in the big block space accordingly are general Rate.
It is to obtain the data specifically, carrying out prediction processing to data block according to cover probability prediction model in this step The IO characteristic of block is matched in cover probability prediction model, and matching way, which can be, searches immediate mould according to IO characteristic Recorded IO characteristic in type obtains its corresponding IO covering state, calculates the data according to the degree of closeness of the IO characteristic The cover probability of block;A possibility that matching way can also make to calculate IO characteristic in model, obtain its IO covering state, That is cover probability;Matching way is also possible to be existed when cover probability prediction model is curve model according to the IO characteristic of acquisition Corresponding point is searched in the curve model gets corresponding cover probability.So the mode of the prediction processing in this step is simultaneously It is not unique, it is not specifically limited herein.
All data block cover probabilities of big block space are added, obtain corresponding cover probability by S202.
On the basis of step S101, this step is intended to for all data block cover probabilities of each big block space being added just The cover probability of the big block space is obtained.
It should be noted that the present embodiment is the method for calculating the cover probability of one big block space, it is multiple big when calculating When the cover probability of block space, need the step of the present embodiment is repeated several times that prediction processing can be completed.
It is assumed that being A, B, C, D respectively now with 4 big block spaces., each have 4 data blocks in big block space, respectively It is A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4, D1, D2, D3, D4.By Probabilistic Prediction Model to each bulk All data blocks in space carry out prediction processing, that is, carry out prediction processing to the data block of A1 to D4, can successively obtain The cover probability of data block all in these big block spaces to A, B, C, D is exactly tetra- data blocks of A1, A2, A3, A4 in A Cover probability, the cover probability of tetra- data blocks of B1, B2, B3, B4 in B, tetra- data blocks of C1, C2, C3, C4 in C The cover probability of cover probability and tetra- data blocks of D1, D2, D3, D4 in D.By covering for tetra- data blocks of A1, A2, A3, A4 Lid probability is added the cover probability that can be obtained by A, and the cover probability addition of tetra- data blocks of B1, B2, B3, B4 can be obtained To the cover probability of B, the cover probability of tetra- data blocks of C1, C2, C3, C4 is added to the cover probability that can be obtained by C, it will The cover probability of tetra- data blocks of D1, D2, D3, D4 is added the cover probability that can be obtained by D.
The present embodiment is reduced by calculating the cover probability of all data blocks to obtain the cover probability of big block space The unit that cover probability calculates, improves the computational accuracy of cover probability.
Data block cover probability in a upper embodiment using all data blocks calculates the cover probability of big block space.For The speed for calculating cover probability is improved, on the basis of a upper embodiment, the present embodiment uses the data block of partial data block Cover probability calculates the cover probability of big block space.
Referring to FIG. 3, Fig. 3 is another prediction processing method of rubbish recovering method provided by the embodiment of the present application Flow chart.
This method may include:
S301 selects the data block to be predicted of big block space according to data block selection rule;
Data block selection rule in this step is mainly the data block for selecting part in big block space, as to be predicted Data block.The quantity of data block is reduced, the time for calculating cover probability is reduced, improves the speed of prediction processing.
Wherein, data block selection rule can be the data block for choosing preset quantity from all data blocks at random, can also To be the data block for choosing preset ratio quantity from all data blocks at random, it can also be that being separated by default unit chooses data block As data block to be predicted.Therefore, the data block selection rule in this step is not unique, is not specifically limited herein, as long as The data block of this step selected part from all data blocks calculates covering generally as data block to be predicted, to reduce The data number of blocks of rate.
It is to obtain the data specifically, carrying out prediction processing to data block according to cover probability prediction model in this step The IO characteristic of block is matched in cover probability prediction model, and matching way, which can be, searches immediate mould according to IO characteristic Recorded IO characteristic in type obtains its corresponding IO covering state, calculates the data according to the degree of closeness of the IO characteristic The cover probability of block;A possibility that matching way can also make to calculate IO characteristic in model, obtain its IO covering state, That is cover probability;Matching way is also possible to be existed when cover probability prediction model is curve model according to the IO characteristic of acquisition Corresponding point is searched in the curve model gets corresponding cover probability.So the mode of the prediction processing in this step is simultaneously It is not unique, it is not specifically limited herein.
S302 carries out probabilistic forecasting processing according to be predicted data block of the cover probability prediction model to big block space, obtains To the corresponding multiple data block cover probabilities of big block space;
On the basis of step S301, this step is intended to carry out probability to all data blocks to be predicted in big block space pre- Survey processing, obtains multiple data block cover probabilities.
All data block cover probabilities of each big block space are added, obtain corresponding cover probability by S303.
On the basis of step S302, this step is intended to for obtained all data block cover probabilities being added, and it is big to obtain this The corresponding cover probability of block space.
It should be noted that the present embodiment is the method for calculating the cover probability of one big block space, it is multiple big when calculating When the cover probability of block space, need the step of the present embodiment is repeated several times that prediction processing can be completed.
It is assumed that being A, B, C, D respectively now with 4 big block spaces., each have 4 data blocks in big block space, respectively It is A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4, D1, D2, D3, D4.It is can choose out each by selection rule The first two data block of big block space is as data block to be predicted, that is, A1, A2, B1, B2, C1, C2, D1, D2.
Prediction processing is carried out by be predicted data block of the Probabilistic Prediction Model to each big block space, that is, right A1, A2, B1, B2, C1, C2, D1, D2 carry out prediction processing, can successively obtain needing in these big block spaces of A, B, C, D pre- The cover probability of measured data block is exactly the cover probability of two data blocks of A1, A2 in A, two data blocks of B1, B2 in B Cover probability, the cover probability of two data blocks of C1, C2 in C and the cover probability of two data blocks of D1, D2 in D.
The cover probability of two data blocks of A1, A2 is added the cover probability that can be obtained by A, by two data blocks of B1, B2 Cover probability be added and can be obtained by the cover probability of B, the cover probability addition of two data blocks of C1, C2 can be obtained by The cover probability of two data blocks of D1, D2 is added the cover probability that can be obtained by D by the cover probability of C.
Due to only calculating the cover probability of the data block of part in the present embodiment, reduce the number of computations of data block, It has been correspondingly improved the processing speed of prediction processing.
A kind of garbage retrieving system provided by the embodiments of the present application is introduced below, a kind of rubbish described below returns Receipts system can correspond to each other reference with a kind of above-described rubbish recovering method.
Referring to FIG. 4, Fig. 4 shows for a kind of structure of the garbage retrieving system of storage system provided by the embodiment of the present application It is intended to.
The system may include:
Machine learning training module 100, for obtaining IO characteristic and the corresponding IO covering state of IO characteristic, according to IO characteristic Machine learning training managing is carried out with IO covering state, obtains cover probability prediction model;
Cover probability prediction module 200, for according to the cover probability prediction model respectively to each big block space into Row prediction processing, obtains multiple cover probabilities;
Mark module 300 to be recycled, the bulk free token for cover probability to be less than default cover probability are wait return Receive big block space;
Garbage reclamation processing module 400, for carrying out garbage reclamation processing to all big block spaces to be recycled.
Optionally, the cover probability prediction module 200 may include:
Data block probability prediction unit, for being carried out according to all data blocks of the cover probability prediction model to big block space Probabilistic forecasting processing, obtains the corresponding multiple data block cover probabilities of big block space;
Cover probability addition unit obtains corresponding cover for being added all data block cover probabilities of big block space Lid probability.
Optionally, the machine learning training module 100 may include:
IO characteristic acquiring unit is covered for obtaining I/O logic address in preset time period and the corresponding IO in I/O logic address Cover shape state;
Training unit, for according to preset time period to I/O logic address and the corresponding IO in I/O logic address cover state into Row machine learning processing, obtains cover probability prediction model.
The embodiment of the present application also provides a kind of server, comprising:
Memory, for storing computer program;
Processor realizes the step of rubbish recovering method as described above in Example when for executing the computer program Suddenly.
The embodiment of the present application also provides a kind of computer readable storage medium, stores on the computer readable storage medium There is computer program, the computer program realizes rubbish recovering method as described above in Example when being executed by processor Step.
The computer readable storage medium may include: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. is various to deposit Store up the medium of program code.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration ?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Above to the rubbish recovering method, garbage retrieving system, server of a kind of storage system provided herein with And computer readable storage medium is described in detail.Used herein principle and embodiment party of the specific case to the application Formula is expounded, the description of the example is only used to help understand the method for the present application and its core ideas.It should refer to It out, for those skilled in the art, can also be to the application under the premise of not departing from the application principle Some improvement and modification can also be carried out, these improvement and modification are also fallen into the protection scope of the claim of this application.

Claims (10)

1. a kind of rubbish recovering method of storage system characterized by comprising
It obtains IO characteristic and the corresponding IO of the IO characteristic covers state, carried out according to the IO characteristic and IO covering state Machine learning training managing obtains cover probability prediction model;
Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, obtains multiple cover probabilities;
It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;
Garbage reclamation processing is carried out to all big block spaces to be recycled.
2. rubbish recovering method according to claim 1, which is characterized in that according to the cover probability prediction model to institute It states big block space and carries out prediction processing, obtain corresponding cover probability, comprising:
Probabilistic forecasting processing is carried out according to all data blocks of the cover probability prediction model to the big block space, obtains institute State the corresponding multiple data block cover probabilities of big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
3. rubbish recovering method according to claim 1, which is characterized in that according to the cover probability prediction model to institute It states big block space and carries out prediction processing, obtain corresponding cover probability, comprising:
The data block to be predicted of the big block space is selected according to data block selection rule;
Probabilistic forecasting is carried out according to all to be predicted data blocks of the cover probability prediction model to the big block space Processing, obtains the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
4. rubbish recovering method according to claim 1, which is characterized in that obtain IO characteristic and the IO characteristic is corresponding IO covers state, carries out machine learning training managing according to the IO characteristic and IO covering state, it is pre- to obtain cover probability Survey model, comprising:
The corresponding IO in I/O logic address and the I/O logic address obtained in preset time period covers state;
Engineering is carried out to the corresponding IO covering state in the I/O logic address and the I/O logic address according to preset time period Habit processing, obtains the cover probability prediction model.
5. rubbish recovering method according to any one of claims 1 to 4, which is characterized in that all described to be recycled big Block space carries out garbage reclamation processing, comprising:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new bulk In space, to complete garbage reclamation processing.
6. a kind of garbage retrieving system of storage system characterized by comprising
Machine learning training module, it is special according to the IO for obtaining IO characteristic and the corresponding IO covering state of the IO characteristic Property and the IO covering state carry out machine learning training managing, obtain cover probability prediction model;
Cover probability prediction module, for being carried out at prediction to each big block space respectively according to the cover probability prediction model Reason, obtains multiple cover probabilities;
Mark module to be recycled, the bulk free token for the cover probability to be less than default cover probability are to be recycled big Block space;
Garbage reclamation processing module, for carrying out garbage reclamation processing to all big block spaces to be recycled.
7. garbage retrieving system according to claim 6, which is characterized in that the cover probability prediction module, comprising:
Data block probability prediction unit, for all data blocks according to the cover probability prediction model to the big block space Probabilistic forecasting processing is carried out, the corresponding multiple data block cover probabilities of the big block space are obtained;
Cover probability addition unit obtains corresponding institute for being added all data block cover probabilities of the big block space State cover probability.
8. garbage retrieving system according to claim 6, which is characterized in that the machine learning training module, comprising:
IO characteristic acquiring unit is covered for obtaining I/O logic address in preset time period and the corresponding IO in the I/O logic address Cover shape state;
Training unit, for covering shape to the I/O logic address and the corresponding IO in the I/O logic address according to preset time period State carries out machine learning processing, obtains the cover probability prediction model.
9. a kind of server characterized by comprising
Memory, for storing computer program;
Processor realizes such as rubbish recovering method described in any one of claim 1 to 5 when for executing the computer program The step of.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program is realized when the computer program is executed by processor such as rubbish recovering method described in any one of claim 1 to 5 Step.
CN201811087264.XA 2018-09-18 2018-09-18 Garbage recovery method of storage system and related device Active CN109284233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811087264.XA CN109284233B (en) 2018-09-18 2018-09-18 Garbage recovery method of storage system and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811087264.XA CN109284233B (en) 2018-09-18 2018-09-18 Garbage recovery method of storage system and related device

Publications (2)

Publication Number Publication Date
CN109284233A true CN109284233A (en) 2019-01-29
CN109284233B CN109284233B (en) 2022-02-18

Family

ID=65181006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811087264.XA Active CN109284233B (en) 2018-09-18 2018-09-18 Garbage recovery method of storage system and related device

Country Status (1)

Country Link
CN (1) CN109284233B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158598A (en) * 2019-12-29 2020-05-15 北京浪潮数据技术有限公司 Garbage recycling method, device, equipment and medium for full-flash disk array
CN111913649A (en) * 2019-05-09 2020-11-10 深圳大普微电子科技有限公司 Data processing method and device for solid state disk
CN112347000A (en) * 2019-08-08 2021-02-09 爱思开海力士有限公司 Data storage device, method of operating the same, and controller of the data storage device
WO2022017002A1 (en) * 2020-07-22 2022-01-27 华为技术有限公司 Garbage collection method and device
WO2022171001A1 (en) * 2021-02-09 2022-08-18 山东英信计算机技术有限公司 Gc performance prediction method and system for storage system, medium, and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412826A (en) * 2013-07-18 2013-11-27 记忆科技(深圳)有限公司 Garbage collection method and system of solid state disk
US20130346720A1 (en) * 2011-08-11 2013-12-26 Pure Storage, Inc. Garbage collection in a storage system
CN103577338A (en) * 2013-11-14 2014-02-12 华为技术有限公司 Junk data recycling method and storage device
CN104216665A (en) * 2014-09-01 2014-12-17 上海新储集成电路有限公司 Storage management method of multi-layer unit solid state disk
US9141457B1 (en) * 2013-09-25 2015-09-22 Emc Corporation System and method for predicting multiple-disk failures
CN105204783A (en) * 2015-10-13 2015-12-30 华中科技大学 Solid-state disk garbage recycling method based on data life cycle
CN106874213A (en) * 2017-01-12 2017-06-20 杭州电子科技大学 A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms
CN107102954A (en) * 2017-04-27 2017-08-29 华中科技大学 A kind of solid-state storage grading management method and system based on failure probability
CN107479825A (en) * 2017-06-30 2017-12-15 华为技术有限公司 A kind of storage system, solid state hard disc and date storage method
CN108241471A (en) * 2017-11-29 2018-07-03 深圳忆联信息系统有限公司 A kind of method for promoting solid state disk performance

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346720A1 (en) * 2011-08-11 2013-12-26 Pure Storage, Inc. Garbage collection in a storage system
CN103412826A (en) * 2013-07-18 2013-11-27 记忆科技(深圳)有限公司 Garbage collection method and system of solid state disk
US9141457B1 (en) * 2013-09-25 2015-09-22 Emc Corporation System and method for predicting multiple-disk failures
CN103577338A (en) * 2013-11-14 2014-02-12 华为技术有限公司 Junk data recycling method and storage device
CN104216665A (en) * 2014-09-01 2014-12-17 上海新储集成电路有限公司 Storage management method of multi-layer unit solid state disk
CN105204783A (en) * 2015-10-13 2015-12-30 华中科技大学 Solid-state disk garbage recycling method based on data life cycle
CN106874213A (en) * 2017-01-12 2017-06-20 杭州电子科技大学 A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms
CN107102954A (en) * 2017-04-27 2017-08-29 华中科技大学 A kind of solid-state storage grading management method and system based on failure probability
CN107479825A (en) * 2017-06-30 2017-12-15 华为技术有限公司 A kind of storage system, solid state hard disc and date storage method
CN108241471A (en) * 2017-11-29 2018-07-03 深圳忆联信息系统有限公司 A kind of method for promoting solid state disk performance

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈金忠: "基于页面写相关的闪存转换层策略", 《通信学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111913649A (en) * 2019-05-09 2020-11-10 深圳大普微电子科技有限公司 Data processing method and device for solid state disk
CN111913649B (en) * 2019-05-09 2022-05-06 深圳大普微电子科技有限公司 Data processing method and device for solid state disk
CN112347000A (en) * 2019-08-08 2021-02-09 爱思开海力士有限公司 Data storage device, method of operating the same, and controller of the data storage device
CN111158598A (en) * 2019-12-29 2020-05-15 北京浪潮数据技术有限公司 Garbage recycling method, device, equipment and medium for full-flash disk array
CN111158598B (en) * 2019-12-29 2022-03-22 北京浪潮数据技术有限公司 Garbage recycling method, device, equipment and medium for full-flash disk array
WO2022017002A1 (en) * 2020-07-22 2022-01-27 华为技术有限公司 Garbage collection method and device
WO2022171001A1 (en) * 2021-02-09 2022-08-18 山东英信计算机技术有限公司 Gc performance prediction method and system for storage system, medium, and device

Also Published As

Publication number Publication date
CN109284233B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
CN109284233A (en) A kind of rubbish recovering method and relevant apparatus of storage system
Hu et al. Write amplification analysis in flash-based solid state drives
CN103577338B (en) A kind of method reclaiming junk data and storage device
US6871272B2 (en) Data sorting in information storage systems
CN103246613B (en) Buffer storage and the data cached acquisition methods for buffer storage
EP1936632A1 (en) Method and apparatus for detecting static data area, wear-leveling, and merging data units in nonvolatile data storage device
CN110673789B (en) Metadata storage management method, device, equipment and storage medium of solid state disk
CN102576330A (en) Memory system having persistent garbage collection
CN102169429A (en) Prefetch unit, data prefetch method and microprocessor
CN109671458A (en) The method of management flash memory module and relevant flash controller
CN106293497B (en) Watt record filesystem-aware in junk data recovery method and device
CN110674056B (en) Garbage recovery method and device
US11204697B2 (en) Wear leveling in solid state devices
CN115756312A (en) Data access system, data access method, and storage medium
CN115951839A (en) Data writing method and device for partition name space solid state disk and electronic equipment
CN103150245A (en) Method for determining visiting characteristic of data entityand store controller
CN110795363A (en) Hot page prediction method and page scheduling method for storage medium
Lin et al. Dynamic garbage collection scheme based on past update times for NAND flash-based consumer electronics
Jung et al. Fass: A flash-aware swap system
CN110532195A (en) The workload sub-clustering of storage system and the method for executing it
CN113253926A (en) Memory internal index construction method for improving query and memory performance of novel memory
Lin et al. Flash-aware linux swap system for portable consumer electronics
WO2023083454A1 (en) Data compression and deduplication aware tiering in a storage system
KR101157763B1 (en) Variable space page mapping method and apparatus for flash memory device with trim command processing
KR101022001B1 (en) Flash memory system and method for managing flash memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant