CN109284233A - A kind of rubbish recovering method and relevant apparatus of storage system - Google Patents
A kind of rubbish recovering method and relevant apparatus of storage system Download PDFInfo
- Publication number
- CN109284233A CN109284233A CN201811087264.XA CN201811087264A CN109284233A CN 109284233 A CN109284233 A CN 109284233A CN 201811087264 A CN201811087264 A CN 201811087264A CN 109284233 A CN109284233 A CN 109284233A
- Authority
- CN
- China
- Prior art keywords
- cover
- big block
- probability
- block space
- cover probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0253—Garbage collection, i.e. reclamation of unreferenced memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0616—Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0679—Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/72—Details relating to flash memory management
- G06F2212/7205—Cleaning, compaction, garbage collection, erase control
Abstract
This application discloses a kind of rubbish recovering methods of storage system, comprising: obtains IO characteristic and the corresponding IO of IO characteristic covers state, carry out machine learning training managing according to IO characteristic and IO covering state, obtain cover probability prediction model;Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, obtains multiple cover probabilities;It is big block space to be recycled by the bulk free token that cover probability is less than default cover probability;Garbage reclamation processing is carried out to all big block spaces to be recycled.Judge whether the valid data in big block space will become junk data by the prediction model of machine learning training, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, avoid the waste of IO performance.Disclosed herein as well is a kind of garbage retrieving system, server and computer readable storage mediums, have the above beneficial effect.
Description
Technical field
This application involves field of computer technology, in particular to a kind of rubbish recovering method of storage system, garbage reclamation
System, server and computer readable storage medium.
Background technique
With the continuous development of information technology, the data stored in internet are more and more, in order to improve data storage
There is AFA (full flash memory storage) array in efficiency.AFA array be all using SSD (solid state hard disk) hard disk stored, due to
The limitation for writing feature and erasable number of SSD itself, it will usually will be write again, be realized to bulk after discrete data aggregate
The garbage reclamation in space is handled, efficiently to utilize SSD hard disk.
In general, the rubbish recovering method that the prior art provides is the total amount of the junk data of each big block space of statistics,
The most big block space of selection junk data migrates the valid data in big block space to new as big block space to be recycled
Space in, to discharge the memory space of big block space.
But can exist reform into junk data after the valid data of big block space are migrated to new space in the prior art
The case where, do not achieve the effect that rubbish recovering method not only, also wastes the storage system migrated to valid data
IO (read-write operation), influences the performance of host, has an effect on the service life of SSD hard disk.
Therefore, the Important Problems that the effect of garbage reclamation technique is those skilled in the art's concern how to be improved.
Summary of the invention
The purpose of the application be to provide the rubbish recovering method of storage system a kind of, garbage retrieving system, server and
Computer readable storage medium judges that the valid data in big block space whether will by the prediction model of machine learning training
Become junk data, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, keep away
Exempt from the waste of IO performance.
In order to solve the above technical problems, the application provides a kind of rubbish recovering method of storage system, comprising:
It obtains IO characteristic and the corresponding IO of the IO characteristic covers state, state is covered according to the IO characteristic and the IO
Machine learning training managing is carried out, cover probability prediction model is obtained;
Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, it is general to obtain multiple coverings
Rate;
It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;
Garbage reclamation processing is carried out to all big block spaces to be recycled.
Optionally, prediction processing is carried out to the big block space according to the cover probability prediction model, obtained corresponding
Cover probability, comprising:
Probabilistic forecasting processing is carried out according to all data blocks of the cover probability prediction model to the big block space, is obtained
To the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
Optionally, prediction processing is carried out to the big block space according to the cover probability prediction model, obtained corresponding
Cover probability, comprising:
The data block to be predicted of the big block space is selected according to data block selection rule;
Probability is carried out according to all to be predicted data blocks of the cover probability prediction model to the big block space
Prediction processing, obtains the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
Optionally, it obtains IO characteristic and the corresponding IO of the IO characteristic covers state, according to the IO characteristic and the IO
Covering state carries out machine learning training managing, obtains cover probability prediction model, comprising:
The corresponding IO in I/O logic address and the I/O logic address obtained in preset time period covers state;
Machine is carried out to the corresponding IO covering state in the I/O logic address and the I/O logic address according to preset time period
Device study processing, obtains the cover probability prediction model.
Optionally, garbage reclamation processing is carried out to all big block spaces to be recycled, comprising:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new
In big block space, to complete garbage reclamation processing.
The application also provides a kind of garbage retrieving system of storage system, comprising:
Machine learning training module, for obtaining IO characteristic and the corresponding IO covering state of the IO characteristic, according to described
IO characteristic and IO covering state carry out machine learning training managing, obtain cover probability prediction model;
Cover probability prediction module, it is pre- for being carried out respectively to each big block space according to the cover probability prediction model
Survey processing, obtains multiple cover probabilities;
Mark module to be recycled, the bulk free token for the cover probability to be less than default cover probability are wait return
Receive big block space;
Garbage reclamation processing module, for carrying out garbage reclamation processing to all big block spaces to be recycled.
Optionally, the cover probability prediction module, comprising:
Data block probability prediction unit, for all numbers according to the cover probability prediction model to the big block space
Probabilistic forecasting processing is carried out according to block, obtains the corresponding multiple data block cover probabilities of the big block space;
Cover probability addition unit is corresponded to for being added all data block cover probabilities of the big block space
The cover probability.
Optionally, the machine learning training module, comprising:
IO characteristic acquiring unit, it is corresponding for obtaining I/O logic address in preset time period and the I/O logic address
IO covers state;
Training unit, for being covered according to preset time period to the I/O logic address and the corresponding IO in the I/O logic address
Cover shape state carries out machine learning processing, obtains the cover probability prediction model.
The application also provides a kind of server, comprising:
Memory, for storing computer program;
Processor, the step of rubbish recovering method as described above is realized when for executing the computer program.
The application also provides a kind of computer readable storage medium, and calculating is stored on the computer readable storage medium
The step of machine program, the computer program realizes rubbish recovering method as described above when being executed by processor.
The rubbish recovering method of a kind of storage system provided herein, comprising: obtain IO characteristic and the IO characteristic
Corresponding IO covers state, carries out machine learning training managing according to the IO characteristic and IO covering state, is covered
Probabilistic Prediction Model;Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, is obtained multiple
Cover probability;It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;To all
The big block space to be recycled carries out garbage reclamation processing.
By the IO characteristic and corresponding IO covering state progress machine learning training managing in the storage system of acquisition, obtain
To the cover probability prediction model that can predict the capped probability of a certain piece of I/O data, then according to the cover probability prediction model
It can predict that the probability that chunk data is written to the overlay, that is, prediction become the probability of junk data from valid data, to height
Probability becomes the big block space of junk data without garbage reclamation, and the big block space for becoming junk data to low probability carries out rubbish
Rubbish recycling, that is, garbage reclamation is carried out to the big block space that less will become junk data, to improve the IO of storage system
Can, and avoid and read and write in vain caused by invalid garbage reclamation, IO waist performance is avoided, the operation longevity of SSD disk is improved
Life.
The application also provides the garbage retrieving system, server and computer readable storage medium of a kind of storage system,
With the above beneficial effect, this will not be repeated here.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of the rubbish recovering method of storage system provided by the embodiment of the present application;
Fig. 2 is the flow chart of the prediction processing method of rubbish recovering method provided by the embodiment of the present application;
Fig. 3 is the flow chart of another prediction processing method of rubbish recovering method provided by the embodiment of the present application;
Fig. 4 is a kind of structural schematic diagram of the garbage retrieving system of storage system provided by the embodiment of the present application.
Specific embodiment
The core of the application be to provide the rubbish recovering method of storage system a kind of, garbage retrieving system, server and
Computer readable storage medium judges that the valid data in big block space whether will by the prediction model of machine learning training
Become junk data, to avoid carrying out garbage reclamation processing to this big block space, improves the IO performance of I/O storage system, keep away
Exempt from the waste of IO performance.
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall in the protection scope of this application.
The rubbish recovering method that the prior art provides is the total amount of the junk data of each big block space of statistics, selects rubbish
The most big block space of data migrates the valid data in big block space into new space as big block space to be recycled,
To discharge the memory space of big block space.But can exist in the prior art when the valid data of big block space are migrated to new sky
Between after the case where reforming into junk data, do not achieve the effect that rubbish recovering method not only, also waste to valid data into
The IO (read-write operation) of the storage system of row migration, influences the performance of host, has an effect on the service life of SSD hard disk.
Therefore, the embodiment of the present application provides a kind of rubbish recovering method of storage system, by the storage system of acquisition
IO characteristic and corresponding IO covering state carry out machine learning training managing, obtain can predicting that a certain piece of I/O data is capped
Then the cover probability prediction model of probability can predict what chunk data was written to the overlay according to the cover probability prediction model
Probability, that is, prediction become the probability of junk data from valid data, become the big block space of junk data not to high probability
Garbage reclamation is carried out, the big block space for becoming junk data to low probability carries out garbage reclamation, that is, to less will become rubbish
The big block space of rubbish data carries out garbage reclamation, to improve the IO performance of storage system, and avoids invalid garbage reclamation and makes
At invalid read-write, avoid IO waist performance, improve the service life of SSD disk.
Referring to FIG. 1, Fig. 1 is a kind of process of the rubbish recovering method of storage system provided by the embodiment of the present application
Figure.
This method may include:
S101, obtains IO characteristic and the corresponding IO of IO characteristic covers state, carries out machine according to IO characteristic and IO covering state
The processing of device learning training, obtains cover probability prediction model;
This step is mainly to obtain the characteristic of machine learning, i.e., IO characteristic in this step and corresponding with the IO characteristic
IO cover state, then according to characteristic carry out machine learning training, obtain cover probability prediction model.It is covered by this
Lid Probabilistic Prediction Model and the IO characteristic of acquisition can predict the probability being written to the overlay in corresponding storage address.
Wherein, the characteristic of acquisition can be in each period often capped logical address, the specific period and
Logical address is exactly the IO characteristic obtained, and the covering state of logical address is exactly to be written to the overlay in the case that the IO characteristic is signified.
Characteristic can also be the characteristic in hot spot data region, for example, the data for getting a certain address area are often write
Covering, then can serve as the characteristic being trained.
Wherein, the present embodiment, which carries out algorithm used by machine learning, can be bayesian algorithm, be also possible to k nearest neighbor calculation
Method, the algorithm for any one machine learning that can also be provided using the prior art.It can be seen that in this step machine learning algorithm
It is not unique, it is not specifically limited herein.
Optionally, this step may include:
Step 1: the corresponding IO in I/O logic address and I/O logic address obtained in preset time period covers state;
Step 2: carrying out machine to the corresponding IO covering state in I/O logic address and I/O logic address according to preset time period
Study processing, obtains cover probability prediction model.
This optinal plan is to obtain I/O logic address and IO covering state and corresponding preset time period progress machine
Learning training processing, obtains cover probability prediction model.Pass through the cover probability prediction model of this optinal plan, so that it may use
Period and I/O logic address judge the probability that the data of the logical address are written to the overlay, that is, become the general of junk data
Rate.
S102 carries out prediction processing to each big block space respectively according to the cover probability prediction model, obtains multiple
Cover probability;
On the basis of step S101, this step is intended to empty to each bulk respectively according to the cover probability prediction model
Between carry out prediction processing, obtain multiple cover probabilities.It namely calculates effective in the big block space of each of multiple big block spaces
Valid data in the probability that data are written to the overlay, namely each big block space become the probability of junk data.According to the covering
Probability targetedly can carry out garbage reclamation processing by the big block space of selected section, rather than treat as to all big block spaces one
Benevolence carries out garbage reclamation processing, avoids the waste of IO performance, improves the service efficiency of IO.
Due to the difference of cover probability prediction model, prediction processing, example can be carried out to big block space at different angles
It such as, is to be trained with the whole valid data in big block space, then this step if cover probability prediction model is in training
It is exactly to carry out whole prediction processing to big block space in rapid, that is, directly acquires the characteristic of big block space, it is general by covering
Rate prediction model carries out that cover probability is calculated;It is in big block space if cover probability prediction model is in training
The IO characteristic of each data block of valid data is trained, then being exactly to several numbers in the big block space in this step
Prediction processing is carried out according to block, then obtained several corresponding data block cover probabilities carry out several data block cover probabilities
Calculation processing obtains the cover probability of the big block space.Therefore, the mode of prediction processing is carried out simultaneously to big block space in this step
It is not unique, it is not specifically limited herein.
Specifically, for the prediction processing that data block carries out, all data blocks that can be in the big block space are carried out in advance
It surveys, is also possible to predict the partial data block in the big block space, can obtain several data block cover probabilities in a word.
Then, carry out the cover probability that big block space is calculated according to these several data block cover probabilities, if specifically can be by
Dry data block cover probability sums to obtain cover probability of the sum number as big block space, is also possible to cover several data blocks general
Rate is averaging, and will be obtained average as the cover probability of big block space, be can also be and seek several data block cover probabilities
Weighted average, using weighted average as the cover probability of big block space.Therefore, the side of prediction processing is carried out to data block herein
Formula is not unique, is not specifically limited herein.
The bulk free token that cover probability is less than default cover probability is big block space to be recycled by S103;
On the basis of step S102, this step is intended to for cover probability being less than the bulk free token of default cover probability
For big block space to be recycled, that is, the big block space that cover probability is less than prediction cover probability is considered as it can carry out rubbish and return
The big block space received.Wherein, default cover probability can be set according to the cover probability of all big block spaces, for example, institute
There are the intermediate value of cover probability or the value less than 30%, is also possible to the fixed value received, such as 35%, can also be with IO
The value that can change, so, the setting means that cover probability is preset in this step is not unique, is not specifically limited herein.
By in this step by qualified bulk free token be big block space to be recycled, it is possible to get multiple
Big block space to be recycled, it is also possible to get 1 big block space to be recycled, be not construed as limiting, change with actual conditions and become
Change.
It is assumed that currently there are tetra- big block spaces of A, B, C, D, prediction processing is carried out to each big block space respectively, is obtained more
A cover probability is successively 70%, 50%, 90%, 20%.Wherein, the probability the big more is easily overwritten, that is, therein has
Data are easier is covered by new data for effect, it is therefore desirable to and big block space lesser to cover probability carries out garbage reclamation, so as to
Valid data therein are rationally utilized.At this point, default cover probability is 60%, therefore, the two big block spaces by B, D
Carry out garbage reclamation processing.
S104 carries out garbage reclamation processing to all big block spaces to be recycled.
On the basis of step S103, this step is intended to carry out garbage reclamation processing to bulk space to be recycled.This step
In can using the prior art provide any one garbage reclamation processing, i.e., by all valid data specified in big block space
It all migrates centrally stored into new space.
After having obtained cover probability prediction model, S102 to S104 can be used as a kind of processing method and be individually performed.
It can also be in order to improve the utilization rate and utilization efficiency of data, S104 can also include:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new
In big block space, to complete garbage reclamation processing.
Namely will be when migrating valid data, can be according to the cover probability of valid data, it will be similar in cover probability
Valid data move in identical space, and cold and hot data are separated, and improve data service efficiency.
Wherein, the difference of cover probability is less than the cover probability of prediction variable quantity i.e. any two big block space to be recycled
Difference be less than predetermined probabilities, for example, the difference of the two is less than 5%, so that it may by the valid data of two units be stored in unification
In a big block space.
It should be noted that in the present embodiment when normal handling write request, it can also be according to the feelings of the write request
Condition continues machine learning, to update cover probability prediction model, improves the accuracy rate of subsequent prediction.
To sum up, the IO characteristic and corresponding IO covering state progress engineering in storage system that the present embodiment passes through acquisition
Training managing is practised, obtains to predict that a certain piece of I/O data is capped the cover probability prediction model of probability, then be covered according to this
Lid Probabilistic Prediction Model can predict that the probability that chunk data is written to the overlay, that is, prediction become rubbish number from valid data
According to probability, the big block space for becoming junk data to high probability becomes junk data to low probability without garbage reclamation
Big block space carries out garbage reclamation, that is, carries out garbage reclamation to the big block space that less will become junk data, to improve
The IO performance of storage system, and avoid and read and write in vain caused by invalid garbage reclamation, IO waist performance is avoided, is improved
The service life of SSD disk.
It is handled, can be used according to the prediction that cover probability prediction model carries out multiple big block spaces in a upper embodiment
Any one processing method predicted using prediction model that the prior art provides.It may be to improve prediction accurately
Rate, on the basis of a upper embodiment, using a kind of prediction processing method presented below.
Referring to FIG. 2, Fig. 2 is the process of the prediction processing method of rubbish recovering method provided by the embodiment of the present application
Figure.
This method may include:
S201 carries out probabilistic forecasting processing according to all data blocks of the cover probability prediction model to big block space, obtains
The corresponding multiple data block cover probabilities of big block space;
This step is intended to be carried out according to data block of the cover probability prediction model to each valid data in big block space
Probabilistic forecasting processing, the data block covering that one big block space will obtain data block all in the big block space accordingly are general
Rate.
It is to obtain the data specifically, carrying out prediction processing to data block according to cover probability prediction model in this step
The IO characteristic of block is matched in cover probability prediction model, and matching way, which can be, searches immediate mould according to IO characteristic
Recorded IO characteristic in type obtains its corresponding IO covering state, calculates the data according to the degree of closeness of the IO characteristic
The cover probability of block;A possibility that matching way can also make to calculate IO characteristic in model, obtain its IO covering state,
That is cover probability;Matching way is also possible to be existed when cover probability prediction model is curve model according to the IO characteristic of acquisition
Corresponding point is searched in the curve model gets corresponding cover probability.So the mode of the prediction processing in this step is simultaneously
It is not unique, it is not specifically limited herein.
All data block cover probabilities of big block space are added, obtain corresponding cover probability by S202.
On the basis of step S101, this step is intended to for all data block cover probabilities of each big block space being added just
The cover probability of the big block space is obtained.
It should be noted that the present embodiment is the method for calculating the cover probability of one big block space, it is multiple big when calculating
When the cover probability of block space, need the step of the present embodiment is repeated several times that prediction processing can be completed.
It is assumed that being A, B, C, D respectively now with 4 big block spaces., each have 4 data blocks in big block space, respectively
It is A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4, D1, D2, D3, D4.By Probabilistic Prediction Model to each bulk
All data blocks in space carry out prediction processing, that is, carry out prediction processing to the data block of A1 to D4, can successively obtain
The cover probability of data block all in these big block spaces to A, B, C, D is exactly tetra- data blocks of A1, A2, A3, A4 in A
Cover probability, the cover probability of tetra- data blocks of B1, B2, B3, B4 in B, tetra- data blocks of C1, C2, C3, C4 in C
The cover probability of cover probability and tetra- data blocks of D1, D2, D3, D4 in D.By covering for tetra- data blocks of A1, A2, A3, A4
Lid probability is added the cover probability that can be obtained by A, and the cover probability addition of tetra- data blocks of B1, B2, B3, B4 can be obtained
To the cover probability of B, the cover probability of tetra- data blocks of C1, C2, C3, C4 is added to the cover probability that can be obtained by C, it will
The cover probability of tetra- data blocks of D1, D2, D3, D4 is added the cover probability that can be obtained by D.
The present embodiment is reduced by calculating the cover probability of all data blocks to obtain the cover probability of big block space
The unit that cover probability calculates, improves the computational accuracy of cover probability.
Data block cover probability in a upper embodiment using all data blocks calculates the cover probability of big block space.For
The speed for calculating cover probability is improved, on the basis of a upper embodiment, the present embodiment uses the data block of partial data block
Cover probability calculates the cover probability of big block space.
Referring to FIG. 3, Fig. 3 is another prediction processing method of rubbish recovering method provided by the embodiment of the present application
Flow chart.
This method may include:
S301 selects the data block to be predicted of big block space according to data block selection rule;
Data block selection rule in this step is mainly the data block for selecting part in big block space, as to be predicted
Data block.The quantity of data block is reduced, the time for calculating cover probability is reduced, improves the speed of prediction processing.
Wherein, data block selection rule can be the data block for choosing preset quantity from all data blocks at random, can also
To be the data block for choosing preset ratio quantity from all data blocks at random, it can also be that being separated by default unit chooses data block
As data block to be predicted.Therefore, the data block selection rule in this step is not unique, is not specifically limited herein, as long as
The data block of this step selected part from all data blocks calculates covering generally as data block to be predicted, to reduce
The data number of blocks of rate.
It is to obtain the data specifically, carrying out prediction processing to data block according to cover probability prediction model in this step
The IO characteristic of block is matched in cover probability prediction model, and matching way, which can be, searches immediate mould according to IO characteristic
Recorded IO characteristic in type obtains its corresponding IO covering state, calculates the data according to the degree of closeness of the IO characteristic
The cover probability of block;A possibility that matching way can also make to calculate IO characteristic in model, obtain its IO covering state,
That is cover probability;Matching way is also possible to be existed when cover probability prediction model is curve model according to the IO characteristic of acquisition
Corresponding point is searched in the curve model gets corresponding cover probability.So the mode of the prediction processing in this step is simultaneously
It is not unique, it is not specifically limited herein.
S302 carries out probabilistic forecasting processing according to be predicted data block of the cover probability prediction model to big block space, obtains
To the corresponding multiple data block cover probabilities of big block space;
On the basis of step S301, this step is intended to carry out probability to all data blocks to be predicted in big block space pre-
Survey processing, obtains multiple data block cover probabilities.
All data block cover probabilities of each big block space are added, obtain corresponding cover probability by S303.
On the basis of step S302, this step is intended to for obtained all data block cover probabilities being added, and it is big to obtain this
The corresponding cover probability of block space.
It should be noted that the present embodiment is the method for calculating the cover probability of one big block space, it is multiple big when calculating
When the cover probability of block space, need the step of the present embodiment is repeated several times that prediction processing can be completed.
It is assumed that being A, B, C, D respectively now with 4 big block spaces., each have 4 data blocks in big block space, respectively
It is A1, A2, A3, A4, B1, B2, B3, B4, C1, C2, C3, C4, D1, D2, D3, D4.It is can choose out each by selection rule
The first two data block of big block space is as data block to be predicted, that is, A1, A2, B1, B2, C1, C2, D1, D2.
Prediction processing is carried out by be predicted data block of the Probabilistic Prediction Model to each big block space, that is, right
A1, A2, B1, B2, C1, C2, D1, D2 carry out prediction processing, can successively obtain needing in these big block spaces of A, B, C, D pre-
The cover probability of measured data block is exactly the cover probability of two data blocks of A1, A2 in A, two data blocks of B1, B2 in B
Cover probability, the cover probability of two data blocks of C1, C2 in C and the cover probability of two data blocks of D1, D2 in D.
The cover probability of two data blocks of A1, A2 is added the cover probability that can be obtained by A, by two data blocks of B1, B2
Cover probability be added and can be obtained by the cover probability of B, the cover probability addition of two data blocks of C1, C2 can be obtained by
The cover probability of two data blocks of D1, D2 is added the cover probability that can be obtained by D by the cover probability of C.
Due to only calculating the cover probability of the data block of part in the present embodiment, reduce the number of computations of data block,
It has been correspondingly improved the processing speed of prediction processing.
A kind of garbage retrieving system provided by the embodiments of the present application is introduced below, a kind of rubbish described below returns
Receipts system can correspond to each other reference with a kind of above-described rubbish recovering method.
Referring to FIG. 4, Fig. 4 shows for a kind of structure of the garbage retrieving system of storage system provided by the embodiment of the present application
It is intended to.
The system may include:
Machine learning training module 100, for obtaining IO characteristic and the corresponding IO covering state of IO characteristic, according to IO characteristic
Machine learning training managing is carried out with IO covering state, obtains cover probability prediction model;
Cover probability prediction module 200, for according to the cover probability prediction model respectively to each big block space into
Row prediction processing, obtains multiple cover probabilities;
Mark module 300 to be recycled, the bulk free token for cover probability to be less than default cover probability are wait return
Receive big block space;
Garbage reclamation processing module 400, for carrying out garbage reclamation processing to all big block spaces to be recycled.
Optionally, the cover probability prediction module 200 may include:
Data block probability prediction unit, for being carried out according to all data blocks of the cover probability prediction model to big block space
Probabilistic forecasting processing, obtains the corresponding multiple data block cover probabilities of big block space;
Cover probability addition unit obtains corresponding cover for being added all data block cover probabilities of big block space
Lid probability.
Optionally, the machine learning training module 100 may include:
IO characteristic acquiring unit is covered for obtaining I/O logic address in preset time period and the corresponding IO in I/O logic address
Cover shape state;
Training unit, for according to preset time period to I/O logic address and the corresponding IO in I/O logic address cover state into
Row machine learning processing, obtains cover probability prediction model.
The embodiment of the present application also provides a kind of server, comprising:
Memory, for storing computer program;
Processor realizes the step of rubbish recovering method as described above in Example when for executing the computer program
Suddenly.
The embodiment of the present application also provides a kind of computer readable storage medium, stores on the computer readable storage medium
There is computer program, the computer program realizes rubbish recovering method as described above in Example when being executed by processor
Step.
The computer readable storage medium may include: USB flash disk, mobile hard disk, read-only memory (Read-Only
Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. is various to deposit
Store up the medium of program code.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities
The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment
Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration
?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
Above to the rubbish recovering method, garbage retrieving system, server of a kind of storage system provided herein with
And computer readable storage medium is described in detail.Used herein principle and embodiment party of the specific case to the application
Formula is expounded, the description of the example is only used to help understand the method for the present application and its core ideas.It should refer to
It out, for those skilled in the art, can also be to the application under the premise of not departing from the application principle
Some improvement and modification can also be carried out, these improvement and modification are also fallen into the protection scope of the claim of this application.
Claims (10)
1. a kind of rubbish recovering method of storage system characterized by comprising
It obtains IO characteristic and the corresponding IO of the IO characteristic covers state, carried out according to the IO characteristic and IO covering state
Machine learning training managing obtains cover probability prediction model;
Prediction processing is carried out to each big block space respectively according to the cover probability prediction model, obtains multiple cover probabilities;
It is big block space to be recycled by the bulk free token that the cover probability is less than default cover probability;
Garbage reclamation processing is carried out to all big block spaces to be recycled.
2. rubbish recovering method according to claim 1, which is characterized in that according to the cover probability prediction model to institute
It states big block space and carries out prediction processing, obtain corresponding cover probability, comprising:
Probabilistic forecasting processing is carried out according to all data blocks of the cover probability prediction model to the big block space, obtains institute
State the corresponding multiple data block cover probabilities of big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
3. rubbish recovering method according to claim 1, which is characterized in that according to the cover probability prediction model to institute
It states big block space and carries out prediction processing, obtain corresponding cover probability, comprising:
The data block to be predicted of the big block space is selected according to data block selection rule;
Probabilistic forecasting is carried out according to all to be predicted data blocks of the cover probability prediction model to the big block space
Processing, obtains the corresponding multiple data block cover probabilities of the big block space;
All data block cover probabilities of the big block space are added, the corresponding cover probability is obtained.
4. rubbish recovering method according to claim 1, which is characterized in that obtain IO characteristic and the IO characteristic is corresponding
IO covers state, carries out machine learning training managing according to the IO characteristic and IO covering state, it is pre- to obtain cover probability
Survey model, comprising:
The corresponding IO in I/O logic address and the I/O logic address obtained in preset time period covers state;
Engineering is carried out to the corresponding IO covering state in the I/O logic address and the I/O logic address according to preset time period
Habit processing, obtains the cover probability prediction model.
5. rubbish recovering method according to any one of claims 1 to 4, which is characterized in that all described to be recycled big
Block space carries out garbage reclamation processing, comprising:
The valid data that the difference of cover probability is less than the big block space to be recycled of default variable quantity are recycled to identical new bulk
In space, to complete garbage reclamation processing.
6. a kind of garbage retrieving system of storage system characterized by comprising
Machine learning training module, it is special according to the IO for obtaining IO characteristic and the corresponding IO covering state of the IO characteristic
Property and the IO covering state carry out machine learning training managing, obtain cover probability prediction model;
Cover probability prediction module, for being carried out at prediction to each big block space respectively according to the cover probability prediction model
Reason, obtains multiple cover probabilities;
Mark module to be recycled, the bulk free token for the cover probability to be less than default cover probability are to be recycled big
Block space;
Garbage reclamation processing module, for carrying out garbage reclamation processing to all big block spaces to be recycled.
7. garbage retrieving system according to claim 6, which is characterized in that the cover probability prediction module, comprising:
Data block probability prediction unit, for all data blocks according to the cover probability prediction model to the big block space
Probabilistic forecasting processing is carried out, the corresponding multiple data block cover probabilities of the big block space are obtained;
Cover probability addition unit obtains corresponding institute for being added all data block cover probabilities of the big block space
State cover probability.
8. garbage retrieving system according to claim 6, which is characterized in that the machine learning training module, comprising:
IO characteristic acquiring unit is covered for obtaining I/O logic address in preset time period and the corresponding IO in the I/O logic address
Cover shape state;
Training unit, for covering shape to the I/O logic address and the corresponding IO in the I/O logic address according to preset time period
State carries out machine learning processing, obtains the cover probability prediction model.
9. a kind of server characterized by comprising
Memory, for storing computer program;
Processor realizes such as rubbish recovering method described in any one of claim 1 to 5 when for executing the computer program
The step of.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program is realized when the computer program is executed by processor such as rubbish recovering method described in any one of claim 1 to 5
Step.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811087264.XA CN109284233B (en) | 2018-09-18 | 2018-09-18 | Garbage recovery method of storage system and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811087264.XA CN109284233B (en) | 2018-09-18 | 2018-09-18 | Garbage recovery method of storage system and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109284233A true CN109284233A (en) | 2019-01-29 |
CN109284233B CN109284233B (en) | 2022-02-18 |
Family
ID=65181006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811087264.XA Active CN109284233B (en) | 2018-09-18 | 2018-09-18 | Garbage recovery method of storage system and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109284233B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111158598A (en) * | 2019-12-29 | 2020-05-15 | 北京浪潮数据技术有限公司 | Garbage recycling method, device, equipment and medium for full-flash disk array |
CN111913649A (en) * | 2019-05-09 | 2020-11-10 | 深圳大普微电子科技有限公司 | Data processing method and device for solid state disk |
CN112347000A (en) * | 2019-08-08 | 2021-02-09 | 爱思开海力士有限公司 | Data storage device, method of operating the same, and controller of the data storage device |
WO2022017002A1 (en) * | 2020-07-22 | 2022-01-27 | 华为技术有限公司 | Garbage collection method and device |
WO2022171001A1 (en) * | 2021-02-09 | 2022-08-18 | 山东英信计算机技术有限公司 | Gc performance prediction method and system for storage system, medium, and device |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103412826A (en) * | 2013-07-18 | 2013-11-27 | 记忆科技(深圳)有限公司 | Garbage collection method and system of solid state disk |
US20130346720A1 (en) * | 2011-08-11 | 2013-12-26 | Pure Storage, Inc. | Garbage collection in a storage system |
CN103577338A (en) * | 2013-11-14 | 2014-02-12 | 华为技术有限公司 | Junk data recycling method and storage device |
CN104216665A (en) * | 2014-09-01 | 2014-12-17 | 上海新储集成电路有限公司 | Storage management method of multi-layer unit solid state disk |
US9141457B1 (en) * | 2013-09-25 | 2015-09-22 | Emc Corporation | System and method for predicting multiple-disk failures |
CN105204783A (en) * | 2015-10-13 | 2015-12-30 | 华中科技大学 | Solid-state disk garbage recycling method based on data life cycle |
CN106874213A (en) * | 2017-01-12 | 2017-06-20 | 杭州电子科技大学 | A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms |
CN107102954A (en) * | 2017-04-27 | 2017-08-29 | 华中科技大学 | A kind of solid-state storage grading management method and system based on failure probability |
CN107479825A (en) * | 2017-06-30 | 2017-12-15 | 华为技术有限公司 | A kind of storage system, solid state hard disc and date storage method |
CN108241471A (en) * | 2017-11-29 | 2018-07-03 | 深圳忆联信息系统有限公司 | A kind of method for promoting solid state disk performance |
-
2018
- 2018-09-18 CN CN201811087264.XA patent/CN109284233B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130346720A1 (en) * | 2011-08-11 | 2013-12-26 | Pure Storage, Inc. | Garbage collection in a storage system |
CN103412826A (en) * | 2013-07-18 | 2013-11-27 | 记忆科技(深圳)有限公司 | Garbage collection method and system of solid state disk |
US9141457B1 (en) * | 2013-09-25 | 2015-09-22 | Emc Corporation | System and method for predicting multiple-disk failures |
CN103577338A (en) * | 2013-11-14 | 2014-02-12 | 华为技术有限公司 | Junk data recycling method and storage device |
CN104216665A (en) * | 2014-09-01 | 2014-12-17 | 上海新储集成电路有限公司 | Storage management method of multi-layer unit solid state disk |
CN105204783A (en) * | 2015-10-13 | 2015-12-30 | 华中科技大学 | Solid-state disk garbage recycling method based on data life cycle |
CN106874213A (en) * | 2017-01-12 | 2017-06-20 | 杭州电子科技大学 | A kind of solid state hard disc dsc data recognition methods for merging various machine learning algorithms |
CN107102954A (en) * | 2017-04-27 | 2017-08-29 | 华中科技大学 | A kind of solid-state storage grading management method and system based on failure probability |
CN107479825A (en) * | 2017-06-30 | 2017-12-15 | 华为技术有限公司 | A kind of storage system, solid state hard disc and date storage method |
CN108241471A (en) * | 2017-11-29 | 2018-07-03 | 深圳忆联信息系统有限公司 | A kind of method for promoting solid state disk performance |
Non-Patent Citations (1)
Title |
---|
陈金忠: "基于页面写相关的闪存转换层策略", 《通信学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111913649A (en) * | 2019-05-09 | 2020-11-10 | 深圳大普微电子科技有限公司 | Data processing method and device for solid state disk |
CN111913649B (en) * | 2019-05-09 | 2022-05-06 | 深圳大普微电子科技有限公司 | Data processing method and device for solid state disk |
CN112347000A (en) * | 2019-08-08 | 2021-02-09 | 爱思开海力士有限公司 | Data storage device, method of operating the same, and controller of the data storage device |
CN111158598A (en) * | 2019-12-29 | 2020-05-15 | 北京浪潮数据技术有限公司 | Garbage recycling method, device, equipment and medium for full-flash disk array |
CN111158598B (en) * | 2019-12-29 | 2022-03-22 | 北京浪潮数据技术有限公司 | Garbage recycling method, device, equipment and medium for full-flash disk array |
WO2022017002A1 (en) * | 2020-07-22 | 2022-01-27 | 华为技术有限公司 | Garbage collection method and device |
WO2022171001A1 (en) * | 2021-02-09 | 2022-08-18 | 山东英信计算机技术有限公司 | Gc performance prediction method and system for storage system, medium, and device |
Also Published As
Publication number | Publication date |
---|---|
CN109284233B (en) | 2022-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109284233A (en) | A kind of rubbish recovering method and relevant apparatus of storage system | |
Hu et al. | Write amplification analysis in flash-based solid state drives | |
CN103577338B (en) | A kind of method reclaiming junk data and storage device | |
US6871272B2 (en) | Data sorting in information storage systems | |
CN103246613B (en) | Buffer storage and the data cached acquisition methods for buffer storage | |
EP1936632A1 (en) | Method and apparatus for detecting static data area, wear-leveling, and merging data units in nonvolatile data storage device | |
CN110673789B (en) | Metadata storage management method, device, equipment and storage medium of solid state disk | |
CN102576330A (en) | Memory system having persistent garbage collection | |
CN102169429A (en) | Prefetch unit, data prefetch method and microprocessor | |
CN109671458A (en) | The method of management flash memory module and relevant flash controller | |
CN106293497B (en) | Watt record filesystem-aware in junk data recovery method and device | |
CN110674056B (en) | Garbage recovery method and device | |
US11204697B2 (en) | Wear leveling in solid state devices | |
CN115756312A (en) | Data access system, data access method, and storage medium | |
CN115951839A (en) | Data writing method and device for partition name space solid state disk and electronic equipment | |
CN103150245A (en) | Method for determining visiting characteristic of data entityand store controller | |
CN110795363A (en) | Hot page prediction method and page scheduling method for storage medium | |
Lin et al. | Dynamic garbage collection scheme based on past update times for NAND flash-based consumer electronics | |
Jung et al. | Fass: A flash-aware swap system | |
CN110532195A (en) | The workload sub-clustering of storage system and the method for executing it | |
CN113253926A (en) | Memory internal index construction method for improving query and memory performance of novel memory | |
Lin et al. | Flash-aware linux swap system for portable consumer electronics | |
WO2023083454A1 (en) | Data compression and deduplication aware tiering in a storage system | |
KR101157763B1 (en) | Variable space page mapping method and apparatus for flash memory device with trim command processing | |
KR101022001B1 (en) | Flash memory system and method for managing flash memory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |