CN111880748A

CN111880748A - Wear balancing method for solid state disk of distributed storage system

Info

Publication number: CN111880748A
Application number: CN202010764608.7A
Authority: CN
Inventors: 呼延晓楠; 田鹏; 陕振; 袁晓光
Original assignee: Beijing Institute of Computer Technology and Applications
Current assignee: Beijing Institute of Computer Technology and Applications
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2020-11-03
Anticipated expiration: 2040-07-30
Also published as: CN111880748B

Abstract

The invention relates to a wear leveling method for a solid state disk of a distributed storage system, which comprises the following steps: the method comprises the following steps: collecting storage media, recording the weight information of the current storage cluster as an initial weight, and grouping the media according to fault domains to ensure that the storage media under different fault domains are distributed in different groups; step two: recording the weight information of the current storage medium, generating the wear degree e of the storage medium according to the writing amount m, the number b of bad blocks, the read error retry rate r and the official writing life L of the current storage medium in the SMART information of the storage medium, and then judging measures taken by the storage cluster according to the distribution condition of the wear degree e between 0 and 1. The invention ensures the performance of the storage cluster when the storage cluster is not idle, and also ensures the stability and controllability of the storage cluster.

Description

Wear balancing method for solid state disk of distributed storage system

Technical Field

The invention relates to the field of computing and storage, in particular to a method for balancing abrasion of a solid state disk of a distributed storage system.

Background

With the development of technologies such as a host, a disk, a network and the like, a data storage mode and a data storage architecture are continuously changed, and a local storage is developed to a network storage, a single-machine storage is developed to a distributed storage, so that problems are brought. The read and write amount of distributed storage of an enterprise is very large, and naturally, extremely high requirements are put on the reliability and consistency of physical storage equipment above the enterprise. With the improvement of network performance and the improvement of the requirement of a user on data throughput, more and more cloud storage systems use the solid state disk to store files, the reading and writing speed of the files is improved, and the response time is shortened. Under the guarantee of the wear mechanism of the existing distributed storage system, after each solid state disk enters a storage cluster for a period of time, the wear degree of each solid state disk and other equipment in the storage cluster converge, and after the service life of the whole solid state disk reaches the end, people cannot know which solid state disks have problems, and the uncontrollable way causes higher maintenance cost.

Wear leveling is one of the necessary characteristics of cloud storage, and the purpose of the wear leveling is to make the relative lifetime of each storage device reach a level of leveling, which is achieved by specific allocation of management write data streams, but the reverse wear leveling is crucial to the stability of the health and performance of the storage cluster.

In order to realize wear balance and reverse wear balance among storage clusters, data of each hard disk of the storage clusters are monitored, health degree data of each hard disk is calculated, so that a health degree ranking is obtained, expected data write-in distribution is calculated by means of the ranking, and then write-in weight of the hard disks is changed, so that the health degree of the hard disks is changed on a data write-in layer. For reverse wear leveling, devices needing to be worn preferentially are screened out according to the overall health degree, and then data is written preferentially, so that the storage medium of the storage cluster is worn in a planned way.

Disclosure of Invention

The present invention is directed to a method for wear leveling of solid state disks in a distributed storage system, which is used to solve the above-mentioned problems of the prior art.

The invention discloses a solid state disk wear leveling method for a distributed storage system, which comprises the following steps: the method comprises the following steps: collecting storage media, recording the weight information of the current storage cluster as an initial weight, and grouping the media according to fault domains to ensure that the storage media under different fault domains are distributed in different groups; step two: recording the weight information of the current storage medium, generating the wear degree e of the storage medium according to the writing amount m, the number b of bad blocks, the read error retry rate r and the official writing life L of the current storage medium in the SMART information of the storage medium, and then judging measures taken by the storage cluster according to the distribution condition of the wear degree e between 0 and 1.

An embodiment of a method for wear leveling of a solid state disk for a distributed storage system according to the present invention includes: the SMART information comprises the writing quantity, the number of bad blocks and the retry rate of reading errors.

An embodiment of a method for wear leveling of a solid state disk for a distributed storage system according to the present invention includes: SMART information also captures the official write lifetime of the storage medium.

An embodiment of a method for wear leveling of a solid state disk for a distributed storage system according to the present invention includes: and continuously monitoring the operation condition of the storage cluster.

In an embodiment of the method for wear leveling of a solid state disk for a distributed storage system according to the present invention, the wear level e of the storage medium is generated by e ═ 0.9(L/m) +0.05 × (b/10000) +0.05 × r.

According to an embodiment of the method for wear leveling of the solid state disk for the distributed storage system, for a wear degree e: if the mean square error of the abrasion degrees of the storage media of the storage clusters is less than 0.025, the storage media in the interval of two standard deviations of the weight average is more than 95%, the storage clusters are considered to be healthy, the weight information of the storage clusters at the moment is recorded, and if the storage clusters are not the initial weight stored previously, the storage clusters are restored to the initial weight through the weight adjusting command of the storage clusters; if the mean square error of the wear degrees of the storage media of the storage clusters is larger than 0.025, the storage media in an interval of two standard deviations of the mean of the wear degrees are smaller than 95%, and at the moment, the data writing of the storage media of the storage clusters is considered to be unbalanced, and starting wear is carried out; and if the abrasion degrees of the storage media of the storage clusters are all above 0.75, performing reverse abrasion balance.

According to an embodiment of the method for wear leveling of the solid state disk for the distributed storage system, the wear leveling includes: recording the storage media with the wearing degree larger than 95% smaller than the average wearing degree in the storage media, sorting the storage media from large to small, reducing the weight value of the first 50% of the storage media to seventy percent of the original weight value, calculating a brand new weight value, submitting the brand new weight value to a storage cluster, and when the storage system operates normally and has no data migration, considering that the storage system is healthy, and writing the weight back to the storage cluster by using the storage system through a self command.

According to an embodiment of the method for wear leveling of the solid state disk for the distributed storage system, the method for wear leveling of the solid state disk comprises the following steps: sorting all storage media in a storage cluster from high to low according to the abrasion degree, increasing the weight of the first 20 percent of the storage media to 200 percent of the original weight, calculating a brand new weight value, submitting the brand new weight value to the storage cluster, considering that the storage system is healthy when the storage system operates normally and has no data migration, writing a self command of the weight using storage system back to the storage cluster, paying attention to the abrasion degree of the selected storage media by a program at any time, recording logs of the weight at the moment when the abrasion degree of the selected storage media reaches 0.98, and adjusting the weight value of the selected storage media to 0 by using the self command of the storage system.

The invention establishes a set of monitoring system for the storage medium while the storage cluster works normally, and forms an automatic weight balancing mechanism under the condition of not influencing the performance of the storage cluster, thereby achieving the automation of wear balancing.

Drawings

FIG. 1 is a flow chart illustrating wear leveling;

fig. 2 is a flow chart of reverse wear equalization.

Detailed Description

In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.

FIG. 1 is a flow chart of a wear leveling die; fig. 2 is a flowchart illustrating reverse wear leveling, and as shown in fig. 1 and fig. 2, the present invention provides a method for wear leveling of a solid state disk for a distributed storage system, including the following steps:

the method comprises the following steps: collecting detailed information of the storage medium, and calculating to obtain the specific health degree of the storage medium through a model; in the SMART information, average failure time of the hard disk, total write-in quantity, error times and total capacity are required to be acquired; additionally, the official write lifetime of the storage medium is to be obtained.

After obtaining the data, the health degree of each hard disk is calculated, and the health degrees of the hard disks are classified and arranged according to fault domains.

Step two: recording the weight information of the current storage medium, generating the wear degree e of the storage medium according to the writing amount m, the number b of bad blocks, the read error retry rate r and the official writing life L of the current storage medium in the SMART information of the storage medium, and then judging measures taken by the storage cluster according to the distribution condition of the wear degree e between 0 and 1.

Step three: monitoring the state of the storage cluster, starting a temporary wear leveling mechanism after the overall wear state deviates, correspondingly changing the weight of the storage medium in the deviating state, continuously monitoring the state of the storage cluster, and adjusting the weight to recover to the normal state after the storage cluster recovers to the normal state.

For the degree of wear e: if the mean square error of the abrasion degrees of the storage media of the storage clusters is less than 0.025, the storage media in the interval of two standard deviations of the weight average is more than 95%, the storage clusters are considered to be healthy, the weight information of the storage clusters at the moment is recorded, and if the storage clusters are not the initial weight stored previously, the storage clusters are restored to the initial weight through the weight adjusting command of the storage clusters; if the mean square error of the wear degrees of the storage media of the storage clusters is larger than 0.025, the storage media in an interval of two standard deviations of the mean of the wear degrees are smaller than 95%, and at the moment, the data writing of the storage media of the storage clusters is considered to be unbalanced, and starting wear is carried out; and if the abrasion degrees of the storage media of the storage clusters are all above 0.75, performing reverse abrasion balance.

The wear leveling includes: recording the storage media with the wearing degree larger than 95% smaller than the average wearing degree in the storage media, sorting the storage media from large to small, reducing the weight value of the first 50% of the storage media to seventy percent of the original weight value, calculating a brand new weight value, submitting the brand new weight value to a storage cluster, and when the storage system operates normally and has no data migration, considering that the storage system is healthy, and writing the weight back to the storage cluster by using the storage system through a self command.

The reverse wear leveling includes: sorting all storage media in a storage cluster from high to low according to the abrasion degree, increasing the weight of the first 20 percent of the storage media to 200 percent of the original weight, calculating a brand new weight value, submitting the brand new weight value to the storage cluster, considering that the storage system is healthy when the storage system operates normally and has no data migration, writing a self command of the weight using storage system back to the storage cluster, paying attention to the abrasion degree of the selected storage media by a program at any time, recording logs of the weight at the moment when the abrasion degree of the selected storage media reaches 0.98, and adjusting the weight value of the selected storage media to 0 by using the self command of the storage system.

And modifying the storage cluster in the step two by the wear leveling and the reverse wear leveling, wherein the step three is a temporary modification measure, and the data volume suddenly increased in a short time or the imbalance between the newly added storage medium and the existing storage cluster is modified by the step three.

The invention provides a method for correcting the health degree of a storage cluster, which corrects the weight of a storage medium of the storage cluster by using the idle time of the storage cluster while ensuring the health of the storage cluster, monitors in real time, ensures the performance of the storage cluster when the storage cluster is not idle, and also ensures the stability and controllability of the storage cluster; for the wear-leveling device, when the administrator changes the storage structure, the health degree of the whole storage cluster does not have overlarge deviation, and the method can automatically adjust for a new storage architecture.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims

1. A method for wear leveling of a solid state disk for a distributed storage system, comprising:

the method comprises the following steps: collecting storage media, recording the weight information of the current storage cluster as an initial weight, and grouping the media according to fault domains to ensure that the storage media under different fault domains are distributed in different groups;

2. The method for wear leveling of the solid state disks of the distributed storage system according to claim 1, comprising: the SMART information comprises the writing quantity, the number of bad blocks and the retry rate of reading errors.

3. The method for wear leveling of the solid state disk for the distributed storage system according to claim 2, comprising: SMART information also captures the official write lifetime of the storage medium.

4. The method for wear leveling of the solid state disks of the distributed storage system according to claim 1, comprising: and continuously monitoring the operation condition of the storage cluster.

5. The method of claim 1, wherein the wear leveling of the storage medium is performed by e ═ 0.9(L/m) +0.05 × (b/10000) +0.05 × r.

6. The method for wear leveling of solid state disks for distributed storage system according to claim 1, wherein for a degree of wear e:

if the mean square error of the abrasion degrees of the storage media of the storage clusters is less than 0.025, the storage media in the interval of two standard deviations of the weight average is more than 95%, the storage clusters are considered to be healthy, the weight information of the storage clusters at the moment is recorded, and if the storage clusters are not the initial weight stored previously, the storage clusters are restored to the initial weight through the weight adjusting command of the storage clusters;

if the mean square error of the wear degrees of the storage media of the storage clusters is larger than 0.025, the storage media in an interval of two standard deviations of the mean of the wear degrees are smaller than 95%, and at the moment, the data writing of the storage media of the storage clusters is considered to be unbalanced, and starting wear is carried out;

and if the abrasion degrees of the storage media of the storage clusters are all above 0.75, performing reverse abrasion balance.

7. The method for wear leveling of the solid state disks of the distributed storage system according to claim 6, wherein the wear leveling comprises: recording the storage media with the wearing degree larger than 95% smaller than the average wearing degree in the storage media, sorting the storage media from large to small, reducing the weight value of the first 50% of the storage media to seventy percent of the original weight value, calculating a brand new weight value, submitting the brand new weight value to a storage cluster, and when the storage system operates normally and has no data migration, considering that the storage system is healthy, and writing the weight back to the storage cluster by using the storage system through a self command.

8. The method for wear leveling of the solid state disks of the distributed storage system according to claim 6, wherein the reverse wear leveling comprises: sorting all storage media in a storage cluster from high to low according to the abrasion degree, increasing the weight of the first 20 percent of the storage media to 200 percent of the original weight, calculating a brand new weight value, submitting the brand new weight value to the storage cluster, considering that the storage system is healthy when the storage system operates normally and has no data migration, writing a self command of the weight using storage system back to the storage cluster, paying attention to the abrasion degree of the selected storage media by a program at any time, recording logs of the weight at the moment when the abrasion degree of the selected storage media reaches 0.98, and adjusting the weight value of the selected storage media to 0 by using the self command of the storage system.