CN112162708A - Hard disk array scanning frequency calculation method - Google Patents
Hard disk array scanning frequency calculation method Download PDFInfo
- Publication number
- CN112162708A CN112162708A CN202011103972.5A CN202011103972A CN112162708A CN 112162708 A CN112162708 A CN 112162708A CN 202011103972 A CN202011103972 A CN 202011103972A CN 112162708 A CN112162708 A CN 112162708A
- Authority
- CN
- China
- Prior art keywords
- hard disk
- time
- raid
- scanning
- disk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0616—Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0653—Monitoring storage devices or systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a method for calculating the scanning frequency of a hard disk array, which comprises the following steps: determining a probability distribution function of hard disk failure time; when the hard disk fails, the data loss cost of the storage system is in direct proportion to the time from the beginning of the failure to the current time, and the proportionality coefficient is Lc and represents the data loss cost in unit time; determining the Mean Time Between Failures (MTBF) value of the hard disk; when the hard disk is scanned and detected, paying a mathematical expected value of the cost at the moment t; obtaining RAID dislocation rate for RAID arrays of N hard disks; obtaining the failure rate of a single RAID array group and the failure rate of the whole storage system; and determining the scanning detection times of the single-disk fault tolerant storage system, and obtaining the scanning frequency requirement of the hard disk array system based on the minimized target of the system cost. The invention integrates all factors of the influence of hard disk scanning and quantifies the influence degree; and (4) aiming at the optimal cost target, determining the numerical calculation result of the scanning frequency according to the evaluation of different hard disk system indexes.
Description
Technical Field
The invention belongs to the technical field of data storage, and relates to a hard disk array scanning frequency calculation method based on scanning cost.
Background
In order to improve the reliability of the hard disk, the probability of error of the hard disk device can be reduced by solving the potential sector error on the hard disk. A common method is a hard disk scanning detection technique, which refers to reading hard disk data by sending a specific command outside normal read-write access, reading the data by IO, finding a bad track where the data cannot be read, performing Remap, and rewriting the data to a free area of the hard disk, thereby protecting the data. However, as the capacity of data array storage systems continues to increase, too many scans can affect the performance of the storage system, while too few scans can lead to reduced reliability.
To minimize the impact on the front-end user, the scan detect operation is typically run as a low priority process. The time for each scan test is still very short for the life cycle of the storage system over the life cycle of the storage system. And many hard disk scan test operations may be required throughout the life of the hard disk. There are generally two cases when the hard disk scan detection is triggered: regular starting and random starting. If the influence of the hard disk scanning detection on the reliability, performance and energy consumption of the system is considered, selecting an appropriate hard disk array scanning detection period becomes a problem worthy of study.
Disclosure of Invention
Objects of the invention
The purpose of the invention is: the method for calculating the scanning frequency of the hard disk array based on the scanning cost is provided, and the numerical value of the cost is quantized for all factors involved in the scanning detection operation.
(II) technical scheme
In order to solve the above technical problem, the present invention provides a method for calculating the scan frequency of a hard disk array, which is set based on the following points:
(1) the probability distribution function of the hard disk failure time is F (T), the probability density function is f (T), the service life of the hard disk is T, and then
(2) When a hard disk fails, the data loss cost of the storage system is proportional to the time from the beginning of the failure to the current time, and the proportionality coefficient is Lc, which represents the data loss cost in unit time. The proportional value can be obtained by the following formula: lc ═ K × r (t). K represents the value of each data record stored in the hard disk, and r (t) is the failure rate of the hard disk, and is defined as follows:
(3) it is assumed that the hard disk failure rates during two adjacent scan detection intervals obey a uniform distribution. The cycle value of the scan detection is very small for the life of the hard disk, and the MTBF value (mean time between failures) of the hard disk is a constant value within a certain period of time.
(4) The cost of one-time scanning detection is Sc; the number of scanning detections in unit time is n (t), and the total number of scanning detections at time t is
During the operation of the whole storage system, the mathematical expected value of the price that can be paid at a certain time t is expressed as the following formula:
in order to minimize e (t), i.e. the total expected cost, as a function of the number of checks, it is necessary to satisfy:
that is, when the period of the scan test satisfies this formula, the cost of the entire memory system in terms of reliability has a minimum.
For a typical RAID array of N hard disks, at least one disk error may be received. The RAID (probability of BER occurrence) bit error rate can be found as:
Pf=(N-1)·SBER·Cdisk
wherein N is the number of hard disks in the RAID array group, CdiskCapacity of a single disc, SBERThe potential sector failure rate of the hard disk is 10 in general-14The potential sector failure rate of SCSI disk is 10-15. MTTDL (mean time to loss of data) for a RAID system that tolerates single disk errors (e.g., RAID1, RAID10, RAID5, etc.) is a function of MTBF:
MTTDLRAID=MTBF/(N·Pf)
from the reliability theory, the failure rate of a single RAID array group can be found as follows:
r(t)Dis the failure rate of a single hard disk.
If the whole system has M hard disks to form an array, the failure rate of the whole storage system is as follows:
the number of scanning detections for a single-disk error tolerant storage system should satisfy the following formula:
by the evaluation method, the scanning frequency requirement of the hard disk array system can be obtained based on the minimum target of the system cost.
(III) advantageous effects
The method for calculating the scanning frequency of the hard disk array has the advantages that:
1. all factors of the influence of hard disk scanning are integrated, and the influence degree is quantized;
2. and (4) aiming at the optimal cost target, determining the numerical calculation result of the scanning frequency according to the evaluation of different hard disk system indexes.
Detailed Description
In order to make the objects, contents and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be given in conjunction with examples.
Based on the calculation method and formula of the invention, the following calculation process can be given in combination with common practical application.
Assuming that there are 1000 hard disks in the storage system and each RAID is composed of 5 hard disks, there are 200 RAID groups in the storage system. The capacity of each hard disk is 120G according to PfDefinition of (1), to know
Pf=(5-1)×120G×10-14/bit=0.384
The bandwidth of a general hard disk is assumed to be 60MB/s-70MB/s, and if a hard disk is found to be failed, the time required for rebuilding the failed hard disk is 120GB/(60MB/s-70MB/s) ≈ 30 (minutes), that is, MTTR equals 30 (minutes), so that the repair rate μ equals 1 in 1 hour. Existing data analysis indicates that the annual failure rate r (t) of hard disks is between 1.7% and 8.5%, and sometimes even higher. Generally, each watt of electricity is 0.4-0.8 yuan, the speed of reading the hard disks is 60M/s-70M/s, and the power of the hard disks is about 13w, so the time required for scanning and detecting a 120GB hard disk is 5/9 hours (120000/60 × 60), and the electricity cost required for scanning and detecting a data center with 1000 hard disks once is about 0.013 × 0.6 × 1000 × 5/9 yuan, and Sc is 4.3 yuan.
According to the method and the data calculation, the hard disk scanning frequency under the conditions of different data loss costs and hard disk failure rates is finally obtained as shown in the following table 1.
TABLE 1 Single-disk fault array scanning frequency reference value (times/years)
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (9)
1. A method for calculating the scanning frequency of a hard disk array is characterized by comprising the following steps:
(1) determining a probability distribution function of hard disk failure time as F (t);
(2) when the hard disk fails, the data loss cost of the storage system is in direct proportion to the time from the beginning of the failure to the current time, and the proportionality coefficient is Lc and represents the data loss cost in unit time;
(3) determining the Mean Time Between Failures (MTBF) value of the hard disk as a constant;
(4) when the hard disk is scanned and detected, the scanning and detecting times in unit time are n (t), the cost of one-time scanning and detecting is Sc, and the total scanning and detecting times at the moment t areThe mathematical expectation value of the cost paid at time t is expressed as the following equation:
in order to minimize the function of the number of checks that minimizes e (t), i.e., the total expected cost, it is necessary to satisfy:
for a RAID array with N hard disks, at least one disk error can be received, and the RAID bit error rate is obtained as follows:
Pf=(N-1)·SBER·Cdisk
wherein N is the number of hard disks in the RAID array group, CdiskCapacity of a single disc, SBERLatent fan being a hard diskZone failure rate;
the average data loss time MTTDL of a RAID system tolerant of single disk errors is a function of MTBF:
MTTDLRAID=MTBF/(N·Pf)
according to the reliability theory, the failure rate of a single RAID array group is obtained as follows:
r(t)Dfailure rate of a single hard disk;
if the whole system has M hard disks to form an array, the failure rate of the whole storage system is as follows:
the scanning detection times of the single-disk error-tolerant storage system meet the following formula:
based on the process, the scanning frequency requirement of the hard disk array system can be obtained based on the minimization target of the system cost.
3. The method according to claim 2, wherein in the step (2), Lc ═ K × r (t), K represents the value of each data record stored in the hard disk, and r (t) represents the failure rate of the hard disk.
5. The hard disk array scanning frequency calculation method according to claim 4, characterized in that in the process (3), the Mean Time Between Failures (MTBF) of the hard disks in a certain time period is determined on the assumption that the failure rates of the hard disks during two adjacent scanning detection intervals are subject to uniform distribution.
6. The hard disk array scanning frequency calculation method according to claim 5, wherein in the process (4), the RAID bit error rate is a probability of occurrence of BER.
7. The method of claim 6, wherein in the process (4), the potential sector failure rate of the SATA disk is 10-14The potential sector failure rate of SCSI disk is 10-15。
8. The hard disk array scanning frequency calculation method according to claim 7, wherein in the process (4), the RAID system that tolerates the single disk error is RAID1, or RAID10, or RAID 5.
9. Use of the method according to any of claims 1-8 for calculating the scanning frequency of a hard disk array in the field of data storage technology.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011103972.5A CN112162708A (en) | 2020-10-15 | 2020-10-15 | Hard disk array scanning frequency calculation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011103972.5A CN112162708A (en) | 2020-10-15 | 2020-10-15 | Hard disk array scanning frequency calculation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112162708A true CN112162708A (en) | 2021-01-01 |
Family
ID=73867119
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011103972.5A Withdrawn CN112162708A (en) | 2020-10-15 | 2020-10-15 | Hard disk array scanning frequency calculation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112162708A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117795472A (en) * | 2021-08-09 | 2024-03-29 | 美光科技公司 | Adaptive data integrity scan frequency |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103729276A (en) * | 2014-01-28 | 2014-04-16 | 深圳市迪菲特科技股份有限公司 | Method for scanning disk array |
US20170249089A1 (en) * | 2016-02-25 | 2017-08-31 | EMC IP Holding Company LLC | Method and apparatus for maintaining reliability of a raid |
-
2020
- 2020-10-15 CN CN202011103972.5A patent/CN112162708A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103729276A (en) * | 2014-01-28 | 2014-04-16 | 深圳市迪菲特科技股份有限公司 | Method for scanning disk array |
US20170249089A1 (en) * | 2016-02-25 | 2017-08-31 | EMC IP Holding Company LLC | Method and apparatus for maintaining reliability of a raid |
Non-Patent Citations (1)
Title |
---|
刘军平: ""磁盘存储系统可靠性技术研究"", 《中国优秀博士学位论文全文数据库信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117795472A (en) * | 2021-08-09 | 2024-03-29 | 美光科技公司 | Adaptive data integrity scan frequency |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5078235B2 (en) | Method for maintaining track data integrity in a magnetic disk storage device | |
US7979635B2 (en) | Apparatus and method to allocate resources in a data storage library | |
US9442802B2 (en) | Data access methods and storage subsystems thereof | |
US10025666B2 (en) | RAID surveyor | |
US6922801B2 (en) | Storage media scanner apparatus and method providing media predictive failure analysis and proactive media surface defect management | |
US20130173972A1 (en) | System and method for solid state disk flash plane failure detection | |
US11676671B1 (en) | Amplification-based read disturb information determination system | |
US11989452B2 (en) | Read-disturb-based logical storage read temperature information identification system | |
US10795790B2 (en) | Storage control apparatus, method and non-transitory computer-readable storage medium | |
US11922019B2 (en) | Storage device read-disturb-based block read temperature utilization system | |
US20060215456A1 (en) | Disk array data protective system and method | |
US11922067B2 (en) | Read-disturb-based logical storage read temperature information maintenance system | |
CN112162708A (en) | Hard disk array scanning frequency calculation method | |
US11929135B2 (en) | Read disturb information determination system | |
US11922020B2 (en) | Read-disturb-based read temperature information persistence system | |
US11763898B2 (en) | Value-voltage-distirubution-intersection-based read disturb information determination system | |
US11983424B2 (en) | Read disturb information isolation system | |
US11928354B2 (en) | Read-disturb-based read temperature determination system | |
US11989441B2 (en) | Read-disturb-based read temperature identification system | |
US11983431B2 (en) | Read-disturb-based read temperature time-based attenuation system | |
US20230229577A1 (en) | Storage device read-disturb-based read temperature map utilization system | |
US20230236749A1 (en) | Read-disturb-based read temperature adjustment system | |
US11995340B2 (en) | Read-disturb-based read temperature information access system | |
US20230236928A1 (en) | Read-disturb-based physical storage read temperature information identification system | |
JP2018190192A (en) | Storage device and storage control program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210101 |