WO2017012392A1 - 一种磁盘检测的方法和装置 - Google Patents

一种磁盘检测的方法和装置 Download PDF

Info

Publication number
WO2017012392A1
WO2017012392A1 PCT/CN2016/081438 CN2016081438W WO2017012392A1 WO 2017012392 A1 WO2017012392 A1 WO 2017012392A1 CN 2016081438 W CN2016081438 W CN 2016081438W WO 2017012392 A1 WO2017012392 A1 WO 2017012392A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
delay time
determined
preset
slow
Prior art date
Application number
PCT/CN2016/081438
Other languages
English (en)
French (fr)
Inventor
亢振华
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017012392A1 publication Critical patent/WO2017012392A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing

Definitions

  • Solution 1 Count the I/O delay time of each disk one by one, divide the I/O delay time into several intervals, and set the threshold for each interval. When I/O is executed on the disk, the I/O delay time is calculated, and the number of execution I/Os is divided into corresponding intervals. When the number of I/Os in a certain interval exceeds the threshold, it is determined that the disk is a slow disk.
  • the other disk includes a disk other than the slow disk to be determined.
  • the disk is determined to be the slow disk to be determined.
  • the disk is determined to be a slow disk to be determined, wherein the ratio preset threshold is greater than 1.
  • the method further includes: before obtaining the delay time that the disk processes two or more I/Os in the preset time,
  • an embodiment of the present invention further provides an apparatus for detecting a disk, where the apparatus includes:
  • the obtaining module is configured to obtain, for each disk in the RAID array, a delay time for the disk to process two or more input/output I/Os within a preset time;
  • the calculation module is configured to obtain an average delay time of the disk according to the obtained delay time calculation
  • the first determining module is configured to determine, according to an average delay time of the disk, whether the disk is a slow disk to be determined;
  • the second determining module includes:
  • the determining unit is configured to determine that the delay time of the slow disk to be determined is the most in the preset interval Whether the left end value of the preset interval corresponding to the large value is greater than the right end value of the preset interval corresponding to the maximum value of the distribution times of the delay times of other disks;
  • the other disk includes a disk other than the slow disk to be determined.
  • the determining unit is configured to determine that the disk is a slow disk to be determined when the ratio of the average delay time of the disk to the average delay time of the RAID is greater than a preset threshold; wherein the ratio preset threshold is greater than 1.
  • the device further includes:
  • FIG. 2 is a schematic flowchart of determining whether a slow disk is a slow disk in an embodiment of the present invention
  • FIG. 3 is a schematic flow chart of a second embodiment of a disk detecting method according to the present invention.
  • FIG. 5 is a schematic flowchart diagram of a third embodiment of a method for detecting a disk according to the present invention.
  • FIG. 6 is a schematic diagram of functional modules of a first embodiment of a disk detecting apparatus according to the present invention.
  • FIG. 7 is a schematic diagram of functional modules of a second determining module according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of functional modules of a first determining module according to an embodiment of the present invention.
  • Embodiments of the present invention provide a method for detecting a disk.
  • FIG. 1 is a schematic flowchart diagram of a first embodiment of a disk detecting method according to the present invention.
  • the disk detection method includes: for each disk in a disk array (RAID),
  • Step 10 Obtain a delay time for the disk to process two or more input/output (I/O) in a preset time;
  • the user can identify the preset time or frequency of the slow disk in the RAID through the configuration menu setting, such as setting one month to identify once, one day to identify once or half a year to identify once.
  • the preset time during implementation can also be factory preset. In more implementations, the preset time may not be preset. For example, when the user needs to recognize the slow disk in the RAID, set the time period in real time through the configuration menu or obtain the default time period before the disk is used to identify the slow disk. .
  • the input/output (I/O) delay time is processed multiple times (two or more times) per disk in the preset time.
  • the delay time of the disk processing I/O includes the request time that the terminal sends the disk to process the I/O to the response time of the disk corresponding to the processing I/O according to the request.
  • Step 20 The distribution of the delay time obtained by the statistics in each preset interval of the pre-divided interval
  • the user can also set a plurality of preset intervals X 1 to X n for the delay time of the disk processing I/O through the configuration menu, such as the disk processing in this embodiment.
  • this extended time belongs to X 3 , and the value of x 3 is 1.
  • the user can also set a plurality of preset intervals X 1 to X n for the delay time of the disk processing I/O through the configuration menu, for example, when the delay time of all the disks is counted in the preset time, according to the length of the delay time.
  • the delay time of each I/O processed by each disk and the delay time of each disk processing I/O corresponding to the preset interval may be recorded in a preset time; or each disk may be recorded first.
  • each disk processing I/O is pre-divided in each preset. The distribution of the number of intervals. Count the number of delay times for each preset interval Then proceed to step 30.
  • Step 30 calculating an average delay time of the disk according to the obtained delay time
  • the average delay time of each disk processing I/O is obtained, and the calculation formula is as follows:
  • the average delay time of the disk processing I/O is calculated, and the average delay time of the hard disk processing I/O is equal to the total.
  • the total I/O delay time is equal to the total number of times multiplied by the old (previous) average delay time plus the current delay time T n ; the total number of times is equal to the number of times the old number plus the current delay time: 1. Its calculation formula is as follows:
  • T avgnew is the average delay time of the current disk processing I / O
  • T avgold is the average delay time of the processing I / O that the disk has completed before the update
  • Count is the total number of I / O counted before the update.
  • T avgnew is the average delay time of the current disk processing I / O
  • T avgold is the average delay time of the processing I / O that the disk has completed before the update
  • n is the total I / of the statistics The number of O times
  • T n is the delay time of the processing I/O of the nth record.
  • Step 40 Determine, according to the average delay time of the disk, whether the disk is a slow disk to be determined
  • a preliminary screening is performed to determine a disk that may be a slow disk in the RAID, which is defined as a slow disk to be determined. For example, it is determined whether the average delay time of all disks is greater than a preset time threshold. For example, a preset time threshold is set according to parameters such as the brand, type, and usage time of the disk. If the average delay time of a disk is obtained in step 30, If the threshold is greater than the preset time threshold, the disk is determined to be a slow disk to be determined. Of course, you can also determine the part of the RAID. Whether the average delay time of the disk is greater than the preset time threshold determines the slow disk to be determined in the partial disk. After judging that the slow disk is to be judged, the process proceeds to step 50. If it is determined that the disk is a normal disk, return to step 10 or end the process.
  • a preset time threshold is set according to parameters such as the brand, type, and usage time of the disk.
  • the implementation may further determine, according to the frequency distribution obtained in step 20, the slow disk to be determined in the RAID; and include: when the statistical delay time is distributed in one or more preset intervals, the number of times is greater than each preset interval.
  • the preset threshold is preset, the disk is determined to be a slow disk to be determined. For example, in this embodiment, whether the ratio of the delay time of the fifth and sixth preset intervals in the six preset intervals of the delay time to the total number of times of the disk exceeds a preset ratio threshold, that is, the judgment Whether the proportion of the time interval (for example, the largest and the second largest) in the divided area exceeds the preset ratio threshold. If the preset ratio threshold is exceeded, it is determined that the disk is a slow disk to be determined.
  • Step 50 Determine, according to the number of times of the statistics of the slow disk statistics, the maximum number of distribution times of the delay time in all the preset intervals, and determine whether the slow disk is to be determined as the slow disk according to the preset interval corresponding to the determined maximum value.
  • step 40 Determining, according to step 40, the slow disk to be determined in the RAID, determining a preset interval corresponding to the maximum value of the statistical number distribution in the preset interval of the slow disk to be determined, and then corresponding to the maximum value of the statistical number of times of the slow disk to be determined
  • the preset interval determines whether the slow disk is to be judged as a slow disk. For example, it is determined whether the percentage of the maximum number of times of the preset interval in the preset interval of the slow disk preset interval is greater than a preset percentage, and if the preset interval of the maximum number of times in the slow disk preset interval is to be determined The maximum value of the number of times is greater than the preset percentage, and it is determined that the slow disk is determined to be a slow disk.
  • the slow disk After determining the slow disk, the slow disk can be processed correspondingly, for example, the data can be written to the slow disk, but when the data of the slow disk needs to be read, the data in the slow disk is read, or the data is read; or After taking the data of other disks in the RAID, obtain the data to be read on the slow disk through logical calculation; or use the slow disk as the backup disk.
  • step 20 may not be performed between step 10 and step 30, and may be performed as long as step 50 is performed.
  • steps or conditional changes according to the description of the embodiments of the present invention.
  • the embodiment of the invention obtains the magnetic time in a preset time by using each disk in the disk array (RAID)
  • the disk processes two or more input/output (I/O) delay times; the statistically obtained delay time is distributed over each pre-divided preset interval; the average delay time of the disk is obtained based on the obtained delay time According to the average delay time of the disk, it is determined whether the disk is a slow disk to be determined; if the disk is a slow disk to be determined, the distribution of the delay time in all preset intervals is determined according to the number of times of the slow disk statistics to be determined. The value determines whether the slow disk is to be a slow disk according to the preset interval corresponding to the determined maximum value.
  • the embodiment of the invention also discloses a disk detecting device. The embodiment of the invention does not need to consider factors such as the disk (I/O) model, model and brand, and improves the accuracy of the slow disk recognition, and ensures the efficient working state of the RAID.
  • FIG. 2 is a schematic flowchart of determining whether a slow disk is to be a slow disk according to an embodiment of the present invention.
  • step 50 includes:
  • Step 51 Determine whether the left end value of the preset interval corresponding to the maximum value of the preset interval is greater than the right end value of the preset interval corresponding to the maximum number of distribution times of other disks;
  • the left end value may represent the minimum value of the interval
  • the right end value may represent the maximum value of the interval
  • step 52 If the result of the determination is that the left end value of the preset interval corresponding to the maximum value of the number of times of the slow disk is greater than the right end value of the preset interval corresponding to the maximum value of the number of times of the other disks, the process proceeds to step 52. Otherwise go to step 53.
  • Step 52 The preset interval corresponding to the maximum value of the delay time of the slow disk is determined. When the left end value is greater than the right end value of the preset interval corresponding to the maximum value of the delay time distribution of the other disks, it is determined that the slow disk is determined to be a slow disk;
  • the left end value of the preset interval corresponding to the maximum value of the delay time of the slow disk to be determined is greater than the preset interval corresponding to the maximum value of the delay time of the other disks
  • the right end value for example, in the embodiment, the left end value of X 3 is 500 ms and the right end value of X 1 is 10 ms, and it is determined that the slow disk is a slow disk.
  • Step 53 When the left end value of the preset interval corresponding to the maximum value of the delay time of the slow disk is less than or equal to the right end value of the preset interval corresponding to the maximum value of the delay time of the other disks, the determination is to be determined. It is judged that the slow disk is not a slow disk.
  • FIG. 3 is a schematic flowchart diagram of a second embodiment of a method for detecting a disk according to the present invention.
  • the method of this embodiment further includes:
  • Step 60 Determine whether the disk is a slow disk to be determined according to the delay time of the one or more preset intervals and the preset threshold for each preset interval.
  • the disk is determined to be a slow disk to be determined.
  • This step may be an embodiment of step 40.
  • the user can assign a threshold value to each interval as a preset threshold according to the parameters such as the brand, type and usage time of each disk, and of course, according to each disk.
  • the parameters such as brand, type, and usage time assign a threshold to each interval as the preset threshold. It is also possible to specify a threshold for only one interval during implementation. The number of times of each preset interval exceeds the threshold corresponding to the preset interval as the preset threshold value. If the number of statistics of the delay time of a certain preset interval of a disk exceeds the corresponding preset number of times, the disk is determined to be a slow disk to be determined.
  • FIG. 4 is a schematic diagram of another process for determining whether a disk is a slow disk to be determined according to an embodiment of the present invention.
  • the step 40 may further include:
  • Step 41 Obtain an average delay time of the RAID according to an average delay time of the disk
  • the average delay time of the RAID is obtained, and it is of course not calculated based on the average delay time of all the disks.
  • the average latency of RAID can be equal to the sum of the latency of each disk divided by the total number of disks, as shown in the following equation:
  • Traid is the average delay time of RAID
  • T x is the average delay time of the xth disk
  • N is the total number of disks participating in the calculation in RAID.
  • N can be the number of all disks in the RAID.
  • calculating the average delay time of the RAID can also adopt the method of deducting the data of the slow disk to be determined, and the calculation formula is as follows:
  • Traid is the average delay time of RAID
  • T x is the average delay time of the xth disk
  • T k is the average delay time of the slow disk to be judged
  • N is the total number of disks participating in the calculation in the RAID.
  • N can be the number of all disks in the RAID.
  • the average latency of the RAID can be obtained by other means.
  • step 42 After obtaining the average delay time of the RAID, the process proceeds to step 42.
  • Step 42 Determine whether the disk is a slow disk to be determined according to a ratio of an average delay time of the disk to an average delay time of the RAID and a preset threshold.
  • the disk is determined to be a slow disk to be determined, wherein the ratio preset threshold is greater than 1.
  • the average delay time of the RAID Obtaining the average delay time of the RAID according to step 41, calculating the ratio of the average delay time of the disk to the average delay time of the RAID, and determining whether the disk is a slow disk to be determined according to the obtained ratio and the preset threshold value, wherein the ratio is preset.
  • the threshold is greater than 1. If the average latency of the disk The ratio of the average delay time between the RAID and the RAID is greater than the ratio preset threshold, and then the disk is determined to be a slow disk to be determined.
  • the ratio of the average delay time of the disk to the average delay time of the RAID may not be calculated, for example, the disk with the largest average delay time is directly selected as the slow disk to be determined.
  • FIG. 5 is a schematic flowchart diagram of a third embodiment of a method for detecting a disk according to the present invention.
  • Step 70 Reset the record information of the delay time of the I/O of the disk in the RAID.
  • the terminal When the terminal starts to determine the slow disk in the RAID, it can also reset the record information of the delay time of all the disks in the RAID before obtaining the delay time of the disk processing I/O in the RAID within the preset time. Reset the recorded I/O delay time and count the number of times per interval to zero.
  • the embodiment of the present invention records the record information of the delay time of the input/output I/O of the disk in the RAID before starting to judge the slow disk.
  • the embodiment of the invention can eliminate the influence of historical data on the disk and ensure the correctness of the judgment result.
  • the method of the inventive embodiment is required to be implemented by a server on which the disk is located or a server or device that can communicate with the disk.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the disk detection method.
  • the embodiment of the invention further provides a device for detecting a disk.
  • FIG. 6 is a schematic diagram of functional modules of a first embodiment of a disk detecting apparatus according to the present invention.
  • the disk detecting device includes:
  • the obtaining module 10 is configured to obtain, for each disk in the RAID array, a delay time for the disk to process two or more input/output I/Os within a preset time;
  • the user can identify the preset time or frequency of the slow disk in the RAID through the configuration menu setting, such as setting one month to identify once, one day to identify once or half a year to identify once. Of course, you can also Identify slow disks in RAID to identify a disk.
  • the preset time during implementation can also be factory preset. In more implementations, the preset time may not be preset. For example, when the user needs to recognize the slow disk in the RAID, set the time period in real time through the configuration menu or obtain the default time period before the disk is used to identify the slow disk. .
  • the input/output (I/O) delay time is processed multiple times (two or more times) per disk in the preset time.
  • the delay time of the disk processing I/O includes the request time that the terminal sends the disk to process the I/O to the response time of the disk corresponding to the processing I/O according to the request.
  • the statistic module 20 is configured to statistically obtain a distribution of the delay time of each of the pre-divided preset intervals
  • the user can also set a plurality of preset intervals X 1 to X n for the delay time of the disk processing I/O through the configuration menu, such as the disk processing in this embodiment.
  • the delay time of I/O is divided into 6 preset intervals: X 1 [0, 10ms), X 2 [10ms, 100ms), X 3 [100ms, 500ms), X 4 [500ms, 1s), X 5 [1s , 5s), X 6 [5s, ⁇ ), each interval can correspond to the number of times the recording disk processing I / O delay time occurs x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , ie According to the delay time of the I/O processing each time the disk is obtained in step 10, it is determined in which interval of the preset interval the delay time of the processing I/O is recorded, and the number of times each interval occurs is recorded.
  • this extended time belongs to X 3 , and the value of x 3 is 1.
  • the user can also set a plurality of preset intervals X 1 to X n for the delay time of the disk processing I/O through the configuration menu, for example, when the delay time of all the disks is counted in the preset time, according to the length of the delay time.
  • the area can be divided into [0, 10ms), [10ms, 50ms), [50ms, 1s), [1s, 2s), [2s , ⁇ ).
  • the delay time of each I/O processed by each disk and the delay time of each disk processing I/O corresponding to the preset interval may be recorded in a preset time; or each disk may be recorded first.
  • the calculating module 30 is configured to calculate an average delay time of the disk according to the obtained delay time
  • the average delay time of each disk processing I/O is obtained according to the obtained delay time of the disk processing I/O obtained by the module 10 and the number of times counted by the statistical module 20, and the calculation formula is as follows:
  • T avgnew is the average delay time of the disk processing I/O
  • T i is the delay time of the i/O processing I/O of the disk
  • n is the total number of times, which can be obtained according to the statistics module 20 counting the preset interval times.
  • the total number of times n x 1 + x 2 + x 3 + x 4 + x 5 + x 6 .
  • the average delay time of the disk processing I/O is calculated, and the average delay time of the hard disk processing I/O is equal to the total.
  • the total I/O delay time is equal to the total number of times multiplied by the old (previous) average delay time plus the current delay time T n ; the total number of times is equal to the number of times the old number plus the current delay time: 1. Its calculation formula is as follows:
  • T avgnew is the average delay time of the current disk processing I / O
  • T avgold is the average delay time of the processing I / O that the disk has completed before the update
  • Count is the total number of I / O counted before the update.
  • T avgnew is the average delay time of the current disk processing I / O
  • T avgold is the average delay time of the processing I / O that the disk has completed before the update
  • n is the total I / of the statistics The number of O times
  • T n is the delay time of the processing I/O of the nth record.
  • the first determining module 40 is configured to determine, according to the average delay time of the disk, whether the disk is a slow disk to be determined;
  • a preliminary screening is performed to determine a disk that may be a slow disk in the RAID, which is defined as a slow disk to be determined. For example, it is determined whether the average delay time of all disks is greater than a preset time threshold. For example, a preset time threshold is set according to parameters such as the brand, type, and usage time of the disk. If the average delay time of a disk is obtained in step 30, If the threshold is greater than the preset time threshold, the disk is determined to be a slow disk to be determined. Of course, it can be determined whether the average delay time of some disks in the RAID is greater than a preset time threshold, and the slow disk to be determined in the partial disk is determined.
  • the implementation may further determine, according to the frequency distribution obtained by the statistic module 20, the slow disk to be determined in the RAID, including: when the statistical delay time is distributed in one or more preset intervals, the number of times is greater than each preset.
  • the preset threshold is preset for the interval, the disk is determined to be a slow disk to be determined. For example, in this embodiment, whether the ratio of the delay time of the fifth and sixth preset intervals in the six preset intervals of the delay time to the total number of times of the disk exceeds a preset ratio threshold, that is, the judgment Whether the proportion of the time interval (for example, the largest and the second largest) in the divided area exceeds the preset ratio threshold. If the preset ratio threshold is exceeded, it is determined that the disk is a slow disk to be determined.
  • the first determining module 40 is further configured to determine that the disk is a slow disk to be determined when the statistical delay time is greater than a predetermined threshold for each preset interval.
  • the part of the functions of the embodiment of the present invention can be implemented by the setting number determining unit.
  • the user may assign a threshold value to each interval as a preset threshold according to parameters such as the brand, type, and usage time of each disk, and may also automatically Each threshold is assigned a threshold as a preset threshold based on parameters such as the brand, type, and usage time of each disk. It is also possible to specify a threshold for only one interval in the implementation. The number of times of each preset interval exceeds the threshold corresponding to the preset interval as the preset threshold value. If the number of statistics of the delay time of a certain preset interval of a disk exceeds the corresponding preset number of times, the disk is determined to be a slow disk to be determined.
  • the second judging module 50 is configured to determine, if the disk is a slow disk to be determined, the maximum number of distribution times of the delay time in all preset intervals according to the number of times of the slow disk statistics to be determined. The value determines whether the slow disk is to be a slow disk according to the preset interval corresponding to the determined maximum value.
  • the corresponding preset interval determines whether the slow disk is to be determined to be a slow disk.
  • the percentage of the maximum number of times of the preset interval in the preset interval of the slow disk preset interval is greater than a preset percentage, and if the preset interval of the maximum number of times in the slow disk preset interval is to be determined The maximum value of the number of times is greater than the preset percentage, and it is determined that the slow disk is determined to be a slow disk.
  • the slow disk After determining the slow disk, the slow disk can be processed correspondingly, for example, the data can be written to the slow disk, but when the data of the slow disk needs to be read, the data in the slow disk is read, or the data is read; or After taking the data of other disks in the RAID, obtain the data to be read on the slow disk through logical calculation; or use the slow disk as the backup disk.
  • the delay time of the disk processing two or more input/output (I/O) is obtained by using the disk in the disk array (RAID) for a preset time; the delay time obtained by the statistics is pre-divided.
  • the number of times of each preset interval; the average delay time of the disk is obtained according to the obtained delay time; whether the disk is a slow disk to be judged according to the average delay time of the disk; if the disk is to be judged as a slow disk,
  • the number of times of the disk statistics determines the maximum number of distribution times of the delay time in all the preset intervals in the distribution of the number of times, and determines whether the slow disk is to be a slow disk according to the preset interval corresponding to the determined maximum value.
  • the embodiment of the invention also discloses a disk detecting device.
  • the embodiment of the invention does not need to consider factors such as the disk (I/O) model, model and brand, and improves the accuracy of the slow disk recognition, and ensures the efficient working state of the RAID.
  • FIG. 7 is a schematic diagram of functional modules of a second determining module according to an embodiment of the present invention.
  • the second determining module 50 includes:
  • the determining unit 51 is configured to determine whether the delay time of the slow disk to be determined is the right end of the preset interval corresponding to the maximum value of the preset time interval corresponding to the maximum value of the delay time of the other disk. value;
  • the determining unit 52 is configured to: if the left end value of the preset interval corresponding to the maximum value of the delay time of the delay time of the slow disk is greater than the right end value of the preset interval corresponding to the maximum value of the delay time of the other disks, Determining that the slow disk is to be determined as a slow disk;
  • disks include disks other than the slow disk to be determined.
  • the judgment result of the judging unit 51 if the judgment result is that the left end value of the preset interval corresponding to the maximum value of the delay time of the slow disk to be judged is greater than the preset value corresponding to the maximum value of the delay time of the other disks
  • the right end value for example, in the embodiment, the left end value of X 3 is 500 ms and the right end value of X 1 is 10 ms, and it is determined that the slow disk is a slow disk.
  • the disk is not Slow disk.
  • FIG. 8 is a schematic diagram of functional modules of a first determining module according to an embodiment of the present invention.
  • the first determining module 40 includes:
  • the obtaining unit 41 is arranged to obtain the average delay time of the RAID according to the average delay time of each disk in the RAID.
  • the average delay time of obtaining the RAID is calculated according to the average delay time of all the disk processing I/Os obtained by the obtaining module 10, and of course, it may not be calculated based on the average delay time of all the disks.
  • the average latency of RAID can be equal to the sum of the latency of each disk divided by the total number of disks, as shown in the following equation:
  • Traid is the average delay time of RAID
  • T x is the average delay time of the xth disk
  • N is the total number of disks participating in the calculation in RAID.
  • N can be the number of all disks in the RAID.
  • calculating the average delay time of the RAID can also adopt the method of deducting the data of the slow disk to be determined, and the calculation formula is as follows:
  • Traid is the average delay time of RAID
  • T x is the average delay time of the xth disk
  • T k is the average delay time of the slow disk to be judged
  • N is the total number of disks participating in the calculation in the RAID.
  • N can be the number of all disks in the RAID.
  • the average latency of the RAID can be obtained by other means.
  • the determining unit 42 is configured to determine that the disk is a slow disk to be determined when the ratio of the average delay time of the disk to the average delay time of the RAID is greater than a preset threshold; wherein the ratio preset threshold is greater than 1.
  • the ratio of the average delay time of the disk to the average delay time of the RAID may not be calculated, for example, the disk with the largest average delay time is directly selected as the slow disk to be determined.
  • FIG. 9 is a schematic diagram of functional modules of a third embodiment of the apparatus for detecting a disk according to the present invention.
  • the device of the embodiment includes:
  • the reset module 60 is set to record information of the delay time of the I/O of the disk in the RAID.
  • the terminal When the terminal starts to determine the slow disk in the RAID, it can also reset the record information of the delay time of all the disks in the RAID before obtaining the delay time of the disk processing I/O in the RAID within the preset time. Reset the recorded I/O delay time and count the number of times per interval to zero.
  • the embodiment of the present invention records the record information of the delay time of the input/output I/O of the disk in the RAID before starting to judge the slow disk.
  • the embodiment of the invention can eliminate the influence of historical data on the disk and ensure the correctness of the judgment result.
  • the above technical solution improves the accuracy of the slow disk recognition and ensures the efficient working state of the RAID.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

磁盘检测方法及终端,对磁盘阵列(RAID)中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出(I/O)的延迟时间(10);统计获得的延迟时间在预先划分的每一个预设区间的次数分布(20);根据获得的延迟时间计算获得磁盘的平均延迟时间(30);根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘(40);如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘(50)。无需考虑磁盘的I/O模型、型号和品牌等因素,提高了慢盘识别的准确性,保证了RAID的高效工作状态。

Description

一种磁盘检测的方法和装置 技术领域
本文涉及但不限于存储领域,尤其涉及一种磁盘检测的方法和装置。
背景技术
磁盘是用于存放数据的硬件设备。磁盘主要有磁介质机械硬盘(简称机械硬盘)和固态磁盘(也叫固态硬盘)两种。磁盘阵列RAID(Redundant Array of Independent Disks,独立冗余磁盘阵列的简称)简单的说是一种把多块独立的磁盘按不同的方式组合起来形成一个磁盘组,可以通过多个磁盘同时读取或者存储数据,从而提供比单个磁盘更高的性能。
多个磁盘中性能最差的磁盘会严重影响RAID的性能。比如:一个RAID由磁盘(Disk)1、Disk2和Disk3组成,对RAID的一次写操作,如果组成一个RAID的成员盘具有相同的I/O模型,则RAID的一次写操作被分为三个大小相同的I/O到三个磁盘上处理,Disk1的延迟时间为100毫秒(ms)、Disk2的延迟时间为100ms、Disk3的延迟时间为500ms,那么这次写操作反馈给主机端的延迟时间将以Disk3的延迟时间是500ms。
这种延迟较大,影响RAID性能的磁盘被定义为慢盘。准确识别RAID中的慢盘,对于提升RAID整体性能来说至关重要。目前,业界对于慢盘的识别主要采取以下方案:
方案一:逐个统计每个磁盘的I/O延迟时间,将I/O延迟时间划分为若干个区间,对每个区间设置门限值。当有I/O在磁盘得到执行后,计算I/O的延迟时间,将执行I/O次数划分到对应的区间内。当某个区间的I/O个数超过门限值时,则判定该磁盘为慢盘。
方案二,通过监控磁盘的自我检测、分析与报告技术S.M.A.R.T(Self-Monitoring Analysis and Reporting Technology)对磁盘的某些属性进行监控,设置门限值,当指定属性值超过门限值时,设置磁盘为慢盘或者故障盘。
然而,当磁盘应用在RAID中时,方案一存在明显的缺陷:未考虑磁盘 的I/O模型。举例来说,当拆分到RAID上的I/O都是较大的I/O时,磁盘处理I/O的时间必然会相对较长;或者当磁盘已经满负荷运行时,如果处理器仍然不断的下发I/O给磁盘,磁盘不能及时响应后续的I/O从而导致后续I/O被统计为延迟时间较大,在这些情况下,很可能导致RAID的成员盘都被误判为慢盘,从而影响RAID的正常服务。方案二,则受不同品牌、型号、甚至固件的影响较大,当不同品牌的磁盘混用在一个RAID中时,尤其当不同级别定位的磁盘混用时,性能最差的磁盘S.M.A.R.T可能无异常或者未到门限值而被继续使用,从而导致整个RAID不能工作在最佳状态。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本发明实施例提供一种磁盘检测的方法和装置,能够提高慢盘识别的准确性和RAID的工作效率。
本发明实施例提供的一种磁盘检测的方法,所述方法包括以下步骤:对磁盘阵列(RAID)中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出(I/O)的延迟时间;统计获得的延迟时间在预先划分的每一个预设区间的次数分布;根据获得的延迟时间计算获得磁盘的平均延迟时间;根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。
可选地,所述判断所述待判断慢盘是否为慢盘包括:
判断所述待判断慢盘的所述延迟时间在预设区间的最大值所对应的预设区间的左端值是否大于其他磁盘的延迟时间的所述分布次数最大值所对应的预设区间的右端值;
如果所述待判断慢盘的延迟时间的所述次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的所述次数分布最大值所对应的预设区间的右端值,则判定述待判断慢盘为慢盘;
所述其他磁盘包括除所述待判断慢盘以外的磁盘。
可选地,所述方法还包括:
当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个所述预设区间预先设定的次数预设阈值时,判断所述磁盘为所述待判断慢盘。
可选地,所述根据磁盘的平均延迟时间判断所述磁盘是否为待判断慢盘包括:
根据RAID中每个磁盘的平均延迟时间获得RAID的平均延迟时间;
磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断所述磁盘为待判断慢盘,其中,所述比值预设阈值大于1。
可选地,所述方法还包括:所述获得预置时间内磁盘处理两个或两个以上I/O的延迟时间之前,
重置RAID中磁盘的I/O的延迟时间的记录信息。
此外,本发明实施例还提供一种磁盘检测的装置,所述装置包括:
获得模块设置为,对磁盘阵列RAID中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出I/O的延迟时间;
统计模块设置为,统计获得的延迟时间在预先划分的每一个预设区间的次数分布;
计算模块设置为,根据获得的所述延迟时间计算获得所述磁盘的平均延迟时间;
第一判断模块设置为,根据磁盘的平均延迟时间判断所述磁盘是否为待判断慢盘;
第二判断模块设置为,如果所述磁盘是待判断慢盘,则根据所述待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断所述待判断慢盘是否为慢盘。
可选地,所述第二判断模块包括:
判断单元设置为,判断所述待判断慢盘的所述延迟时间在预设区间的最 大值所对应的预设区间的左端值是否大于其他磁盘的延迟时间的所述分布次数最大值所对应的预设区间的右端值;
判定单元设置为,如果所述待判断慢盘的延迟时间的所述次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的所述次数分布最大值所对应的预设区间的右端值,则判定述待判断慢盘为慢盘;
所述其他磁盘包括除所述待判断慢盘以外的磁盘。
可选地,所述第一判断模块还包括次数判断单元,
次数判断单元设置为,当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个所述预设区间预先设定的次数预设阈值时,判断所述磁盘为待判断慢盘。
可选地,所述第一判断模块包括:
获得单元设置为,根据RAID中每个磁盘的平均延迟时间获得RAID的平均延迟时间;
判断单元设置为,磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断所述磁盘为待判断慢盘;其中,所述比值预设阈值大于1。
可选地,所述装置还包括:
重置模块设置为,重置RAID中磁盘的I/O的延迟时间的记录信息。
与相关技术相比,本发明提供的技术方案,包括:对磁盘阵列(RAID)中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出(I/O)的延迟时间;统计获得的延迟时间在预先划分的每一个预设区间的次数分布;根据获得的延迟时间计算获得磁盘的平均延迟时间;根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。通过上述方式,本发明实施例在预置时间内根据获得磁盘处理I/O的平均延迟时间确定待判断慢盘,再根据待判断慢盘的次数分布的最大值所对应的预设区间判定待判断慢盘是否为慢盘,无需考虑磁盘的I/O模型、型号和品牌等因 素,提高了慢盘识别的准确性,保证了RAID的高效工作状态。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1为本发明磁盘检测的方法第一实施例的流程示意图;
图2为本发明实施例中判断待判断慢盘是否为慢盘的流程示意图;
图3为本发明磁盘检测方法第二实施例的流程示意图;
图4为本发明实施例中另一判断磁盘是否为待判断慢盘的流程示意图;
图5为本发明磁盘检测的方法第三实施例的流程示意图;
图6为本发明磁盘检测的装置第一实施例的功能模块示意图;
图7为本发明实施例中第二判断模块的功能模块示意图;
图8为本发明实施例中第一判断模块的功能模块示意图;
图9为本发明磁盘检测的装置第三实施例的功能模块示意图。
本发明的实施方式
下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
本发明实施例提供一种磁盘检测的方法。
请参照图1,图1为本发明磁盘检测方法第一实施例的流程示意图。
在本实施例中,该磁盘检测方法包括:对磁盘阵列(RAID)中每个磁盘,
步骤10,获得预置时间内磁盘处理两个或两个以上输入/输出(I/O)的延迟时间;
在用户可以通过配置菜单设置识别出RAID中慢盘的预置时间或者频率,比如设置一个月识别一次,一天识别一次或者半年识别一次。当然也可以不识别RAID中慢盘,对某个磁盘进行识别。实施过程中的预置时间还可以出厂预设。在更多的实施中还可以不预设预置时间,比如在用户需要识别出RAID中慢盘时,通过配置菜单实时设置时间段或者默认获得磁盘开始使用到需要识别出慢盘之前的时间段。
在预置时间内获得RAID中每个磁盘多次(两次或两次以上)处理输入/输出(I/O)的延迟时间。磁盘处理I/O的延迟时间包括终端下发给磁盘处理I/O的请求时间到磁盘根据请求对应处理I/O的响应时间。本实施例中终端或者软件可以记录每一个下发给磁盘处理的I/O请求的时刻为t1,硬盘收到I/O后响应的时刻为t2,那么这个磁盘处理I/O的延迟时间T=t2-t1。当然还可以记录每一个下发给磁盘处理的I/O请求到硬盘接收到该请求的时间为t1’,硬盘收到I/O请求到响应的时间为t2’,那么这个磁盘处理I/O的延迟时间T=t1’+t2’。实施时还可以不获得RAID中每个磁盘每次处理输入/输出I/O的延迟时间,比如在用户指定获得RAID中多个磁盘进行识别时,获得用户指定的多个磁盘每次处理输入/输出I/O的延迟时间。获得延迟时间后进入步骤20。
步骤20,统计获得的延迟时间在预先划分的每一个预设区间的次数分布;
为了统计每个磁盘的I/O延迟时间的分布情况,用户还可以通过配置菜单对磁盘处理I/O的延迟时间设置多个预设区间X1~Xn,比如本实施例中将磁盘处理I/O的延迟时间划分为6个预设区间:X1[0,10ms)、X2[10ms,100ms)、X3[100ms,500ms)、X4[500ms,1s)、X5[1s,5s)、X6[5s,∞),每个区间可以对应的记录磁盘处理I/O的延迟时间分别发生的次数x1,x2,x3,x4,x5,x6,即根据在步骤10获得磁盘每次处理I/O的延迟时间,判断处理I/O的延迟时间在预设区间的哪个区间,对应记录每个区间发生的次数。如果在预置时间内获得磁盘处理一个I/O延迟时间是450毫秒(ms),那么这个延长时间属于X3,则x3的值为1。当然用户也可以不通过配置菜单对磁盘处理I/O的延迟时间设置多个预设区间X1~Xn,比如在预置时间内,统计完磁盘所有的延迟时间时,根据延迟时间的长度自动划分区域,比如统计完成的所有延迟时间最小为5ms,最大为2s,则可以划分区域为[0,10ms),[10ms,50ms),[50ms,1s),[1s,2s),[2s,∞)。
本发明实施例可以在预置时间内记录每个磁盘每次处理I/O的延迟时间和同时统计每个磁盘处理I/O的延迟时间对应预设区间的次数;也可以先记录每个磁盘每次处理I/O的延迟时间,记录完成后再根据获得的磁盘处理I/O的延迟时间T以及预先划分的预设区间,统计每个磁盘处理I/O在预先划分的每一个预设区间的次数分布。统计得到每一个预设区间的延时时间的次数 后进入步骤30。
步骤30,根据获得的延迟时间计算获得磁盘的平均延迟时间;
根据步骤10统计的磁盘处理I/O的所有延迟时间计算获得每个磁盘处理I/O的平均延迟时间,计算公式如下:
Figure PCTCN2016081438-appb-000001
其中,Tavgnew为磁盘处理I/O的平均延迟时间,Ti为磁盘第i次处理I/O的延迟时间,n为总次数,可以根据步骤S20统计预设区间的次数获得,本实施例中总次数n=x1+x2+x3+x4+x5+x6。当然还可以只根据步骤10统计的磁盘处理I/O的所有延迟时间,对所有的延迟时间进行计数获得总次数。
可选的,本发明实施例中,如果在每一次下发给磁盘处理I/O请求时,都进行计算获得磁盘处理I/O的平均延迟时间,硬盘处理I/O的平均延迟时间等于总I/O(RAID的)的平均延迟时间除以总I/O次数。总I/O的延迟时间等于总次数乘以旧(在先)平均延迟时间再加上当前延迟时间Tn;总次数等于旧次数加当前延迟时间的次数:1。其计算公式如下所示:
Tavgnew=(Count*Tavgold)/(Count+1),
其中,Tavgnew为当前磁盘处理I/O的平均延迟时间,Tavgold为更新之前磁盘已完成的处理I/O的平均延迟时间,Count为更新之前统计的总I/O次数。当然还可以采用公式:进行计算,其中,Tavgnew为当前磁盘处理I/O的平均延迟时间,Tavgold为更新之前磁盘已完成的处理I/O的平均延迟时间,n为统计的总I/O次数,Tn为第n次记录的处理I/O的延迟时间。本领域人员还可以根据上述描述对计算公式进行其他变化,此处不一一赘述。获得平均延迟时间后进入步骤40。
步骤40,根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;
根据步骤30获得的平均延迟时间进行初步筛选,确定RAID中可能是慢盘的磁盘,定义为待判断慢盘。比如判断所有磁盘的平均延迟时间是是否大于预设时间阈值,比如根据磁盘的品牌、类型和使用时间等参数对平均延迟时间设置一预设时间阈值,如果步骤30获得某个磁盘的平均延迟时间大于该预设时间阈值,则确定该磁盘为待判断慢盘。当然还可以判断RAID中部分 磁盘的平均延迟时间是否大于预设时间阈值,确定部分磁盘中的待判断慢盘。判断出待判断慢盘后进入步骤50。如果判断磁盘为正常磁盘,则返回步骤10或者结束流程。
可选的,实施时还可以根据步骤20获得的次数分布进行判断确定RAID中待判断慢盘;包括:当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个预设区间预先设定的次数预设阈值时,判断磁盘为待判断慢盘。比如、在本实施例中可以将延迟时间划分的6个预设区间中第5个和第6个预设区间的延迟时间的次数占该磁盘总次数的比例是否超过预设比例阈值,即判断划分的区域中时间较大区间(例如、最大和次大)的次数所占的比例是否超过预设比例阈值。如果超过预设比例阈值,则确定该磁盘为待判断慢盘。
步骤50,根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。
根据步骤40确定RAID中的待判断慢盘,判断出待判断慢盘的预设区间中统计次数分布的最大值所对应的预设区间,然后根据待判断慢盘的统计次数分布最大值所对应的预设区间判断待判断慢盘是否为慢盘。比如判断待判断慢盘预设区间中次数最大值的预设区间的该次数最大值占总次数的百分比是否大于预设的百分比,如果待判断慢盘预设区间中次数最大值的预设区间的该次数最大值占总次数的百分比大于预设的百分比,则判定待判断慢盘为慢盘。
在判定出慢盘后,则可以对慢盘进行对应处理,比如可以对慢盘写入数据,但是需要读取该慢盘的数据时,采用降级读的方式,读取其中的数据;或者读取RAID中其它磁盘的数据之后,通过逻辑计算获得要在慢盘中读的数据;或者将慢盘作为备份盘等。
本领域技术人员可以理解的是步骤20可以不在步骤10和步骤30之间执行,可以只要在步骤50之前执行即可。当然本领域技术人员还可以根据本发明实施例的描述做其他步骤或者条件变换。
本发明实施例通过对磁盘阵列(RAID)中每个磁盘,获得预置时间内磁 盘处理两个或两个以上输入/输出(I/O)的延迟时间;统计获得的延迟时间在预先划分的每一个预设区间的次数分布;根据获得的延迟时间计算获得磁盘的平均延迟时间;根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。本发明实施例还公开了一种磁盘检测装置。本发明实施例无需考虑磁盘的(I/O)模型、型号和品牌等因素,提高了慢盘识别的准确性,保证了RAID的高效工作状态。
请参照图2,图2为本发明实施例中判断待判断慢盘是否为慢盘的流程示意图。
基于本发明磁盘检测方法第一实施例,步骤50包括:
步骤51,判断待判断慢盘的延迟时间在预设区间的最大值所对应的预设区间的左端值是否大于其他磁盘的分布次数最大值所对应的预设区间的右端值;
这里,左端值可以表示区间的最小值,右端值可以表示区间的最大值。
根据步骤40确定RAID中的待判断慢盘和步骤20统计的预设区间对应的次数分布,判断出待判断慢盘的统计次数最大值所对应的预设区间,然后判断待判断慢盘的次数分布的最大值所对应的预设区间的左端值是否大于其他磁盘的次数分布最大值所对应的区间的右端值。比如在本实施例中待判断慢盘在预设区间的分布情况为x1=10,x2=10,x3=100,x4=5,x5=1,x6=1,延时时间次数分布中统计的次数最大值对应的区间为X3[100ms,500ms);其他磁盘在预设区间的分布情况为x1=100,x2=10,x3=12,x4=5,x5=0,x6=0,次数分布最大值的预设区间为X1[0,10ms),判断待判断慢盘的X3区间和其他磁盘的X1区间从而得到判断结果。如果判断结果为待判断慢盘的次数分布最大值所对应的预设区间的左端值大于其他磁盘的次数分布最大值所对应的预设区间的右端值,则进入步骤52。否则进入步骤53。
步骤52,待判断慢盘的延迟时间的次数分布最大值所对应的预设区间的 左端值大于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值时,判定待判断慢盘为慢盘;
根据步骤51的判断结果,如果判断结果为待判断慢盘的延迟时间的次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值,比如本实施例中X3的左端值为500ms大于X1的右端值10ms,则判定待判断慢盘为慢盘。
步骤53,待判断慢盘的延迟时间的次数分布最大值所对应的预设区间的左端值小于或等于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值时,判定待判断慢盘不是慢盘。
根据步骤51的判断结果,如果判断结果为待判断慢盘的次数分布最大值所对应的预设区间的左端值小于或等于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值,则判定判断磁盘不是慢盘。判断完成后可以输出判断结果或者重新执行判断流程。
请参阅图3,图3为本发明磁盘检测的方法第二实施例的流程示意图。
基于本发明磁盘检测方法第一实施例,本实施例方法还包括:
步骤60,根据延迟时间在一个或一个以上预设区间的次数分布和为每一个预设区间预先设定的次数预设阈值判断磁盘是否为待判断慢盘。
可选的,当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个预设区间预先设定的次数预设阈值时,判断磁盘为待判断慢盘。
本步骤可以是步骤40的一种实施方式。在实施时,用户可以根据每一个磁盘的品牌、类型和使用时间等参数在通过配置菜单划分区间时,给每个区间指定一门限值作为次数预设阈值,当然还可以自动根据每一个磁盘的品牌、类型和使用时间等参数给每个区间指定一门限值作为次数预设阈值。实施时还可以只给某一个区间指定门限值。根据步骤20统计延迟时间所在预设区间的次数,判断每一个预设区间的次数是否超过预设区间对应的门限值作为次数预设阈值。如果某个磁盘的某个预设区间的延迟时间的统计次数超过其对应的次数预设阈值,则确定该磁盘为待判断慢盘。
请参阅图4,图4为本发明实施例中另一判断磁盘是否为待判断慢盘的流程示意图。
基于本发明磁盘检测的方法第一实施例,可选的,步骤40还可以包括:
步骤41,根据磁盘的平均延迟时间获得RAID的平均延迟时间;
根据步骤10获得的所有磁盘处理I/O的平均延迟时间计算获得RAID的平均延迟时间,当然还可以不根据所有磁盘的平均延迟时间计算。RAID的平均延迟时间可以等于每一个磁盘的延迟时间的和除以磁盘的总数量,计算公式如下所示:
Traid=Σi=1...NTx/N,
其中,Traid为RAID的平均延迟时间,Tx为第x个磁盘的平均延迟时间,N为RAID中参与计算的磁盘总数量。N可以是RAID中所有盘的数量。
实施时,计算RAID的平均延迟时间还可以采用扣除待判断慢盘的数据的方法,其计算公式如下所示:
Traid=Σi=1...(K-1)(K+1)...NTx-TK/(N-1),
其中,Traid为RAID的平均延迟时间,Tx为第x个磁盘的平均延迟时间,Tk是待判断慢盘的平均延迟时间,N为RAID中参与计算的磁盘总数量。N可以是RAID中所有盘的数量。当然还可以通过其他方式获得RAID的平均延迟时间。
获得RAID的平均延迟时间后进入步骤42。
步骤42,根据磁盘的平均延迟时间与RAID的平均延迟时间的比值和比值预设阈值判断磁盘是否为待判断慢盘。可以包括:
磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断磁盘为待判断慢盘,其中,比值预设阈值大于1。
根据步骤41获得RAID的平均延迟时间,计算磁盘的平均延迟时间与RAID的平均延迟时间的比值,根据获得的比值与比值预设阈值的大小判断磁盘是否为待判断慢盘,其中,比值预设阈值大于1。如果磁盘的平均延迟时 间与RAID的平均延迟时间的比值大于比值预设阈值,则确定磁盘为待判断慢盘。实施时,还可以不计算磁盘的平均延迟时间与RAID的平均延迟时间的比值,比如直接选择平均延迟时间最大的磁盘作为待判断慢盘。
请参阅图5,图5为本发明磁盘检测的方法第三实施例的流程示意图。
基于本发明磁盘检测方法第一实施例,可选的,在步骤10之前还可以包括:
步骤70,重置RAID中磁盘的I/O的延迟时间的记录信息。
在终端开始判断RAID中慢盘时,获得预置时间内RAID中磁盘处理I/O的延迟时间之前,还可以重置RAID中所有磁盘中延迟时间的记录信息。将记录的I/O延迟时间和统计每一个区间的次数重置为0。
本发明实施例通过在开始判断慢盘之前重置RAID中磁盘的输入/输出I/O的延迟时间的记录信息。本发明实施例能够消除磁盘中历史数据的影响,保证判断结果的正确性。
需要发明实施例方法,可以由磁盘所在的服务器或可以与磁盘通信的服务器或装置实施。
本发明实施例还提供一种计算机存储介质,计算机存储介质中存储有计算机可执行指令,计算机可执行指令用于执行上述磁盘检测的方法。
本发明实施例进一步提供一种磁盘检测的装置。
参照图6,图6为本发明磁盘检测的装置第一实施例的功能模块示意图。
在本实施例中,该磁盘检测装置包括:
获得模块10设置为,对磁盘阵列RAID中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出I/O的延迟时间;
在用户可以通过配置菜单设置识别出RAID中慢盘的预置时间或者频率,比如设置一个月识别一次,一天识别一次或者半年识别一次。当然也可以不 识别RAID中慢盘,对某个磁盘进行识别。实施过程中的预置时间还可以出厂预设。在更多的实施中还可以不预设预置时间,比如在用户需要识别出RAID中慢盘时,通过配置菜单实时设置时间段或者默认获得磁盘开始使用到需要识别出慢盘之前的时间段。
在预置时间内获得RAID中每个磁盘多次(两次或两次以上)处理输入/输出(I/O)的延迟时间。磁盘处理I/O的延迟时间包括终端下发给磁盘处理I/O的请求时间到磁盘根据请求对应处理I/O的响应时间。本实施例中终端或者软件可以记录每一个下发给磁盘处理的I/O请求的时刻为t1,硬盘收到I/O后响应的时刻为t2,那么这个磁盘处理I/O的延迟时间T=t2-t1。当然还可以记录每一个下发给磁盘处理的I/O请求到硬盘接收到该请求的时间为t1’,硬盘收到I/O请求到响应的时间为t2’,那么这个磁盘处理I/O的延迟时间T=t1’+t2’。实施时还可以不获得RAID中每个磁盘每次处理输入/输出I/O的延迟时间,比如在用户指定获得RAID中多个磁盘进行识别时,获得用户指定的多个磁盘每次处理输入/输出I/O的延迟时间。
统计模块20设置为,统计获得的延迟时间在预先划分的每一个预设区间的次数分布;
为了统计每个磁盘的I/O延迟时间的分布情况,用户还可以通过配置菜单对磁盘处理I/O的延迟时间设置多个预设区间X1~Xn,比如本实施例中将磁盘处理I/O的延迟时间划分为6个预设区间:X1[0,10ms)、X2[10ms,100ms)、X3[100ms,500ms)、X4[500ms,1s)、X5[1s,5s)、X6[5s,∞),每个区间可以对应的记录磁盘处理I/O的延迟时间分别发生的次数x1,x2,x3,x4,x5,x6,即根据在步骤10获得磁盘每次处理I/O的延迟时间,判断处理I/O的延迟时间在预设区间的哪个区间,对应记录每个区间发生的次数。如果在预置时间内获得磁盘处理一个I/O延迟时间是450毫秒(ms),那么这个延长时间属于X3,则x3的值为1。当然用户也可以不通过配置菜单对磁盘处理I/O的延迟时间设置多个预设区间X1~Xn,比如在预置时间内,统计完磁盘所有的延迟时间时,根据延迟时间的长度自动划分区域,比如统计完成的所有延迟时间最小为5ms,最大为2s,则可以划分区域为[0,10ms),[10ms,50ms),[50ms,1s),[1s,2s),[2s,∞)。
本发明实施例可以在预置时间内记录每个磁盘每次处理I/O的延迟时间和同时统计每个磁盘处理I/O的延迟时间对应预设区间的次数;也可以先记录每个磁盘每次处理I/O的延迟时间,记录完成后再根据获得的磁盘处理I/O的延迟时间T以及预先划分的预设区间,统计每个磁盘处理I/O在预先划分的每一个预设区间的次数分布。
计算模块30设置为,根据获得的延迟时间计算获得磁盘的平均延迟时间;
根据获得模块10统计的磁盘处理I/O的所有延迟时间和在统计模块20统计的次数分布计算获得每个磁盘处理I/O的平均延迟时间,计算公式如下:
Figure PCTCN2016081438-appb-000002
其中,Tavgnew为磁盘处理I/O的平均延迟时间,Ti为磁盘第i次处理I/O的延迟时间,n为总次数,可以根据统计模块20统计预设区间的次数获得,。本实施例中总次数n=x1+x2+x3+x4+x5+x6。当然还可以只根据获得模块10统计的磁盘处理I/O的所有延迟时间,对所有的延迟时间进行计数获得总次数。
可选的,本发明实施例中,如果在每一次下发给磁盘处理I/O请求时,都进行计算获得磁盘处理I/O的平均延迟时间,硬盘处理I/O的平均延迟时间等于总I/O(RAID的)的平均延迟时间除以总I/O次数。总I/O的延迟时间等于总次数乘以旧(在先)平均延迟时间再加上当前延迟时间Tn;总次数等于旧次数加当前延迟时间的次数:1。其计算公式如下所示:
Tavgnew=(Count*Tavgold)/(Count+1),
其中,Tavgnew为当前磁盘处理I/O的平均延迟时间,Tavgold为更新之前磁盘已完成的处理I/O的平均延迟时间,Count为更新之前统计的总I/O次数。当然还可以采用公式:进行计算,其中,Tavgnew为当前磁盘处理I/O的平均延迟时间,Tavgold为更新之前磁盘已完成的处理I/O的平均延迟时间,n为统计的总I/O次数,Tn为第n次记录的处理I/O的延迟时间。本领域人员还可以根据上述描述对计算公式进行其他变化,此处不一一赘述。
第一判断模块40设置为,根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;
根据计算模块30获得的平均延迟时间进行初步筛选,确定RAID中可能是慢盘的磁盘,定义为待判断慢盘。比如判断所有磁盘的平均延迟时间是是否大于预设时间阈值,比如根据磁盘的品牌、类型和使用时间等参数对平均延迟时间设置一预设时间阈值,如果步骤30获得某个磁盘的平均延迟时间大于该预设时间阈值,则确定该磁盘为待判断慢盘。当然还可以判断RAID中部分磁盘的平均延迟时间是否大于预设时间阈值,确定部分磁盘中的待判断慢盘。
可选的,实施时还可以根据统计模块20获得的次数分布进行判断确定RAID中待判断慢盘,包括:当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个预设区间预先设定的次数预设阈值时,判断磁盘为待判断慢盘。比如、在本实施例中可以将延迟时间划分的6个预设区间中第5个和第6个预设区间的延迟时间的次数占该磁盘总次数的比例是否超过预设比例阈值,即判断划分的区域中时间较大区间(例如、最大和次大)的次数所占的比例是否超过预设比例阈值。如果超过预设比例阈值,则确定该磁盘为待判断慢盘。
第一判断模块40还设置为,当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个预设区间预先设定的次数预设阈值时,判断磁盘为待判断慢盘。这里,本发明实施例这部分功能可以通过设置次数判断单元实施。
可选的,在实施时,用户可以根据每一个磁盘的品牌、类型和使用时间等参数在通过配置菜单划分区间时,给每个区间指定一门限值作为次数预设阈值,当然还可以自动根据每一个磁盘的品牌、类型和使用时间等参数给每个区间指定一门限值作为次数预设阈值。实施中还可以只给某一个区间指定门限值。根据统计模块20统计延迟时间所在预设区间的次数,判断每一个预设区间的次数是否超过预设区间对应的门限值作为次数预设阈值。如果某个磁盘的某个预设区间的延迟时间的统计次数超过其对应的次数预设阈值,则确定该磁盘为待判断慢盘。
第二判断模块50设置为,如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大 值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。
根据第一判断模块40确定RAID中的待判断慢盘,判断出待判断慢的预设区间盘的统计次数分布最大值所对应的预设区间,然后根据待判断慢盘的统计次数分布最大值所对应的预设区间判断待判断慢盘是否为慢盘。比如判断待判断慢盘预设区间中次数最大值的预设区间的该次数最大值占总次数的百分比是否大于预设的百分比,如果待判断慢盘预设区间中次数最大值的预设区间的该次数最大值占总次数的百分比大于预设的百分比,则判定待判断慢盘为慢盘。
在判定出慢盘后,则可以对慢盘进行对应处理,比如可以对慢盘写入数据,但是需要读取该慢盘的数据时,采用降级读的方式,读取其中的数据;或者读取RAID中其它磁盘的数据之后,通过逻辑计算获得要在慢盘中读的数据;或者将慢盘作为备份盘等。
本发明实施例通过对磁盘阵列(RAID)中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出(I/O)的延迟时间;统计获得的延迟时间在预先划分的每一个预设区间的次数分布;根据获得的延迟时间计算获得磁盘的平均延迟时间;根据磁盘的平均延迟时间判断磁盘是否为待判断慢盘;如果磁盘是待判断慢盘,则根据待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断待判断慢盘是否为慢盘。本发明实施例还公开了一种磁盘检测装置。本发明实施例无需考虑磁盘的(I/O)模型、型号和品牌等因素,提高了慢盘识别的准确性,保证了RAID的高效工作状态。
图7为本发明实施例中第二判断模块的功能模块示意图。
基于本发明磁盘检测装置第一实施例,第二判断模块50包括:
判断单元51设置为,判断待判断慢盘的延迟时间在预设区间的最大值所对应的预设区间的左端值是否大于其他磁盘的延迟时间的分布次数最大值所对应的预设区间的右端值;
根据第一判断模块40确定RAID中的待判断慢盘和统计模块20统计的预设区间对应的次数分布,判断出待判断慢盘的统计次数最大值所对应的预 设区间,然后判断待判断慢盘的次数分布的最大值所对应的预设区间的左端值是否大于其他磁盘的次数最大所对应的区间的右端值。比如在本实施例中待判断慢盘在预设区间的分布情况为x1=10,x2=10,x3=100,x4=5,x5=1,x6=1,延时时间次数分布中统计的次数最大值对应的区间为X3[100ms,500ms);其他磁盘在预设区间的分布情况为x1=100,x2=10,x3=12,x4=5,x5=0,x6=0,次数分布最大值的预设区间为X1[0,10ms),判断待判断慢盘的X3区间和其他磁盘的X1区间从而得到判断结果。
判定单元52设置为,如果待判断慢盘的延迟时间的次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值,则判定待判断慢盘为慢盘;
其他磁盘包括除待判断慢盘以外的磁盘。
根据判断单元51的判断结果,如果判断结果为待判断慢盘的延迟时间的次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值,比如本实施例中X3的左端值为500ms大于X1的右端值10ms,则判定待判断慢盘为慢盘。如果判断结果为待判断慢盘的次数分布最大值所对应的预设区间的左端值小于或等于其他磁盘的延迟时间的次数分布最大值所对应的预设区间的右端值,则判定判断磁盘不是慢盘。
图8为本发明实施例中第一判断模块的功能模块示意图。
基于本发明磁盘检测装置第一实施例,第一判断模块40包括:
获得单元41设置为,根据RAID中每个磁盘的平均延迟时间获得RAID的平均延迟时间。
根据获得模块10获得的所有磁盘处理I/O的平均延迟时间计算获得RAID的平均延迟时间,当然还可以不根据所有磁盘的平均延迟时间计算。RAID的平均延迟时间可以等于每一个磁盘的延迟时间的和除以磁盘的总数量,计算公式如下所示:
Traid=Σi=1...NTx/N,
其中,Traid为RAID的平均延迟时间,Tx为第x个磁盘的平均延迟时间,N为RAID中参与计算的磁盘总数量。N可以是RAID中所有盘的数量。
实施时,计算RAID的平均延迟时间还可以采用扣除待判断慢盘的数据的方法,其计算公式如下所示:
Traid=Σi=1...(K-1)(K+1)...NTx-TK/(N-1),
其中,Traid为RAID的平均延迟时间,Tx为第x个磁盘的平均延迟时间,Tk是待判断慢盘的平均延迟时间,N为RAID中参与计算的磁盘总数量。N可以是RAID中所有盘的数量。当然还可以通过其他方式获得RAID的平均延迟时间。
判断单元42设置为,磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断磁盘为待判断慢盘;其中,比值预设阈值大于1。
根据获得单元41获得RAID的平均延迟时间,计算磁盘的平均延迟时间与RAID的平均延迟时间的比值,根据获得的比值与比值预设阈值的大小判断磁盘是否为待判断慢盘,其中,比值预设阈值大于1。如果磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值,则确定磁盘为待判断慢盘。实施时,还可以不计算磁盘的平均延迟时间与RAID的平均延迟时间的比值,比如直接选择平均延迟时间最大的磁盘作为待判断慢盘。
图9为本发明磁盘检测的装置第三实施例的功能模块示意图。
基于本发明磁盘检测装置第一实施例,可选的,本实施例装置包括:
重置模块60设置为,重置RAID中磁盘的I/O的延迟时间的记录信息。
在终端开始判断RAID中慢盘时,获得预置时间内RAID中磁盘处理I/O的延迟时间之前,还可以重置RAID中所有磁盘中延迟时间的记录信息。将记录的I/O延迟时间和统计每一个区间的次数重置为0。
本发明实施例通过在开始判断慢盘之前重置RAID中磁盘的输入/输出I/O的延迟时间的记录信息。本发明实施例能够消除磁盘中历史数据的影响,保证判断结果的正确性。
以上仅为本发明的可选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件(例如处理器)完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的每个模块/单元可以采用硬件的形式实现,例如通过集成电路来实现其相应功能,也可以采用软件功能模块的形式实现,例如通过处理器执行存储于存储器中的程序/指令来实现其相应功能。本发明不限制于任何特定形式的硬件和软件的结合。”
虽然本申请所揭露的实施方式如上,但所述的内容仅为便于理解本申请而采用的实施方式,并非用以限定本申请,如本发明实施方式中的具体的实现方法。任何本申请所属领域内的技术人员,在不脱离本申请所揭露的精神和范围的前提下,可以在实施的形式及细节上进行任何的修改与变化,但本申请的专利保护范围,仍须以所附的权利要求书所界定的范围为准。
工业实用性
上述技术方案提高了慢盘识别的准确性,保证了RAID的高效工作状态。

Claims (10)

  1. 一种磁盘检测的方法,所述方法包括:对磁盘阵列RAID中每个磁盘,
    获得预置时间内磁盘处理两个或两个以上输入/输出I/O的延迟时间;
    统计获得的所述延迟时间在预先划分的每一个预设区间的次数分布;
    根据获得的所述延迟时间计算获得所述磁盘的平均延迟时间;
    根据磁盘的平均延迟时间判断所述磁盘是否为待判断慢盘;
    如果所述磁盘是待判断慢盘,则根据所述待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断所述待判断慢盘是否为慢盘。
  2. 如权利要求1所述的方法,其中,所述判断所述待判断慢盘是否为慢盘包括:
    判断所述待判断慢盘的所述延迟时间在预设区间的最大值所对应的预设区间的左端值是否大于其他磁盘的延迟时间的所述分布次数最大值所对应的预设区间的右端值;
    如果所述待判断慢盘的延迟时间的所述次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的所述次数分布最大值所对应的预设区间的右端值,则判定述待判断慢盘为慢盘;
    所述其他磁盘包括除所述待判断慢盘以外的磁盘。
  3. 如权利要求1所述的方法,所述方法还包括:
    当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个所述预设区间预先设定的次数预设阈值时,判断所述磁盘为所述待判断慢盘。
  4. 如权利要求1~3任一项所述的方法,其中,所述根据磁盘的平均延迟时间判断所述磁盘是否为待判断慢盘包括:
    根据RAID中每个磁盘的平均延迟时间获得RAID的平均延迟时间;
    磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断所述磁盘为待判断慢盘,其中,所述比值预设阈值大于1。
  5. 如权利要求1~3所述的方法,所述方法还包括:所述获得预置时间内磁盘处理两个或两个以上I/O的延迟时间之前,
    重置RAID中磁盘的I/O的延迟时间的记录信息。
  6. 一种磁盘检测的装置,所述装置包括:
    获得模块设置为,对磁盘阵列RAID中每个磁盘,获得预置时间内磁盘处理两个或两个以上输入/输出I/O的延迟时间;
    统计模块设置为,统计获得的延迟时间在预先划分的每一个预设区间的次数分布;
    计算模块设置为,根据获得的所述延迟时间计算获得所述磁盘的平均延迟时间;
    第一判断模块设置为,根据磁盘的平均延迟时间判断所述磁盘是否为待判断慢盘;
    第二判断模块设置为,如果所述磁盘是待判断慢盘,则根据所述待判断慢盘统计的次数分布确定次数分布中延迟时间在所有预设区间的分布次数的最大值,根据确定的最大值所对应的预设区间判断所述待判断慢盘是否为慢盘。
  7. 如权利要求6所述的装置,其中,所述第二判断模块包括:
    判断单元设置为,判断所述待判断慢盘的所述延迟时间在预设区间的最大值所对应的预设区间的左端值是否大于其他磁盘的延迟时间的所述分布次数最大值所对应的预设区间的右端值;
    判定单元设置为,如果所述待判断慢盘的延迟时间的所述次数分布最大值所对应的预设区间的左端值大于其他磁盘的延迟时间的所述次数分布最大值所对应的预设区间的右端值,则判定述待判断慢盘为慢盘;
    所述其他磁盘包括除所述待判断慢盘以外的磁盘。
  8. 如权利要求6所述的装置,所述第一判断模块还包括次数判断单元,
    次数判断单元设置为,当统计的延迟时间在一个或一个以上预设区间的次数分布大于为每一个所述预设区间预先设定的次数预设阈值时,判断所述 磁盘为待判断慢盘。
  9. 如权利要求6~8任一项所述的装置,其中,所述第一判断模块包括:
    获得单元设置为,根据RAID中每个磁盘的平均延迟时间获得RAID的平均延迟时间;
    判断单元设置为,磁盘的平均延迟时间与RAID的平均延迟时间的比值大于比值预设阈值时,判断所述磁盘为待判断慢盘;其中,所述比值预设阈值大于1。
  10. 如权利要求6~8任一项所述的装置,所述装置还包括:
    重置模块设置为,重置RAID中磁盘的I/O的延迟时间的记录信息。
PCT/CN2016/081438 2015-07-17 2016-05-09 一种磁盘检测的方法和装置 WO2017012392A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510422809.8 2015-07-17
CN201510422809.8A CN106354590B (zh) 2015-07-17 2015-07-17 磁盘检测方法和装置

Publications (1)

Publication Number Publication Date
WO2017012392A1 true WO2017012392A1 (zh) 2017-01-26

Family

ID=57833520

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/081438 WO2017012392A1 (zh) 2015-07-17 2016-05-09 一种磁盘检测的方法和装置

Country Status (2)

Country Link
CN (1) CN106354590B (zh)
WO (1) WO2017012392A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083515A (zh) * 2019-04-24 2019-08-02 苏州元核云技术有限公司 分布式存储系统中慢盘的快速判断方法、装置及存储介质
CN111045881A (zh) * 2018-10-15 2020-04-21 深信服科技股份有限公司 一种慢盘检测方法及系统
CN112416639A (zh) * 2020-11-16 2021-02-26 新华三技术有限公司成都分公司 一种慢盘检测方法、装置、设备及存储介质
CN113312218A (zh) * 2021-03-31 2021-08-27 阿里巴巴新加坡控股有限公司 磁盘的检测方法和装置
CN114003477A (zh) * 2021-10-27 2022-02-01 苏州浪潮智能科技有限公司 慢盘诊断信息收集方法、系统、终端及存储介质
CN114415973A (zh) * 2022-03-28 2022-04-29 阿里云计算有限公司 慢盘检测方法、装置、电子设备及存储介质
CN115051956A (zh) * 2022-06-30 2022-09-13 北京达佳互联信息技术有限公司 一种连接建立方法、装置、设备及存储介质
CN115114099A (zh) * 2022-06-24 2022-09-27 苏州浪潮智能科技有限公司 一种磁盘冗余阵列中慢盘识别处理方法、装置及存储介质
CN117194177A (zh) * 2023-11-03 2023-12-08 四川省华存智谷科技有限责任公司 一种提高存储系统慢盘检测准确率的方法
CN117806890A (zh) * 2024-02-28 2024-04-02 四川省华存智谷科技有限责任公司 一种基于分布式存储的慢盘检测处理方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329710A (zh) * 2017-07-06 2017-11-07 郑州云海信息技术有限公司 一种存储性能优化的方法、系统及存储软件
CN107391042A (zh) * 2017-07-28 2017-11-24 郑州云海信息技术有限公司 一种磁盘阵列的设计方法及系统
CN107465579B (zh) * 2017-09-22 2021-03-09 苏州浪潮智能科技有限公司 一种端口性能统计系统
CN107832202A (zh) * 2017-11-06 2018-03-23 郑州云海信息技术有限公司 一种检测硬盘的方法、装置及计算机可读存储介质
CN109783259B (zh) * 2017-11-15 2022-07-26 成都华为技术有限公司 慢盘检测方法和装置、存储介质
CN109815037B (zh) * 2017-11-22 2021-07-20 华为技术有限公司 慢盘检测方法和存储阵列
CN109684140B (zh) * 2018-12-11 2022-07-01 广东浪潮大数据研究有限公司 一种慢盘检测方法、装置、设备及计算机可读存储介质
CN111399748B (zh) * 2019-01-02 2023-09-05 中国移动通信有限公司研究院 一种数据放置方法、装置和计算机可读存储介质
CN112241343B (zh) * 2019-07-19 2024-02-23 深信服科技股份有限公司 一种慢盘检测方法、装置、电子设备及可读存储介质
CN111290909A (zh) * 2020-01-19 2020-06-16 山东汇贸电子口岸有限公司 一种对ceph集群进行监控和告警的系统及方法
CN115348157B (zh) * 2021-05-14 2023-09-05 中国移动通信集团浙江有限公司 分布式存储集群的故障定位方法、装置、设备及存储介质
CN113849123B (zh) * 2021-08-14 2023-08-25 苏州浪潮智能科技有限公司 一种慢盘的数据处理方法、系统、设备以及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050240742A1 (en) * 2004-04-22 2005-10-27 Apple Computer, Inc. Method and apparatus for improving performance of data storage systems
US20060106926A1 (en) * 2003-08-19 2006-05-18 Fujitsu Limited System and program for detecting disk array device bottlenecks
CN102147708A (zh) * 2010-02-10 2011-08-10 成都市华为赛门铁克科技有限公司 一种磁盘检测方法及装置
CN103488544A (zh) * 2013-09-26 2014-01-01 华为技术有限公司 检测慢盘的处理方法和装置
CN103810062A (zh) * 2014-03-05 2014-05-21 华为技术有限公司 慢盘检测方法和装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6321345B1 (en) * 1999-03-01 2001-11-20 Seachange Systems, Inc. Slow response in redundant arrays of inexpensive disks
CN103019885B (zh) * 2012-11-26 2015-05-27 大唐移动通信设备有限公司 基于嵌入式Linux的硬盘坏道监测方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060106926A1 (en) * 2003-08-19 2006-05-18 Fujitsu Limited System and program for detecting disk array device bottlenecks
US20050240742A1 (en) * 2004-04-22 2005-10-27 Apple Computer, Inc. Method and apparatus for improving performance of data storage systems
CN102147708A (zh) * 2010-02-10 2011-08-10 成都市华为赛门铁克科技有限公司 一种磁盘检测方法及装置
CN103488544A (zh) * 2013-09-26 2014-01-01 华为技术有限公司 检测慢盘的处理方法和装置
CN103810062A (zh) * 2014-03-05 2014-05-21 华为技术有限公司 慢盘检测方法和装置

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111045881A (zh) * 2018-10-15 2020-04-21 深信服科技股份有限公司 一种慢盘检测方法及系统
CN110083515B (zh) * 2019-04-24 2023-06-20 苏州元核云技术有限公司 分布式存储系统中慢盘的快速判断方法、装置及存储介质
CN110083515A (zh) * 2019-04-24 2019-08-02 苏州元核云技术有限公司 分布式存储系统中慢盘的快速判断方法、装置及存储介质
CN112416639A (zh) * 2020-11-16 2021-02-26 新华三技术有限公司成都分公司 一种慢盘检测方法、装置、设备及存储介质
CN113312218A (zh) * 2021-03-31 2021-08-27 阿里巴巴新加坡控股有限公司 磁盘的检测方法和装置
CN114003477A (zh) * 2021-10-27 2022-02-01 苏州浪潮智能科技有限公司 慢盘诊断信息收集方法、系统、终端及存储介质
CN114003477B (zh) * 2021-10-27 2023-08-22 苏州浪潮智能科技有限公司 慢盘诊断信息收集方法、系统、终端及存储介质
CN114415973A (zh) * 2022-03-28 2022-04-29 阿里云计算有限公司 慢盘检测方法、装置、电子设备及存储介质
CN114415973B (zh) * 2022-03-28 2022-08-30 阿里云计算有限公司 慢盘检测方法、装置、电子设备及存储介质
CN115114099A (zh) * 2022-06-24 2022-09-27 苏州浪潮智能科技有限公司 一种磁盘冗余阵列中慢盘识别处理方法、装置及存储介质
CN115051956A (zh) * 2022-06-30 2022-09-13 北京达佳互联信息技术有限公司 一种连接建立方法、装置、设备及存储介质
CN115051956B (zh) * 2022-06-30 2023-09-26 北京达佳互联信息技术有限公司 一种连接建立方法、装置、设备及存储介质
CN117194177A (zh) * 2023-11-03 2023-12-08 四川省华存智谷科技有限责任公司 一种提高存储系统慢盘检测准确率的方法
CN117806890A (zh) * 2024-02-28 2024-04-02 四川省华存智谷科技有限责任公司 一种基于分布式存储的慢盘检测处理方法
CN117806890B (zh) * 2024-02-28 2024-05-03 四川省华存智谷科技有限责任公司 一种基于分布式存储的慢盘检测处理方法

Also Published As

Publication number Publication date
CN106354590A (zh) 2017-01-25
CN106354590B (zh) 2020-04-24

Similar Documents

Publication Publication Date Title
WO2017012392A1 (zh) 一种磁盘检测的方法和装置
US10216558B1 (en) Predicting drive failures
CN109684140B (zh) 一种慢盘检测方法、装置、设备及计算机可读存储介质
US20180157438A1 (en) Slow-disk detection method and apparatus
US20200387311A1 (en) Disk detection method and apparatus
US9886195B2 (en) Performance-based migration among data storage devices
CN103578568A (zh) 固态硬盘的性能测试方法及装置
US20150074467A1 (en) Method and System for Predicting Storage Device Failures
US11734103B2 (en) Behavior-driven die management on solid-state drives
CN111045881A (zh) 一种慢盘检测方法及系统
CN108874324B (zh) 一种访问请求处理方法、装置、设备及可读存储介质
US20130346950A1 (en) Usability testing
CN112749013B (zh) 线程负载的检测方法、装置、电子设备及存储介质
CN112416670B (zh) 硬盘测试方法、装置、服务器和存储介质
WO2023185767A1 (zh) 慢盘检测方法、装置、电子设备及存储介质
CN112596964A (zh) 磁盘故障的预测方法及装置
US8930773B2 (en) Determining root cause
WO2023050671A1 (zh) 服务器故障定位方法、装置、电子设备及存储介质
CN112764684A (zh) 一种存储系统的硬盘性能识别方法和系统
CN109358815B (zh) 一种nand闪存数据管理方法和装置
US20190138931A1 (en) Apparatus and method of introducing probability and uncertainty via order statistics to unsupervised data classification via clustering
EP2915059B1 (en) Analyzing data with computer vision
US9633061B2 (en) Methods for determining event counts based on time-sampled data
CN117785074B (zh) 一种输入输出超时处理的方法、装置、服务器及介质
CN107908517B (zh) 一种基于shell脚本的CPU压力测试方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16827083

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16827083

Country of ref document: EP

Kind code of ref document: A1