CN106294065A - Hard disk failure monitoring method, Apparatus and system - Google Patents

Hard disk failure monitoring method, Apparatus and system Download PDF

Info

Publication number
CN106294065A
CN106294065A CN201610609204.4A CN201610609204A CN106294065A CN 106294065 A CN106294065 A CN 106294065A CN 201610609204 A CN201610609204 A CN 201610609204A CN 106294065 A CN106294065 A CN 106294065A
Authority
CN
China
Prior art keywords
hard disk
danger coefficient
span
status data
failure monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610609204.4A
Other languages
Chinese (zh)
Inventor
范瑞展
缪亦奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201610609204.4A priority Critical patent/CN106294065A/en
Publication of CN106294065A publication Critical patent/CN106294065A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the present application provides hard disk failure monitoring method, Apparatus and system, the method is according to the status data of hard disk, obtain the actual consumption life-span that hard disk is current, the default life-span according to described actual consumption life-span Yu described hard disk, calculate the danger coefficient of described hard disk, this danger coefficient indicates the probabilistic information that this hard disk breaks down, such that it is able to make user find the exception of hard disk in time, user can also know whether to need to change hard disk in advance according to this dangerous information, thus avoids the catastrophic consequence that hard disk corruptions loss of data causes.

Description

Hard disk failure monitoring method, Apparatus and system
Technical field
The application relates to seagate field, is more particularly to a kind of hard disk failure monitoring method, Apparatus and system.
Background technology
Hard disk is topmost storage device in electronic equipment, as data and the carrier of information of electronic device user, A large amount of capsule information are often preserved on hard disk.The mean free error time of most hard disks reached 30000~50000 hours with On, however for many users, particularly commercial user for, after the most common hard disk failure just be enough to cause calamity Really.Find that the exception of hard disk is to maintain electronic equipment stable operation, the basic premise of protection data safety in time.
Summary of the invention
In view of this, the invention provides a kind of hard disk failure monitoring method, Apparatus and system, to overcome in prior art Find the exception of hard disk the most in time, cause the problem that the stability of data in hard disk loss and electronic equipment reduces.
For achieving the above object, the present invention provides following technical scheme:
A kind of hard disk failure monitoring method, including:
According to the status data of hard disk, obtaining the currently practical consumption life of described hard disk, described status data includes described Hard disk temperature information under each time, and the load information that described hard disk is under each time;
According to the default life-span in described actual consumption life-span Yu described hard disk, calculate the danger coefficient of described hard disk, institute State danger coefficient and indicate the probabilistic information that described hard disk breaks down.
Preferably, also include:
Determine that danger coefficient is more than or equal to the source hard disk of the first preset value;
Determine and meet pre-conditioned purpose hard disk;
Generating hard disk and migrate instruction, described hard disk migrates instruction and carries address information and the purpose hard disk of described source hard disk Address information.
Preferably, also include:
According to described danger coefficient, determine the danger classes of presently described hard disk;
Export the warning message corresponding with described danger classes.
A kind of hard disk failure monitoring device, including:
Acquisition module, for the status data according to hard disk, obtains the currently practical consumption life of described hard disk, described state Data include described hard disk temperature information under each time, and the load information that described hard disk is under each time;
Computing module, for the default life-span according to described actual consumption life-span Yu described hard disk, calculates described hard disk Danger coefficient, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
A kind of hard disk failure monitoring system, including baseboard management controller, bus, monitor, the management of described substrate controls Device is connected with described monitor by described bus;
Described monitor, for monitoring the status data of hard disk, and by described status data by described bus transfer extremely Described baseboard management controller, described status data includes described hard disk temperature information under each time, and described hard disk Load information under each time;
Described baseboard management controller, for according to the status data of described hard disk, obtains that described hard disk is currently practical to disappear The consumption life-span, and according to the default life-span in described actual consumption life-span Yu described hard disk, calculate the danger that described hard disk breaks down Danger coefficient, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
Preferably, described baseboard management controller is additionally operable to:
Determine that danger coefficient is more than or equal to the source hard disk of the first preset value;
Determine and meet pre-conditioned purpose hard disk;
Generating hard disk and migrate instruction, and sent to described monitor by described bus, described hard disk migrates instruction and carries The address information of described source hard disk and the address information of purpose hard disk;
Described monitor is additionally operable to: migrate instruction according to described hard disk, by the Data Migration of described source hard disk to described mesh Hard disk.
Wherein, described status data also includes the memory space surplus of hard disk, and described baseboard management controller is determining When going out to meet pre-conditioned purpose hard disk, specifically for:
The hard disk that memory space surplus is maximum is defined as described purpose hard disk;
Or, from the danger coefficient hard disk less than or equal to the 3rd preset value, determine the hard of memory space surplus maximum Dish, is defined as described purpose hard disk by this hard disk;
Or, the hard disk that danger coefficient is minimum is defined as described purpose hard disk.
Preferably, described baseboard management controller is additionally operable to:
According to described danger coefficient, determine the danger classes of presently described hard disk;
Export the warning message corresponding with described danger classes.
Wherein,
Described bus is I2C bus, and described monitor is array control unit;
Or, described bus is KCS bus, described monitor Built In Operating System and application software (software), wherein, Described operating system performs data migration operation by described application software to described hard disk.
A kind of hard disk failure monitoring system, including processor and memorizer, wherein:
Described memorizer, for the status data of storage hard disk, described status data includes that described hard disk is under each time Temperature information, and the load information that described hard disk is under each time;
Described processor, the status data of the described hard disk for storing according to described memorizer, obtain described hard disk and work as The front actual consumption life-span, and according to the default life-span in described actual consumption life-span Yu described hard disk, calculate described hard disk and occur The danger coefficient of fault, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
Understand via above-mentioned technical scheme, compared with prior art, embodiments provide a kind of hard disk failure Monitoring method, the method, according to the status data of hard disk, obtains the actual consumption life-span that hard disk is current, according to described actual consumption Life-span and the default life-span of described hard disk, calculating the danger coefficient of described hard disk, this danger coefficient indicates this hard disk and occurs The probabilistic information of fault, such that it is able to make user find the exception of hard disk in time, user can also be permissible according to this dangerous information Know whether to need to change hard disk in advance, thus avoid the catastrophic consequence that hard disk corruptions loss of data causes.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Inventive embodiment, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to according to The accompanying drawing provided obtains other accompanying drawing.
Fig. 1 is the operating ambient temperature relation schematic diagram with fault rate of hard disk;
The schematic flow sheet of a kind of hard disk failure monitoring method that Fig. 2 provides for the embodiment of the present application;
The moving method of the data in hard disk in a kind of hard disk failure monitoring method that Fig. 3 provides for the embodiment of the present application Schematic flow sheet;
Warning schematic diagram in a kind of hard disk failure monitoring method that Fig. 4 provides for the embodiment of the present application;
The structural representation of the hard disk failure monitoring device that Fig. 5 provides for the embodiment of the present application;
The structural representation of a kind of hard disk failure monitoring system that Fig. 6 provides for the embodiment of the present application;
A kind of structural representation of the specific implementation of the hard disk failure monitoring system that Fig. 7 provides for the embodiment of the present application Figure;
The structural representation of another implementation in the hard disk failure monitoring system that Fig. 8 provides for the embodiment of the present application.
Detailed description of the invention
For the sake of quoting and understanding, the explanation of the technical term being used below, write a Chinese character in simplified form or abridge and be summarized as follows:
I2C bus: Inter-Integrated Circuit;
KCS:Keyboard Controller Style;KBC mode;
BMC:Baseboard Management Controller, baseboard management controller.
RAID:Redundant Arrays of Independent Disks, disk array;
HDD:Hard Disk Drive, hard disk drive.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.
Hard disk is a critically important memory element in the electronic device, and its life-span and temperature have direct relatedness, such as Fig. 1 Shown in, for the relation schematic diagram of operating ambient temperature and the probability (hereinafter referred to as fault rate) broken down of hard disk, horizontal in Fig. 1 The operating ambient temperature of coordinate representation hard disk, vertical coordinate represent hard disk year fault rate (year fault rate refer to 1 year break down Probability).
In Fig. 1, curve 1 represents that the hard disk accumulative conduction time of 1 year (Power On Hour, POH) is 2400 hours, bent Line 2 represents that the hard disk POH of a year is 8760 hours.
As can be seen from Figure 1 when temperature is in time being increased to 70 ° for 30 °, year fault rate (Annualized Failure Rate, AFR) rise at double.
Current is all to see whether bad rail makes a decision for hard disk health degree, namely actually occurs bad rail Hou when hard disk, User changes this hard disk again, but in hard disk, the data of storage there may be the risk of loss.
At present, the method not being monitored for the life-span of hard disk.It is the most all that certain hard disk is damaged by the time After, after BMC record SEL log (selection daily record), user changes this hard disk again.
The hard disk failure monitoring method that the embodiment of the present application provides can know the danger coefficient that hard disk is current, in order to user Can be according to danger coefficient, it may be judged whether need the data on hard disk are migrated, thus avoid the number of storage in hard disk According to being risk of damage to.
As in figure 2 it is shown, the schematic flow sheet of a kind of hard disk failure monitoring method provided for the embodiment of the present application, the method Including:
Step S201: according to the status data of hard disk, obtain the currently practical consumption life of described hard disk.
Described status data includes described hard disk temperature information under each time, and described hard disk is under each time Load information.
HD vendor obtains through test of many times at present, and the live load at hard disk is 50%, and operating ambient temperature is 40 DEG C time, AFR is 0.73%.Namely hard disk is at the operating ambient temperature of 40 DEG C, and when live load is 50%, year of hard disk therefore Barrier rate is 0.73%, but time actually used, the operating ambient temperature of hard disk is not necessarily 40 DEG C, and the load of hard disk also can Along with the operating state of electronic equipment changes, 50% will not be maintained always.
And, for arbitrary electronic equipment, for the life-span of hard disk is only for one or several hard disk, but Above-mentioned 0.73% is with thousands of hard disks for year fault rate that sample obtains.Therefore, 0.73% can not be represented each in reality The year fault rate of the hard disk in the use of border, so the life-span right and wrong of the one or more hard disks pre-estimated in each electronic equipment The most important.
Hard disk current actual consumption life-span can be calculated by equation below, currently practical consumption life:
Wherein, T refers to that hard disk is in time beginning to use to current energising Between, temperature (T) represents hard disk actual working environment temperature under time T;Load (T) represents hard disk reality under time T Live load.
It should be noted that above-mentioned formula is not intended that the restriction to the application, those skilled in the art can be according to this The technological thought that invention provides combines practical application request designed, designed.
Step S202: according to the default life-span in described actual consumption life-span Yu described hard disk, calculate the danger of described hard disk Danger coefficient.
Described danger coefficient indicates the probabilistic information that described hard disk breaks down.
The default life-span of hard disk may refer toWherein T1 represents that HD vendor advises The fixed time, such as 1 year, 2 years, 5 years etc..
Danger coefficient X can be L_real/L_total.
It should be noted that above-mentioned formula is not intended that the restriction to the application, those skilled in the art can be according to this The technological thought that invention provides combines practical application request designed, designed.
It is understood that the life-span of hard disk, closer to the default life-span, illustrates hard disk it may happen that the probability of fault is the biggest. Can determine whether to change hard disk according to this danger coefficient X.
Embodiments providing a kind of hard disk failure monitoring method, the method, according to the status data of hard disk, obtains The actual consumption life-span that hard disk is current, according to the default life-span in described actual consumption life-span Yu described hard disk, calculate described firmly The danger coefficient of dish, this danger coefficient indicates the probabilistic information that this hard disk breaks down, such that it is able to make user find in time The exception of hard disk, user can also know whether in advance to need to change hard disk according to this dangerous information, thus avoid hard disk Damage the catastrophic consequence that loss of data causes.
When being provided with multiple hard disk on electronic equipment, electronic equipment can be automatically by it may happen that deposit in the hard disk of fault The Data Migration of storage is in other hard disks, in above-mentioned hard disk failure monitoring method embodiment, it is also possible to include, will deposit in hard disk The data of storage carry out moving method, as it is shown on figure 3, the method includes:
Step S301: determine that danger coefficient is more than or equal to the source hard disk of the first preset value.
Depending on first preset value can be according to practical situation, the such as first preset value can be 80%, 90%, 100% etc. Deng.
Danger coefficient in electronic equipment is referred to as source hard disk more than or equal to the hard disk of the first preset value by the embodiment of the present application.
Step S302: determine and meet pre-conditioned purpose hard disk.
The embodiment of the present application is referred to as purpose hard disk by meeting pre-conditioned hard disk in electronic equipment.
Status data can also include the memory space surplus of hard disk, pre-conditioned can be: is remained by memory space The hard disk of amount maximum is defined as described purpose hard disk;Or, from the danger coefficient hard disk less than or equal to the 3rd preset value, determine The hard disk that memory space surplus is maximum, is defined as described purpose hard disk by this hard disk;Or, the hard disk that danger coefficient is minimum is true It is set to described purpose hard disk.
Step S303: generate hard disk and migrate instruction, described hard disk migrate instruction carry the address information of described source hard disk with And the address information of purpose hard disk.
It is understood that when the danger coefficient of hard disk is more than or equal to the second preset value, warning message can be exported.Example As when danger coefficient is 30%, not alert, when danger coefficient is 80% or above just can alert. Depending on second preset value is based on practical situation, such as, can set according to the significance level of the data of storage in hard disk Putting, significance level is the highest, and the second preset value is the least, and this second preset value can be that user passes through the display screen of electronic equipment certainly Row is arranged, it is also possible to be that electronic equipment had pre-set before dispatching from the factory.
No matter the danger coefficient of hard disk is how many, the warning message of output can be identical, it is also possible to different.Such as, hard disk Danger coefficient when being 80% or 100%, the warning message of output is the same, such as, show in the display screen of electronic equipment The information that hard disk will break down.Or, when the danger coefficient of hard disk is 80% or 100%, the warning message of output is different, Concrete, above-mentioned hard disk failure monitoring method can also include: according to described danger coefficient, determines the danger of presently described hard disk Grade;Export the warning message corresponding with described danger classes.
Such as, danger coefficient be 80%-90% be the second danger classes, 91% to 100% is first danger classes etc.. First danger classes can corresponding warning red, the second danger classes can corresponding warning yellow.
As shown in Figure 4, HDD 1 represents hard disk, when BMC 41 determines that the danger coefficient X of HDD 1 is: 90% >=X >= When 80%, representing that the life-span of HDD 1, soon close to specification, can export warning yellow, warning yellow can be to remind user Note the health status of HDD 1, and need the resettlement of preparation data and whether have alternate hard to be available for replacing etc..
When the danger coefficient X of HDD 1 is: 100% >=X >=91% time, represent that HDD 1 has had the risk of damage, can With output red alarm, red Serpentis alarm may refer to, and reminds user carry out Data Migration and change HDD 1, and avoiding can The HDD 1 of energy damages and loss of data.
The embodiment of the present application, in addition to providing above-mentioned hard disk failure monitoring method, additionally provides hard disk failure monitoring device, During in hard disk failure monitoring device, the description of modules refers to hard disk failure monitoring method, each step corresponding Describe, do not repeating at this, as it is shown in figure 5, the structural representation of the hard disk failure monitoring device provided for the embodiment of the present application, This device includes: acquisition module 501 and computing module 502, wherein:
Acquisition module 501, for the status data according to hard disk, obtains the currently practical consumption life of described hard disk, described Status data includes described hard disk temperature information under each time, and the load information that described hard disk is under each time.
Computing module 502, for according to default life-span in described actual consumption life-span Yu described hard disk, calculate described firmly The danger coefficient of dish, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
Embodiments provide a kind of hard disk failure monitoring device, acquisition module 501 according to the status data of hard disk, Obtaining the actual consumption life-span that hard disk is current, computing module 502 is according to the default longevity in described actual consumption life-span Yu described hard disk Life, calculates the danger coefficient of described hard disk, and this danger coefficient indicates the probabilistic information that this hard disk breaks down, such that it is able to Making user find the exception of hard disk in time, user can also know whether to need to change firmly according to this dangerous information in advance Dish, thus avoid the catastrophic consequence that hard disk corruptions loss of data causes.
When being provided with multiple hard disk on electronic equipment, electronic equipment can be automatically by it may happen that deposit in the hard disk of fault The Data Migration of storage is in other hard disks, and the most above-mentioned hard disk failure monitoring device can also include:
First determines module, for determining that danger coefficient is more than or equal to the source hard disk of the first preset value.
Second determines module, meets pre-conditioned purpose hard disk for determining.
Status data can also include the memory space surplus of hard disk, pre-conditioned can be: is remained by memory space The hard disk of amount maximum is defined as described purpose hard disk;Or, from the danger coefficient hard disk less than or equal to the 3rd preset value, determine The hard disk that memory space surplus is maximum, is defined as described purpose hard disk by this hard disk;Or, the hard disk that danger coefficient is minimum is true It is set to described purpose hard disk.
Generation module, is used for generating hard disk and migrates instruction, and described hard disk migrates instruction and carries the address letter of described source hard disk Breath and the address information of purpose hard disk.
Any of the above-described hard disk failure monitoring device can also include:
3rd determines module, for according to described danger coefficient, determines the danger classes of presently described hard disk;Output mould Block, for the warning message that output is corresponding with described danger classes.
May refer to the description of Fig. 4 in detail, do not repeat them here.
The embodiment of the present application additionally provides a kind of hard disk failure monitoring system, as shown in Figure 6, this hard disk failure monitoring system Including: BMC 41, bus 61, monitor 62, described BMC 41 is connected with described monitor 62 by described bus 61.
Described monitor 62, for monitoring the status data of hard disk, and by described status data by described bus transfer To described baseboard management controller, described status data includes described hard disk temperature information under each time, and described firmly Dish load information under each time.
Bus 61 can be I2C bus or KCS bus etc..
BMC 41, for the status data according to described hard disk, obtains the currently practical consumption life of described hard disk, and foundation Described actual consumption life-span and the default life-span of described hard disk, calculate the danger coefficient that described hard disk breaks down, described danger Danger coefficient table understands the probabilistic information that described hard disk breaks down.
Some electronic equipments include BMC, but the BMC that in prior art, electronic equipment includes does not has the embodiment of the present application The function of middle BMC 41, is built in, by the function of BMC in the embodiment of the present application 41, the BMC that electronic equipment in prior art includes Code in, it is not necessary to increase extra hardware to realize this function, i.e. will not increase hardware cost.
The detailed description of watch-dog 63 and BMC 41 be may refer to, corresponding with hard disk failure monitoring method in Fig. 2 The detailed description of each step, does not repeats them here.
In above-mentioned hard disk failure monitoring system, baseboard management controller is additionally operable to: determine that danger coefficient is more than or equal to The source hard disk of the first preset value;Determine and meet pre-conditioned purpose hard disk;Generate hard disk migrate instruction, and by described always Line sends to described monitor, and described hard disk migrates instruction and carries address information and the address of purpose hard disk of described source hard disk Information;Described monitor is additionally operable to: migrate instruction according to described hard disk, by hard for the Data Migration of described source hard disk to described purpose Dish.
Described status data also includes the memory space surplus of hard disk, described baseboard management controller determine satisfied During pre-conditioned purpose hard disk, specifically for: the hard disk that memory space surplus is maximum is defined as described purpose hard disk; Or, from the danger coefficient hard disk less than or equal to the 3rd preset value, determine the hard disk that memory space surplus is maximum, this is hard Dish is defined as described purpose hard disk;Or, the hard disk that danger coefficient is minimum is defined as described purpose hard disk.
Any of the above-described described hard disk failure monitoring system, described in baseboard management controller be additionally operable to: according to described danger Coefficient, determines the danger classes of presently described hard disk;Export the warning message corresponding with described danger classes.
May refer to the description of Fig. 4 in detail, do not repeat them here.
More understand the hard disk failure monitoring system that the embodiment of the present application provides for those skilled in the art, name two Individual object lesson realizes process to hard disk failure monitoring system and illustrates.
Refer to Fig. 7, for the knot of a kind of specific implementation of the hard disk failure monitoring system that the embodiment of the present application provides Structure schematic diagram.
Bus 61 is I2C bus, and monitor 62 can be the array control unit 71 in RAID.HDD is the hard disk in RAID, In order to more clearly describe hard disk failure monitoring system, the array control unit 71 and HDD in RAID is separated by Fig. 7. Fig. 7 shows two hard disk HDD 1 and HDD2, it is to be understood that the number of hard disk can be 1, now electronic equipment Cannot automatically carry out in hard disk the migration of the data of storage, need user oneself to change, the number of hard disk can be 2 or Multiple, now electronic equipment can not carry out the migration of the data of storage in hard disk automatically, it is also possible to automatically carries out depositing in hard disk The migration of the data of storage.
For BMC, the status data of hard disk can not be directly obtained, need array control unit 71 by each hard disk i.e. The status data of HDD 1 and HDD 2 is by I2C bus transfer to BMC 41, and BMC 41 is for each hard disk, according to HDD 1 The status data of (or HDD 2), obtains HDD 1 (or HDD 2) currently practical consumption life L_real1 (or L_real2), and depends on According to the default life-span L_toatl1 (or L_toatl2) in described actual consumption life-span Yu HDD 1 (or HDD 2), calculate HDD 1 The danger coefficient X1 (or X2) that (or HDD 2) breaks down.
BMC 41 may determine that whether X1 and X2 be more than the second preset value (for example, 80%), it is assumed that determines X1= 90%, i.e. more than or equal to 80%, X2=30%, now, BMC can export warning message, such as warning yellow.
BMC 41 is if it is determined that danger coefficient is more than or equal to the data needs in the hard disk of the first preset value (being assumed to be 85%) Resettlement, then the data during BMC 41 is capable of determining that HDD 1 need resettlement, and BMC 41 can also calculate the optimum bit of Data Migration Put, it is assumed that the optimum data migration position determined is HDD2, then can generate hard disk resettlement instruction, the resettlement instruction of this hard disk is wrapped Include address information and the address information of HDD 2 of HDD 1.After array control unit 71 receives hard disk resettlement instruction, can be by In HDD 1, the data of storage are moved to HDD 2.
Refer to Fig. 8, for the structure of another implementation in the hard disk failure monitoring system that the embodiment of the present application provides Schematic diagram.
Bus 61 is KCS bus, can be with Built In Operating System (OS, Operating system) 81 and should in monitor 62 With software 82, operating system 81 can obtain the status data of hard disk HDD 1 and HDD 2, operating system by application software 82 81 by application software 82 to described hard disk execution data migration operation, and can also obtain the status data of hard disk.
Operating system 81 obtains each hard disk, the status data of such as hard disk HDD 1 and HDD 2 by application software 82 After, (watch-dog 63 can pass through basic input output system 83 (Basic Input Output can to pass through KCS bus System, BIOS) be connected with KCS bus) transmit to BMC 41.
Operating system 81 obtains the principle of status data, can be to carry out the depositor or card sector specifying hard disk entirely The read-write in face, thus obtain the status data of hard disk.
BMC 41 is for each hard disk, according to the status data of HDD 1 (or HDD 2), obtain HDD 1 (or HDD 2) when Front actual consumption life-span L_real1 (or L_real2), and presetting according to described actual consumption life-span and HDD 1 (or HDD 2) Life-span L_toatl1 (or L_toatl2), calculates the danger coefficient X1 (or X2) that HDD 1 (or HDD 2) breaks down.
BMC 41 may determine that whether X1 and X2 be more than the second preset value (for example, 80%), it is assumed that determines X1= 90%, i.e. more than or equal to 80%, X2=30%, now, BMC can export warning message, such as warning yellow.
BMC 41 is if it is determined that danger coefficient is more than or equal to the data needs in the hard disk of the first preset value (being assumed to be 85%) Resettlement, then the data during BMC 41 is capable of determining that HDD 1 need resettlement, and BMC 41 can also calculate the optimum bit of Data Migration Put, it is assumed that the optimum data migration position determined is HDD 2, then can generate hard disk resettlement instruction, in the resettlement instruction of this hard disk Address information and the address information of HDD 2 including HDD 1.The resettlement instruction of this hard disk can pass through KCS bus, BIOS 83 Transmitting the operating system 81 to watch-dog 63, operating system 81 is by application program 82, by the data resettlement of storage in HDD 1 To HDD 2.
A kind of hard disk failure monitoring system that the embodiment of the present application provides, this system can be computer, mobile phone, flat board electricity Brain, PDA (Personal Digital Assistant, personal digital assistant), POS (Point of Sales, point-of-sale terminal), The electronic equipments such as vehicle-mounted computer.
Hard disk failure monitoring system can include memorizer, processor.
Memorizer can be used for storing software program and module, and processor is stored in the software program of memorizer by operation And module, thus perform the application of various functions and the data process of electronic equipment.Memorizer can mainly include storage program District and storage data field, wherein, storage program area can store the application program needed for operating system, at least one function (such as Calculate danger coefficient function etc.) etc.;Storage data field can store data (the such as hard disk that the use according to electronic equipment is created Status data etc.) etc..Additionally, memorizer can include high-speed random access memory, it is also possible to include non-volatile memories Device, for example, at least one disk memory, flush memory device or other volatile solid-state parts.
Processor is the control centre of electronic equipment, utilizes various interface and each portion of the whole electronic equipment of connection Point, by running or perform software program and/or the module being stored in memorizer, and call the number being stored in memorizer According to, perform the various functions of electronic equipment and process data, thus electronic equipment is carried out integral monitoring.Optionally, processor One or more processing unit can be included;Preferably, processor can integrated application processor and modem processor, wherein, Application processor mainly processes operating system, user interface and application program etc., and modem processor mainly processes channel radio Letter.It is understood that above-mentioned modem processor can not also be integrated in processor.
The status data of the described hard disk that the processor in the embodiment of the present application can store according to described memorizer, obtains The currently practical consumption life of described hard disk, and according to the default life-span in described actual consumption life-span Yu described hard disk, calculate institute State the danger coefficient that hard disk breaks down.
Described memorizer can also store the first preset value, pre-conditioned, and described processor can be also used for: determines Danger coefficient is more than or equal to the source hard disk of the first preset value;Determine and meet pre-conditioned purpose hard disk;Generation hard disk migrates Instruction, described hard disk migrates instruction and carries address information and the address information of purpose hard disk of described source hard disk.
Described processor can be also used for: the hard disk that memory space surplus is maximum is defined as described purpose hard disk;Or, From the danger coefficient hard disk less than or equal to the 3rd preset value, determine the hard disk that memory space surplus is maximum, by this hard disk It is defined as described purpose hard disk;Or, the hard disk that danger coefficient is minimum is defined as described purpose hard disk.
Described memorizer can also store the corresponding relation of dangerous grade, danger classes and warning message, described process Device can be also used for: according to described danger coefficient, determines the danger classes of presently described hard disk;Output and described danger classes pair The warning message answered.
It should be noted that each embodiment in this specification all uses the mode gone forward one by one to describe, each embodiment weight Point explanation is all the difference with other embodiments, and between each embodiment, identical similar part sees mutually.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention. Multiple amendment to these embodiments will be apparent from for those skilled in the art, as defined herein General Principle can realize without departing from the spirit or scope of the present invention in other embodiments.Therefore, the present invention It is not intended to be limited to the embodiments shown herein, and is to fit to and principles disclosed herein and features of novelty phase one The widest scope caused.

Claims (10)

1. a hard disk failure monitoring method, it is characterised in that including:
According to the status data of hard disk, obtaining the currently practical consumption life of described hard disk, described status data includes described hard disk Temperature information under each time, and the load information that described hard disk is under each time;
According to the default life-span in described actual consumption life-span Yu described hard disk, calculate the danger coefficient of described hard disk, described danger Danger coefficient table understands the probabilistic information that described hard disk breaks down.
Hard disk failure monitoring method the most according to claim 1, it is characterised in that also include:
Determine that danger coefficient is more than or equal to the source hard disk of the first preset value;
Determine and meet pre-conditioned purpose hard disk;
Generating hard disk and migrate instruction, described hard disk migrates instruction and carries address information and the ground of purpose hard disk of described source hard disk Location information.
Hard disk failure monitoring method the most according to claim 1 or claim 2, it is characterised in that also include:
According to described danger coefficient, determine the danger classes of presently described hard disk;
Export the warning message corresponding with described danger classes.
4. a hard disk failure monitoring device, it is characterised in that including:
Acquisition module, for the status data according to hard disk, obtains the currently practical consumption life of described hard disk, described status data Including described hard disk temperature information under each time, and the load information that described hard disk is under each time;
Computing module, for the default life-span according to described actual consumption life-span Yu described hard disk, calculates the danger of described hard disk Danger coefficient, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
5. a hard disk failure monitoring system, it is characterised in that include baseboard management controller, bus, monitor, described substrate Management Controller is connected with described monitor by described bus;
Described monitor, for monitoring the status data of hard disk and described status data is the most described by described bus transfer Baseboard management controller, described status data includes described hard disk temperature information under each time, and described hard disk is respectively Load information under time;
Described baseboard management controller, for the status data according to described hard disk, obtains the described hard disk currently practical consumption longevity Life, and according to the default life-span in described actual consumption life-span Yu described hard disk, calculate the dangerous system that described hard disk breaks down Number, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
Hard disk failure monitoring system the most according to claim 5, it is characterised in that described baseboard management controller is additionally operable to:
Determine that danger coefficient is more than or equal to the source hard disk of the first preset value;
Determine and meet pre-conditioned purpose hard disk;
Generating hard disk and migrate instruction, and sent to described monitor by described bus, described hard disk migrates instruction and carries described The address information of source hard disk and the address information of purpose hard disk;
Described monitor is additionally operable to: migrate instruction according to described hard disk, by hard for the Data Migration of described source hard disk to described purpose Dish.
Hard disk failure monitoring system the most according to claim 6, it is characterised in that described status data also includes depositing of hard disk Storage Spatial Residual amount, described baseboard management controller determine meet pre-conditioned purpose hard disk time, specifically for:
The hard disk that memory space surplus is maximum is defined as described purpose hard disk;
Or, from the danger coefficient hard disk less than or equal to the 3rd preset value, determine the hard disk that memory space surplus is maximum, will This hard disk is defined as described purpose hard disk;
Or, the hard disk that danger coefficient is minimum is defined as described purpose hard disk.
8. monitor system according to the arbitrary described hard disk failure of claim 5 to 7, it is characterised in that described baseboard management controller It is additionally operable to:
According to described danger coefficient, determine the danger classes of presently described hard disk;
Export the warning message corresponding with described danger classes.
9. monitor system according to the arbitrary described hard disk failure of claim 5 to 7, it is characterised in that
Described bus is I2C bus, and described monitor is array control unit;
Or, described bus is KCS bus, described monitor Built In Operating System and application software (software), wherein, described Operating system performs data migration operation by described application software to described hard disk.
10. a hard disk failure monitoring system, it is characterised in that include processor and memorizer, wherein:
Described memorizer, for the status data of storage hard disk, described status data includes described hard disk temperature under each time Degree information, and the load information that described hard disk is under each time;
Described processor, the status data of the described hard disk for storing according to described memorizer, obtain described hard disk currently real Border consumption life, and according to the default life-span in described actual consumption life-span Yu described hard disk, calculate described hard disk and break down Danger coefficient, described danger coefficient indicates the probabilistic information that described hard disk breaks down.
CN201610609204.4A 2016-07-28 2016-07-28 Hard disk failure monitoring method, Apparatus and system Pending CN106294065A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610609204.4A CN106294065A (en) 2016-07-28 2016-07-28 Hard disk failure monitoring method, Apparatus and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610609204.4A CN106294065A (en) 2016-07-28 2016-07-28 Hard disk failure monitoring method, Apparatus and system

Publications (1)

Publication Number Publication Date
CN106294065A true CN106294065A (en) 2017-01-04

Family

ID=57662687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610609204.4A Pending CN106294065A (en) 2016-07-28 2016-07-28 Hard disk failure monitoring method, Apparatus and system

Country Status (1)

Country Link
CN (1) CN106294065A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980472A (en) * 2017-03-30 2017-07-25 上海与德科技有限公司 The method and device that a kind of EMMC health degrees judge
CN107515731A (en) * 2017-07-31 2017-12-26 华中科技大学 A kind of evolutionary storage system and its method of work based on solid-state disk
CN107544759A (en) * 2017-09-19 2018-01-05 郑州云海信息技术有限公司 A kind of disk array I O assignment system and method
CN107577582A (en) * 2017-09-28 2018-01-12 长沙曙通信息科技有限公司 A kind of storage system hard disk failure intelligent predicting management method
CN108345519A (en) * 2018-01-31 2018-07-31 河南职业技术学院 The processing method and processing device of hard disc of computer failure
CN108958998A (en) * 2018-06-12 2018-12-07 郑州云海信息技术有限公司 Server hard disc uses time detection method and device under a kind of linux
CN109117342A (en) * 2018-08-13 2019-01-01 郑州云海信息技术有限公司 A kind of server and its hard disk health status monitoring system
CN109710443A (en) * 2018-12-24 2019-05-03 平安科技(深圳)有限公司 A kind of data processing method, device, equipment and storage medium
CN110598802A (en) * 2019-09-26 2019-12-20 腾讯科技(深圳)有限公司 Memory detection model training method, memory detection method and device
CN110928742A (en) * 2019-08-08 2020-03-27 北京盛赞科技有限公司 Hard disk retest period determination method, device, equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102467438A (en) * 2010-11-12 2012-05-23 英业达股份有限公司 Method for obtaining fault signal of storage device by baseboard management controller
US20120278661A1 (en) * 2011-04-27 2012-11-01 Hon Hai Precision Industry Co., Ltd. Hard disk backplane and hard disk monitoring system
CN103176919A (en) * 2013-03-07 2013-06-26 洛阳伟信电子科技有限公司 Simple and easy device and simple and easy method for computer hard disk data saving
CN103176884A (en) * 2011-12-20 2013-06-26 鸿富锦精密工业(深圳)有限公司 Hard disk monitoring system and hard disk monitoring method
CN104536855A (en) * 2014-12-03 2015-04-22 曙光信息产业(北京)有限公司 Fault detection method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102467438A (en) * 2010-11-12 2012-05-23 英业达股份有限公司 Method for obtaining fault signal of storage device by baseboard management controller
US20120278661A1 (en) * 2011-04-27 2012-11-01 Hon Hai Precision Industry Co., Ltd. Hard disk backplane and hard disk monitoring system
CN103176884A (en) * 2011-12-20 2013-06-26 鸿富锦精密工业(深圳)有限公司 Hard disk monitoring system and hard disk monitoring method
CN103176919A (en) * 2013-03-07 2013-06-26 洛阳伟信电子科技有限公司 Simple and easy device and simple and easy method for computer hard disk data saving
CN104536855A (en) * 2014-12-03 2015-04-22 曙光信息产业(北京)有限公司 Fault detection method and device

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980472A (en) * 2017-03-30 2017-07-25 上海与德科技有限公司 The method and device that a kind of EMMC health degrees judge
CN107515731A (en) * 2017-07-31 2017-12-26 华中科技大学 A kind of evolutionary storage system and its method of work based on solid-state disk
CN107544759B (en) * 2017-09-19 2021-01-29 苏州浪潮智能科技有限公司 Disk array IO distribution system and method
CN107544759A (en) * 2017-09-19 2018-01-05 郑州云海信息技术有限公司 A kind of disk array I O assignment system and method
CN107577582A (en) * 2017-09-28 2018-01-12 长沙曙通信息科技有限公司 A kind of storage system hard disk failure intelligent predicting management method
CN108345519A (en) * 2018-01-31 2018-07-31 河南职业技术学院 The processing method and processing device of hard disc of computer failure
CN108958998A (en) * 2018-06-12 2018-12-07 郑州云海信息技术有限公司 Server hard disc uses time detection method and device under a kind of linux
CN109117342A (en) * 2018-08-13 2019-01-01 郑州云海信息技术有限公司 A kind of server and its hard disk health status monitoring system
CN109710443A (en) * 2018-12-24 2019-05-03 平安科技(深圳)有限公司 A kind of data processing method, device, equipment and storage medium
CN109710443B (en) * 2018-12-24 2023-06-16 平安科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN110928742A (en) * 2019-08-08 2020-03-27 北京盛赞科技有限公司 Hard disk retest period determination method, device, equipment and readable storage medium
CN110928742B (en) * 2019-08-08 2023-06-09 北京盛赞科技有限公司 Hard disk rechecking period determining method, device, equipment and readable storage medium
CN110598802A (en) * 2019-09-26 2019-12-20 腾讯科技(深圳)有限公司 Memory detection model training method, memory detection method and device
CN110598802B (en) * 2019-09-26 2021-07-27 腾讯科技(深圳)有限公司 Memory detection model training method, memory detection method and device

Similar Documents

Publication Publication Date Title
CN106294065A (en) Hard disk failure monitoring method, Apparatus and system
US10678622B2 (en) Optimizing and scheduling maintenance tasks in a dispersed storage network
US9026863B2 (en) Replacement of storage responsive to remaining life parameter
US9104790B2 (en) Arranging data handling in a computer-implemented system in accordance with reliability ratings based on reverse predictive failure analysis in response to changes
CN108153622B (en) Fault processing method, device and equipment
US8402307B2 (en) Peripheral component interconnect express root port mirroring
US20170060685A1 (en) Adaptive extra write issuance within a dispersed storage network (dsn)
EP3667504B1 (en) Storage medium management method, device and readable storage medium
US9489138B1 (en) Method and apparatus for reliable I/O performance anomaly detection in datacenter
US11288378B2 (en) Embedded data protection and forensics for physically unsecure remote terminal unit (RTU)
US10324657B2 (en) Accounting for data whose rebuilding is deferred
US20220129601A1 (en) Techniques for generating a configuration for electrically isolating fault domains in a data center
CN102541722B (en) Server memory monitoring method and server memory monitoring system
US10268376B2 (en) Automated deployment and assignment of access devices in a dispersed storage network
US11809893B2 (en) Systems and methods for collapsing resources used in cloud deployments
CN114238019A (en) Hard disk display method, device, equipment and medium
US11507446B1 (en) Hot-swap controller fault reporting system
US11422965B1 (en) Hot-swap controller monitoring configuration system
US10423491B2 (en) Preventing multiple round trips when writing to target widths
CN104484252A (en) Method, device and system for detecting standby power of solid-state hard disks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104

RJ01 Rejection of invention patent application after publication