CN105260279B - Method and apparatus based on SMART data dynamic diagnosis hard disk failure - Google Patents

Method and apparatus based on SMART data dynamic diagnosis hard disk failure Download PDF

Info

Publication number
CN105260279B
CN105260279B CN201510738474.0A CN201510738474A CN105260279B CN 105260279 B CN105260279 B CN 105260279B CN 201510738474 A CN201510738474 A CN 201510738474A CN 105260279 B CN105260279 B CN 105260279B
Authority
CN
China
Prior art keywords
hard disk
data
parameter
model
early warning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510738474.0A
Other languages
Chinese (zh)
Other versions
CN105260279A (en
Inventor
梁效宁
张佳强
杨先珉
杨明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SICHUAN XLY INFORMATION SAFETY TECHNOLOGY Co Ltd
Original Assignee
SICHUAN XLY INFORMATION SAFETY TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN XLY INFORMATION SAFETY TECHNOLOGY Co Ltd filed Critical SICHUAN XLY INFORMATION SAFETY TECHNOLOGY Co Ltd
Priority to CN201510738474.0A priority Critical patent/CN105260279B/en
Publication of CN105260279A publication Critical patent/CN105260279A/en
Application granted granted Critical
Publication of CN105260279B publication Critical patent/CN105260279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of method and apparatus based on SMART data dynamic diagnosis hard disk failure, are related to data storage security diagnostic field, comprising the following steps: 101 establish cloud storage service end, persistent collection three classes data;102 establish hard disk failure early warning dynamic model;103 establish S.M.A.R.T. parameter normal fluctuation curve and range;104, by big data analysis, obtain Gernral Check-up scoring dynamic model.Beneficial effects of the present invention are as follows: 1. establish cloud storage service end, persistent collection hard disk health related data;2. the data preparation being collected into is established: hard disk failure early warning dynamic model, the positive ordinary wave of S.M.A.R.T. parameter;3. moving curve and range, Gernral Check-up are scored, dynamic model, constantly by machine learning, improves the correctness of model in hard disk failure diagnosis.

Description

Method and apparatus based on SMART data dynamic diagnosis hard disk failure
Technical field
The invention belongs to data storage security diagnostic fields, are related to a kind of based on SMART data dynamic diagnosis hard disk failure Method and apparatus.
Background technique
S.M.A.R.T.: full name is " Self-Monitoring Analysis and Reporting Technology ", That is " self-monitoring, analysis and reporting techniques ", are the Technology On Data Encryptions that present hard disk generallys use, when hard disk operational Monitoring system carries out operation conditions monitoring to the state of motor, circuit, disk, magnetic head, will send out when having abnormal occur It alerts out.
Cloud storage (Cloud Storage): being that in the conceptive extension of cloud computing (Cloud Computing) and developed A new concept, be a kind of emerging Network storage technology, refer to through cluster application, network technology or distributed document Different types of storage equipment a large amount of in network are gathered collaborative work by application software by the functions such as system, common right It is outer that a system of data storage and business access function is provided.
Normalization: being also data normalization, is a kind of mode of simplified calculating, i.e., the expression formula that will have dimension, by becoming It changes, turns to nondimensional expression formula, become scalar, be an element task for carrying out data mining.
Machine learning (Machine Learning): " machine " mentioned here, what is referred to is exactly computer, and electronics calculates Machine, neutron computer, photonic computer or neuro-computer etc..Machine learning is the science of an artificial intelligence, is to use number According to or previous experience, use " comparison-adjustment-comparison ", the performance of Optimal improvements programmed algorithm.Machine learning is widely answered Used in fields such as data mining, natural language processing, search engine, medical diagnosis, voice and handwriting recognitions.
To realize comparatively safe data protection, the 1990s, S.M.A.R.T. technology is come into being.Since In June, 1996 become professional standard after, even to this day, S.M.A.R.T. technology still for we carry out hard disk failure prediction mention For supporting, the tool device that numerous hard disk failures test and analyze early warning all relies on this.But these test and analyze the tool of early warning There are the following problems for device:
1, S.M.A.R.T. information simply just is read out from hard-disk system reserved area, is showed with tabular form, Zhong Duofei Professional user fails to understand;
2, only have S.M.A.R.T. information items current value, item of information numerical value can not be compiled as history song Line, to more precisely be diagnosed;
3, it can not solve in SSD solid state hard disk, the different master controls of different HD vendors, different model product S.M.A.R.T. project, attribute, description is not quite similar this problem;
4, dangerous for predictable hard disk, pre-warning time lag, or without active forewarning;
5, the early warning of mistake is unable to get amendment.
Summary of the invention
In view of the deficiencies of the prior art, the present invention provides a kind of sides based on SMART data dynamic diagnosis hard disk failure Method and device, can be effectively solved the prior art and showed with tabular form causes unprofessional user to fail to understand;Only S.M.A.R.T. information items current value can not compile item of information numerical value as history curve, to carry out more Accurately diagnose;It can not solve in SSD solid state hard disk, the different master controls of different HD vendors, different model product S.M.A.R.T. the problems such as project, attribute, description are not quite similar.
In order to solve the above problem, The technical solution adopted by the invention is as follows: a kind of be based on SMART data dynamic diagnosis hard disk The method of failure, comprising the following steps:
101 establish cloud storage service end, persistent collection three classes data: first is that hard disk type data, including hard disk brand and Model;Second is that S.M.A.R.T. supplemental characteristic and parameter collection time data;Third is that the Hard disk error log of operating system record Data;
102 the S.M.A.R.T. supplemental characteristic being collected into and its corresponding hard disk brand, model data are normalized Processing generates normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the hard disk being collected into Error log data establish hard disk failure early warning dynamic model;
103 by the S.M.A.R.T. data being collected into using parameter as group, in conjunction with corresponding hard disk brand, model data, formed Different brands, different model hard disk S.M.A.R.T. dynamic state of parameters change curve, statistics show that hard disk health is run S.M.A.R.T. parameter normal fluctuation range establishes S.M.A.R.T. parameter normal fluctuation curve and range;
104, by big data analysis, obtain the S.M.A.R.T. parameters weighting of different brands different model hard disk;According to hard Setting of the disk manufacturer to S.M.A.R.T. early-warning parameters, combined training learning data show that different brands different model hard disk is new Early-warning parameters and weighing factor factor to hard disk health;A full marks value is set, is weighed according to new S.M.A.R.T. parameter Weighing factor factor heavy and to hard disk health, sets standard of deducting point, obtains Gernral Check-up scoring dynamic model;Based on hard disk event Hinder early warning dynamic model, S.M.A.R.T. parameter normal fluctuation curve and range, Gernral Check-up scoring dynamic model, it is strong to hard disk Health situation carries out diagnostic score, provides specific aim suggestion;If there are risks for hard disk, early warning is carried out automatically;If early warning mistake, Start machine learning.
Preferably, 102 the following steps are included:
201 data normalizations, using Z-score standardized method, specific formula are as follows:Wherein, x is 101 sample datas collected, x* are the data after normalization, and μ is the mean value of all sample datas, and σ is all sample datas Standard deviation;
Normalized data are pressed disk, magnetic head, head arm, motor, control circuit board, data-interface, master control and sudden strain of a muscle by 202 Particle is deposited to classify;
203 by normalized classification data, in groups by HD vendor, brand, model;
204 according to the threshold value of each items S.M.A.R.T. parameter set by manufacturer, Classified into groups data after setting normalization Early warning value;Build S.M.A.R.T. parameter comparison model;It is pre- to combine every contrast model, early warning trigger formation hard disk failure Alert dynamic model;
205 read hard disk S.M.A.R.T. supplemental characteristics to be checked, import comparison model, when certain item data is more than early warning value, Early warning flip-flop toggle, automatic push warning information prompt user's hard disk failure place;
206 readings simultaneously collect hard disk S.M.A.R.T. supplemental characteristic to be checked, after normalized, are stored in cloud storage end;Root According to hard disk S.M.A.R.T. supplemental characteristic to be checked and wrong early warning item, the early warning of Classified into groups data after corresponding normalization is corrected Value, generates new hard disk failure early warning dynamic model;By related amendment record at cloud storage service end.
Preferably, 103 the following steps are included:
The S.M.A.R.T. parameter and parameter collection time that 301 calling cloud storage service ends are collected into, with individual event S.M.A.R.T. supplemental characteristic is the longitudinal axis, and the time is horizontal axis;Generate individual event S.M.A.R.T. parametric plot;In this way, generating complete Portion's individual event S.M.A.R.T. parametric plot;
302, according to individual event S.M.A.R.T. parametric plot, obtain individual event S.M.A.R.T. parameter normal fluctuation range;
303 read hard disk S.M.A.R.T. supplemental characteristic to be checked, comparison model are imported, when certain item data is more than normal suddenly Fluctuation range, early warning flip-flop toggle, automatic push warning information prompt user's hard disk failure place;
304 read and collect hard disk early warning mistake S.M.A.R.T. supplemental characteristic to be checked, correct normal fluctuation range: reducing Minimum Min improves maximum Max early warning value, generates new individual event S.M.A.R.T. parameter normal fluctuation range;Correlation is repaired Just it is being recorded in cloud storage service end.
Preferably, 104 the following steps are included:
401, according to S.M.A.R.T. parameter and the needs of hard disk Gernral Check-up, set the detecting of level-one hardware fault, second level Two-stage weighting levels altogether are monitored using cumulative statistics and use state, the decision factor as Gernral Check-up;
402 are correcting data, S.M.A.R.T. parameter just based on the original setting of HD vendor, hard disk failure early warning dynamic model Ordinary wave moving curve and range correct data, weight factor setting and adjustment in relation to parameter in the 401 two-stage weights and weight Rule is as follows:
(1) hardware fault detecting weight factor: 80%;It is monitored using cumulative statistics and use state: 10%;
(2) using cumulative statistics value more than after individual event highest cumulative statistics value 60%, weight factor is promoted to 20%;It is more than After the 80% of individual event highest cumulative statistics value, weight factor is promoted to 40%;After the 90% of individual event highest cumulative statistics value, Weighting levels are adjusted into highest level;
(3) value continuous one week of use state monitoring is more than S.M.A.R.T. parameter normal fluctuation curve and range, weight Factor is promoted to 20%;Continuous two weeks are more than that weight factor is promoted to 40%;Continuous January is more than that weighting levels are adjusted into most It is high-grade;
403 determine code of points;
404 based on 402 weight factor and 403 code of points, in conjunction with each HD vendor and master control manufacturer S.M.A.R.T. parameter setting establishes hard disk Gernral Check-up scoring algorithm model;
405 diagnosis;
406 generate new according to feedback data, the weight factor and code of points of amendment fine tuning S.M.A.R.T. parameter Gernral Check-up scoring dynamic model;By related amendment record at cloud storage service end.
As preferred: hard disk type information further includes sequence number, firmware version, interface type, disk running speed, disk size With caching size data.
In order to solve the above problem, the present invention additionally uses the following technical solution: one kind is based on SMART data dynamic diagnosis The device of hard disk failure, including cloud storage service end, hard disk failure early warning dynamic model generation unit, S.M.A.R.T. parameter are just Ordinary wave moving curve and range generation unit and Gernral Check-up scoring dynamic model generation unit;
Cloud storage service end is used for persistent collection three classes data: first is that hard disk type data, including hard disk brand and type Number;Second is that S.M.A.R.T. supplemental characteristic and parameter collection time data;Third is that the Hard disk error log number of operating system record According to;
Hard disk failure early warning dynamic model generation unit, S.M.A.R.T. supplemental characteristic and its correspondence for will be collected into Hard disk brand, model data be normalized, generate normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the Hard disk error daily record data being collected into, establish hard disk failure early warning dynamic model;
S.M.A.R.T. parameter normal fluctuation curve and range generation unit, the S.M.A.R.T. data for will be collected into Using parameter as group, in conjunction with corresponding hard disk brand, model data, different brands, different model hard disk S.M.A.R.T. parameter are formed Dynamic changing curve, statistics show that hard disk health runs S.M.A.R.T. parameter normal fluctuation range, establish S.M.A.R.T. ginseng Number normal fluctuation curve and range;
Gernral Check-up scoring dynamic model generation unit, for obtaining different brands different model by big data analysis The S.M.A.R.T. parameters weighting of hard disk;Setting according to HD vendor to S.M.A.R.T. early-warning parameters, combined training study Data obtain the new early-warning parameters of different brands different model hard disk and the weighing factor factor to hard disk health;Setting one Full marks value sets standard of deducting point, obtains according to new S.M.A.R.T. parameters weighting and to the weighing factor factor of hard disk health Gernral Check-up scoring dynamic model.
As preferred: hard disk type information further includes sequence number, firmware version, interface type, disk running speed, disk size With caching size data.
Beneficial effects of the present invention are as follows:
1. establishing cloud storage service end, persistent collection hard disk health related data;
2. the data preparation being collected into is established: hard disk failure early warning dynamic model, the positive ordinary wave of S.M.A.R.T. parameter;
3. moving curve and range, Gernral Check-up are scored, dynamic model constantly passes through engineering in hard disk failure diagnosis It practises, improves the correctness of model.
Detailed description of the invention
Fig. 1 is based on S.M.A.R.T. data dynamic diagnosis hard disk failure main flow;
Fig. 2 is hard disk failure early warning dynamic model main flow;
Fig. 3 is S.M.A.R.T. parameter normal fluctuation curve and range main flow;
Fig. 4 is Gernral Check-up scoring dynamic model main flow.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, right hereinafter, referring to the drawings and the embodiments, The present invention is described in further details.
As shown in Figure 1, a kind of method based on SMART data dynamic diagnosis hard disk failure,
101 establish cloud storage service end, persistent collection three classes data: first is that hard disk brand, model, sequence number, firmware version Sheet, interface type, disk running speed, disk size, cache size data;Second is that when S.M.A.R.T. supplemental characteristic and parameter collection Between data;Third is that the Hard disk error daily record data of operating system record;
S.M.A.R.T. supplemental characteristic collects content including but not limited to 01 read error rate;02 throughput performance;03 Turn the time;04 start-stop number;05 redistributes sector count;06 reads channel surplus;07 tracking error rate;08 tracking Time performance;09 conduction time;0A, which rises, turns number of retries;0B recalibrates number of retries;0C starts-close cycle-index;0D Soft read error rate probe;The end-to-end mistake of B8;BB report can not correct mistake;BC command timeout;The improper height of BD magnetic head Degree write-in;BE gas flow temperature;BF acceleration induction error rate;C0 powers off magnetic head and retracts counting;C1 magnetic head loading system/unloading circulation meter Number;C2 temperature;C3 hardware ECC has repaired error count;C4 reallocation sector physical position event counts;C5 currently waits middle fan Area's number;C6 can not modified sector sum;C7UltraDMA crc error counts;C8 write error rate;The soft read error rate of C9; CA data address mark mistake;CBECC error rate;The soft ECC correction of CC;The harsh rate of CD heat;CE magnetic head flight height;CF, which rises, to be turned most High current;D0 rise turn buzzing/rise turn ladder;D1 seeks performance offline;Vibration when D3 is written;Vibration when D4 is written;DC disk Piece displacement;DD acceleration induction error rate;DE loads hourage;DF load/unload retries counting;E0 load friction;E1 load/ Unload cycle count;The time of E2 load;The amplification of E3 torque counts;E4 powers off magnetic head and retracts counting;E6 large reluctance magnetic head amplitude; E7 temperature;F0 flying magnetic head hourage;FA read error retries rate;The protection of FE freely falling body;
102 establish hard disk failure early warning dynamic model.By the S.M.A.R.T. supplemental characteristic being collected into and its corresponding hard Disk brand, model data are normalized, and generate normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the Hard disk error daily record data being collected into, establish hard disk failure early warning dynamic model;(hard disk event Barrier early warning dynamic model detailed process is shown in Fig. 2)
103 establish S.M.A.R.T. parameter normal fluctuation curve and range.By the S.M.A.R.T. data being collected into join Number is group, in conjunction with corresponding hard disk brand, model data, forms different brands, different model hard disk S.M.A.R.T. dynamic state of parameters Change curve, statistics show that hard disk health runs S.M.A.R.T. parameter normal fluctuation range;(the positive ordinary wave of S.M.A.R.T. parameter Moving curve and range detailed process are shown in Fig. 3)
104 establish Gernral Check-up scoring dynamic model.By big data analysis, different brands different model hard disk is obtained S.M.A.R.T. parameters weighting;Setting according to HD vendor to S.M.A.R.T. early-warning parameters, combined training learning data, obtains The new early-warning parameters of different brands different model hard disk and the weighing factor factor to hard disk health out;Divide by full marks 100, according to New S.M.A.R.T. parameters weighting and the weighing factor factor to hard disk health set standard of deducting point, show that Gernral Check-up is commented Transfer states model;(Gernral Check-up scoring dynamic model detailed process is shown in Fig. 4)
105 start Gernral Check-up;
106 determine hard disk type: being mechanical hard disk or solid state hard disk;
107 diagnosis.Based on hard disk failure early warning dynamic model, S.M.A.R.T. parameter normal fluctuation curve and range, it is good for Health diagnostic score dynamic model carries out diagnostic score to hard disk health status, provides specific aim suggestion;If there are wind for hard disk Danger, carries out early warning automatically;If early warning mistake, starting machine learning (correlation training study detailed process is shown in Fig. 2, Fig. 3, Fig. 4);
201 data normalizations.There are two types of common method for normalizing: min-max standardizes (Min-Max ) and Z-score standardized method Normalization;Because of min-max method existing defects:, may when there is new data addition The variation for leading to max and min needs to redefine.Therefore, this programme uses Z-score standardized method.Specific formula are as follows:Wherein μ is the mean value of all sample datas, and σ is the standard deviation of all sample datas;
Normalized data are pressed disk, magnetic head, head arm, motor, control circuit board (PCB), data-interface, master by 202 The hardware such as control, flash memory particle are classified;
203 by normalized classification data, in groups by HD vendor, brand, model;
204 according to the threshold value of each items S.M.A.R.T. parameter set by manufacturer, Classified into groups data after setting normalization Early warning value;Build S.M.A.R.T. parameter comparison model;It is pre- to combine every contrast model, early warning trigger formation hard disk failure Alert dynamic model;
205 diagnosis.Hard disk S.M.A.R.T. supplemental characteristic to be checked is read, comparison model is imported, when certain item data is more than pre- Alert value, early warning flip-flop toggle, automatic push warning information prompt user's hard disk failure place;
206 early warning mistakes, machine learning.It reads and collects hard disk S.M.A.R.T. supplemental characteristic to be checked, normalized Afterwards, it is stored in cloud storage end.According to hard disk S.M.A.R.T. supplemental characteristic to be checked and wrong early warning item, after correcting corresponding normalization The early warning value of Classified into groups data generates new hard disk failure early warning dynamic model;By related amendment record in cloud storage service End.
301 formation curve figures.The S.M.A.R.T. parameter for calling cloud storage service end to be collected into and parameter collection time, with Individual event S.M.A.R.T. supplemental characteristic is the longitudinal axis, and the time is horizontal axis;Generate individual event S.M.A.R.T. parametric plot;In this way, raw At whole individual event S.M.A.R.T. parametric plots;
302, according to individual event S.M.A.R.T. parametric plot, obtain individual event S.M.A.R.T. parameter normal fluctuation range (Min-Max;Min is curve graph the lowest point, and Max is curve graph wave crest);
303 diagnosis.Hard disk S.M.A.R.T. supplemental characteristic to be checked is read, comparison model is imported, when certain item data surpasses suddenly It crosses normal fluctuation range (Min-Max), early warning flip-flop toggle, automatic push warning information, prompts user's hard disk failure place;
304 early warning mistakes, machine learning.Hard disk early warning mistake S.M.A.R.T. supplemental characteristic to be checked is read and collected, is repaired Positive normal fluctuation range: reducing Min or improves Max early warning value, generates new individual event S.M.A.R.T. parameter normal fluctuation model Enclose (Min-Max);By related amendment record at cloud storage service end.
401 setup parameter weighting levels.According to S.M.A.R.T. parameter and the needs of hard disk Gernral Check-up, hardware is set Detecting fault (level-one) monitors (second level) two-stage weighting levels altogether using cumulative statistics and use state, as Gernral Check-up Decision factor;
402 determine health effect weight factor.Based on the original setting of HD vendor, the amendment of hard disk failure early warning dynamic model Data, S.M.A.R.T. parameter normal fluctuation curve and range correct data, related parameter in the 401 two-stage weights and weight Weight factor setting and adjustment rule it is as follows:
(1) hardware fault detecting weight factor: 80%;It is monitored using cumulative statistics and use state: 10%;
(2) using cumulative statistics value more than after individual event highest cumulative statistics value 60%, weight factor is promoted to 20%;It is more than After the 80% of individual event highest cumulative statistics value, weight factor is promoted to 40%;After the 90% of individual event highest cumulative statistics value, Weighting levels are adjusted into highest level;
(3) value continuous one week of use state monitoring is more than S.M.A.R.T. parameter normal fluctuation curve and range, weight Factor is promoted to 20%;Continuous two weeks are more than that weight factor is promoted to 40%;Continuous January is more than that weighting levels are adjusted into most It is high-grade;
403 determine code of points.Good health state full marks 100 add up to reduce item by item according to health effect weight factor, It is minimum to be divided into 0 point, it may be assumed that alarm condition.Deduction of points rule is as follows:
(1) level-one weight: 25 points.
S.M.A.R.T. there is numerical value in parameter, and button 25 is taken separately with weight 80%, that is, detains 20 points;
S.M.A.R.T. parameter is more than threshold value, detains 25 points;If S.M.A.R.T. parameter has amendment record, taken with correction value For genuine threshold value, then row judgement;
(2) second level weight: 10 points.
S.M.A.R.T. there is numerical value in parameter, and button 10 is taken separately with weight 10%, that is, detains 1 point;
S.M.A.R.T. parameter is more than threshold value, detains 10 points;If S.M.A.R.T. parameter has amendment record, taken with correction value For genuine threshold value, then row judgement;
When S.M.A.R.T. parameter is using cumulative statistics parameter:
S.M.A.R.T. after parameter value is more than individual event highest cumulative statistics value 60%, button 10 is taken separately with weight 20%, Detain 2 points;
S.M.A.R.T. after parameter value is more than individual event highest cumulative statistics value 80%, button 10 is taken separately with weight 40%, Detain 4 points;
S.M.A.R.T. after parameter value is more than the 90% of individual event highest cumulative statistics value, weighting levels are adjusted into most high Grade, is less than threshold value button 20 and divides, divide more than threshold value button 25;
When S.M.A.R.T. parameter is use state monitoring parameters:
S.M.A.R.T. more than S.M.A.R.T. parameter normal fluctuation curve and range, button 10 is taken separately within parameter value continuous one week With weight 20%, that is, detain 2 points;
S.M.A.R.T. more than S.M.A.R.T. parameter normal fluctuation curve and range, button 10 is taken separately within parameter value continuous two weeks With weight 40%, that is, detain 4 points;
S.M.A.R.T. parameter value continuous January is more than S.M.A.R.T. parameter normal fluctuation curve and range, weighting levels It adjusts into highest level, is less than threshold value button 20 and divides, divide more than threshold value button 25;
404 generate Gernral Check-up scoring dynamic model.Based on 402 weight factor and 403 code of points, in conjunction with each hard Disk manufacturer and master control manufacturer S.M.A.R.T. parameter setting establish hard disk Gernral Check-up scoring algorithm model.
405 diagnosis.
406 adjustment Gernral Check-ups scoring dynamic models.According to feedback data, the weight of amendment fine tuning S.M.A.R.T. parameter Factor and code of points generate new Gernral Check-up scoring dynamic model;By related amendment record at cloud storage service end.
A kind of device based on SMART data dynamic diagnosis hard disk failure, including cloud storage service end, hard disk failure early warning Dynamic model generation unit, S.M.A.R.T. parameter normal fluctuation curve and range generation unit and Gernral Check-up scoring dynamic analog Type generation unit;
Cloud storage service end is used for persistent collection three classes data: first is that hard disk type data, including hard disk brand and type Number;Second is that S.M.A.R.T. supplemental characteristic and parameter collection time data;Third is that the Hard disk error log number of operating system record According to;
Hard disk failure early warning dynamic model generation unit, S.M.A.R.T. supplemental characteristic and its correspondence for will be collected into Hard disk brand, model data be normalized, generate normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the Hard disk error daily record data being collected into, establish hard disk failure early warning dynamic model;
S.M.A.R.T. parameter normal fluctuation curve and range generation unit, the S.M.A.R.T. data for will be collected into Using parameter as group, in conjunction with corresponding hard disk brand, model data, different brands, different model hard disk S.M.A.R.T. parameter are formed Dynamic changing curve, statistics show that hard disk health runs S.M.A.R.T. parameter normal fluctuation range, establish S.M.A.R.T. ginseng Number normal fluctuation curve and range;
Gernral Check-up scoring dynamic model generation unit, for obtaining different brands different model by big data analysis The S.M.A.R.T. parameters weighting of hard disk;Setting according to HD vendor to S.M.A.R.T. early-warning parameters, combined training study Data obtain the new early-warning parameters of different brands different model hard disk and the weighing factor factor to hard disk health;Setting one Full marks value sets standard of deducting point, obtains according to new S.M.A.R.T. parameters weighting and to the weighing factor factor of hard disk health Gernral Check-up scoring dynamic model.
Hard disk type data further include that sequence number, firmware version, interface type, disk running speed, disk size and caching are big Small data.

Claims (4)

1. a kind of method based on SMART data dynamic diagnosis hard disk failure, it is characterised in that: the following steps are included:
101 establish cloud storage service end, persistent collection three classes data: first is that hard disk type data, including hard disk brand and model; Second is that S.M.A.R.T. supplemental characteristic and parameter collection time data;Third is that the Hard disk error daily record data of operating system record;
102 the S.M.A.R.T. supplemental characteristic being collected into and its corresponding hard disk brand, model data are normalized, Generate normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the Hard disk error day being collected into Will data establish hard disk failure early warning dynamic model;
103 by the S.M.A.R.T. data being collected into using parameter as group, in conjunction with corresponding hard disk brand, model data, formed different Brand, different model hard disk S.M.A.R.T. dynamic state of parameters change curve, statistics obtain hard disk health operation S.M.A.R.T. ginseng Number normal fluctuation range, establishes S.M.A.R.T. parameter normal fluctuation curve and range;
104, by big data analysis, obtain the S.M.A.R.T. parameters weighting of different brands different model hard disk;According to hard disk factory Setting of the quotient to S.M.A.R.T. early-warning parameters, combined training learning data obtain new pre- of different brands different model hard disk Alert parameter and the weighing factor factor to hard disk health;Set a full marks value, according to new S.M.A.R.T. parameters weighting and To the weighing factor factor of hard disk health, standard of deducting point is set, obtains Gernral Check-up scoring dynamic model;It is pre- based on hard disk failure Alert dynamic model, S.M.A.R.T. parameter normal fluctuation curve and range, Gernral Check-up scoring dynamic model, to hard disk health shape Condition carries out diagnostic score, provides specific aim suggestion;If there are risks for hard disk, early warning is carried out automatically;If early warning mistake, starting Machine learning;
103 the following steps are included:
The S.M.A.R.T. parameter and parameter collection time that 301 calling cloud storage service ends are collected into, with individual event S.M.A.R.T. Supplemental characteristic is the longitudinal axis, and the time is horizontal axis;Generate individual event S.M.A.R.T. parametric plot;In this way, generating whole individual events S.M.A.R.T. parametric plot;
302, according to individual event S.M.A.R.T. parametric plot, obtain individual event S.M.A.R.T. parameter normal fluctuation range;
303 read hard disk S.M.A.R.T. supplemental characteristic to be checked, comparison model are imported, when certain item data is more than normal fluctuation suddenly Range, early warning flip-flop toggle, automatic push warning information prompt user's hard disk failure place;
304 read and collect hard disk early warning mistake S.M.A.R.T. supplemental characteristic to be checked, correct normal fluctuation range: reducing minimum Min improves maximum Max early warning value, generates new individual event S.M.A.R.T. parameter normal fluctuation range;By correlation amendment note Record is at cloud storage service end.
2. a kind of method based on SMART data dynamic diagnosis hard disk failure according to claim 1, it is characterised in that: Hard disk type information further includes sequence number, firmware version, interface type, disk running speed, disk size and caching size data.
3. a kind of device based on SMART data dynamic diagnosis hard disk failure method according to claim 1, feature exist In: including cloud storage service end, hard disk failure early warning dynamic model generation unit, S.M.A.R.T. parameter normal fluctuation curve and Range generation unit and Gernral Check-up scoring dynamic model generation unit;
Cloud storage service end is used for persistent collection three classes data: first is that hard disk type data, including hard disk brand and model;Two It is S.M.A.R.T. supplemental characteristic and parameter collection time data;Third is that the Hard disk error daily record data of operating system record;
Hard disk failure early warning dynamic model generation unit, S.M.A.R.T. supplemental characteristic for will be collected into and its corresponding hard Disk brand, model data are normalized, and generate normalization S.M.A.R.T. data acquisition system;Based on normalization S.M.A.R.T. data acquisition system and the Hard disk error daily record data being collected into, establish hard disk failure early warning dynamic model;
S.M.A.R.T. parameter normal fluctuation curve and range generation unit, the S.M.A.R.T. data for will be collected into are to join Number is group, in conjunction with corresponding hard disk brand, model data, forms different brands, different model hard disk S.M.A.R.T. dynamic state of parameters Change curve, statistics show that hard disk health runs S.M.A.R.T. parameter normal fluctuation range, are establishing S.M.A.R.T. parameter just Ordinary wave moving curve and range;Specifically comprise the following steps: S.M.A.R.T. parameter that 301 calling cloud storage service ends are collected into and The parameter collection time, using individual event S.M.A.R.T. supplemental characteristic as the longitudinal axis, the time is horizontal axis;Generate individual event S.M.A.R.T. parameter Curve graph;In this way, generating whole individual event S.M.A.R.T. parametric plots;302 according to individual event S.M.A.R.T. parametric plot, Obtain individual event S.M.A.R.T. parameter normal fluctuation range;303 read hard disk S.M.A.R.T. supplemental characteristic to be checked, import and compare Model, when certain item data is more than normal fluctuation range, early warning flip-flop toggle, automatic push warning information, prompt user suddenly Where hard disk failure;304 read and collect hard disk early warning mistake S.M.A.R.T. supplemental characteristic to be checked, correct normal fluctuation model It encloses: reducing minimum Min or improve maximum Max early warning value, generate new individual event S.M.A.R.T. parameter normal fluctuation range;It will Related amendment record is at cloud storage service end;
Gernral Check-up scoring dynamic model generation unit, for obtaining different brands different model hard disk by big data analysis S.M.A.R.T. parameters weighting;Setting according to HD vendor to S.M.A.R.T. early-warning parameters, combined training learning data, Obtain the new early-warning parameters of different brands different model hard disk and the weighing factor factor to hard disk health;Set a full marks Value sets standard of deducting point, obtains health according to new S.M.A.R.T. parameters weighting and to the weighing factor factor of hard disk health Diagnostic score dynamic model.
4. a kind of device based on SMART data dynamic diagnosis hard disk failure according to claim 3, it is characterised in that: Hard disk type information further includes sequence number, firmware version, interface type, disk running speed, disk size and caching size data.
CN201510738474.0A 2015-11-04 2015-11-04 Method and apparatus based on SMART data dynamic diagnosis hard disk failure Active CN105260279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510738474.0A CN105260279B (en) 2015-11-04 2015-11-04 Method and apparatus based on SMART data dynamic diagnosis hard disk failure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510738474.0A CN105260279B (en) 2015-11-04 2015-11-04 Method and apparatus based on SMART data dynamic diagnosis hard disk failure

Publications (2)

Publication Number Publication Date
CN105260279A CN105260279A (en) 2016-01-20
CN105260279B true CN105260279B (en) 2019-01-01

Family

ID=55099979

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510738474.0A Active CN105260279B (en) 2015-11-04 2015-11-04 Method and apparatus based on SMART data dynamic diagnosis hard disk failure

Country Status (1)

Country Link
CN (1) CN105260279B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109945968A (en) * 2019-03-19 2019-06-28 苏州浪潮智能科技有限公司 A kind of detection hard disk multiple location is impacted device, the method and system of size by noise

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025154B (en) * 2016-01-29 2020-12-01 阿里巴巴集团控股有限公司 Disk failure prediction method and device
EP3446220A4 (en) * 2016-04-22 2020-03-25 Hewlett-Packard Development Company, L.P. Determining the health of a storage drive
CN107172128B (en) * 2017-04-24 2020-05-22 华南理工大学 Cloud-assisted manufacturing equipment big data acquisition method
CN109144833A (en) * 2017-06-27 2019-01-04 中兴通讯股份有限公司 A kind of hard disk analysis method and device
CN107391325B (en) * 2017-06-30 2021-03-12 苏州浪潮智能科技有限公司 Hard disk test method and device and terminal
CN107479836A (en) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 Disk failure monitoring method, device and storage system
CN108073486B (en) * 2017-12-28 2022-05-10 新华三大数据技术有限公司 Hard disk fault prediction method and device
EP3747008A4 (en) * 2018-01-31 2021-09-15 Hewlett-Packard Development Company, L.P. Hard disk drive lifetime forecasting
CN108446734A (en) * 2018-03-20 2018-08-24 中科边缘智慧信息科技(苏州)有限公司 Disk failure automatic prediction method based on artificial intelligence
CN108681496A (en) * 2018-05-09 2018-10-19 北京奇艺世纪科技有限公司 Prediction technique, device and the electronic equipment of disk failure
CN108763023A (en) * 2018-05-29 2018-11-06 郑州云海信息技术有限公司 A kind of stage division of disk, device, equipment and readable storage medium storing program for executing
CN108846833A (en) * 2018-05-30 2018-11-20 郑州云海信息技术有限公司 A method of hard disk failure is diagnosed based on TensorFlow image recognition
CN108959006A (en) * 2018-06-29 2018-12-07 郑州云海信息技术有限公司 A kind of hardware detection method and its tool
CN109032891A (en) * 2018-07-23 2018-12-18 郑州云海信息技术有限公司 A kind of cloud computing server hard disk failure prediction technique and device
CN109144835A (en) * 2018-08-02 2019-01-04 广东浪潮大数据研究有限公司 A kind of automatic prediction method, device, equipment and the medium of application service failure
CN109144798B (en) * 2018-08-13 2020-11-10 清华大学 Intelligent management system with machine learning function
CN110888763A (en) * 2018-09-11 2020-03-17 北京奇虎科技有限公司 Disk fault diagnosis method and device, terminal equipment and computer storage medium
CN109828869B (en) * 2018-12-05 2020-12-04 南京中兴软件有限责任公司 Method, device and storage medium for predicting hard disk fault occurrence time
CN109918246A (en) * 2019-02-28 2019-06-21 苏州浪潮智能科技有限公司 A kind of disk state detection method, system, terminal and storage medium
CN110119344B (en) * 2019-04-10 2023-09-01 深圳市科新精密电子有限公司 Hard disk health state analysis method based on S.M.A.R.T. parameters
CN111966569A (en) * 2019-05-20 2020-11-20 中国电信股份有限公司 Hard disk health degree evaluation method and device and computer readable storage medium
CN110427311B (en) * 2019-06-26 2020-07-28 华中科技大学 Disk fault prediction method and system based on time sequence characteristic processing and model optimization
US10969969B2 (en) * 2019-06-26 2021-04-06 Western Digital Technologies, Inc. Use of recovery behavior for prognosticating and in-situ repair of data storage devices
CN110929305A (en) * 2019-08-08 2020-03-27 北京盛赞科技有限公司 Hard disk protection method, device, equipment and computer readable storage medium
CN111382029B (en) * 2020-03-05 2021-09-03 清华大学 Mainboard abnormity diagnosis method and device based on PCA and multidimensional monitoring data
CN111581072B (en) * 2020-05-12 2023-08-15 国网安徽省电力有限公司信息通信分公司 Disk fault prediction method based on SMART and performance log
CN111835593B (en) * 2020-07-14 2022-06-03 杭州海康威视数字技术股份有限公司 Detection method based on nonvolatile storage medium, storage medium and electronic equipment
CN112231141B (en) * 2020-09-18 2022-07-29 苏州浪潮智能科技有限公司 Method and device for improving Raid data backup efficiency
CN112416670B (en) * 2020-11-12 2024-02-09 宁畅信息产业(北京)有限公司 Hard disk testing method, device, server and storage medium
CN112486787A (en) * 2020-11-13 2021-03-12 苏州浪潮智能科技有限公司 Method and device for evaluating warning state of hard disk for distributed storage
CN113886128B (en) * 2021-10-20 2022-09-09 深圳市东方聚成科技有限公司 SSD (solid State disk) fault diagnosis and data recovery method and system
CN117472629B (en) * 2023-11-02 2024-05-28 兰州航空职业技术学院 Multi-fault diagnosis method and system for electronic information system
CN117520104B (en) * 2024-01-08 2024-03-29 中国民航大学 System for predicting abnormal state of hard disk

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8094015B2 (en) * 2009-01-22 2012-01-10 International Business Machines Corporation Wavelet based hard disk analysis
CN103578568A (en) * 2012-07-24 2014-02-12 苏州捷泰科信息技术有限公司 Method and apparatus for testing performances of solid state disks
CN103646114A (en) * 2013-12-26 2014-03-19 北京百度网讯科技有限公司 Method and device for extracting feature data from SMART data of hard disk
CN104503874A (en) * 2014-12-29 2015-04-08 南京大学 Hard disk failure prediction method for cloud computing platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8094015B2 (en) * 2009-01-22 2012-01-10 International Business Machines Corporation Wavelet based hard disk analysis
CN103578568A (en) * 2012-07-24 2014-02-12 苏州捷泰科信息技术有限公司 Method and apparatus for testing performances of solid state disks
CN103646114A (en) * 2013-12-26 2014-03-19 北京百度网讯科技有限公司 Method and device for extracting feature data from SMART data of hard disk
CN104503874A (en) * 2014-12-29 2015-04-08 南京大学 Hard disk failure prediction method for cloud computing platform

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Machine Learning Methods for Predicting Failures in Hard Drives:A Multiple-Instance Application;Joseph F.Murray et al.;《Journal of Machine Learning Research》;20051231;全文
Online Failure Prediction in Cloud Datacenters by Real-time Message Pattern Learning;Yukihiro Watanabe et al.;《2012 IEEE 4th International Conference on Cloud Computing Technology and Science》;20121231;全文
基于COG-OS框架利用SMART预测云计算平台的硬盘故障;宋云华 等;《计算机应用》;20140110;全文

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109945968A (en) * 2019-03-19 2019-06-28 苏州浪潮智能科技有限公司 A kind of detection hard disk multiple location is impacted device, the method and system of size by noise

Also Published As

Publication number Publication date
CN105260279A (en) 2016-01-20

Similar Documents

Publication Publication Date Title
CN105260279B (en) Method and apparatus based on SMART data dynamic diagnosis hard disk failure
CN108228377B (en) SMART threshold value optimization method for disk fault detection
US11740619B2 (en) Malfunction early-warning method for production logistics delivery equipment
CN102870057B (en) Plant diagnosis device, diagnosis method, and diagnosis program
CN102999038B (en) The diagnostic device of generating set and the diagnostic method of generating set
KR101713985B1 (en) Method and apparatus for prediction maintenance
WO2019199433A1 (en) Predicting failures in electrical submersible pumps using pattern recognition
CN104035331B (en) Unit running optimization instructs system and equipment thereof
CA2972973A1 (en) Machine learning-based fault detection system
US20170261403A1 (en) Abnormality detection procedure development apparatus and abnormality detection procedure development method
JP7180985B2 (en) Diagnostic device and diagnostic method
EP2646884A1 (en) Machine anomaly detection and diagnosis incorporating operational data
WO2015121176A1 (en) Method of identifying anomalies
CN105928710A (en) Diesel engine fault monitoring method
CN109522193A (en) A kind of processing method of operation/maintenance data, system and device
CN102541050A (en) Chemical process fault diagnosis method based on improved support vector machine
CN107710089A (en) Shop equipment diagnostic device and shop equipment diagnostic method
GB2608772A (en) Machine learning based data monitoring
CN105653835B (en) A kind of method for detecting abnormality based on clustering
CN110119344B (en) Hard disk health state analysis method based on S.M.A.R.T. parameters
JP2006276924A (en) Equipment diagnostic device and equipment diagnostic program
US20240168835A1 (en) Hard disk failure prediction method, system, device and medium
CN116538092B (en) Compressor on-line monitoring and diagnosing method, device, equipment and storage medium
CN112668415A (en) Aircraft engine fault prediction method
Jiang et al. A SVDD and K‐Means Based Early Warning Method for Dual‐Rotor Equipment under Time‐Varying Operating Conditions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant