WO2017129030A1 - Disk failure prediction method and apparatus - Google Patents

Disk failure prediction method and apparatus Download PDF

Info

Publication number
WO2017129030A1
WO2017129030A1 PCT/CN2017/071695 CN2017071695W WO2017129030A1 WO 2017129030 A1 WO2017129030 A1 WO 2017129030A1 CN 2017071695 W CN2017071695 W CN 2017071695W WO 2017129030 A1 WO2017129030 A1 WO 2017129030A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
data
sample
tested
module
Prior art date
Application number
PCT/CN2017/071695
Other languages
French (fr)
Chinese (zh)
Inventor
丁永明
周俊
崔卿
瞿神全
Original Assignee
阿里巴巴集团控股有限公司
丁永明
周俊
崔卿
瞿神全
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 丁永明, 周俊, 崔卿, 瞿神全 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017129030A1 publication Critical patent/WO2017129030A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3058Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations

Definitions

  • the present invention relates to the field of magnetic disks, and in particular to a method and apparatus for predicting failure of a magnetic disk.
  • the hard disk is the main medium for storing data, and once the hard disk fails, it will cause huge data loss. Therefore, how to ensure the stability of the hard disk can be very important.
  • the probability of a hard disk error in 24 hours is about one in ten thousand.
  • the probability of a server hard disk error will rise to one thousandth, and with the current website.
  • the number of hard disks that the server needs to use will increase, and the probability of multiple hard disks failing at the same time will increase.
  • data storage usually has multiple backups, such as mysql main and standby libraries, and GFS files default to 3 backups.
  • backups such as mysql main and standby libraries
  • GFS files default to 3 backups.
  • the embodiment of the invention provides a method and a device for predicting a fault of a magnetic disk, so as to at least solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be collected or quantized due to inaccurate prediction results. .
  • a method for predicting a fault of a magnetic disk includes: acquiring sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions; using a GBDT algorithm The sample disk data is sample-trained to obtain a disk prediction model composed of a plurality of decision trees; after receiving the disk data of the disk to be tested, the disk prediction data of the plurality of decision trees is used to process the disk data of the disk to be tested, Determine if the disk to be tested is a failed disk.
  • a fault prediction apparatus for a disk comprising: acquiring sample disk data of a disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; using GBDT The algorithm performs sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, the disk prediction data composed of multiple decision trees is used to perform disk data of the disk to be tested. Process to determine if the disk to be tested is a failed disk.
  • the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the sample disk data is sample-trained by using the GBDT algorithm to obtain multiple decision trees.
  • the disk prediction model is formed by processing the disk data of the disk to be tested by using a disk prediction model composed of multiple decision trees after receiving the disk data of the disk to be tested, thereby determining whether the disk to be tested is a failed disk.
  • the purpose is to realize the technical effect of predicting the fault state of the disk, and further solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be inaccurate due to the acquisition or quantification.
  • FIG. 1 is a block diagram showing the hardware structure of a computer terminal for predicting a failure of a magnetic disk according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of training sample disk data using the GBDT algorithm according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of calculating a predicted value of a disk using a GBDT algorithm according to an embodiment of the present invention
  • FIG. 5 is a flowchart of an optional disk fault prediction method according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention.
  • FIG. 11 is a block diagram showing the structure of a computer terminal according to an embodiment of the present invention.
  • an embodiment of a method for predicting a failure of a magnetic disk there is also provided an embodiment of a method for predicting a failure of a magnetic disk, and it is to be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and Although the logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
  • FIG. 1 is a hardware block diagram of a computer terminal of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention.
  • computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • a memory 104 for storing data
  • a transmission module 106 for communication functions.
  • computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of the application software, such as program instructions/modules corresponding to the fault prediction method of the disk in the embodiment of the present invention, and the processor 102 executes by executing the software program and the module stored in the memory 104.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • FIG. 2 is a flow chart of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM, including a number of instructions to make a terminal device (available).
  • a storage medium such as ROM/RAM, disk, CD-ROM, including a number of instructions to make a terminal device (available
  • the method described in various embodiments of the present invention is implemented by a mobile phone, a computer, a server, or a network device.
  • FIG. 2 is a flowchart of a method for processing decompiled data according to Embodiment 1 of the present invention. As shown in FIG. 2, the method includes:
  • Step 21 Obtain sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions.
  • the disk monitoring technology is used to monitor various disk data generated during the use of the disk after the factory to predict the fault state of the disk, so that the disk user can know that the disk is about to fail before the disk fails. Therefore, the data in the disk is copied and stored to avoid data loss.
  • the sample disk data may include: an underlying data read error rate, a start/stop count, a number of remapping sectors, a power-on time accumulation, a spindle spin retries, and a disk calibration retry.
  • the number of times, the number of disk power-ons, the temperature, and the write error rate can be used to obtain sample disk data based on historical disk failure conditions. For example, sample acquisition can be performed at a ratio of 1:5 to positive and negative samples, where the positive sample is the faulty disk and the negative sample is the disk with no fault.
  • the disks used by the various organizations that predict the disk failure are not necessarily the same, and the environmental factors such as temperature and humidity of the various mechanisms affect the disk.
  • the ratio of the disks of different organizations is different.
  • the sample disk data can also be obtained according to the actual disk damage of the mechanism.
  • Step S23 Perform sample training on the sample disk data by using a GBDT algorithm to obtain a disk prediction model composed of multiple decision trees.
  • GBDT Gradient Boosting Decision Tree
  • GBDT Genetic Boosting Decision Tree
  • the above decision tree is used as a predictive model.
  • the next layer of decision is made, including parameters such as decision points, state nodes, and result nodes.
  • Each node in the tree represents the predicted object, and each bifurcation path represents the object. Attributes.
  • the sample disk is the original value of the SMART of the disk
  • the sample disk is sample-trained, for example, the original value is greater than or equal to a preset original value, and the sample disk may be considered to be faulty.
  • the probability is large.
  • the probability that the sample disk fails is considered to be small. Therefore, when the disk prediction model is determined, if the original value of the sample disk is greater than or equal to the preset original value, Confirm that the attribute of the sample disk is faulty. If the original value of the sample disk is less than the preset original value, confirm that the attribute of the sample disk is non-faulty. Establish a disk prediction model with the above-mentioned decision-making capability.
  • the decision tree When the disk to be detected is input to the decision tree, if the original value of the disk to be detected is greater than or equal to the preset original value, and the decision tree automatically confirms that the disk to be detected is faulty, confirm the The attribute of the sample disk is fault. When the original value of the sample disk is less than the preset original value, confirm that the attribute of the sample disk is non-faulty.
  • Step S25 After receiving the disk data of the disk to be tested, the disk data of the disk to be tested is processed by using the disk prediction model consisting of multiple decision trees to determine whether the disk to be tested is a faulty disk.
  • the values of the multiple dimensions of the sample disk are used as evaluation indexes of the decision tree to obtain a plurality of decision trees, and then a plurality of decision trees form a disk prediction model to detect the detected disks.
  • the decision trees obtained according to each dimension of the disk may be the same and may not be the same. Therefore, when using multiple decision trees to form a disk prediction model, it is necessary to be based on the importance of each decision tree in the evaluation system. To confirm the weight value of each decision tree, and get the disk prediction model.
  • the disk disk data is obtained by the disk monitoring technology
  • the disk detection technology is adopted, so that the process of obtaining the sample disk data is simpler, and the acquired data is more comprehensive, and the sample disk data is Training provides a wealth of disk sample data.
  • the sample training of the sample disk data by using the GBDT algorithm may be divided into two. Training is performed one or more times to improve the accuracy and recall rate of the disk prediction model composed of the decision tree corresponding to the training result.
  • the solution of the first embodiment provided by the present application solves the technical problem that some factors in the prior art hard disk fault prediction system that are likely to cause the hard disk failure cannot be collected or quantized due to inaccurate prediction results.
  • the sample disk data includes at least sample data in four dimensions: original value, standard value, worst value, and cumulative value.
  • the above-mentioned original value is the current parameter of the disk running time; the above-mentioned standard value is the value of each parameter of the normal disk running; the above-mentioned worst value is that when the disk is running, the detection parameters of the disk have the largest deviation from the normal value.
  • Normal value is the cumulative result of each disk's detection parameters from disk usage to the current time.
  • the parameters of the disk may be information describing various attributes of the disk, and may include an error read rate, a power-on frequency, a number of re-allocated sectors, a number of rotation retries, One or more of the number of disk calibration retries and the parity error rate may also include other attribute information of the disk.
  • the above steps of the present application can respectively obtain a plurality of different decision trees by using the sample data in the above four dimensions.
  • the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.
  • the method further includes:
  • Step S211 performing any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that sample data in any one dimension is expanded into a new dimension. sample.
  • the decision result is further calculated, and the decision tree can be expanded into a new dimension according to the operation result, and the sample data in this dimension is obtained.
  • sample data of each dimension can perform a variety of operations to obtain more dimensional sample data on the basis of this dimension. On the basis of four dimensions, each dimension is separately performed. Differential operations, square operations, and distributed summation operations yield sample data in sixteen dimensions, and the focus of decision making through sample data for each dimension is different.
  • the sample data of the original value is still taken as an example, and the sample data of the original value is subjected to a difference operation, a square operation, and a distribution sum operation, thereby obtaining a new four dimensions.
  • Sample data using the new four dimensions of sample data for the most decision-making indicators to train, and get a new four decision trees.
  • the sample disk data is sample-trained by using the GBDT algorithm, and a disk prediction model composed of multiple decision trees is obtained, including:
  • step S231 sample disk data of all disks is used as training data, and the classification model parameters of the training data are initialized with default values.
  • the classification model parameter of the initialization training data may be preset the number of the above decision trees and the number of layers of each decision tree, that is, the initial setting of the attributes of the decision tree.
  • Step S233 extracting a plurality of feature data in the training data, creating each of the plurality of decision trees as a root node, and using the feature value corresponding to each feature data as a leaf node of the corresponding decision tree.
  • Step S235 Calculate an optimal partition of all current leaf nodes and a gain thereof, and perform splitting with the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.
  • the gain may be the minimum mean square error of the label value, that is, the square of the difference between the label value of each sample and the predicted label value, and calculate the sum of the squares of all the differences, which may be considered to be predicted. The more samples that are erroneous, the larger the mean square error, so the optimal branching basis can be found by minimizing the mean square error.
  • the decision tree may be a binary tree with each feature data as a root node, and each special data corresponds to a feature value, and the feature value is a leaf of a decision tree with the feature data as a root node. node. After determining the leaf nodes of the decision tree, the leaf nodes are further divided. It is worth noting that when the leaf nodes are further divided, the gain is maximized when the gains of the plurality of leaf nodes are different. Leaf nodes, so that all sample data can be divided into corresponding leaf nodes.
  • the sample disk is a disk of A, B, C, and D.
  • the A disk and the B disk are normal disks, and the C disk and the D disk are damaged disks.
  • the normal disk corresponds to 0, and the failed disk corresponds to 1, so the four disks A, B, C, and D correspond to 0, 0, 1, 1, respectively.
  • Obtaining the characteristic value of the disk in the first dimension is A, and training the sample disk data by using the GBDT algorithm.
  • FIG. 3 is a schematic diagram of training the sample disk data by using the GBDT algorithm according to an embodiment of the present invention, and FIG.
  • the default initial value is set to 0.5, that is, the probability that each disk is a failed disk is 0.5, the threshold of the first dimension is A0, and the disk with the feature value greater than A0 is divided into a child node, which will be in the first dimension.
  • a disk whose eigenvalue is less than or equal to A0 is divided into another child node, and the probability that the disk of the two child nodes is a failed disk is 0.5.
  • each feature data in the training data is extracted, each feature data is created as a root node, and each feature data is created.
  • the corresponding feature value is used as the leaf node of the corresponding decision tree, including:
  • step S2331 the threshold corresponding to any one of the feature data is read.
  • Step S2333 comparing the feature value of any one of the feature data with the threshold, and obtaining the entropy of the two branches according to the comparison result.
  • Step S2335 determining two new nodes as two leaf nodes of the arbitrary one feature data according to the entropy of the two branches.
  • Step S2337 using the above steps to process each feature data until each special The levy data gets the predetermined two unique leaf nodes.
  • each threshold of each feature is exhausted, and the features and thresholds that minimize the entropy of the two branches according to the feature being less than or equal to the threshold and the feature is greater than the threshold are found, and two new branches are obtained according to the standard branch.
  • Nodes use the same method to continue branching until all samples are split into leaf nodes with only normal disks or only failed disks, or reach the default termination condition. If there are not only normal disks or failed disks in the final leaf node, then The average tag value of all samples on the node is used as the predicted tag value for the leaf node.
  • the tag value is the probability that the disk is a failed disk.
  • the minimum entropy means that as far as possible, the ratio of positive and negative samples in each branch is far from 1:1, and the case of minimum entropy is that there are only positive or negative samples on the branch. That is, there are only normal disks or failed disks on the branch.
  • each node obtains a predicted value equal to the average of all the tag values belonging to the node, and the node is divided. Exhausting each threshold of each feature, finding the best segmentation point, until the tag value of each sample on each leaf node is unique or reaches the preset termination condition, if the label of the sample on the final leaf node If the value is not unique, the average tag value of all samples on the node is used as the predicted tag value of the leaf node.
  • the optimal partitioning criterion is no longer to minimize the entropy, but to minimize the mean square error, that is, the difference between the label value of each sample and the predicted label value.
  • the square of the square, and calculate the sum of the squares of all the differences, can be considered that the more samples that are predicted to be wrong, the greater the mean square error, so the optimal branching basis can be found by minimizing the mean square error.
  • a termination condition can be preset in order to obtain the prediction result closest to the real situation.
  • the termination condition can be the upper limit of the leaf.
  • the method further comprises: adjusting the classification model parameters, wherein the classification Model parameters include faulty disk samples and non-faulty disk samples In this case, if it is determined whether the disk to be tested is a faulty disk, the proportion of the failed disk samples in the classification model parameter is increased.
  • the disk data of the disk to be tested is processed by using the disk prediction model composed of a plurality of decision trees to determine whether the disk to be tested is a faulty disk.
  • Step S251 After receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested.
  • Step S253 traversing each decision tree according to the initial value of the disk to be tested, calculating a prediction result and a first residual determined by the first decision tree, and assigning the first residual to the initial value. , get the updated initial value.
  • Step S255 calculating, by using the updated initial value, a prediction result determined by the second decision tree and a second residual, and assigning the updated residual value to the second residual, thereby traversing all the The decision tree obtains a result of predicting whether the disk to be tested is a failed disk.
  • step S257 each tree learns the residual of all previous tree conclusions, and the residual is an accumulated amount that can obtain the true value after adding the predicted value.
  • the four disks A, B, C, and D are still taken as an example, and the four disks A, B, C, and D can be divided into two parts by using feature A, respectively. , B and C, D, each part uses the average tag value as the predicted value.
  • FIG. 4 FIG.
  • FIG. 4 is a schematic diagram of calculating a predicted value of a disk using the GBDT algorithm according to an embodiment of the present invention, and using the residual to replace the original values of A, B, C, and D, and inputting to the second decision tree.
  • the second tree has only two values of 0.5 and -0.5, so it is split directly into two nodes. At this point everyone's residual is 0, that is, everyone gets real predictions.
  • a decision tree can be obtained according to the sample data amount, and the predicted value refers to the sum of all the previous trees. Since in this embodiment, the decision tree has only one decision tree before, so it is directly 0.5. If there are still strange decision trees, they need to be added together as the predicted value of A.
  • FIG. 5 is a flowchart of an optional fault prediction method for a magnetic disk according to an embodiment of the present invention. A preferred embodiment of the present application is described in detail below with reference to FIG. 5.
  • the method may include the following steps S51 to S57:
  • the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.
  • the difference operation refers to a value obtained by performing difference calculation between the feature data of the disk at a certain time and the feature data of the disk before 24 hours.
  • S55 the first step of training and forecasting, makes the recall rate larger.
  • the proportion of negative samples in the training data is large, the proportion of positive samples is small. For example, when the ratio of the two is 1000:1, if all the training data is used for training, the prediction can be accurately predicted. Positive samples are rare. Because there are few positive samples in the training data, many data with real values of negative samples may be misjudged as positive samples. Therefore, the first step is to make the positive sample recall rate higher during training.
  • the training data predicted as the positive sample in the first step is used as the training data of the second step, that is, those samples that are close to the positive sample are selected as the training samples, so that when training, the trained model will be more It is good to predict the positive sample, so the result of the second step prediction, the accuracy of the positive sample will be greatly improved than the first step, so that the accuracy and recall rate reach a certain balance.
  • FIG. 6 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention, such as As shown in FIG. 6, the apparatus includes an acquisition module 60, a training module 62, and a processing module 64.
  • the obtaining module 60 is configured to acquire sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions;
  • the training module 62 is configured to perform sample training on the sample disk data by using a GBDT algorithm to obtain a disk prediction model composed of multiple decision trees;
  • the processing module 64 after receiving the disk data of the disk to be tested, processing the disk data of the disk to be tested by using the disk prediction model composed of multiple decision trees, and determining whether the disk to be tested is a fault disk .
  • the above-mentioned acquisition module 60, the training module 62, and the processing module 64 are the same as the application scenarios and the application scenarios that are implemented in the steps S21 to S25 of the embodiment, but are not limited to the disclosure in the first embodiment. content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in four dimensions: original value, standard value, and most Difference and cumulative value.
  • the device further includes:
  • the operation module 70 is configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that sample data in any one dimension is expanded to a new one. Sample data on the dimension.
  • the foregoing operation module 770 is the same as the example and the application scenario implemented in step S21 to step S25 in the first embodiment, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the first embodiment. Provided in the computer terminal 10.
  • the training module 62 further includes:
  • the initial module 80 is configured to use sample disk data of all disks as training data, and initialize a classification model parameter of the training data by using a default value;
  • the extracting module 82 is configured to extract a plurality of feature data in the training data, create each of the plurality of decision trees as a root node, and use the feature value corresponding to each feature data as a corresponding decision tree.
  • the first calculating module 84 is configured to calculate an optimal partition of all current leaf nodes and a gain thereof, and perform splitting with the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.
  • the initial module 80, the extraction module 82, and the first calculation module 84 are the same as the application scenarios implemented in steps S231 to S235 of the embodiment, but are not limited to the foregoing embodiment. Public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • the extraction module 82 includes:
  • the reading module 90 is configured to read a threshold corresponding to any one of the feature data
  • the comparing module 92 is configured to compare the feature value of the any one of the feature data with the threshold, and obtain the entropy of the two branches according to the comparison result;
  • a determining module 94 configured to determine, according to the entropy of the two branches, two new nodes as two leaf nodes of the any one of the feature data;
  • the processing sub-module 96 is configured to process each feature data by using the above steps until each feature data obtains two predetermined unique leaf nodes.
  • the foregoing reading module 90, the comparing module 92, the determining module 94, and the processing sub-module 96 correspond to the realities implemented in steps S2331 to S2337 of the embodiment.
  • the example is the same as the application scenario, but is not limited to the content disclosed in the first embodiment.
  • the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • the method further comprises: adjusting the classification model parameters, wherein the classification
  • the model parameters include a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
  • the processing module 64 includes:
  • the receiving module 100 is configured to: after receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested;
  • the second calculating module 102 is configured to traverse each decision tree according to the initial value of the disk to be tested, calculate a prediction result and a first residual determined by the first decision tree, and assign the first residual Giving the initial value, obtaining an updated initial value;
  • the traversing module 104 is configured to calculate, by using the updated initial value, a prediction result determined by the second decision tree and a second residual, and the second residual is assigned the updated initial value. Iterate through all the decision trees and get the result of predicting whether the disk to be tested is a failed disk.
  • the example of the receiving module 100, the second calculating module 102, and the traversing module 104 corresponding to the steps S251 to S255 of the embodiment are the same as the application scenario, but are not limited to the foregoing embodiment. Public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • Embodiments of the present invention may provide a computer terminal, which may be any one of computer terminal groups.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
  • the computer terminal may execute the program code of the following steps in the fault prediction method of the disk: acquiring the sample disk data of the disk by using the disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; using GBDT The algorithm performs sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, the disk prediction data composed of multiple decision trees is used to perform disk data of the disk to be tested. Process to determine if the disk to be tested is a failed disk.
  • FIG. 11 is a structural block diagram of a computer terminal according to an embodiment of the present invention.
  • the computer terminal A may include one or more (only one shown in the figure) processor 111, memory 113, and transmission device 115.
  • the memory can be used to store the software program and the module, such as the fault prediction method of the disk and the program instruction/module corresponding to the device in the embodiment of the present invention, and the processor executes various programs by running the software program and the module stored in the memory. Functional application and data processing, that is, the above-described method for predicting the failure of the disk.
  • the memory may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • the memory can further include memory remotely located relative to the processor, which can be connected to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the processor may call the memory stored information and the application by the transmission device to perform the following steps: the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value , the worst value and the cumulative value.
  • the foregoing processor may further execute the following program code: perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distributed sum operation, so that any one dimension The sample data is expanded out of the sample data on the new dimension.
  • the foregoing processor may further execute the following program code: use sample disk data of all disks as training data, and initialize a classification model parameter of the training data by using a default value; and extract multiple feature data in the training data, Each feature data is used as a root node to create multiple decision trees, and the feature value corresponding to each feature data is used as a leaf node of the corresponding decision tree; the optimal partition of all current leaf nodes and its gain are calculated, and the gain is maximized. Leaves The node and the corresponding dividing point are split, so that the sample disk data is divided into the child nodes.
  • the processor may further execute the following program code: read a threshold corresponding to any one of the feature data; compare the feature value of any one of the feature data with a threshold, and obtain an entropy of the two branches according to the comparison result; Two new nodes are determined as two leaf nodes of any one of the feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data obtains two predetermined unique leaf nodes.
  • the foregoing processor may further execute the following program code: after obtaining the disk prediction model composed of multiple decision trees, the method further includes: adjusting the classification model parameters, where the classification model parameters include the faulty disk In the case of sample and non-failed disk samples, if you want to determine if the disk under test is a failed disk, increase the proportion of the failed disk samples in the classification model parameters.
  • the foregoing processor may further execute the following program code: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; and each decision tree is traversed according to the initial value of the disk to be tested. Calculating the prediction result and the first residual determined by the first decision tree, and assigning the first residual to the initial value to obtain the updated initial value; and calculating the updated initial value to obtain the second decision tree The determined prediction result and the second residual, and the second residual is assigned an updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.
  • the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the sample disk data is sample-trained by using the GBDT algorithm to obtain multiple decision trees.
  • the disk prediction model is formed by processing the disk data of the disk to be tested by using a disk prediction model composed of multiple decision trees after receiving the disk data of the disk to be tested, thereby determining whether the disk to be tested is a failed disk.
  • the purpose is to realize the technical effect of predicting the fault state of the disk, and further solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be inaccurate due to the acquisition or quantification.
  • FIG. 11 is only an illustration, and the computer terminal can also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, and a palm phone. Brain and mobile Internet devices (MID), PAD and other terminal devices.
  • FIG. 11 does not limit the structure of the above electronic device.
  • computer terminal A may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 11, or have a different configuration than that shown in FIG.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be used to save the program code executed by the fault prediction method of the disk provided in the first embodiment.
  • the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
  • the storage medium is configured to store program code for performing the following steps: acquiring sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions;
  • the GBDT algorithm is used to perform sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees.
  • a disk prediction model composed of multiple decision trees is used to measure the disks of the disk. The data is processed to determine whether the disk to be tested is a failed disk.
  • the storage medium is further configured to store program code for performing the following steps: performing one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation,
  • the sample data in any one dimension is expanded to the sample data on the new dimension.
  • the storage medium is further configured to store program code for performing the following steps:
  • the sample disk data of all disks is used as training data, and the classification model parameters of the training data are initialized by default values; multiple feature data in the training data are extracted, and each feature data is used as a root node to create multiple decision trees, and The feature value corresponding to each feature data is used as a leaf node of the corresponding decision tree; the optimal partition of all current leaf nodes and its gain are calculated, and the leaf nodes with the largest gain and the corresponding segment points are split, so that the sample disk data is obtained. Divided into child nodes.
  • the storage medium is further configured to store program code for performing the following steps: reading a threshold corresponding to any one of the feature data; comparing the feature value of any one of the feature data with a threshold, and obtaining two according to the comparison result.
  • Entropy of branches two new nodes are determined as two leaf nodes of any one feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data obtains two predetermined unique ones Leaf node.
  • the storage medium is further configured to store program code for performing the following steps: after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein, in the classification In the case where the model parameters include a failed disk sample and a non-faulty disk sample, if the disk to be tested is determined to be a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
  • the foregoing storage medium is further configured to store program code for performing the following steps: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; traversing according to the initial value of the disk to be tested. For each decision tree, the prediction result determined by the first decision tree and the first residual are calculated, and the first residual is assigned to the initial value to obtain the updated initial value; and the updated initial value is calculated. The prediction result determined by the two decision trees and the second residual, and the second residual is assigned an updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.
  • the disclosed technical content can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Abstract

Disclosed are a disk failure prediction method and apparatus. The method comprises: acquiring sample disk data of a disk through disk monitoring technology, the sample disk data comprising sample data on a plurality of dimensions (S21); performing sample training on the sample disk data by using a GBDT algorithm, to obtain a disk prediction model consisting of a plurality of decision-making trees (S23); and after disk data of a disk to be predicted is received, processing the disk data of the disk to be predicted by using the disk prediction model consisting of the plurality of decision-making trees, and determining whether the disk to be predicted is a failed disk (S25). The method solves the technical problem in the prior art of an inaccurate prediction result caused by the fact that some factors resulting in hard disk failures cannot be collected or quantized in a hard disk failure prediction system.

Description

磁盘的故障预测方法和装置Disk failure prediction method and device 技术领域Technical field
本发明涉及磁盘领域,具体而言,涉及一种磁盘的故障预测方法和装置。The present invention relates to the field of magnetic disks, and in particular to a method and apparatus for predicting failure of a magnetic disk.
背景技术Background technique
目前,硬盘是存储数据的主要介质,硬盘一旦出故障,便会造成巨大的数据损失。因此如何保证硬盘的稳定性能非常重要。在通常状态下,硬盘在24小时中出错的概率在是万分之一左右,当一台服务器具有十块硬盘时,服务器硬盘出错的概率就会上升到千分之一,而随着当前网站等业务的发展,服务器需要使用的硬盘会越来越多,多块硬盘同时出错的概率也会提升。At present, the hard disk is the main medium for storing data, and once the hard disk fails, it will cause huge data loss. Therefore, how to ensure the stability of the hard disk can be very important. Under normal conditions, the probability of a hard disk error in 24 hours is about one in ten thousand. When a server has ten hard disks, the probability of a server hard disk error will rise to one thousandth, and with the current website. As the business develops, the number of hard disks that the server needs to use will increase, and the probability of multiple hard disks failing at the same time will increase.
通常情况下,数据存储通常会有多个备份,如mysql主备库,GFS文件默认3个备份。在大量数据存储平台上,如果多个硬盘同时出故障,那么这些硬盘上存储着同一个文件的备份的概率就会很高,即如果多块硬盘同时出现故障,就会导致一些文件的丢失,对于一些线上的服务,大都依赖于服务器中存储的海量数据,如果硬盘出故障,就会导致上述在线服务异常,甚至暂停使用。Usually, data storage usually has multiple backups, such as mysql main and standby libraries, and GFS files default to 3 backups. On a large number of data storage platforms, if multiple hard disks fail at the same time, the probability of storing the same file on these hard disks will be high. That is, if multiple hard disks fail at the same time, some files will be lost. For some online services, most of them depend on the huge amount of data stored in the server. If the hard disk fails, the above online service will be abnormal or even suspended.
由于上述原因,需要具有预测硬盘是否会出错的系统需要有一套系统能提前告诉我们哪些硬盘会出错,数据可能丢失导致硬盘故障的原因有很多,最常见的有以下几种:外部振动、温度和湿度、电器元件损坏、声音和灰尘,在上述因素中,有些因素能够被采集到,比如温度和湿度、一些元器件数据,但是更多的数据无法被采集和量化,因此便会导致预测结果不准确。For the above reasons, systems that need to predict whether the hard disk will go wrong need a system that can tell us in advance which hard disks will go wrong. There are many reasons why the data may be lost. The most common ones are: external vibration, temperature and Humidity, electrical component damage, sound and dust, some of the above factors can be collected, such as temperature and humidity, some component data, but more data can not be collected and quantified, so it will lead to prediction results accurate.
针对现有技术的硬盘故障预测系统中一些容易致使硬盘故障的因素 不能被采集胡或量化导致的预测结果不准确的问题,目前尚未提出有效的解决方案。Some factors in the prior art hard disk failure prediction system that easily cause hard disk failure There is no effective solution to the problem of inaccurate prediction results that cannot be collected or quantified.
发明内容Summary of the invention
本发明实施例提供了一种磁盘的故障预测方法和装置,以至少解决现有技术的硬盘故障预测系统中一些容易致使硬盘故障的因素不能被采集胡或量化导致的预测结果不准确的技术问题。The embodiment of the invention provides a method and a device for predicting a fault of a magnetic disk, so as to at least solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be collected or quantized due to inaccurate prediction results. .
根据本发明实施例的一个方面,提供了一种磁盘的故障预测方法,包括:通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,确定待测磁盘是否为故障磁盘。According to an aspect of the embodiments of the present invention, a method for predicting a fault of a magnetic disk includes: acquiring sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions; using a GBDT algorithm The sample disk data is sample-trained to obtain a disk prediction model composed of a plurality of decision trees; after receiving the disk data of the disk to be tested, the disk prediction data of the plurality of decision trees is used to process the disk data of the disk to be tested, Determine if the disk to be tested is a failed disk.
根据本发明实施例的另一方面,还提供了一种磁盘的故障预测装置,包括:通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,确定待测磁盘是否为故障磁盘。According to another aspect of the embodiments of the present invention, there is also provided a fault prediction apparatus for a disk, comprising: acquiring sample disk data of a disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; using GBDT The algorithm performs sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, the disk prediction data composed of multiple decision trees is used to perform disk data of the disk to be tested. Process to determine if the disk to be tested is a failed disk.
在本发明实施例中,采用通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型方式,通过在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,达到了确定待测磁盘是否为故障磁盘的目的,从而实现了预测磁盘故障状态的技术效果,进而解决了现有技术的硬盘故障预测系统中一些容易致使硬盘故障的因素不能被采集胡或量化导致的预测结果不准确的技术问题。 In the embodiment of the present invention, the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the sample disk data is sample-trained by using the GBDT algorithm to obtain multiple decision trees. The disk prediction model is formed by processing the disk data of the disk to be tested by using a disk prediction model composed of multiple decision trees after receiving the disk data of the disk to be tested, thereby determining whether the disk to be tested is a failed disk. The purpose is to realize the technical effect of predicting the fault state of the disk, and further solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be inaccurate due to the acquisition or quantification.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:
图1是根据本发明实施例的一种磁盘的故障预测方法的计算机终端的硬件结构框图;1 is a block diagram showing the hardware structure of a computer terminal for predicting a failure of a magnetic disk according to an embodiment of the present invention;
图2是根据本发明实施例的一种磁盘的故障预测方法的流程图;2 is a flowchart of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention;
图3是根据本发明实施例的一种使用GBDT算法对样本磁盘数据进行训练的示意图;3 is a schematic diagram of training sample disk data using the GBDT algorithm according to an embodiment of the present invention;
图4为根据本发明实施例的一种使用GBDT算法计算磁盘预测值的示意图;4 is a schematic diagram of calculating a predicted value of a disk using a GBDT algorithm according to an embodiment of the present invention;
图5是根据本发明实施例的一种可选的磁盘的故障预测方法的流程图;FIG. 5 is a flowchart of an optional disk fault prediction method according to an embodiment of the present invention; FIG.
图6是根据本发明实施例的一种磁盘的故障预测装置的结构示意图;6 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention;
图7是根据本发明实施例的一种可选的磁盘的故障预测装置的结构示意图;7 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;
图8是根据本发明实施例的一种可选的磁盘的故障预测装置的结构示意图;FIG. 8 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention; FIG.
图9是根据本发明实施例的一种可选的磁盘的故障预测装置的结构示意图;9 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;
图10是根据本发明实施例的一种可选的磁盘的故障预测装置的结构示意图;以及10 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;
图11是根据本发明实施例的一种计算机终端的结构框图。 11 is a block diagram showing the structure of a computer terminal according to an embodiment of the present invention.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the invention described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.
实施例1Example 1
根据本发明实施例,还提供了一种磁盘的故障预测方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, there is also provided an embodiment of a method for predicting a failure of a magnetic disk, and it is to be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and Although the logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
本申请实施例一所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是根据本发明实施例的一种磁盘的故障预测方法的计算机终端的硬件结构框图。如图1所示,计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输模块106。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。 The method embodiment provided in Embodiment 1 of the present application can be executed in a mobile terminal, a computer terminal or the like. Taking a computer terminal as an example, FIG. 1 is a hardware block diagram of a computer terminal of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention. As shown in FIG. 1, computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) A memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG. 1 is merely illustrative and does not limit the structure of the above electronic device. For example, computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
存储器104可用于存储应用软件的软件程序以及模块,如本发明实施例中的磁盘的故障预测方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的应用程序的漏洞检测方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store software programs and modules of the application software, such as program instructions/modules corresponding to the fault prediction method of the disk in the embodiment of the present invention, and the processor 102 executes by executing the software program and the module stored in the memory 104. Various functional applications and data processing, that is, the vulnerability detection method for implementing the above application. Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。 Transmission device 106 is for receiving or transmitting data via a network. The network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
在上述运行环境下,本申请提供了如图2所示的一种磁盘的故障预测方法。图2是根据本发明实施例的一种磁盘的故障预测方法的流程图。In the above operating environment, the present application provides a method for predicting a failure of a magnetic disk as shown in FIG. 2. 2 is a flow chart of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可 以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD-ROM, including a number of instructions to make a terminal device (available The method described in various embodiments of the present invention is implemented by a mobile phone, a computer, a server, or a network device.
在上述运行环境下,本申请提供了如图2所示的反编译数据的处理方法。图2是根据本发明实施例一的反编译数据的处理方法的流程图,如图2所示,该方法包括:In the above operating environment, the present application provides a method of processing decompiled data as shown in FIG. 2. 2 is a flowchart of a method for processing decompiled data according to Embodiment 1 of the present invention. As shown in FIG. 2, the method includes:
步骤21,通过磁盘监控技术获取磁盘的样本磁盘数据,其中,所述样本磁盘数据包括多个维度上的样本数据。Step 21: Obtain sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions.
在上述步骤中,磁盘监控技术用于监测磁盘出厂后的使用过程中产生的各项磁盘数据,以预测磁盘的故障状态,使得磁盘使用者能够在磁盘发生故障之前便能知晓磁盘即将发生故障,从而对磁盘中的数据进行拷贝存储,避免数据的丢失。In the above steps, the disk monitoring technology is used to monitor various disk data generated during the use of the disk after the factory to predict the fault state of the disk, so that the disk user can know that the disk is about to fail before the disk fails. Therefore, the data in the disk is copied and stored to avoid data loss.
在一种可选的实施例中,上述样本磁盘数据可以包括:底层数据读取错误率、启动/停止计数、重映射扇区数、通电时间累计、主轴起旋重试次数、磁盘校准重试次数、磁盘通电次数、温度以及写错误率,可以根据磁盘历史故障情况获取样本磁盘数据。例如,可以按照正负样本比例为1:5的比例进行样本获取,其中,正样本为存在故障的磁盘,负样本为不存在故障的磁盘。In an optional embodiment, the sample disk data may include: an underlying data read error rate, a start/stop count, a number of remapping sectors, a power-on time accumulation, a spindle spin retries, and a disk calibration retry. The number of times, the number of disk power-ons, the temperature, and the write error rate can be used to obtain sample disk data based on historical disk failure conditions. For example, sample acquisition can be performed at a ratio of 1:5 to positive and negative samples, where the positive sample is the faulty disk and the negative sample is the disk with no fault.
此处需要说明的是,在通过磁盘监控技术获取磁盘的样本磁盘数据时,由于预测磁盘故障的各个机构使用的磁盘并不一定相同,且由于各个机构不同温湿度等环境因素对磁盘的影响,使得不同机构的磁盘的好坏比例并不相同,为了使样本磁盘数据的训练提供更可靠的样本磁盘数据,还可以根据机构的实际上磁盘损坏情况进行获取样本磁盘数据。It should be noted that when the disk data of the disk is obtained by the disk monitoring technology, the disks used by the various organizations that predict the disk failure are not necessarily the same, and the environmental factors such as temperature and humidity of the various mechanisms affect the disk. The ratio of the disks of different organizations is different. In order to provide more reliable sample disk data for the training of sample disk data, the sample disk data can also be obtained according to the actual disk damage of the mechanism.
步骤S23,采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型。Step S23: Perform sample training on the sample disk data by using a GBDT algorithm to obtain a disk prediction model composed of multiple decision trees.
在上述步骤中,GBDT(Gradient Boosting Decision Tree)为一种迭代的决策树算法,该算法由多棵决策树组成,并通过对所有决策树的结论进行累加,得到最终结果。上述决策树作为一种预测模型,是在上一层决策得 到的结果的基础上,进行下一层决策,包括决策点、状态结点、结果结点等参数,树中的每个节点表示被预测的对象,二每个分叉路径则代表该对象可能的属性。In the above steps, GBDT (Gradient Boosting Decision Tree) is an iterative decision tree algorithm, which consists of multiple decision trees and accumulates the conclusions of all decision trees to obtain the final result. The above decision tree is used as a predictive model. On the basis of the results, the next layer of decision is made, including parameters such as decision points, state nodes, and result nodes. Each node in the tree represents the predicted object, and each bifurcation path represents the object. Attributes.
在一种可选的实施例中,在上述样本磁盘为磁盘的S.M.A.R.T的原始值的情况下,对样本磁盘进行样本训练,例如,原始值大于等于预设原始值,可以认为该样本磁盘发生故障的概率较大,原始值小于预设值原始时,可以认为该样本磁盘发生故障的概率较小,因此在确定磁盘预测模型时,在样本磁盘的原始值大于等于预设原始值的情况下,确认该样本磁盘的属性为故障,在样本磁盘的原始值小于预设原始值的情况下,确认该样本磁盘的属性为非故障。建立具备上述决策能力的磁盘预测模型,即向决策树输入待检测磁盘时,若待检测磁盘的原始值大于等于预设原始值,决策树自动确认该待检测磁盘为故障的情况下,确认该样本磁盘的属性为故障,当样本磁盘的原始值小于预设原始值的情况下,确认该样本磁盘的属性为非故障。In an optional embodiment, if the sample disk is the original value of the SMART of the disk, the sample disk is sample-trained, for example, the original value is greater than or equal to a preset original value, and the sample disk may be considered to be faulty. The probability is large. When the original value is less than the preset value, the probability that the sample disk fails is considered to be small. Therefore, when the disk prediction model is determined, if the original value of the sample disk is greater than or equal to the preset original value, Confirm that the attribute of the sample disk is faulty. If the original value of the sample disk is less than the preset original value, confirm that the attribute of the sample disk is non-faulty. Establish a disk prediction model with the above-mentioned decision-making capability. When the disk to be detected is input to the decision tree, if the original value of the disk to be detected is greater than or equal to the preset original value, and the decision tree automatically confirms that the disk to be detected is faulty, confirm the The attribute of the sample disk is fault. When the original value of the sample disk is less than the preset original value, confirm that the attribute of the sample disk is non-faulty.
步骤S25,在接收到待测磁盘的磁盘数据之后,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘。Step S25: After receiving the disk data of the disk to be tested, the disk data of the disk to be tested is processed by using the disk prediction model consisting of multiple decision trees to determine whether the disk to be tested is a faulty disk.
在一种可选的实施例中,将样本磁盘的多个维度的值作为决策树的评价指标,得到多个决策树,再由多个决策树构成一个磁盘预测模型,对待检测磁盘进行检测。In an optional embodiment, the values of the multiple dimensions of the sample disk are used as evaluation indexes of the decision tree to obtain a plurality of decision trees, and then a plurality of decision trees form a disk prediction model to detect the detected disks.
此处值得注意的是,根据磁盘每一个维度得到的决策树可能相同,可能不相同,因此在使用多个决策树构成磁盘预测模型时,需要根据每个决策树在评价体系中的重要性,来确认每个决策树的权重值,从而得到磁盘预测模型。It is worth noting here that the decision trees obtained according to each dimension of the disk may be the same and may not be the same. Therefore, when using multiple decision trees to form a disk prediction model, it is necessary to be based on the importance of each decision tree in the evaluation system. To confirm the weight value of each decision tree, and get the disk prediction model.
此处需要说明的是,在通过磁盘监控技术获取磁盘的样本磁盘数据时,采用了磁盘检测技术,使得获取样本磁盘数据的过程更为简单,且获取的数据更为全面,为样本磁盘数据的训练提供了丰富的磁盘样本数据。在上述步骤中,采用GBDT算法对所述样本磁盘数据进行样本训练可以是分两 次或多次进行训练,以提高与训练结果对应的决策树构成的磁盘预测模型的准确率和召回率。It should be noted here that when the disk disk data is obtained by the disk monitoring technology, the disk detection technology is adopted, so that the process of obtaining the sample disk data is simpler, and the acquired data is more comprehensive, and the sample disk data is Training provides a wealth of disk sample data. In the above steps, the sample training of the sample disk data by using the GBDT algorithm may be divided into two. Training is performed one or more times to improve the accuracy and recall rate of the disk prediction model composed of the decision tree corresponding to the training result.
由此,本申请提供的上述实施例一的方案解决了现有技术的硬盘故障预测系统中一些容易致使硬盘故障的因素不能被采集或量化导致的预测结果不准确的技术问题。Therefore, the solution of the first embodiment provided by the present application solves the technical problem that some factors in the prior art hard disk fault prediction system that are likely to cause the hard disk failure cannot be collected or quantized due to inaccurate prediction results.
根据本申请上述实施例,在一种优选的方案中,所述样本磁盘数据至少包括如下四个维度上的样本数据:原始值、标准值、最差值和累积值。According to the above embodiment of the present application, in a preferred embodiment, the sample disk data includes at least sample data in four dimensions: original value, standard value, worst value, and cumulative value.
上述原始值为磁盘运行时的当前参数;上述标准值为正常磁盘运行时各项参数的数值;上述最差值为磁盘运行时,磁盘的各项检测参数曾出现过与正常值偏差最大的非正常值;上述累计值为磁盘的各项检测参数从磁盘使用至当前时刻的累计结果。The above-mentioned original value is the current parameter of the disk running time; the above-mentioned standard value is the value of each parameter of the normal disk running; the above-mentioned worst value is that when the disk is running, the detection parameters of the disk have the largest deviation from the normal value. Normal value; the above cumulative value is the cumulative result of each disk's detection parameters from disk usage to the current time.
在一种可选的实施例中,磁盘的各项参数可以是对磁盘的各项属性进行描述的信息,可以包括错误读取率、加电次数、重新分配扇区数、旋转重试次数、磁盘校准重试次数以及奇偶校验错误率中的一项或多项,也可以包括磁盘的其他属性信息。In an optional embodiment, the parameters of the disk may be information describing various attributes of the disk, and may include an error read rate, a power-on frequency, a number of re-allocated sectors, a number of rotation retries, One or more of the number of disk calibration retries and the parity error rate may also include other attribute information of the disk.
本申请上述步骤可以分别以上述四个维度上的样本数据得到多个不同的决策树。The above steps of the present application can respectively obtain a plurality of different decision trees by using the sample data in the above four dimensions.
在一种可选的实施例中,可以采用HDTune、CrystalDiskInfo等软件获取样本磁盘数据。In an optional embodiment, the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.
根据本申请上述实施例,在一种优选的方案中,在通过磁盘监控技术获取磁盘的样本磁盘数据之后,所述方法还包括:According to the above embodiment of the present application, in a preferred solution, after acquiring the sample disk data of the disk by using the disk monitoring technology, the method further includes:
步骤S211,对所述每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。Step S211, performing any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that sample data in any one dimension is expanded into a new dimension. sample.
在上述步骤中,对决策结果进行进一步运算,可将决策树根据运算结果拓展出新的维度,得到这一维度上的样本数据。 In the above steps, the decision result is further calculated, and the decision tree can be expanded into a new dimension according to the operation result, and the sample data in this dimension is obtained.
此处值得注意的时,每个维度的样本数据都可以进行多种运算以在这一维度的基础上得到更多维度的样本数据,在有四个维度的基础上,每个维度再分别进行差分运算、平方运算和分布求和运算,便能够得到十六个维度的样本数据,且通过每个维度的样本数据进行决策的侧重点均不同。When it is worth noting here, the sample data of each dimension can perform a variety of operations to obtain more dimensional sample data on the basis of this dimension. On the basis of four dimensions, each dimension is separately performed. Differential operations, square operations, and distributed summation operations yield sample data in sixteen dimensions, and the focus of decision making through sample data for each dimension is different.
在一种可选的实施例中,仍以原始值这一维度的样本数据为例,对原始值的样本数据进行差分运算、平方运算和分布求和运算,由此得到新的四个维度的样本数据,采用新的四个维度的样本数据最为决策指标进行训练,并得到新的四个决策树。In an optional embodiment, the sample data of the original value is still taken as an example, and the sample data of the original value is subjected to a difference operation, a square operation, and a distribution sum operation, thereby obtaining a new four dimensions. Sample data, using the new four dimensions of sample data for the most decision-making indicators to train, and get a new four decision trees.
根据本申请上述实施例,在一种优选的方案中,采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型,包括:According to the above embodiment of the present application, in a preferred solution, the sample disk data is sample-trained by using the GBDT algorithm, and a disk prediction model composed of multiple decision trees is obtained, including:
步骤S231,以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化所述训练数据的分类模型参数。In step S231, sample disk data of all disks is used as training data, and the classification model parameters of the training data are initialized with default values.
在上述步骤中,初始化训练数据的分类模型参数可以是预先设置上述决策树的个数、每个决策树的层数,即对决策树的属性进行初步设置。In the above steps, the classification model parameter of the initialization training data may be preset the number of the above decision trees and the number of layers of each decision tree, that is, the initial setting of the attributes of the decision tree.
步骤S233,提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点。Step S233, extracting a plurality of feature data in the training data, creating each of the plurality of decision trees as a root node, and using the feature value corresponding to each feature data as a leaf node of the corresponding decision tree. .
步骤S235,计算当前所有叶子节点的最优划分以及其增益,并以增益最大的叶子节点以及对应的划分点进行分裂,使得将所述样本磁盘数据划分到子节点中。Step S235: Calculate an optimal partition of all current leaf nodes and a gain thereof, and perform splitting with the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.
在上述步骤中,增益可以是标签值的最小化均方差,即每个样本的标签值与预测标签值做差后,求的差的平方,并计算所有差的平方的和,可以认为被预测出错的样本越多,均方差就越大,因此通过最小化均方差能够找到最优的分枝依据In the above steps, the gain may be the minimum mean square error of the label value, that is, the square of the difference between the label value of each sample and the predicted label value, and calculate the sum of the squares of all the differences, which may be considered to be predicted. The more samples that are erroneous, the larger the mean square error, so the optimal branching basis can be found by minimizing the mean square error.
上述决策树可以是以每个特征数据作为根节点的二叉树,且每个特数据对应于一个特征值,该特征值为以该特征数据为根节点的决策树的叶子 节点。在确定决策树的叶子节点后,对叶子节点在进行下一步划分,此处值得注意的是,当对叶子节点进行进一步划分时,在多个叶子节点的增益不相同的情况下,划分增益最大的叶子节点,使所有样本数据都能划分至相应的叶子节点中。The decision tree may be a binary tree with each feature data as a root node, and each special data corresponds to a feature value, and the feature value is a leaf of a decision tree with the feature data as a root node. node. After determining the leaf nodes of the decision tree, the leaf nodes are further divided. It is worth noting that when the leaf nodes are further divided, the gain is maximized when the gains of the plurality of leaf nodes are different. Leaf nodes, so that all sample data can be divided into corresponding leaf nodes.
在一种可选的实施例中,以样本磁盘为A、B、C和D四块磁盘为例,其中,A磁盘和B磁盘为正常磁盘,C磁盘和D磁盘为损坏的磁盘,在这一示例中,将正常磁盘对应于0,故障磁盘对应于1,因此,A、B、C和D四块磁盘分别对应为0、0、1、1。获取上述磁盘在第一维度上的特征值为A,使用GBDT算法对样本磁盘数据进行训练,图3是根据本发明实施例的一种使用GBDT算法对样本磁盘数据进行训练的示意图,结合图3所示,设置默认初始值为0.5,即每个磁盘为故障磁盘的概率为0.5,第一维度的阀值为A0,将特征值大于A0的磁盘划分为一个子节点,将第一维度上的特征值小于等于A0的磁盘划分为另一个子节点,并设置两个子节点的磁盘为故障磁盘的概率为0.5。In an optional embodiment, the sample disk is a disk of A, B, C, and D. The A disk and the B disk are normal disks, and the C disk and the D disk are damaged disks. In one example, the normal disk corresponds to 0, and the failed disk corresponds to 1, so the four disks A, B, C, and D correspond to 0, 0, 1, 1, respectively. Obtaining the characteristic value of the disk in the first dimension is A, and training the sample disk data by using the GBDT algorithm. FIG. 3 is a schematic diagram of training the sample disk data by using the GBDT algorithm according to an embodiment of the present invention, and FIG. 3 As shown, the default initial value is set to 0.5, that is, the probability that each disk is a failed disk is 0.5, the threshold of the first dimension is A0, and the disk with the feature value greater than A0 is divided into a child node, which will be in the first dimension. A disk whose eigenvalue is less than or equal to A0 is divided into another child node, and the probability that the disk of the two child nodes is a failed disk is 0.5.
此处需要说明的当是,上述实施例为方便说明,仅选用了四个样本数据进行说明,因此只划分得到两个叶子节点,在实际应用中,根节点划分为两个叶子节点之后,仍可以继续划分,样本数据量越大,划分的层次就越多。It should be noted that, in the above embodiment, for convenience of description, only four sample data are selected for description, so only two leaf nodes are obtained. In practical applications, after the root node is divided into two leaf nodes, You can continue to divide, the larger the sample data, the more the level of division.
根据本申请上述实施例,在一种优选的方案中,提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点,包括:According to the above embodiment of the present application, in a preferred solution, multiple feature data in the training data are extracted, each feature data is created as a root node, and each feature data is created. The corresponding feature value is used as the leaf node of the corresponding decision tree, including:
步骤S2331,读取任意一个特征数据对应的阈值。In step S2331, the threshold corresponding to any one of the feature data is read.
步骤S2333,将所述任意一个特征数据的特征值与所述阈值进行比较,并根据比较结果得到两个分支的熵。Step S2333, comparing the feature value of any one of the feature data with the threshold, and obtaining the entropy of the two branches according to the comparison result.
步骤S2335,根据所述两个分支的熵确定两个新节点作为所述任意一个特征数据的两个叶子节点。Step S2335, determining two new nodes as two leaf nodes of the arbitrary one feature data according to the entropy of the two branches.
步骤S2337,采用上述步骤对每一个特征数据进行处理,直到每个特 征数据得到预定的两个唯一的叶子节点。Step S2337, using the above steps to process each feature data until each special The levy data gets the predetermined two unique leaf nodes.
在上述步骤中,穷举每一个特征的每一个阈值,找到使得按照特征小于等于阈值,和特征大于阈值分成的两个分枝的熵最小的特征和阈值,按照该标准分枝得到两个新节点,使用同样方法继续分枝直到所有样本都被分入只有正常磁盘或只有故障磁盘的叶子节点,或达到预设的终止条件,若最终叶子节点中不是只有正常磁盘或故障磁盘,则以该节点上所有样本的平均标签值作为该叶子节点的预测标签值。In the above steps, each threshold of each feature is exhausted, and the features and thresholds that minimize the entropy of the two branches according to the feature being less than or equal to the threshold and the feature is greater than the threshold are found, and two new branches are obtained according to the standard branch. Nodes, use the same method to continue branching until all samples are split into leaf nodes with only normal disks or only failed disks, or reach the default termination condition. If there are not only normal disks or failed disks in the final leaf node, then The average tag value of all samples on the node is used as the predicted tag value for the leaf node.
此处需要说明的是,标签值即为该磁盘为故障磁盘的概率。It should be noted here that the tag value is the probability that the disk is a failed disk.
此处仍需要说明的是,熵最小是指尽可能的使每个分枝中,正样本和负样本的比例远离1:1,熵最小的情况为该分枝上只有正样本或负样本,即该分支上只有正常的磁盘,或故障磁盘。It should be noted here that the minimum entropy means that as far as possible, the ratio of positive and negative samples in each branch is far from 1:1, and the case of minimum entropy is that there are only positive or negative samples on the branch. That is, there are only normal disks or failed disks on the branch.
在一种可选的实施例中,在决策树为回归树的示例中,每个节点都会得一个预测值,该预测值等于属于该节点的所有标签值的平均值,对该节点进行划分时,穷举每一个特征的每个阈值,找最好的分割点进行划分,直到每个叶子节点上每个样本的标签值都唯一或者达到预设的终止条件,若最终叶子节点上样本的标签值不唯一,则以该节点上所有样本的平均标签值作为该叶子节点的预测标签值。In an optional embodiment, in the example where the decision tree is a regression tree, each node obtains a predicted value equal to the average of all the tag values belonging to the node, and the node is divided. Exhausting each threshold of each feature, finding the best segmentation point, until the tag value of each sample on each leaf node is unique or reaches the preset termination condition, if the label of the sample on the final leaf node If the value is not unique, the average tag value of all samples on the node is used as the predicted tag value of the leaf node.
此处需要说明的是,在上述实施例中,最优的划分标准不再是最小化熵,而是最小化均方差,即每个样本的标签值与预测标签值做差后,求的差的平方,并计算所有差的平方的和,可以认为被预测出错的样本越多,均方差就越大,因此通过最小化均方差能够找到最优的分枝依据。It should be noted here that in the above embodiment, the optimal partitioning criterion is no longer to minimize the entropy, but to minimize the mean square error, that is, the difference between the label value of each sample and the predicted label value. The square of the square, and calculate the sum of the squares of all the differences, can be considered that the more samples that are predicted to be wrong, the greater the mean square error, so the optimal branching basis can be found by minimizing the mean square error.
此处还需要说明的是,在进行划分时,使每个叶子节点上每个样本的标签值都唯一是很难达到的,因此为了得到最接近真实情况的预测结果可以预设一个终止条件,该终止条件可以是叶子的上限。It should also be noted here that it is difficult to achieve the unique label value of each sample on each leaf node when performing the partitioning, so a termination condition can be preset in order to obtain the prediction result closest to the real situation. The termination condition can be the upper limit of the leaf.
根据本申请上述实施例,在一种优选的方案中,在得到由多个决策树组成的磁盘预测模型之后,所述方法还包括:对所述分类模型参数进行调整,其中,在所述分类模型参数包括故障磁盘样本和非故障磁盘样本的情 况下,如果要确定所述待测磁盘是否为故障磁盘,则将所述分类模型参数中的故障磁盘样本的比例调高。According to the above embodiment of the present application, in a preferred solution, after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein the classification Model parameters include faulty disk samples and non-faulty disk samples In this case, if it is determined whether the disk to be tested is a faulty disk, the proportion of the failed disk samples in the classification model parameter is increased.
根据本申请上述实施例,在一种优选的方案中,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘,包括:According to the above embodiment of the present application, in a preferred solution, the disk data of the disk to be tested is processed by using the disk prediction model composed of a plurality of decision trees to determine whether the disk to be tested is a faulty disk. ,include:
步骤S251,接收到所述待测磁盘的磁盘数据之后,对所述待测磁盘的磁盘数据赋予一个初始值。Step S251: After receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested.
步骤S253,根据所述待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将所述第一残差赋值给所述初始值,得到更新后的初始值。Step S253, traversing each decision tree according to the initial value of the disk to be tested, calculating a prediction result and a first residual determined by the first decision tree, and assigning the first residual to the initial value. , get the updated initial value.
步骤S255,以所述更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并所述第二残差赋值所述更新后的初始值,以此遍历所有的决策树,得到预测所述待测磁盘是否为故障磁盘的结果。Step S255, calculating, by using the updated initial value, a prediction result determined by the second decision tree and a second residual, and assigning the updated residual value to the second residual, thereby traversing all the The decision tree obtains a result of predicting whether the disk to be tested is a failed disk.
步骤S257,每一棵树学的是之前所有树结论和的残差,这个残差就是一个加预测值后能得真实值的累加量。In step S257, each tree learns the residual of all previous tree conclusions, and the residual is an accumulated amount that can obtain the true value after adding the predicted value.
在一种可选的实施例中,仍以上述A,B,C,D四个磁盘为例,采用特征A可将A,B,C,D四个磁盘分为两个部分,分别为A,B和C,D,每个部分用平均标签值作为预测值。此时计算残差,其中残差至为磁盘的预测值与磁盘的实际值的差,所以A的残差就是1-0.5=0.5进而得到A,B,C,D的残差分别为0.5,-0.5,0.5,-0.5。然后结合图4所示,图4为根据本发明实施例的一种使用GBDT算法计算磁盘预测值的示意图,使用残差替代A,B,C,D的原值,输入至第二棵决策树进行训练,并根据与特征B的比对结果分为两个叶子节点,如果预测值和它们的残差相等,则只需把第二棵树的结论累加到第一棵树上就能得到磁盘的实际值。第二棵树仅有两个值0.5和-0.5,因此直接分成两个节点。此时所有人的残差都是0,即每个人都得到了真实的预测值。In an optional embodiment, the four disks A, B, C, and D are still taken as an example, and the four disks A, B, C, and D can be divided into two parts by using feature A, respectively. , B and C, D, each part uses the average tag value as the predicted value. At this time, the residual is calculated, wherein the residual is the difference between the predicted value of the disk and the actual value of the disk, so the residual of A is 1-0.5=0.5, and the residuals of A, B, C, and D are respectively 0.5. -0.5, 0.5, -0.5. Then, as shown in FIG. 4, FIG. 4 is a schematic diagram of calculating a predicted value of a disk using the GBDT algorithm according to an embodiment of the present invention, and using the residual to replace the original values of A, B, C, and D, and inputting to the second decision tree. Train and divide into two leaf nodes according to the comparison result with feature B. If the predicted values and their residuals are equal, then simply add the conclusion of the second tree to the first tree to get the disk. Actual value. The second tree has only two values of 0.5 and -0.5, so it is split directly into two nodes. At this point everyone's residual is 0, that is, everyone gets real predictions.
此处需要说明的是,上述实施例以说明为目的,因此只有两颗决策树, 在实际应用中,根据样本数据量可以获得到个决策树,且预测值是指之前所有树累加的和,由于此实施例中,这棵决策树之前仅有一颗决策树,因此直接是0.5,如果还有奇特决策树,则需要都累加起来作为A的预测值。It should be noted here that the above embodiment is for the purpose of explanation, so there are only two decision trees. In practical applications, a decision tree can be obtained according to the sample data amount, and the predicted value refers to the sum of all the previous trees. Since in this embodiment, the decision tree has only one decision tree before, so it is directly 0.5. If there are still strange decision trees, they need to be added together as the predicted value of A.
图5是根据本发明实施例的一种可选的磁盘的故障预测方法的流程图,下面结合图5详细介绍本申请的一种优选的实施例。FIG. 5 is a flowchart of an optional fault prediction method for a magnetic disk according to an embodiment of the present invention. A preferred embodiment of the present application is described in detail below with reference to FIG. 5.
如图5所示,提供了一种磁盘的故障预测方法,该方法可以包括如下步骤S51至步骤S57:As shown in FIG. 5, a method for predicting a fault of a magnetic disk is provided. The method may include the following steps S51 to S57:
S51,获取样本磁盘的样本数据。S51. Obtain sample data of the sample disk.
具体的,在上述步骤中,可以通过HDTune、CrystalDiskInfo等软件获取样本磁盘数据。Specifically, in the above steps, the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.
S52,对样本数据进行差分运算。S52, performing differential operations on the sample data.
具体的,在上述步骤中,差分运算指磁盘在某一时刻的特征数据与过该磁盘在24小时之前的特征数据做差运算得到的值。Specifically, in the above steps, the difference operation refers to a value obtained by performing difference calculation between the feature data of the disk at a certain time and the feature data of the disk before 24 hours.
S53,对差分运算得到的结果进行分布求和和/或平方运算。S53, performing a distribution summation and/or a square operation on the result obtained by the difference operation.
S54,得到训练和预测数据。S54, obtaining training and prediction data.
S55,第一步训练和预测,使召回率较大。S55, the first step of training and forecasting, makes the recall rate larger.
S56,第二步训练和预测,平衡召回率和准确率。S56, the second step of training and forecasting, balances recall and accuracy.
具体的,在上述步骤中,由于训练数据中负样本占比很大,正样本占比小,例如,当二者比例为1000:1时,如果用全部的训练数据做训练,能准确预测的正样本是很少的,由于训练数据中正样本较少,很多真实值为负样本的数据可能被误判为正样本,因此第一步在训练时使正样本的召回率较大,第二步在训练时,把第一步预测为正样本的训练数据作为第二步的训练数据,即选择为与正样本接近的那些样本作为训练样本,如此在做训练时,训练出的模型会更有利于预测出正样本,这样第二步预测得到的结果,正样本的准确率会比第一步有大幅度提高,从而使准确率和召回率达到一定的平衡程度。 Specifically, in the above steps, since the proportion of negative samples in the training data is large, the proportion of positive samples is small. For example, when the ratio of the two is 1000:1, if all the training data is used for training, the prediction can be accurately predicted. Positive samples are rare. Because there are few positive samples in the training data, many data with real values of negative samples may be misjudged as positive samples. Therefore, the first step is to make the positive sample recall rate higher during training. During training, the training data predicted as the positive sample in the first step is used as the training data of the second step, that is, those samples that are close to the positive sample are selected as the training samples, so that when training, the trained model will be more It is good to predict the positive sample, so the result of the second step prediction, the accuracy of the positive sample will be greatly improved than the first step, so that the accuracy and recall rate reach a certain balance.
实施例2Example 2
根据本发明实施例,还提供了一种用于实施上述反编译数据的处理方法的反编译数据的处理装置,图6是根据本发明实施例的一种磁盘的故障预测装置的结构示意图,如图6所示,该装置包括:获取模块60,训练模块62和处理模块64。According to an embodiment of the present invention, there is also provided a processing apparatus for decompiling data for implementing the processing method of the decompiled data, and FIG. 6 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention, such as As shown in FIG. 6, the apparatus includes an acquisition module 60, a training module 62, and a processing module 64.
获取模块60,用于通过磁盘监控技术获取磁盘的样本磁盘数据,其中,所述样本磁盘数据包括多个维度上的样本数据;The obtaining module 60 is configured to acquire sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions;
训练模块62,用于采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;The training module 62 is configured to perform sample training on the sample disk data by using a GBDT algorithm to obtain a disk prediction model composed of multiple decision trees;
处理模块64,在接收到待测磁盘的磁盘数据之后,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘。The processing module 64, after receiving the disk data of the disk to be tested, processing the disk data of the disk to be tested by using the disk prediction model composed of multiple decision trees, and determining whether the disk to be tested is a fault disk .
此处需要说明的是,上述获取模块60,训练模块62和处理模块64对应于实施例一种的步骤S21至步骤S25所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。It should be noted that the above-mentioned acquisition module 60, the training module 62, and the processing module 64 are the same as the application scenarios and the application scenarios that are implemented in the steps S21 to S25 of the embodiment, but are not limited to the disclosure in the first embodiment. content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
根据本申请上述实施例,在一种优选的方案中,所述样本磁盘数据为SMART磁盘数据,其中,所述样本磁盘数据至少包括如下四个维度上的样本数据:原始值、标准值、最差值和累积值。According to the above embodiment of the present application, in a preferred solution, the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in four dimensions: original value, standard value, and most Difference and cumulative value.
根据本申请上述实施例,在一种优选的方案中,结合图7所示,上述装置还包括:According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 7, the device further includes:
运算模块70,用于对所述每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。The operation module 70 is configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that sample data in any one dimension is expanded to a new one. Sample data on the dimension.
此处需要说明的是,上述运算模块770对应与实施例一中的步骤S21至步骤S25所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一 提供的计算机终端10中。It should be noted that the foregoing operation module 770 is the same as the example and the application scenario implemented in step S21 to step S25 in the first embodiment, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the first embodiment. Provided in the computer terminal 10.
根据本申请上述实施例,在一种优选的方案中,结合图8所示,上述训练模块62还包括:According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 8, the training module 62 further includes:
初始模块80,用于以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化所述训练数据的分类模型参数;The initial module 80 is configured to use sample disk data of all disks as training data, and initialize a classification model parameter of the training data by using a default value;
提取模块82,用于提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点;The extracting module 82 is configured to extract a plurality of feature data in the training data, create each of the plurality of decision trees as a root node, and use the feature value corresponding to each feature data as a corresponding decision tree. Leaf node
第一计算模块84,用于计算当前所有叶子节点的最优划分以及其增益,并以增益最大的叶子节点以及对应的划分点进行分裂,使得将所述样本磁盘数据划分到子节点中。The first calculating module 84 is configured to calculate an optimal partition of all current leaf nodes and a gain thereof, and perform splitting with the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.
此处需要说明的是,上述初始模块80,提取模块82和第一计算模块84对应于实施例一种的步骤S231至步骤S235所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。It should be noted that the initial module 80, the extraction module 82, and the first calculation module 84 are the same as the application scenarios implemented in steps S231 to S235 of the embodiment, but are not limited to the foregoing embodiment. Public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
根据本申请上述实施例,在一种优选的方案中,结合图9所示,所述提取模块82包括:According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 9, the extraction module 82 includes:
读取模块90,用于读取任意一个特征数据对应的阈值;The reading module 90 is configured to read a threshold corresponding to any one of the feature data;
比较模块92,用于将所述任意一个特征数据的特征值与所述阈值进行比较,并根据比较结果得到两个分支的熵;The comparing module 92 is configured to compare the feature value of the any one of the feature data with the threshold, and obtain the entropy of the two branches according to the comparison result;
确定模块94,用于根据所述两个分支的熵确定两个新节点作为所述任意一个特征数据的两个叶子节点;a determining module 94, configured to determine, according to the entropy of the two branches, two new nodes as two leaf nodes of the any one of the feature data;
处理子模块96,用于采用上述步骤对每一个特征数据进行处理,直到每个特征数据得到预定的两个唯一的叶子节点。The processing sub-module 96 is configured to process each feature data by using the above steps until each feature data obtains two predetermined unique leaf nodes.
此处需要说明的是,上述读取模块90,比较模块92、确定模块94和处理子模块96对应于实施例一种的步骤S2331至步骤S2337所实现的实 例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。It should be noted that the foregoing reading module 90, the comparing module 92, the determining module 94, and the processing sub-module 96 correspond to the realities implemented in steps S2331 to S2337 of the embodiment. The example is the same as the application scenario, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
根据本申请上述实施例,在一种优选的方案中,在得到由多个决策树组成的磁盘预测模型之后,所述方法还包括:对所述分类模型参数进行调整,其中,在所述分类模型参数包括故障磁盘样本和非故障磁盘样本的情况下,如果要确定所述待测磁盘是否为故障磁盘,则将所述分类模型参数中的故障磁盘样本的比例调高。According to the above embodiment of the present application, in a preferred solution, after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein the classification In the case where the model parameters include a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
根据本申请上述实施例,在一种优选的方案中,结合图10所示,上述处理模块64包括:According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 10, the processing module 64 includes:
接收模块100,用于接收到所述待测磁盘的磁盘数据之后,对所述待测磁盘的磁盘数据赋予一个初始值;The receiving module 100 is configured to: after receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested;
第二计算模块102,用于根据所述待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将所述第一残差赋值给所述初始值,得到更新后的初始值;The second calculating module 102 is configured to traverse each decision tree according to the initial value of the disk to be tested, calculate a prediction result and a first residual determined by the first decision tree, and assign the first residual Giving the initial value, obtaining an updated initial value;
遍历模块104,用于以所述更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并所述第二残差赋值所述更新后的初始值,以此遍历所有的决策树,得到预测所述待测磁盘是否为故障磁盘的结果。The traversing module 104 is configured to calculate, by using the updated initial value, a prediction result determined by the second decision tree and a second residual, and the second residual is assigned the updated initial value. Iterate through all the decision trees and get the result of predicting whether the disk to be tested is a failed disk.
此处需要说明的是,上述接收模块100,第二计算模块102和遍历模块104对应于实施例一种的步骤S251至步骤S255所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。It should be noted that the example of the receiving module 100, the second calculating module 102, and the traversing module 104 corresponding to the steps S251 to S255 of the embodiment are the same as the application scenario, but are not limited to the foregoing embodiment. Public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
实施例3Example 3
本发明的实施例可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。Embodiments of the present invention may provide a computer terminal, which may be any one of computer terminal groups. Optionally, in this embodiment, the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
可选地,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。 Optionally, in this embodiment, the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
在本实施例中,上述计算机终端可以执行磁盘的故障预测方法中以下步骤的程序代码:通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,确定待测磁盘是否为故障磁盘。In this embodiment, the computer terminal may execute the program code of the following steps in the fault prediction method of the disk: acquiring the sample disk data of the disk by using the disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; using GBDT The algorithm performs sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, the disk prediction data composed of multiple decision trees is used to perform disk data of the disk to be tested. Process to determine if the disk to be tested is a failed disk.
可选地,图11是根据本发明实施例的一种计算机终端的结构框图。如图11所示,该计算机终端A可以包括:一个或多个(图中仅示出一个)处理器111、存储器113、以及传输装置115。Optionally, FIG. 11 is a structural block diagram of a computer terminal according to an embodiment of the present invention. As shown in FIG. 11, the computer terminal A may include one or more (only one shown in the figure) processor 111, memory 113, and transmission device 115.
其中,存储器可用于存储软件程序以及模块,如本发明实施例中的磁盘的故障预测方法和装置对应的程序指令/模块,处理器通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的磁盘的故障预测方法。存储器可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory can be used to store the software program and the module, such as the fault prediction method of the disk and the program instruction/module corresponding to the device in the embodiment of the present invention, and the processor executes various programs by running the software program and the module stored in the memory. Functional application and data processing, that is, the above-described method for predicting the failure of the disk. The memory may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory can further include memory remotely located relative to the processor, which can be connected to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
处理器可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:样本磁盘数据为SMART磁盘数据,其中,样本磁盘数据至少包括如下四个维度上的样本数据:原始值、标准值、最差值和累积值。The processor may call the memory stored information and the application by the transmission device to perform the following steps: the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value , the worst value and the cumulative value.
可选的,上述处理器还可以执行如下步骤的程序代码:对每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。Optionally, the foregoing processor may further execute the following program code: perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distributed sum operation, so that any one dimension The sample data is expanded out of the sample data on the new dimension.
可选的,上述处理器还可以执行如下步骤的程序代码:以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化训练数据的分类模型参数;提取训练数据中的多个特征数据,将每个特征数据作为根节点在创建多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点;计算当前所有叶子节点的最优划分以及其增益,并以增益最大的叶子 节点以及对应的划分点进行分裂,使得将样本磁盘数据划分到子节点中。Optionally, the foregoing processor may further execute the following program code: use sample disk data of all disks as training data, and initialize a classification model parameter of the training data by using a default value; and extract multiple feature data in the training data, Each feature data is used as a root node to create multiple decision trees, and the feature value corresponding to each feature data is used as a leaf node of the corresponding decision tree; the optimal partition of all current leaf nodes and its gain are calculated, and the gain is maximized. Leaves The node and the corresponding dividing point are split, so that the sample disk data is divided into the child nodes.
可选的,上述处理器还可以执行如下步骤的程序代码:读取任意一个特征数据对应的阈值;将任意一个特征数据的特征值与阈值进行比较,并根据比较结果得到两个分支的熵;根据两个分支的熵确定两个新节点作为任意一个特征数据的两个叶子节点;采用上述步骤对每一个特征数据进行处理,直到每个特征数据得到预定的两个唯一的叶子节点。Optionally, the processor may further execute the following program code: read a threshold corresponding to any one of the feature data; compare the feature value of any one of the feature data with a threshold, and obtain an entropy of the two branches according to the comparison result; Two new nodes are determined as two leaf nodes of any one of the feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data obtains two predetermined unique leaf nodes.
可选的,上述处理器还可以执行如下步骤的程序代码:在得到由多个决策树组成的磁盘预测模型之后,方法还包括:对分类模型参数进行调整,其中,在分类模型参数包括故障磁盘样本和非故障磁盘样本的情况下,如果要确定待测磁盘是否为故障磁盘,则将分类模型参数中的故障磁盘样本的比例调高。Optionally, the foregoing processor may further execute the following program code: after obtaining the disk prediction model composed of multiple decision trees, the method further includes: adjusting the classification model parameters, where the classification model parameters include the faulty disk In the case of sample and non-failed disk samples, if you want to determine if the disk under test is a failed disk, increase the proportion of the failed disk samples in the classification model parameters.
可选的,上述处理器还可以执行如下步骤的程序代码:接收到待测磁盘的磁盘数据之后,对待测磁盘的磁盘数据赋予一个初始值;根据待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将第一残差赋值给初始值,得到更新后的初始值;以更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并第二残差赋值更新后的初始值,以此遍历所有的决策树,得到预测待测磁盘是否为故障磁盘的结果。Optionally, the foregoing processor may further execute the following program code: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; and each decision tree is traversed according to the initial value of the disk to be tested. Calculating the prediction result and the first residual determined by the first decision tree, and assigning the first residual to the initial value to obtain the updated initial value; and calculating the updated initial value to obtain the second decision tree The determined prediction result and the second residual, and the second residual is assigned an updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.
在本发明实施例中,采用通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型方式,通过在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,达到了确定待测磁盘是否为故障磁盘的目的,从而实现了预测磁盘故障状态的技术效果,进而解决了现有技术的硬盘故障预测系统中一些容易致使硬盘故障的因素不能被采集胡或量化导致的预测结果不准确的技术问题。In the embodiment of the present invention, the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the sample disk data is sample-trained by using the GBDT algorithm to obtain multiple decision trees. The disk prediction model is formed by processing the disk data of the disk to be tested by using a disk prediction model composed of multiple decision trees after receiving the disk data of the disk to be tested, thereby determining whether the disk to be tested is a failed disk. The purpose is to realize the technical effect of predicting the fault state of the disk, and further solve the technical problem that some factors in the prior art hard disk fault prediction system that are easy to cause the fault of the hard disk cannot be inaccurate due to the acquisition or quantification.
本领域普通技术人员可以理解,图11所示的结构仅为示意,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电 脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图11其并不对上述电子装置的结构造成限定。例如,计算机终端A还可包括比图11中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图11所示不同的配置。A person skilled in the art can understand that the structure shown in FIG. 11 is only an illustration, and the computer terminal can also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, and a palm phone. Brain and mobile Internet devices (MID), PAD and other terminal devices. FIG. 11 does not limit the structure of the above electronic device. For example, computer terminal A may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 11, or have a different configuration than that shown in FIG.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。A person of ordinary skill in the art may understand that all or part of the steps of the foregoing embodiments may be completed by a program to instruct terminal device related hardware, and the program may be stored in a computer readable storage medium, and the storage medium may be Including: flash disk, read-only memory (ROM), random access memory (RAM), disk or optical disk.
实施例4Example 4
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于保存上述实施例一所提供的磁盘的故障预测方法所执行的程序代码。Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the foregoing storage medium may be used to save the program code executed by the fault prediction method of the disk provided in the first embodiment.
可选地,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。Optionally, in this embodiment, the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:通过磁盘监控技术获取磁盘的样本磁盘数据,其中,样本磁盘数据包括多个维度上的样本数据;采用GBDT算法对样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;在接收到待测磁盘的磁盘数据之后,使用由多个决策树组成的磁盘预测模型对待测磁盘的磁盘数据进行处理,确定待测磁盘是否为故障磁盘。Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; The GBDT algorithm is used to perform sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, a disk prediction model composed of multiple decision trees is used to measure the disks of the disk. The data is processed to determine whether the disk to be tested is a failed disk.
可选地,上述存储介质还被设置为存储用于执行以下步骤的程序代码:对每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。Optionally, the storage medium is further configured to store program code for performing the following steps: performing one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, The sample data in any one dimension is expanded to the sample data on the new dimension.
可选地,上述存储介质还被设置为存储用于执行以下步骤的程序代码: 以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化训练数据的分类模型参数;提取训练数据中的多个特征数据,将每个特征数据作为根节点在创建多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点;计算当前所有叶子节点的最优划分以及其增益,并以增益最大的叶子节点以及对应的划分点进行分裂,使得将样本磁盘数据划分到子节点中。Optionally, the storage medium is further configured to store program code for performing the following steps: The sample disk data of all disks is used as training data, and the classification model parameters of the training data are initialized by default values; multiple feature data in the training data are extracted, and each feature data is used as a root node to create multiple decision trees, and The feature value corresponding to each feature data is used as a leaf node of the corresponding decision tree; the optimal partition of all current leaf nodes and its gain are calculated, and the leaf nodes with the largest gain and the corresponding segment points are split, so that the sample disk data is obtained. Divided into child nodes.
可选地,上述存储介质还被设置为存储用于执行以下步骤的程序代码:读取任意一个特征数据对应的阈值;将任意一个特征数据的特征值与阈值进行比较,并根据比较结果得到两个分支的熵;根据两个分支的熵确定两个新节点作为任意一个特征数据的两个叶子节点;采用上述步骤对每一个特征数据进行处理,直到每个特征数据得到预定的两个唯一的叶子节点。Optionally, the storage medium is further configured to store program code for performing the following steps: reading a threshold corresponding to any one of the feature data; comparing the feature value of any one of the feature data with a threshold, and obtaining two according to the comparison result. Entropy of branches; two new nodes are determined as two leaf nodes of any one feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data obtains two predetermined unique ones Leaf node.
可选地,上述存储介质还被设置为存储用于执行以下步骤的程序代码:在得到由多个决策树组成的磁盘预测模型之后,方法还包括:对分类模型参数进行调整,其中,在分类模型参数包括故障磁盘样本和非故障磁盘样本的情况下,如果要确定待测磁盘是否为故障磁盘,则将分类模型参数中的故障磁盘样本的比例调高。Optionally, the storage medium is further configured to store program code for performing the following steps: after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein, in the classification In the case where the model parameters include a failed disk sample and a non-faulty disk sample, if the disk to be tested is determined to be a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
可选地,上述存储介质还被设置为存储用于执行以下步骤的程序代码:接收到待测磁盘的磁盘数据之后,对待测磁盘的磁盘数据赋予一个初始值;根据待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将第一残差赋值给初始值,得到更新后的初始值;以更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并第二残差赋值更新后的初始值,以此遍历所有的决策树,得到预测待测磁盘是否为故障磁盘的结果。Optionally, the foregoing storage medium is further configured to store program code for performing the following steps: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; traversing according to the initial value of the disk to be tested. For each decision tree, the prediction result determined by the first decision tree and the first residual are calculated, and the first residual is assigned to the initial value to obtain the updated initial value; and the updated initial value is calculated. The prediction result determined by the two decision trees and the second residual, and the second residual is assigned an updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容, 可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed technical content, It can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims (14)

  1. 一种磁盘的故障预测方法,其特征在于,包括:A method for predicting a fault of a disk, comprising:
    通过磁盘监控技术获取磁盘的样本磁盘数据,其中,所述样本磁盘数据包括多个维度上的样本数据;Obtaining sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions;
    采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;Performing sample training on the sample disk data by using the GBDT algorithm to obtain a disk prediction model composed of multiple decision trees;
    在接收到待测磁盘的磁盘数据之后,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘。After receiving the disk data of the disk to be tested, the disk data of the disk to be tested is processed by using the disk prediction model consisting of multiple decision trees to determine whether the disk to be tested is a faulty disk.
  2. 根据权利要求1所述的方法,其特征在于,所述样本磁盘数据为SMART磁盘数据,其中,所述样本磁盘数据至少包括如下四个维度上的样本数据:原始值、标准值、最差值和累积值。The method according to claim 1, wherein the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in four dimensions: original value, standard value, and worst value. And cumulative values.
  3. 根据权利要求2所述的方法,其特征在于,在通过磁盘监控技术获取磁盘的样本磁盘数据之后,所述方法还包括:The method of claim 2, after the obtaining the sample disk data of the disk by the disk monitoring technology, the method further comprises:
    对每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。The sample data in each dimension is subjected to any one or more of the following operations: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded to the sample data in the new dimension.
  4. 根据权利要求1至3中任意一项所述的方法,其特征在于,采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型,包括:The method according to any one of claims 1 to 3, wherein the sample disk data is sample-trained by using a GBDT algorithm to obtain a disk prediction model composed of a plurality of decision trees, including:
    以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化所述训练数据的分类模型参数;Taking sample disk data of all disks as training data, and initializing the classification model parameters of the training data by default values;
    提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点; Extracting a plurality of feature data in the training data, creating each of the plurality of decision trees as a root node, and using the feature value corresponding to each feature data as a leaf node of the corresponding decision tree;
    计算当前所有叶子节点的最优划分以及其增益,并以增益最大的叶子节点以及对应的划分点进行分裂,使得将所述样本磁盘数据划分到子节点中。The optimal partition of all current leaf nodes and its gain are calculated, and splitted by the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.
  5. 根据权利要求4所述的方法,其特征在于,提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点,包括:The method according to claim 4, wherein a plurality of feature data in the training data are extracted, each feature data is used as a root node to create the plurality of decision trees, and each feature data is corresponding The feature value is used as the leaf node of the corresponding decision tree, including:
    读取任意一个特征数据对应的阈值;Reading a threshold corresponding to any one of the feature data;
    将所述任意一个特征数据的特征值与所述阈值进行比较,并根据比较结果得到两个分支的熵;Comparing the feature value of any one of the feature data with the threshold, and obtaining the entropy of the two branches according to the comparison result;
    根据所述两个分支的熵确定两个新节点作为所述任意一个特征数据的两个叶子节点;Determining two new nodes as two leaf nodes of the arbitrary one of the feature data according to the entropy of the two branches;
    采用上述步骤对每一个特征数据进行处理,直到每个特征数据得到预定的两个唯一的叶子节点。Each of the feature data is processed using the above steps until each feature data obtains two predetermined unique leaf nodes.
  6. 根据权利要求4所述的方法,其特征在于,在得到由多个决策树组成的磁盘预测模型之后,所述方法还包括:对所述分类模型参数进行调整,其中,在所述分类模型参数包括故障磁盘样本和非故障磁盘样本的情况下,如果要确定所述待测磁盘是否为故障磁盘,则将所述分类模型参数中的故障磁盘样本的比例调高。The method according to claim 4, wherein after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein the classification model parameters In the case of a faulty disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
  7. 根据权利要求1所述的方法,其特征在于,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘,包括:The method according to claim 1, wherein the disk data of the disk to be tested is processed by using a disk prediction model composed of a plurality of decision trees, and determining whether the disk to be tested is a failed disk, including :
    接收到所述待测磁盘的磁盘数据之后,对所述待测磁盘的磁盘数据赋予一个初始值;After receiving the disk data of the disk to be tested, assigning an initial value to the disk data of the disk to be tested;
    根据所述待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将所述第一残差赋值给所述初始值,得到更新后的初始值; Traversing each decision tree according to the initial value of the disk to be tested, calculating a prediction result determined by the first decision tree and a first residual, and assigning the first residual to the initial value to obtain an update After the initial value;
    以所述更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并所述第二残差赋值所述更新后的初始值,以此遍历所有的决策树,得到预测所述待测磁盘是否为故障磁盘的结果。Calculating the prediction result determined by the second decision tree and the second residual by using the updated initial value, and assigning the updated initial value to the second residual, thereby traversing all the decision trees. Obtain a result of predicting whether the disk to be tested is a failed disk.
  8. 一种磁盘的故障预测装置,其特征在于,包括:A fault prediction device for a magnetic disk, comprising:
    获取模块,用于通过磁盘监控技术获取磁盘的样本磁盘数据,其中,所述样本磁盘数据包括多个维度上的样本数据;An obtaining module, configured to acquire sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions;
    训练模块,用于采用GBDT算法对所述样本磁盘数据进行样本训练,得到由多个决策树组成的磁盘预测模型;a training module, configured to perform sample training on the sample disk data by using a GBDT algorithm, to obtain a disk prediction model composed of multiple decision trees;
    处理模块,在接收到待测磁盘的磁盘数据之后,使用所述由多个决策树组成的磁盘预测模型对所述待测磁盘的磁盘数据进行处理,确定所述待测磁盘是否为故障磁盘。The processing module, after receiving the disk data of the disk to be tested, processes the disk data of the disk to be tested by using the disk prediction model composed of multiple decision trees, and determines whether the disk to be tested is a fault disk.
  9. 根据权利要求8所述的装置,其特征在于,所述样本磁盘数据为SMART磁盘数据,其中,所述样本磁盘数据至少包括如下四个维度上的样本数据:原始值、标准值、最差值和累积值。The apparatus according to claim 8, wherein said sample disk data is SMART disk data, and wherein said sample disk data includes at least sample data in four dimensions: original value, standard value, and worst value. And cumulative values.
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:The device according to claim 9, wherein the device further comprises:
    运算模块,用于对每个维度上的样本数据进行如下任意一种或多种运算:差分运算、平方运算和分布求和运算,使得任意一个维度上的样本数据被扩展出新的维度上的样本数据。The operation module is configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded into a new dimension. sample.
  11. 根据权利要求8至10中任意一项所述的装置,其特征在于,所述训练模块还包括:The apparatus according to any one of claims 8 to 10, wherein the training module further comprises:
    初始模块,用于以所有磁盘的样本磁盘数据作为训练数据,并采用默认值初始化所述训练数据的分类模型参数;An initial module, configured to use sample disk data of all disks as training data, and initialize a classification model parameter of the training data by using a default value;
    提取模块,用于提取所述训练数据中的多个特征数据,将每个特征数据作为根节点在创建所述多个决策树,并将每个特征数据对应的特征值作为对应的决策树的叶子节点;An extraction module, configured to extract a plurality of feature data in the training data, create each of the plurality of decision trees as a root node, and use a feature value corresponding to each feature data as a corresponding decision tree Leaf node
    第一计算模块,用于计算当前所有叶子节点的最优划分以及其增 益,并以增益最大的叶子节点以及对应的划分点进行分裂,使得将所述样本磁盘数据划分到子节点中。a first calculation module, configured to calculate an optimal division of all current leaf nodes and increase thereof And splitting with the leaf node with the largest gain and the corresponding dividing point, so that the sample disk data is divided into the child nodes.
  12. 根据权利要求11所述的装置,其特征在于,所述提取模块包括:The apparatus according to claim 11, wherein the extraction module comprises:
    读取模块,用于读取任意一个特征数据对应的阈值;a reading module, configured to read a threshold corresponding to any one of the feature data;
    比较模块,用于将所述任意一个特征数据的特征值与所述阈值进行比较,并根据比较结果得到两个分支的熵;a comparison module, configured to compare the feature value of the any one of the feature data with the threshold, and obtain the entropy of the two branches according to the comparison result;
    确定模块,用于根据所述两个分支的熵确定两个新节点作为所述任意一个特征数据的两个叶子节点;a determining module, configured to determine, according to the entropy of the two branches, two new nodes as two leaf nodes of the any one of the feature data;
    处理子模块,用于采用上述步骤对每一个特征数据进行处理,直到每个特征数据得到预定的两个唯一的叶子节点。The processing submodule is configured to process each feature data by using the above steps until each feature data obtains two predetermined unique leaf nodes.
  13. 根据权利要求11所述的装置,其特征在于,在得到由多个决策树组成的磁盘预测模型之后,所述装置还包括:对所述分类模型参数进行调整,其中,在所述分类模型参数包括故障磁盘样本和非故障磁盘样本的情况下,如果要确定所述待测磁盘是否为故障磁盘,则将所述分类模型参数中的故障磁盘样本的比例调高。The apparatus according to claim 11, wherein after obtaining a disk prediction model composed of a plurality of decision trees, the apparatus further comprises: adjusting the classification model parameters, wherein the classification model parameters In the case of a faulty disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.
  14. 根据权利要求8所述的装置,其特征在于,所述处理模块包括:The device according to claim 8, wherein the processing module comprises:
    接收模块,用于接收到所述待测磁盘的磁盘数据之后,对所述待测磁盘的磁盘数据赋予一个初始值;The receiving module is configured to: after receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested;
    第二计算模块,用于根据所述待测磁盘的初始值遍历每一个决策树,计算得到第一个决策树所确定的预测结果和第一残差,并将所述第一残差赋值给所述初始值,得到更新后的初始值;a second calculating module, configured to traverse each decision tree according to the initial value of the disk to be tested, calculate a prediction result and a first residual determined by the first decision tree, and assign the first residual to the The initial value is obtained as an updated initial value;
    遍历模块,用于以所述更新后的初始值计算得到第二个决策树所确定的预测结果和第二残差,并所述第二残差赋值所述更新后的初始值,以此遍历所有的决策树,得到预测所述待测磁盘是否为故障磁盘的结果。 a traversing module, configured to calculate, by using the updated initial value, a prediction result determined by a second decision tree and a second residual, and the second residual is assigned the updated initial value, thereby traversing All decision trees get the result of predicting whether the disk to be tested is a failed disk.
PCT/CN2017/071695 2016-01-29 2017-01-19 Disk failure prediction method and apparatus WO2017129030A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610065807.2A CN107025154B (en) 2016-01-29 2016-01-29 Disk failure prediction method and device
CN201610065807.2 2016-01-29

Publications (1)

Publication Number Publication Date
WO2017129030A1 true WO2017129030A1 (en) 2017-08-03

Family

ID=59397412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/071695 WO2017129030A1 (en) 2016-01-29 2017-01-19 Disk failure prediction method and apparatus

Country Status (3)

Country Link
CN (1) CN107025154B (en)
TW (1) TW201732591A (en)
WO (1) WO2017129030A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959004A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 Disk failure prediction technique, device, equipment and computer readable storage medium
CN109063247A (en) * 2018-06-26 2018-12-21 西安工程大学 Landslide disaster forecasting procedure based on deepness belief network
CN109117327A (en) * 2018-07-20 2019-01-01 郑州云海信息技术有限公司 A kind of hard disk detection method and device
CN109472296A (en) * 2018-10-17 2019-03-15 阿里巴巴集团控股有限公司 A kind of model training method and device promoting decision tree based on gradient
CN110399238A (en) * 2019-06-27 2019-11-01 浪潮电子信息产业股份有限公司 A kind of disk failure method for early warning, device, equipment and readable storage medium storing program for executing
WO2020000404A1 (en) * 2018-06-29 2020-01-02 Microsoft Technology Licensing, Llc. Multi-factor cloud service storage device error prediction
WO2020134442A1 (en) * 2018-12-28 2020-07-02 中兴通讯股份有限公司 Disk failure prediction method and device, and storage medium
CN111860858A (en) * 2020-04-15 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and device for determining model updating parameters and readable storage medium
CN111858283A (en) * 2020-07-24 2020-10-30 山东海量信息技术研究院 Hard disk fault preprocessing method for edge data center and related components
CN112180312A (en) * 2020-08-24 2021-01-05 南京航空航天大学 Current sensor composite fault diagnosis method
CN112419086A (en) * 2020-11-18 2021-02-26 贵州电网有限责任公司 Fault studying and judging method based on regulation and control data analysis
CN112596964A (en) * 2020-12-15 2021-04-02 中国建设银行股份有限公司 Disk failure prediction method and device
CN113076217A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Disk fault prediction method based on domestic platform
CN113190405A (en) * 2021-04-29 2021-07-30 山东英信计算机技术有限公司 Node health detection method and device, electronic equipment and storage medium
CN115729761A (en) * 2022-11-23 2023-03-03 中国人民解放军陆军装甲兵学院 Hard disk fault prediction method, system, device and medium
US20230281094A1 (en) * 2022-03-01 2023-09-07 Inventec (Pudong) Technology Corporation Creating method of classification model about hard disk efficiency problem, analysis method of hard disk efficiency problem and classification model creating system about hard disk efficiency problem
CN117234826A (en) * 2023-11-10 2023-12-15 深圳市领德创科技有限公司 Solid state disk reliability verification interference-free test platform and working method

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107392320A (en) * 2017-07-28 2017-11-24 郑州云海信息技术有限公司 A kind of method that hard disk failure is predicted using machine learning
CN107479836A (en) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 Disk failure monitoring method, device and storage system
CN107861829A (en) * 2017-11-06 2018-03-30 郑州云海信息技术有限公司 A kind of method, system, device and the storage medium of disk failure detection
CN108345958A (en) * 2018-01-10 2018-07-31 拉扎斯网络科技(上海)有限公司 A kind of order goes out to eat time prediction model construction, prediction technique, model and device
CN108376553B (en) * 2018-02-28 2020-11-03 北京奇艺世纪科技有限公司 Monitoring method and system for magnetic disk of video server
CN110389866A (en) * 2018-04-20 2019-10-29 武汉安天信息技术有限责任公司 Disk failure prediction technique, device, computer equipment and computer storage medium
CN108683530B (en) * 2018-04-28 2021-06-01 北京百度网讯科技有限公司 Data analysis method and device for multi-dimensional data and storage medium
CN108681750A (en) * 2018-05-21 2018-10-19 阿里巴巴集团控股有限公司 The feature of GBDT models explains method and apparatus
CN109032891A (en) * 2018-07-23 2018-12-18 郑州云海信息技术有限公司 A kind of cloud computing server hard disk failure prediction technique and device
CN110196792B (en) * 2018-08-07 2022-06-14 腾讯科技(深圳)有限公司 Fault prediction method and device, computing equipment and storage medium
CN109299728B (en) * 2018-08-10 2023-06-27 深圳前海微众银行股份有限公司 Sample joint prediction method, system and medium based on construction of gradient tree model
CN109460588B (en) * 2018-10-22 2022-02-15 武汉大学 Equipment fault prediction method based on gradient lifting decision tree
CN109617715A (en) * 2018-11-27 2019-04-12 中盈优创资讯科技有限公司 Network fault diagnosis method, system
CN110175100B (en) * 2019-04-17 2020-05-19 华中科技大学 Storage disk fault prediction method and prediction system
CN111831389A (en) * 2019-04-23 2020-10-27 上海华为技术有限公司 Data processing method and device and storage medium
CN110489303B (en) * 2019-08-22 2022-09-23 江苏华存电子科技有限公司 Temperature prediction control management method and device based on NVMe SSD
CN111008119A (en) * 2019-12-13 2020-04-14 浪潮电子信息产业股份有限公司 Method, device, equipment and medium for updating hard disk prediction model
CN113127274B (en) * 2019-12-31 2024-03-19 中移动信息技术有限公司 Disk failure prediction method, device, equipment and computer storage medium
CN111767162B (en) * 2020-05-20 2021-02-26 北京大学 Fault prediction method for hard disks of different models and electronic device
CN112118259B (en) * 2020-09-17 2022-04-15 四川长虹电器股份有限公司 Unauthorized vulnerability detection method based on classification model of lifting tree
CN112446557B (en) * 2021-01-29 2021-05-07 北京蒙帕信创科技有限公司 Disk failure prediction evasion method and system based on deep learning
TWI818463B (en) * 2022-03-09 2023-10-11 英業達股份有限公司 Creating method of a classifying model of a efficiency problem of a hard disk, analyzing method of an efficiency problem of a hard disk and classifying model creating system of the efficiency problem of a hard disk

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866271A (en) * 2010-06-08 2010-10-20 华中科技大学 Security early warning system and method based on RAID
CN104766014A (en) * 2015-04-30 2015-07-08 安一恒通(北京)科技有限公司 Method and system used for detecting malicious website
US20150205657A1 (en) * 2012-09-28 2015-07-23 Longsand Limited Predicting failure of a storage device
CN105069476A (en) * 2015-08-10 2015-11-18 国网宁夏电力公司 Method for identifying abnormal wind power data based on two-stage integration learning
CN105224888A (en) * 2015-09-29 2016-01-06 上海爱数软件有限公司 A kind of data of magnetic disk array protection system based on safe early warning technology

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059278A1 (en) * 2011-11-14 2014-02-27 Lsi Corporation Storage device firmware and manufacturing software
CN105260279B (en) * 2015-11-04 2019-01-01 四川效率源信息安全技术股份有限公司 Method and apparatus based on SMART data dynamic diagnosis hard disk failure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101866271A (en) * 2010-06-08 2010-10-20 华中科技大学 Security early warning system and method based on RAID
US20150205657A1 (en) * 2012-09-28 2015-07-23 Longsand Limited Predicting failure of a storage device
CN104766014A (en) * 2015-04-30 2015-07-08 安一恒通(北京)科技有限公司 Method and system used for detecting malicious website
CN105069476A (en) * 2015-08-10 2015-11-18 国网宁夏电力公司 Method for identifying abnormal wind power data based on two-stage integration learning
CN105224888A (en) * 2015-09-29 2016-01-06 上海爱数软件有限公司 A kind of data of magnetic disk array protection system based on safe early warning technology

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063247A (en) * 2018-06-26 2018-12-21 西安工程大学 Landslide disaster forecasting procedure based on deepness belief network
CN109063247B (en) * 2018-06-26 2023-04-18 西安工程大学 Landslide disaster forecasting method based on deep belief network
CN108959004B (en) * 2018-06-28 2022-02-18 郑州云海信息技术有限公司 Disk failure prediction method, device, equipment and computer readable storage medium
CN108959004A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 Disk failure prediction technique, device, equipment and computer readable storage medium
US11748185B2 (en) 2018-06-29 2023-09-05 Microsoft Technology Licensing, Llc Multi-factor cloud service storage device error prediction
WO2020000404A1 (en) * 2018-06-29 2020-01-02 Microsoft Technology Licensing, Llc. Multi-factor cloud service storage device error prediction
CN109117327A (en) * 2018-07-20 2019-01-01 郑州云海信息技术有限公司 A kind of hard disk detection method and device
WO2020078098A1 (en) * 2018-10-17 2020-04-23 阿里巴巴集团控股有限公司 Gradient boosting decision tree-based method and device for model training
CN109472296A (en) * 2018-10-17 2019-03-15 阿里巴巴集团控股有限公司 A kind of model training method and device promoting decision tree based on gradient
US11157818B2 (en) 2018-10-17 2021-10-26 Advanced New Technologies Co., Ltd. Model training method and apparatus based on gradient boosting decision tree
WO2020134442A1 (en) * 2018-12-28 2020-07-02 中兴通讯股份有限公司 Disk failure prediction method and device, and storage medium
CN110399238A (en) * 2019-06-27 2019-11-01 浪潮电子信息产业股份有限公司 A kind of disk failure method for early warning, device, equipment and readable storage medium storing program for executing
CN110399238B (en) * 2019-06-27 2023-09-22 浪潮电子信息产业股份有限公司 Disk fault early warning method, device, equipment and readable storage medium
CN111860858A (en) * 2020-04-15 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and device for determining model updating parameters and readable storage medium
CN111858283A (en) * 2020-07-24 2020-10-30 山东海量信息技术研究院 Hard disk fault preprocessing method for edge data center and related components
CN112180312A (en) * 2020-08-24 2021-01-05 南京航空航天大学 Current sensor composite fault diagnosis method
CN112419086A (en) * 2020-11-18 2021-02-26 贵州电网有限责任公司 Fault studying and judging method based on regulation and control data analysis
CN112596964A (en) * 2020-12-15 2021-04-02 中国建设银行股份有限公司 Disk failure prediction method and device
CN113076217A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Disk fault prediction method based on domestic platform
CN113076217B (en) * 2021-04-21 2024-04-12 扬州万方科技股份有限公司 Disk fault prediction method based on domestic platform
CN113190405A (en) * 2021-04-29 2021-07-30 山东英信计算机技术有限公司 Node health detection method and device, electronic equipment and storage medium
US20230281094A1 (en) * 2022-03-01 2023-09-07 Inventec (Pudong) Technology Corporation Creating method of classification model about hard disk efficiency problem, analysis method of hard disk efficiency problem and classification model creating system about hard disk efficiency problem
CN115729761A (en) * 2022-11-23 2023-03-03 中国人民解放军陆军装甲兵学院 Hard disk fault prediction method, system, device and medium
CN115729761B (en) * 2022-11-23 2023-10-20 中国人民解放军陆军装甲兵学院 Hard disk fault prediction method, system, equipment and medium
CN117234826A (en) * 2023-11-10 2023-12-15 深圳市领德创科技有限公司 Solid state disk reliability verification interference-free test platform and working method
CN117234826B (en) * 2023-11-10 2024-04-05 深圳市领德创科技有限公司 Solid state disk reliability verification interference-free test platform and working method

Also Published As

Publication number Publication date
TW201732591A (en) 2017-09-16
CN107025154A (en) 2017-08-08
CN107025154B (en) 2020-12-01

Similar Documents

Publication Publication Date Title
WO2017129030A1 (en) Disk failure prediction method and apparatus
CN107025153B (en) Disk failure prediction method and device
US11748185B2 (en) Multi-factor cloud service storage device error prediction
CN110166462B (en) Access control method, system, electronic device and computer storage medium
CN110164501B (en) Hard disk detection method, device, storage medium and equipment
CN110955550B (en) Cloud platform fault positioning method, device, equipment and storage medium
CN107168995B (en) Data processing method and server
US11157380B2 (en) Device temperature impact management using machine learning techniques
CN112434742A (en) Method, system and equipment for identifying Pompe frauds on Ether house
US11561875B2 (en) Systems and methods for providing data recovery recommendations using A.I
US9639434B2 (en) Auto-didacted hierarchical failure recovery for remote access controllers
CN110689084B (en) Abnormal user identification method and device
CN114943321A (en) Fault prediction method, device and equipment for hard disk
US20160224447A1 (en) Reliability verification apparatus and storage system
US20150378806A1 (en) System analysis device and system analysis method
CN116794510A (en) Fault prediction method, device, computer equipment and storage medium
CN115793990B (en) Memory health state determining method and device, electronic equipment and storage medium
CN117170915A (en) Data center equipment fault prediction method and device and computer equipment
CN111008119A (en) Method, device, equipment and medium for updating hard disk prediction model
WO2023093431A1 (en) Model training method and apparatus, and device, storage medium and program product
KR20180135958A (en) Devices and related methods for managing the performance of Wi-Fi networks
CN113127274A (en) Disk failure prediction method, device, equipment and computer storage medium
CN108133234B (en) Sparse subset selection algorithm-based community detection method, device and equipment
US10990284B1 (en) Alert configuration for data protection
CN114281611A (en) Method, system, equipment and storage medium for comprehensively detecting system disk

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17743645

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17743645

Country of ref document: EP

Kind code of ref document: A1