WO2017129032A1

WO2017129032A1 - Disk failure prediction method and apparatus

Info

Publication number: WO2017129032A1
Application number: PCT/CN2017/071699
Authority: WO
Inventors: 丁永明; 周俊; 崔卿; 瞿神全
Original assignee: 阿里巴巴集团控股有限公司; 丁永明; 周俊; 崔卿; 瞿神全
Priority date: 2016-01-29
Filing date: 2017-01-19
Publication date: 2017-08-03
Also published as: TW201732789A; CN107025153A; CN107025153B

Abstract

Disclosed are a disk failure prediction method and apparatus. The method comprises: acquiring sample disk data of a disk through disk monitoring technology, the sample disk data comprising sample data on a plurality of dimensions; binning the sample disk data by using Bucketing technology, and classifying the sample disk data; performing sample training on the classified sample disk data by using an Owlqn model, to obtain a disk prediction model; and after disk data of a disk to be predicted is received, processing the disk data of the disk to be predicted by using the disk prediction model, and determining whether the disk to be predicted is a faulty disk. The method solves the technical problem in the prior art of an inaccurate prediction result caused by the fact that some factors easily causing hard disk failures cannot be collected or quantized in a hard disk failure prediction system.

Description

Disk failure prediction method and device

Technical field

The present invention relates to the field of magnetic disks, and in particular to a method and apparatus for predicting failure of a magnetic disk.

Background technique

At present, the hard disk is the main medium for storing data, and once the hard disk fails, it will cause huge data loss. Therefore, how to ensure the stability of the hard disk can be very important. Under normal conditions, the probability of a hard disk error in 24 hours is about one in ten thousand. When a server has ten hard disks, the probability of a server hard disk error will rise to one thousandth, and with the current website. As the business develops, the number of hard disks that the server needs to use will increase, and the probability of multiple hard disks failing at the same time will increase.

Usually, data storage usually has multiple backups, such as mysql main and standby libraries, and GFS files default to 3 backups. On a large number of data storage platforms, if multiple hard disks fail at the same time, the probability of storing the same file on these hard disks will be high. That is, if multiple hard disks fail at the same time, some files will be lost. For some online services, most of them depend on the huge amount of data stored in the server. If the hard disk fails, the above online service will be abnormal or even suspended.

For the above reasons, systems that need to predict whether the hard disk will go wrong need a system that can tell us in advance which hard disks will go wrong. There are many reasons why the data may be lost. The most common ones are: external vibration, temperature and Humidity, electrical component damage, sound and dust, some of the above factors can be collected, such as temperature and humidity, some component data, but more data can not be collected and quantified, so it will lead to prediction results accurate.

Some factors in the prior art hard disk failure prediction system that easily cause hard disk failure There is no effective solution to the problem of inaccurate prediction results that cannot be collected or quantified.

Summary of the invention

The embodiments of the present invention provide a method and a device for predicting a fault of a disk, so as to at least solve the technical problem that some factors in the prior art hard disk fault prediction system that are likely to cause the fault of the hard disk cannot be collected or quantized are inaccurate.

According to an aspect of the embodiments of the present invention, a method for predicting a fault of a magnetic disk includes: acquiring sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions; using a Bucking technology The sample disk data is subjected to binning processing to classify the sample disk data; the Owlqn model is used to perform sample training on the classified sample disk data to obtain a disk prediction model; after receiving the disk data of the disk to be tested, the disk prediction model is used to treat The disk data of the disk is measured for processing to determine whether the disk to be tested is a failed disk.

According to another aspect of the present invention, a fault prediction apparatus for a magnetic disk is provided, including: an obtaining module, configured to acquire sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes multiple dimensions. Sample data; a classification module for performing binning processing on sample disk data by using the Buckinging technique, classifying sample disk data; and training module for performing sample training on the classified sample disk data by using the Owlqn model to obtain a disk prediction model The determining module is configured to process the disk data of the disk to be tested after the disk data of the disk to be tested is received, and determine whether the disk to be tested is a faulty disk.

In the embodiment of the present invention, the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; the sample disk data is binned by the Bucking technology, and the sample disk data is processed. Classification; using the Owlqn model to perform sample training on the classified sample disk data to obtain a disk prediction model. After receiving the disk data of the disk to be tested, the disk prediction data is processed using the disk prediction model, and the disk data is processed. The purpose of determining whether the disk to be tested is a failed disk is achieved. The technical effect of predicting disk failure further solves the technical problem that some of the prior art hard disk failure prediction systems are incapable of causing inaccurate prediction results due to factors that cannot be collected or quantized.

DRAWINGS

The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing:

1 is a block diagram showing the hardware structure of a computer terminal for predicting a failure of a magnetic disk according to an embodiment of the present invention;

2 is a flowchart of a method for predicting a failure of a magnetic disk according to Embodiment 1 of the present invention;

3 is a flowchart of an optional disk fault prediction method according to an embodiment of the present invention;

4 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention; FIG.

6 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;

7 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;

8 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention;

FIG. 9 is a structural block diagram of a computer terminal according to an embodiment of the present invention.

detailed description

In order to provide a better understanding of the present invention by those skilled in the art, the present invention will be described below. The embodiments of the present invention are clearly and completely described in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.

It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the invention described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.

Example 1

According to an embodiment of the present invention, there is provided an embodiment of a method for predicting a failure of a magnetic disk. It is to be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.

The method embodiment provided in Embodiment 1 of the present application can be executed in a mobile terminal, a computer terminal or the like. Taking a computer terminal as an example, FIG. 1 is a hardware block diagram of a computer terminal of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention. As shown in FIG. 1, computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) A memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG. 1 is merely illustrative and does not limit the structure of the above electronic device. For example, computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.

The memory 104 can be used to store software programs and modules of the application software, such as program instructions/modules corresponding to the fault prediction method of the disk in the embodiment of the present invention, and the processor 102 executes by executing the software program and the module stored in the memory 104. Various functional applications and data processing, that is, the above-described method for predicting the failure of the disk. Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Transmission device 106 is for receiving or transmitting data via a network. The network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.

In the above operating environment, the present application provides a method for predicting a failure of a disk as shown in FIG. 2. 2 is a flow chart of a method for predicting a failure of a magnetic disk according to a first embodiment of the present invention.

Step S21: Obtain sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions.

In the above steps, disk monitoring technology is used to monitor and record the disk status.

In an alternative embodiment, the sample disk data may be data throughput performance of the sample disk, motor startup time, seek error rate, and the like.

It should be noted here that when using the existing technology (such as SMART, self-monitoring, analysis and reporting technology) to monitor the disk, it is possible to obtain multi-dimensional data reflecting the state of the disk, and whether the disk is based on the monitored data. Failure, or whether it will be analyzed in the short-term future, such analysis is based on the data monitored by the disk monitoring technology, but the state of the disk can also be reflected by other data volumes, which may be Yes The amount of data that cannot be detected or can not be quantified, so the present application establishes a disk prediction model, which uses a disk prediction model to analyze the fault state of the disk, wherein the disk prediction model is sampled by the Owlqn model for sample disk data. The solution of the foregoing embodiment establishes a model for predicting faults by training the samples of the disk sample data, so that after inputting the sample data of the disk to be tested to the disk monitoring system, the fault state analysis of the disk to be tested can be performed according to the model of the predicted fault, thereby avoiding When analyzing disk failures, the analysis of single or fixed multiple sample data results in the impact of non-statistical or non-quantitable disk data on disk failure prediction results.

In step S23, the sample disk data is subjected to binning processing by using the Buckinging technology, and the sample disk data is classified.

In the above steps, when the sample disk data is binned, a plurality of binning methods can be used to achieve smooth data. The method for binning the sample disk data includes smoothing the data according to the average value of the data in the box. The data is smoothed according to the intermediate value of the data in the box and the data is smoothed according to the boundary value of the data in the box.

In an optional embodiment, multiple sample data in the sample disk data set may be divided into multiple bins. In this example, the sample disk data is divided into 5 bins, and the sample disk is When the data is divided into different bins, the sample disk data can be sorted in ascending order, and then the amount of data in each bin can be calculated. The sample disk data is divided into 5 points according to the amount of data that should be in each bin. The box then processes the data in each bin. In this embodiment, the method is used to smooth the data according to the average value of the data in the box, that is, the average value of the data in each bin is calculated, and then the score is obtained. All data in the box becomes the average.

It needs to be explained here that the sample disk data is binned for smoothing the data in each bin. Since the data in each bin is similar, the binning process achieves stable and smooth data. Based on this, it does not affect the results of training the sample disk data in the next step.

It should be noted that the method for performing binning processing on the sample disk data includes any one of the foregoing embodiments, and is not limited thereto, and any method capable of achieving smooth or stable data can be used for the sample. Box processing of disk data.

In step S25, the Owlqn model is used to perform sample training on the classified sample disk data to obtain a disk prediction model.

In the above steps, the sample disk data is trained to input the processed sample disk data to the Owlqn model, wherein the sample disk data is a sample that knows the true value in advance, and the true value of the sample may be 1 or 0. Indicates that the sample is a positive or negative sample, a positive sample indicates that the sample is a failed disk, and a negative sample indicates that the sample is a normal disk.

In an optional embodiment, each input sample disk data can obtain a corresponding output value from the Owlqn model, and after obtaining the corresponding output value of each sample in the sample disk data set, all positive samples are obtained. The output value constitutes a positive sample output value interval, and the output values of all negative samples are also obtained to form an output interval of the negative sample, thereby obtaining a disk prediction model.

Step S27: After receiving the disk data of the disk to be tested, use the disk prediction model to process the disk data of the disk to be tested, and determine whether the disk to be tested is a faulty disk.

It should be further explained that when the sample data of the sample disk data is trained by using the Owlqn model, the sample data after classifying the sample disk data is used, and the classified sample data is subjected to binning processing, so that the classified samples are processed. The sample data in each category is discretized so that the sample data of the sample disk can be trained.

In an optional embodiment, the sample disk data may include: an underlying data read error rate, a start/stop count, a number of remapping sectors, a power-on time accumulation, a spindle spin retries, and a disk calibration retry. The number of times, the number of disk power-ons, the temperature, and the write error rate can be used to obtain sample disk data based on historical disk failure conditions. For example, sample acquisition can be performed at a ratio of 1:5 to positive and negative samples, where the positive sample is the faulty disk and the negative sample is the disk with no fault.

It should be noted that when the disk data of the disk is obtained by the disk monitoring technology, the disks used by the various organizations that predict the disk failure are not necessarily the same, and the environmental factors such as temperature and humidity of the various mechanisms affect the disk. The ratio of the disks of different organizations is different. In order to provide more reliable sample disk data for the training of sample disk data, the sample disk data can also be obtained according to the actual disk damage of the mechanism.

Therefore, the technical problem that the prediction result of the hard disk failure prediction system in the prior art is easy to cause the failure of the hard disk cannot be collected or quantized is solved.

According to the above embodiment of the present application, in a preferred solution, the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value, worst value, and accumulation. value.

The above-mentioned original value is the current parameter of the disk running time; the above-mentioned standard value is the value of each parameter of the normal disk running; the above-mentioned worst value is that when the disk is running, the detection parameters of the disk have the largest deviation from the normal value. Normal value; the above cumulative value is the cumulative result of each disk's detection parameters from disk usage to the current time.

In an optional embodiment, the parameters of the disk may be information describing various attributes of the disk, and may include an error read rate, a power-on frequency, a number of re-allocated sectors, a number of rotation retries, One or more of the number of disk calibration retries and the parity error rate may also include other attribute information of the disk.

In an optional embodiment, the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.

According to the above embodiment of the present application, in a preferred solution, after the step S21 acquires the sample disk data of the disk by using the disk monitoring technology, the method further includes:

Step S211, performing any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded to the sample data in the new dimension. .

In an optional embodiment, the original value in the sample data is subjected to a difference operation, a square operation, and a distribution sum operation, thereby obtaining a difference value of the original value, a pool value of the original value, and a distribution sum of the original values. Value, so based on the original value of the sample disk data, the sample disk data of the other four dimensions is obtained; the standard value, the worst value, and the accumulated value in the sample data can also be used as the above operations, and more Sample disk data for dimensions.

It should be noted that performing multiple operations on the sample disk data to obtain more dimensional sample disk data can improve the utilization of the sample disk data and train the sample disk data. When the sample disk data is sensitive, the accuracy of the fault prediction model is improved.

According to the above embodiment of the present application, in a preferred solution, step S23 uses the Bucking technology to perform binning processing on the sample disk data, and classifies the sample disk data, including:

In step S231, the value range of each bin divided in advance and the ID value corresponding to each bin are determined.

In the above steps, the purpose of the range of values of each bin is to determine the bin corresponding to the data in the sample disk data set, that is, the bin corresponding to the range to which the sample disk data belongs is the data of the sample disk. Binning. Determine the ID value of each bin to distinguish between different bins.

Step S233, classifying the sample disk data by discretizing the sample data in each dimension to a corresponding bin, and obtaining an ID value corresponding to the sample data in each dimension.

In an optional embodiment, after the sample disk data is allocated to different bins, the data assigned to the bin is replaced by the ID number of the bin, that is, the sample disk data in each dimension is It is replaced with the bin ID value corresponding to the sample disk data, so that the data in each dimension of the original sample disk data is replaced with the above integer value.

In another optional embodiment, for example, when five bins having different value ranges are set, and each bin has an ID value of 1, 2, 3, 4, 5, each segment The box contains different data. When the sample disk data A falls within the range of the values of the bin 1 and the bin 3, the ID value of the sample disk data A may be 10100. According to the scheme in the above embodiment, the sample data in each dimension can be obtained with the ID value corresponding thereto.

According to the above embodiment of the present application, in a preferred solution, in step S25, the Owlqn model is used to perform sample training on the classified sample disk data to obtain a disk prediction model, including:

Step S251, the Owlqn model trains the ID values corresponding to the sample data in each dimension to obtain the weight values of the sample data in each dimension.

In the above steps, the weight value of the sample data in each dimension is the probability that the sample is "1", that is, the probability that the sample is a positive sample.

In an optional example, the disk data to be tested is represented as

among them,

y _i is 0 or 1. After obtaining the sample data for training, the Owlqn model outputs the weight value of each disk characteristic data, that is, the probability that each disk characteristic data is faulty disk data. The weight value can be calculated by the following formula: weight value

i is used to represent the i-th sample, n is used to represent n dimensions, k is to represent any dimension between 1 and n, and w _{k is} used to represent the weight value in the k-dimension, where w ₀ is the intercept, requiring attention The weight value of the output needs to meet the conditions:

The minimum value can be obtained, and J is the optimization objective function.

Step S253, determining a disk prediction model according to the sample data and the corresponding weight value in each dimension, wherein the disk prediction model includes the prediction result of the sample data in each dimension.

In an optional embodiment, after obtaining the disk data of the disk to be tested, calculating a predicted value of the disk to be tested, wherein calculating the predicted value of the disk to be tested may be calculated according to the following formula:

The above predicted value is the predicted result obtained from the training sample disk data. Since the sample disk is a faulty disk, the known value is obtained. Therefore, after the prediction result is obtained, the prediction result of the positive sample disk and the prediction result of the negative sample disk are distinguished. , get the range of the predicted value of the failed disk and the range of the predicted value of the normal disk.

In an optional embodiment, the ID value corresponding to the sample data is input to the Owlqn model, and the fault state of the sample disk corresponding to the ID value is input to the Owlqn model, so that the Owlqn model stores the ID value and the disk fault status corresponding to the ID value. Then, input the ID value repeatedly to the Owlqn model to verify whether the Owlqn model can output the fault status corresponding to the ID value.

According to the above embodiment of the present application, in a preferred embodiment, the prediction result of the sample data in each dimension is a predicted value obtained by classifying the sample disk data.

According to the above embodiment of the present application, in a preferred solution, after receiving the disk data of the disk to be tested, the step S27 uses the disk prediction model to process the disk data of the disk to be tested, and determines whether the disk to be tested is a failed disk. include:

Step S271: After receiving the disk data of the disk to be tested, the disk data of the disk to be tested is discretized to a corresponding bin, and the ID value corresponding to the disk data of the disk to be tested is obtained.

In the above step, the disc data of the disk to be tested is discretized to the corresponding sub-box, and the ID value corresponding to the disk data of the disk to be tested is obtained, which can be implemented by using the solution in step S231 to step S233 in the above embodiment. .

Step S273: Determine a weight value of the disk data of the disk to be tested according to the ID value corresponding to the disk data of the disk to be tested.

In an optional example, the disk data to be tested is represented as

among them,

y _i is 0 or 1. After the owlqn model obtains the sample data for training, the weight value of each disk characteristic data is output, that is, the probability that each disk characteristic data is faulty disk data.

Step S275: Determine, according to the weight value of the disk data of the disk to be tested, whether the disk to be tested is a failed disk from the disk prediction model.

After obtaining the predicted value of the disk to be tested, compare the predicted value of the disk to be tested with the value range of the positive sample obtained by the training sample disk data and the value range of the negative sample, if the predicted value of the disk to be tested falls into the positive value If the value of the sample is a faulty disk, the disk to be tested is considered to be a faulty disk. If the predicted value of the disk to be tested falls within the range of the negative sample, the disk to be tested can be considered as a normal disk.

It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In addition, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on this rationale The solution of the technical solution of the present invention in essence or contribution to the prior art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, CD). A number of instructions are included to cause a terminal device (which may be a cell phone, computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention.

As shown in FIG. 3, a method for predicting a fault of a magnetic disk is provided. The method may include the following steps S31 to S37:

S31. Obtain sample data of the sample disk.

In the above steps, the sample data of the sample disk may be SMART disk data. Specifically, in the above steps, the sample disk data can be obtained by using software such as HDTune or CrystalDiskInfo.

S32, performing differential operations on the sample data.

Specifically, in the above steps, the difference operation refers to a value obtained by performing difference calculation between the feature data of the disk at a certain time and the feature data of the disk before 24 hours.

S33, performing a distribution summation and/or a square operation on the result obtained by the difference operation.

The above steps perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded to the sample data in the new dimension.

S34, obtaining data for training and prediction use.

S35, using a bin to discretize.

The purpose of the value range of each bin divided by the above steps is to determine the bin corresponding to the data in the sample disk data set, that is, the bin corresponding to the range to which the sample disk data belongs is the bin to which the sample disk data belongs. . Determine the ID value of each bin to distinguish different bins and discretize the data in each bin.

S36, training through the Owlqn model.

In the above steps, the sample data of the sample disk is trained by the Owlqn model. Disk prediction model.

S37, the predicted result of the disk is obtained.

In the above steps, the disk prediction model constructed by the above steps is used to predict the disk to be tested, and after obtaining the predicted value, the predicted value range in the model is compared to obtain the prediction result of the disk to be tested.

Example 2

According to an embodiment of the present invention, there is also provided an apparatus for implementing the above-described failure prediction method for a magnetic disk. As shown in FIG. 4, the apparatus includes: an acquisition module 40, a classification module 42, a training module 44, and a determination module 46.

The obtaining module 40 is configured to obtain sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions, and the classifying module 42 is configured to perform binning processing on the sample disk data by using a bucketing technology. And the training module 44 is configured to perform sample training on the classified sample disk data by using the Owlqn model to obtain a disk prediction model, and the determining module 46 is configured to: after receiving the disk data of the disk to be tested, Use the disk prediction model to process the disk data of the disk to be tested to determine whether the disk to be tested is a failed disk.

It should be noted that the example and the application scenario implemented by the step S21 to the step S27 of the first embodiment are the same as the application scenario, but are not limited to the above embodiment. A public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 5, the device further includes:

The operation module 50 is configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded into a new dimension. Sample data.

It should be noted that the example and the application scenario implemented by the above-mentioned obtaining module 50 corresponding to step S211 of the embodiment are the same, but are not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 6, the classification module 42 includes:

a first determining sub-module 60, configured to determine a range of values of each binning pre-divided and an ID value corresponding to each bin; a sub-module 62 for discretizing sample data in each dimension to The corresponding bins are used to classify the sample disk data to obtain the ID value corresponding to the sample data in each dimension.

It should be noted that the first determining sub-module 60 and the categorizing sub-module 62 are the same as the example and the application scenario implemented by the step S231 and the step S233 of the embodiment, but are not limited to the one disclosed in the first embodiment. content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 7, the training module 44 includes:

The training sub-module 70 is configured to train the Owlqn model to match the ID value corresponding to the sample data in each dimension to obtain the weight value of the sample data in each dimension; and the second determining sub-module 72 is configured to use each dimension according to each dimension The disk prediction model is determined by the sample data and the corresponding weight value, wherein the disk prediction model includes prediction results of the sample data in each dimension.

It should be noted that the training sub-module 70 and the second determining sub-module 72 are the same as the example and the application scenario implemented by the step S251 and the step S253 of the embodiment, but are not limited to the one disclosed in the first embodiment. content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 8, the determining module 46 further includes:

The discretization module 80 is configured to discretize the disk data of the disk to be tested to the corresponding sub-box, and obtain the ID value corresponding to the disk data of the disk to be tested; the third determining sub-module 82 Determining, according to the ID value corresponding to the disk data of the disk to be tested, the weight value of the disk data of the disk to be tested; the fourth determining sub-module 84, configured to predict the model from the disk according to the weight value of the disk data of the disk to be tested. Determine if the disk to be tested is a failed disk.

It should be noted that the above-mentioned discrete module 80, the third determining sub-module 82, and the fourth determining sub-module 84 are the same as the examples and application scenarios implemented in step S271 and step S275 of the embodiment, but are not limited to the above. The content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

Example 3

Embodiments of the present invention may provide a computer terminal, which may be any one of computer terminal groups. Optionally, in this embodiment, the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.

Optionally, in this embodiment, the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.

In this embodiment, the computer terminal may execute the program code of the following steps in the method for predicting the fault of the disk: acquiring the sample disk data of the disk by using the disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; using Bucking The technology performs binning processing on the sample disk data, classifies the sample disk data, and uses the Owlqn model to perform sample training on the classified sample disk data to obtain a disk prediction model; after receiving the disk data of the disk to be tested, using the disk prediction The model processes the disk data of the disk to be tested to determine whether the disk to be tested is a failed disk.

Optionally, FIG. 9 is a structural block diagram of a computer terminal according to an embodiment of the present invention. As shown in FIG. 9, the computer terminal A may include one or more (only one shown in the figure) processor 91, memory 93, and transmission device 95.

The memory can be used to store the software program and the module, such as the fault prediction method of the disk and the program instruction/module corresponding to the device in the embodiment of the present invention, and the processor executes various programs by running the software program and the module stored in the memory. Functional application and data processing, that is, the above-described method for predicting the failure of the disk. The memory may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory can further include memory remotely located relative to the processor, which can be connected to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The processor may call the memory stored information and the application program by the transmission device to perform the following steps: acquiring the sample disk data of the disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions; using the Bucking technology The sample disk data is subjected to binning processing to classify the sample disk data; the Owlqn model is used to perform sample training on the classified sample disk data to obtain a disk prediction model; after receiving the disk data of the disk to be tested, the disk prediction model is used to treat The disk data of the disk is measured for processing to determine whether the disk to be tested is a failed disk.

Optionally, the foregoing processor may further execute the following program code: the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value, worst value, and Cumulative value.

Optionally, the foregoing processor may further execute the following program code: perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distributed sum operation, so that any one dimension The sample data is expanded out of the sample data on the new dimension.

Optionally, the foregoing processor may further execute the following program code: determining a range of values of each of the pre-divided bins and an ID value corresponding to each bin; and discretizing the sample data in each dimension to The corresponding bins are used to classify the sample disk data to obtain the ID value corresponding to the sample data in each dimension.

Optionally, the foregoing processor may further execute the following program code: the Owlqn model trains the ID value corresponding to the sample data in each dimension, and obtains the weight value of the sample data in each dimension; according to each dimension The disk prediction model is determined by the sample data and the corresponding weight value, wherein the disk prediction model includes prediction results of the sample data in each dimension.

Optionally, the foregoing processor may further execute program code of the following steps: the prediction result of the sample data in each dimension is a predicted value obtained by classifying the sample disk data.

Optionally, the foregoing processor may further execute the following program code: after receiving the disk data of the disk to be tested, discretizing the disk data of the disk to be tested to a corresponding bin, and obtaining the disk data of the disk to be tested. ID value; determining the weight value of the disk data of the disk to be tested according to the ID value corresponding to the disk data of the disk to be tested; determining whether the disk to be tested is a fault disk from the disk prediction model according to the weight value of the disk data of the disk to be tested .

In the embodiment of the present invention, the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; the sample disk data is binned by the Bucking technology, and the sample disk data is processed. Classification; using the Owlqn model to perform sample training on the classified sample disk data to obtain a disk prediction model. After receiving the disk data of the disk to be tested, the disk prediction data is processed using the disk prediction model, and the disk data is processed. The purpose of determining whether the disk to be tested is a faulty disk is to achieve the technical effect of predicting the disk failure, thereby solving the problem that some factors in the prior art hard disk fault prediction system that are likely to cause the hard disk failure cannot be collected or quantized. Accurate technical issues.

A person skilled in the art can understand that the structure shown in FIG. 9 is only an illustration, and the computer terminal can also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (Mobile Internet Devices, MID). ), PAD and other terminal devices. FIG. 9 does not limit the structure of the above electronic device. For example, computer terminal A may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 9, or have a different configuration than that shown in FIG.

A person of ordinary skill in the art may understand that all or part of the steps of the foregoing embodiments may be completed by a program to instruct terminal device related hardware, and the program may be saved. The storage medium may include a flash disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like.

Example 4

Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the storage medium may be used to save the program code executed by the fault prediction method of the disk provided in the first embodiment.

Optionally, in this embodiment, the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.

Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; The sample disk data is binned by the Bucking technology to classify the sample disk data. The Owlqn model is used to perform sample training on the classified sample disk data to obtain a disk prediction model. After receiving the disk data of the disk to be tested, the disk data is used. The disk prediction model processes the disk data of the disk to be tested to determine whether the disk to be tested is a failed disk.

Optionally, the storage medium is further configured to store program code for performing the following steps: the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value , the worst value and the cumulative value.

Optionally, the storage medium is further configured to store program code for performing the following steps: performing one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, The sample data in any one dimension is expanded to the sample data on the new dimension.

Optionally, the foregoing storage medium is further configured to store program code for performing the following steps: determining a range of values of each of the pre-divided bins and an ID value corresponding to each bin; The sample data in each dimension is discretized to the corresponding bins to classify the sample disk data to obtain the ID value corresponding to the sample data in each dimension.

Optionally, the storage medium is further configured to store program code for performing the following steps: the Owlqn model trains the ID value corresponding to the sample data in each dimension to obtain the weight value of the sample data in each dimension. The disk prediction model is determined based on the sample data and the corresponding weight values in each dimension, wherein the disk prediction model includes prediction results of the sample data in each dimension.

Optionally, the storage medium is further configured to store program code for performing the following steps: the prediction result of the sample data on each dimension is a predicted value obtained by classifying the sample disk data.

Optionally, the foregoing storage medium is further configured to store program code for performing the following steps: after receiving the disk data of the disk to be tested, discretizing the disk data of the disk to be tested to a corresponding bin, and obtaining the disk to be tested. The ID value corresponding to the disk data; determining the weight value of the disk data of the disk to be tested according to the ID value corresponding to the disk data of the disk to be tested; determining the test value from the disk prediction model according to the weight value of the disk data of the disk to be tested Whether the disk is a failed disk.

The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.

In the several embodiments provided by the present application, it should be understood that the disclosed technical contents may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place. Square, or it can be distributed to multiple network elements. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims

A method for predicting a fault of a disk, comprising:

Obtaining sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions;

Performing a binning process on the sample disk data by using a bucketing technique to classify the sample disk data;

Performing sample training on the classified sample disk data by using the Owlqn model to obtain a disk prediction model;

After the disk data of the disk to be tested is received, the disk data of the disk to be tested is processed by using the disk prediction model to determine whether the disk to be tested is a faulty disk.
The method according to claim 1, wherein the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in four dimensions: original value, standard value, and worst value. And cumulative values.
The method of claim 2, after the obtaining the sample disk data of the disk by the disk monitoring technology, the method further comprises:

The sample data in each dimension is subjected to any one or more of the following operations: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded to the sample data in the new dimension.
The method according to any one of claims 1 to 3, wherein the sample disk data is subjected to binning processing by using a bucketing technique, and the sample disk data is classified, including:

Determining the range of values of each bin that is pre-divided and the ID value corresponding to each bin;

The sample disk data is classified by discretizing the sample data in each dimension to a corresponding bin, and the ID value corresponding to the sample data in each dimension is obtained.
The method of claim 4 wherein said score is determined using an Owlqn model The sample disk data after the class is sampled and trained to obtain a disk prediction model, including:

The Owlqn model trains ID values corresponding to the sample data in each dimension to obtain weight values of sample data in each dimension;

The disk prediction model is determined according to sample data in each dimension and a corresponding weight value, wherein the disk prediction model includes prediction results of sample data in each dimension.
The method according to claim 5, wherein the prediction result of the sample data in each dimension is a predicted value obtained by classifying the sample disk data.
The method according to claim 6, wherein after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is processed by using the disk prediction model to determine whether the disk to be tested is Faulty disk, including:

After the disk data of the disk to be tested is received, the disk data of the disk to be tested is discretized to a corresponding bin, and the ID value corresponding to the disk data of the disk to be tested is obtained;

Determining, according to an ID value corresponding to the disk data of the disk to be tested, a weight value of the disk data of the disk to be tested;

Determining, according to the weight value of the disk data of the disk to be tested, whether the disk to be tested is a failed disk from the disk prediction model.
A fault prediction device for a magnetic disk, comprising:

An obtaining module, configured to acquire sample disk data of a disk by using a disk monitoring technology, where the sample disk data includes sample data in multiple dimensions;

a classifying module, configured to perform binning processing on the sample disk data by using a bucketing technology, and classify the sample disk data;

a training module, configured to perform sample training on the classified sample disk data by using an Owlqn model to obtain a disk prediction model;

Determining a module, after receiving the disk data of the disk to be tested, using the disk prediction model to process the disk data of the disk to be tested, and determining the disk to be tested Whether it is a failed disk.
The apparatus according to claim 8, wherein said sample disk data is SMART disk data, and wherein said sample disk data includes at least sample data in four dimensions: original value, standard value, and worst value. And cumulative values.
The device according to claim 9, wherein the device further comprises:

The operation module is configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that the sample data in any one dimension is expanded into a new dimension. sample.
The apparatus according to any one of claims 8 to 10, wherein the classification module comprises:

a first determining sub-module, configured to determine a value range of each binning pre-divided and an ID value corresponding to each bin;

And a classification sub-module, configured to classify the sample disk data by discretizing sample data in each dimension to a corresponding bin, to obtain an ID value corresponding to the sample data in each dimension.
The apparatus according to claim 11, wherein the training module comprises:

a training sub-module, configured to: the Owlqn model trains an ID value corresponding to the sample data in each dimension to obtain a weight value of the sample data in each dimension;

a second determining submodule, configured to determine the disk prediction model according to the sample data and the corresponding weight value in each dimension, wherein the disk prediction model includes prediction of sample data in each dimension result.
The apparatus according to claim 12, wherein the prediction result of the sample data in each dimension is a predicted value obtained by classifying the sample disk data.
The device according to claim 13, wherein the determining module further comprises:

a discrete module, after receiving the disk data of the disk to be tested, discretizing the disk data of the disk to be tested into a corresponding bin, and obtaining the number of disks of the disk to be tested According to the corresponding ID value;

a third determining submodule, configured to determine, according to an ID value corresponding to the disk data of the disk to be tested, a weight value of the disk data of the disk to be tested;

And a fourth determining submodule, configured to determine, according to the weight value of the disk data of the disk to be tested, whether the disk to be tested is a faulty disk.