TW201732591A

TW201732591A - Disk failure prediction method and apparatus

Info

Publication number: TW201732591A
Application number: TW106102676A
Authority: TW
Inventors: yong-ming Ding; Jun Zhou; Qing Cui; Shen-Quan Qu
Original assignee: Alibaba Group Services Ltd
Priority date: 2016-01-29
Filing date: 2017-01-24
Publication date: 2017-09-16
Also published as: CN107025154A; WO2017129030A1; CN107025154B

Abstract

Disclosed are a disk failure prediction method and apparatus. The method comprises: acquiring sample disk data of a disk through disk monitoring technology, the sample disk data comprising sample data on a plurality of dimensions; performing sample training on the sample disk data by using a GBDT algorithm, to obtain a disk prediction model consisting of a plurality of decision-making trees; and after disk data of a disk to be predicted is received, processing the disk data of the disk to be predicted by using the disk prediction model consisting of the plurality of decision-making trees, and determining whether the disk to be predicted is a failed disk. The method solves the technical problem in the prior art of an inaccurate prediction result caused by the fact that some factors resulting in hard disk failures cannot be collected or quantized in a hard disk failure prediction system.

Description

Disk failure prediction method and device

本發明關於磁碟領域，具體而言，關於一種磁碟的故障預測方法和裝置。 The present invention relates to the field of magnetic disks, and more particularly to a method and apparatus for predicting failure of a magnetic disk.

目前，硬碟是儲存資料的主要媒體，硬碟一旦出故障，便會造成巨大的資料損失。因此如何保證硬碟的穩定性能非常重要。在通常狀態下，硬碟在24小時中出錯的機率在是萬分之一左右，當一台服務器具有十塊硬碟時，伺服器硬碟出錯的機率就會上升到千分之一，而隨著當前網站等業務的發展，伺服器需要使用的硬碟會越來越多，多塊硬碟同時出錯的機率也會提升。 At present, the hard disk is the main medium for storing data, and if the hard disk fails, it will cause huge data loss. Therefore, how to ensure the stability of the hard disk can be very important. Under normal conditions, the probability of a hard disk error in 24 hours is about one ten thousandth. When a server has ten hard disks, the chance of a server hard disk error will rise to one thousandth. With the development of services such as the current website, the number of hard disks that the server needs to use will increase, and the probability of multiple hard disks simultaneously failing will increase.

通常情況下，資料儲存通常會有多個備份，如mysql主備庫，GFS檔默認3個備份。在大量資料儲存平臺上，如果多個硬碟同時出故障，那麼這些硬碟上儲存著同一個檔的備份的機率就會很高，即如果多塊硬碟同時出現故障，就會導致一些檔的遺失，對於一些線上的服務，大都依賴於伺服器中儲存的大量資料，如果硬碟出故障，就會導致上述線上服務異常，甚至暫停使用。 Usually, the data storage usually has multiple backups, such as the mysql main standby library, and the GFS file defaults to 3 backups. On a large data storage platform, if multiple hard disks fail at the same time, the chances of storing the same file on these hard disks will be high, that is, if multiple hard disks fail at the same time, it will lead to some files. Loss, for some online services, mostly rely on a large amount of data stored in the server. If the hard disk fails, it will cause the above-mentioned online service to be abnormal or even suspended.

由於上述原因，需要具有預測硬碟是否會出錯的系統需要有一套系統能提前告訴我們哪些硬碟會出錯，資料可能遺失導致硬碟故障的原因有很多，最常見的有以下幾種：外部振動、溫度和濕度、電器元件損壞、聲音和灰塵，在上述因素中，有些因素能夠被採集到，比如溫度和濕度、一些元件資料，但是更多的資料無法被採集和量化，因此便會導致預測結果不準確。 For the above reasons, systems that need to predict whether a hard disk will go wrong need a system that can tell us in advance which hard disk will be wrong. There are many reasons why the data may be lost. The most common ones are: external vibration , temperature and humidity, electrical component damage, sound and dust, some of the above factors can be collected, such as temperature and humidity, some component data, but more data can not be collected and quantified, thus leading to predictions The result is not accurate.

針對現有技術的硬碟故障預測系統中一些容易致使硬碟故障的因素不能被採集胡或量化導致的預測結果不準確的問題，目前尚未提出有效的解決方案。 In view of the problem that some factors in the prior art hard disk failure prediction system that are likely to cause a hard disk failure cannot be inaccurate due to acquisition or quantification, an effective solution has not been proposed yet.

本發明實施例提供了一種磁碟的故障預測方法和裝置，以至少解決現有技術的硬碟故障預測系統中一些容易致使硬碟故障的因素不能被採集或量化導致的預測結果不準確的技術問題。 Embodiments of the present invention provide a method and apparatus for predicting a failure of a magnetic disk, so as to at least solve the technical problem that some prediction factors in the prior art hard disk failure prediction system that cause the failure of the hard disk cannot be collected or quantized are inaccurate. .

根據本發明實施例的一個態樣，提供了一種磁碟的故障預測方法，包括：透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，樣本磁碟資料包括多個維度上的樣本資料；採用GBDT演算法對樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型；在接收到待測磁碟的磁碟資料之後，使用由多個決策樹組成的磁碟預測模型對待測磁碟的磁碟資料進行處理，確定待測磁碟是否為故障磁碟。 According to an aspect of an embodiment of the present invention, a method for predicting a failure of a magnetic disk includes: acquiring a magnetic disk data of a magnetic disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions. The GBDT algorithm is used to perform sample training on the sample disk data to obtain a disk prediction model composed of multiple decision trees. After receiving the disk data of the disk to be tested, a disk composed of multiple decision trees is used. The prediction model processes the disk data of the disk to be determined to determine whether the disk to be tested is a failed disk.

根據本發明實施例的另一態樣，還提供了一種磁碟的故障預測裝置，包括：透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，樣本磁碟資料包括多個維度上的樣本資料；採用GBDT演算法對樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型；在接收到待測磁碟的磁碟資料之後，使用由多個決策樹組成的磁碟預測模型對待測磁碟的磁碟資料進行處理，確定待測磁碟是否為故障磁碟。 According to another aspect of the embodiments of the present invention, a fault prediction apparatus for a magnetic disk is provided, comprising: acquiring a magnetic disk data of a magnetic disk by using a disk monitoring technology, wherein the sample disk data includes multiple dimensions. Sample data; using the GBDT algorithm to sample the sample disk data to obtain a disk prediction model composed of multiple decision trees; after receiving the disk data of the disk to be tested, using a plurality of decision trees The disk prediction model processes the disk data of the disk to be determined to determine whether the disk to be tested is a failed disk.

在本發明實施例中，採用透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，樣本磁碟資料包括多個維度上的樣本資料；採用GBDT演算法對樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型方式，透過在接收到待測磁碟的磁碟資料之後，使用由多個決策樹組成的磁碟預測模型對待測磁碟的磁碟資料進行處理，達到了確定待測磁碟是否為故障磁碟的目的，從而實現了預測磁碟故障狀態的技術效果，進而解決了現有技術的硬碟故障預測系統中一些容易致使硬碟故障的因素不能被採集或量化導致的預測結果不準確的技術問題。 In the embodiment of the present invention, the sample disk data of the disk is obtained by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the sample disk disk data is sampled by using the GBDT algorithm. Obtaining a disk prediction model method composed of a plurality of decision trees, after processing the disk data of the disk to be tested, using a disk prediction model composed of a plurality of decision trees to process the disk data of the disk to be tested The purpose of determining whether the disk to be tested is a faulty disk is achieved, thereby realizing the technical effect of predicting the disk fault state, thereby solving some factors in the prior art hard disk fault prediction system that can easily cause the hard disk failure to be Collect or quantify technical problems that result in inaccurate predictions.

10‧‧‧電腦終端 10‧‧‧Computer terminal

102‧‧‧處理器 102‧‧‧Processor

104‧‧‧記憶體 104‧‧‧ memory

106‧‧‧傳輸模組 106‧‧‧Transmission module

60‧‧‧獲取模組 60‧‧‧Getting module

62‧‧‧訓練模組 62‧‧‧ training module

64‧‧‧處理模組 64‧‧‧Processing module

70‧‧‧運算模組 70‧‧‧ Computing Module

80‧‧‧初始模組 80‧‧‧ initial module

82‧‧‧提取模組 82‧‧‧ extraction module

84‧‧‧第一計算模組 84‧‧‧First Computing Module

90‧‧‧讀取模組 90‧‧‧Reading module

92‧‧‧比較模組 92‧‧‧Comparative Module

94‧‧‧確定模組 94‧‧‧Determining modules

96‧‧‧處理子模組 96‧‧‧Processing submodules

100‧‧‧接收模組 100‧‧‧ receiving module

102‧‧‧第二計算模組 102‧‧‧Second calculation module

104‧‧‧遍歷模組 104‧‧‧ traversal module

111‧‧‧處理器 111‧‧‧ Processor

113‧‧‧記憶體 113‧‧‧ memory

115‧‧‧傳輸裝置 115‧‧‧Transportation device

此處所說明的附圖用來提供對本發明的進一步理解，構成本申請的一部分，本發明的示意性實施例及其說明用於解釋本發明，並不構成對本發明的不當限定。在附圖中：圖1是根據本發明實施例的一種磁碟的故障預測方法的電腦終端的硬體結構框圖；圖2是根據本發明實施例的一種磁碟的故障預測方法的流程圖；圖3是根據本發明實施例的一種使用GBDT演算法對樣本磁碟資料進行訓練的示意圖；圖4為根據本發明實施例的一種使用GBDT演算法計算磁碟預測值的示意圖；圖5是根據本發明實施例的一種可選的磁碟的故障預測方法的流程圖；圖6是根據本發明實施例的一種磁碟的故障預測裝置的結構示意圖；圖7是根據本發明實施例的一種可選的磁碟的故障預測裝置的結構示意圖；圖8是根據本發明實施例的一種可選的磁碟的故障預測裝置的結構示意圖；圖9是根據本發明實施例的一種可選的磁碟的故障預測裝置的結構示意圖；圖10是根據本發明實施例的一種可選的磁碟的故障預測裝置的結構示意圖；以及圖11是根據本發明實施例的一種電腦終端的結構框圖。 The drawings described herein are intended to provide a further understanding of the invention, and are intended to be a part of the invention. In the drawing: 1 is a block diagram of a hardware structure of a computer terminal for predicting a failure of a disk according to an embodiment of the present invention; FIG. 2 is a flowchart of a method for predicting a failure of a disk according to an embodiment of the present invention; A schematic diagram of training a sample disk data using a GBDT algorithm according to an embodiment of the present invention; FIG. 4 is a schematic diagram of calculating a disk prediction value using a GBDT algorithm according to an embodiment of the present invention; FIG. 5 is a schematic diagram of a disk according to an embodiment of the present invention; FIG. 6 is a schematic structural diagram of a failure prediction apparatus for a magnetic disk according to an embodiment of the present invention; FIG. 7 is an optional magnetic disk according to an embodiment of the present invention; FIG. 8 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention; FIG. 9 is an optional disk fault prediction apparatus according to an embodiment of the present invention; FIG. 10 is a schematic structural diagram of an optional disk fault prediction apparatus according to an embodiment of the present invention; and FIG. 11 is an embodiment of the present invention. Species block diagram of the computer terminal.

為了使本技術領域的人員更好地理解本發明方案，下面將結合本發明實施例中的附圖，對本發明實施例中的技術方案進行清楚、完整地描述，顯然，所描述的實施例僅僅是本發明一部分的實施例，而不是全部的實施例。基於本發明中的實施例，本領域普通技術人員在沒有做出創造性勞動前提下所獲得的所有其他實施例，都應當屬於本發明保護的範圍。 The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.

需要說明的是，本發明的說明書和申請專利範圍及上述附圖中的術語“第一”、“第二”等是用於區別類似的物件，而不必用於描述特定的順序或先後次序。應該理解這樣使用的資料在適當情況下可以互換，以便這裡描述的本發明的實施例能夠以除了在這裡圖示或描述的那些以外的順序實施。此外，術語“包括”和“具有”以及他們的任何變形，意圖在於覆蓋不排他的包含，例如，包含了一系列步驟或單元的過程、方法、系統、產品或設備不必限於清楚地列出的那些步驟或單元，而是可包括沒有清楚地列出的或對於這些過程、方法、產品或設備固有的其它步驟或單元。 It should be noted that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar items, and are not necessarily used to describe a specific order or order. It is to be understood that the materials so used are interchangeable, where appropriate, so that the embodiments of the invention described herein can be carried out in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.

實施例1Example 1

根據本發明實施例，還提供了一種磁碟的故障預測方法實施例，需要說明的是，在附圖的流程圖示出的步驟可以在諸如一組電腦可執行指令的電腦系統中執行，並且，雖然在流程圖中示出了邏輯順序，但是在某些情況下，可以以不同於此處的循序執行所示出或描述的步驟。 According to an embodiment of the present invention, there is also provided an embodiment of a method for predicting a failure of a disk, and it is to be noted that the steps shown in the flowchart of the drawing may be performed in a computer system such as a set of computer executable instructions, and Although the logical order is shown in the flowchart, in some cases, The steps shown or described are performed in a different order than here.

本申請實施例一所提供的方法實施例可以在移動終端、電腦終端或者類似的運算裝置中執行。以運行在電腦終端上為例，圖1是根據本發明實施例的一種磁碟的故障預測方法的電腦終端的硬體結構框圖。如圖1所示，電腦終端10可以包括一個或多個(圖中僅示出一個)處理器102(處理器102可以包括但不限於微處理器MCU或可程式設計邏輯器件FPGA等的處理裝置)、用於儲存資料的記憶體104、以及用於通信功能的傳輸模組106。本領域普通技術人員可以理解，圖1所示的結構僅為示意，其並不對上述電子裝置的結構造成限定。例如，電腦終端10還可包括比圖1中所示更多或者更少的元件，或者具有與圖1所示不同的配置。 The method embodiment provided in Embodiment 1 of the present application can be executed in a mobile terminal, a computer terminal or the like. Taking a computer terminal as an example, FIG. 1 is a block diagram showing a hardware structure of a computer terminal for predicting a failure of a disk according to an embodiment of the present invention. As shown in FIG. 1, computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a microprocessor MCU or a programmable logic device FPGA, etc. ), a memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in FIG. 1 is merely illustrative and does not limit the structure of the above electronic device. For example, computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.

記憶體104可用於儲存應用軟體的軟體程式以及模組，如本發明實施例中的磁碟的故障預測方法對應的程式指令/模組，處理器102透過運行儲存在記憶體104內的軟體程式以及模組，從而執行各種功能應用以及資料處理，即實現上述的應用程式的漏洞檢測方法。記憶體104可包括高速隨機記憶體，還可包括非揮發性記憶體，如一個或者多個磁性儲存裝置、快閃記憶體、或者其他非揮發性固態記憶體。在一些實例中，記憶體104可進一步包括相對於處理器102遠端設置的記憶體，這些遠端存放器可以透過網路連接至電腦終端10。上述網路的實例包括但不限於互聯網、企業內部網、局域網、移動通信網及其組合。 The memory 104 can be used to store software programs and modules of the application software, such as the program instructions/modules corresponding to the method for predicting the failure of the disk in the embodiment of the present invention, and the processor 102 runs the software program stored in the memory 104. And modules to perform various functional applications and data processing, that is, to implement the vulnerability detection method of the above application. The memory 104 can include high speed random memory and can also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 104 can further include memory disposed remotely from processor 102, which can be connected to computer terminal 10 via a network. Examples of the above networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and groups thereof. Hehe.

傳輸模組106用於經由一個網路接收或者發送資料。上述的網路具體實例可包括電腦終端10的通信供應商提供的無線網路。在一個實例中，傳輸模組106包括一個網路介面卡(Network Interface Controller，NIC)，其可透過基地台與其他網路設備相連從而可與互聯網進行通訊。在一個實例中，傳輸模組106可以為射頻(Radio Frequency，RF)模組，其用於透過無線方式與互聯網進行通訊。 The transmission module 106 is configured to receive or transmit data via a network. The above specific network example may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transport module 106 includes a network interface controller (NIC) that can be connected to other network devices through the base station to communicate with the Internet. In one example, the transmission module 106 can be a radio frequency (RF) module for communicating wirelessly with the Internet.

在上述運行環境下，本申請提供了如圖2所示的一種磁碟的故障預測方法。圖2是根據本發明實施例的一種磁碟的故障預測方法的流程圖。 In the above operating environment, the present application provides a method for predicting the failure of a disk as shown in FIG. 2. 2 is a flow chart of a method for predicting a failure of a magnetic disk according to an embodiment of the present invention.

需要說明的是，對於前述的各方法實施例，為了簡單描述，故將其都表述為一系列的動作組合，但是本領域技術人員應該知悉，本發明並不受所描述的動作順序的限制，因為依據本發明，某些步驟可以採用其他順序或者同時進行。其次，本領域技術人員也應該知悉，說明書中所描述的實施例均屬於優選實施例，所涉及的動作和模組並不一定是本發明所必須的。 It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present invention.

透過以上的實施方式的描述，本領域的技術人員可以清楚地瞭解到根據上述實施例的方法可借助軟體加必需的通用硬體平臺的方式來實現，當然也可以透過硬體，但很多情況下前者是更佳的實施方式。基於這樣的理解，本發明的技術方案本質上或者說對現有技術做出貢獻的部分可以以軟體產品的形式體現出來，該電腦軟體產品儲存在一個儲存媒體(如ROM/RAM、磁碟、光碟)中，包括若干指令用以使得一台終端設備(可以是手機，電腦，伺服器，或者網路設備等)執行本發明各個實施例所述的方法。 Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of a software plus a necessary universal hardware platform, and of course, can also be through hardware, but in many cases The former is a better implementation. Based on such understanding, the technical solution of the present invention may be essential in nature or contribution to the prior art. In the form of a software product, the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), and includes a number of instructions for making a terminal device (can be a mobile phone, a computer, a server) , or a network device, etc.) performs the methods described in various embodiments of the present invention.

在上述運行環境下，本申請提供了如圖2所示的反編譯資料的處理方法。圖2是根據本發明實施例一的反編譯資料的處理方法的流程圖，如圖2所示，該方法包括：步驟21，透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，所述樣本磁碟資料包括多個維度上的樣本資料。 In the above operating environment, the present application provides a method of processing the decompiled data as shown in FIG. 2. 2 is a flowchart of a method for processing decompiled data according to Embodiment 1 of the present invention. As shown in FIG. 2, the method includes: Step 21: acquiring a sample disk data of a disk by using a disk monitoring technology, where The sample disk data includes sample data in multiple dimensions.

在上述步驟中，磁碟監控技術用於監測磁碟出廠後的使用過程中產生的各項磁碟資料，以預測磁碟的故障狀態，使得磁碟使用者能夠在磁碟發生故障之前便能知曉磁碟即將發生故障，從而對磁碟中的資料進行拷貝儲存，避免資料的遺失。 In the above steps, the disk monitoring technology is used to monitor the disk data generated during the use of the disk after the factory to predict the fault state of the disk, so that the disk user can before the disk fails. Know that the disk is about to fail, so that the data in the disk is copied and stored to avoid data loss.

在一種可選的實施例中，上述樣本磁碟資料可以包括：底層資料讀取錯誤率、啟動/停止計數、重映射磁區數、通電時間累計、主軸起旋重試次數、磁碟校準重試次數、磁碟通電次數、溫度以及寫錯誤率，可以根據磁碟歷史故障情況獲取樣本磁碟資料。例如，可以按照正負樣本比例為1：5的比例進行樣本獲取，其中，正樣本為存在故障的磁碟，負樣本為不存在故障的磁碟。 In an optional embodiment, the sample disk data may include: an underlying data read error rate, a start/stop count, a number of remapping magnetic regions, an energization time accumulation, a spindle spin retry count, and a disk calibration weight. The number of trials, the number of disk power-ons, the temperature, and the write error rate allow you to obtain sample disk data based on disk history failures. For example, the sample acquisition can be performed at a ratio of positive to negative sample ratio of 1:5, wherein the positive sample is a disk with a fault, and the negative sample is a disk with no fault.

此處需要說明的是，在透過磁碟監控技術獲取磁碟的樣本磁碟資料時，由於預測磁碟故障的各個機構使用的磁碟並不一定相同，且由於各個機構不同溫濕度等環境因素對磁碟的影響，使得不同機構的磁碟的好壞比例並不相同，為了使樣本磁碟資料的訓練提供更可靠的樣本磁碟資料，還可以根據機構的實際上磁碟損壞情況進行獲取樣本磁碟資料。 What needs to be explained here is that the disk is obtained by the disk monitoring technology. In the case of sample disk data, the disks used by the various organizations that predict disk failure are not necessarily the same, and because of the influence of environmental factors such as temperature and humidity of various mechanisms on the disk, the ratio of the disk of different organizations is good or bad. Differently, in order to provide more reliable sample disk data for the training of sample disk data, it is also possible to obtain sample disk data according to the actual disk damage of the mechanism.

步驟S23，採用GBDT演算法對所述樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型。 Step S23: Perform sample training on the sample disk data by using a GBDT algorithm to obtain a disk prediction model composed of multiple decision trees.

在上述步驟中，GBDT(Gradient Boosting Decision Tree)為一種反覆運算的決策樹演算法，該演算法由多棵決策樹組成，並透過對所有決策樹的結論進行累加，得到最終結果。上述決策樹作為一種預測模型，是在上一層決策得到的結果的基礎上，進行下一層決策，包括決策點、狀態結點、結果結點等參數，樹中的每個節點表示被預測的物件，二每個分叉路徑則代表該物件可能的屬性。 In the above steps, GBDT (Gradient Boosting Decision Tree) is a decision tree algorithm of repeated operations. The algorithm consists of multiple decision trees and accumulates the conclusions of all decision trees to obtain the final result. As a kind of prediction model, the above decision tree is based on the result of the previous decision, and makes the next decision, including decision points, state nodes, result nodes, etc. Each node in the tree represents the predicted object. Each of the bifurcation paths represents a possible attribute of the object.

在一種可選的實施例中，在上述樣本磁碟為磁碟的S.M.A.R.T的原始值的情況下，對樣本磁碟進行樣本訓練，例如，原始值大於等於預設原始值，可以認為該樣本磁碟發生故障的機率較大，原始值小於預設值原始時，可以認為該樣本磁碟發生故障的機率較小，因此在確定磁碟預測模型時，在樣本磁碟的原始值大於等於預設原始值的情況下，確認該樣本磁碟的屬性為故障，在樣本磁碟的原始值小於預設原始值的情況下，確認該樣本磁碟的屬性為非故障。建立具備上述決策能力的磁碟預測模型，即向決策樹輸入待檢測磁碟時，若待檢測磁碟的原始值大於等於預設原始值，決策樹自動確認該待檢測磁碟為故障的情況下，確認該樣本磁碟的屬性為故障，當樣本磁碟的原始值小於預設原始值的情況下，確認該樣本磁碟的屬性為非故障。 In an optional embodiment, in the case that the sample disk is the original value of the SMART of the disk, the sample disk is sample-trained, for example, the original value is greater than or equal to the preset original value, and the sample magnetic is considered as The chance of a disk failure is large. When the original value is less than the preset value, the probability that the sample disk is faulty is considered to be small. Therefore, when the disk prediction model is determined, the original value of the sample disk is greater than or equal to the preset. In the case of the original value, it is confirmed that the attribute of the sample disk is a failure, and if the original value of the sample disk is less than the preset original value, it is confirmed that the attribute of the sample disk is non-faulty. Establish a disk prediction model with the above decision-making ability If the original value of the disk to be detected is greater than or equal to the preset original value, the decision tree automatically confirms that the disk to be detected is faulty, and confirms that the attribute of the sample disk is faulty. If the original value of the sample disk is less than the preset original value, confirm that the attribute of the sample disk is non-faulty.

步驟S25，在接收到待測磁碟的磁碟資料之後，使用所述由多個決策樹組成的磁碟預測模型對所述待測磁碟的磁碟資料進行處理，確定所述待測磁碟是否為故障磁碟。 Step S25, after receiving the disk data of the disk to be tested, processing the disk data of the disk to be tested by using the disk prediction model composed of a plurality of decision trees to determine the magnetic field to be tested Whether the disc is a failed disk.

在一種可選的實施例中，將樣本磁碟的多個維度的值作為決策樹的評價指標，得到多個決策樹，再由多個決策樹構成一個磁碟預測模型，對待檢測磁碟進行檢測。 In an optional embodiment, the values of the plurality of dimensions of the sample disk are used as evaluation indexes of the decision tree to obtain a plurality of decision trees, and then a plurality of decision trees form a disk prediction model, and the disk to be detected is performed. Detection.

此處值得注意的是，根據磁碟每一個維度得到的決策樹可能相同，可能不相同，因此在使用多個決策樹構成磁碟預測模型時，需要根據每個決策樹在評價體系中的重要性，來確認每個決策樹的權重值，從而得到磁碟預測模型。 It is worth noting here that the decision trees obtained according to each dimension of the disk may be the same and may not be the same. Therefore, when using multiple decision trees to construct the disk prediction model, it is necessary to be important in the evaluation system according to each decision tree. To determine the weight value of each decision tree, and to obtain a disk prediction model.

此處需要說明的是，在透過磁碟監控技術獲取磁碟的樣本磁碟資料時，採用了磁碟檢測技術，使得獲取樣本磁碟資料的過程更為簡單，且獲取的資料更為全面，為樣本磁碟資料的訓練提供了豐富的磁碟樣本資料。在上述步驟中，採用GBDT演算法對所述樣本磁碟資料進行樣本訓練可以是分兩次或多次進行訓練，以提高與訓練結果對應的決策樹構成的磁碟預測模型的準確率和召回率。 What needs to be explained here is that when the disk disk data of the disk is obtained through the disk monitoring technology, the disk detecting technology is adopted, so that the process of acquiring the sample disk data is simpler and the acquired data is more comprehensive. A wealth of disk sample data is provided for the training of sample disk data. In the above steps, the sample training of the sample disk data by using the GBDT algorithm may be performed in two or more times to improve the accuracy and recall of the disk prediction model composed of the decision tree corresponding to the training result. rate.

由此，本申請提供的上述實施例一的方案解決了現有技術的硬碟故障預測系統中一些容易致使硬碟故障的因素不能被採集或量化導致的預測結果不準確的技術問題。 Therefore, the solution of the above-mentioned Embodiment 1 provided by the present application solves the existing solution. Some of the technical hard disk failure prediction systems that are susceptible to hard disk failure cannot be collected or quantized to cause inaccurate prediction results.

根據本申請上述實施例，在一種優選的方案中，所述樣本磁碟資料至少包括如下四個維度上的樣本資料：原始值、標準值、最差值和累積值。 According to the above embodiment of the present application, in a preferred solution, the sample disk data includes at least sample data in four dimensions: original value, standard value, worst value, and cumulative value.

上述原始值為磁碟運行時的當前參數；上述標準值為正常磁碟運行時各項參數的數值；上述最差值為磁碟運行時，磁碟的各項檢測參數曾出現過與正常值偏差最大的非正常值；上述累計值為磁碟的各項檢測參數從磁碟使用至當前時刻的累計結果。 The above original value is the current parameter of the disk running; the above standard value is the value of each parameter of the normal disk running; the above difference is when the disk is running, the various detecting parameters of the disk have appeared and normal values. The abnormal value with the largest deviation; the above cumulative value is the cumulative result of each detection parameter of the disk from the use of the disk to the current time.

在一種可選的實施例中，磁碟的各項參數可以是對磁碟的各項屬性進行描述的資訊，可以包括錯誤讀取率、加電次數、重新分配磁區數、旋轉重試次數、磁碟校準重試次數以及同位錯誤率中的一項或多項，也可以包括磁碟的其他屬性資訊。 In an optional embodiment, the parameters of the disk may be information describing various attributes of the disk, and may include an error reading rate, a power-on frequency, a redistributed magnetic area number, and a rotating retry number. One or more of the number of disk calibration retries and the parity error rate may also include other attribute information of the disk.

本申請上述步驟可以分別以上述四個維度上的樣本資料得到多個不同的決策樹。 The above steps of the present application can respectively obtain a plurality of different decision trees by using the sample data in the above four dimensions.

在一種可選的實施例中，可以採用HDTune、CrystalDiskInfo等軟體獲取樣本磁碟資料。 In an optional embodiment, the file disk data can be obtained by using software such as HDTune or CrystalDiskInfo.

根據本申請上述實施例，在一種優選的方案中，在透過磁碟監控技術獲取磁碟的樣本磁碟資料之後，所述方法還包括：步驟S211，對所述每個維度上的樣本資料進行如下任意一種或多種運算：差分運算、平方運算和分佈求和運算，使得任意一個維度上的樣本資料被擴展出新的維度上的樣本資料。 According to the above embodiment of the present application, in a preferred solution, after acquiring the sample disk data of the disk by using the disk monitoring technology, the method further includes: step S211, performing sample data in each dimension Any one or more of the following operations: differential operation, square operation, and distribution summation Calculate, so that the sample data in any dimension is expanded to the sample data in the new dimension.

在上述步驟中，對決策結果進行進一步運算，可將決策樹根據運算結果拓展出新的維度，得到這一維度上的樣本資料。 In the above steps, the decision result is further calculated, and the decision tree can be expanded into a new dimension according to the operation result, and the sample data in this dimension is obtained.

此處值得注意的時，每個維度的樣本資料都可以進行多種運算以在這一維度的基礎上得到更多維度的樣本資料，在有四個維度的基礎上，每個維度再分別進行差分運算、平方運算和分佈求和運算，便能夠得到十六個維度的樣本資料，且透過每個維度的樣本資料進行決策的側重點均不同。 When it is worth noting here, the sample data of each dimension can be subjected to various operations to obtain more dimensional sample data on the basis of this dimension. On the basis of four dimensions, each dimension is separately differentiated. By computing, squaring, and distributed summation, it is possible to obtain sample data in sixteen dimensions, and the focus of decision making through sample data for each dimension is different.

在一種可選的實施例中，仍以原始值這一維度的樣本資料為例，對原始值的樣本資料進行差分運算、平方運算和分佈求和運算，由此得到新的四個維度的樣本資料，採用新的四個維度的樣本資料最為決策指標進行訓練，並得到新的四個決策樹。 In an optional embodiment, the sample data of the original value is still taken as an example, and the sample data of the original value is subjected to a difference operation, a square operation, and a distribution sum operation, thereby obtaining a sample of the new four dimensions. Data, using the new four dimensions of sample data to train the most decision-making indicators, and get a new four decision trees.

根據本申請上述實施例，在一種優選的方案中，採用GBDT演算法對所述樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型，包括：步驟S231，以所有磁碟的樣本磁碟資料作為訓練資料，並採用預設值初始化所述訓練資料的分類模型參數。 According to the above embodiment of the present application, in a preferred solution, the sample disk data is sample-trained by using a GBDT algorithm to obtain a disk prediction model composed of a plurality of decision trees, including: step S231, with all magnetics The sample disk data of the disk is used as training data, and the classification model parameters of the training data are initialized by using preset values.

在上述步驟中，初始化訓練資料的分類模型參數可以是預先設置上述決策樹的個數、每個決策樹的層數，即對決策樹的屬性進行初步設置。 In the above steps, the classification model parameter of the initialization training data may be preset the number of the above decision trees and the number of layers of each decision tree, that is, the initial setting of the attributes of the decision tree.

步驟S233，提取所述訓練資料中的多個特徵資料，將每個特徵資料作為根節點在創建所述多個決策樹，並將每個特徵資料對應的特徵值作為對應的決策樹的葉子節點。 Step S233, extracting a plurality of feature data in the training data, creating each of the plurality of decision trees as a root node, and using the feature value corresponding to each feature data as a leaf node of the corresponding decision tree. .

步驟S235，計算當前所有葉子節點的最佳劃分以及其增益，並以增益最大的葉子節點以及對應的劃分點進行分裂，使得將所述樣本磁碟資料劃分到子節點中。 Step S235: Calculate an optimal partition of all current leaf nodes and a gain thereof, and perform splitting with the leaf node with the largest gain and the corresponding split point, so that the sample disk data is divided into the child nodes.

在上述步驟中，增益可以是標籤值的最小化均方差，即每個樣本的標籤值與預測標籤值做差後，求的差的平方，並計算所有差的平方的和，可以認為被預測出錯的樣本越多，均方差就越大，因此透過最小化均方差能夠找到最佳的分枝依據。 In the above steps, the gain may be the minimum mean square error of the label value, that is, the square of the difference between the label value of each sample and the predicted label value, and calculate the sum of the squares of all the differences, which may be considered to be predicted. The more samples that are in error, the greater the mean square error, so the best branching basis can be found by minimizing the mean square error.

上述決策樹可以是以每個特徵資料作為根節點的二叉樹，且每個特資料對應於一個特徵值，該特徵值為以該特徵資料為根節點的決策樹的葉子節點。在確定決策樹的葉子節點後，對葉子節點在進行下一步劃分，此處值得注意的是，當對葉子節點進行進一步劃分時，在多個葉子節點的增益不相同的情況下，劃分增益最大的葉子節點，使所有樣本資料都能劃分至相應的葉子節點中。 The decision tree may be a binary tree with each feature data as a root node, and each special data corresponds to a feature value, and the feature value is a leaf node of a decision tree with the feature data as a root node. After determining the leaf nodes of the decision tree, the leaf nodes are further divided. It is worth noting that when the leaf nodes are further divided, the gain is maximized when the gains of the plurality of leaf nodes are different. Leaf nodes, so that all sample data can be divided into corresponding leaf nodes.

在一種可選的實施例中，以樣本磁碟為A、B、C和D四塊磁碟為例，其中，A磁碟和B磁碟為正常磁碟，C磁碟和D磁碟為損壞的磁碟，在這一示例中，將正常磁碟對應於0，故障磁碟對應於1，因此，A、B、C和D四塊磁碟分別對應為0、0、1、1。獲取上述磁碟在第一維度上的特徵值為A，使用GBDT演算法對樣本磁碟資料進行訓練，圖3是根據本發明實施例的一種使用GBDT演算法對樣本磁碟資料進行訓練的示意圖，結合圖3所示，設置默認初始值為0.5，即每個磁碟為故障磁碟的機率為0.5，第一維度的閥值為A0，將特徵值大於A0的磁碟劃分為一個子節點，將第一維度上的特徵值小於等於A0的磁碟劃分為另一個子節點，並設置兩個子節點的磁碟為故障磁碟的機率為0.5。 In an optional embodiment, the sample disks are four disks A, B, C, and D, wherein the A disk and the B disk are normal disks, and the C disk and the D disk are The damaged disk, in this example, the normal disk corresponds to 0, and the failed disk corresponds to 1, so the four disks A, B, C, and D correspond to 0, 0, 1, 1, respectively. Get the above disk in the first dimension The upper eigenvalue is A, and the sample disk data is trained using the GBDT algorithm. FIG. 3 is a schematic diagram of training the sample disk data using the GBDT algorithm according to an embodiment of the present invention. The default initial value is 0.5, that is, the probability of each disk being a failed disk is 0.5, the threshold of the first dimension is A0, and the disk with the feature value greater than A0 is divided into a child node, and the feature in the first dimension is A disk whose value is less than or equal to A0 is divided into another child node, and the probability that the disk of the two child nodes is a failed disk is 0.5.

此處需要說明的當是，上述實施例為方便說明，僅選用了四個樣本資料進行說明，因此只劃分得到兩個葉子節點，在實際應用中，根節點劃分為兩個葉子節點之後，仍可以繼續劃分，樣本資料量越大，劃分的層次就越多。 It should be noted that, in the above embodiment, for convenience of description, only four sample data are selected for description, so only two leaf nodes are obtained. In practical applications, after the root node is divided into two leaf nodes, You can continue to divide, the larger the sample data, the more the level of division.

根據本申請上述實施例，在一種優選的方案中，提取所述訓練資料中的多個特徵資料，將每個特徵資料作為根節點在創建所述多個決策樹，並將每個特徵資料對應的特徵值作為對應的決策樹的葉子節點，包括：步驟S2331，讀取任意一個特徵資料對應的閾值。 According to the above embodiment of the present application, in a preferred solution, a plurality of feature data in the training material are extracted, each feature data is used as a root node to create the plurality of decision trees, and each feature data is correspondingly The feature value is used as a leaf node of the corresponding decision tree, and includes: Step S2331, reading a threshold corresponding to any feature data.

步驟S2333，將所述任意一個特徵資料的特徵值與所述閾值進行比較，並根據比較結果得到兩個分支的熵。 Step S2333, comparing the feature value of any one of the feature data with the threshold value, and obtaining the entropy of the two branches according to the comparison result.

步驟S2335，根據所述兩個分支的熵確定兩個新節點作為所述任意一個特徵資料的兩個葉子節點。 Step S2335, determining two new nodes as two leaf nodes of the arbitrary one of the feature data according to the entropy of the two branches.

步驟S2337，採用上述步驟對每一個特徵資料進行處理，直到每個特徵資料得到預定的兩個唯一的葉子節點。 In step S2337, each feature data is processed by the above steps until each feature data obtains two predetermined unique leaf nodes.

在上述步驟中，窮舉每一個特徵的每一個閾值，找到使得按照特徵小於等於閾值，和特徵大於閾值分成的兩個分枝的熵最小的特徵和閾值，按照該標準分枝得到兩個新節點，使用同樣方法繼續分枝直到所有樣本都被分入只有正常磁碟或只有故障磁碟的葉子節點，或達到預設的終止條件，若最終葉子節點中不是只有正常磁碟或故障磁碟，則以該節點上所有樣本的平均標籤值作為該葉子節點的預測標籤值。 In the above steps, exhaust each threshold of each feature, find So that the feature and the threshold with the minimum entropy of the two branches whose characteristics are less than or equal to the threshold and the feature is greater than the threshold are obtained, according to the standard branch, two new nodes are obtained, and the branching is continued in the same way until all the samples are divided into only A normal disk or a leaf node with only a failed disk, or a preset termination condition is reached. If the final leaf node does not have only a normal disk or a failed disk, the average tag value of all samples on the node is used as the leaf node. The predicted tag value.

此處需要說明的是，標籤值即為該磁碟為故障磁碟的機率。 It should be noted here that the tag value is the probability that the disk is a failed disk.

此處仍需要說明的是，熵最小是指盡可能的使每個分枝中，正樣本和負樣本的比例遠離1：1，熵最小的情況為該分枝上只有正樣本或負樣本，即該分支上只有正常的磁碟，或故障磁碟。 It should be noted here that the minimum entropy means that as far as possible, the ratio of positive and negative samples in each branch is far from 1:1, and the case of minimum entropy is that there are only positive or negative samples on the branch. That is, there is only a normal disk on the branch, or a failed disk.

在一種可選的實施例中，在決策樹為回歸樹的示例中，每個節點都會得一個預測值，該預測值等於屬於該節點的所有標籤值的平均值，對該節點進行劃分時，窮舉每一個特徵的每個閾值，找最好的分割點進行劃分，直到每個葉子節點上每個樣本的標籤值都唯一或者達到預設的終止條件，若最終葉子節點上樣本的標籤值非唯一，則以該節點上所有樣本的平均標籤值作為該葉子節點的預測標籤值。 In an optional embodiment, in the example where the decision tree is a regression tree, each node obtains a predicted value equal to the average of all the tag values belonging to the node, and when the node is divided, Exhausting each threshold of each feature, finding the best segmentation point, until the tag value of each sample on each leaf node is unique or reaches the preset termination condition, if the label value of the sample on the final leaf node Non-unique, the average tag value of all samples on the node is used as the predicted tag value of the leaf node.

此處需要說明的是，在上述實施例中，最佳的劃分標準不再是最小化熵，而是最小化均方差，即每個樣本的標籤值與預測標籤值做差後，求的差的平方，並計算所有差的平方的和，可以認為被預測出錯的樣本越多，均方差就越大，因此透過最小化均方差能夠找到最佳的分枝依據。 It should be noted here that in the above embodiment, the optimal partitioning criterion is no longer to minimize the entropy, but to minimize the mean square error, that is, the difference between the label value of each sample and the predicted label value. Squared and calculate all differences The sum of the squares can be considered as the more samples that are predicted to be erroneous, the greater the mean square error, so the best branching basis can be found by minimizing the mean square error.

此處還需要說明的是，在進行劃分時，使每個葉子節點上每個樣本的標籤值都唯一是很難達到的，因此為了得到最接近真實情況的預測結果可以預設一個終止條件，該終止條件可以是葉子的上限。 It should also be noted here that it is difficult to achieve the unique label value of each sample on each leaf node when performing the partitioning, so a termination condition can be preset in order to obtain the prediction result closest to the real situation. The termination condition can be the upper limit of the leaf.

根據本申請上述實施例，在一種優選的方案中，在得到由多個決策樹組成的磁碟預測模型之後，所述方法還包括：對所述分類模型參數進行調整，其中，在所述分類模型參數包括故障磁碟樣本和非故障磁碟樣本的情況下，如果要確定所述待測磁碟是否為故障磁碟，則將所述分類模型參數中的故障磁碟樣本的比例調高。 According to the above embodiment of the present application, in a preferred solution, after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein the classification In the case where the model parameters include a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.

根據本申請上述實施例，在一種優選的方案中，使用所述由多個決策樹組成的磁碟預測模型對所述待測磁碟的磁碟資料進行處理，確定所述待測磁碟是否為故障磁碟，包括：步驟S251，接收到所述待測磁碟的磁碟資料之後，對所述待測磁碟的磁碟資料賦予一個初始值。 According to the above embodiment of the present application, in a preferred solution, the disk data of the disk to be tested is processed by using the disk prediction model composed of a plurality of decision trees to determine whether the disk to be tested is For the faulty disk, the method includes the following steps: Step S251: After receiving the disk data of the disk to be tested, assign an initial value to the disk data of the disk to be tested.

步驟S253，根據所述待測磁碟的初始值遍歷每一個決策樹，計算得到第一個決策樹所確定的預測結果和第一殘差，並將所述第一殘差賦值給所述初始值，得到更新後的初始值。 Step S253, traversing each decision tree according to the initial value of the disk to be tested, calculating a prediction result and a first residual determined by the first decision tree, and assigning the first residual to the initial Value, get the updated initial value.

步驟S255，以所述更新後的初始值計算得到第二個決策樹所確定的預測結果和第二殘差，並所述第二殘差賦值所述更新後的初始值，以此遍歷所有的決策樹，得到預測所述待測磁碟是否為故障磁碟的結果。 Step S255, calculating, by using the updated initial value, a prediction result determined by the second decision tree and a second residual, and the second residual is assigned The updated initial value is used to traverse all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.

步驟S257，每一棵樹學的是之前所有樹結論和的殘差，這個殘差就是一個加預測值後能得真實值的累加量。 In step S257, each tree learns the residual of all previous tree conclusions, and the residual is an accumulated amount that can obtain the true value after adding the predicted value.

在一種可選的實施例中，仍以上述A,B,C,D四個磁碟為例，採用特徵A可將A,B,C,D四個磁碟分為兩個部分，分別為A,B和C,D，每個部分用平均標籤值作為預測值。此時計算殘差，其中殘差至為磁碟的預測值與磁碟的實際值的差，所以A的殘差就是1-0.5=0.5進而得到A,B,C,D的殘差分別為0.5,-0.5，0.5,-0.5。然後結合圖4所示，圖4為根據本發明實施例的一種使用GBDT演算法計算磁碟預測值的示意圖，使用殘差替代A,B,C,D的原值，輸入至第二棵決策樹進行訓練，並根據與特徵B的比對結果分為兩個葉子節點，如果預測值和它們的殘差相等，則只需把第二棵樹的結論累加到第一棵樹上就能得到磁碟的實際值。第二棵樹僅有兩個值0.5和-0.5，因此直接分成兩個節點。此時所有人的殘差都是0，即每個人都得到了真實的預測值。 In an optional embodiment, the four disks of the above A, B, C, and D are still taken as an example. The feature A can divide the four disks A, B, C, and D into two parts, respectively A, B and C, D, each part uses the average tag value as the predicted value. At this time, the residual is calculated, wherein the residual is the difference between the predicted value of the disk and the actual value of the disk, so the residual of A is 1-0.5=0.5 and the residuals of A, B, C, and D are respectively 0.5, -0.5, 0.5, -0.5. Then, as shown in FIG. 4, FIG. 4 is a schematic diagram of calculating a disk prediction value using a GBDT algorithm according to an embodiment of the present invention, using the residual to replace the original values of A, B, C, and D, and inputting to the second decision. The tree is trained and divided into two leaf nodes according to the comparison result with feature B. If the predicted values and their residuals are equal, then only the conclusion of the second tree is added to the first tree to obtain The actual value of the disk. The second tree has only two values of 0.5 and -0.5, so it is split directly into two nodes. At this point everyone's residual is 0, that is, everyone gets real predictions.

此處需要說明的是，上述實施例以說明為目的，因此只有兩顆決策樹，在實際應用中，根據樣本資料量可以獲得到個決策樹，且預測值是指之前所有樹累加的和，由於此實施例中，這棵決策樹之前僅有一顆決策樹，因此直接是0.5，如果還有奇特決策樹，則需要都累加起來作為A的預測值。 It should be noted that the above embodiment is for the purpose of description, so there are only two decision trees. In practical applications, a decision tree can be obtained according to the sample data amount, and the predicted value refers to the sum of all the previous trees. Since in this embodiment, the decision tree has only one decision tree before, it is directly 0.5. If there is a strange decision tree, it needs to be added up as the predicted value of A.

圖5是根據本發明實施例的一種可選的磁碟的故障預測方法的流程圖，下面結合圖5詳細介紹本申請的一種優選的實施例。 FIG. 5 is a flow chart of an optional method for predicting a failure of a magnetic disk according to an embodiment of the present invention. A preferred embodiment of the present application is described in detail below with reference to FIG. 5.

如圖5所示，提供了一種磁碟的故障預測方法，該方法可以包括如下步驟S51至步驟S57：S51，獲取樣本磁碟的樣本資料。 As shown in FIG. 5, a method for predicting a failure of a magnetic disk is provided. The method may include the following steps S51 to S57: S51: acquiring sample data of a sample disk.

具體的，在上述步驟中，可以透過HDTune、CrystalDiskInfo等軟體獲取樣本磁碟資料。 Specifically, in the above steps, the sample disk data can be obtained through software such as HDTune and CrystalDiskInfo.

S52，對樣本資料進行差分運算。 S52, performing differential operations on the sample data.

具體的，在上述步驟中，差分運算指磁碟在某一時刻的特徵資料與過該磁碟在24小時之前的特徵資料做差運算得到的值。 Specifically, in the above steps, the difference operation refers to a value obtained by performing a difference operation between the feature data of the disk at a certain time and the feature data of the disk before 24 hours.

S53，對差分運算得到的結果進行分佈求和及/或平方運算。 S53, performing a distribution summation and/or a square operation on the result obtained by the difference operation.

S54，得到訓練和預測資料。 S54, obtaining training and prediction data.

S55，第一步訓練和預測，使召回率較大。 S55, the first step of training and forecasting, makes the recall rate larger.

S56，第二步訓練和預測，平衡召回率和準確率。 S56, the second step of training and forecasting, balances recall and accuracy.

具體的，在上述步驟中，由於訓練資料中負樣本占比很大，正樣本占比小，例如，當二者比例為1000：1時，如果用全部的訓練資料做訓練，能準確預測的正樣本是很少的，由於訓練資料中正樣本較少，很多真實值為負樣本的資料可能被誤判為正樣本，因此第一步在訓練時使正樣本的召回率較大，第二步在訓練時，把第一步預測為正樣本的訓練資料作為第二步的訓練資料，即選擇為與正樣本接近的那些樣本作為訓練樣本，如此在做訓練時，訓練出的模型會更有利於預測出正樣本，這樣第二步預測得到的結果，正樣本的準確率會比第一步有大幅度提高，從而使準確率和召回率達到一定的平衡程度。 Specifically, in the above steps, since the proportion of negative samples in the training data is large, the proportion of positive samples is small. For example, when the ratio of the two is 1000:1, if all the training materials are used for training, the prediction can be accurately predicted. Positive samples are rare. Because there are few positive samples in the training data, many data with negative real samples may be misjudged as positive samples. Therefore, the first step is to make the positive sample recall rate higher during training. The second step is During training, the training data predicted as the positive sample in the first step is used as the training data of the second step, that is, the positive and negative samples are selected. Those samples that are close to each other are used as training samples. Therefore, when training, the trained model will be more conducive to predicting positive samples, so that the results obtained by the second step prediction will have a higher accuracy than the first step. , so that the accuracy and recall rate reach a certain degree of balance.

實施例2Example 2

根據本發明實施例，還提供了一種用於實施上述反編譯資料的處理方法的反編譯資料的處理裝置，圖6是根據本發明實施例的一種磁碟的故障預測裝置的結構示意圖，如圖6所示，該裝置包括：獲取模組60，訓練模組62和處理模組64。 According to an embodiment of the present invention, there is also provided a processing apparatus for implementing decompilation data for implementing the processing method of the decompiled data, and FIG. 6 is a schematic structural diagram of a fault prediction apparatus for a magnetic disk according to an embodiment of the present invention. As shown in FIG. 6, the device includes: an acquisition module 60, a training module 62, and a processing module 64.

獲取模組60，用於透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，所述樣本磁碟資料包括多個維度上的樣本資料；訓練模組62，用於採用GBDT演算法對所述樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型；處理模組64，在接收到待測磁碟的磁碟資料之後，使用所述由多個決策樹組成的磁碟預測模型對所述待測磁碟的磁碟資料進行處理，確定所述待測磁碟是否為故障磁碟。 The module 60 is configured to acquire the sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; and the training module 62 is configured to use the GBDT algorithm. Performing sample training on the sample disk data to obtain a disk prediction model composed of a plurality of decision trees; and processing module 64, after receiving the disk data of the disk to be tested, using the plurality of decision trees The disk prediction model processes the disk data of the disk to be tested to determine whether the disk to be tested is a failed disk.

此處需要說明的是，上述獲取模組60，訓練模組62和處理模組64對應於實施例一種的步驟S21至步驟S25所實現的實例和應用場景相同，但不限於上述實施例一所公開的內容。需要說明的是，上述模組作為裝置的一部分可以運行在實施例一提供的電腦終端10中。 It should be noted that the above-mentioned acquisition module 60, the training module 62, and the processing module 64 are the same as the application scenario of the step S21 to the step S25 of the embodiment, but are not limited to the foregoing embodiment. Public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

根據本申請上述實施例，在一種優選的方案中，所述樣本磁碟資料為SMART磁碟資料，其中，所述樣本磁碟資料至少包括如下四個維度上的樣本資料：原始值、標準值、最差值和累積值。 According to the above embodiment of the present application, in a preferred embodiment, the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value , the worst value and the cumulative value.

根據本申請上述實施例，在一種優選的方案中，結合圖7所示，上述裝置還包括：運算模組70，用於對所述每個維度上的樣本資料進行如下任意一種或多種運算：差分運算、平方運算和分佈求和運算，使得任意一個維度上的樣本資料被擴展出新的維度上的樣本資料。 According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 7, the apparatus further includes: an operation module 70, configured to perform any one or more of the following operations on the sample data in each dimension: The difference operation, the square operation, and the distribution sum operation are such that the sample data in any one dimension is expanded to the sample data in the new dimension.

此處需要說明的是，上述運算模組770對應與實施例一中的步驟S21至步驟S25所實現的實例和應用場景相同，但不限於上述實施例一所公開的內容。需要說明的是，上述模組作為裝置的一部分可以運行在實施例一提供的電腦終端10中。 It should be noted that the above-mentioned operation module 770 is the same as the example and application scenario implemented in step S21 to step S25 in the first embodiment, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

根據本申請上述實施例，在一種優選的方案中，結合圖8所示，上述訓練模組62還包括：初始模組80，用於以所有磁碟的樣本磁碟資料作為訓練資料，並採用預設值初始化所述訓練資料的分類模型參數；提取模組82，用於提取所述訓練資料中的多個特徵資料，將每個特徵資料作為根節點在創建所述多個決策樹，並將每個特徵資料對應的特徵值作為對應的決策樹的葉子節點；第一計算模組84，用於計算當前所有葉子節點的最佳劃分以及其增益，並以增益最大的葉子節點以及對應的劃分點進行分裂，使得將所述樣本磁碟資料劃分到子節點中。 According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 8 , the training module 62 further includes: an initial module 80 for using the sample disk data of all the disks as training materials, and adopting The preset value initializes the classification model parameter of the training data; the extraction module 82 is configured to extract a plurality of feature data in the training data, and create each of the plurality of decisions by using each feature data as a root node. a tree, and the feature value corresponding to each feature data is used as a leaf node of the corresponding decision tree; the first calculation module 84 is configured to calculate an optimal partition of all current leaf nodes and a gain thereof, and the leaf node with the largest gain And dividing the corresponding dividing points to divide the sample disk data into the child nodes.

此處需要說明的是，上述初始模組80，提取模組82和第一計算模組84對應於實施例一種的步驟S231至步驟S235所實現的實例和應用場景相同，但不限於上述實施例一所公開的內容。需要說明的是，上述模組作為裝置的一部分可以運行在實施例一提供的電腦終端10中。 It should be noted that the initial module 80, the extraction module 82, and the first computing module 84 are the same as the application scenarios implemented in steps S231 to S235 of the embodiment, but are not limited to the above embodiments. A public content. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

根據本申請上述實施例，在一種優選的方案中，結合圖9所示，所述提取模組82包括：讀取模組90，用於讀取任意一個特徵資料對應的閾值；比較模組92，用於將所述任意一個特徵資料的特徵值與所述閾值進行比較，並根據比較結果得到兩個分支的熵；確定模組94，用於根據所述兩個分支的熵確定兩個新節點作為所述任意一個特徵資料的兩個葉子節點；處理子模組96，用於採用上述步驟對每一個特徵資料進行處理，直到每個特徵資料得到預定的兩個唯一的葉子節點。 According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 9, the extraction module 82 includes: a reading module 90 for reading a threshold corresponding to any feature data; and a comparison module 92. And comparing the feature value of the any one of the feature data with the threshold, and obtaining the entropy of the two branches according to the comparison result; the determining module 94 is configured to determine two new according to the entropy of the two branches. The node serves as two leaf nodes of the arbitrary feature data; the processing sub-module 96 is configured to process each feature data by using the above steps until each feature data obtains two predetermined unique leaf nodes.

此處需要說明的是，上述讀取模組90，比較模組 92、確定模組94和處理子模組96對應於實施例一種的步驟S2331至步驟S2337所實現的實例和應用場景相同，但不限於上述實施例一所公開的內容。需要說明的是，上述模組作為裝置的一部分可以運行在實施例一提供的電腦終端10中。 It should be noted here that the above reading module 90, the comparison module The example and the application scenario implemented by the step S2331 to the step S2337 are the same as those of the first embodiment, but are not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

根據本申請上述實施例，在一種優選的方案中，結合圖10所示，上述處理模組64包括：接收模組100，用於接收到所述待測磁碟的磁碟資料之後，對所述待測磁碟的磁碟資料賦予一個初始值；第二計算模組102，用於根據所述待測磁碟的初始值遍歷每一個決策樹，計算得到第一個決策樹所確定的預測結果和第一殘差，並將所述第一殘差賦值給所述初始值，得到更新後的初始值；遍歷模組104，用於以所述更新後的初始值計算得到第二個決策樹所確定的預測結果和第二殘差，並所述第二殘差賦值所述更新後的初始值，以此遍歷所有的決策樹，得到預測所述待測磁碟是否為故障磁碟的結果。 According to the above embodiment of the present application, in a preferred solution, as shown in FIG. 10, the processing module 64 includes: a receiving module 100, configured to receive the disk data of the disk to be tested, and then The disk data of the disk to be tested is given an initial value; the second computing module 102 is configured to traverse each of the decision trees according to the initial value of the disk to be tested, and calculate the prediction determined by the first decision tree. a result and a first residual, and assigning the first residual to the initial value to obtain an updated initial value; the traversing module 104, configured to calculate a second decision by using the updated initial value Determining the prediction result and the second residual determined by the tree, and assigning the updated residual value to the second residual, thereby traversing all the decision trees to obtain whether the predicted disk is a faulty disk. result.

此處需要說明的是，上述接收模組100，第二計算模組102和遍歷模組104對應於實施例一種的步驟S251至步驟S255所實現的實例和應用場景相同，但不限於上述實施例一所公開的內容。需要說明的是，上述模組作為裝置的一部分可以運行在實施例一提供的電腦終端10中。 It should be noted here that the receiving module 100, the second computing module The example and application scenario implemented by the group 102 and the traversal module 104 corresponding to the steps S251 to S255 of the embodiment are the same, but are not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.

實施例3Example 3

本發明的實施例可以提供一種電腦終端，該電腦終端可以是電腦終端群中的任意一個電腦終端設備。可選地，在本實施例中，上述電腦終端也可以替換為移動終端等終端設備。 An embodiment of the present invention may provide a computer terminal, which may be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.

可選地，在本實施例中，上述電腦終端可以位於電腦網路的多個網路設備中的至少一個網路設備。 Optionally, in this embodiment, the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.

在本實施例中，上述電腦終端可以執行磁碟的故障預測方法中以下步驟的程式碼：透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，樣本磁碟資料包括多個維度上的樣本資料；採用GBDT演算法對樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型；在接收到待測磁碟的磁碟資料之後，使用由多個決策樹組成的磁碟預測模型對待測磁碟的磁碟資料進行處理，確定待測磁碟是否為故障磁碟。 In this embodiment, the computer terminal can execute the code of the following steps in the method for predicting the fault of the disk: acquiring the sample disk data of the disk by using a disk monitoring technology, wherein the sample disk data includes multiple dimensions. Sample data; using the GBDT algorithm to sample the sample disk data to obtain a disk prediction model composed of multiple decision trees; after receiving the disk data of the disk to be tested, using a plurality of decision trees The disk prediction model processes the disk data of the disk to be determined to determine whether the disk to be tested is a failed disk.

可選地，圖11是根據本發明實施例的一種電腦終端的結構框圖。如圖11所示，該電腦終端A可以包括：一個或多個(圖中僅示出一個)處理器111、記憶體113、以及傳輸裝置115。 Optionally, FIG. 11 is a structural block diagram of a computer terminal according to an embodiment of the present invention. As shown in FIG. 11, the computer terminal A may include one or more (only one shown in the figure) processor 111, memory 113, and transmission device 115.

其中，記憶體可用於儲存軟體程式以及模組，如本發明實施例中的磁碟的故障預測方法和裝置對應的程式指令/模組，處理器透過運行儲存在記憶體內的軟體程式以及模組，從而執行各種功能應用以及資料處理，即實現上述的磁碟的故障預測方法。記憶體可包括高速隨機記憶體，還可以包括非揮發性記憶體，如一個或者多個磁性儲存裝置、快閃記憶體、或者其他非揮發性固態記憶體。在一些實例中，記憶體可進一步包括相對於處理器遠端設置的記憶體，這些遠端存放器可以透過網路連接至終端A。上述網路的實例包括但不限於互聯網、企業內部網、局域網、移動通信網及其組合。 The memory can be used to store software programs and modules, such as the method for predicting the failure of the disk in the embodiment of the present invention and the program instructions/modules corresponding to the device. The processor runs the software program and the module stored in the memory. , thereby performing various functional applications and data processing, that is, implementing the above-described method for predicting the failure of the disk. The memory may include high speed random memory and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory can further include memory disposed remotely from the processor, the remote registers being connectable to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

處理器可以透過傳輸裝置調用記憶體儲存的資訊及應用程式，以執行下述步驟：樣本磁碟資料為SMART磁碟資料，其中，樣本磁碟資料至少包括如下四個維度上的樣本資料：原始值、標準值、最差值和累積值。 The processor can call the information and application stored in the memory through the transmission device to perform the following steps: the sample disk data is SMART disk data, wherein the sample disk data includes at least the sample data in the following four dimensions: original Value, standard value, worst value, and cumulative value.

可選的，上述處理器還可以執行如下步驟的程式碼：對每個維度上的樣本資料進行如下任意一種或多種運算：差分運算、平方運算和分佈求和運算，使得任意一個維度上的樣本資料被擴展出新的維度上的樣本資料。 Optionally, the processor may further execute the following code: performing any one or more of the following operations on the sample data in each dimension: a differential operation, a square operation, and a distributed summation operation, so that samples in any one dimension The data is expanded to sample data in a new dimension.

可選的，上述處理器還可以執行如下步驟的程式碼：以所有磁碟的樣本磁碟資料作為訓練資料，並採用預設值初始化訓練資料的分類模型參數；提取訓練資料中的多個特徵資料，將每個特徵資料作為根節點在創建多個決策樹，並將每個特徵資料對應的特徵值作為對應的決策樹的葉子節點；計算當前所有葉子節點的最佳劃分以及其增益，並以增益最大的葉子節點以及對應的劃分點進行分裂，使得將樣本磁碟資料劃分到子節點中。 Optionally, the processor may further execute the following steps: using the sample disk data of all the disks as training data, and initializing the classification model parameters of the training data by using preset values; and extracting multiple features in the training data. Data, each feature data is used as a root node to create multiple decision trees, and the feature values corresponding to each feature data are used as corresponding decision trees. The leaf node calculates the optimal partition of all current leaf nodes and its gain, and splits with the leaf node with the largest gain and the corresponding dividing point, so that the sample disk data is divided into the child nodes.

可選的，上述處理器還可以執行如下步驟的程式碼：讀取任意一個特徵資料對應的閾值；將任意一個特徵資料的特徵值與閾值進行比較，並根據比較結果得到兩個分支的熵；根據兩個分支的熵確定兩個新節點作為任意一個特徵資料的兩個葉子節點；採用上述步驟對每一個特徵資料進行處理，直到每個特徵資料得到預定的兩個唯一的葉子節點。 Optionally, the processor may further execute the following steps: reading a threshold corresponding to any one of the feature data; comparing the feature value of any one of the feature data with a threshold, and obtaining an entropy of the two branches according to the comparison result; Two new nodes are determined as two leaf nodes of any one of the feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data obtains two predetermined unique leaf nodes.

可選的，上述處理器還可以執行如下步驟的程式碼：在得到由多個決策樹組成的磁碟預測模型之後，方法還包括：對分類模型參數進行調整，其中，在分類模型參數包括故障磁碟樣本和非故障磁碟樣本的情況下，如果要確定待測磁碟是否為故障磁碟，則將分類模型參數中的故障磁碟樣本的比例調高。 Optionally, the processor may further execute the following code: after obtaining the disk prediction model composed of multiple decision trees, the method further includes: adjusting the classification model parameters, wherein the classification model parameters include faults In the case of a disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.

可選的，上述處理器還可以執行如下步驟的程式碼：接收到待測磁碟的磁碟資料之後，對待測磁碟的磁碟資料賦予一個初始值；根據待測磁碟的初始值遍歷每一個決策樹，計算得到第一個決策樹所確定的預測結果和第一殘差，並將第一殘差賦值給初始值，得到更新後的初始值；以更新後的初始值計算得到第二個決策樹所確定的預測結果和第二殘差，並第二殘差賦值更新後的初始值，以此遍歷所有的決策樹，得到預測待測磁碟是否為故障磁碟的結果。 Optionally, the processor may further execute the following code: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; traversing according to the initial value of the disk to be tested. For each decision tree, the prediction result determined by the first decision tree and the first residual are calculated, and the first residual is assigned to the initial value to obtain the updated initial value; and the updated initial value is calculated. The prediction result determined by the two decision trees and the second residual, and the second residual is assigned an updated initial value, thereby traversing all the decision trees to obtain a knot predicting whether the disk to be tested is a failed disk fruit.

本領域普通技術人員可以理解，圖11所示的結構僅為示意，電腦終端也可以是智慧手機(如Android手機、iOS手機等)、平板電腦、掌聲電腦以及移動互聯網設備(Mobile Internet Devices，MID)、PAD等終端設備。圖11其並不對上述電子裝置的結構造成限定。例如，電腦終端A還可包括比圖11中所示更多或者更少的元件(如網路介面、顯示裝置等)，或者具有與圖11所示不同的配置。 A person skilled in the art can understand that the structure shown in FIG. 11 is only an illustration, and the computer terminal can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (MID). ), PAD and other terminal devices. FIG. 11 does not limit the structure of the above electronic device. For example, computer terminal A may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 11, or have a different configuration than that shown in FIG.

本領域普通技術人員可以理解上述實施例的各種方法中的全部或部分步驟是可以透過程式來指令終端設備相關的硬體來完成，該程式可以儲存於一電腦可讀儲存媒體中，儲存媒體可以包括：快閃記憶體盤、唯讀記憶體(Read-Only Memory，ROM)、隨機存取記憶體(Random Access Memory，RAM)、磁碟或光碟等。 A person skilled in the art can understand that all or part of the steps of the foregoing embodiments can be completed by using a program to instruct a terminal device related hardware, and the program can be stored in a computer readable storage medium, and the storage medium can be Including: flash memory disk, read-only memory (ROM), random access memory (Random) Access Memory, RAM), disk or CD.

實施例4Example 4

本發明的實施例還提供了一種儲存媒體。可選地，在本實施例中，上述儲存媒體可以用於保存上述實施例一所提供的磁碟的故障預測方法所執行的程式碼。 Embodiments of the present invention also provide a storage medium. Optionally, in the embodiment, the storage medium may be used to save the code executed by the fault prediction method of the disk provided in the first embodiment.

可選地，在本實施例中，上述儲存媒體可以位於電腦網路中電腦終端群中的任意一個電腦終端中，或者位於移動終端群中的任意一個移動終端中。 Optionally, in this embodiment, the storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.

可選地，在本實施例中，儲存媒體被設置為儲存用於執行以下步驟的程式碼：透過磁碟監控技術獲取磁碟的樣本磁碟資料，其中，樣本磁碟資料包括多個維度上的樣本資料；採用GBDT演算法對樣本磁碟資料進行樣本訓練，得到由多個決策樹組成的磁碟預測模型；在接收到待測磁碟的磁碟資料之後，使用由多個決策樹組成的磁碟預測模型對待測磁碟的磁碟資料進行處理，確定待測磁碟是否為故障磁碟。 Optionally, in this embodiment, the storage medium is configured to store a code for performing the following steps: acquiring the magnetic disk data of the magnetic disk by using a disk monitoring technology, wherein the sample disk data includes multiple dimensions. Sample data; sample training of sample disk data using GBDT algorithm to obtain a disk prediction model composed of multiple decision trees; after receiving the disk data of the disk to be tested, using a plurality of decision trees The disk prediction model processes the disk data of the disk to be determined to determine whether the disk to be tested is a failed disk.

可選地，上述儲存媒體還被設置為儲存用於執行以下步驟的程式碼：對每個維度上的樣本資料進行如下任意一種或多種運算：差分運算、平方運算和分佈求和運算，使得任意一個維度上的樣本資料被擴展出新的維度上的樣本資料。 Optionally, the storage medium is further configured to store a code for performing the following steps: performing one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution sum operation, so that Sample data in one dimension is extended to sample data in a new dimension.

可選地，上述儲存媒體還被設置為儲存用於執行以下步驟的程式碼：以所有磁碟的樣本磁碟資料作為訓練資料，並採用預設值初始化訓練資料的分類模型參數；提取訓練資料中的多個特徵資料，將每個特徵資料作為根節點在創建多個決策樹，並將每個特徵資料對應的特徵值作為對應的決策樹的葉子節點；計算當前所有葉子節點的最佳劃分以及其增益，並以增益最大的葉子節點以及對應的劃分點進行分裂，使得將樣本磁碟資料劃分到子節點中。 Optionally, the storage medium is further configured to store a code for performing the following steps: using sample disk data of all disks as training resources Material, and initialize the classification model parameters of the training data by using preset values; extract multiple feature data in the training data, create each decision tree as a root node, and associate the feature values corresponding to each feature data As the leaf node of the corresponding decision tree; calculate the optimal partition of all current leaf nodes and its gain, and split with the leaf node with the largest gain and the corresponding partition point, so that the sample disk data is divided into the child nodes.

可選地，上述儲存媒體還被設置為儲存用於執行以下步驟的程式碼：讀取任意一個特徵資料對應的閾值；將任意一個特徵資料的特徵值與閾值進行比較，並根據比較結果得到兩個分支的熵；根據兩個分支的熵確定兩個新節點作為任意一個特徵資料的兩個葉子節點；採用上述步驟對每一個特徵資料進行處理，直到每個特徵資料得到預定的兩個唯一的葉子節點。 Optionally, the storage medium is further configured to store a code for performing the following steps: reading a threshold corresponding to any one of the feature data; comparing the feature value of any one of the feature data with a threshold, and obtaining two according to the comparison result Entropy of branches; two new nodes are determined as two leaf nodes of any one feature data according to the entropy of the two branches; each feature data is processed by the above steps until each feature data is obtained by two predetermined unique Leaf node.

可選地，上述儲存媒體還被設置為儲存用於執行以下步驟的程式碼：在得到由多個決策樹組成的磁碟預測模型之後，方法還包括：對分類模型參數進行調整，其中，在分類模型參數包括故障磁碟樣本和非故障磁碟樣本的情況下，如果要確定待測磁碟是否為故障磁碟，則將分類模型參數中的故障磁碟樣本的比例調高。 Optionally, the storage medium is further configured to store a code for performing the following steps: after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein In the case where the classification model parameters include a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.

可選地，上述儲存媒體還被設置為儲存用於執行以下步驟的程式碼：接收到待測磁碟的磁碟資料之後，對待測磁碟的磁碟資料賦予一個初始值；根據待測磁碟的初始值遍歷每一個決策樹，計算得到第一個決策樹所確定的預測結果和第一殘差，並將第一殘差賦值給初始值，得到更新後的初始值；以更新後的初始值計算得到第二個決策樹所確定的預測結果和第二殘差，並第二殘差賦值更新後的初始值，以此遍歷所有的決策樹，得到預測待測磁碟是否為故障磁碟的結果。 Optionally, the storage medium is further configured to store a code for performing the following steps: after receiving the disk data of the disk to be tested, the disk data of the disk to be tested is given an initial value; The initial value of the disc traverses each decision tree, calculates the prediction result and the first residual determined by the first decision tree, and assigns the first residual to the initial value to obtain an update. After the initial value; the predicted result and the second residual determined by the second decision tree are calculated by the updated initial value, and the updated initial value is assigned to the second residual, thereby traversing all the decision trees and obtaining Predict whether the disk to be tested is the result of a failed disk.

上述本發明實施例序號僅僅為了描述，不代表實施例的佳劣。 The serial numbers of the above embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

在本發明的上述實施例中，對各個實施例的描述都各有側重，某個實施例中沒有詳述的部分，可以參見其他實施例的相關描述。 In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.

在本申請所提供的幾個實施例中，應該理解到，所揭露的技術內容，可透過其它的方式實現。其中，以上所描述的裝置實施例僅僅是示意性的，例如所述單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，例如多個單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通信連接可以是透過一些介面，單元或模組的間接耦合或通信連接，可以是電性或其它的形式。 In the several embodiments provided by the present application, it should be understood that the disclosed technical contents may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.

所述作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。 The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

另外，在本發明各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。上述集成的單元既可以採用硬體的形式實現，也可以採用軟體功能單元的形式實現。 In addition, each functional unit in various embodiments of the present invention can be integrated In one processing unit, each unit may be physically present alone, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of a hardware or a software functional unit.

所述集成的單元如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以儲存在一個電腦可讀取儲存媒體中。基於這樣的理解，本發明的技術方案本質上或者說對現有技術做出貢獻的部分或者該技術方案的全部或部分可以以軟體產品的形式體現出來，該電腦軟體產品儲存在一個儲存媒體中，包括若干指令用以使得一台電腦設備(可為個人電腦、伺服器或者網路設備等)執行本發明各個實施例所述方法的全部或部分步驟。而前述的儲存媒體包括：快閃隨身碟、唯讀記憶體(ROM，Read-Only Memory)、隨機存取記憶體(RAM，Random Access Memory)、移動硬碟、磁碟或者光碟等各種可以儲存程式碼的媒體。 The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, can be stored in a computer readable storage medium. Based on such an understanding, the technical solution of the present invention may contribute to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage media include: flash flash drive, read-only memory (ROM), random access memory (RAM, Random Access Memory), mobile hard disk, disk or optical disc, etc. The code of the media.

以上所述僅是本發明的優選實施方式，應當指出，對於本技術領域的普通技術人員來說，在不脫離本發明原理的前提下，還可以做出若干改進和潤飾，這些改進和潤飾也應視為本發明的保護範圍。 The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims

A method for predicting a fault of a magnetic disk, comprising: acquiring a sample disk data of a magnetic disk by using a disk monitoring technology, wherein the sample disk data includes sample data in a plurality of dimensions; using a GBDT algorithm The sample disk data is sample-trained to obtain a disk prediction model composed of a plurality of decision trees; after receiving the disk data of the disk to be tested, the disk prediction model composed of a plurality of decision trees is used for the disk disk prediction model. The disk data of the disk is processed to determine whether the disk to be tested is a failed disk.

The method of claim 1, wherein the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard value, and most Difference and cumulative value.

The method of claim 2, wherein after obtaining the sample disk data of the disk by the disk monitoring technology, the method further comprises: performing one or more of the following sample data in each dimension; Operations: differential operations, square operations, and distributed summation operations, so that sample data in any dimension is extended to sample data in a new dimension.

The method according to any one of claims 1 to 3, wherein the sample disk data is sample-trained by using a GBDT algorithm to obtain a disk prediction model composed of a plurality of decision trees, including: The sample disk data of all the disks is used as training data, and the classification model parameters of the training materials are initialized by using preset values; Extracting a plurality of feature data in the training data, creating each of the plurality of decision trees as a root node, and using the feature value corresponding to each feature data as a leaf node of the corresponding decision tree; calculating all current leaves The optimal partitioning of the node and its gain are split by the leaf node with the largest gain and the corresponding dividing point, so that the sample disk data is divided into the child nodes.

The method of claim 4, wherein extracting a plurality of feature data in the training material, using each feature data as a root node to create the plurality of decision trees, and corresponding features of each feature data The value is a leaf node of the corresponding decision tree, including: reading a threshold corresponding to any one of the feature data; comparing the feature value of the any one of the feature data with the threshold, and obtaining the entropy of the two branches according to the comparison result; The entropy of the two branches determines two new nodes as the two leaf nodes of the arbitrary feature data; each of the feature data is processed by the above steps until each feature data obtains two predetermined unique leaf nodes. .

The method of claim 4, wherein after obtaining a disk prediction model composed of a plurality of decision trees, the method further comprises: adjusting the classification model parameters, wherein the classification model parameters In the case of a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.

According to the method of claim 1, wherein And processing the disk data of the disk to be tested by using the disk prediction model consisting of multiple decision trees to determine whether the disk to be tested is a fault disk, including: receiving the magnetic disk of the disk to be tested After the disc data, an initial value is given to the disc data of the disc to be tested; and each decision tree is traversed according to the initial value of the disc to be tested, and the prediction result determined by the first decision tree and the first stub are calculated. Poor, and assigning the first residual to the initial value to obtain an updated initial value; calculating the predicted result and the second residual determined by the second decision tree by using the updated initial value, and The second residual is assigned the updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.

A device for predicting a failure of a disk, comprising: an acquisition module, configured to acquire a sample disk data of a disk by using a disk monitoring technology, wherein the sample disk data includes sample data in multiple dimensions; The training module is configured to perform sample training on the sample disk data by using the GBDT algorithm to obtain a disk prediction model composed of a plurality of decision trees; after receiving the disk data of the disk to be tested, the processing module The disk data of the disk to be tested is processed by using the disk prediction model composed of a plurality of decision trees to determine whether the disk to be tested is a failed disk.

The device of claim 8, wherein the sample disk data is SMART disk data, wherein the sample disk data includes at least sample data in the following four dimensions: original value, standard Value, worst value, and cumulative value.

The device according to claim 9, wherein the device further comprises: an operation module, configured to perform any one or more of the following operations on the sample data in each dimension: a difference operation, a square operation, and a distribution summation. The operation is such that the sample data in any one dimension is expanded to the sample data in the new dimension.

The device of any one of claims 8 to 10, wherein the training module further comprises: an initial module for using the sample disk data of all the disks as training materials, and adopting a preset The value initializes the classification model parameter of the training data; the extraction module is configured to extract a plurality of feature data in the training data, and each feature data is used as a root node to create the plurality of decision trees, and each feature data is corresponding The eigenvalue is used as the leaf node of the corresponding decision tree; the first computing module is configured to calculate the optimal partition of all current leaf nodes and the gain thereof, and split with the leaf node with the largest gain and the corresponding dividing point, so that The sample disk data is divided into child nodes.

The device of claim 11, wherein the extraction module comprises: a reading module for reading a threshold corresponding to any one of the feature data; and a comparison module for using the any one of the feature data The eigenvalue is compared with the threshold, and the entropy of the two branches is obtained according to the comparison result; the determining module is configured to determine two new nodes as the two leaf nodes of the arbitrary feature data according to the entropy of the two branches ; The processing sub-module is configured to process each feature data by using the above steps until each feature data obtains two predetermined unique leaf nodes.

The apparatus of claim 11, wherein after obtaining a disk prediction model composed of a plurality of decision trees, the apparatus further comprises: adjusting the classification model parameters, wherein the classification model parameters include In the case of a failed disk sample and a non-faulty disk sample, if it is determined whether the disk to be tested is a failed disk, the proportion of the failed disk sample in the classification model parameter is increased.

The device of claim 8, wherein the processing module comprises: a receiving module, configured to receive the disk data of the disk to be tested after receiving the disk data of the disk to be tested An initial value is given; a second computing module is configured to traverse each decision tree according to the initial value of the disk to be tested, calculate a prediction result and a first residual determined by the first decision tree, and A residual is assigned to the initial value to obtain an updated initial value; a traversal module is configured to calculate a prediction result and a second residual determined by the second decision tree by using the updated initial value, and The second residual is assigned the updated initial value, thereby traversing all the decision trees to obtain a result of predicting whether the disk to be tested is a failed disk.