WO2016107402A1 - 基于预测模型的磁盘故障预测方法及装置 - Google Patents

基于预测模型的磁盘故障预测方法及装置 Download PDF

Info

Publication number
WO2016107402A1
WO2016107402A1 PCT/CN2015/097409 CN2015097409W WO2016107402A1 WO 2016107402 A1 WO2016107402 A1 WO 2016107402A1 CN 2015097409 W CN2015097409 W CN 2015097409W WO 2016107402 A1 WO2016107402 A1 WO 2016107402A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
information
fault
failure
prediction
Prior art date
Application number
PCT/CN2015/097409
Other languages
English (en)
French (fr)
Inventor
何东杰
张凌毅
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2016107402A1 publication Critical patent/WO2016107402A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring

Definitions

  • the present invention relates to a disk failure prediction method and apparatus, and more particularly to a disk failure prediction method and apparatus based on a prediction model.
  • the existing technical solutions have the following problems: the limited redundancy backup method can only ensure the continuity of data processing and system operation in the case of a small number of disk damages, and in the case of more disk failures, the method It is difficult to ensure stable operation of the system and data processing efficiency. In addition, since the number of failed disks cannot be estimated, it is difficult to accurately prepare the number of disks for emergency and backup, thereby increasing maintenance costs.
  • the present invention proposes a predictive model based disk fault prediction method and apparatus capable of predicting the probability and the number of occurrences of disk failures.
  • a disk fault prediction method based on a predictive model includes the following steps:
  • (A2) Analyze the basic information, historical operation information and fault information of the collected massive disks to determine the elements strongly related to the disk failure, and construct a prediction model based on the determined elements strongly related to the disk failure;
  • the prediction model is a linear prediction model.
  • the method further comprises: periodically detecting and recording operational status information of each running disk, and recording a failure occurrence time, a failure cause, and a failure when the disk fails.
  • the operating state of the disk preferably, the method further comprises: periodically detecting and recording operational status information of each running disk, and recording a failure occurrence time, a failure cause, and a failure when the disk fails. The operating state of the disk.
  • the basic information includes a disk type, a disk manufacturer, and disk factory information
  • the historical operation information includes a running time, an operating environment, and time series-based operating state information, where the fault The information includes the time when the fault occurred and the cause of the fault.
  • the step (A2) further comprises: analyzing a correlation between the basic information of the disk, the historical running information, and the fault information to determine an element strongly related to the failure of the disk, and The linear prediction model is constructed by using the determined strong correlation element as a sampling point.
  • the step (A3) further comprises: acquiring basic information and current running information of each disk and calculating and predicting a failure of each disk in the next day or week based on the linear prediction model. Probability, and based on the predicted probability, determine the number of disks that may fail in the next day or week.
  • a disk fault prediction apparatus based on a prediction model wherein the prediction model based disk failure prediction apparatus comprises:
  • An information collecting unit which collects basic information, historical running information, and fault information of the mass disk, and transmits the collected basic information, historical running information, and fault information of the massive disk to the model building unit;
  • a model building unit that analyzes basic information, historical running information, and fault information of the massive disk to determine elements that are strongly related to disk failure, and constructs a prediction based on the determined elements that are strongly related to disk failure Model and provide the prediction model to fault prediction unit;
  • a failure prediction unit that predicts a probability and a number of failures of each running disk based on the prediction model.
  • the prediction model is a linear prediction model.
  • the apparatus further includes an information recording unit that periodically detects and records operation state information of each running disk and records a failure when the disk fails The time of occurrence, the cause of the failure, and the operational status of the disk in the event of a failure.
  • the basic information includes a disk type, a disk manufacturer, and disk factory information
  • the historical operation information includes a running time, an operating environment, and time series-based operating state information, where the fault The information includes the time when the fault occurred and the cause of the fault.
  • the model building unit constructs the linear prediction model in such a manner as to analyze the correlation between the basic information of the disk, the historical running information, and the fault information to determine that the disk is faulty.
  • the linear correlation model is constructed by strongly correlating the elements and using the determined strong correlation elements as sampling points.
  • the fault prediction unit predicts the probability and the number of failures of each running disk in the following manner: acquiring basic information of each disk and current running information based on the linear prediction model Calculate and predict the probability of failure of each disk in the next day or week, and determine the number of disks that may fail in the next day or week based on the predicted probability.
  • the prediction model-based disk failure prediction method and apparatus disclosed by the invention have the following advantages: the probability and the number of disk failure occurrences can be predicted, thereby ensuring stable operation of the system and efficiency of data processing, and being able to estimate the failed disk.
  • the number of disks that enable more accurate preparation of disks for emergency and backup, making maintenance easy and less expensive.
  • FIG. 1 is a flow chart of a disk failure prediction method based on a prediction model according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a disk fault prediction apparatus based on a prediction model according to an embodiment of the present invention.
  • the prediction model-based disk fault prediction method disclosed in the present invention includes the following steps: (A1) collecting basic information, historical operation information, and fault information of a mass disk; (A2) analyzing the collected massive disks. Basic information, historical operational information, and fault information to determine features that are strongly associated with disk failures, and to build predictive models based on determined factors that are strongly correlated with disk failures; (A3) predict each running based on the predictive model The probability and number of disk failures.
  • the prediction model is a linear prediction model.
  • the predictive model based disk fault prediction method disclosed by the present invention further comprises: periodically detecting (eg, 10 seconds or 1 minute) detecting and recording running status information of each running disk, and when the disk fails Record the time of the failure, the cause of the failure, and the operational status of the disk in the event of a failure.
  • the basic information includes, but is not limited to, a disk type, a disk manufacturer, and disk factory information
  • the historical operation information includes, but is not limited to, running time
  • the operating environment ie, the usage scenario, for example, if it is a massive data processing environment, involves a large number of read and write operations, if it is a virtualized environment, only a small number of read and write operations are involved
  • the fault Information includes, but is not limited to, the time at which the failure occurred and the cause of the failure.
  • the step (A2) further comprises: analyzing a correlation between the basic information of the disk, the historical running information, and the fault information to determine the disk.
  • a fault-related element such as running time, access frequency, etc.
  • the determined strong correlation element is used as a sampling point to construct the linear prediction model (for example, the time unit of the linear prediction model is "day" ").
  • the step (A3) further includes: acquiring basic information and current running information of each disk and calculating and predicting each based on the linear prediction model.
  • the prediction model-based disk failure prediction method disclosed by the present invention has the following advantages: it can predict the probability and the number of occurrences of disk failures, thereby ensuring stable operation of the system and efficiency of data processing, and being able to estimate failures.
  • the number of disks allows for more accurate preparation of disks for emergency and backup, making maintenance easy and less expensive.
  • the prediction model-based disk failure prediction apparatus includes an information collection unit 1, a model construction unit 2, and a failure prediction unit 3.
  • the information collecting unit 1 collects basic information, historical running information, and fault information of the mass disk, and transmits the collected basic information, historical running information, and fault information of the massive disk to the model building unit 2.
  • the model building unit 2 analyzes the basic information, the historical running information, and the fault information of the mass disk to determine elements strongly related to the disk failure, and constructs a prediction model based on the determined elements strongly related to the disk failure, and
  • the prediction model is provided to the failure prediction unit 3.
  • the failure prediction unit 3 predicts the probability and the number of failures of each running disk based on the prediction model.
  • the predictive model is a linear predictive model.
  • the prediction model-based disk failure prediction apparatus disclosed by the present invention further includes an information recording unit 4 that periodically detects (for example, 10 seconds or 1 minute) and records each running disk. Runs status information and records the time the failure occurred, the cause of the failure, and the operational status of the disk in the event of a failure.
  • an information recording unit 4 that periodically detects (for example, 10 seconds or 1 minute) and records each running disk. Runs status information and records the time the failure occurred, the cause of the failure, and the operational status of the disk in the event of a failure.
  • the basic information includes, but is not limited to, a disk type, a disk manufacturer, and a disk factory information
  • the historical operation information includes, but is not limited to, running time
  • the operating environment ie, the usage scenario, for example, if it is a massive data processing environment, involves a large number of read and write operations, if it is a virtualized environment, only a small number of read and write operations are involved
  • the fault Information includes but is not limited to when a fault occurs Between, the cause of the failure.
  • the model construction unit 2 constructs the linear prediction model in the following manner: analyzing basic information of a disk, historical operation information, and fault information Correlation to determine features (such as runtime, access frequency, etc.) that are strongly correlated with disk failures, and construct the linear prediction model with the determined strongly correlated features as sample points (eg, the linear prediction The time unit of the model is "day").
  • the fault prediction unit 3 predicts the probability and the number of failures of each running disk in the following manner: obtaining basic information of each disk and current Running information and calculating and predicting the probability of failure of each disk in the next day or week based on the linear prediction model, and determining the number of disks that may fail in the next day or week based on the predicted probability (eg, failed) The number of disks whose probability is greater than a predetermined threshold).
  • the prediction model-based disk failure prediction apparatus disclosed by the present invention has the following advantages: it can predict the probability and the number of disk failure occurrences, thereby ensuring stable operation of the system and efficiency of data processing, and predicting failure
  • the number of disks allows for more accurate preparation of disks for emergency and backup, making maintenance easy and less expensive.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

基于预测模型的磁盘故障预测方法及装置,其中,所述方法包括:收集海量磁盘的基础信息、历史运行信息和故障信息(A1);分析所收集的海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型(A2);基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量(A3)。该基于预测模型的磁盘故障预测方法及装置能够预测磁盘故障发生概率以及数量。

Description

基于预测模型的磁盘故障预测方法及装置 技术领域
本发明涉及磁盘故障预测方法及装置,更具体地,涉及基于预测模型的磁盘故障预测方法及装置。
背景技术
目前,随着计算机和网络应用的日益广泛以及不同领域的业务种类的日益丰富,基于海量数据存储的云计算技术的应用变得越来越广泛,由此,随着用于存储海量数据的机械式硬盘(例如SAS硬盘以及SATA硬盘)的数量的显著增长,其发生故障的概率以及不良影响也越来越大,因此,对磁盘故障进行处理和维护变得越来越重要。
在现有的技术方案中,通常通过有限的冗余备份的方式来在磁盘发生故障的情况下确保数据处理和系统运行的连续性。
然而,现有的技术方案存在如下问题:有限的冗余备份的方式仅能够在少量磁盘损坏的情况下保证数据处理和系统运行的连续性,而在较多磁盘发生故障的情况下,该方式难于保证系统的稳定运行和数据处理效率,此外,由于不能预估发生故障的磁盘的数量,因此难于准确地准备用于应急和备份的磁盘的数量,由此使得维护成本上升。
因此,存在如下需求:提供能够预测磁盘故障发生概率以及数量的基于预测模型的磁盘故障预测方法及装置。
发明内容
为了解决上述现有技术方案所存在的问题,本发明提出了能够预测磁盘故障发生概率以及数量的基于预测模型的磁盘故障预测方法及装置。
本发明的目的是通过以下技术方案实现的:
一种基于预测模型的磁盘故障预测方法,所述基于预测模型的磁盘故障预测方法包括下列步骤:
(A1)收集海量磁盘的基础信息、历史运行信息和故障信息;
(A2)分析所收集的海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型;
(A3)基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
在上面所公开的方案中,优选地,所述预测模型是线性预测模型。
在上面所公开的方案中,优选地,所述方法还包括:周期性地检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
在上面所公开的方案中,优选地,所述基础信息包括磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括运行时间、运行环境以及基于时间序列的运行状态信息,所述故障信息包括故障发生时间、故障原因。
在上面所公开的方案中,优选地,所述步骤(A2)进一步包括:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素,并将所确定的强相关要素作为采样点而构建所述线性预测模型。
在上面所公开的方案中,优选地,所述步骤(A3)进一步包括:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率确定未来一天或一周内可能会发生故障的磁盘的数量。
本发明的目的也可以通过以下技术方案实现:
一种基于预测模型的磁盘故障预测装置,所述基于预测模型的磁盘故障预测装置包括:
信息收集单元,所述信息收集单元收集海量磁盘的基础信息、历史运行信息和故障信息,并将所收集的海量磁盘的基础信息、历史运行信息和故障信息传送至模型构建单元;
模型构建单元,所述模型构建单元分析所述海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型,以及将所述预测模型提供给故障预测 单元;
故障预测单元,所述故障预测单元基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
在上面所公开的方案中,优选地,所述预测模型是线性预测模型。
在上面所公开的方案中,优选地,所述装置还包括信息记录单元,所述信息记录单元周期性地检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
在上面所公开的方案中,优选地,所述基础信息包括磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括运行时间、运行环境以及基于时间序列的运行状态信息,所述故障信息包括故障发生时间、故障原因。
在上面所公开的方案中,优选地,所述模型构建单元以如下方式构建所述线性预测模型:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素,并将所确定的强相关要素作为采样点而构建所述线性预测模型。
在上面所公开的方案中,优选地,所述故障预测单元以如下方式预测每个正在运行的磁盘发生故障的概率以及数量:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率确定未来一天或一周内可能会发生故障的磁盘的数量。
本发明所公开的基于预测模型的磁盘故障预测方法及装置具有以下优点:能够预测磁盘故障发生概率以及数量,由此能够确保系统的稳定运行和数据处理的效率,并且能够预估发生故障的磁盘的数量,从而能够更准确地准备用于应急和备份的磁盘,使得维护便捷且成本较低。
附图说明
结合附图,本发明的技术特征以及优点将会被本领域技术人员更好地理解,其中:
图1是根据本发明的实施例的基于预测模型的磁盘故障预测方法的流程 图;
图2是根据本发明的实施例的基于预测模型的磁盘故障预测装置的示意性结构图。
具体实施方式
图1是根据本发明的实施例的基于预测模型的磁盘故障预测方法的流程图。如图1所示,本发明所公开的基于预测模型的磁盘故障预测方法包括下列步骤:(A1)收集海量磁盘的基础信息、历史运行信息和故障信息;(A2)分析所收集的海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型;(A3)基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
优选地,在本发明所公开的基于预测模型的磁盘故障预测方法中,所述预测模型是线性预测模型。
优选地,本发明所公开的基于预测模型的磁盘故障预测方法还包括:周期性地(例如10秒或者1分钟)检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
优选地,在本发明所公开的基于预测模型的磁盘故障预测方法中,所述基础信息包括但不限于磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括但不限于运行时间、运行环境(即使用场景,例如,如果是海量数据处理环境则涉及大量的读写操作,如果是虚拟化环境,则仅涉及少量的读写操作)以及基于时间序列的运行状态信息,所述故障信息包括但不限于故障发生时间、故障原因。
优选地,在本发明所公开的基于预测模型的磁盘故障预测方法中,所述步骤(A2)进一步包括:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素(例如运行时间、访问频度等等),并将所确定的强相关要素作为采样点而构建所述线性预测模型(例如,所述线性预测模型的时间单位为“天”)。
优选地,在本发明所公开的基于预测模型的磁盘故障预测方法中,所述步骤(A3)进一步包括:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率确定未来一天或一周内可能会发生故障的磁盘的数量(例如发生故障的概率大于预定阈值的磁盘的数量)。
由上可见,本发明所公开的基于预测模型的磁盘故障预测方法具有下列优点:能够预测磁盘故障发生概率以及数量,由此能够确保系统的稳定运行和数据处理的效率,并且能够预估发生故障的磁盘的数量,从而能够更准确地准备用于应急和备份的磁盘,使得维护便捷且成本较低。
图2是根据本发明的实施例的基于预测模型的磁盘故障预测装置的示意性结构图。如图2所示,本发明所公开的基于预测模型的磁盘故障预测装置包括信息收集单元1、模型构建单元2和故障预测单元3。所述信息收集单元1收集海量磁盘的基础信息、历史运行信息和故障信息,并将所收集的海量磁盘的基础信息、历史运行信息和故障信息传送至模型构建单元2。所述模型构建单元2分析所述海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型,以及将所述预测模型提供给故障预测单元3。所述故障预测单元3基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
优选地,在本发明所公开的基于预测模型的磁盘故障预测装置中,所述预测模型是线性预测模型。
优选地,本发明所公开的基于预测模型的磁盘故障预测装置还包括信息记录单元4,所述信息记录单元4周期性地(例如10秒或者1分钟)检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
优选地,在本发明所公开的基于预测模型的磁盘故障预测装置中,所述基础信息包括但不限于磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括但不限于运行时间、运行环境(即使用场景,例如,如果是海量数据处理环境则涉及大量的读写操作,如果是虚拟化环境,则仅涉及少量的读写操作)以及基于时间序列的运行状态信息,所述故障信息包括但不限于故障发生时 间、故障原因。
优选地,在本发明所公开的基于预测模型的磁盘故障预测装置中,所述模型构建单元2以如下方式构建所述线性预测模型:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素(例如运行时间、访问频度等等),并将所确定的强相关要素作为采样点而构建所述线性预测模型(例如,所述线性预测模型的时间单位为“天”)。
优选地,在本发明所公开的基于预测模型的磁盘故障预测装置中,所述故障预测单元3以如下方式预测每个正在运行的磁盘发生故障的概率以及数量:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率确定未来一天或一周内可能会发生故障的磁盘的数量(例如发生故障的概率大于预定阈值的磁盘的数量)。
由上可见,本发明所公开的基于预测模型的磁盘故障预测装置具有下列优点:能够预测磁盘故障发生概率以及数量,由此能够确保系统的稳定运行和数据处理的效率,并且能够预估发生故障的磁盘的数量,从而能够更准确地准备用于应急和备份的磁盘,使得维护便捷且成本较低。
尽管本发明是通过上述的优选实施方式进行描述的,但是其实现形式并不局限于上述的实施方式。应该认识到:在不脱离本发明主旨和范围的情况下,本领域技术人员可以对本发明做出不同的变化和修改。

Claims (12)

  1. 一种基于预测模型的磁盘故障预测方法,所述基于预测模型的磁盘故障预测方法包括下列步骤:
    (A1)收集海量磁盘的基础信息、历史运行信息和故障信息;
    (A2)分析所收集的海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型;
    (A3)基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
  2. 根据权利要求1所述的基于预测模型的磁盘故障预测方法,其特征在于,所述预测模型是线性预测模型。
  3. 根据权利要求2所述的基于预测模型的磁盘故障预测方法,其特征在于,所述方法还包括:周期性地检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
  4. 根据权利要求3所述的基于预测模型的磁盘故障预测方法,其特征在于,所述基础信息包括磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括运行时间、运行环境以及基于时间序列的运行状态信息,所述故障信息包括故障发生时间、故障原因。
  5. 根据权利要求4所述的基于预测模型的磁盘故障预测方法,其特征在于,所述步骤(A2)进一步包括:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素,并将所确定的强相关要素作为采样点而构建所述线性预测模型。
  6. 根据权利要求5所述的基于预测模型的磁盘故障预测方法,其特征在于,所述步骤(A3)进一步包括:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率确定未来一天或一周内可能会发生故障的磁盘的数量。
  7. 一种基于预测模型的磁盘故障预测装置,所述基于预测模型的磁盘故障预测装置包括:
    信息收集单元,所述信息收集单元收集海量磁盘的基础信息、历史运行信息和故障信息,并将所收集的海量磁盘的基础信息、历史运行信息和故障信息传送至模型构建单元;
    模型构建单元,所述模型构建单元分析所述海量磁盘的基础信息、历史运行信息和故障信息以确定与磁盘发生故障强相关的要素,并基于所确定的与磁盘发生故障强相关的要素构建预测模型,以及将所述预测模型提供给故障预测单元;
    故障预测单元,所述故障预测单元基于所述预测模型预测每个正在运行的磁盘发生故障的概率以及数量。
  8. 根据权利要求7所述的基于预测模型的磁盘故障预测装置,其特征在于,所述预测模型是线性预测模型。
  9. 根据权利要求8所述的基于预测模型的磁盘故障预测装置,其特征在于,所述装置还包括信息记录单元,所述信息记录单元周期性地检测并记录每个正在运行的磁盘的运行状态信息,并且在磁盘发生故障时记录故障发生时间、故障原因以及发生故障时该磁盘的运行状态。
  10. 根据权利要求9所述的基于预测模型的磁盘故障预测装置,其特征在于,所述基础信息包括磁盘类型、磁盘生产商、磁盘出厂信息,所述历史运行信息包括运行时间、运行环境以及基于时间序列的运行状态信息,所述故障信息包括故障发生时间、故障原因。
  11. 根据权利要求10所述的基于预测模型的磁盘故障预测装置,其特征在于,所述模型构建单元以如下方式构建所述线性预测模型:分析磁盘的基础信息、历史运行信息和故障信息意之间的相关性以确定与磁盘发生故障强相关的要素,并将所确定的强相关要素作为采样点而构建所述线性预测模型。
  12. 根据权利要求11所述的基于预测模型的磁盘故障预测装置,其特征在于,所述故障预测单元以如下方式预测每个正在运行的磁盘发生故障的概率以及数量:获取各个磁盘的基础信息和当前运行信息并基于所述线性预测模型计算并预测每个磁盘未来一天或一周内发生故障的概率,并且基于所预测的概率 确定未来一天或一周内可能会发生故障的磁盘的数量。
PCT/CN2015/097409 2014-12-31 2015-12-15 基于预测模型的磁盘故障预测方法及装置 WO2016107402A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410845353.1 2014-12-31
CN201410845353.1A CN105589795A (zh) 2014-12-31 2014-12-31 基于预测模型的磁盘故障预测方法及装置

Publications (1)

Publication Number Publication Date
WO2016107402A1 true WO2016107402A1 (zh) 2016-07-07

Family

ID=55929393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097409 WO2016107402A1 (zh) 2014-12-31 2015-12-15 基于预测模型的磁盘故障预测方法及装置

Country Status (2)

Country Link
CN (1) CN105589795A (zh)
WO (1) WO2016107402A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111381990A (zh) * 2020-03-16 2020-07-07 上海威固信息技术股份有限公司 一种基于流特征的磁盘故障预测方法及装置
CN114697203A (zh) * 2022-03-31 2022-07-01 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN116126593A (zh) * 2023-01-10 2023-05-16 华南高科(广东)股份有限公司 一种云平台环境下的数据备份系统及方法

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107728929A (zh) * 2016-08-10 2018-02-23 先智云端数据股份有限公司 用于云端服务系统中数据保护的方法
CN106598791B (zh) * 2016-09-12 2020-08-21 湖南微软创新中心有限公司 一种基于机器学习的工业设备故障预防性识别方法
CN106598486B (zh) * 2016-11-11 2019-08-02 工业和信息化部电信研究院 一种云服务数据存储持久性评估装置和方法
CN107239388A (zh) * 2017-05-27 2017-10-10 郑州云海信息技术有限公司 一种监测告警方法及系统
CN107479836A (zh) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 磁盘故障监控方法、装置以及存储系统
CN107957935B (zh) * 2017-11-23 2021-01-12 上海联影医疗科技股份有限公司 设备的控制方法和装置、计算机可读存储介质
CN109962857B (zh) * 2017-12-26 2022-08-30 中国电信股份有限公司 流量控制方法、装置及计算机可读存储介质
CN108376553B (zh) * 2018-02-28 2020-11-03 北京奇艺世纪科技有限公司 一种视频服务器的磁盘的监控方法及系统
CN108763002A (zh) * 2018-05-25 2018-11-06 郑州云海信息技术有限公司 基于机器学习预测cpu故障的方法及系统
CN109032891A (zh) * 2018-07-23 2018-12-18 郑州云海信息技术有限公司 一种云计算服务器硬盘故障预测方法及装置
CN109240867A (zh) * 2018-09-18 2019-01-18 鸿秦(北京)科技有限公司 硬盘故障预测方法
CN109474474B (zh) * 2018-12-07 2021-08-27 天津津航计算技术研究所 基于泊松分布故障模型的无线传感器网络故障检测系统
CN109495313B (zh) * 2018-12-07 2021-08-27 天津津航计算技术研究所 基于泊松分布故障模型的无线传感器网络故障检测方法
CN111258788B (zh) * 2020-01-17 2024-04-12 上海商汤智能科技有限公司 磁盘故障预测方法、装置及计算机可读存储介质
CN111752775B (zh) * 2020-05-28 2022-11-18 苏州浪潮智能科技有限公司 一种磁盘故障预测方法和系统
CN113447290B (zh) * 2021-06-25 2022-11-29 上海三一重机股份有限公司 一种工程机械故障预警方法、装置及工程机械

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6574754B1 (en) * 2000-02-14 2003-06-03 International Business Machines Corporation Self-monitoring storage device using neural networks
CN102129397A (zh) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 一种自适应磁盘阵列故障预测方法及系统
CN102981930A (zh) * 2012-11-15 2013-03-20 浪潮电子信息产业股份有限公司 一种磁盘阵列多级数据自动修复的方法
CN103646114A (zh) * 2013-12-26 2014-03-19 北京百度网讯科技有限公司 硬盘smart数据中特征数据提取方法和装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116531A (zh) * 2013-01-25 2013-05-22 浪潮(北京)电子信息产业有限公司 存储系统故障预测方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6574754B1 (en) * 2000-02-14 2003-06-03 International Business Machines Corporation Self-monitoring storage device using neural networks
CN102129397A (zh) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 一种自适应磁盘阵列故障预测方法及系统
CN102981930A (zh) * 2012-11-15 2013-03-20 浪潮电子信息产业股份有限公司 一种磁盘阵列多级数据自动修复的方法
CN103646114A (zh) * 2013-12-26 2014-03-19 北京百度网讯科技有限公司 硬盘smart数据中特征数据提取方法和装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111381990A (zh) * 2020-03-16 2020-07-07 上海威固信息技术股份有限公司 一种基于流特征的磁盘故障预测方法及装置
CN111381990B (zh) * 2020-03-16 2023-10-20 上海威固信息技术股份有限公司 一种基于流特征的磁盘故障预测方法及装置
CN114697203A (zh) * 2022-03-31 2022-07-01 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN114697203B (zh) * 2022-03-31 2023-07-25 浙江省通信产业服务有限公司 一种网络故障的预判方法、装置、电子设备及存储介质
CN116126593A (zh) * 2023-01-10 2023-05-16 华南高科(广东)股份有限公司 一种云平台环境下的数据备份系统及方法
CN116126593B (zh) * 2023-01-10 2023-09-08 华南高科(广东)股份有限公司 一种云平台环境下的数据备份系统及方法

Also Published As

Publication number Publication date
CN105589795A (zh) 2016-05-18

Similar Documents

Publication Publication Date Title
WO2016107402A1 (zh) 基于预测模型的磁盘故障预测方法及装置
US10452983B2 (en) Determining an anomalous state of a system at a future point in time
US11558272B2 (en) Methods and systems for predicting time of server failure using server logs and time-series data
US20190324430A1 (en) Computer System and Method for Creating a Supervised Failure Model
US9424157B2 (en) Early detection of failing computers
CN102436376B (zh) 用于分布式应用确认的模型检查
JP6896432B2 (ja) 故障予知方法、故障予知装置および故障予知プログラム
US20170075744A1 (en) Identifying root causes of failures in a deployed distributed application using historical fine grained machine state data
Ganguly et al. A practical approach to hard disk failure prediction in cloud platforms: Big data model for failure management in datacenters
BR102016003934A2 (pt) sistema e método de detecção de anomalia
Hoffmann et al. Advanced failure prediction in complex software systems
US9043652B2 (en) User-coordinated resource recovery
US10996861B2 (en) Method, device and computer product for predicting disk failure
US20200257581A1 (en) Fault prediction and detection using time-based distributed data
US20190311297A1 (en) Anomaly detection and processing for seasonal data
Pannu et al. A self-evolving anomaly detection framework for developing highly dependable utility clouds
US11449376B2 (en) Method of determining potential anomaly of memory device
CN111061581B (zh) 一种故障检测方法、装置及设备
JP6252309B2 (ja) 監視漏れ特定処理プログラム,監視漏れ特定処理方法及び監視漏れ特定処理装置
US11099219B2 (en) Estimating the remaining useful life of a power transformer based on real-time sensor data and periodic dissolved gas analyses
Raj et al. Cloud infrastructure fault monitoring and prediction system using LSTM based predictive maintenance
Bhattacharyya et al. Online phase detection and characterization of cloud applications
CN111324516A (zh) 自动记录异常事件的方法及装置、存储介质、电子设备
JP6896380B2 (ja) 故障予兆判定方法、故障予兆判定装置および故障予兆判定プログラム
CN114298533A (zh) 性能指标处理方法、装置、设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15875080

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.11.2017)

122 Ep: pct application non-entry in european phase

Ref document number: 15875080

Country of ref document: EP

Kind code of ref document: A1