CN110750455A - Intelligent online self-updating fault diagnosis method and system based on system log analysis - Google Patents

Intelligent online self-updating fault diagnosis method and system based on system log analysis Download PDF

Info

Publication number
CN110750455A
CN110750455A CN201910993251.7A CN201910993251A CN110750455A CN 110750455 A CN110750455 A CN 110750455A CN 201910993251 A CN201910993251 A CN 201910993251A CN 110750455 A CN110750455 A CN 110750455A
Authority
CN
China
Prior art keywords
log
fault diagnosis
updating
online
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910993251.7A
Other languages
Chinese (zh)
Other versions
CN110750455B (en
Inventor
贾统
李影
张齐勋
吴中海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201910993251.7A priority Critical patent/CN110750455B/en
Publication of CN110750455A publication Critical patent/CN110750455A/en
Application granted granted Critical
Publication of CN110750455B publication Critical patent/CN110750455B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/366Software debugging using diagnostics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting

Abstract

The invention discloses an intelligent online self-updating fault diagnosis method and system based on system log analysis, wherein online log stream is input, log data is converted into a log template, a control flow graph fault diagnosis model is trained and updated online, and meanwhile, the control flow graph fault diagnosis model is utilized to find abnormality from the online log stream and diagnose system faults; the intelligent online self-updating fault diagnosis system comprises an online log template mining and conversion module, an online fault diagnosis model training and updating module and an online fault diagnosis module. By adopting the technical scheme of the invention, according to the fault diagnosis model of the dynamic control flow graph, log data are analyzed on line so as to diagnose system faults, a real-time fault diagnosis mode of updating and diagnosing on line is realized, and system abnormity and system faults can be identified on line from a large amount of log data; meanwhile, the online self-updating of the fault diagnosis model and the rapid updating and updating of the intelligent adaptive software system are realized.

Description

Intelligent online self-updating fault diagnosis method and system based on system log analysis
Technical Field
The invention belongs to the field of intelligent operation and maintenance of distributed software systems, and particularly relates to an intelligent online self-updating fault diagnosis method and system based on system log analysis.
Background
With the development of Artificial Intelligence (AI), intelligent operation and maintenance (AIOps) was first proposed by Gartner in 2016, that is, large-scale data from various operation and maintenance tools and equipment is analyzed by algorithms such as Machine Learning (Machine Learning), and problems occurring in the system are automatically discovered and responded in real time, so that the Information Technology (IT) operation and maintenance capability and the automation degree are improved. Under the trend of AIOps, the automatic and intelligent fault diagnosis taking the analysis of system log data as the core becomes an important component and development trend of the fault diagnosis technology of the distributed software system.
However, the fault diagnosis technology based on the system log analysis still faces two key problems and challenges at present. First, the complex and high association of component relationships with execution logic in distributed software systems, such as Hadoop, Spark, OpenStack, etc., makes accurate and fine-grained fault diagnosis extremely difficult. When a large scale requests to access the system concurrently, the complexity of the whole system operation logic increases exponentially. Meanwhile, the system abnormal behavior has relevance and evolutionary property, the propagation speed of the system fault is high, the influence range is large, and important challenges are provided for realizing accurate fine-grained fault diagnosis. Secondly, how to construct a fast online training and updating fault diagnosis model to adapt to the current DevOps fast development iteration environment becomes a key problem. In today's fast development iteration environment of DevOps, system updates are very frequent, with consequent frequent updates and changes to the log. Such frequent system updating requires that the fault diagnosis model has the capability of rapid on-line training, updating and adapting, so as to ensure the availability and effectiveness of the fault diagnosis model. The existing fault diagnosis technology adopts an off-line training and on-line diagnosis mode, firstly, a fault diagnosis model is trained by using historical log data and fault data of a system, and then, the log data of an on-line log or a plurality of time windows before the system fault is input into the trained fault diagnosis model to obtain a fault diagnosis result on line. This mode only supports off-line retraining (Re-Training), is slow, inefficient, and cannot accommodate frequent system updates.
Therefore, in order to improve the availability and reliability of a distributed software system, an intelligent fault diagnosis technology having fine-grained fault diagnosis capability of accurately describing component relationships and request execution logic and rapid online self-training and updating capability is urgently needed.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides an intelligent online self-updating fault diagnosis method and system based on system log analysis.
The main innovation points of the invention are as follows: the invention provides a real-time online updating and diagnosing fault diagnosis mode based on system log analysis. Secondly, the invention provides a method for constructing and updating a dynamic control flow graph fault diagnosis model, which constructs the dynamic control flow graph model at the current moment in real time by maintaining transition probability matrixes among all log templates.
The core of the invention is: the intelligent online self-updating fault diagnosis method based on the system log analysis only needs system log data, has no limit on the type of the fault to be diagnosed, adopts an online self-updating mode, namely, the construction, updating and using processes of a fault diagnosis model are all completed online at the same time, and diagnoses the fault of the system while training and updating the model.
The technical scheme provided by the invention is as follows:
an intelligent online self-updating fault diagnosis method based on system log analysis comprises the steps of firstly inputting an online log stream, converting log data into a log template, then training online and updating a fault diagnosis model of a control flow graph, and meanwhile, finding system abnormity from the online log stream and diagnosing system faults by using the fault diagnosis model, and providing system fault diagnosis results for system operation and maintenance personnel. The method specifically comprises the following steps:
1) sequentially converting the logs in the online log stream into log templates;
2) constructing, training and updating a fault diagnosis model of the control flow graph at the current moment in real time after the conversion of one log in the step 1) is completed;
3) after the step 2) is executed, aiming at the log data at the current moment, utilizing the updated fault diagnosis model of the control flow graph at the current moment to diagnose the system fault on line;
by adopting the intelligent online self-updating fault diagnosis method based on system log analysis, the step 2) and the step 3) are sequentially executed for each log in the online log stream.
Aiming at the intelligent online self-updating fault diagnosis method based on system log analysis, further, an online log template mining algorithm is adopted in the step 1), online log streams are processed in real time, and logs in the log streams are sequentially converted into log templates. By log template is meant a log type abstraction that is identified by a constant portion in the log. The log is converted into a log template by keeping constant parts in the log and identifying variable parts in the log by placeholders. That is, the log template corresponding to the log includes the constant part and the placeholder in the log.
Aiming at the intelligent online self-updating fault diagnosis method based on system log analysis, further, step 2) is to construct, train and update the fault diagnosis model of the control flow graph at the current moment in real time, and the method specifically comprises the following steps:
the method comprises the following steps of maintaining a transition probability function parameter matrix among all log templates, if the transition probability function parameter among the log templates is larger than a threshold value β, adding a directed edge between the log templates, otherwise, two log templates are independent, and further constructing a dynamic control flow graph fault diagnosis model at any moment, and in the training and updating process, updating the transition probability function parameter by using a gradient descent method.
21) Calculating the updating gradient of the transition probability between any two log templates;
recording the online log stream as L ═ L1,l2,l3,...li,., wherein liIs a log; let liHas a time stamp of tiLet liThe corresponding log template is TiThe corresponding log template is; separately calculate tiLog template and T corresponding to each log in previous w time periodiThe transition probabilities between update the gradients.
Let Lw={lj,lj+1,...liSatisfy ti-tj< w and ti-tj-1Not less than w, letk∈LwIf T isiAt first occurrence, then TkAnd TiTransition probability parameter update gradient between
Figure BDA0002238946900000031
Expressed as:
Figure BDA0002238946900000032
wherein δ is a control parameter; if T isiNot the first occurrence, thenExpressed as:
Figure BDA0002238946900000034
wherein
Figure BDA0002238946900000035
For the log template T in the current transfer probability function parameter matrixxAnd log template TiA transition probability function parameter between.
22) Updating the transition probability function parameters;
updating gradient after obtaining transition probability function parameter
Figure BDA0002238946900000036
Thereafter, the transition probability function parameters are updated
Figure BDA0002238946900000037
Figure BDA0002238946900000038
Where σ is the update step size, where
Figure BDA0002238946900000039
Representing updated TkAnd TiThe transition probability function parameter between,
Figure BDA00022389469000000310
representing T before updatekAnd TiA transition probability function parameter between.
23) Shifting all elements in the probability function parameter matrix at intervals to undergo decay;
every time a period of time, e.g. ten minutes, has elapsed, all elements in the transition probability function parameter matrix undergo a decay, i.e.
Figure BDA0002238946900000041
Where gamma is the decay step size.
Figure BDA0002238946900000042
Is a transition probability function parameter before updating;
Figure BDA0002238946900000043
the updated transition probability function parameter;
24) and constructing a fault diagnosis model of the control flow graph at the current moment.
If the transition probability function parameter between the two log templates is larger than the threshold value β, adding a directed edge between the log templates, otherwise, the two log templates are independent, and further constructing a control flow graph fault diagnosis model (DAG graph model) at the current moment
For the above intelligent online self-updating fault diagnosis method based on system log analysis, further, step 3) uses the latest fault diagnosis model to perform online fault diagnosis for online log streams, and the specific operations are as follows:
metering an online log stream as L ═ L1,l2,l3,...li,., wherein liIs a log; let liHas a time stamp of tiLet liThe corresponding log template is TiSeparately calculate tiLog template and T corresponding to each log in previous w time periodiThe transition probability between.
Let Lw={lj,lj+1,...liSatisfy ti-tj< w and ti-tj-1Not less than w, letk∈Lw,TkAnd TiTransition probability between
Figure BDA0002238946900000044
Calculated by the following formula:
Figure BDA0002238946900000045
if each one is
Figure BDA0002238946900000046
If the values are all smaller than the set fault threshold β (in the specific implementation, the value range is (0, 1)), the system fault is determined, and the segment of the log stream is returned, so that the intelligent online self-updating fault diagnosis based on the system log analysis is realized.
The invention also provides an intelligent online self-updating fault diagnosis system based on system log analysis, which comprises an online log template mining and converting module, an online fault diagnosis model training and updating module and an online fault diagnosis module.
The online log template mining and converting module is used for preprocessing an online log stream and aims to mine a log template from the log stream and find the log template corresponding to log data;
the online fault diagnosis model training and updating module is used for acquiring a log template stream corresponding to the log stream from the online log template mining and converting module, and using the log template stream for online training and updating the fault diagnosis model of the control flow graph;
and the online fault diagnosis module is used for mining and converting the online log template to obtain a log template stream corresponding to the log stream, obtaining a fault diagnosis model from the online fault diagnosis model training and updating module, analyzing and diagnosing the log template stream by using the fault diagnosis model, and finally returning a diagnosis result.
For the intelligent online self-updating fault diagnosis system based on system log analysis, further, the online fault diagnosis model training and updating module comprises a fault diagnosis model training and updating device and a fault diagnosis model memory. The fault diagnosis model training and updating device obtains the on-line log template flow, and constructs or updates the dynamic control flow graph model at the current moment by maintaining the transition probability matrix among all log templates. The fault diagnosis model memory is responsible for storing the latest control flow graph model after each model update.
For the intelligent online self-updating fault diagnosis system based on system log analysis, the online fault diagnosis module further comprises a log analysis and fault diagnosis device and a fault diagnosis result displayer. The log analysis and fault diagnosis device firstly obtains the latest fault diagnosis model from the fault diagnosis model memory, then uses the fault diagnosis method to analyze and diagnose the online log template flow, and returns the diagnosis result to the fault diagnosis result display device. And the fault diagnosis result displayer is responsible for displaying fault diagnosis results including fault time, fault positions, fault log segments and the like to system operation and maintenance personnel.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides an intelligent online self-updating fault diagnosis system based on system log analysis, which can realize online log stream analysis, online construction and updating of a fault diagnosis model based on logs and online system fault diagnosis, and is mainly characterized in that:
the system and the method provided by the invention realize a real-time online updating and diagnosing fault diagnosis mode.
The system and the method provided by the invention construct and update the fault diagnosis model of the control flow graph in real time on the basis of log data.
And thirdly, the system and the method provided by the invention can analyze the log data on line according to the dynamic control flow graph fault diagnosis model so as to diagnose the system fault.
By using the technical scheme of the invention, the system abnormity can be identified and the system fault can be diagnosed on line from a large amount of log data; meanwhile, the online self-updating of the fault diagnosis model and the upgrading and updating of the intelligent adaptive software system are realized, and the method is particularly suitable for the DevOps production environment.
Drawings
Fig. 1 is a flow chart of an intelligent online self-updating fault diagnosis method based on system log analysis according to the present invention.
Fig. 2 is a block diagram of the intelligent online self-updating fault diagnosis system based on system log analysis according to the present invention.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
Fig. 1 is a flow chart of an intelligent online self-updating fault diagnosis method based on system log analysis according to the present invention. Firstly, the online log template mining and converting module mines log templates from online log streams and converts each log into a corresponding log template. Then, the process of training and updating the fault diagnosis model is performed simultaneously with the fault diagnosis process. And the online fault diagnosis model training and updating module builds a control flow graph fault diagnosis model according to the log template flow, updates the transfer probability parameters among the log templates in the control flow graph, and then stores the updated fault diagnosis model. And the online fault diagnosis module inquires the latest fault diagnosis model from the online fault diagnosis model training and updating module, analyzes and diagnoses the online log stream by using the model, and finally displays the fault diagnosis result of the system.
Fig. 2 is a block diagram of the intelligent online self-updating fault diagnosis system based on system log analysis according to the present invention. The system takes on-line log stream data as input, and comprises an on-line log template mining and converting module, an on-line fault diagnosis model training and updating module and an on-line fault diagnosis module, wherein different modules are respectively specifically explained below.
S1) online log template mining and conversion module
The online log template mining and converting module is used for mining the log template from the online log stream and converting the log into the corresponding log template. The log template set mined by the module is Templates ═ T1,T2,...,TnThe log and the log template are in a many-to-one relationship, and the log stream L is { L ═ L1,l2,l3,...lk,.. } Each log liIs converted into TiWherein T isi∈Templates。
S2) on-line fault diagnosis model training and updating module
The function of the online fault diagnosis model training and updating module is to construct and update a fault diagnosis model of the control flow graph according to the log flow and the log template corresponding to the log. The module contains two sub-modules:
s21) fault diagnosis model training and updating device
The control flow graph fault diagnosis model is a directed graph model G ═ { Nodes, Edges }, wherein the node log template set is Nodes, i.e. the log template set is Templates ═ { T ═1,T2,...,TnEdge Edges are the transition relations between log templates. The fault diagnosis model training and updating device maintains a temporary log template set Templates and a log template transfer probability parameter matrix
Figure BDA0002238946900000061
The fault diagnosis model trainer and updater passes Templates and (α) to the fault diagnosis model memory each time a period of time passes.
S22) failure diagnosis model memory
The failure diagnosis model memory maintains a stable log template set Templates and a log template transfer probability parameter matrixThe latest model information is obtained from the troubleshooting training and updater and the query of the matrix (α) is served externally.
S3) online fault diagnosis module
The function of the online fault diagnosis module is to analyze the log stream online to find system abnormality and diagnose system fault. The module comprises two sub-modules:
s31) Log analysis and Fault diagnostor
The log analysis and fault diagnosis device firstly inquires the latest fault diagnosis model parameter matrix (α) from the fault diagnosis model memory, then calculates the transfer probability between log templates according to the fault diagnosis method and compares the transfer probability with the transfer relation in the log stream, further discovers the system abnormity and inputs the abnormal result into the fault diagnosis result displayer.
S32) fault diagnosis result displayer
The fault diagnosis result displayer aims to display system abnormalities and faults discovered by the log analysis and fault diagnosis device, and specifically comprises fault time, fault log fragments and a fault control flow graph link.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.

Claims (10)

1. An intelligent online self-updating fault diagnosis method based on system log analysis comprises the steps of firstly inputting online log flow, converting log data into a log template, then training and updating a control flow graph fault diagnosis model on line, and meanwhile, finding system abnormity from the online log flow and diagnosing system faults by using the control flow graph fault diagnosis model; the method comprises the following steps:
1) sequentially converting the logs in the online log stream into log templates, wherein the log templates comprise constant parts and placeholders in the corresponding logs;
2) constructing, training and updating a fault diagnosis model of the control flow graph at the current moment in real time after the conversion of one log in the step 1) is completed; the method comprises the following steps:
calculating the updating gradient of the transition probability between any two log templates;
updating the transition probability function parameters: in the process of training and updating a dynamic control flow graph fault diagnosis model, updating a transition probability function parameter by using a gradient descent method;
at intervals all elements in the transition probability function parameter matrix undergo decay: reducing transfer probability function parameters among log templates through a decay mechanism, so that the control flow graph fault diagnosis model can be evolved and degraded in real time;
constructing a fault diagnosis model of the control flow graph at the current moment, namely maintaining a transition probability function parameter matrix among all log templates, adding a directed edge among the log templates when the transition probability function parameter among the log templates is larger than a threshold value β, otherwise, all the log templates are independent log templates, and constructing a fault diagnosis model of the dynamic control flow graph at any moment, wherein the fault diagnosis model of the dynamic control flow graph is a DAG graph model;
3) after the step 2) is executed, aiming at the log data at the current moment, utilizing the updated fault diagnosis model of the control flow graph at the current moment to diagnose the system fault on line; the method comprises the following operations:
metering an online log stream as L ═ L1,l2,l3,…li… }, wherein liIs a log; let liHas a time stamp of tiLet liThe corresponding log template is TiSeparately calculate tiLog template and T corresponding to each log in previous w time periodiThe transition probability between.
Let Lw={lj,lj+1,…liSatisfy ti-tj<w and ti-tj-1Not less than w, letk∈Lw,TkAnd TiTransition probability between
Figure FDA0002238946890000011
Calculated by the following formula:
if each one is
Figure FDA0002238946890000013
If the values are all smaller than the set fault threshold value β, judging that the system has a fault, and returning the segments of the log stream;
therefore, intelligent online self-updating fault diagnosis based on system log analysis is realized.
2. The intelligent online self-updating fault diagnosis method based on system log analysis as claimed in claim 1, wherein step 2) and step 3) are performed sequentially for each log in the online log stream.
3. The intelligent online self-updating fault diagnosis method based on system log analysis as claimed in claim 1, wherein step 1) specifically adopts an online log template mining algorithm to process online log streams in real time and sequentially convert logs in the log streams into log templates.
4. The intelligent online self-updating fault diagnosis method based on system log analysis as claimed in claim 3, wherein the log template is a log type identified by a constant part in the log, and the log is converted into the log template by keeping the constant part in the log and identifying a variable part in the log by a placeholder.
5. The intelligent online self-updating fault diagnosis method based on system log analysis as claimed in claim 1, wherein the specific method for dynamically constructing and updating the control flow graph fault diagnosis model in step 2) is as follows:
21) calculating the updating gradient of the transition probability between any two log templates;
recording the online log stream as L ═ L1,l2,l3,…li… }, where liIs a log; let liHas a time stamp of tiLet liThe corresponding log template is TiThe corresponding log template is; separately calculate tiLog template and T corresponding to each log in previous w time periodiUpdating the gradient with the transition probability therebetween;
let Lw={lj,lj+1,…liSatisfy ti-tj<w and ti-tj-1Not less than w, letk∈LwIf T isiAt first occurrence, then TkAnd TiTransition probability parameter update gradient between
Figure FDA0002238946890000021
Expressed as:
Figure FDA0002238946890000022
wherein δ is a control parameter; if T isiNot the first occurrence, then
Figure FDA0002238946890000023
Expressed as:
Figure FDA0002238946890000024
wherein
Figure FDA0002238946890000025
For the log template T in the current transfer probability function parameter matrixxAnd log template TiTransition probability function parameters between;
22) updating the transition probability function parameters;
updating gradient after obtaining transition probability function parameterThereafter, the transition probability function parameters are updated
Figure FDA0002238946890000027
Where σ is the update step size, where
Figure FDA0002238946890000029
Representing updated TkAnd TiThe transition probability function parameter between,
Figure FDA00022389468900000210
representing T before updatekAnd TiTransition probability function parameters between;
23) shifting all elements in the probability function parameter matrix at intervals to undergo decay;
every time a period of time passes, all elements in the transition probability function parameter matrix undergo a decay, i.e.
Figure FDA0002238946890000032
Wherein γ is the decay step size;
Figure FDA0002238946890000033
is a transition probability function parameter before updating;
Figure FDA0002238946890000034
the updated transition probability function parameter;
24) constructing a fault diagnosis model of a control flow graph at the current moment;
and if the transition probability function parameter between the two log templates is larger than a threshold value β, adding a directed edge between the log templates, otherwise, the two log templates are independent, and thus constructing and obtaining the fault diagnosis model of the control flow graph at the current moment.
6. The intelligent online self-updating fault diagnosis method based on system log analysis as claimed in claim 1, wherein the threshold β is (0, 1).
7. An intelligent online self-updating fault diagnosis system based on system log analysis comprises an online log template mining and conversion module, an online fault diagnosis model training and updating module and an online fault diagnosis module;
the online log template mining and converting module is used for preprocessing the online log stream, mining the log template from the log stream and finding the log template corresponding to the log data;
the online fault diagnosis model training and updating module is used for acquiring a log template stream corresponding to the log stream from the online log template mining and converting module, and using the log template stream for online training and updating the fault diagnosis model of the control flow graph;
and the online fault diagnosis module is used for mining and converting the online log template to obtain a log template stream corresponding to the log stream, obtaining a fault diagnosis model from the online fault diagnosis model training and updating module, analyzing and diagnosing the log template stream by using the fault diagnosis model, and returning a fault diagnosis result.
8. The intelligent online self-updating fault diagnosis system based on system log analysis as claimed in claim 7, wherein the online fault diagnosis model training and updating module comprises a fault diagnosis model training and updating device and a fault diagnosis model storage; the fault diagnosis model training and updating device is used for acquiring an online log template stream, and constructing or updating a dynamic control flow diagram model at the current moment by maintaining transition probability matrixes among all log templates; the fault diagnosis model memory is used for storing the latest control flow graph model after each model update.
9. The intelligent online self-updating fault diagnosis system based on system log analysis as claimed in claim 7, wherein the online fault diagnosis module comprises a log analysis and fault diagnosis device and a fault diagnosis result displayer.
10. The system for intelligent online self-updating fault diagnosis based on system log analysis as claimed in claim 9, wherein the log analysis and fault diagnosis device first obtains the latest fault diagnosis model from the fault diagnosis model memory, then analyzes and diagnoses the online log template stream, and returns the diagnosis result to the fault diagnosis result display device; the fault diagnosis result displayer is used for displaying fault diagnosis results including fault time, fault positions and fault log fragments to system operation and maintenance personnel.
CN201910993251.7A 2019-10-18 2019-10-18 Intelligent online self-updating fault diagnosis method and system based on system log analysis Active CN110750455B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910993251.7A CN110750455B (en) 2019-10-18 2019-10-18 Intelligent online self-updating fault diagnosis method and system based on system log analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910993251.7A CN110750455B (en) 2019-10-18 2019-10-18 Intelligent online self-updating fault diagnosis method and system based on system log analysis

Publications (2)

Publication Number Publication Date
CN110750455A true CN110750455A (en) 2020-02-04
CN110750455B CN110750455B (en) 2021-04-30

Family

ID=69278923

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910993251.7A Active CN110750455B (en) 2019-10-18 2019-10-18 Intelligent online self-updating fault diagnosis method and system based on system log analysis

Country Status (1)

Country Link
CN (1) CN110750455B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111769974A (en) * 2020-06-11 2020-10-13 中国科学院计算技术研究所 Cloud system fault diagnosis method
WO2022085014A1 (en) * 2020-10-23 2022-04-28 Telefonaktiebolaget Lm Ericsson (Publ) Application fault analysis using machine learning
WO2022134911A1 (en) * 2020-12-21 2022-06-30 中兴通讯股份有限公司 Diagnosis method and apparatus, and terminal and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893208A (en) * 2016-03-31 2016-08-24 城云科技(杭州)有限公司 Cloud computing platform system fault prediction method based on hidden semi-Markov models
CN106557560A (en) * 2016-11-11 2017-04-05 天翼爱音乐文化科技有限公司 Level music based on user interest recommends method
CN108763654A (en) * 2018-05-03 2018-11-06 国网江西省电力有限公司信息通信分公司 A kind of electrical equipment fault prediction technique based on Weibull distribution and hidden Semi-Markov Process
KR101984730B1 (en) * 2018-10-23 2019-06-03 (주) 글루시스 Automatic predicting system for server failure and automatic predicting method for server failure
CN109918313A (en) * 2019-03-29 2019-06-21 武汉大学 A kind of SaaS software performance method for diagnosing faults based on GBDT decision tree
CN109960839A (en) * 2017-12-26 2019-07-02 中国移动通信集团浙江有限公司 Business support system service link based on machine learning finds method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893208A (en) * 2016-03-31 2016-08-24 城云科技(杭州)有限公司 Cloud computing platform system fault prediction method based on hidden semi-Markov models
CN106557560A (en) * 2016-11-11 2017-04-05 天翼爱音乐文化科技有限公司 Level music based on user interest recommends method
CN109960839A (en) * 2017-12-26 2019-07-02 中国移动通信集团浙江有限公司 Business support system service link based on machine learning finds method and system
CN108763654A (en) * 2018-05-03 2018-11-06 国网江西省电力有限公司信息通信分公司 A kind of electrical equipment fault prediction technique based on Weibull distribution and hidden Semi-Markov Process
KR101984730B1 (en) * 2018-10-23 2019-06-03 (주) 글루시스 Automatic predicting system for server failure and automatic predicting method for server failure
CN109918313A (en) * 2019-03-29 2019-06-21 武汉大学 A kind of SaaS software performance method for diagnosing faults based on GBDT decision tree

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
崔元: "面向大规模网络日志的主动故障检测方法的研究", 《中国优秀硕士学位论文全文库 信息科技辑》 *
胡金海等: "一种基于自适应核主元分析的故障检测方法 ", 《控制工程》 *
赵靓等: "基于分层模糊符号有向图法的故障诊断方法", 《热力发电》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111769974A (en) * 2020-06-11 2020-10-13 中国科学院计算技术研究所 Cloud system fault diagnosis method
CN111769974B (en) * 2020-06-11 2021-10-15 中国科学院计算技术研究所 Cloud system fault diagnosis method
WO2022085014A1 (en) * 2020-10-23 2022-04-28 Telefonaktiebolaget Lm Ericsson (Publ) Application fault analysis using machine learning
WO2022134911A1 (en) * 2020-12-21 2022-06-30 中兴通讯股份有限公司 Diagnosis method and apparatus, and terminal and storage medium

Also Published As

Publication number Publication date
CN110750455B (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN111230887B (en) Industrial gluing robot running state monitoring method based on digital twin technology
CN110750455B (en) Intelligent online self-updating fault diagnosis method and system based on system log analysis
US8306931B1 (en) Detecting, classifying, and tracking abnormal data in a data stream
CN106951695A (en) Plant equipment remaining life computational methods and system under multi-state
CN109255440B (en) Method for predictive maintenance of power production equipment based on Recurrent Neural Networks (RNN)
JP6661839B1 (en) Time series data diagnosis device, additional learning method, and program
US11409962B2 (en) System and method for automated insight curation and alerting
CN108564326A (en) Prediction technique and device, computer-readable medium, the logistics system of order
WO2022134911A1 (en) Diagnosis method and apparatus, and terminal and storage medium
CN115438726A (en) Device life and fault type prediction method and system based on digital twin technology
CN115204491A (en) Production line working condition prediction method and system based on digital twinning and LSTM
CN105471647A (en) Power communication network fault positioning method
CN116451848A (en) Satellite telemetry data prediction method and device based on space-time attention mechanism
CN111767324B (en) Intelligent associated self-adaptive data analysis method and device
CN117333038A (en) Economic trend analysis system based on big data
CN114385601B (en) Cloud-edge collaborative high-throughput ocean data intelligent processing method and system based on super computation
WO2024012735A1 (en) Training of a machine learning model for predictive maintenance tasks
CN112783740B (en) Server performance prediction method and system based on time series characteristics
CN115129029A (en) Industrial system fault diagnosis method and system based on sub-field adaptive dictionary learning
WO2020194583A1 (en) Abnormality detection device, control method, and program
CN113393107B (en) Incremental calculation method for state parameter reference value of power generation equipment
CN116700166A (en) Data processing method and system in industrial Internet of things gateway
US20210232105A1 (en) Htm-based predictions for system behavior management
CN118036451A (en) Lifetime prediction method, system and storage medium for rotating machinery under limited sample
Islam et al. Application of Prognostics and Health Management (PHM) to Software System Fault and Remaining Useful Life (RUL) Prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant