CN107301119A - The method and device of IT failure root cause analysis is carried out using timing dependence - Google Patents

The method and device of IT failure root cause analysis is carried out using timing dependence Download PDF

Info

Publication number
CN107301119A
CN107301119A CN201710508423.8A CN201710508423A CN107301119A CN 107301119 A CN107301119 A CN 107301119A CN 201710508423 A CN201710508423 A CN 201710508423A CN 107301119 A CN107301119 A CN 107301119A
Authority
CN
China
Prior art keywords
critical field
causality
time series
series data
system journal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710508423.8A
Other languages
Chinese (zh)
Other versions
CN107301119B (en
Inventor
饶琛琳
梁玫娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co Ltd
Original Assignee
BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co Ltd filed Critical BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co Ltd
Priority to CN201710508423.8A priority Critical patent/CN107301119B/en
Publication of CN107301119A publication Critical patent/CN107301119A/en
Application granted granted Critical
Publication of CN107301119B publication Critical patent/CN107301119B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The present invention relates to a kind of method and device that IT failure root cause analysis is carried out using timing dependence, methods described therein includes:Obtain system journal;The critical field of the system journal is extracted, the critical field is counted to obtain the time series data of system journal;Assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;When IT failures occur, the correlated characteristic of the time series data is tested by Granger causality, wherein, the size of the causality value between each correlated characteristic of the time series data as the occurrence cause for being evaluated as the IT failures foundation.Beneficial effects of the present invention are:Failure root cause analysis process is automatically completed by way of machine learning, helps user to rapidly find out fault occurrence reason, failure diagnosis time (Mean Time To Diagonise, MTTD) is reduced, system is most recovered normal soon.

Description

The method and device of IT failure root cause analysis is carried out using timing dependence
Technical field
Timing dependence is utilized to carry out IT failures the present embodiments relate to technical field of information processing, more particularly to one kind The method and device of root cause analysis.
Background technology
Daily record data is failture evacuation, monitoring, safety, the basis for closing many enterprise's applications such as rule, electronic evidence-collecting.Meanwhile, They have huge break-up value, with the arriving in big data epoch, and data generation speed is accelerated, the data scale of construction is huge, single The speed that machine produces data can not be kept up with manpower.Most of contents in daily record data can not also transfer to artificial directly knowledge Not.As the growth of amount and type is held in daily record, daily record data is analyzed simultaneously log content beyond the cognitive ability of the mankind The problem of following the trail of potential is more and more difficult, especially, it is necessary to veteran operation after the appearance of many daily record correlation analysis Personnel tracking event chain, filtering noise, and last diagnostic goes wrong the basic reason of appearance.For traffic failure root because point Analysis, compares the experience and trial and error dependent on operation maintenance personnel, only simplest hardware fault could be by simple always Some alarm convergence is accomplished in parent (origin, parent element) setting, and there is presently no what can preferably be solved the above problems Method or apparatus occurs.
The content of the invention
In order to overcome technical problem present in correlation technique, the present invention provides one kind and utilizes timing dependence to carry out IT events Hinder the method and device of root cause analysis, failure cause can be analyzed in time and excluded former after failure generation to realize Barrier.
In a first aspect, the embodiments of the invention provide a kind of side that IT failure root cause analysis is carried out using timing dependence Method, its feasible technical scheme includes as follows:
A kind of method that IT failure root cause analysis is carried out using timing dependence, methods described is included:
Obtain system journal;
The critical field of the system journal is extracted, the critical field is counted to obtain the time of system journal Sequence data;
Assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;
When IT failures occur, the correlated characteristic of the time series data is examined by Granger causality Test, wherein, the size of the causality value between each correlated characteristic of the time series data is as being evaluated as the IT The foundation of the occurrence cause of failure.
It is described to extract the system day in a kind of implementation being likely to occur on the other hand with reference on the other hand The critical field of will, is counted to obtain the time series data of system journal to the critical field, including:
Extract the critical field of the system journal;
Key index parameter to the system journal carries out counting the time series data for obtaining the system journal;
Wherein, the key index parameter is included more than one or both of access number, authority change, error message Combination.
It is described to extract the system day in a kind of implementation being likely to occur on the other hand with reference on the other hand The critical field of will, is counted to obtain the time series data of system journal to the critical field, in addition to:
Parametrization setting is carried out to the critical field;
Critical field after the parametrization obtained to the system journal sets up parameter role graph of a relation;
It is described that the correlated characteristic of the time series data is tested by Granger causality, including:Pass through Granger causality is tested to the critical field after the parametrization.
With reference on the other hand, in a kind of implementation being likely to occur on the other hand, the time series data The size of causality value between each correlated characteristic as the occurrence cause for being evaluated as the IT failures foundation, including:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawn The causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
With reference on the other hand, in a kind of implementation being likely to occur on the other hand, the time series data The size of causality value between each correlated characteristic is also wrapped as the foundation for the occurrence cause for being evaluated as the IT failures Include:
Determine that path maximum in the quantitative causality figure is IT fault propagations path.
Second aspect, the embodiment of the present invention additionally provides a kind of dress that IT failure root cause analysis is carried out using timing dependence Put, its feasible technical scheme includes as follows:
Described device includes:
Acquisition module, for obtaining system journal;
Statistical module, the critical field for extracting the system journal, is counted to obtain to the critical field The time series data of system journal;
Module is automatically extracted, for assuming that detection automatically extracts the correlated characteristic of the time series data based on quantization;
Fault determination module, for when IT failures occur, by Granger causality to the time series data Correlated characteristic test, wherein, the size of the causality value between each correlated characteristic of the time series data It is used as the foundation for the occurrence cause for being evaluated as the IT failures.
Above-mentioned device, the statistical module includes:
Extract submodule, the critical field for extracting the system journal;
Statistic submodule, for the key index parameter to the system journal count obtaining the system journal Time series data;
Wherein, the key index parameter is included more than one or both of access number, authority change, error message Combination.
Above-mentioned device, the statistical module also includes:
Setup module is parameterized, for carrying out parametrization setting to the critical field;
Parameter Map sets up module, and parameter role is set up for the critical field after the parametrization that is obtained to the system journal Graph of a relation;
The fault determination module is additionally operable to:The critical field after the parametrization is carried out by Granger causality Examine.
Above-mentioned device, the fault determination module is additionally operable to:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawn The causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
Above-mentioned device, the fault determination module, in addition to:
Path determination sub-module, for determining that path maximum in the quantitative causality figure is IT fault propagations road Footpath.
Critical field of the invention by extracting the system journal, is counted to obtain system to the critical field The time series data of daily record, set up after Granger causality by the correlated characteristic of each time series data in calculating figure it Between causality value determine failure cause, and new parameter can be continuously added in Granger causality figure, realized Failure root cause analysis process is automatically completed by way of machine learning, helps user to rapidly find out fault occurrence reason, Reduce failure diagnosis time (Mean Time To
Diagonise, MTTD), system is most recovered normal soon.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not Can the limitation present invention.
Brief description of the drawings
Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the implementation for meeting the present invention Example, and for explaining principle of the invention together with specification.
Fig. 1 is a kind of side that IT failure root cause analysis is carried out using timing dependence according to an exemplary embodiment The schematic flow sheet of method.
Fig. 2 is a kind of side that IT failure root cause analysis is carried out using timing dependence according to an exemplary embodiment The schematic flow sheet of method.
Fig. 3 is a kind of side that IT failure root cause analysis is carried out using timing dependence according to an exemplary embodiment The schematic flow sheet of method.
Fig. 4 is a kind of dress that IT failure root cause analysis is carried out using timing dependence according to an exemplary embodiment The block diagram put.
Fig. 5 is a kind of dress that IT failure root cause analysis is carried out using timing dependence according to an exemplary embodiment The block diagram put.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detail The processing described as flow chart or method.It is therein to be permitted although each step to be described as to the processing of order in flow chart Multi-step can be implemented concurrently, concomitantly or simultaneously.In addition, the order of each step can be rearranged, when its operation The processing can be terminated during completion, it is also possible to the other steps being not included in accompanying drawing.Processing can be corresponded to In method, function, code, subroutine, subprogram etc..
The present invention relates to it is a kind of using timing dependence carry out IT failure root cause analysis method and its corresponding device, its The main Enterprise IT System that applies in the scene of network failure is excluded in time after breaking down, its basic thought is:Extract system Critical field in system daily record as time series data, and to the automation of time series data progress correlated characteristic to carry Take, when IT system breaks down, sequential correlated characteristic is tested using Granger causality, to causality in figure Taken this as a foundation as fault propagation path as the scheme basis for solving the failure, or by event in the maximum path of value Optimal fault solution is matched in barrier knowledge data base, fault occurrence reason can be rapidly found out, fault diagnosis is reduced Time MTTD, makes system most recover normal soon.
In the case of the present embodiment is applicable in the IT terminals with machine learning module to carry out fast failure exclusion, This method can be performed by machine learning module, and wherein the machine learning module can be realized by software and/or hardware, It can apply to such as the easy application program of daily record, as shown in figure 1, being the schematic flow sheet that the embodiment of the present invention one is provided, the side Method specifically includes following steps:
In step 110, system journal is obtained;
The system journal includes system journal, application log and the peace produced during the operating system of equipment Full-time will etc.;In a kind of feasible embodiment, it can be by inputting eventvwr.msc in being run in operating system The event viewer of calling system is received to system journal.
In the step 120, the critical field of the system journal is extracted, the critical field is counted to be The time series data of system daily record;
The critical field of the system journal can be the information for representing a certain type.
In step 130, assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;
, can be every by calculating when carrying out feature extraction in a kind of implement scene of exemplary embodiment of the present Pre-selection model and be carried out by deep learning that the correlation of one feature and response variable, training can give a mark to feature The modes such as feature selecting are carried out, the correlated characteristic of the time series data can be then autocorrelation parameter, partial correlation parameter with And delayed period parameters etc..
When detection is assumed in the quantization, generally using percentage % as detection level, the correlation more than the detection level is special Levy and be extracted, the correlated characteristic that Shamanism is less than the detection level is then then filtered for irrelevant feature.
In step 140, when IT failures occur, correlation of the Granger causality to the time series data is passed through Feature is tested, wherein, the size of the causality value between each correlated characteristic of the time series data is as commenting Valency is the foundation of the occurrence cause of the IT failures.
In another implement scene of exemplary embodiment of the present, as shown in Fig. 2 described extract the system journal Critical field, the critical field is counted to obtain the time series data of system journal, in addition to:
In step 121, parametrization setting is carried out to the critical field;
In step 122, the critical field after the parametrization obtained to the system journal sets up parameter role graph of a relation;
It is described that the correlated characteristic of the time series data is tested by Granger causality, including:Step 123:The critical field after the parametrization is tested by Granger causality.
In another implement scene of exemplary embodiment of the present, as shown in figure 3, the time series data is each The size of causality value between correlated characteristic as the occurrence cause for being evaluated as the IT failures foundation, including:
In step 131, the critical field after the parametrization is tested by Granger causality, ginseng is drawn The causality value of critical field after numberization;
When the critical field in fault knowledge database matches failed with the system journal, according to user on The new established feedback solutions of failure are by the new phenomenon of the failure, new failure cause and new failure solution party of this new failure Case is added in the fault knowledge database, i.e. by machine learning, with the critical field of the system journal of extraction Parameter role graph of a relation is set up as key index parameter, and to the key index parameter of the system journal, and and then is carried out Step 140, wherein, the key index parameter includes access number, authority change, more than one or both of error message Combination.
In step 132, the quantitative causality figure of the IT failures is set up according to the causality value.
In a kind of implement scene of exemplary embodiment of the present, key can be referred to Granger causality test Mark parameter is examined two-by-two, the quantitative causality figure of failure is set up according to the causality value calculated, by causality The maximum path of value is considered fault propagation path, and the fault propagation path is also the occurrence cause of the IT failures simultaneously.
In another implement scene of exemplary embodiment of the present, each correlated characteristic of the time series data it Between causality value size as the occurrence cause for being evaluated as the IT failures foundation, in addition to:
Determine that path maximum in the quantitative causality figure is IT fault propagations path.
The present invention method, by extracting the critical field of the system journal, the critical field is counted with The time series data of system journal is obtained, the phase by each time series data in calculating figure after Granger causality is set up The causality value closed between feature determines failure cause, and new ginseng can be continuously added in Granger causality figure Number, realizes and failure root cause analysis process is automatically completed by way of machine learning, helps user to rapidly find out failure hair Raw reason, reduces failure diagnosis time (Mean TimeTo Diagonise, MTTD), system is most recovered normal soon.
Fig. 4 is a kind of device that IT failure root cause analysis is carried out using timing dependence that the embodiment of the present invention five is provided Structural representation, the device can be realized by software and/or hardware, be usually integrated in machine learning, can be by using sequential Correlation carries out the method for IT failure root cause analysis to realize.As illustrated, the present embodiment can based on above-described embodiment, There is provided a kind of device that IT failure root cause analysis is carried out using timing dependence, it mainly includes acquisition module 410, statistics Module 420, automatically extract module 430 and fault determination module 440.
Acquisition module 410 therein, for obtaining system journal;
Statistical module 420 therein, the critical field for extracting the system journal, unites to the critical field Count to obtain the time series data of system journal;
It is therein to automatically extract module 430, for assuming that detection automatically extracts the time series data based on quantization Correlated characteristic;
Fault determination module 440 therein, for when IT failures occur, by Granger causality to the time The correlated characteristic of sequence data is tested, wherein, the causality amount between each correlated characteristic of the time series data The size of value as the occurrence cause for being evaluated as the IT failures foundation.
In another implement scene of exemplary embodiment of the present, as shown in figure 5, the statistical module 420 includes:
Extract submodule 421, the critical field for extracting the system journal;
Statistic submodule 422, counts for the key index parameter progress to the system journal and obtains the system day The time series data of will;
Wherein, the key index parameter is included more than one or both of access number, authority change, error message Combination.
In another implement scene of exemplary embodiment of the present, the statistical module also includes:
Setup module is parameterized, for carrying out parametrization setting to the critical field;
Parameter Map sets up module, and parameter role is set up for the critical field after the parametrization that is obtained to the system journal Graph of a relation;
The fault determination module is additionally operable to:The critical field after the parametrization is carried out by Granger causality Examine.
Above-mentioned device, the fault determination module 440 is additionally operable to:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawn The causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
Above-mentioned device, the fault determination module 440, in addition to:
Path determination sub-module, for determining that path maximum in the quantitative causality figure is IT fault propagations road Footpath.
The executable present invention of device for carrying out IT failure root cause analysis using timing dependence provided in above-described embodiment The method that IT failure root cause analysis is carried out using timing dependence provided in middle any embodiment, possesses execution this method phase The functional module and beneficial effect answered, the ins and outs not being described in detail in the above-described embodiments, reference can be made to the present invention is any real Apply the method that IT failure root cause analysis is carried out using timing dependence provided in example.
It will be appreciated that, the present invention also extends to the computer program for being suitable for putting the invention into practice, particularly Computer program on carrier or in carrier.Program can be with source code, object code, code intermediate source and such as part volume The form of the object code for the form translated, or with any other shape for being adapted to use in the realization according to the method for the present invention Formula.Also it will be noted that, such program may have many different frame designs.For example, realizing the side according to the present invention Functional program code of method or system may be subdivided into one or more subroutine.
For that will be obvious for technical personnel in the functional many different modes of these subroutine intermediate distributions. Subroutine can be collectively stored in an executable file, so as to form self-contained program.Such executable file can With including computer executable instructions, such as processor instruction and/or interpreter instruction (for example, Java interpreter instruction).Can Alternatively, one or more or all subroutines of subroutine may be stored at least one external library file, and And statically or dynamically (for example at runtime) linked with main program.Main program contains at least one in subroutine At least one call.Subroutine can also include to mutual function call.It is related to the embodiment bag of computer program product Include the computer executable instructions corresponding at least one of illustrated method each step of the process step of method.These refer to Order can be subdivided into subroutine and/or be stored in the file of one or more possible static or dynamic link.
Another is related to the embodiment of computer program product and includes corresponding in illustrated system and/or product at least The computer executable instructions of each device in the device of one.These instructions can be subdivided into subroutine and/or be stored In the file of one or more possible static or dynamic link.
The carrier of computer program can deliver any entity or device of program.For example, carrier can be wrapped Containing storage medium, such as (ROM such as CDROM or semiconductor ROM) or magnetic recording media (such as floppy disk or hard disk).Enter One step, carrier can be the carrier that can be transmitted, such as electricity or optical signalling, its can via cable or optical cable, or Person is transmitted by radio or other means.When program is embodied as such signal, carrier can be by such cable Or other devices or device composition.Alternatively, carrier can be the integrated circuit for being wherein embedded with program, described integrated Circuit is adapted for carrying out correlation technique, or for used in the execution of correlation technique.
Should be noted that, embodiment mentioned above is to illustrate the present invention, rather than the limitation present invention, and this The technical staff in field is possible to design many alternate embodiments, without departing from scope of the following claims.In power During profit is required, any reference symbol being placed between round parentheses is not to be read as being limitations on claims.Verb " bag Include " and its is paradigmatic using being not excluded for depositing for element in addition to those recorded in the claims or step .Article " one " or " one " before element are not excluded for the presence of a plurality of such elements.The present invention can pass through Include the hardware of several visibly different elements, and realized by properly programmed computer.Enumerating several devices In device claim, several in these devices can be embodied by the same item of hardware.In mutually different appurtenance Profit states that the simple fact of some measures is not intended that the combination of these measures can not be used to benefit in requiring.
If desired, difference in functionality discussed herein can be performed with different order and/or performed simultaneously with one another. In addition, if desired, one or more functions described above can be optional or can be combined.
If desired, each step is not limited to the execution sequence in each embodiment, different step as discussed above It can be performed and/or performed simultaneously with one another with different order.In addition, in other embodiments, described above one or many Individual step can be optional or can be combined.
Although various aspects of the invention are provided in the independent claim, the other side of the present invention includes coming from The combination of the dependent claims of the feature of described embodiment and/or feature with independent claims, and not only It is the combination clearly provided in claim.
Although it is to be noted here that the foregoing describing the example embodiment of the present invention, these descriptions are not It should be understood in a limiting sense.Will without departing from such as appended right on the contrary, several change and modifications can be carried out The scope of the present invention defined in asking.
Will be appreciated by those skilled in the art that each module in the device of the embodiment of the present invention can use general meter Calculate device to realize, each module can be concentrated in the group of networks of single computing device or computing device composition, and the present invention is real The method that the device in example corresponds in previous embodiment is applied, it can be realized by executable program code, can also be led to The mode of integrated circuit combination is crossed to realize, therefore the invention is not limited in specific hardware or software and its combination.
Will be appreciated by those skilled in the art that each module in the device of the embodiment of the present invention can use general shifting Dynamic terminal realizes that each module can be concentrated in the device combination of single mobile terminal or mobile terminal composition, the present invention The method that device in embodiment corresponds in previous embodiment, it can be realized by the executable program code of editor, It can be realized by way of integrated circuit combination, therefore the invention is not limited in specific hardware or software and its knot Close.
Note, above are only the exemplary embodiment and institute's application technology principle of the present invention.Those skilled in the art can manage Solution, the invention is not restricted to specific embodiment described here, can carry out various obvious changes for a person skilled in the art Change, readjust and substitute without departing from protection scope of the present invention.There is no need and unable to give all embodiments With exhaustion.Therefore, although the present invention is described in further detail by above example, but the present invention not only limit In above example, without departing from the inventive concept, other more equivalent embodiments can also be included, it is all in this hair Bright spirit and with any obvious change or variation extended out within principle still in the claims in the present invention Among the scope protected.

Claims (10)

1. a kind of method that IT failure root cause analysis is carried out using timing dependence, it is characterised in that methods described includes:
Obtain system journal;
The critical field of the system journal is extracted, the critical field is counted to obtain the time series of system journal Data;
Assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;
When IT failures occur, the correlated characteristic of the time series data is tested by Granger causality, its In, the size of the causality value between each correlated characteristic of the time series data is as being evaluated as the IT failures The foundation of occurrence cause.
2. according to the method described in claim 1, it is characterised in that the critical field for extracting the system journal, to institute Critical field is stated to be counted to obtain the time series data of system journal, including:
Extract the critical field of the system journal;
Key index parameter to the system journal carries out counting the time series data for obtaining the system journal;
Wherein, the key index parameter includes group more than one or both of access number, authority change, error message Close.
3. method according to claim 1 or 2, it is characterised in that the critical field of the extraction system journal, right The critical field is counted to obtain the time series data of system journal, in addition to:
Parametrization setting is carried out to the critical field;
Critical field after the parametrization obtained to the system journal sets up parameter role graph of a relation;
It is described that the correlated characteristic of the time series data is tested by Granger causality, including:Pass through Glan Outstanding causality is tested to the critical field after the parametrization.
4. method according to claim 3, it is characterised in that between each correlated characteristic of the time series data because The size of fruit relation value as the occurrence cause for being evaluated as the IT failures foundation, including:
The critical field after the parametrization is tested by Granger causality, the critical field after parametrization is drawn Causality value;
The quantitative causality figure of the IT failures is set up according to the causality value.
5. method according to claim 4, it is characterised in that between each correlated characteristic of the time series data because The size of fruit relation value as the occurrence cause for being evaluated as the IT failures foundation, in addition to:
Determine that path maximum in the quantitative causality figure is IT fault propagations path.
6. a kind of device that IT failure root cause analysis is carried out using timing dependence, it is characterised in that described device includes:
Acquisition module, for obtaining system journal;
Statistical module, the critical field for extracting the system journal, is counted to obtain system to the critical field The time series data of daily record;
Module is automatically extracted, for assuming that detection automatically extracts the correlated characteristic of the time series data based on quantization;
Fault determination module, for when IT failures occur, passing through phase of the Granger causality to the time series data Feature is closed to test, wherein, the size conduct of the causality value between each correlated characteristic of the time series data It is evaluated as the foundation of the occurrence cause of the IT failures.
7. device according to claim 6, it is characterised in that the statistical module includes:
Extract submodule, the critical field for extracting the system journal;
Statistic submodule, carries out counting the time for obtaining the system journal for the key index parameter to the system journal Sequence data;
Wherein, the key index parameter includes group more than one or both of access number, authority change, error message Close.
8. the device according to claim 6 or 7, it is characterised in that the statistical module also includes:
Setup module is parameterized, for carrying out parametrization setting to the critical field;
Parameter Map sets up module, and parameter role relation is set up for the critical field after the parametrization that is obtained to the system journal Figure;
The fault determination module is additionally operable to:The critical field after the parametrization is examined by Granger causality Test.
9. device according to claim 8, it is characterised in that the fault determination module is additionally operable to:
The critical field after the parametrization is tested by Granger causality, the critical field after parametrization is drawn Causality value;
The quantitative causality figure of the IT failures is set up according to the causality value.
10. device according to claim 9, it is characterised in that the fault determination module, in addition to:
Path determination sub-module, for determining that path maximum in the quantitative causality figure is IT fault propagations path.
CN201710508423.8A 2017-06-28 2017-06-28 Method and device for analyzing IT fault root cause by utilizing time sequence correlation Active CN107301119B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710508423.8A CN107301119B (en) 2017-06-28 2017-06-28 Method and device for analyzing IT fault root cause by utilizing time sequence correlation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710508423.8A CN107301119B (en) 2017-06-28 2017-06-28 Method and device for analyzing IT fault root cause by utilizing time sequence correlation

Publications (2)

Publication Number Publication Date
CN107301119A true CN107301119A (en) 2017-10-27
CN107301119B CN107301119B (en) 2020-07-14

Family

ID=60136065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710508423.8A Active CN107301119B (en) 2017-06-28 2017-06-28 Method and device for analyzing IT fault root cause by utilizing time sequence correlation

Country Status (1)

Country Link
CN (1) CN107301119B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009040A (en) * 2017-12-12 2018-05-08 杭州时趣信息技术有限公司 A kind of definite failure root because method, system and computer-readable recording medium
CN108446184A (en) * 2018-02-23 2018-08-24 北京天元创新科技有限公司 Analyze the method and system of failure root primordium
CN108897633A (en) * 2018-06-06 2018-11-27 山东超越数控电子股份有限公司 A kind of method for diagnosing faults and device based on machine data
CN109034368A (en) * 2018-06-22 2018-12-18 北京航空航天大学 A kind of complex device Multiple Fault Diagnosis Method based on DNN
CN109190709A (en) * 2018-09-12 2019-01-11 北京工业大学 A method of for the selection feature of pollutant prediction
CN109271319A (en) * 2018-09-18 2019-01-25 北京航空航天大学 A kind of prediction technique of the software fault based on panel Data Analyses
CN109460362A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis system based on fine granularity Feature Semantics network
CN110855502A (en) * 2019-11-22 2020-02-28 叶晓斌 Fault cause determination method and system based on time-space analysis log
CN111555895A (en) * 2019-02-12 2020-08-18 北京数安鑫云信息技术有限公司 Method, device, storage medium and computer equipment for analyzing website faults
CN112052151A (en) * 2020-10-09 2020-12-08 腾讯科技(深圳)有限公司 Fault root cause analysis method, device, equipment and storage medium
CN112363891A (en) * 2020-11-18 2021-02-12 西安交通大学 Exception reason obtaining method based on fine-grained event and KPIs analysis
WO2021116857A1 (en) * 2019-12-11 2021-06-17 International Business Machines Corporation Root cause analysis using granger causality
CN113051307A (en) * 2019-12-27 2021-06-29 深信服科技股份有限公司 Alarm signal analysis method, equipment, storage medium and device
CN113127528A (en) * 2019-12-30 2021-07-16 中移信息技术有限公司 System root cause positioning method, device, equipment and computer storage medium
CN113676360A (en) * 2021-09-26 2021-11-19 平安科技(深圳)有限公司 Link diagram repairing method based on Glangel causal relationship inspection and diagram similarity technology
CN115118580A (en) * 2022-05-20 2022-09-27 阿里巴巴(中国)有限公司 Alarm analysis method and device
US11568281B2 (en) 2019-11-13 2023-01-31 International Business Machines Corporation Causal reasoning for explanation of model predictions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080201397A1 (en) * 2007-02-20 2008-08-21 Wei Peng Semi-automatic system with an iterative learning method for uncovering the leading indicators in business processes
CN104483958A (en) * 2014-10-31 2015-04-01 中国石油大学(北京) Adaptive data driving fault diagnosis method and device in complex refining process
CN106502815A (en) * 2016-10-20 2017-03-15 北京蓝海讯通科技股份有限公司 A kind of abnormal cause localization method, device and computing device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080201397A1 (en) * 2007-02-20 2008-08-21 Wei Peng Semi-automatic system with an iterative learning method for uncovering the leading indicators in business processes
CN104483958A (en) * 2014-10-31 2015-04-01 中国石油大学(北京) Adaptive data driving fault diagnosis method and device in complex refining process
CN106502815A (en) * 2016-10-20 2017-03-15 北京蓝海讯通科技股份有限公司 A kind of abnormal cause localization method, device and computing device

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009040B (en) * 2017-12-12 2021-05-04 杭州时趣信息技术有限公司 Method, system and computer readable storage medium for determining fault root cause
CN108009040A (en) * 2017-12-12 2018-05-08 杭州时趣信息技术有限公司 A kind of definite failure root because method, system and computer-readable recording medium
CN108446184A (en) * 2018-02-23 2018-08-24 北京天元创新科技有限公司 Analyze the method and system of failure root primordium
CN108446184B (en) * 2018-02-23 2021-09-07 北京天元创新科技有限公司 Method and system for analyzing fault root cause
CN108897633A (en) * 2018-06-06 2018-11-27 山东超越数控电子股份有限公司 A kind of method for diagnosing faults and device based on machine data
CN109034368A (en) * 2018-06-22 2018-12-18 北京航空航天大学 A kind of complex device Multiple Fault Diagnosis Method based on DNN
CN109190709A (en) * 2018-09-12 2019-01-11 北京工业大学 A method of for the selection feature of pollutant prediction
CN109271319A (en) * 2018-09-18 2019-01-25 北京航空航天大学 A kind of prediction technique of the software fault based on panel Data Analyses
CN109460362A (en) * 2018-11-06 2019-03-12 北京京航计算通讯研究所 System interface timing knowledge analysis system based on fine granularity Feature Semantics network
CN111555895B (en) * 2019-02-12 2023-02-21 北京数安鑫云信息技术有限公司 Method, device, storage medium and computer equipment for analyzing website faults
CN111555895A (en) * 2019-02-12 2020-08-18 北京数安鑫云信息技术有限公司 Method, device, storage medium and computer equipment for analyzing website faults
US11568281B2 (en) 2019-11-13 2023-01-31 International Business Machines Corporation Causal reasoning for explanation of model predictions
CN110855502A (en) * 2019-11-22 2020-02-28 叶晓斌 Fault cause determination method and system based on time-space analysis log
WO2021116857A1 (en) * 2019-12-11 2021-06-17 International Business Machines Corporation Root cause analysis using granger causality
GB2606918A (en) * 2019-12-11 2022-11-23 Ibm Root cause analysis using granger causality
US11816178B2 (en) 2019-12-11 2023-11-14 International Business Machines Corporation Root cause analysis using granger causality
US11238129B2 (en) 2019-12-11 2022-02-01 International Business Machines Corporation Root cause analysis using Granger causality
CN113051307A (en) * 2019-12-27 2021-06-29 深信服科技股份有限公司 Alarm signal analysis method, equipment, storage medium and device
CN113127528A (en) * 2019-12-30 2021-07-16 中移信息技术有限公司 System root cause positioning method, device, equipment and computer storage medium
CN112052151B (en) * 2020-10-09 2022-02-18 腾讯科技(深圳)有限公司 Fault root cause analysis method, device, equipment and storage medium
CN112052151A (en) * 2020-10-09 2020-12-08 腾讯科技(深圳)有限公司 Fault root cause analysis method, device, equipment and storage medium
CN112363891A (en) * 2020-11-18 2021-02-12 西安交通大学 Exception reason obtaining method based on fine-grained event and KPIs analysis
CN112363891B (en) * 2020-11-18 2022-10-25 西安交通大学 Method for obtaining abnormal reasons based on fine-grained events and KPIs (Key Performance indicators) analysis
CN113676360B (en) * 2021-09-26 2022-10-21 平安科技(深圳)有限公司 Link diagram repairing method based on Glangel causal relationship inspection and diagram similarity technology
CN113676360A (en) * 2021-09-26 2021-11-19 平安科技(深圳)有限公司 Link diagram repairing method based on Glangel causal relationship inspection and diagram similarity technology
CN115118580A (en) * 2022-05-20 2022-09-27 阿里巴巴(中国)有限公司 Alarm analysis method and device
CN115118580B (en) * 2022-05-20 2023-10-31 阿里巴巴(中国)有限公司 Alarm analysis method and device

Also Published As

Publication number Publication date
CN107301119B (en) 2020-07-14

Similar Documents

Publication Publication Date Title
CN107301119A (en) The method and device of IT failure root cause analysis is carried out using timing dependence
CN109034660B (en) Method and related device for determining risk control strategy based on prediction model
CN107341068A (en) The method and apparatus that O&M troubleshooting is carried out by natural language processing
CN105095048B (en) A kind of monitoring system alarm association processing method based on business rule
Becker et al. Fraud detection in telecommunications: History and lessons learned
JP2018536956A (en) Method, device and storage medium for preventing illegal acts related to advertisement
CN111143097B (en) GNSS positioning service-oriented fault management system and method
CN106789844B (en) Malicious user identification method and device
CN110351150A (en) Fault rootstock determines method and device, electronic equipment and readable storage medium storing program for executing
CN106254145B (en) Network request tracking processing method and device
US11966319B2 (en) Identifying anomalies in a data center using composite metrics and/or machine learning
CN109684052B (en) Transaction analysis method, device, equipment and storage medium
EP0894378A1 (en) Signature based fraud detection system
CN103220164A (en) Data integrity scoring and visualization for network and customer experience monitoring
CN107465667A (en) The safe synergic monitoring method and device of power network industry control based on stipulations deep analysis
CN109993189A (en) A kind of network failure method for early warning, device and medium
CN110457175A (en) Business data processing method, device, electronic equipment and medium
CN110162422A (en) One kind being based on the problem of decision tree localization method and device
CN105279196B (en) The generation method and device of test script
CN108989581A (en) A kind of consumer's risk recognition methods, apparatus and system
CN115514619B (en) Alarm convergence method and system
CN108959048A (en) The method for analyzing performance of modular environment, device and can storage medium
CN111782524A (en) Application testing method and device, storage medium and electronic device
CN110187992A (en) Failure analysis methods and device
CN107317708A (en) The monitoring method and device of a kind of Court business application system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20171027

Assignee: Zhongguancun Technology Leasing Co.,Ltd.

Assignor: BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2022980003471

Denomination of invention: Method and device for it fault root cause analysis using time series correlation

Granted publication date: 20200714

License type: Exclusive License

Record date: 20220328

EE01 Entry into force of recordation of patent licensing contract
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method and device for it fault root cause analysis using time series correlation

Effective date of registration: 20220329

Granted publication date: 20200714

Pledgee: Zhongguancun Technology Leasing Co.,Ltd.

Pledgor: BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2022980003498

PE01 Entry into force of the registration of the contract for pledge of patent right
EC01 Cancellation of recordation of patent licensing contract

Assignee: Zhongguancun Technology Leasing Co.,Ltd.

Assignor: BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2022980003471

Date of cancellation: 20231010

EC01 Cancellation of recordation of patent licensing contract
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20231010

Granted publication date: 20200714

Pledgee: Zhongguancun Technology Leasing Co.,Ltd.

Pledgor: BEIJING YOUTEJIE INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2022980003498

PC01 Cancellation of the registration of the contract for pledge of patent right