CN110166264A - A kind of Fault Locating Method, device and electronic equipment - Google Patents

A kind of Fault Locating Method, device and electronic equipment Download PDF

Info

Publication number
CN110166264A
CN110166264A CN201810142390.4A CN201810142390A CN110166264A CN 110166264 A CN110166264 A CN 110166264A CN 201810142390 A CN201810142390 A CN 201810142390A CN 110166264 A CN110166264 A CN 110166264A
Authority
CN
China
Prior art keywords
node
destination node
performance indicator
link
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810142390.4A
Other languages
Chinese (zh)
Other versions
CN110166264B (en
Inventor
陈涛
刘宏伟
郭永强
王文浩
刘庆文
龚炎
崔大壮
秦强强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201810142390.4A priority Critical patent/CN110166264B/en
Publication of CN110166264A publication Critical patent/CN110166264A/en
Application granted granted Critical
Publication of CN110166264B publication Critical patent/CN110166264B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the present application provides a kind of Fault Locating Method, device and electronic equipment.This method comprises: obtaining the target indicator value of performance indicator corresponding to each node in operation system;Based on base-line data corresponding to target indicator value obtained and each performance indicator, the anomaly assessment score of each node is calculated;When judging exist when at least one destination node in the operation system based on the anomaly assessment score, based on the destination node link where at least one described destination node, determined from least one described destination node generation failure root because node;Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.By this programme can fast and effeciently orient in operation system generate failure root because node.

Description

A kind of Fault Locating Method, device and electronic equipment
Technical field
The invention relates to fault diagnosis field, in particular to a kind of Fault Locating Method, device and electronic equipment.
Background technique
For the operation system using distributed scene deployment more for functions such as take-away systems, usually there is business The features such as relationship is complicated, subsystem is various, the calling chain between system is longer.Specifically, there is a large amount of section in operation system Point has call relation between certain nodes, wherein node can be with are as follows: interface (i.e. one section of specific code method), service are (i.e. Realize the code of certain function, including multiple interfaces or method) or database etc..
When operation system breaks down, positioned by the way of manually checking in the prior art generate failure root because Node.And since during operation system breaks down, warning information blowout, a large amount of useful informations are submerged, so that artificial side The positioning time that formula carries out fault location is longer and difficulty is larger, so cause MTTR (Mean Time To Restoration, Mean Time To Recovery) deteriorate as minute grade, even hour grade.
As it can be seen that how fast and effeciently to orient in operation system generate failure root because node, be one urgently to be resolved The problem of.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of Fault Locating Method, device and electronic equipment, with fast and effeciently Orient in operation system generate failure root because node.
Specifically, the embodiment of the present application is achieved by the following technical solution:
In a first aspect, the embodiment of the present application provides a kind of Fault Locating Method, comprising:
Obtain the target indicator value of performance indicator corresponding to each node in operation system;
Based on base-line data corresponding to target indicator value obtained and each performance indicator, the different of each node is calculated Often assessment score;
Judge to be based in the operation system there are when at least one destination node when based on the anomaly assessment score Destination node link where at least one described destination node determines from least one described destination node and generates failure root The node of cause;
Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.
Optionally, described the step of obtaining the target indicator value of performance indicator corresponding to each node in operation system, Include:
Periodically obtain the target indicator value of performance indicator corresponding to each node in operation system.
Optionally, described based on base-line data corresponding to target indicator value obtained and each performance indicator, it calculates The step of anomaly assessment score of each node, comprising:
For each performance indicator, target indicator value and corresponding base-line data based on the performance indicator, calculating should The anomaly assessment score of performance indicator;
The node is calculated based on the anomaly assessment score of each performance indicator corresponding to the node for each node Anomaly assessment score.
Optionally, described to be directed to each performance indicator, target indicator value and corresponding baseline based on the performance indicator Data, the step of calculating the anomaly assessment score of the performance indicator, comprising:
For each performance indicator, using preset anomaly assessment model, target indicator value based on the performance indicator and Corresponding base-line data calculates the anomaly assessment score of the performance indicator;
Wherein, the anomaly assessment model are as follows:
Gi=Fi*fi
Wherein, Fi=| Ci-Mi|/Mi, GiFor the anomaly assessment score of performance indicator i, fiIt is pre- corresponding to performance indicator i If basic score, CiFor the target indicator value of performance indicator i, MiFor base-line data corresponding to performance indicator i.
Optionally, the destination node link based on where at least one described destination node, from it is described at least one In destination node determine generate failure root because node the step of, comprising:
Determine the destination node link where at least one described destination node;
Be directed to each destination node link, when in the destination node link there are when a destination node, will be existing A destination node be determined as generate failure root because node;When there are at least two destination nodes in the destination node link When, by the node in most downstream at least two existing destination nodes be determined as generate failure root because node.
Optionally, the destination node link based on where at least one described destination node, from it is described at least one In destination node determine generate failure root because node the step of, comprising:
Determine the destination node link where at least one described destination node;
By exporting the corresponding chain graph of the destination node link, guide management personnel are from least one described target section In point determine generate failure root because node;Wherein, destination node highlights in the chain graph.
Optionally, the calculation of base-line data corresponding to any performance indicator, comprising:
Obtain history index value corresponding to the performance indicator;
The abnormal index value in the history index value is rejected, efficiency index value is obtained;
According to baseline computational algorithm corresponding to the performance indicator, which is calculated based on the efficiency index value Corresponding base-line data.
Second aspect, the embodiment of the present application provide a kind of fault locator, comprising:
Target indicator value obtaining unit, the target for obtaining performance indicator corresponding to each node in operation system refer to Scale value;
Node score calculating unit, for based on baseline corresponding to target indicator value obtained and each performance indicator Data calculate the anomaly assessment score of each node;
Failure location unit, for judging in the operation system when based on the anomaly assessment score in the presence of at least one When a destination node, based on the destination node link where at least one described destination node, from least one described target section In point determine generate failure root because node;
Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.
Optionally, the target indicator value obtaining unit includes:
Index value obtains subelement, for periodically obtaining the mesh of performance indicator corresponding to each node in operation system Mark index value.
Optionally, the node score calculating unit includes:
Index score computation subunit, for being directed to each performance indicator, target indicator value based on the performance indicator and Corresponding base-line data calculates the anomaly assessment score of the performance indicator;
Node score computation subunit, for being directed to each node, based on each performance indicator corresponding to the node Anomaly assessment score calculates the anomaly assessment score of the node.
Optionally, the failure location unit includes:
Link determines subelement, for the destination node link where at least one determining described destination node;
First locator unit, for being directed to each destination node link, when there are one in the destination node link When destination node, by an existing destination node be determined as generate failure root because node;When in the destination node link There are when at least two destination nodes, the node in most downstream at least two existing destination nodes is determined as producing Raw failure root because node.
Optionally, the failure location unit includes:
Link determines subelement, for the destination node link where at least one determining described destination node;
Second locator unit, for by exporting the corresponding chain graph of the destination node link, guide management personnel From at least one described destination node determine generate failure root because node;Wherein, destination node is prominent in the chain graph Display.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, including memory, processor and are stored in storage On device and the computer program that can run on a processor, the processor realize above-mentioned first aspect institute when executing described program The Fault Locating Method of offer.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, and the storage medium is stored with Computer program, the computer program is for executing Fault Locating Method provided by above-mentioned first aspect.
In method provided by the embodiment of the present application, target indicator value and base based on the corresponding performance indicator of each node Line number evidence carries out abnormal scoring to each node, obtains in the presence of at least one abnormal destination node, and in view of between node Call relation determine and produce from least one destination node based on the destination node link where at least one destination node Raw failure root because node.As it can be seen that by this programme can fast and effeciently orient in operation system generate failure root because Node.
Detailed description of the invention
Fig. 1 is a kind of flow chart of Fault Locating Method provided by the embodiment of the present application;
Fig. 2 is a kind of structural schematic diagram of fault locator provided by the embodiment of the present application;
Fig. 3 is the structural schematic diagram of a kind of electronic equipment provided by the embodiment of the present application.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only to be not intended to be limiting the application merely for for the purpose of describing particular embodiments in term used in this application. It is also intended in the application and the "an" of singular used in the attached claims, " described " and "the" including majority Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps It may be combined containing one or more associated any or all of project listed.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the application A little information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not departing from In the case where the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to as One information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... When " or " in response to determination ".
In order to fast and effeciently orient generated in operation system failure root because node, the embodiment of the present application provides one Kind Fault Locating Method, device and electronic equipment.
A kind of Fault Locating Method provided by the embodiment of the present application is introduced first below.
It should be noted that a kind of executing subject of Fault Locating Method provided by the embodiment of the present application can be one kind Fault locator.Wherein, which can run in electronic equipment, and in a particular application, which sets Standby can be terminal device or server, this is all reasonable.
In addition, the call relation based on node each in operation system, can construct a plurality of section of the operation system in advance Link is put, there are call relations for the node in any node link.Wherein, node can be interface, service or database etc.;And Node link can be constructed by manual type, be not limited thereto certainly.Specifically, interface is one section of specific code method; And the code serviced to realize certain function, it may include multiple interfaces or method.
Furthermore, it is desirable to, it is emphasized that operation system involved in the embodiment of the present application is not limited to take-away system, for There are any operation systems between multiple nodes and multiple nodes there are call relation, can use the embodiment of the present application and are mentioned The Fault Locating Method of confession carries out fault location.
As shown in Figure 1, a kind of Fault Locating Method provided by the embodiment of the present application, may include steps of:
S101 obtains the target indicator value of performance indicator corresponding to each node in operation system;
In view of the index value of performance indicator corresponding to node is that an important factor for whether node breaks down judged, It is exactly that the index value of performance indicator corresponding to node can change when failure occurs, therefore, in the embodiment of the present application, The fault locator can obtain the target indicator value of performance indicator corresponding to each node in operation system, and then be based on Target indicator value obtained executes subsequent processing.
It is understood that any one node can correspond to a performance indicator or multiple performance indicators, it is different types of The quantity and concrete type of performance indicator corresponding to node can be similar and different, and property corresponding to same type of node The quantity and concrete type of energy index can be identical or different.Also, in a particular application, corresponding to the node of each type Performance indicator may be set according to actual conditions.For example:
Performance indicator corresponding to interface can be TP99, QPM (Query Per Minute, query rate per minute), mistake It loses one or more in rate, abnormal number etc.;
Performance indicator corresponding to database can be with are as follows: abnormal number, failure rate, AVG (Average, average time-consuming), QPM In it is one or more;
The corresponding performance indicator of service can be with are as follows: one or more in abnormal number, failure rate, AVG, QPM etc.;
Wherein, TP99 are as follows: meet minimum time-consuming required for percent ninety-nine network request;Failure rate are as follows: call Number/calling total degree of failure;Abnormal number are as follows: the number of malloc failure malloc.
In addition, it is necessary to explanation, in a kind of specific implementation, which, which works as, is received about the industry When the warning message of business system, the target indicator for obtaining performance indicator corresponding to each node in operation system can be executed Value, that is, receiving about the warning message of the operation system is that performance corresponding to each node refers in the acquisition operation system The trigger condition of the step of target target indicator value.
And in order to further enhance fault location efficiency, in another specific implementation, which can Periodically to obtain the target indicator value of performance indicator corresponding to each node in operation system, that is to say, that the failure is fixed Position device is monitored operation system, in this way, receiving the report about the operation system if operation system breaks down Early period of alert information or simultaneously, which can complete the positioning of failure, greatly improve fault location efficiency, together When avoid certain false alarms and the fault location process that causes.It is understood that in a particular application, fault location dress The step of the target indicator value of performance indicator corresponding to each node in operation system can be obtained every the execution of predetermined minute by setting Suddenly, i.e., minute grade periodicity;It is obtained in operation system corresponding to each node it is of course also possible to be executed every predetermined hour The step of target indicator value of performance indicator, i.e. hour grade periodically, etc..
Also, the fault locator obtains the target indicator value of performance indicator corresponding to each node in operation system Specific acquisition pattern there are a variety of.Wherein, which can be voluntarily from acquisition performance index from each node Target indicator value, such as: read the correlation log information of each node;Alternatively, the fault locator can be climbed by network Target indicator value of the worm from acquisition performance index from each node, wherein web crawlers can acquire from each node in real time The index value of performance indicator is simultaneously stored, and the fault locator is read from the data stored when needing fault location Take target indicator value, etc..
S102 calculates each section based on base-line data corresponding to target indicator value obtained and each performance indicator The anomaly assessment score of point;
After the target indicator value for obtaining performance indicator corresponding to each node, which can be based on institute Base-line data corresponding to the target indicator value and each performance indicator of acquisition, calculates the anomaly assessment score of each node.Its In, base-line data corresponding to any performance indicator is used to characterize the normal water level values of the performance indicator, it is to be understood that appoints The actual index value of one performance indicator can be higher or lower than base-line data certain amplitude.
Optionally, described based on base-line data corresponding to target indicator value obtained and each performance indicator, it calculates The step of anomaly assessment score of each node, may include:
For each performance indicator, target indicator value and corresponding base-line data based on the performance indicator, calculating should The anomaly assessment score of performance indicator;
The node is calculated based on the anomaly assessment score of each performance indicator corresponding to the node for each node Anomaly assessment score.
It wherein, can be by each performance indicator corresponding to the node when calculating the anomaly assessment score of each node Anomaly assessment score be added, obtain the anomaly assessment score of the node;It can also be by each performance indicator corresponding to the node Anomaly assessment score weighting summation, obtain the anomaly assessment score of the node.It should be noted that corresponding to the node The weighted value that the anomaly assessment score weighting summation of each performance indicator is utilized, can according to empirical value determine value, This is without limitation.
It is understood that the degree that target indicator value deviates base-line data shows that more greatly performance indicator intensity of anomaly is tighter Weight, so that node intensity of anomaly is more serious, i.e., target indicator value deviates the degree value of base-line data and the abnormal journey of performance indicator Angle value, the intensity of anomaly value line style of node are related.And in order to characterize intensity of anomaly value by score, it is first in the embodiment of the present application Basic score first is preset for each performance indicator, and then deviates the journey of base-line data using basic score and target indicator value Two members of angle value usually calculation of performance indicators, node anomaly assessment score.Specifically, described be directed to each performance indicator, it is based on The target indicator value and corresponding base-line data of the performance indicator, the step of calculating the anomaly assessment score of the performance indicator, May include:
For each performance indicator, using preset anomaly assessment model, target indicator value based on the performance indicator and Corresponding base-line data calculates the anomaly assessment score of the performance indicator;
Wherein, the anomaly assessment model are as follows:
Gi=Fi*fi
Wherein, Fi=| Ci-Mi|/Mi, GiFor the anomaly assessment score of performance indicator i, fiIt is pre- corresponding to performance indicator i If basic score, CiFor the target indicator value of performance indicator i, MiFor base-line data corresponding to performance indicator i.
It is understood that default basic score corresponding to any performance indicator can be set as the case may be Fixed, the application is not construed as limiting this.
It should be noted that base-line data corresponding to any performance indicator can immobilize, can also regularly update. In addition, base-line data corresponding to any performance indicator can be set based on experience value.Certainly, corresponding to any performance indicator Base-line data can also be calculated by the fault locator according to historical data, specifically, any performance indicator institute it is right The calculation for the base-line data answered may include:
Obtain history index value corresponding to the performance indicator;
The abnormal index value in history index value is rejected, efficiency index value is obtained;
According to baseline computational algorithm corresponding to the performance indicator, which is calculated based on the efficiency index value Corresponding base-line data.
It is emphasized that abnormal index value corresponding to different performance indicators rejects algorithm difference, and it is similar, it is different Performance indicator corresponding to baseline computational algorithm can be different.Wherein, for the performance indicator of aperiodicity rule, When rejecting abnormalities index value, used dealing of abnormal data algorithm can for MAD (Median Absolute Deviation, Mean absolute deviation) algorithm;And when calculating base-line data, used baseline computational algorithm can be quartile algorithm, when So it is not limited thereto.And for the performance indicator of periodic regularity, in rejecting abnormalities index value, used exception Data processing algorithm can be MAD algorithm;And when calculating base-line data, used baseline computational algorithm can be K σ deviation Algorithm is not limited thereto certainly.For example: the index value of TP99 can use MAD algorithm rejecting abnormalities without specific rule Index value calculates base-line data using quartile method;And the index Distribution value of QPM has periodically, can use MAD algorithm Rejecting abnormalities index value calculates base-line data using K σ deviation algorithm.Wherein, MAD (Median Absolute Deviation, Mean absolute deviation) abbreviation average deviation, specifically, having variable X when overall units is N1, X2, X3... ..., XN-1, XN, for the difference of every variable and population average deviation, mean absolute deviation is defined as the exhausted of the deviation of each data and average value To the average of value.Wherein, quartile algorithm is also box figure, is to utilize first quartile, the median, third in data The statistics such as quartile describe the overall distribution situations of data, and the up-and-down boundary value of data is calculated by these statistics As baseline value.Wherein, for K σ deviation algorithm, K can be set according to scene, this lower kind is introduced by taking K=3 as an example Deviation algorithm, specifically: one group of detection data Normal Distribution or approximate normal distribution are first assumed, in conjunction with knowledge of statistics meter Its mean value and standard deviation are calculated, baseline value is determined according to the data distribution characteristic of normal distribution.
Further it will be understood that can be determined according to the index value distribution situation of performance indicator is based on experience value Base-line data is set, according further to historical data base-line data is calculated.
It is emphasized that above-mentioned given based on corresponding to target indicator value obtained and each performance indicator Base-line data, the specific implementation for calculating the anomaly assessment score of each node is merely exemplary, should not constitute pair The restriction of the embodiment of the present application.
S103 is judged in the operation system when based on the anomaly assessment score there are when at least one destination node, base Destination node link where at least one destination node, from least one destination node determine generate failure root because Node.
Wherein, which is the node that anomaly assessment score meets default exceptional condition.
After the anomaly assessment score of each node is calculated, it can be determined that whether the anomaly assessment score of each node Meet default exceptional condition, to obtain the destination node for meeting default exceptional condition, wherein due to destination node meet it is default Exceptional condition, therefore, destination node are the node there are failure.In turn, when judging the business based on the anomaly assessment score There are when at least one destination node in system, show that operation system breaks down, therefore, which is based on this extremely Destination node link where a few destination node, determined from least one destination node generate failure root because section Point.Wherein, default exceptional condition can be with are as follows: is higher than predetermined score threshold value, is not limited thereto certainly;Also, the predetermined score Threshold value can be 0, or the fractional value greater than 0, can set as the case may be.
Optionally, in a kind of specific implementation, the destination node based on where at least one destination node Link, from least one destination node determine generate failure root because node the step of, may include:
Determine the destination node link where at least one destination node;
Be directed to each destination node link, when in the destination node link there are when a destination node, will be existing A destination node be determined as generate failure root because node;When there are at least two destination nodes in the destination node link When, by the node in most downstream at least two existing destination nodes be determined as generate failure root because node.
It is emphasized that the most downstream node for a node link, in multiple nodes are as follows: itself becomes Kinetic energy enough influences the node of other nodes in multiple node, such as: belong to the node for calling the bottom in multiple nodes.Citing For: multiple nodes: A, B, C and D, it is assumed that A is called by B, C and D, then, A is the most downstream node in multiple node.
In addition, since a node can be located in a plurality of node link, determining at least one destination node When the destination node link at place, the destination node chain for belonging to core link where at least one destination node can be determined Road, wherein so-called core link is the link or more important link in operation system that administrative staff more pay close attention to, Wherein core link can manually be set, and setting can also be voluntarily analyzed with system.
Optionally, that is looked in another specific implementation, the mesh based on where at least one destination node Mark node link, from least one destination node determine generate failure root because node the step of, may include:
Determine the destination node link where at least one destination node;
By exporting the corresponding chain graph of destination node link, guide management personnel are from least one destination node Determine generate failure root because node;Wherein, destination node highlights in the chain graph.
Wherein it is possible to highlight destination node by color, it is not limited thereto certainly.Also, works as and pass through color Come when highlighting destination node, under the premise of being different from the color of icon of other normal nodes, each destination node Icon can have same color, it is possible to have different colours.It is understood that in a particular application, in order to reach It is preferable to distinguish effect and recognize intensity of anomaly, the destination node of predetermined value is higher than for anomaly assessment score, icon Color can be set to red, and be lower than the destination node of predetermined value for anomaly assessment score, and the color of icon can be set For yellow, so that administrative staff can recognize the intensity of anomaly of destination node by the eye-catching degree of color.
In addition, the fault locator can direct output link figure, can also with the corresponding link information of output link figure, Certainly it is not limited thereto.Also, for the mode for exporting link information, administrative staff can be entered with clickthrough information The displaying interface of chain graph, so that administrative staff can view chain graph.
In method provided by the embodiment of the present application, target indicator value and base based on the corresponding performance indicator of each node Line number evidence carries out abnormal scoring to each node, obtains in the presence of at least one abnormal destination node, and in view of between node Call relation determine and produce from least one destination node based on the destination node link where at least one destination node Raw failure root because node.As it can be seen that by this programme can fast and effeciently orient in operation system generate failure root because Node.
Corresponding to above method embodiment, the embodiment of the present application also provides a kind of fault locators.As shown in Fig. 2, The fault locator may include:
Target indicator value obtaining unit 210, for obtaining the mesh of performance indicator corresponding to each node in operation system Mark index value;
Node score calculating unit 220, for based on corresponding to target indicator value obtained and each performance indicator Base-line data calculates the anomaly assessment score of each node;
Failure location unit 230, for judging in the operation system when based on the anomaly assessment score in the presence of extremely When a few destination node, based on the destination node link where at least one described destination node, from least one described mesh Mark node in determine generate failure root because node;
Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.
Device provided by the embodiment of the present application, target indicator value and baseline based on the corresponding performance indicator of each node Data carry out abnormal scoring to each node, obtain in the presence of at least one abnormal destination node, and in view of between node Call relation is determined from least one destination node and is generated based on the destination node link where at least one destination node Failure root because node.As it can be seen that by this programme can fast and effeciently orient in operation system generate failure root because section Point.Optionally, the target indicator value obtaining unit 210 may include:
Index value obtains subelement, for periodically obtaining the mesh of performance indicator corresponding to each node in operation system Mark index value.
Optionally, the node score calculating unit 220 may include:
Index score computation subunit, for being directed to each performance indicator, target indicator value based on the performance indicator and Corresponding base-line data calculates the anomaly assessment score of the performance indicator;
Node score computation subunit, for being directed to each node, based on each performance indicator corresponding to the node Anomaly assessment score calculates the anomaly assessment score of the node.
Optionally, the index score computation subunit is specifically used for:
For each performance indicator, using preset anomaly assessment model, target indicator value based on the performance indicator and Corresponding base-line data calculates the anomaly assessment score of the performance indicator;
Wherein, the anomaly assessment model are as follows:
Gi=Fi*fi
Wherein, Fi=| Ci-Mi|/Mi, GiFor the anomaly assessment score of performance indicator i, fiIt is pre- corresponding to performance indicator i If basic score, CiFor the target indicator value of performance indicator i, MiFor base-line data corresponding to performance indicator i.
Optionally, the failure location unit 230 may include:
Link determines subelement, for the destination node link where at least one determining described destination node;
First locator unit, for being directed to each destination node link, when there are one in the destination node link When destination node, by an existing destination node be determined as generate failure root because node;When in the destination node link There are when at least two destination nodes, the node in most downstream at least two existing destination nodes is determined as producing Raw failure root because node.
Optionally, the failure location unit 230 may include:
Link determines subelement, for the destination node link where at least one determining described destination node;
Second locator unit, for by exporting the corresponding chain graph of the destination node link, guide management personnel From at least one described destination node determine generate failure root because node;Wherein, destination node is prominent in the chain graph Display.
In addition, corresponding to above-mentioned Fault Locating Method, the embodiment of the present application also proposed a kind of electronic equipment.It please refers to Fig. 3, in hardware view, which includes processor 310, internal bus 320, network interface 330, memory 340 and non- Volatile memory 350 is also possible that hardware required for other business certainly.Processor 310 is from nonvolatile memory Corresponding computer program is read in 350 then to be run into memory 340 to execute Fault Locating Method provided herein, A kind of fault locator is formed on logic level.Certainly, other than software realization mode, other are not precluded in the application Implementation, such as logical device or the mode of software and hardware combining etc., that is to say, that the executing subject of following process flow It is not limited to each logic unit, is also possible to hardware or logical device.
In addition, the storage medium is stored with meter the embodiment of the present application also provides a kind of computer readable storage medium Calculation machine program, the computer program is for executing above-mentioned Fault Locating Method.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus Realization process, details are not described herein.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual The purpose for needing to select some or all of the modules therein to realize application scheme.Those of ordinary skill in the art are not paying Out in the case where creative work, it can understand and implement.
The foregoing is merely the preferred embodiments of the application, not to limit the application, all essences in the application Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the application protection.

Claims (14)

1. a kind of Fault Locating Method characterized by comprising
Obtain the target indicator value of performance indicator corresponding to each node in operation system;
Based on base-line data corresponding to target indicator value obtained and each performance indicator, the exception for calculating each node is commented Estimate score;
It is described there are being based on when at least one destination node in the operation system when being judged based on the anomaly assessment score Destination node link where at least one destination node, from least one described destination node determine generate failure root because Node;
Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.
2. the method according to claim 1, wherein property corresponding to each node in the acquisition operation system The step of target indicator value of energy index, comprising:
Periodically obtain the target indicator value of performance indicator corresponding to each node in operation system.
3. method according to claim 1 or 2, which is characterized in that described based on target indicator value obtained and each Base-line data corresponding to performance indicator, the step of calculating the anomaly assessment score of each node, comprising:
For each performance indicator, target indicator value and corresponding base-line data based on the performance indicator calculate the performance The anomaly assessment score of index;
The different of the node is calculated based on the anomaly assessment score of each performance indicator corresponding to the node for each node Often assessment score.
4. according to the method described in claim 3, it is characterized in that, described be directed to each performance indicator, based on the performance indicator Target indicator value and corresponding base-line data, the step of calculating the anomaly assessment score of the performance indicator, comprising:
For each performance indicator, using preset anomaly assessment model, target indicator value and institute based on the performance indicator are right The base-line data answered calculates the anomaly assessment score of the performance indicator;
Wherein, the anomaly assessment model are as follows:
Gi=Fi*fi
Wherein, Fi=| Ci-Mi|/Mi, GiFor the anomaly assessment score of performance indicator i, fiFor default base corresponding to performance indicator i This score, CiFor the target indicator value of performance indicator i, MiFor base-line data corresponding to performance indicator i.
5. method according to claim 1 or 2, which is characterized in that described based on where at least one described destination node Destination node link, from least one described destination node determine generate failure root because node the step of, comprising:
Determine the destination node link where at least one described destination node;
Be directed to each destination node link, when in the destination node link there are when a destination node, by existing one A destination node be determined as generate failure root because node;When in the destination node link there are when at least two destination nodes, By the node in most downstream at least two existing destination nodes be determined as generate failure root because node.
6. method according to claim 1 or 2, which is characterized in that described based on where at least one described destination node Destination node link, from least one described destination node determine generate failure root because node the step of, comprising:
Determine the destination node link where at least one described destination node;
By exporting the corresponding chain graph of the destination node link, guide management personnel are from least one described destination node Determine generate failure root because node;Wherein, destination node highlights in the chain graph.
7. method according to claim 1 or 2, which is characterized in that the meter of base-line data corresponding to any performance indicator Calculation mode, comprising:
Obtain history index value corresponding to the performance indicator;
The abnormal index value in the history index value is rejected, efficiency index value is obtained;
According to baseline computational algorithm corresponding to the performance indicator, calculated corresponding to the performance indicator based on the efficiency index value Base-line data.
8. a kind of fault locator characterized by comprising
Target indicator value obtaining unit, for obtaining the target indicator of performance indicator corresponding to each node in operation system Value;
Node score calculating unit, for based on baseline number corresponding to target indicator value obtained and each performance indicator According to calculating the anomaly assessment score of each node;
Failure location unit judges that there are at least one mesh in the operation system based on the anomaly assessment score for working as When marking node, based on the destination node link where at least one described destination node, from least one described destination node Determine generate failure root because node;
Wherein, the destination node is the node that anomaly assessment score meets default exceptional condition.
9. device according to claim 8, which is characterized in that the target indicator value obtaining unit includes:
Index value obtains subelement, and the target for periodically obtaining performance indicator corresponding to each node in operation system refers to Scale value.
10. device according to claim 8 or claim 9, which is characterized in that the node score calculating unit includes:
Index score computation subunit, for being directed to each performance indicator, target indicator value and institute based on the performance indicator are right The base-line data answered calculates the anomaly assessment score of the performance indicator;
Node score computation subunit, for being directed to each node, the exception based on each performance indicator corresponding to the node Score is assessed, the anomaly assessment score of the node is calculated.
11. device according to claim 8 or claim 9, which is characterized in that the failure location unit includes:
Link determines subelement, for the destination node link where at least one determining described destination node;
First locator unit, for being directed to each destination node link, when there are a targets in the destination node link When node, by an existing destination node be determined as generate failure root because node;Exist when in the destination node link When at least two destination nodes, the node in most downstream at least two existing destination nodes is determined as to generate event Hinder root because node.
12. device according to claim 8 or claim 9, which is characterized in that the failure location unit includes:
Link determines subelement, for the destination node link where at least one determining described destination node;
Second locator unit, for by exporting the corresponding chain graph of the destination node link, guide management personnel are from institute State at least one destination node determine generate failure root because node;Wherein, destination node highlights in the chain graph.
13. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine program, which is characterized in that the processor realizes event described in any one of the claims 1-7 when executing described program Hinder localization method.
14. a kind of computer readable storage medium, which is characterized in that the storage medium is stored with computer program, the meter Calculation machine program is for executing the described in any item Fault Locating Methods of the claims 1-7.
CN201810142390.4A 2018-02-11 2018-02-11 Fault positioning method and device and electronic equipment Active CN110166264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810142390.4A CN110166264B (en) 2018-02-11 2018-02-11 Fault positioning method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810142390.4A CN110166264B (en) 2018-02-11 2018-02-11 Fault positioning method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110166264A true CN110166264A (en) 2019-08-23
CN110166264B CN110166264B (en) 2022-03-08

Family

ID=67635085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810142390.4A Active CN110166264B (en) 2018-02-11 2018-02-11 Fault positioning method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110166264B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110530650A (en) * 2019-09-05 2019-12-03 哈尔滨电气股份有限公司 Heavy duty gas turbine performance state monitoring method based on generalized regression nerve networks Yu box map analysis
CN110601890A (en) * 2019-09-17 2019-12-20 深圳市网心科技有限公司 Network performance analysis method, device, equipment and readable storage medium
CN111722952A (en) * 2020-05-25 2020-09-29 中国建设银行股份有限公司 Fault analysis method, system, equipment and storage medium of business system
CN112346936A (en) * 2020-11-27 2021-02-09 中国工商银行股份有限公司 Application fault root cause positioning method and system
CN112565227A (en) * 2020-11-27 2021-03-26 深圳前海微众银行股份有限公司 Abnormal task detection method and device
CN112838962A (en) * 2020-12-31 2021-05-25 中国银联股份有限公司 Performance bottleneck detection method and device for big data cluster
CN113032227A (en) * 2021-05-31 2021-06-25 北京宝兰德软件股份有限公司 Abnormal network element detection method and device, electronic equipment and storage medium
CN113094249A (en) * 2021-04-30 2021-07-09 杭州安恒信息技术股份有限公司 Node abnormity detection method, device and medium
WO2021179574A1 (en) * 2020-03-12 2021-09-16 平安科技(深圳)有限公司 Root cause localization method, device, computer apparatus, and storage medium
CN114077510A (en) * 2020-08-11 2022-02-22 腾讯科技(深圳)有限公司 Method and device for fault root cause positioning and fault root cause display
CN114844768A (en) * 2022-04-27 2022-08-02 广州亚信技术有限公司 Information analysis method and device and electronic equipment
CN114966304A (en) * 2022-04-13 2022-08-30 中移互联网有限公司 Fault positioning method and device and electronic equipment

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005628A1 (en) * 2006-06-30 2008-01-03 Underdal Olav M Conversion of static diagnostic procedure to dynamic test plan method and apparatus
CN101183989A (en) * 2007-12-03 2008-05-21 中兴通讯股份有限公司 Incremental analysis method of optical synchronization transmission network alarm correlation
CN101577648A (en) * 2009-06-26 2009-11-11 杭州华三通信技术有限公司 Method for determining root cause of network fault and analytic equipment thereof
CN102611568A (en) * 2011-12-21 2012-07-25 华为技术有限公司 Failure service path diagnosis method and device
CN102664760A (en) * 2012-04-28 2012-09-12 华为技术有限公司 Alarming method for communication system, equipment and communication system
CN103368776A (en) * 2013-07-09 2013-10-23 杭州东方通信软件技术有限公司 Method and system for evaluating equipment status by standardized physical examination
CN103412911A (en) * 2013-08-02 2013-11-27 中国工商银行股份有限公司 Method and device for monitoring performance of database system
CN103617110A (en) * 2013-11-11 2014-03-05 国家电网公司 Server device condition maintenance system
CN105101277A (en) * 2015-09-01 2015-11-25 中国联合网络通信集团有限公司 Method, device and system for judging abnormalities of monitoring area and sensing node
JP2016174281A (en) * 2015-03-17 2016-09-29 日本電信電話株式会社 Network evaluation device, network evaluation method and network evaluation program
CN105991339A (en) * 2015-03-05 2016-10-05 腾讯科技(深圳)有限公司 Alarm source positioning method and device
CN106209920A (en) * 2016-09-19 2016-12-07 贵州白山云科技有限公司 The safety protecting method of a kind of dns server and device
CN106776214A (en) * 2016-12-12 2017-05-31 广州市申迪计算机系统有限公司 A kind of server health degree appraisal procedure
CN107040395A (en) * 2016-02-03 2017-08-11 腾讯科技(深圳)有限公司 A kind of processing method of warning information, device and system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080005628A1 (en) * 2006-06-30 2008-01-03 Underdal Olav M Conversion of static diagnostic procedure to dynamic test plan method and apparatus
CN101183989A (en) * 2007-12-03 2008-05-21 中兴通讯股份有限公司 Incremental analysis method of optical synchronization transmission network alarm correlation
CN101577648A (en) * 2009-06-26 2009-11-11 杭州华三通信技术有限公司 Method for determining root cause of network fault and analytic equipment thereof
CN102611568A (en) * 2011-12-21 2012-07-25 华为技术有限公司 Failure service path diagnosis method and device
CN102664760A (en) * 2012-04-28 2012-09-12 华为技术有限公司 Alarming method for communication system, equipment and communication system
CN103368776A (en) * 2013-07-09 2013-10-23 杭州东方通信软件技术有限公司 Method and system for evaluating equipment status by standardized physical examination
CN103412911A (en) * 2013-08-02 2013-11-27 中国工商银行股份有限公司 Method and device for monitoring performance of database system
CN103617110A (en) * 2013-11-11 2014-03-05 国家电网公司 Server device condition maintenance system
CN105991339A (en) * 2015-03-05 2016-10-05 腾讯科技(深圳)有限公司 Alarm source positioning method and device
JP2016174281A (en) * 2015-03-17 2016-09-29 日本電信電話株式会社 Network evaluation device, network evaluation method and network evaluation program
CN105101277A (en) * 2015-09-01 2015-11-25 中国联合网络通信集团有限公司 Method, device and system for judging abnormalities of monitoring area and sensing node
CN107040395A (en) * 2016-02-03 2017-08-11 腾讯科技(深圳)有限公司 A kind of processing method of warning information, device and system
CN106209920A (en) * 2016-09-19 2016-12-07 贵州白山云科技有限公司 The safety protecting method of a kind of dns server and device
CN106776214A (en) * 2016-12-12 2017-05-31 广州市申迪计算机系统有限公司 A kind of server health degree appraisal procedure

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110530650A (en) * 2019-09-05 2019-12-03 哈尔滨电气股份有限公司 Heavy duty gas turbine performance state monitoring method based on generalized regression nerve networks Yu box map analysis
CN110601890A (en) * 2019-09-17 2019-12-20 深圳市网心科技有限公司 Network performance analysis method, device, equipment and readable storage medium
WO2021179574A1 (en) * 2020-03-12 2021-09-16 平安科技(深圳)有限公司 Root cause localization method, device, computer apparatus, and storage medium
CN111722952A (en) * 2020-05-25 2020-09-29 中国建设银行股份有限公司 Fault analysis method, system, equipment and storage medium of business system
CN114077510A (en) * 2020-08-11 2022-02-22 腾讯科技(深圳)有限公司 Method and device for fault root cause positioning and fault root cause display
CN112346936A (en) * 2020-11-27 2021-02-09 中国工商银行股份有限公司 Application fault root cause positioning method and system
CN112565227A (en) * 2020-11-27 2021-03-26 深圳前海微众银行股份有限公司 Abnormal task detection method and device
CN112565227B (en) * 2020-11-27 2023-05-19 深圳前海微众银行股份有限公司 Abnormal task detection method and device
CN112838962A (en) * 2020-12-31 2021-05-25 中国银联股份有限公司 Performance bottleneck detection method and device for big data cluster
CN113094249A (en) * 2021-04-30 2021-07-09 杭州安恒信息技术股份有限公司 Node abnormity detection method, device and medium
CN113032227A (en) * 2021-05-31 2021-06-25 北京宝兰德软件股份有限公司 Abnormal network element detection method and device, electronic equipment and storage medium
CN114966304A (en) * 2022-04-13 2022-08-30 中移互联网有限公司 Fault positioning method and device and electronic equipment
CN114844768A (en) * 2022-04-27 2022-08-02 广州亚信技术有限公司 Information analysis method and device and electronic equipment

Also Published As

Publication number Publication date
CN110166264B (en) 2022-03-08

Similar Documents

Publication Publication Date Title
CN110166264A (en) A kind of Fault Locating Method, device and electronic equipment
CN109981328A (en) A kind of fault early warning method and device
CN106254137B (en) The alarm root analysis system and method for supervisory systems
CN107196895A (en) Network attack is traced to the source implementation method and device
CN108111342B (en) Visualization-based threat alarm display method
CN111814999A (en) Fault work order generation method, device and equipment
CN105045251A (en) Demand analysis and integration method for function safety and information safety of industrial control system
CN108880845A (en) A kind of method and relevant apparatus of information alert
CN105915402A (en) Industrial control network security protection system
CN108986418A (en) intelligent alarm method, device, equipment and storage medium
CN110619738B (en) Joint defense warning method and device
CN111210029A (en) Device and method for auxiliary analysis of service and related equipment
CN114338348A (en) Intelligent alarm method, device, equipment and readable storage medium
CN105825130B (en) A kind of information security method for early warning and device
JP2012018604A (en) Disaster crisis management device, damage level computation method and damage level computation program
CN110781591A (en) Urban drainage waterlogging prevention simulation system, method, equipment and storage medium
CN103929322B (en) Communication monitor and Forecasting Methodology
CN109495424B (en) Method and device for detecting intrusion flow
CN104486353B (en) A kind of security incident detection method and device based on flow
CN112769615A (en) Anomaly analysis method and device
US20190278236A1 (en) System and method for remote non-intrusive monitoring of assets and entities
CN110826882A (en) Gas pipeline toughness evaluation method and device
CN115798174A (en) Fusion processing method, device, equipment and storage medium for multi-disaster early warning information
CN110457349A (en) The monitoring method and monitoring device of information outflow
CN110443515A (en) Internet of Things safety detection method and system based on threat index

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant