CN104734871A

CN104734871A - Method and device for positioning failures

Info

Publication number: CN104734871A
Application number: CN201310711392.8A
Authority: CN
Inventors: 郭宪杰; 申山宏; 刘淑霞; 尚尔刚
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2013-12-20
Filing date: 2013-12-20
Publication date: 2015-06-24
Also published as: WO2015090098A1; CN105659528B; CN105659528A

Abstract

The invention discloses a method and device for positioning failures. The method includes the steps of obtaining current failure information, establishing a conduction chain set of all monitoring objects for all failure types in a preset time window at different time points according to the obtained current failure information, analyzing the relevance of conduction chains in the conduction chain set to obtain the failure object conduction chains of all the monitoring objects for different failure types, and positioning the failure objects and the failure types according to the failure object conduction chains. By means of the method, root failure positioning and efficient order sending can be rapidly and accurately achieved, and the efficiency of daily network maintenance and failure order sending is improved.

Description

A kind of method and device realizing fault location

Technical field

The present invention relates to network management technology, espespecially a kind of method and device realizing fault location.

Background technology

Existing network management system is for managing each monitored object.Usually need the parameters by netconfig function configuration monitoring object, comprise the name identification of monitored object, annexation etc.Such as monitored object is a switch and four computers, and switch connects this four computers.After having had this configuration data, be just familiar with each object of management system, normally identify monitored object, as Switcher100, Computer100, Computer101, Computer102, Computer103 etc. according to mark title.

Usually attendant can be reported after fault threshold being reached to the monitored results of monitored object, such as cpu busy percentage reaches more than 96% to be needed to report to the police, this time, monitored object will send a piece of news to supervisor (network management system), and message comprises: the information such as index, current criteria value, alarm name of object type, object identity, monitoring.Such as Computer, ID=100, CPU, 98%, Computer CPU Utilization Ratio is too high.From network management system, these alarm datas all report from each monitored object, and type of message is can be self-defining.

After alarm data is reported by monitored object, according to interface definition, type of message, message object and object identity can be obtained, receive as mentioned above one " Computer, ID=100, CPU; 98%, Computer CPU Utilization Ratio is too high ", will know that abnormal conditions have appearred in Computer100.

In the real network of complexity, a fault can cause more monitored object to break down, and typical in after power down, all monitored object may all cannot normally work; Transmission line interrupts causing the communication of a panel region to be obstructed.May be exactly can report up to a hundred warning information within one or two minutes, in the alarm data that these report, if the alarm data of quick position root, preferentially repair it, other alarm data may will recover automatically.The alarm data how quick position is underlying is exactly the analysis emphasis of prior art, normally according to the causality (power down and low pressure etc. have before and after or causality) between the annexation (as Switcher100 is connected to Computer100 etc. 4) between network monitoring object, business, conclude these annexations, causality forms alarm knowledge base or empirical rule, utilize existing alarm knowledge base or alarm empirical rule to carry out fault location and analysis to alarm data.

Utilizing existing alarm knowledge base or alarm empirical rule to carry out fault location and analysis to alarm data, is the main method that existing network is safeguarded.But existing method is applied in the alarm data that can bring magnanimity in the monitoring of whole network, and across a network equipment is very large across the warning association analysis difficulty between management system.Particularly periodically networking and routinely regular maintenance make network be in the middle of the process of dynamically change all the time, and bring very large inaccuracy in the face of dynamic network configuration change is understood to the alarm empirical rule of priori, the location of root fault cannot be carried out fast and accurately, commodity network cannot be promoted and to safeguard and pending accounts send efficiency in single process.

Summary of the invention

In order to solve the problems of the technologies described above, the invention provides a kind of method and the device that realize fault location, can carry out the location of root fault fast and accurately, the maintenance of lifting commodity network and fault send the efficiency in single process.

In order to reach foregoing invention object, the invention discloses a kind of method realizing fault location, comprising:

Obtain current failure information, current failure information at least comprises monitored object, fault type and temporal information;

According to the current failure information obtained, set up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points;

Correlation between conduction chain in the conduction chain set of setting up is analyzed, obtains the fault object conduction chain of all monitored object for different faults type;

According to the fault object conduction chain obtained, orient current fault object and fault type.

Preferably, said method can also have following features: also comprise before described acquisition current failure information: according to the historical failure information obtained, set up fault metadata storehouse.

Preferably, said method can also have following features: before the set of described foundation conduction chain, the method also comprises: judge whether described current failure information is present in described historical failure information;

Preferably, said method can also have following features: describedly set up all monitored object and comprise for the conduction chain set of different faults type in the scheduled time window of different time points:

Obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time;

Current monitor object is set up for the conduction chain set of current failure type in the scheduled time window of different time points according to described historical failure information.

Preferably, said method can also have following features: the correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:

Obtain each monitored object in the set of described conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, said method can also have following features: when judging that described current failure information is not present in described historical failure information, the method also comprises:

Correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:

Obtain each monitored object in the set of current conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio that each monitored object occurs in the number of times total degree that all monitored object break down in current conduction chain of this fault, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, said method can also have following features: all monitored object of described acquisition are for after the fault object conduction chain of different faults type, and the method also comprises:

According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to different monitoring object orients fault object and fault type; Or,

According to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.

The invention also discloses a kind of device realizing fault location, comprising:

Receiver module, for obtaining current failure information, current failure information at least comprises monitored object, fault type and temporal information;

First sets up module, for according to the current failure information obtained, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points, and exports to second and set up module;

Second sets up module, analyzes, obtain the fault object conduction chain export to locating module of all monitored object for all fault types for the correlation set up first between the conduction chain in the conduction chain set that module sets up;

Locating module, for according to the fault object conduction chain setting up module from second, orients fault object and fault type.

Preferably, said apparatus can also have following features: described device also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.

Preferably, said apparatus can also have following features: described first sets up module, also for judging whether described current failure information is present in described historical failure information;

When judging that described current failure information is present in described historical failure information, obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time; Set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to described fault message, set up module to described second and send the first notice.

Preferably, said apparatus can also have following features: described second set up module specifically for:

Receive the first notice setting up module from first, obtain the number of times that in the set of described conduction chain, each monitored object breaks down, calculate the ratio in the total degree that number of times that each monitored object breaks down breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, said apparatus can also have following features: described first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;

Described second sets up module, also for receiving the second notice setting up module from first, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, said apparatus can also have following features: described locating module also for:

According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to the different monitoring object obtained orients fault object and fault type;

Or according to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.

Technical scheme comprises: obtain current failure information, current failure information comprises monitored object, fault type and temporal information; According to acquisition current failure information, set up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points; Correlation between conduction chain in the conduction chain set of setting up is analyzed, obtains the fault object conduction chain of all monitored object for all fault types; And according to the fault object conduction chain obtained, orient fault object and fault type.The technical scheme of the application need not find the causality between annexation between monitored object and fault type one by one, doing so avoids the time cost that cost is higher, meets the requirement of real-time.Do not emphasize causality in logic and carry out the judgement of strong correlation, contain the uncertainty caused by change that may exist, according to the ability level that monitoring is safeguarded, judge its priority processed according to the height of correlation, carry out fault location with means more flexibly.

Accompanying drawing explanation

Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:

Fig. 1 is the flow chart that the present invention realizes the method for fault location;

Fig. 2 is the flow chart that the present invention realizes the embodiment of the method for fault location;

Fig. 3 is the structural representation that the present invention realizes the device of fault location.

Embodiment

Below in conjunction with drawings and the specific embodiments, the present invention is described in detail.

Fig. 1 is the flow chart that the present invention realizes the method for fault location, comprises the following steps:

Step 101, obtains current failure information.

Wherein, current failure information comprises monitored object, fault type and temporal information.

Preferably, before acquisition current failure information, can also comprise:

According to historical failure information, set up fault metadata storehouse.

Specifically comprise: first according to the existing fault message state of the whole network, identify monitored object and the fault category of minimum particle size, then set up basic fault metadata storehouse according to the monitored object of minimum particle size and fault type.

Illustrate, monitored object is focus main in network management, can repair, can only replace during catastrophe failure during monitored object generation minor failure.Usual each monitored object is made up of several different parts, from safeguarding angle, and the monitored object of so-called minimum particle size, the minimum unit parts can replaced exactly.Such as switch, if the switch that a small-sized integrated level is high, cannot change for each port after breaking down, then each port needs after there is catastrophe failure to change this switch, then the minimum particle size of this monitored object is just switch itself.If a larger switch, each port can change parts, then minimum particle size is defined as each port under switch, can change port part when this port breaks down.So the monitored object of minimum particle size is the port numbering under switch.

Above-mentioned fault metadata storehouse due to the network expansion of monitored object, fault type abundant and constantly expand, due to fault metadata storehouse limited amount, only can increase and not delete, ensure to continue in monitoring historical failure available.

Step 102, according to fault metadata storehouse, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points.

Specifically comprise:

First, current monitor object is obtained for the conduction chain of current failure type in the scheduled time window of current point in time.

Secondly, if before obtaining current failure information history of existence fault message time, set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to historical failure information; Before obtaining current failure information if there is no historical failure information time, then proceed to step 103.

Preferably, above-mentioned conduction chain is defined as: a series of object outages sequence that a certain object outages can affect after occurring.

Step 103, analyzes the correlation between the conduction chain in the conduction chain set of setting up, and obtains the fault object conduction chain of all monitored object for different faults type.

Specifically comprise:

If before obtaining current failure information history of existence fault message time, obtain each monitored object in the set of above-mentioned conduction chain and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list above-mentioned ratio being greater than predetermined threshold conducts chain as fault object.Or

Before obtaining current failure information if there is no historical failure information time, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list above-mentioned ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, said method also comprises:

According to fault object conduction chain, obtain the fault conduction chain for different monitoring object, orient fault object and fault type according to fault conduction chain.Or,

According to fault object conduction chain, obtain the object conduction chain for different faults type, orient fault object and fault type according to object conduction chain.

Wherein, the current failure information initially reported, comprising: the essential information such as monitored object, fault type, time, and above-mentioned current failure information is as basic correlation basis for estimation, and these data are come from the Network element object of monitored object; If initial history data are empty, then correlation is all fixed tentatively is 100% strong correlation, and because counts is only 1, confidence level and priority reduce, and when historical data is constantly accumulated, the computability of correlation is more and more higher.

First, above-mentioned predetermined threshold can adjust in actual applications.

Secondly, above-mentioned fault object conduction chain is defined as: the object outages set of the strong correlation that the fault type of monitored object affects.

Moreover above-mentioned fault conduction chain is defined as: the limited fault set of the fault of strong correlation, is all easy to other fault type (may be different objects) caused on this chain when namely occurring for this fault.

Finally, above-mentioned object conduction chain is defined as: the limited object set of the object of strong correlation, namely for this object, other object (may be different faults) that any fault is all easy to affect on this chain occurs.

Step 104, according to the fault object conduction chain obtained, orients fault object and fault type.

Said method is using network management system when monitoring each monitored object of the whole network and fault type, abandon the analytical method of existing Corpus--based Method, but towards real-time dynamic fault message, find out the strong correlation relation of the spatial and temporal distributions of monitored object and fault type in a network, and with reference to the correlation (including, but are not limited to monitored object, connection, fault time, fault type etc.) of the object chain in historical failure information, the strong correlation carried out between fault object judges.

Do not emphasize causality in logic in the present invention and carry out the judgement of strong correlation, the uncertainty caused by change that containing may exist, according to the ability level that monitoring is safeguarded, judge its priority processed according to the height of correlation, achieve fault location with means more flexibly.

Fig. 2 is the detail flowchart that the present invention realizes the method for fault location, comprises the following steps:

Step 201, obtains current failure information, comprising: the essential informations such as monitored object, fault type and time.

Step 202, according to historical data information, sets up fault metadata storehouse, and the fault metadata storehouse of foundation comprises: the monitored object of minimum particle size and fault category;

Be specially:

Without under the prerequisite of priori, according to the existing fault message state of the whole network, identify the monitored object O of minimum particle size _nwith fault type F _m, according to the monitored object O of minimum particle size _nwith fault type F _mset up basic fault metadata storehouse.

Above-mentioned fault metadata storehouse is due to the network capacity extension of monitored object, fault type abundant and constantly expanding.

The current failure information initially reported, comprising: the essential information such as monitored object, fault type, time, and above-mentioned current failure information is as basic correlation basis for estimation, and these data are come from the Network element object of monitored object; If initial history data are empty, then correlation is all fixed tentatively is 100% strong correlation, and because counts is only 1, confidence level and priority reduce, and when historical data is constantly accumulated, the computability of correlation is more and more higher.

The fault type newly increased, or the fault type changed, do not inquire, be used as initial fault message and calculate by strong correlation in above-mentioned fault metadata storehouse; The monitored object newly increased, or the monitored object changing mark, do not inquire, be used as initial fault message and calculate by strong correlation in above-mentioned fault metadata storehouse.

To the monitored object changing mark, finally its correlative relationship still can be identical with the arithmetic result of former monitored object.

Step 203, obtains current point in time T ₀conduction chain L in time window _ij0set.

Specifically comprise: obtain current monitor object for the scheduled time window T of current failure at current point in time ₀interior conduction chain L _ij0set.

Wherein, chain L is conducted _ij0set expression, in time series, passes in the time after a certain fault occurs, the monitored object occurred and fault type thereof, the conduction chain set of formation.

Step 204, has judged whether historical data, if there is historical data, then proceeds to step 205; If there is no historical data, then proceed to step 206.

Step 205, according to historical data, sets up T _kthe conduction chain L of time point _ijkset.

Specifically comprise:

First, current monitor object is set up for the conduction chain set of current failure type in the scheduled time window of different time points according to historical failure information.

Finally, each monitored object O is analyzed _ifault type F _j, be based upon T _kthe conduction chain set of time point.

Wherein, chain L is conducted _ijkbe defined as: conduction chain L _ijkrepresent at object O _ifault type F _jthe time point T occurred _klater T ₀the object outages time series set occurred in time.

Illustrate, such as generator O _ithe low fault F of output voltage _joccur in 20:03 timesharing in certain day evening, its later T ₀the time series set of all fault objects occurred in the time can think the node of this fault object on the fault conduction chain of this time point, wherein T ₀for empirical, be generally 3 minutes or 5 minutes.

Step 206, analyzes the strong correlation between each conduction chain, obtains the fault object conduction chain L of all monitored object for all fault types _ij.

Correlation determination methods between above-mentioned each conduction chain is specially:

Before obtaining current failure information history of existence fault message time, obtain each monitored object in the set of described conduction chain and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.Or

Before obtaining current failure information not history of existence fault message time, obtain each monitored object in the set of current conduction chain and the number of times of often kind of fault occurs, calculate the ratio that each monitored object occurs in the number of times total degree that all monitored object break down in current conduction chain of this fault, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Wherein, predetermined threshold can adjust in actual applications.

Illustrate, first suppose, monitored object O _ifault type F _joccur, set up its T ₀fault object set all in conduction time is L _ijk=F (O _i, F _j, T _k), analysis of history data, before this control object O _ifault type F _joccurred K-1 time, the accumulative conduction of K bar fault altogether chain.

Then, in this K article of fault conduction chain set, M is amounted to _kindividual fault object, analyzes the number of times ∑ C that all these monitored object occur in K-1 the conduction chain set of history _km=Count(L _ijk, O _m) (k=1,2 ... k-1), M is obtained _kthe number of times that individual monitored object occurs, in order to normalization can calculate its frequency occurred, namely occurrence number accounts for the percentage of total quantity.

Finally, be the fault object of 100% if there is the frequency, then the degree of correlation is the highest, for cause and effect strong correlation relation, but because fault object chain in the production environment of reality can change because network changes, it is more than 90% that empirical data can get the frequency, or determines the priority orders of fault object according to frequency order from high to low.Fault object conduction chain L _ijbe defined as: object O _ifault type F _jthe object outages set of the strong correlation affected;

Illustrate, in a certain complex communications networks, include the network subsystems such as wireless base station network, backbone network transmission network, IT monitor network, power and environmental monitoring network.Simplify its network model, suppose in its networking mode, there are three monitor nodes: power supply P ₁, transmission T ₁with base station S ₁.Its three objects have causality: transmit passive after power interruptions, and base station also interrupts providing service, and when power supply is normal, transmission abnormality interrupts base station can not provide service, that is: P ₁-->(T ₁-->S ₁).

As transmission T ₁outage can calculate its T after occurring ₀a lot of faults is had to report in time period, wherein base station S ₁interrupt to occur after its time series occurs, the fault also having other near certain same time point produces; Carry out correlation analysis with the conduction chain of historical data, will (T be found ₁-->S ₁) occurrence frequency can be very high, ideally should reach 100% with occurring, and other fault occurred at random, then the degree of correlation of occurrence frequency can be lower.

Equally, as power supply P ₁after power down fault occurs, the T on its conduction chain can be calculated ₁and S ₁also after appearing in time series, and the degree of correlation is very high; (P ₁-->T ₁) and (P ₁-->S ₁) be exactly power supply P ₁conduction chain, P ₁--> (T ₁-->S ₁) be exactly a larger conduction chain.

But, when due to network expansion or maintenance variation, transmission T ₁no longer connect base station S ₁but S ₂, at this moment (T ₁-->S ₁) relation no longer occur, (T ₁-->S ₂) be then new conduct the relation.Because historical data does not exist when this conduct the relation starts, then think only to occur strong incidence relation once (under initial situation all occur once all think strong incidence relation 100%, but priority will reduce), (P ₁-->T ₁) and (P1-->S ₂) be power supply P ₁conduction chain, when occur second time more than time, priority just can improve.

Step 207, according to above-mentioned fault object conduction chain L _ij, find the root fault on fault object conduction chain, orient monitored object and fault type.

Said method can generate the strongly connected spanning tree based on monitored object and fault type; After fault occurs, all alarm monitorings on a timeline, can conduct chain L according to object _ijcarry out strongly connectedly automatically presenting; This presenting can help user to analyze better and localizing faults, unifies to send list more easily when sending single to a class site problems, and in conjunction with historical data, convenient investigation, raises the efficiency.

Step 208, in step, on the basis of 206, said method can also comprise:

According to above-mentioned fault object conduction chain L _ij, obtain the object conduction chain L for different faults type _i, according to above-mentioned object conduction chain L _iorient fault object and fault type; Wherein

Above-mentioned object conduction chain L _ibe defined as: the object O of strong correlation _ilimited object set, namely for this object, other object that any fault is all easy to affect on this chain occurring, may be wherein different faults;

Object conduction chain L _iconcrete determination methods:

An object O _imultiple fault type can be detected, each fault type F _jacquisition conduction chain L can be calculated _ij(j=1 ... m), the fault that chain includes monitored object and its detection be affected is conducted.In object outages set in multiple conduction chain, calculate the frequency of the object outages occurred in each set to judge the correlation between multiple conduction chain, identical with above-mentioned determination methods;

Illustrate, on the multiple veneers in certain machine frame, the serious communication failure for machine frame detects, and all can have influence on the communication capacity of veneer self.Thisly to associate with fault type not quite, have set membership between object, the mode just can conducting chain by object carries out finding and excavating, and just preferentially can investigate father's malfunctioning node of conduction chain root during fault recovery.

The object with strong correlation can be expanded and be summarized as a large object bag, and the fault in object bag can be assigned as a fault Shang Zhan team, and the fault of strong correlation in object bag preferentially can investigate the malfunctioning node of conduction chain root.Or

Step 209, according to above-mentioned fault object conduction chain L _ij, obtain the fault conduction chain L for different monitoring object _j, according to fault conduction chain L _jorient fault object and fault type.Wherein

Above-mentioned fault conduction chain L _jbe defined as: be the fault F of strong correlation _jlimited fault set, being all easy to other fault type caused on this chain when namely occurring for this fault, may be different monitored object.

Fault conduction chain L _jconcrete determination methods: a fault F _jcan detectedly on multiple objects occur, for each fault type F _jequally can different object O _iconduction chain L when it occurs _ij(i=1 ... n), the fault that chain includes object and its detection be affected is conducted.In object outages set in multiple conduction chain, calculate the frequency of the object outages occurred in each set to judge the correlation between multiple conduction chain, identical with above-mentioned determination methods.

Illustrate, in the levels communication process of communication protocol stack, low-level communication often affects upper layer communication.If when monitoring the protocol stack of different levels, the fault of underlying protocol stack can affect the function of upper-layer protocol stack; Thisly to associate not quite with object itself, have the strong incidence relation of logic between object, the mode just can conducting chain by fault carries out finding and excavating, and just preferentially can investigate the malfunctioning node of conduction chain root during fault recovery.

Fig. 3 is the structural representation of the positioner of a kind of fault of one embodiment of the invention, comprising: receiver module (30), and module (31) is set up in fault metadata storehouse, and first sets up module (32), and second sets up module (33) and locating module (34).

Wherein, first sets up module, for according to the current failure information obtained, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points, and exports to second and set up module.

First sets up module, also for judging whether described current failure information is present in described historical failure information; When judging that described current failure information is present in described historical failure information, obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time; Set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to described fault message, set up module to described second and send the first notice.

Preferably, first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;

Second sets up module, analyzes, obtain the fault object conduction chain export to locating module of all monitored object for all fault types for the correlation set up first between the conduction chain in the conduction chain set that module sets up.

Further, second set up module specifically for: receive from first set up module first notice, obtain the number of times that in the set of described conduction chain, each monitored object breaks down, calculate the ratio in the total degree that number of times that each monitored object breaks down breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Preferably, second sets up module, also for receiving the second notice setting up module from first, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.

Further, locating module also for:

According to fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to the different monitoring object obtained orients fault object and fault type; Or described fault object conduction chain, obtains the object conduction chain for different faults type, and the object conduction chain according to different faults type orients fault object and fault type.

Finally, said apparatus also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.

The above, be only preferred embodiments of the present invention, be not intended to limit protection scope of the present invention, and for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. realize a method for fault location, it is characterized in that, comprising: obtain current failure information, current failure information at least comprises monitored object, fault type and temporal information;

2. method according to claim 1, is characterized in that, also comprises before described acquisition current failure information: according to the historical failure information obtained, set up fault metadata storehouse.

3. method according to claim 2, is characterized in that, before the set of described foundation conduction chain, the method also comprises: judge whether described current failure information is present in described historical failure information;

When judging that described current failure information is present in described historical failure information, describedly setting up all monitored object and comprise for the conduction chain set of different faults type in the scheduled time window of different time points:

4. method according to claim 3, is characterized in that, the correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:

5. method according to claim 1 and 2, is characterized in that, when judging that described current failure information is not present in described historical failure information, the method also comprises:

6. method according to claim 1 and 2, is characterized in that, all monitored object of described acquisition are for after the fault object conduction chain of different faults type, and the method also comprises:

7. realize a device for fault location, it is characterized in that, comprising:

8. device according to claim 7, is characterized in that, described device also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.

9. device according to claim 8, is characterized in that, described first sets up module, also for judging whether described current failure information is present in described historical failure information;

10. device according to claim 9, is characterized in that, described second set up module specifically for:

11. devices according to claim 7 or 8, it is characterized in that, described first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;

12. devices according to claim 7 or 8, is characterized in that, described locating module also for: