CN104734871A - Method and device for positioning failures - Google Patents
Method and device for positioning failures Download PDFInfo
- Publication number
- CN104734871A CN104734871A CN201310711392.8A CN201310711392A CN104734871A CN 104734871 A CN104734871 A CN 104734871A CN 201310711392 A CN201310711392 A CN 201310711392A CN 104734871 A CN104734871 A CN 104734871A
- Authority
- CN
- China
- Prior art keywords
- fault
- conduction chain
- monitored object
- chain
- failure information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/28—Routing or path finding of packets in data switching networks using route fault recovery
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Locating Faults (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a method and device for positioning failures. The method includes the steps of obtaining current failure information, establishing a conduction chain set of all monitoring objects for all failure types in a preset time window at different time points according to the obtained current failure information, analyzing the relevance of conduction chains in the conduction chain set to obtain the failure object conduction chains of all the monitoring objects for different failure types, and positioning the failure objects and the failure types according to the failure object conduction chains. By means of the method, root failure positioning and efficient order sending can be rapidly and accurately achieved, and the efficiency of daily network maintenance and failure order sending is improved.
Description
Technical field
The present invention relates to network management technology, espespecially a kind of method and device realizing fault location.
Background technology
Existing network management system is for managing each monitored object.Usually need the parameters by netconfig function configuration monitoring object, comprise the name identification of monitored object, annexation etc.Such as monitored object is a switch and four computers, and switch connects this four computers.After having had this configuration data, be just familiar with each object of management system, normally identify monitored object, as Switcher100, Computer100, Computer101, Computer102, Computer103 etc. according to mark title.
Usually attendant can be reported after fault threshold being reached to the monitored results of monitored object, such as cpu busy percentage reaches more than 96% to be needed to report to the police, this time, monitored object will send a piece of news to supervisor (network management system), and message comprises: the information such as index, current criteria value, alarm name of object type, object identity, monitoring.Such as Computer, ID=100, CPU, 98%, Computer CPU Utilization Ratio is too high.From network management system, these alarm datas all report from each monitored object, and type of message is can be self-defining.
After alarm data is reported by monitored object, according to interface definition, type of message, message object and object identity can be obtained, receive as mentioned above one " Computer, ID=100, CPU; 98%, Computer CPU Utilization Ratio is too high ", will know that abnormal conditions have appearred in Computer100.
In the real network of complexity, a fault can cause more monitored object to break down, and typical in after power down, all monitored object may all cannot normally work; Transmission line interrupts causing the communication of a panel region to be obstructed.May be exactly can report up to a hundred warning information within one or two minutes, in the alarm data that these report, if the alarm data of quick position root, preferentially repair it, other alarm data may will recover automatically.The alarm data how quick position is underlying is exactly the analysis emphasis of prior art, normally according to the causality (power down and low pressure etc. have before and after or causality) between the annexation (as Switcher100 is connected to Computer100 etc. 4) between network monitoring object, business, conclude these annexations, causality forms alarm knowledge base or empirical rule, utilize existing alarm knowledge base or alarm empirical rule to carry out fault location and analysis to alarm data.
Utilizing existing alarm knowledge base or alarm empirical rule to carry out fault location and analysis to alarm data, is the main method that existing network is safeguarded.But existing method is applied in the alarm data that can bring magnanimity in the monitoring of whole network, and across a network equipment is very large across the warning association analysis difficulty between management system.Particularly periodically networking and routinely regular maintenance make network be in the middle of the process of dynamically change all the time, and bring very large inaccuracy in the face of dynamic network configuration change is understood to the alarm empirical rule of priori, the location of root fault cannot be carried out fast and accurately, commodity network cannot be promoted and to safeguard and pending accounts send efficiency in single process.
Summary of the invention
In order to solve the problems of the technologies described above, the invention provides a kind of method and the device that realize fault location, can carry out the location of root fault fast and accurately, the maintenance of lifting commodity network and fault send the efficiency in single process.
In order to reach foregoing invention object, the invention discloses a kind of method realizing fault location, comprising:
Obtain current failure information, current failure information at least comprises monitored object, fault type and temporal information;
According to the current failure information obtained, set up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points;
Correlation between conduction chain in the conduction chain set of setting up is analyzed, obtains the fault object conduction chain of all monitored object for different faults type;
According to the fault object conduction chain obtained, orient current fault object and fault type.
Preferably, said method can also have following features: also comprise before described acquisition current failure information: according to the historical failure information obtained, set up fault metadata storehouse.
Preferably, said method can also have following features: before the set of described foundation conduction chain, the method also comprises: judge whether described current failure information is present in described historical failure information;
Preferably, said method can also have following features: describedly set up all monitored object and comprise for the conduction chain set of different faults type in the scheduled time window of different time points:
Obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time;
Current monitor object is set up for the conduction chain set of current failure type in the scheduled time window of different time points according to described historical failure information.
Preferably, said method can also have following features: the correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:
Obtain each monitored object in the set of described conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, said method can also have following features: when judging that described current failure information is not present in described historical failure information, the method also comprises:
Correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:
Obtain each monitored object in the set of current conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio that each monitored object occurs in the number of times total degree that all monitored object break down in current conduction chain of this fault, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, said method can also have following features: all monitored object of described acquisition are for after the fault object conduction chain of different faults type, and the method also comprises:
According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to different monitoring object orients fault object and fault type; Or,
According to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.
The invention also discloses a kind of device realizing fault location, comprising:
Receiver module, for obtaining current failure information, current failure information at least comprises monitored object, fault type and temporal information;
First sets up module, for according to the current failure information obtained, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points, and exports to second and set up module;
Second sets up module, analyzes, obtain the fault object conduction chain export to locating module of all monitored object for all fault types for the correlation set up first between the conduction chain in the conduction chain set that module sets up;
Locating module, for according to the fault object conduction chain setting up module from second, orients fault object and fault type.
Preferably, said apparatus can also have following features: described device also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.
Preferably, said apparatus can also have following features: described first sets up module, also for judging whether described current failure information is present in described historical failure information;
When judging that described current failure information is present in described historical failure information, obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time; Set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to described fault message, set up module to described second and send the first notice.
Preferably, said apparatus can also have following features: described second set up module specifically for:
Receive the first notice setting up module from first, obtain the number of times that in the set of described conduction chain, each monitored object breaks down, calculate the ratio in the total degree that number of times that each monitored object breaks down breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, said apparatus can also have following features: described first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;
Described second sets up module, also for receiving the second notice setting up module from first, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, said apparatus can also have following features: described locating module also for:
According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to the different monitoring object obtained orients fault object and fault type;
Or according to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.
Technical scheme comprises: obtain current failure information, current failure information comprises monitored object, fault type and temporal information; According to acquisition current failure information, set up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points; Correlation between conduction chain in the conduction chain set of setting up is analyzed, obtains the fault object conduction chain of all monitored object for all fault types; And according to the fault object conduction chain obtained, orient fault object and fault type.The technical scheme of the application need not find the causality between annexation between monitored object and fault type one by one, doing so avoids the time cost that cost is higher, meets the requirement of real-time.Do not emphasize causality in logic and carry out the judgement of strong correlation, contain the uncertainty caused by change that may exist, according to the ability level that monitoring is safeguarded, judge its priority processed according to the height of correlation, carry out fault location with means more flexibly.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide a further understanding of the present invention, and form a application's part, schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart that the present invention realizes the method for fault location;
Fig. 2 is the flow chart that the present invention realizes the embodiment of the method for fault location;
Fig. 3 is the structural representation that the present invention realizes the device of fault location.
Embodiment
Below in conjunction with drawings and the specific embodiments, the present invention is described in detail.
Fig. 1 is the flow chart that the present invention realizes the method for fault location, comprises the following steps:
Step 101, obtains current failure information.
Wherein, current failure information comprises monitored object, fault type and temporal information.
Preferably, before acquisition current failure information, can also comprise:
According to historical failure information, set up fault metadata storehouse.
Specifically comprise: first according to the existing fault message state of the whole network, identify monitored object and the fault category of minimum particle size, then set up basic fault metadata storehouse according to the monitored object of minimum particle size and fault type.
Illustrate, monitored object is focus main in network management, can repair, can only replace during catastrophe failure during monitored object generation minor failure.Usual each monitored object is made up of several different parts, from safeguarding angle, and the monitored object of so-called minimum particle size, the minimum unit parts can replaced exactly.Such as switch, if the switch that a small-sized integrated level is high, cannot change for each port after breaking down, then each port needs after there is catastrophe failure to change this switch, then the minimum particle size of this monitored object is just switch itself.If a larger switch, each port can change parts, then minimum particle size is defined as each port under switch, can change port part when this port breaks down.So the monitored object of minimum particle size is the port numbering under switch.
Above-mentioned fault metadata storehouse due to the network expansion of monitored object, fault type abundant and constantly expand, due to fault metadata storehouse limited amount, only can increase and not delete, ensure to continue in monitoring historical failure available.
Step 102, according to fault metadata storehouse, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points.
Specifically comprise:
First, current monitor object is obtained for the conduction chain of current failure type in the scheduled time window of current point in time.
Secondly, if before obtaining current failure information history of existence fault message time, set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to historical failure information; Before obtaining current failure information if there is no historical failure information time, then proceed to step 103.
Preferably, above-mentioned conduction chain is defined as: a series of object outages sequence that a certain object outages can affect after occurring.
Step 103, analyzes the correlation between the conduction chain in the conduction chain set of setting up, and obtains the fault object conduction chain of all monitored object for different faults type.
Specifically comprise:
If before obtaining current failure information history of existence fault message time, obtain each monitored object in the set of above-mentioned conduction chain and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list above-mentioned ratio being greater than predetermined threshold conducts chain as fault object.Or
Before obtaining current failure information if there is no historical failure information time, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list above-mentioned ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, said method also comprises:
According to fault object conduction chain, obtain the fault conduction chain for different monitoring object, orient fault object and fault type according to fault conduction chain.Or,
According to fault object conduction chain, obtain the object conduction chain for different faults type, orient fault object and fault type according to object conduction chain.
Wherein, the current failure information initially reported, comprising: the essential information such as monitored object, fault type, time, and above-mentioned current failure information is as basic correlation basis for estimation, and these data are come from the Network element object of monitored object; If initial history data are empty, then correlation is all fixed tentatively is 100% strong correlation, and because counts is only 1, confidence level and priority reduce, and when historical data is constantly accumulated, the computability of correlation is more and more higher.
First, above-mentioned predetermined threshold can adjust in actual applications.
Secondly, above-mentioned fault object conduction chain is defined as: the object outages set of the strong correlation that the fault type of monitored object affects.
Moreover above-mentioned fault conduction chain is defined as: the limited fault set of the fault of strong correlation, is all easy to other fault type (may be different objects) caused on this chain when namely occurring for this fault.
Finally, above-mentioned object conduction chain is defined as: the limited object set of the object of strong correlation, namely for this object, other object (may be different faults) that any fault is all easy to affect on this chain occurs.
Step 104, according to the fault object conduction chain obtained, orients fault object and fault type.
Said method is using network management system when monitoring each monitored object of the whole network and fault type, abandon the analytical method of existing Corpus--based Method, but towards real-time dynamic fault message, find out the strong correlation relation of the spatial and temporal distributions of monitored object and fault type in a network, and with reference to the correlation (including, but are not limited to monitored object, connection, fault time, fault type etc.) of the object chain in historical failure information, the strong correlation carried out between fault object judges.
Do not emphasize causality in logic in the present invention and carry out the judgement of strong correlation, the uncertainty caused by change that containing may exist, according to the ability level that monitoring is safeguarded, judge its priority processed according to the height of correlation, achieve fault location with means more flexibly.
Fig. 2 is the detail flowchart that the present invention realizes the method for fault location, comprises the following steps:
Step 201, obtains current failure information, comprising: the essential informations such as monitored object, fault type and time.
Step 202, according to historical data information, sets up fault metadata storehouse, and the fault metadata storehouse of foundation comprises: the monitored object of minimum particle size and fault category;
Be specially:
Without under the prerequisite of priori, according to the existing fault message state of the whole network, identify the monitored object O of minimum particle size
nwith fault type F
m, according to the monitored object O of minimum particle size
nwith fault type F
mset up basic fault metadata storehouse.
Above-mentioned fault metadata storehouse is due to the network capacity extension of monitored object, fault type abundant and constantly expanding.
The current failure information initially reported, comprising: the essential information such as monitored object, fault type, time, and above-mentioned current failure information is as basic correlation basis for estimation, and these data are come from the Network element object of monitored object; If initial history data are empty, then correlation is all fixed tentatively is 100% strong correlation, and because counts is only 1, confidence level and priority reduce, and when historical data is constantly accumulated, the computability of correlation is more and more higher.
The fault type newly increased, or the fault type changed, do not inquire, be used as initial fault message and calculate by strong correlation in above-mentioned fault metadata storehouse; The monitored object newly increased, or the monitored object changing mark, do not inquire, be used as initial fault message and calculate by strong correlation in above-mentioned fault metadata storehouse.
To the monitored object changing mark, finally its correlative relationship still can be identical with the arithmetic result of former monitored object.
Step 203, obtains current point in time T
0conduction chain L in time window
ij0set.
Specifically comprise: obtain current monitor object for the scheduled time window T of current failure at current point in time
0interior conduction chain L
ij0set.
Wherein, chain L is conducted
ij0set expression, in time series, passes in the time after a certain fault occurs, the monitored object occurred and fault type thereof, the conduction chain set of formation.
Step 204, has judged whether historical data, if there is historical data, then proceeds to step 205; If there is no historical data, then proceed to step 206.
Step 205, according to historical data, sets up T
kthe conduction chain L of time point
ijkset.
Specifically comprise:
First, current monitor object is set up for the conduction chain set of current failure type in the scheduled time window of different time points according to historical failure information.
Finally, each monitored object O is analyzed
ifault type F
j, be based upon T
kthe conduction chain set of time point.
Wherein, chain L is conducted
ijkbe defined as: conduction chain L
ijkrepresent at object O
ifault type F
jthe time point T occurred
klater T
0the object outages time series set occurred in time.
Illustrate, such as generator O
ithe low fault F of output voltage
joccur in 20:03 timesharing in certain day evening, its later T
0the time series set of all fault objects occurred in the time can think the node of this fault object on the fault conduction chain of this time point, wherein T
0for empirical, be generally 3 minutes or 5 minutes.
Step 206, analyzes the strong correlation between each conduction chain, obtains the fault object conduction chain L of all monitored object for all fault types
ij.
Correlation determination methods between above-mentioned each conduction chain is specially:
Before obtaining current failure information history of existence fault message time, obtain each monitored object in the set of described conduction chain and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.Or
Before obtaining current failure information not history of existence fault message time, obtain each monitored object in the set of current conduction chain and the number of times of often kind of fault occurs, calculate the ratio that each monitored object occurs in the number of times total degree that all monitored object break down in current conduction chain of this fault, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Wherein, predetermined threshold can adjust in actual applications.
Illustrate, first suppose, monitored object O
ifault type F
joccur, set up its T
0fault object set all in conduction time is L
ijk=F (O
i, F
j, T
k), analysis of history data, before this control object O
ifault type F
joccurred K-1 time, the accumulative conduction of K bar fault altogether chain.
Then, in this K article of fault conduction chain set, M is amounted to
kindividual fault object, analyzes the number of times ∑ C that all these monitored object occur in K-1 the conduction chain set of history
km=Count(L
ijk, O
m) (k=1,2 ... k-1), M is obtained
kthe number of times that individual monitored object occurs, in order to normalization can calculate its frequency occurred, namely occurrence number accounts for the percentage of total quantity.
Finally, be the fault object of 100% if there is the frequency, then the degree of correlation is the highest, for cause and effect strong correlation relation, but because fault object chain in the production environment of reality can change because network changes, it is more than 90% that empirical data can get the frequency, or determines the priority orders of fault object according to frequency order from high to low.Fault object conduction chain L
ijbe defined as: object O
ifault type F
jthe object outages set of the strong correlation affected;
Illustrate, in a certain complex communications networks, include the network subsystems such as wireless base station network, backbone network transmission network, IT monitor network, power and environmental monitoring network.Simplify its network model, suppose in its networking mode, there are three monitor nodes: power supply P
1, transmission T
1with base station S
1.Its three objects have causality: transmit passive after power interruptions, and base station also interrupts providing service, and when power supply is normal, transmission abnormality interrupts base station can not provide service, that is: P
1-->(T
1-->S
1).
As transmission T
1outage can calculate its T after occurring
0a lot of faults is had to report in time period, wherein base station S
1interrupt to occur after its time series occurs, the fault also having other near certain same time point produces; Carry out correlation analysis with the conduction chain of historical data, will (T be found
1-->S
1) occurrence frequency can be very high, ideally should reach 100% with occurring, and other fault occurred at random, then the degree of correlation of occurrence frequency can be lower.
Equally, as power supply P
1after power down fault occurs, the T on its conduction chain can be calculated
1and S
1also after appearing in time series, and the degree of correlation is very high; (P
1-->T
1) and (P
1-->S
1) be exactly power supply P
1conduction chain, P
1--> (T
1-->S
1) be exactly a larger conduction chain.
But, when due to network expansion or maintenance variation, transmission T
1no longer connect base station S
1but S
2, at this moment (T
1-->S
1) relation no longer occur, (T
1-->S
2) be then new conduct the relation.Because historical data does not exist when this conduct the relation starts, then think only to occur strong incidence relation once (under initial situation all occur once all think strong incidence relation 100%, but priority will reduce), (P
1-->T
1) and (P1-->S
2) be power supply P
1conduction chain, when occur second time more than time, priority just can improve.
Step 207, according to above-mentioned fault object conduction chain L
ij, find the root fault on fault object conduction chain, orient monitored object and fault type.
Said method can generate the strongly connected spanning tree based on monitored object and fault type; After fault occurs, all alarm monitorings on a timeline, can conduct chain L according to object
ijcarry out strongly connectedly automatically presenting; This presenting can help user to analyze better and localizing faults, unifies to send list more easily when sending single to a class site problems, and in conjunction with historical data, convenient investigation, raises the efficiency.
Step 208, in step, on the basis of 206, said method can also comprise:
According to above-mentioned fault object conduction chain L
ij, obtain the object conduction chain L for different faults type
i, according to above-mentioned object conduction chain L
iorient fault object and fault type; Wherein
Above-mentioned object conduction chain L
ibe defined as: the object O of strong correlation
ilimited object set, namely for this object, other object that any fault is all easy to affect on this chain occurring, may be wherein different faults;
Object conduction chain L
iconcrete determination methods:
An object O
imultiple fault type can be detected, each fault type F
jacquisition conduction chain L can be calculated
ij(j=1 ... m), the fault that chain includes monitored object and its detection be affected is conducted.In object outages set in multiple conduction chain, calculate the frequency of the object outages occurred in each set to judge the correlation between multiple conduction chain, identical with above-mentioned determination methods;
Illustrate, on the multiple veneers in certain machine frame, the serious communication failure for machine frame detects, and all can have influence on the communication capacity of veneer self.Thisly to associate with fault type not quite, have set membership between object, the mode just can conducting chain by object carries out finding and excavating, and just preferentially can investigate father's malfunctioning node of conduction chain root during fault recovery.
The object with strong correlation can be expanded and be summarized as a large object bag, and the fault in object bag can be assigned as a fault Shang Zhan team, and the fault of strong correlation in object bag preferentially can investigate the malfunctioning node of conduction chain root.Or
Step 209, according to above-mentioned fault object conduction chain L
ij, obtain the fault conduction chain L for different monitoring object
j, according to fault conduction chain L
jorient fault object and fault type.Wherein
Above-mentioned fault conduction chain L
jbe defined as: be the fault F of strong correlation
jlimited fault set, being all easy to other fault type caused on this chain when namely occurring for this fault, may be different monitored object.
Fault conduction chain L
jconcrete determination methods: a fault F
jcan detectedly on multiple objects occur, for each fault type F
jequally can different object O
iconduction chain L when it occurs
ij(i=1 ... n), the fault that chain includes object and its detection be affected is conducted.In object outages set in multiple conduction chain, calculate the frequency of the object outages occurred in each set to judge the correlation between multiple conduction chain, identical with above-mentioned determination methods.
Illustrate, in the levels communication process of communication protocol stack, low-level communication often affects upper layer communication.If when monitoring the protocol stack of different levels, the fault of underlying protocol stack can affect the function of upper-layer protocol stack; Thisly to associate not quite with object itself, have the strong incidence relation of logic between object, the mode just can conducting chain by fault carries out finding and excavating, and just preferentially can investigate the malfunctioning node of conduction chain root during fault recovery.
Fig. 3 is the structural representation of the positioner of a kind of fault of one embodiment of the invention, comprising: receiver module (30), and module (31) is set up in fault metadata storehouse, and first sets up module (32), and second sets up module (33) and locating module (34).
Receiver module, for obtaining current failure information, current failure information at least comprises monitored object, fault type and temporal information;
Wherein, first sets up module, for according to the current failure information obtained, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points, and exports to second and set up module.
First sets up module, also for judging whether described current failure information is present in described historical failure information; When judging that described current failure information is present in described historical failure information, obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time; Set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to described fault message, set up module to described second and send the first notice.
Preferably, first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;
Second sets up module, analyzes, obtain the fault object conduction chain export to locating module of all monitored object for all fault types for the correlation set up first between the conduction chain in the conduction chain set that module sets up.
Further, second set up module specifically for: receive from first set up module first notice, obtain the number of times that in the set of described conduction chain, each monitored object breaks down, calculate the ratio in the total degree that number of times that each monitored object breaks down breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Preferably, second sets up module, also for receiving the second notice setting up module from first, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
Locating module, for according to the fault object conduction chain setting up module from second, orients fault object and fault type.
Further, locating module also for:
According to fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to the different monitoring object obtained orients fault object and fault type; Or described fault object conduction chain, obtains the object conduction chain for different faults type, and the object conduction chain according to different faults type orients fault object and fault type.
Finally, said apparatus also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.
The above, be only preferred embodiments of the present invention, be not intended to limit protection scope of the present invention, and for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (12)
1. realize a method for fault location, it is characterized in that, comprising: obtain current failure information, current failure information at least comprises monitored object, fault type and temporal information;
According to the current failure information obtained, set up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points;
Correlation between conduction chain in the conduction chain set of setting up is analyzed, obtains the fault object conduction chain of all monitored object for different faults type;
According to the fault object conduction chain obtained, orient current fault object and fault type.
2. method according to claim 1, is characterized in that, also comprises before described acquisition current failure information: according to the historical failure information obtained, set up fault metadata storehouse.
3. method according to claim 2, is characterized in that, before the set of described foundation conduction chain, the method also comprises: judge whether described current failure information is present in described historical failure information;
When judging that described current failure information is present in described historical failure information, describedly setting up all monitored object and comprise for the conduction chain set of different faults type in the scheduled time window of different time points:
Obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time;
Current monitor object is set up for the conduction chain set of current failure type in the scheduled time window of different time points according to described historical failure information.
4. method according to claim 3, is characterized in that, the correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:
Obtain each monitored object in the set of described conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio in the total degree that this fault occurs each monitored object number of times breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
5. method according to claim 1 and 2, is characterized in that, when judging that described current failure information is not present in described historical failure information, the method also comprises:
Correlation between the described conduction chain to conducting in chain set is analyzed, and obtains the fault object conduction chain of all monitored object for different faults type, comprising:
Obtain each monitored object in the set of current conduction chain respectively and the number of times of often kind of fault occurs, calculate the ratio that each monitored object occurs in the number of times total degree that all monitored object break down in current conduction chain of this fault, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
6. method according to claim 1 and 2, is characterized in that, all monitored object of described acquisition are for after the fault object conduction chain of different faults type, and the method also comprises:
According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to different monitoring object orients fault object and fault type; Or,
According to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.
7. realize a device for fault location, it is characterized in that, comprising:
Receiver module, for obtaining current failure information, current failure information at least comprises monitored object, fault type and temporal information;
First sets up module, for according to the current failure information obtained, sets up all monitored object for the conduction chain set of different faults type in the scheduled time window of different time points, and exports to second and set up module;
Second sets up module, analyzes, obtain the fault object conduction chain export to locating module of all monitored object for all fault types for the correlation set up first between the conduction chain in the conduction chain set that module sets up;
Locating module, for according to the fault object conduction chain setting up module from second, orients fault object and fault type.
8. device according to claim 7, is characterized in that, described device also comprises: fault metadata sets up module, for according to the fault message obtained, sets up fault metadata storehouse, fault metadata library information is passed to the first processing module.
9. device according to claim 8, is characterized in that, described first sets up module, also for judging whether described current failure information is present in described historical failure information;
When judging that described current failure information is present in described historical failure information, obtain described monitored object for the conduction chain of current failure type in the scheduled time window of current point in time; Set up current monitor object for the conduction chain set of current failure type in the scheduled time window of different time points according to described fault message, set up module to described second and send the first notice.
10. device according to claim 9, is characterized in that, described second set up module specifically for:
Receive the first notice setting up module from first, obtain the number of times that in the set of described conduction chain, each monitored object breaks down, calculate the ratio in the total degree that number of times that each monitored object breaks down breaks down at all monitored object, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
11. devices according to claim 7 or 8, it is characterized in that, described first sets up module, also for before judging to obtain current failure information not history of existence fault message time, set up module to second and send the second notice;
Described second sets up module, also for receiving the second notice setting up module from first, obtain the number of times that in the set of current conduction chain, each monitored object breaks down, calculate the ratio in the number of times total degree that all monitored object break down in current conduction chain that each monitored object breaks down, the monitored object list described ratio being greater than predetermined threshold conducts chain as fault object.
12. devices according to claim 7 or 8, is characterized in that, described locating module also for:
According to described fault object conduction chain, obtain the fault conduction chain for different monitoring object, the fault conduction chain according to the different monitoring object obtained orients fault object and fault type;
Or according to described fault object conduction chain, obtain the object conduction chain for different faults type, the object conduction chain according to different faults type orients fault object and fault type.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310711392.8A CN104734871A (en) | 2013-12-20 | 2013-12-20 | Method and device for positioning failures |
CN201480057055.4A CN105659528B (en) | 2013-12-20 | 2014-09-24 | A kind of method and device for realizing fault location |
PCT/CN2014/087332 WO2015090098A1 (en) | 2013-12-20 | 2014-09-24 | Method and apparatus for realizing fault location |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310711392.8A CN104734871A (en) | 2013-12-20 | 2013-12-20 | Method and device for positioning failures |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104734871A true CN104734871A (en) | 2015-06-24 |
Family
ID=53402074
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310711392.8A Withdrawn CN104734871A (en) | 2013-12-20 | 2013-12-20 | Method and device for positioning failures |
CN201480057055.4A Active CN105659528B (en) | 2013-12-20 | 2014-09-24 | A kind of method and device for realizing fault location |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480057055.4A Active CN105659528B (en) | 2013-12-20 | 2014-09-24 | A kind of method and device for realizing fault location |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN104734871A (en) |
WO (1) | WO2015090098A1 (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106294076A (en) * | 2016-08-24 | 2017-01-04 | 浪潮(北京)电子信息产业有限公司 | A kind of server relevant fault Forecasting Methodology and system thereof |
WO2018010176A1 (en) * | 2016-07-15 | 2018-01-18 | 华为技术有限公司 | Method and device for acquiring fault information |
CN107690676A (en) * | 2017-07-04 | 2018-02-13 | 深圳怡化电脑股份有限公司 | Financial self-service equipment maintenance distribute leaflets generation method, handheld terminal and electronic equipment |
CN108229613A (en) * | 2017-12-30 | 2018-06-29 | 武汉凌科通光电科技有限公司 | Opto-electronic device Fault Locating Method and system |
CN108351814A (en) * | 2015-10-27 | 2018-07-31 | 甲骨文国际公司 | For the system and method to supporting packet to be prioritized |
CN108880838A (en) * | 2017-05-10 | 2018-11-23 | 阿里巴巴集团控股有限公司 | Monitoring method and device, the computer equipment and readable medium of traffic failure |
CN109936470A (en) * | 2017-12-18 | 2019-06-25 | 中国电子科技集团公司第十五研究所 | A kind of method for detecting abnormality |
CN110611604A (en) * | 2019-09-19 | 2019-12-24 | 国家电网有限公司 | Local area network equipment evaluation processing method and device |
CN110635960A (en) * | 2019-11-11 | 2019-12-31 | 国家电网有限公司 | Upgrading method and device of communication equipment |
CN111143101A (en) * | 2019-12-12 | 2020-05-12 | 东软集团股份有限公司 | Method and device for determining fault source, storage medium and electronic equipment |
CN111739188A (en) * | 2019-10-11 | 2020-10-02 | 北京京东尚科信息技术有限公司 | AGV fault growth rate determination method and apparatus |
CN113839804A (en) * | 2020-06-24 | 2021-12-24 | 华为技术有限公司 | Network fault determination method and network equipment |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108306747B (en) * | 2017-01-11 | 2021-07-23 | 阿里巴巴集团控股有限公司 | Cloud security detection method and device and electronic equipment |
CN111327443B (en) * | 2018-12-17 | 2022-11-22 | 中国移动通信集团北京有限公司 | Fault root index determination method and device |
CN115988551B (en) * | 2022-12-19 | 2023-09-08 | 南京濠暻通讯科技有限公司 | O-RAN wireless unit fault management method based on ZYNQ |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101252477A (en) * | 2008-03-27 | 2008-08-27 | 杭州华三通信技术有限公司 | Determining method and analyzing apparatus of network fault root |
CN101442762A (en) * | 2008-12-29 | 2009-05-27 | 中国移动通信集团北京有限公司 | Method and apparatus for analyzing network performance and locating network fault |
CN101854277A (en) * | 2010-06-12 | 2010-10-06 | 河北全通通信有限公司 | Method for monitoring mobile communication operation analysis system |
CN102158360A (en) * | 2011-04-01 | 2011-08-17 | 华中科技大学 | Network fault self-diagnosis method based on causal relationship positioning of time factors |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100375435C (en) * | 2004-06-22 | 2008-03-12 | 中兴通讯股份有限公司 | Alarm correlation analysis of light synchronous transmitting net |
US8156377B2 (en) * | 2010-07-02 | 2012-04-10 | Oracle International Corporation | Method and apparatus for determining ranked causal paths for faults in a complex multi-host system with probabilistic inference in a time series |
CN103001811B (en) * | 2012-12-31 | 2016-01-06 | 北京启明星辰信息技术股份有限公司 | Fault locating method and device |
-
2013
- 2013-12-20 CN CN201310711392.8A patent/CN104734871A/en not_active Withdrawn
-
2014
- 2014-09-24 WO PCT/CN2014/087332 patent/WO2015090098A1/en active Application Filing
- 2014-09-24 CN CN201480057055.4A patent/CN105659528B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101252477A (en) * | 2008-03-27 | 2008-08-27 | 杭州华三通信技术有限公司 | Determining method and analyzing apparatus of network fault root |
CN101442762A (en) * | 2008-12-29 | 2009-05-27 | 中国移动通信集团北京有限公司 | Method and apparatus for analyzing network performance and locating network fault |
CN101854277A (en) * | 2010-06-12 | 2010-10-06 | 河北全通通信有限公司 | Method for monitoring mobile communication operation analysis system |
CN102158360A (en) * | 2011-04-01 | 2011-08-17 | 华中科技大学 | Network fault self-diagnosis method based on causal relationship positioning of time factors |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108351814B (en) * | 2015-10-27 | 2021-08-17 | 甲骨文国际公司 | System and method for prioritizing support packets |
CN108351814A (en) * | 2015-10-27 | 2018-07-31 | 甲骨文国际公司 | For the system and method to supporting packet to be prioritized |
WO2018010176A1 (en) * | 2016-07-15 | 2018-01-18 | 华为技术有限公司 | Method and device for acquiring fault information |
CN106294076A (en) * | 2016-08-24 | 2017-01-04 | 浪潮(北京)电子信息产业有限公司 | A kind of server relevant fault Forecasting Methodology and system thereof |
CN106294076B (en) * | 2016-08-24 | 2019-03-15 | 浪潮(北京)电子信息产业有限公司 | A kind of server relevant fault prediction technique and its system |
CN108880838A (en) * | 2017-05-10 | 2018-11-23 | 阿里巴巴集团控股有限公司 | Monitoring method and device, the computer equipment and readable medium of traffic failure |
CN108880838B (en) * | 2017-05-10 | 2021-11-09 | 阿里巴巴集团控股有限公司 | Service fault monitoring method and device, computer equipment and readable medium |
CN107690676A (en) * | 2017-07-04 | 2018-02-13 | 深圳怡化电脑股份有限公司 | Financial self-service equipment maintenance distribute leaflets generation method, handheld terminal and electronic equipment |
CN109936470A (en) * | 2017-12-18 | 2019-06-25 | 中国电子科技集团公司第十五研究所 | A kind of method for detecting abnormality |
CN108229613A (en) * | 2017-12-30 | 2018-06-29 | 武汉凌科通光电科技有限公司 | Opto-electronic device Fault Locating Method and system |
CN110611604A (en) * | 2019-09-19 | 2019-12-24 | 国家电网有限公司 | Local area network equipment evaluation processing method and device |
CN111739188A (en) * | 2019-10-11 | 2020-10-02 | 北京京东尚科信息技术有限公司 | AGV fault growth rate determination method and apparatus |
CN110635960A (en) * | 2019-11-11 | 2019-12-31 | 国家电网有限公司 | Upgrading method and device of communication equipment |
CN111143101A (en) * | 2019-12-12 | 2020-05-12 | 东软集团股份有限公司 | Method and device for determining fault source, storage medium and electronic equipment |
CN111143101B (en) * | 2019-12-12 | 2023-07-07 | 东软集团股份有限公司 | Method, device, storage medium and electronic equipment for determining fault source |
CN113839804A (en) * | 2020-06-24 | 2021-12-24 | 华为技术有限公司 | Network fault determination method and network equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2015090098A1 (en) | 2015-06-25 |
CN105659528B (en) | 2019-10-08 |
CN105659528A (en) | 2016-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104734871A (en) | Method and device for positioning failures | |
CN104218676B (en) | The intelligent warning system of power dispatching automation main website and method | |
CN103220173B (en) | A kind of alarm monitoring method and supervisory control system | |
CN105159964A (en) | Log monitoring method and system | |
CN104360208A (en) | Acquisition failure analyzing and processing method of electricity utilization information acquisition operating and maintaining system | |
CN105049253B (en) | A kind of method for obtaining mobile network's fault location and fault pre-alarming | |
CN101212367A (en) | Alarm message processing method and device | |
CN104639587A (en) | Robot fault monitoring system and method based on Internet of Things | |
CN109559064A (en) | The operation and maintenance method of gate based on Internet of Things | |
CN107526044A (en) | A kind of communication storage battery Telemetry Data Acquisition monitoring method and system | |
CN104574219A (en) | System and method for monitoring and early warning of operation conditions of power grid service information system | |
CN104038373A (en) | Information early warning and self repairing system and method | |
WO2014169869A1 (en) | Alarm processing method and alarm system | |
CN104570976A (en) | Monitoring system and method | |
CN104243192A (en) | Fault treatment method and system | |
CN103763143A (en) | Method and system for equipment abnormality alarming based on storage server | |
CN103701657A (en) | Device and method for monitoring and processing dysfunction of continuously running data processing system | |
CN102984013A (en) | Alarm analysis method for communication transmission network | |
CN109634808B (en) | Chain monitoring event root cause analysis method based on correlation analysis | |
CN103905271B (en) | A kind of alarm windstorm suppressing method | |
CN105739408A (en) | Business monitoring method used for power scheduling system and business monitoring system | |
CN103297281B (en) | A kind of method and system of electric power dedicated service passage monitoring running state | |
CN104765648A (en) | Problem node detection method and device based on real-time computing system | |
CN108449212B (en) | MAS message transmission method based on event association | |
CN106776193B (en) | The virtual measuring method of apparatus for monitoring power supply slave failure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20150624 |