CN104243192B - Fault handling method and system - Google Patents
Fault handling method and system Download PDFInfo
- Publication number
- CN104243192B CN104243192B CN201310237951.6A CN201310237951A CN104243192B CN 104243192 B CN104243192 B CN 104243192B CN 201310237951 A CN201310237951 A CN 201310237951A CN 104243192 B CN104243192 B CN 104243192B
- Authority
- CN
- China
- Prior art keywords
- fault
- failure
- relevant
- fisrt
- fisrt fault
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of fault handling method and system, is related to failure analysis techniques field.Fault handling method provided in an embodiment of the present invention and system, in this fault correlation method, equipment working condition is monitored in real time, occur when monitoring major error A, then forwardly and rearwardly searched in time window T with major error A time of origin points respectively, whether faulty B occurs, if so, then establishing incidence relation.The foundation of fault correlation relation, be advantageous to operation maintenance personnel and handled for relevant fault, improve troubleshooting efficiency.Further, the embodiment of the present invention has also set up memory cache queue, by the way that failure is cached in internal memory in a manner of queue, so as to it is determined that during fault correlation relation, only inquire about the memory cache queue can whether relevant failure occurs to quickly find, avoid analyzing large sample, further improve troubleshooting efficiency.
Description
Technical field
The present invention relates to failure analysis techniques field, more particularly to a kind of fault handling method and system.
Background technology
In everyday devices maintenance, monitored typically by monitoring personnel, such as find failure, just the failure is submitted and safeguarded
Personnel carry out investigation processing to it, to recover normal operating conditions in time.
But in above-mentioned processing method, for attendant, because the reporting fault received is disorderly and unsystematic, have no
For rule, therefore, when to malfunction elimination and processing, efficiency is low.Therefore, at there is an urgent need to a kind of failure of effective
Solution is managed, to improve troubleshooting efficiency.
The content of the invention
In view of the above problems, the embodiment of the present invention provides a kind of fault handling method and system, enabling according to orderly
The failure reported, realize efficiently quickly troubleshooting solution.
The embodiment of the present invention employs following technical scheme:
One embodiment of the invention provides a kind of fault handling method, and methods described includes:
Whether when monitoring Fisrt fault generation, respectively forwardly and backward searching has the second failure hair in scheduled time window
It is raw;
The generation of the second failure is such as found, then using the Fisrt fault and the second failure as relevant fault, is reported described
Relevant fault;As do not found the generation of the second failure, then the Fisrt fault is reported;
For the failure reported, if relevant fault, then processing is merged to it;If Fisrt fault, then it is entered
Row processing.
Methods described also includes:
Relation template is established, for recording the incidence relation between failure;And
Memory cache queue is established, if the faulty generation during monitoring, is cached in a manner of queue in internal memory
The failure;
It is then described when monitoring Fisrt fault generation, respectively forwardly and backward search in scheduled time window whether have second
Failure specifically includes:
When monitoring Fisrt fault generation, the relation template is inquired about, judges the Fisrt fault with the presence or absence of association
Failure, if relevant fault is not present, report the Fisrt fault;
If relevant fault be present, scheduled time window before the Fisrt fault occurs is inquired about in the memory cache queue
Inside whether there is the second failure being associated, also, in the scheduled time window after Fisrt fault generation, continuing monitoring is
It is no to have the second failure.
Methods described also includes:Management is monitored to the memory cache queue, currently processed failure is removed and occurs in advance
The failure fixed time before window;
It is then described when monitoring Fisrt fault generation, respectively forwardly and backward search in scheduled time window whether have second
Failure specifically includes:
When monitoring Fisrt fault generation, the relation template is inquired about, judges the Fisrt fault with the presence or absence of association
Failure, if relevant fault is not present, report the Fisrt fault;
Whether if relevant fault be present, being inquired about in the memory cache queue has the second failure being associated, and
And whether in the scheduled time window after Fisrt fault generation, continuing to monitor has the second event in the memory cache queue
Barrier occurs.
If relevant fault be present, methods described also includes:
Fault correlation relation in the relation template, establish in internal memory buffer queue and uniquely marked with network element
Know and correlation rule ID mark one packet, then when monitor failure occur when, by current monitor to failure be buffered in it
In the packet of corresponding network element unique mark and correlation rule ID marks.
Second failure is one or more.
The relation of the Fisrt fault of relevant fault and the second failure is as follows each other:
Fisrt fault is major error, then the second failure is time failure;Or
Fisrt fault is time failure, then the second failure is major error.
The embodiment of the present invention also provides a kind of fault processing system, and the system includes:
Relevant fault searching modul, for when monitoring Fisrt fault generation, respectively forwardly and backward searching pre- timing
Between whether have the second failure in window;
Reporting module, occur if finding the second failure for the relevant fault searching modul, by the described first event
Barrier and the second failure report the relevant fault as relevant fault;If the relevant fault searching modul does not find second
Failure occurs, then reports the Fisrt fault;
Processing module, for the failure for reporting, if relevant fault, then processing is merged to it;If first
Failure, then it is handled.
The system also includes:
Relation template module, for establishing relation template, record the incidence relation between failure;
Cache module, for establishing memory cache queue, if the faulty generation during monitoring, in a manner of queue
The failure is cached in internal memory;
Then the relevant fault searching modul specifically includes:
Fault type judging unit, for when monitoring Fisrt fault and occurring, the relation template being inquired about, described in judgement
Fisrt fault whether there is relevant fault;
Searching unit, is there is relevant fault in the judged result for the fault type judging unit, then described interior
Deposit and inquire about before the Fisrt fault occurs whether have the second failure being associated in scheduled time window in buffer queue, also,
In scheduled time window after Fisrt fault generation, continue to monitor whether the second failure;
Trigger element is reported, for being in the absence of relevant fault, then when the judged result of the fault type judging unit
Trigger the reporting module and report the Fisrt fault;And the upper declaration form is triggered according to the lookup result of the searching unit
First reporting fault.
The system also includes:
Memory cache queue management module, for being monitored management to the memory cache queue, remove currently processed
Failure before scheduled time window occurs for failure;
Then the relevant fault searching modul specifically includes:
Fault type judging unit, for when monitoring Fisrt fault and occurring, the relation template being inquired about, described in judgement
Fisrt fault whether there is relevant fault;
Searching unit, is there is relevant fault in the judged result for the fault type judging unit, then described interior
Deposit and inquire about whether have the second failure being associated in buffer queue, also, the scheduled time after Fisrt fault generation
In window, continue to monitor in the memory cache queue whether have the second failure;
Trigger element is reported, for being in the absence of relevant fault, then when the judged result of the fault type judging unit
Trigger the reporting module and report the Fisrt fault;And the upper declaration form is triggered according to the lookup result of the searching unit
First reporting fault.
The cache module also includes:
Grouped element, during for relevant fault be present, the fault correlation relation in the relation template, delay in internal memory
The packet established in queue and identified with network element unique mark and correlation rule ID is deposited, then when monitoring failure
When, by current monitor to failure be buffered in the packet of network element unique mark corresponding to it and correlation rule ID marks
In;
Second failure is one or more;
The relation of the Fisrt fault of relevant fault and the second failure is as follows each other:
Fisrt fault is major error, then the second failure is time failure;Or
Fisrt fault is time failure, then the second failure is major error.
Fault handling method provided in an embodiment of the present invention and system, in this fault correlation method, to equipment working condition
Monitored in real time, occur when monitoring major error A, then the time is forwardly and rearwardly searched with major error A time of origin points respectively
In window T, if faulty B occurs, if so, then establishing incidence relation.The foundation of fault correlation relation, is advantageous to operation maintenance personnel
Handled for relevant fault, improve troubleshooting efficiency.
Further, the embodiment of the present invention has also set up memory cache queue, by being cached in a manner of queue in internal memory
Failure, so that whether it is determined that during fault correlation relation, only inquiring about the memory cache queue can relevant to quickly find
Failure occurs, and avoids analyzing large sample, further improves troubleshooting efficiency.
Brief description of the drawings
Fig. 1 is a kind of fault handling method flow chart that one embodiment of the invention provides;
Fig. 2 is a kind of fault handling method flow chart that another embodiment of the present invention provides;
Fig. 3 is a kind of instantiation flow chart of fault handling method provided in an embodiment of the present invention;
Fig. 4 is a kind of fault processing system block diagram that one embodiment of the invention provides.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
In everyday devices maintenance, by the continuous observation analysis of monitoring personnel, the pests occurrence rule that is out of order is summarized.Generally, if
Multiple failures occur typically together, then claim possess influence relation, referred to as relevant fault between the plurality of failure.Such as when A failures
Occur, it is generally front and rear 10 minutes in B failures can also occur, then it is assumed that alarm A and alarm B is influence relation, according to concrete application
Scene, primary-slave relation be present between relevant fault, such as in above-mentioned incidence relation, A is major error, B is time failure.
In this fault correlation method, equipment working condition is monitored in real time, occurs when monitoring major error A, then divides
Do not searched forwardly and rearwardly in time window T with major error A time of origin points, if faulty B occurs, and is closed if so, then establishing
Connection relation.The foundation of fault correlation relation, be advantageous to attendant and handled for relevant fault, improve troubleshooting effect
Rate.
Specifically, referring to Fig. 1, it is a kind of fault handling method provided in an embodiment of the present invention, specifically comprises the following steps:
S101:Monitor failure.
S102:When monitoring Fisrt fault generation, respectively forwardly and backward search in scheduled time window whether have second
Failure occurs.
According to different application scenarios, the length of scheduled time window can set difference.For example set in communications industry communication
In standby maintenance application scene, the length that can set scheduled time window is 10 minutes.
S103:The generation of the second failure is such as found, then using the Fisrt fault and the second failure as relevant fault, is reported
The relevant fault;As do not found the generation of the second failure, then the Fisrt fault is reported.
Fisrt fault and the second failure are relevant faults, i.e., under normal circumstances, both meetings are with generation, in practical application
In, if before failure reports, analyzing and processing can be associated to failure, and report, so, attendant can be with pin
Merging treatment is associated to failure, troubleshooting efficiency can be greatly improved.
It should be noted that above-mentioned second failure can be one or more, that is to say, that if Fisrt fault is A failures,
Second failure can be B failures, or B failures, C failures and D failures etc., not be limited herein.
It is further to note that the relation of the Fisrt fault of relevant fault and the second failure can be each other:
Fisrt fault is major error, then the second failure is time failure;Or, Fisrt fault is time failure, then the second failure is
Major error.For example if certain base station fault is major error, downstream signal sends failure and just could be arranged to time failure.
S104:For the failure reported, if relevant fault, then processing is merged to it;It is if Fisrt fault, then right
It is handled.
In the embodiment of the present invention, T time before major error A occurs is searched(That is before time window T)Secondary failure B.Time found
Failure B, establishes incidence relation.And time failure B is continued to, establish incidence relation.When major error A exceedes time window T, major error
A is not in association time failure B.
It can be seen that fault handling method provided in an embodiment of the present invention and system, in this fault correlation method, equipment is worked
Situation is monitored in real time, is occurred when monitoring major error A, is then forwardly and rearwardly searched with major error A time of origin points respectively
In time window T, if faulty B occurs, if so, then establishing incidence relation.The foundation of fault correlation relation, is advantageous to O&M
Personnel are handled for relevant fault, improve troubleshooting efficiency.
Preferably, referring to Fig. 2, another embodiment of the present invention provides another fault handling method.The embodiment of the present invention
Further establish has memory cache queue, by caching failure in internal memory in a manner of queue, so as to it is determined that fault correlation
During relation, only inquiring about the memory cache queue can be to quickly find whether relevant failure occurs, so as to avoid to full-page proof
Notebook data is analyzed and processed, and can further improve troubleshooting efficiency.
Comprise the following steps that:
S201:Relation template is established, for recording the incidence relation between failure.
S202:Memory cache queue is established, if the faulty generation during monitoring, in a manner of queue in internal memory
Cache the failure.
In concrete practice, when receiving failure B, out of order time-out time window is calculated(That is time of failure+time window
T minutes).Failure is stored in memory cache in a manner of queue.If currently processed failure is failure A, audit memory caching
Whether faulty B is present in time window T minutes before the failure A cached in queue, it is seen then that by increasing memory cache queue,
The step of can avoiding analyzing big-sample data, only inquired about in internal memory buffer queue.
S203:When monitoring Fisrt fault generation, above-mentioned relation template is inquired about, judges Fisrt fault with the presence or absence of association
Failure, if relevant fault is not present, step S204 is performed, if relevant fault be present, perform step S205.
S204:Fisrt fault is reported, performs step S208.
S205:Whether inquired about in internal memory buffer queue before Fisrt fault occurs has the be associated in scheduled time window
Two failures, also, in the scheduled time window after Fisrt fault generation, continue to monitor whether the second failure.As searched
Occur to the second failure, then perform step S206, do not find the generation of the second failure such as, then perform step S204.
Preferably, the embodiment of the present invention also comprises the following steps:Management is monitored to internal memory buffer queue, is removed current
Failure before scheduled time window occurs for handling failure.
For the step, when receiving failure B, out of order time-out time window is calculated(That is time of failure+time window
T minutes).Failure is stored in memory cache in a manner of queue.If currently processed failure is failure A, audit memory caching
Whether faulty B is present before the failure A cached in queue(The only currently processed failure stored in memory cache queue is i.e.
Before failure A in time window T minutes), also, in Fisrt fault(Failure A)In scheduled time window after generation, continue to monitor
Whether second failure is had in memory cache queue(Failure B)Occur.It can be seen that by increasing memory cache queue, can avoid pair
The step of big-sample data is analyzed, only inquired about in internal memory buffer queue.
Further, if relevant fault be present, methods described of the embodiment of the present invention also includes:
Fault correlation relation in the relation template, established in internal memory buffer queue so that " network element is unique
One packet of mark+correlation rule ID " mark, then when monitoring failure and occurring, by current monitor to failure be buffered in
In the packet that network element unique mark and correlation rule ID corresponding to it identify.
Wherein, network element unique mark, for certain equipment in unique mark network.
Correlation rule ID, for identifying correlation rule, for example A failures are major error, and B failures are time failure.
Accordingly, the step of relevant fault is inquired about in internal memory buffer queue, " to set specially in internal memory buffer queue
Inquired about in the packet of standby network element unique mark+correlation rule ID " marks.So, data processing can further be reduced
Sample, further improve the efficiency of troubleshooting.
S206:Using Fisrt fault and the second failure as relevant fault, the relevant fault is reported.
S207:Processing is merged to relevant fault, is terminated.
S208:Fisrt fault is handled.
Fisrt fault and the second failure are relevant faults, i.e., under normal circumstances, both meetings are with generation, in practical application
In, if before failure reports, analyzing and processing can be associated to failure, and report, so, attendant can be with pin
Merging treatment is associated to failure, troubleshooting efficiency can be greatly improved.
It should be noted that above-mentioned second failure can be one or more, that is to say, that if Fisrt fault is A failures,
Second failure can be B failures, or B failures, C failures and D failures etc., not be limited herein.
It is further to note that the relation of the Fisrt fault of relevant fault and the second failure can be each other:
Fisrt fault is major error, then the second failure is time failure;Or, Fisrt fault is time failure, then the second failure is
Major error.For example if certain base station fault is major error, downstream signal sends failure and just could be arranged to time failure.
In the embodiment of the present invention, T time before major error A occurs is searched in internal memory buffer queue(That is before time window T)'s
Secondary failure B.The secondary failure B found, establishes incidence relation.And time failure B is continued to, establish incidence relation.When major error A surpasses
Time window T is crossed, major error A is not in association time failure B.
It can be seen that fault handling method provided in an embodiment of the present invention and system, in this fault correlation method, equipment is worked
Situation is monitored in real time, is occurred when monitoring major error A, is then forwardly and rearwardly searched with major error A time of origin points respectively
In time window T, if faulty B occurs, if so, then establishing incidence relation.The foundation of fault correlation relation, is advantageous to O&M
Personnel are handled for relevant fault, improve troubleshooting efficiency.
Further, the embodiment of the present invention has also set up memory cache queue, by being cached in a manner of queue in internal memory
Failure, so that whether it is determined that during fault correlation relation, only inquiring about the memory cache queue can relevant to quickly find
Failure occurs, and avoids analyzing large sample, further improves troubleshooting efficiency.
Referring to Fig. 3, for a kind of example of specific fault handling method provided in an embodiment of the present invention, Integral Thought:It is first
First, caching needs the failure associated.Then, according to fault location information, failure is grouped.Finally, when the primary and secondary of faulty equipment
When failure breaks down, fault correlation is carried out.That realizes generallys include following description, and specific sub-step is shown in Figure 3,
Here is omitted.
S301:Define correlation rule.
Alerted based on same device fails A, the B that breaks down is child alarm.Definition time window length is T minutes.
S302:Reception activity alerts.
I. failure A is received(Or failure B), establish one with " one of network element unique mark+correlation rule ID " packet,
Calculate out of order time-out time window(That is time of failure+time window T minutes).Failure is stored in internal memory in a manner of queue
In caching.
Ii. failure B is received(Or failure A)." network element unique mark+correlation rule ID ", which whether there is, has not timed out number for lookup
According to.If it does, and each other primary and secondary alert, by failure A, failure B associate.
S303:Time-out abandons.
Queue is retrieved, the alarm more than time window T is deleted from packet queue.It is no longer used to associate.
It can be seen that the beneficial effect of this example is:The complex query of big data sample is greatly reduced, accelerates fault correlation speed
Degree, so as to substantially increase the efficiency of troubleshooting.
Referring to Fig. 4, the embodiment of the present invention provides a kind of fault processing system, including:
Relevant fault searching modul 401, for when monitoring Fisrt fault generation, respectively forwardly and backward searching predetermined
Whether second failure is had in time window.
Reporting module 402, occur if finding the second failure for relevant fault searching modul 401, by Fisrt fault
With the second failure as relevant fault, the relevant fault is reported;If relevant fault searching modul 401 does not find the second failure hair
It is raw, then Fisrt fault is reported.
Processing module 403, for the failure for reporting, if relevant fault, then processing is merged to it;If
One failure, then handled it.
Further, fault processing system provided in an embodiment of the present invention also includes:
Relation template module 404, for establishing relation template, record the incidence relation between failure.
Cache module 405, for establishing memory cache queue, if the faulty generation during monitoring, with queue
Mode caches the failure in internal memory.
Then relevant fault searching modul 401 specifically includes:
Fault type judging unit, for when monitoring Fisrt fault and occurring, the relation template being inquired about, described in judgement
Fisrt fault whether there is relevant fault.
Searching unit, is there is relevant fault in the judged result for the fault type judging unit, then described interior
Deposit and inquire about before the Fisrt fault occurs whether have the second failure being associated in scheduled time window in buffer queue, also,
In scheduled time window after Fisrt fault generation, continue to monitor whether the second failure.
And trigger element is reported, for being in the absence of association event when the judged result of the fault type judging unit
Barrier, then trigger the reporting module and report the Fisrt fault;And triggered according to the lookup result of the searching unit on described
Declaration form member reporting fault.
Further, fault processing system provided in an embodiment of the present invention also includes:
Memory cache queue management module 406, for being monitored management to internal memory buffer queue, remove currently processed event
Failure before scheduled time window occurs for barrier.
Then the relevant fault searching modul 401 specifically includes:
Fault type judging unit, for when monitoring Fisrt fault and occurring, the relation template being inquired about, described in judgement
Fisrt fault whether there is relevant fault.
Searching unit, is there is relevant fault in the judged result for the fault type judging unit, then described interior
Deposit and inquire about whether have the second failure being associated in buffer queue, also, the scheduled time after Fisrt fault generation
In window, continue to monitor in the memory cache queue whether have the second failure.
And trigger element is reported, for being in the absence of association event when the judged result of the fault type judging unit
Barrier, then trigger the reporting module and report the Fisrt fault;And triggered according to the lookup result of the searching unit on described
Declaration form member reporting fault.
Preferably, above-mentioned cache module also includes:
Grouped element, during for relevant fault be present, the fault correlation relation in the relation template, delay in internal memory
The packet established in queue and identified with network element unique mark and correlation rule ID is deposited, then when monitoring failure
When, by current monitor to failure be buffered in the packet of network element unique mark corresponding to it and correlation rule ID marks
In.
It should be noted that above-mentioned second failure can be one or more, that is to say, that if Fisrt fault is A failures,
Second failure can be B failures, or B failures, C failures and D failures etc., not be limited herein.
It is further to note that the relation of the Fisrt fault of relevant fault and the second failure can be each other:
Fisrt fault is major error, then the second failure is time failure;Or, Fisrt fault is time failure, then the second failure is
Major error.For example if certain base station fault is major error, downstream signal sends failure and just could be arranged to time failure.
It should be noted that the operation principle and processing procedure of the modules or unit in present system embodiment
The associated description in embodiment of the method shown in above-mentioned Fig. 1, Fig. 2 and Fig. 3 is may refer to, here is omitted.
It can be seen that fault handling method provided in an embodiment of the present invention and system, in this fault correlation method, equipment is worked
Situation is monitored in real time, is occurred when monitoring major error A, is then forwardly and rearwardly searched with major error A time of origin points respectively
In time window T, if faulty B occurs, if so, then establishing incidence relation.The foundation of fault correlation relation, is advantageous to O&M
Personnel are handled for relevant fault, improve troubleshooting efficiency.
Further, the embodiment of the present invention has also set up memory cache queue, by being cached in a manner of queue in internal memory
Failure, so that whether it is determined that during fault correlation relation, only inquiring about the memory cache queue can relevant to quickly find
Failure occurs, and avoids analyzing large sample, further improves troubleshooting efficiency.
For the ease of clearly describing the technical scheme of the embodiment of the present invention, in the embodiment of invention, employ " first ",
Printed words such as " second " make a distinction to function and the essentially identical identical entry of effect or similar item, and those skilled in the art can manage
The printed words such as solution " first ", " second " are not defined to quantity and execution order.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all
Any modification, equivalent substitution and improvements made within the spirit and principles in the present invention etc., are all contained in protection scope of the present invention
It is interior.
Claims (9)
1. a kind of fault handling method, it is characterised in that methods described includes:
Relation template is established, for recording the incidence relation between failure;And
Memory cache queue is established, if the faulty generation during monitoring, caches the event in a manner of queue in internal memory
Barrier;
When monitoring Fisrt fault generation, respectively forwardly and backward search in scheduled time window whether have the second failure;
The generation of the second failure is such as found, then using the Fisrt fault and the second failure as relevant fault, reports the association
Failure;As do not found the generation of the second failure, then the Fisrt fault is reported;
For the failure reported, if relevant fault, then processing is merged to it;If Fisrt fault, then at it
Reason;
It is described when monitoring Fisrt fault and occurring, whether search respectively forwardly and backward has the second failure hair in scheduled time window
Life specifically includes:
When monitoring Fisrt fault generation, the relation template is inquired about, judges that the Fisrt fault whether there is relevant fault,
If relevant fault is not present, the Fisrt fault is reported;
If relevant fault be present, inquired about in the memory cache queue before the Fisrt fault occurs is in scheduled time window
It is no to have the second failure being associated, also, in the scheduled time window after Fisrt fault generation, continue to have monitored whether
Second failure occurs.
2. fault handling method according to claim 1, it is characterised in that methods described also includes:The internal memory is delayed
Deposit queue and be monitored management, remove the failure before currently processed failure generation scheduled time window;
It is then described when monitoring Fisrt fault generation, respectively forwardly and backward search in scheduled time window whether have the second failure
Specifically include:
When monitoring Fisrt fault generation, the relation template is inquired about, judges that the Fisrt fault whether there is relevant fault,
If relevant fault is not present, the Fisrt fault is reported;
Whether if relevant fault be present, being inquired about in the memory cache queue has the second failure being associated, also,
Whether in scheduled time window after the Fisrt fault generation, continuing to monitor has the second failure hair in the memory cache queue
It is raw.
3. fault handling method according to claim 1 or 2, it is characterised in that if relevant fault be present, methods described is also
Including:
Fault correlation relation in the relation template, established in internal memory buffer queue with network element unique mark and
Correlation rule ID mark one packet, then when monitor failure occur when, by current monitor to failure be buffered in its institute it is right
In the packet for network element unique mark and correlation rule the ID mark answered.
4. according to the fault handling method described in claim any one of 1-2, it is characterised in that second failure be one or
It is multiple.
5. according to the fault handling method described in claim any one of 1-2, it is characterised in that the first event of relevant fault each other
The relation of barrier and the second failure is as follows:
Fisrt fault is major error, then the second failure is time failure;Or
Fisrt fault is time failure, then the second failure is major error.
6. a kind of fault processing system, it is characterised in that the system includes:
Relation template module, for establishing relation template, record the incidence relation between failure;
Cache module, for establishing memory cache queue, if the faulty generation during monitoring, in a manner of queue including
Deposit middle caching failure;
Relevant fault searching modul, for when monitoring Fisrt fault generation, respectively forwardly and backward searching scheduled time window
Inside whether there is the second failure;
Reporting module, occur if finding the second failure for the relevant fault searching modul, by the Fisrt fault and
Second failure reports the relevant fault as relevant fault;If the relevant fault searching modul does not find the second failure
Occur, then report the Fisrt fault;Processing module, for the failure for reporting, if relevant fault, then it is carried out
Merging treatment;If Fisrt fault, then it is handled;
The relevant fault searching modul specifically includes:
Fault type judging unit, for when monitoring Fisrt fault generation, inquiring about the relation template, judging described first
Failure whether there is relevant fault;
Searching unit, the judged result for the fault type judging unit are then delayed relevant fault to be present in the internal memory
Deposit and inquire about before the Fisrt fault occurs whether have the second failure being associated in scheduled time window in queue, also, in institute
State in the scheduled time window after Fisrt fault occurs, continue to monitor whether the second failure.
7. fault processing system according to claim 6, it is characterised in that
Trigger element is reported, for being in the absence of relevant fault when the judged result of the fault type judging unit, is then triggered
The reporting module reports the Fisrt fault;And triggered according to the lookup result of the searching unit in the reporting module
Report failure.
8. fault processing system according to claim 7, it is characterised in that the system also includes:
Memory cache queue management module, for being monitored management to the memory cache queue, remove currently processed failure
The failure before scheduled time window occurs;
Then the relevant fault searching modul specifically includes:
Fault type judging unit, for when monitoring Fisrt fault generation, inquiring about the relation template, judging described first
Failure whether there is relevant fault;
Searching unit, the judged result for the fault type judging unit are then delayed relevant fault to be present in the internal memory
Deposit and inquire about whether have the second failure being associated in queue, also, in the scheduled time window after Fisrt fault generation,
Continue to monitor in the memory cache queue whether have the second failure;
Trigger element is reported, for being in the absence of relevant fault when the judged result of the fault type judging unit, is then triggered
The reporting module reports the Fisrt fault;And triggered according to the lookup result of the searching unit in the reporting module
Report failure.
9. the fault processing system according to claim 7 or 8, it is characterised in that the cache module also includes:
Grouped element, during for relevant fault be present, the fault correlation relation in the relation template, in memory cache team
The packet identified with network element unique mark and correlation rule ID is established in row, then, will when monitoring failure generation
Current monitor to the network element unique mark that is buffered in corresponding to it of failure and correlation rule ID marks packet in;
Second failure is one or more;
The relation of the Fisrt fault of relevant fault and the second failure is as follows each other:
Fisrt fault is major error, then the second failure is time failure;Or
Fisrt fault is time failure, then the second failure is major error.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310237951.6A CN104243192B (en) | 2013-06-17 | 2013-06-17 | Fault handling method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310237951.6A CN104243192B (en) | 2013-06-17 | 2013-06-17 | Fault handling method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104243192A CN104243192A (en) | 2014-12-24 |
CN104243192B true CN104243192B (en) | 2017-11-10 |
Family
ID=52230593
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310237951.6A Active CN104243192B (en) | 2013-06-17 | 2013-06-17 | Fault handling method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104243192B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106411615A (en) * | 2016-11-22 | 2017-02-15 | 北京奇虎科技有限公司 | Device used for cloud remediation of system application and method |
CN108234189B (en) * | 2016-12-22 | 2021-10-08 | 北京神州泰岳软件股份有限公司 | Alarm data processing method and device |
CN109659936A (en) * | 2018-12-29 | 2019-04-19 | 国电南瑞科技股份有限公司 | A kind of smart grid Dispatching Control System failure method of disposal and system |
CN111240871B (en) * | 2019-12-30 | 2023-07-18 | 潍柴动力股份有限公司 | Method and device for reporting engine fault |
CN113515078A (en) * | 2021-05-20 | 2021-10-19 | 湖南湘船重工有限公司 | Intelligent ship information monitoring and alarm processing method and system |
CN115056228B (en) * | 2022-07-06 | 2023-07-04 | 中迪机器人(盐城)有限公司 | Abnormality monitoring and processing system and method for robot |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6239699B1 (en) * | 1999-03-03 | 2001-05-29 | Lucent Technologies Inc. | Intelligent alarm filtering in a telecommunications network |
CN1492624A (en) * | 2002-10-22 | 2004-04-28 | 华为技术有限公司 | Processing method of communication network warning and relatively analysis management device |
CN101360013A (en) * | 2008-09-25 | 2009-02-04 | 烽火通信科技股份有限公司 | General fast fault locating method for transmission network based on correlativity analysis |
CN102014020A (en) * | 2010-11-12 | 2011-04-13 | 百度在线网络技术(北京)有限公司 | Equipment for performing network monitoring on network equipment and method thereof |
CN102098175A (en) * | 2011-01-26 | 2011-06-15 | 浪潮通信信息系统有限公司 | Alarm association rule obtaining method of mobile internet |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5423427B2 (en) * | 2010-01-26 | 2014-02-19 | 富士通株式会社 | Information management program, information management apparatus, and information management method |
-
2013
- 2013-06-17 CN CN201310237951.6A patent/CN104243192B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6239699B1 (en) * | 1999-03-03 | 2001-05-29 | Lucent Technologies Inc. | Intelligent alarm filtering in a telecommunications network |
CN1492624A (en) * | 2002-10-22 | 2004-04-28 | 华为技术有限公司 | Processing method of communication network warning and relatively analysis management device |
CN101360013A (en) * | 2008-09-25 | 2009-02-04 | 烽火通信科技股份有限公司 | General fast fault locating method for transmission network based on correlativity analysis |
CN102014020A (en) * | 2010-11-12 | 2011-04-13 | 百度在线网络技术(北京)有限公司 | Equipment for performing network monitoring on network equipment and method thereof |
CN102098175A (en) * | 2011-01-26 | 2011-06-15 | 浪潮通信信息系统有限公司 | Alarm association rule obtaining method of mobile internet |
Also Published As
Publication number | Publication date |
---|---|
CN104243192A (en) | 2014-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104243192B (en) | Fault handling method and system | |
CN103220173B (en) | A kind of alarm monitoring method and supervisory control system | |
CN101808351B (en) | Method and system for business impact analysis | |
CN106385334B (en) | Call center system and its abnormality detection and self-recovery method | |
CN112737800B (en) | Service node fault positioning method, call chain generating method and server | |
CN105549508A (en) | Alarm method based on information combination and apparatus thereof | |
CN111786986B (en) | Numerical control system network intrusion prevention system and method | |
CN103634166B (en) | Equipment survival detection method and equipment survival detection device | |
CN102111788A (en) | Alarm processing method and alarm management system | |
Roblee et al. | Implementing large-scale autonomic server monitoring using process query systems | |
TWI448975B (en) | Dispersing-type algorithm system applicable to image monitoring platform | |
CN102195791A (en) | Alarm analysis method, device and system | |
CN105281824A (en) | Method and device for detecting constant light-emitting optical network unit (ONU) and network management equipment | |
CN110381082B (en) | Mininet-based attack detection method and device for power communication network | |
CN111614630A (en) | Network security monitoring method and device and cloud WEB application firewall | |
KR101973728B1 (en) | Integration security anomaly symptom monitoring system | |
WO2014040470A1 (en) | Alarm message processing method and device | |
CN115701889A (en) | Oil field industrial control safety supervision method based on SOAR | |
CN114385438A (en) | Service operation risk early warning method, system and storage medium | |
CN111274089B (en) | Server abnormal behavior perception system based on bypass technology | |
CN108023741A (en) | One kind monitoring resource using method and server | |
CN114006719A (en) | AI verification method, device and system based on situation awareness | |
CN113254313A (en) | Monitoring index abnormality detection method and device, electronic equipment and storage medium | |
CN107615708A (en) | Alarm information reporting method and device | |
CN111918233A (en) | Anomaly detection method suitable for wireless aviation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder |
Address after: Room 818, 8 / F, 34 Haidian Street, Haidian District, Beijing 100080 Patentee after: BEIJING ULTRAPOWER SOFTWARE Co.,Ltd. Address before: 100089 Beijing city Haidian District wanquanzhuang Road No. 28 Wanliu new building 6 storey block A Room 601 Patentee before: BEIJING ULTRAPOWER SOFTWARE Co.,Ltd. |
|
CP02 | Change in the address of a patent holder |