Summary of the invention
One or more problems at existing in the correlation technique the object of the present invention is to provide a kind of alarm information processing method and system, with in addressing the above problem one of at least.
For achieving the above object, according to an aspect of the present invention, provide a kind of alarm information processing method, having comprised:
Analyze group system obtaining the business rule of one or more reality, and set up rule-associated model according to the business rule of one or more reality;
Make the user define the business rule that is used to show according to rule-associated model, wherein, the business rule that is used to show is corresponding and synchronous with actual business rule;
Obtain warning information and the needed supplementary that to analyze, and it is related to utilize the alarm association regulation engine that the warning information that will analyze is carried out according to the business rule and the needed supplementary of the warning information that will analyze, reality, obtains all Root alarm information and related warning information and quantity thereof;
Resulting Root alarm information and related warning information thereof and quantity and the business rule that is used to show are presented to the user, and the user can check the topological diagram of the warning information of certain Root alarm association.
Preferably, the step of setting up rule-associated model comprises: analyze the characteristics of group system, obtain the business rule of one or more reality; And can identification mode set up the business rule correlation model with the alarm association regulation engine according to the business rule of one or more reality.
Preferably, the business rule correlation model comprise following one or more: same parts are applicable to have causal alarm on the same parts; Same unit type is applicable to that there is causal alarm in same unit type; Same equipment is applicable to have causal alarm on the same equipment; Same device type is applicable to that there is causal alarm in same device type; And topological relation, be applicable to that reason alarm and result alarm is that topology is related and have causality, and the reason alarm being that switch is alarmed, it is equipment alarm that the result alarms, the result alarms the situation of spot number of times greater than set point.
Preferably, the step of the business rule that is used for showing at user definition, the business rule that is used for showing is stored in database, and actual business rule is stored under the catalogue of regulation.
Preferably, needed supplementary comprises following one or more: equipment and rack corresponding informance, topology information, node grouping information and software and hardware corresponding informance.
Preferably, the alarm association regulation engine after each actual business rule is performed, the state of the business rule of automatic gauging reality, and it is carried out corresponding operating according to state.
Preferably, the alarm association regulation engine defines the operation of establishment, modification and/or deletion rule in rule.
For achieving the above object, according to another aspect of the present invention, provide a kind of warning information treatment system, having comprised:
The modelling device is used for group system analysis is obtained the business rule of one or more reality, and sets up rule-associated model according to the business rule of one or more reality;
The rule definition device is used to make the user to define the business rule that is used to show according to rule-associated model, and wherein, the business rule that the rule definition unit is used in demonstration is corresponding and synchronous with actual business rule;
The warning information associated apparatus, be used to obtain warning information and the needed supplementary that to analyze, and according to the warning information that will analyze, actual business rule and needed supplementary the warning information that will analyze is carried out relatedly, obtain the warning information and the quantity of all Root alarm information and association thereof; And
Association results presents device, is used for resulting Root alarm information and related warning information thereof and quantity and the business rule that is used to show are presented to the user.
Preferably, the modelling device comprises: analytic unit, be used to analyze the characteristics of group system, and obtain the business rule of one or more reality; Set up the unit, be used for identification mode to set up the business rule correlation model with the warning information associative cell according to the business rule of one or more reality.
Preferably, the business rule correlation model comprise following one or more:
Same parts are applicable to have causal alarm on the same parts;
Same unit type is applicable to that there is causal alarm in same unit type;
Same equipment is applicable to have causal alarm on the same equipment;
Same device type is applicable to that there is causal alarm in same device type; And
Topological relation is applicable to that reason alarm and result alarm is that topology is related and have causality, and the reason alarm is that switch is alarmed, and it is equipment alarm that the result alarms, and the result alarms the situation of spot number of times greater than set point.
Preferably, the business rule that the rule definition unit will be used for showing is stored in database, and the business rule of reality is stored under the catalogue of regulation.
Preferably, needed supplementary comprises following one or more: equipment and rack corresponding informance, topology information, node grouping information and software and hardware corresponding informance.
Preferably, the alarm association regulation engine defines the operation of establishment, modification and/or deletion rule in rule.
By above-mentioned at least one technical scheme of the present invention,, find Root alarm by the mass alarm information incidence relation is analyzed, present to the keeper, reduce alarm quantity, can greatly alleviate keeper's work load, realized the promptness and the stability of alarm and control system.
Embodiment
Functional overview
Consider the one or more problems that exist in the correlation technique, the present invention proposes a kind of alarm information processing method and system, by the mass alarm information incidence relation is analyzed, find Root alarm, present to the keeper, reduce alarm quantity, can greatly alleviate keeper's work load, realized the promptness and the stability of alarm and control system.
Fig. 1 is the flow chart according to alarm information processing method of the present invention.As shown in Figure 1, alarm information processing method of the present invention may further comprise the steps:
Step 102 is analyzed group system obtaining the business rule of one or more reality, and is set up rule-associated model according to the business rule of one or more reality;
Step 104 makes the user define the business rule that is used to show according to rule-associated model, and wherein, the business rule that is used to show is corresponding and synchronous with actual business rule;
Step 106, obtain warning information and the needed supplementary that to analyze, and it is related to utilize the alarm association regulation engine that the warning information that will analyze is carried out according to the business rule and the needed supplementary of the warning information that will analyze, reality, obtains all Root alarm information and related warning information and quantity thereof;
Step 108 is presented to the user with resulting Root alarm information and related warning information thereof and quantity and the business rule that is used to show, and the user can check the topological diagram of the warning information of certain Root alarm associating information.
Wherein, step 102 comprises: analyze the characteristics of group system, obtain the business rule of one or more reality; And can identification mode set up the business rule correlation model with the alarm association regulation engine according to the business rule of one or more reality.
Wherein, the business rule correlation model comprise following one or more: same parts are applicable to have causal alarm on the same parts; Same unit type is applicable to that there is causal alarm in same unit type; Same equipment is applicable to have causal alarm on the same equipment; Same device type is applicable to that there is causal alarm in same device type; And topological relation, be applicable to that reason alarm and result alarm is that topology is related and have causality, and the reason alarm being that switch is alarmed, it is equipment alarm that the result alarms, the result alarms the situation of spot number of times greater than set point.
Wherein, in the step 104, the business rule that is used for showing is stored in database, and actual business rule is stored under the catalogue of regulation.
Wherein, needed supplementary comprises following one or more: equipment and rack corresponding informance, topology information, node grouping information and software and hardware corresponding informance.
Wherein, the alarm association regulation engine after each actual business rule is performed, the state of the business rule of automatic gauging reality, and it is carried out corresponding operating according to state.The alarm association regulation engine defines the operation of establishment, modification and/or deletion rule in rule.
Fig. 2 is the block diagram according to warning information treatment system of the present invention.As shown in Figure 2, warning information treatment system of the present invention comprises:
Modelling device 202 is used for group system analysis is obtained the business rule of one or more reality, and sets up rule-associated model according to the business rule of one or more reality.Modelling device 202 comprises: analytic unit 202-2, be used to analyze the characteristics of group system, and obtain the business rule of one or more reality; Set up unit 202-4, be used for identification mode to set up the business rule correlation model with the warning information associative cell according to the business rule of one or more reality.
Rule definition device 204 is used to make the user to define the business rule that is used to show according to rule-associated model, and wherein, the business rule that the rule definition unit is used in demonstration is corresponding and synchronous with actual business rule.
Warning information associated apparatus 206, be used to obtain warning information and the needed supplementary that to analyze, and according to the warning information that will analyze, actual business rule and needed supplementary the warning information that will analyze is carried out relatedly, obtain the warning information and the quantity of all Root alarm information and association thereof.
Association results presents device 208, is used for resulting Root alarm information and related warning information thereof and quantity and the business rule that is used to show are presented to the user.
Wherein, the business rule correlation model comprise following one or more: same parts are applicable to have causal alarm on the same parts; Same unit type is applicable to that there is causal alarm in same unit type; Same equipment is applicable to have causal alarm on the same equipment; Same device type is applicable to that there is causal alarm in same device type; And topological relation, be applicable to that reason alarm and result alarm is that topology is related and have causality, and the reason alarm being that switch is alarmed, it is equipment alarm that the result alarms, the result alarms the situation of spot number of times greater than set point.
Wherein, the business rule that the rule definition unit will be used for showing is stored in database, and the business rule of reality is stored under the catalogue of regulation.
Wherein, needed supplementary comprises following one or more: equipment and rack corresponding informance, topology information, node grouping information and software and hardware corresponding informance.The alarm association regulation engine defines the operation of establishment, modification and/or deletion rule in rule.
Below introduce more specifically realization of the present invention in detail.
Particularly, the present invention adopts rule-based correlating method to realize.In exploitation of the present invention, relate to following key point:
One, the foundation of rule-associated model
The foundation of rule-associated model mainly was divided into for two steps:
The first step: analyze the characteristics of group system, find out its rule, rule is concluded summary, a kind of general business rule model is provided.
In the following table several examples of business rule model.
The correlation model title |
Describe |
Same parts |
This model is applicable to and has causal alarm on the same parts.Such as: alarm of CPU overtension and the too high alarm of cpu temperature have causality between them, and this causality are only limited to same CPU inside. |
Same unit type |
This model is applicable to that there is causal alarm in same unit type. |
Same equipment |
This model is applicable to and has causal alarm on the same equipment.Such as: too high alarm of switch memory utilance and the too high alarm of port input packet loss have causality between them, and this causality are only limited to same device interior. |
Same device type |
This model is applicable to that there is causal alarm in same device type. |
Topological relation (switch-equipment) |
This model is applicable to that reason alarm and result's alarm are that topology is related and have causality, and the reason alarm is that switch is alarmed, and it is equipment alarm that the result alarms, and the result alarms the situation of the number of times of generation greater than set point.Such as: switch oneself state (can not arrive) alarm and server oneself state (can not arrive) alarm have causality between them, and this causality are topological relations.When certain switch took place to arrive alarm, the server that is connected with this switch had and takes place more than 3 to arrive alarm, thought then that switch can not arrive to alarm to have caused that server can not arrive alarm. |
Second step was to be that regulation engine can identification mode in the system with these business rule model conversion, promptly with certain rule syntax service logic was showed.
Logic with same part relation model is that example describes below.
1package?rules.correlation_${templateName}_${causeAlarmValueID}_${resultAlarmValueID} 2 3import?com.dawning.gridview.alarmSystem.generic.type.database.AlarmInfo; 4import?com.dawning.gridview.alarmSystem.generic.type.correlation.EquipToRack; 5import?com.dawning.gridview.alarmSystem.generic.type.correlation.NodeGroup; 6import?com.dawning.gridview.alarmSystem.generic.type.correlation.Topo; 7import?com.dawning.gridview.alarmSystem.alarmcorrelation.AlarmAnalyze; 8 9global?com.dawning.gridview.alarmSystem.alarmcorrelation.AlarmAnalyze?aiAnalyze; 10 11rule″${ruleName}″ 12?when 13 $cause:AlarmInfo(alarmValueID==″${causeAlarmValueID}″) 14 $result:AlarmInfo(alarmValueID==″${resultAlarmValueID}″, 15 name_type==$cause.name_type, 16 name_typeName==$cause.name_typeName, 17 name_subtype==$cause.name_subtype, 18 name_subtypeName==$cause.name_subtypeName, 19 alarmTime>=$cause.alarmTime) 20?then 21 aiAnalyze.addEdge($cause,$result,″${databaseRuleName}″,″${templateName}″); 22?end |
The syntactic description of top code is as follows:
(1) the 1st row: package package-name (bag name)
The bag name is enforceable.Just as the bag among the java, the bag name is the name space name just, and is irrelevant with file or directory name.
Wherein , ${templateName}, ${causeAlarmValueID}, ${resultAlarmValueID} is a configurable data, and the value of being set by the user replaces when actual create-rule.
(2) the 3rd row-Di 7 row: import
Import is the same with the implication among the java.For any object that will in rule, use, need to formulate complete path and type name.Regulation engine will import class automatically from java bag of the same name.
(3) the 9th row: global
Global is a global variable, is commonly used to return data, as the record of an action, obtains to provide data or service to use to rule.
Global variable is stated in rule file and is used, and carries out assignment in the Java file.The aiAnalyze here is the global variable of alert analysis class example.
(4) the 11st row: rule " name "
Rule name.Here, " name " is a configurable data, replaces configurable data in the time standby actual value of create-rule.
(5) the 12nd row-Di 19 row: when
The condition part of rule.In the alarm association module, correlation model difference, corresponding condition are also different.
(6) the 20th row-Di 21 row: then
The action part of rule.It allows java code semantic chunk.The aiAnalyze.addEdge () here is a method of calling the interpolation limit of alert analysis class.
Two, the management of business rule
The prerequisite of user definition rule is to have the business rule model.More pre-defined business rules in the system, in addition, the user can formulate business rule according to actual conditions, and can operation such as make amendment, deletes, check to the business rule that has defined.The user operates in user interface, and what see is the information that can understand, and the rule file of the reality of bottom operation is some coding forms, therefore need change mutually between the two, promptly rule that is used to show and actual rule file is separated.System realize be the rale store that will be used for showing to database, actual rule file is stored under the catalogue of regulation, is synchronous between the two.
As shown in Figure 3, user interface to the operation of rule by carrying out alternately with database, actual rule file generates according to the content in the database, when the user carries out the warning association analysis operation, what the Drools regulation engine used is actual rule file, rather than user-defined rule file in the database.
Fig. 4 shows according to an embodiment of the invention that the user increases regular flow process, and as shown in Figure 4, this flow process may further comprise the steps:
Step 402-404: judge the legitimacy of user's set point, can not judge whether to be sky for the field of sky.
The alarmValueID (warning value ID) of rule name, correlation model, reason alarm and the alarmValueID of result's alarm can not be sky, if be empty, then need to point out the user.
Step 406: judge whether the alarmValueID that reason is alarmed and the result alarms that the user sets exists.
If alarmValueID does not exist, the prompting corresponding information.
Step 408-410: whether the judgment rule name exists.
The rule name that relatively increases newly whether with Table A larmCorrelationRule in the Name field repeat, if deposit repetition, prompting user policy name repeats, and need re-enter a rule name.
Step 412: whether the rule that judgement will increase exists.
Whether the correlation model of the rule that relatively increases newly+reason alarm alarmValueID+ result alarms alarmValueID and exists in the AlarmCorrelationRule table, if exist, the prompting user policy exists.
Step 414; Whether can form ring after judging the increase rule.
Strictly all rules is stored in the mode of scheming, and vertex representation alarmValueID stores all alarmValueID in the alarm cause table into the figure summit, and figure stores in the adjacency matrix mode.When the user increases rule, alarm in the alarm of the reason of rule and result between the summit of alarmValueID correspondence and add a limit, judge whether to exist ring, if it is exist, illegal.
Step 416: judge whether reason alarm, the result's alarm selected mate with correlation model.
Step 418: if coupling then increases a record in database table.
Step 420: generate corresponding rule file, call create-rule file sub-process.
Three, the handling process of warning association analysis
The prerequisite of warning association analysis is the data that have rule and will analyze, and in addition, analyze also needs some supplementarys.If the data owner warning information of analyzing, supplementary comprise equipment and rack corresponding informance, topology information, node grouping information etc.
Fig. 5 is the simple logic schematic diagram of a warning association analysis according to an embodiment of the invention.As shown in Figure 5, the warning association analysis overall flow is divided into following steps:
Step 1, obtain the warning information that will analyze, the user is provided with querying condition by UI;
Step 2, send warning information to the warning association analysis logic;
Step 3, warning association analysis logic are obtained supplementary and the rule base information that is used to analyze;
The supplementary that is used to analyze comprises: topology information, equipment and rack corresponding informance, node grouping information, hardware and software corresponding informance etc.;
The rule base information that is used to analyze is all Rule Informations that come into force of current large-scale computer;
Step 4, warning association analysis
Regulation engine is inserted into the work internal memory with Rule Information, warning information and supplementary, and warning information is carried out association and subsequent treatment.
The result returns to UI.
Obtain all Root alarm and related warning information and quantity thereof after the analysis, return UI with the form of tabulation.Obtain the warning information of Root alarm association, be shown to the user in the mode of scheming.Fig. 6 is a warning association analysis result's a schematic diagram according to an embodiment of the invention.
Fig. 7 is the detail flowchart of warning association analysis according to an embodiment of the invention.
As shown in Figure 7, the detailed process of warning association analysis can be divided into following steps:
Step 702-704: if the current data that will not analyze of prompting user, are then dished out unusually in the warning information of analyzing tabulation for empty;
Step 706-708: check whether rule file correctly generates, if incorrect the generation then regenerates;
Step 710-714: if there is no rule file, it is unusual then to dish out, and the prompting user is current not to have definition rule, can't analyze;
Step 716: read rule, create the work internal memory of regulation engine;
Step 718: regular global variable aiAnalyze is set, and its value is for this, the example of the alert analysis class of generation when represent the Action of Strust to call;
Step 720: obtain the required supplementary of alert analysis, comprising: topology information tabulation, node grouping information list, equipment and the tabulation of rack corresponding informance etc.;
Step 722; Various supplementarys and warning information all are inserted in the work internal memory;
Step 724: activate rule, carry out Data Matching;
Step 726; Release work internal memory;
Step 728: data are carried out subsequent treatment.The subsequent treatment sub-process mainly is that the information after regulation engine is filtered is handled.At first with in storage to a directed graph, the vertex representation warning information among the figure, directed edge is represented Rule Information, the initial vertex on limit is represented the reason alarm, is stopped vertex representation result alarm.To operations such as figure travel through, obtain Root alarm warning information related and quantity with Root alarm, and the message sense of all alarms that cause by Root alarm.
Four, the realization of alarm association regulation engine
System adopts the Drools regulation engine of increasing income as the alarm association regulation engine, carries out the checking of Data Matching.The object that the Drools regulation engine needs comprises rule, the data that will analyze, other supplementary.The Drools regulation engine meets the JSR-94 standard, and the interface API of external program use and control law engine is provided, and therefore, only need call these API and just can realize being loaded into rule in the system and using them.The step that rule is loaded in the system is as follows:
The first step: create the regulation engine object, this to as if dynamically generate by configuration information.
At first, generate configuration information.
Properties?baseProp=new?Properties(); baseProp.put(″newInstance″,true); baseProp.put(″poll″,10); baseProp.put(″dir″,this.getClass().getResource(″/″).toURI().getPath() +this.RULE_PATH+″/″+hpcID); |
Then, create the regulation engine object according to configuration information.
RuleAgent?ruleAgent=RuleAgent.newRuleAgent(props); |
Second step: from rule base, obtain the rule bag relevant, and be loaded in the regulation engine with alarm association.
StatefulSession?workingMemory=ruleAgent.getRuleBase(hpcID).newStatefulSession(); |
The 3rd step: import the business object that needs processing to regulation engine.Import to as if user's oneself object, alarm object for example, topology information object, node grouping object etc.In example, suppose rules engines processes to as if user-defined alarm object and topology information object.It is right that engine will carry out matching ratio to the rule in the rule bag of the property value of all objects of importing and current loading, and the rule that the match is successful is placed among the Agenda.
for(int?i=0;i<lsAi.size();i++){ workingMemory.insert(lsAi.get(i)); |
} for(int?i=0;i<lsTopo.size();i++){ workingMemory.insert(lsTopo.get(i)); } |
The 4th step: activate rule.In the process of regulation engine executing rule, the operation that may occur comprises:
The property value of some object will be modified (such as revising alarm level);
Some new object is created (behind alert analysis, causing the alarm of some newtypes to be created);
Some object deleted (as alarm filter);
Regulation engine can be after each rule be performed from moving such check: under the current state, whether the medium pending rule of Agenda also satisfies condition, reject do not satisfy condition etc. pending rule; Check simultaneously whether the original not rule in Agenda meets the rule of current state in the rule bag, if having then they are joined among the Agenda.Engine finally can empty Agenda.
workingMemory.fireAllRules(); |
Regulation engine will all be defined in the rule operations such as the establishment of object, modification, deletions, and this has guaranteed the stability of program.After some alarm filter and alarm association rule change, only the rule bag after changing need be called in engine again.Engine only is responsible for (as guaranteeing the mutex relation between rule, the execution sequence of rule) to the accuracy that rule is carried out, but can not be concerned about the particular content of rule.
The present invention has selected the Drools regulation engine of increasing income for use.This can also perhaps select for use commercial regulation engine product to realize by the mode of independent development.
By above-mentioned alarm information processing method and system, the present invention adopts the rule-based method that occupy main flow at present to solve the warning association analysis problem that continues solution in the cluster monitoring.The method that invention realizes is separated business rule logical AND program, make things convenient for user management and formulate business rule flexibly, after warning association analysis, reduced alarm quantity, alleviate system manager's work load, realized the promptness and the stability of alarm and control system.
Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with the general calculation device, they can concentrate on the single calculation element, perhaps be distributed on the network that a plurality of calculation element forms, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in the storage device and carry out by calculation element, perhaps they are made into each integrated circuit modules respectively, perhaps a plurality of modules in them or step are made into the single integrated circuit module and realize.Like this, the present invention is not restricted to any specific hardware and software combination.
The above is the preferred embodiments of the present invention only, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.