Adopt affirmation mechanism to realize the method for alarm management under a kind of snmp protocol
Technical field
The present invention relates to adopt under the snmp protocol affirmation mechanism to realize the method for alarm management, relate in particular in the connecting system of communication network management domain, act on behalf of the management method of entity for alarm in the off line unit of SNMP (V2) agreement.
Background technology
SNMP (Simple Network Management Protocol, Simple Network Management Protocol) is the interim solution that the needs owing to the standard network management agreement produce, three key plates having occurred originally in its evolution, is respectively SNMPV1, SNMPV2 and SNMPV3.Wherein SNMPV2 is divided into several child releases again, and SNMPV2c is most widely used in the network management of communication field in the child release.
Snmp protocol operates on manager-agency's the administrative model, and direct, a fundamental method of exchange of management information between manager's entity and the agent entity is provided.The management information that exchanges between manager's entity and the agent entity has multiple, the type of message kind difference that different versions provides, an important type of message is exactly Trap (trap message) for the SNMPV2c version, this message need be given the correct time important incident at agent entity and just be produced in the active of manager's entity, send request with manager's entity, agent entity is made the type of message difference of response, and manager's entity does not respond for the trap message that receives.Therefore, trap message is an access that the nothing from agent entity to manager's entity is confirmed.
General SNMP is designed to connect UDP in nothing and moves.Therefore, send trap message in typical case at UDP/IP, agent entity can not guarantee that the message of a key has arrived manager's entity.
In connecting system, the network element agent entity that moves on the network equipment carries out the mutual of information by manager's entity of SNMP (V2c) agreement and webmaster, wherein the trap message of network element agency transmission has reflected the operation conditions of the current network equipment of being managed, so that the user takes appropriate measures according to these information.But, because the access that the nothing that trap message is a kind of UDP/IP of employing to be sent is confirmed, therefore, can't guarantee that manager's entity can obtain the trap message that agent entity sends in time, accurately, thereby can't guarantee to carry out in time, effectively manage for the network equipment.
Owing to adopt the network element agent entity of the connecting system of SNMP (V2c) agreement not keep for the trap message that produces, therefore, when webmaster is gone for a season for some reason, again return the back and just can't learn which kind of variation has taken place the network equipment in the time departure section, or produced which kind of fault.
Summary of the invention
The technical problem to be solved in the present invention is can't in time, correctly deliver to gerentocratic shortcoming for the warning information that the access that overcomes nothing affirmation in the snmp protocol causes managing on the network equipment, has proposed a kind of method that adopts affirmation mechanism to realize alarm management.Can guarantee that the trap message that the network element agency on the network equipment transmits at UDP/IP can be delivered to manager's entity in time, accurately, simultaneously, provides again the query function of manager's entity.
The present invention specifically is achieved in that
A kind of snmp protocol adopts affirmation mechanism to realize the method for alarm management down, it is characterized in that this method comprises following processing:
Alarm management module provides unified interface to the alarm source that produces alarm information;
Dissimilar alarm information and the right information of other assembling mib object values of registration in the registration table of alarm management module;
The dissimilar alarm information that receives the alarm source generation is processed;
Call unified interface and send the dissimilar alarm information of handling to webmaster;
For the alarm information that needs are confirmed, wait for the affirmation message of webmaster, if receive affirmation, confirm to process, if do not receive affirmation, carry out overtime processing.
The described processing that dissimilar alarm informations are carried out comprises:
Create the memory block;
The alarm information that produces is obtained the alarm identification code, form the alarm stored record and store;
Message is cancelled in the recovery and the alarm that produce, revised the alarm record mark of storage;
According to the type of alarm information, from registration table, read the mib object that needs binding, carry out the right binding of mib object identifier and mib object value;
For the stored record that is labeled as alarm or recovery, the time counting that resends and the inferior counting number that resends are set.
Affirmation after the described webmaster acknowledge message is processed and is comprised:
The alarm identification code that the mib object of confirming by representative according to the webmaster that receives arranges is searched the memory block, finds corresponding alarm stored record;
According to the alarm information storage class of the stored record mark representative of preserving, confirm operation;
What arrange in the cancellation stored record resends time counting and resends time counting number.
Described timeout treatment of carrying out comprises:
The traversal memory block, for the alarm that is not identified of preserving and the alarm stored record of recovery, the review time sends counting and sends time counting number;
Arrive for transmitting time, what the transmission number of times satisfied condition is not identified alarm or recovers to confirm operation;
Do not arrive for transmitting time, the alarm of the not confirmed that transmission times satisfies condition or recovery reduce the transmitting time of waiting for, continue to wait for;
Arrive for transmitting time, the alarm of the not confirmed that transmission times does not satisfy condition or recovery resend to webmaster, reduce resending number of times, and the time that resends is set;
Do not arrive for transmitting time, the alarm of the not confirmed that transmission times does not satisfy condition or recovery reduce the transmitting time of waiting for, continue to wait for.
Described alarm management module judges that according to the content parameters of alarm information the alarm information whether repetition is arranged in the memory block exists after receiving alarm information;
If exist, do not carry out any processing;
If there is no, then obtain the alarm identification code and form alarm stored record preservation, send network management alarm;
Alarm is set after the transmission resends time counting and time counting number in the stored record.
Described alarm management module judges whether have corresponding alarm information to exist in the memory block according to the content parameters of message recovery after receiving message recovery;
If there is no, do not carry out any processing;
If exist, then revise the recovery unconfirmed of being labeled as of alarm stored record, send to webmaster and recover;
What this recovery was set after the transmission again resends time counting and time counting number.
Described alarm management module receive the alarm cancel message after, cancel content parameters is searched coupling in the memory block alarm information according to alarm;
If certain alarm stored record does not satisfy condition, do not carry out any processing;
If the recovery unconfirmed of being labeled as of alarm stored record is revised in the alarm that this record represented alarm source that is content parameters produces so, send webmaster and recover;
What this recovery was set after the transmission again resends time counting and time counting number.
Described alarm management module further comprises the alarm of not confirmed and the processing of message recovery:
Cycle timing scan memory block according to the transmitting time that keeps in each alarm stored record counting and transmission times counting, determines or sends alarm or do and confirm processing to webmaster;
Timing arrives, if the transmission times counting is that maximum times and transmitting time counting are 0, the recovery of not confirmed is removed, and the alarm of not confirmed is confirmed;
Before also not arriving the maximum times transmission, judge whether transmitting time arrives, if arrive, resend so alarm and the recovery of not confirmed, if transmitting time does not arrive, be left intact, wait for the arrival of transmitting time, if the affirmation of receiving webmaster in the limiting time that does not also reach maximum number of times of transmission is confirmed to process normally.
Described webmaster is by arranging the operation of alarm identification code to the mib object that represents acknowledged alarm, expression is confirmed for corresponding alarm information;
After receiving acknowledged alarm, alarm management module is searched the alarm record that whether has this alarm identification code to represent in the memory block and is existed, if exist, amendment record is labeled as the affirmation alarm, then, removes and resends time counting, inferior counting number;
If there is no, be left intact.
Described webmaster arranges the operation of alarm identification code by the mib object that representative is recovered to confirm, expression is confirmed for corresponding recovery;
After receiving the recovery affirmation, alarm management module is searched the stored record that whether has this alarm identification code to represent in the memory block and is existed, if exist, deletes this stored record;
If there is no, be left intact so.
Described alarm management module also comprises the processing of alarm synchronization:
The first step is obtained the alarm identification code of the alarm record of all preservations, and webmaster carries out read operation by the mib object that representative is obtained the alarm identification code, and expression is obtained; After receiving read operation, alarm management module collects the alarm unconfirmed of all preservations in the memory block or the alarm identification code in the affirmation alarm, composes and gives this mib object;
Second step, request send the alarm record of specifying alarm identification code character to represent, after finishing the first step, webmaster compares the alarm identification code of self preserving one by one with the alarm identification code that comprises in the mib object numerical value group that gets access to;
If the alarm identification code of webmaster does not exist in mib object numerical value group, webmaster thinks that the alarm of this alarm identification code representative recovers, according to Recovery processing, otherwise, according to alarming processing;
If the alarm identification code in the mib object numerical value group does not exist in the alarm identification code character that webmaster is preserved, carry out the second step of alarm synchronization, otherwise, according to alarming processing;
In the 3rd step, webmaster is by specifying the mib object of alarm that the operation of alarm identification code character, the alarm of expression request appointment are set to the representative request;
After receiving alarm identification code character, alarm management module takes out the alarm identification code one by one from group, to each alarm identification code, whether search alarm stored record corresponding to this alarm identification code exists, if exist, send alarm to webmaster so, continue the processing of next alarm identification code; If there is no, then carry out the processing of next alarm identification code.
Described alarm management module also comprises the processing that alarm refreshes:
Webmaster carries out setting operation by the mib object that the representative request is refreshed alarm, the alarm of expression request appointment;
After alarm management module received the alarm refresh requests, alarm management module began to search for the memory block, if stored record is the alarm record, sent alarm to webmaster so;
If not the alarm record, do not do any processing so;
Continue operation, search finishes until the memory block.
Adopt the method for the invention, compared with prior art, owing to taked the affirmation mechanism technical measures, can guarantee that manager's entity receives the trap message that agent entity sends in time, accurately under the snmp protocol, thereby reach the effect of timely grasp equipment operation condition.Simultaneously, owing to agent entity is stored the Trap that sends, thereby can provide gerentocratic real-time query, improve the monitoring capacity of system for the network equipment.
Description of drawings
Fig. 1 is the explanation of Trap (trap message) PDU (protocol Data Unit) format content;
Fig. 2 is the explanation of Registry Elements format content;
Fig. 3 is the format content explanation of the stored record of preserving in the memory block;
Fig. 4 is the conversion specification of different phase between alarm information type generation-storage-webmaster;
Fig. 5 is mib object identifier and the right composition explanation of value that sends to the notice Trap of webmaster;
Fig. 6 is mib object identifier and the right composition explanation of value that sends to the alarm Trap of webmaster;
Fig. 7 is mib object identifier and the right composition explanation of value that sends to the recovery Trap of webmaster;
Fig. 8 adopts affirmation mechanism to realize the overall flow of alarm management;
Fig. 9 is the notification message handling process;
Figure 10 is the alarm information handling process;
Figure 11 recovers the Message Processing flow process;
Figure 12 is that the Message Processing flow process is cancelled in alarm;
Figure 13 resends the function treatment flow process;
Figure 14 is the acknowledged alarm flow process;
Figure 15 recovers to confirm flow process;
Figure 16 is the alarm identification code flow process that webmaster obtains the alarm record of all preservations;
Figure 17 is the alarm record flow process that webmaster is specified the alarm identification code synchronously;
Figure 18 is webmaster request alarm refresh flow.
Embodiment
In the method for the invention, offer the unified interface of the network equipment (alarm source) that produces alarm information by alarm management module, alarm source is registered to kind and other assemblings MIB (Management Information Base) information that object value is right of alarm information in the registration table of alarm management module, then, the interface that calls alarm management module again and provided sends various dissimilar alarm informations.Alarm management module is before sending Trap, and the alarm information that produces for alarm source carries out different management according to classification.After the transmission,, can wait for the affirmation message of webmaster,, confirm so to handle if wait by the time confirm for those Trap that need confirm.Do not confirm if receive webmaster, carry out so overtime processing.
Described alarm management module comprises for management, the transmission of various alarm informations:
1, application and initialization memory block, the application semaphore, the management role of startup alarm management module is prepared the message that send in the receiving alarm source.
2, according to the alarm information type that receives, carry out stores processor respectively.
The alarm that produces need be obtained the alarm identification code, forms stored record and stores.
The alarm record mark that needs to revise storage is cancelled in recovery that produces and alarm.
The notice that produces is not carried out any stores processor.
3, according to the alarm information type, from registration table, read the mib object that needs binding, carry out the right binding of mib object identifier and mib object value, send Trap to webmaster by UDP/IP.
4, for the stored record that is labeled as alarm or recovery of preserving, the time counting that resends and the inferior counting number that resends are set.
5, the Trap that confirms for needs waits for the affirmation of webmaster.
The affirmation of described alarm management module is handled and is comprised:
1, the alarm identification code that the mib object of confirming by representative according to the webmaster that receives arranges is searched the memory block, finds corresponding stored record.
2, according to the storage class of the stored record mark representative of preserving, confirm operation.
What 3, arrange in the cancellation stored record resends time counting and resends time counting number.
The timeout treatment of described alarm management module promptly resends and comprises:
The traversal memory block, for the alarm that is not identified and the recovery of preserving, the review time sends counting and sends time counting number.Judge for following condition, handle respectively.
1) arrive for transmitting time, operation is confirmed in the not confirmed alarm that transmission times satisfies condition or recovery.
2) do not arrive for transmitting time, the alarm of the not confirmed that transmission times satisfies condition or recovery reduce the transmitting time of waiting for, continue to wait for.
3) arrive for transmitting time, the alarm of the not confirmed that transmission times does not satisfy condition or recovery resend to webmaster, reduce resending number of times, and the time that resends is set.
4) do not arrive for transmitting time, the alarm of the not confirmed that transmission times does not satisfy condition or recovery reduce the transmitting time of waiting for, continue to wait for.
Below in conjunction with accompanying drawing, substantially be described in further detail according to the enforcement of the order of accompanying drawing to technical scheme:
The content that Fig. 1 has introduced Trap PDU form in SNMP (V2 and the more than) agreement and comprised.The alarm management module more attention be wherein a row mib object name and value to part, this part has represented the information that Trap had, these message reflections the operation conditions of current managed devices.
Fig. 2 has introduced form that each element has in the Trap registration table and the content that comprises.Before alarm source sends alarm information, need in registration table, alarm kind and object name corresponding to a row mib object indications that Trap PDU is relevant be filled in the registration table.Wherein:
1, usage flag: whether expression Trap registers.After registration is finished, need be changed to user mode.
2, Trap identifier: implication is with Trap object identifier unanimity, and is corresponding one by one with the Trap object identifier, as the scapegoat of Trap object identifier, conveniently stores, compares.
3, Trap object identifier: expression Trap representative is the alarm of which kind of class.
4, the length of Trap object identifier: the element number in the expression Trap object identifier.
5, the number of mib object among the Trap: provided in Fig. 1 a row mib object identifier and value to partly representing the number of those mib objects of warning content.
6, a row object name among the Trap: provided with Fig. 1 in a row mib object identifier and value name that those mib objects that partly represent warning content are had.
Fig. 3 has introduced the form of the stored record of preserving and the content that comprises in the memory block.The restriction of the network equipment memory size that the size of memory block is managed can be according to what of the overall performance of equipment or alarm kind, and the frequency that alarm source produces alarm information defines its size.The position that a certain moment produces the alarm source of alarm information, the information that important Trap such as the content parameters value that comprises in the alarm information have have been put down in writing in stored record.Wherein:
1, Trap identifier: implication is with the explanation among Fig. 2.
2, alarm identification code: being mainly used in alarm management module mutual with webmaster, is the means of information exchange between the two, unique identifies a certain alarm.
3, the type mark of stored record: mainly contain 3 kinds, sign is arranged in Fig. 3.Be used for representing whether the Trap of stored record representative is identified.
4, the content parameters value of stored record: the content parameters value of preserving the alarm information of alarm source generation.
5, the location parameter of stored record counting: preserved the number of describing the alarm source location parameter that produces alarm information.
6, the location parameter value of stored record: described the position that produces the alarm source of alarm information.
7, resend time counting: control the time interval that alarm unconfirmed or recovery resend.
8, resend time counting number: control the number of times that alarm unconfirmed or recovery resend.
When Fig. 4 has introduced alarm information and has produced, the alarm information type that is had, the type of stored record in the memory block sends to the Trap type of webmaster and the conversion between each different phase type at last.
Fig. 5 introduced the notice Trap that sends to webmaster with a row mib object identifier and the right composition structure of value.
Fig. 6 introduced the alarm Trap that sends to webmaster with a row mib object identifier and the right composition structure of value.
Fig. 7 introduced the recovery Trap that sends to webmaster with a row mib object identifier and the right composition structure of value.
Fig. 8 has introduced and has adopted affirmation mechanism to carry out the overall flow of alarm management.
Alarm management module is finished following initialization operation in the system equipment start-up course:
1, creates the memory block, its size is set, and initialize the memory block.
2, create a semaphore, be used for controlling concurrent visit for the memory block.
3, create the alarm identification code resource of an overall situation, and initialize.
4, create an alarm management module Processing tasks.
5, after system's control detection system works is normal, the cold start-up of notice network management system.
The alarm management module Processing tasks receives and comes from the dissimilar alarm informations of alarm source and the time exceeded message of self.
Simultaneously webmaster by specific mib object require alarm management module finish affirmations, alarm synchronous and alarm the operation such as refresh.
The main affirmation operation that alarm management module is finished has: the affirmation of alarm, the affirmation of recovery.
In order to utilize limited resource and to guarantee best Realtime Capability of Communication, alarm management module has designed regularly to be retransmitted unacknowledged alarm, the function that the recovery of not confirmed is regularly removed.
In order to carry out with webmaster alternately, alarm management module uses the alarm identification code to represent unique alarm, and each alarm that alarm source produces can have an alarm identity assignments to give it.When alarm source produces recovery, if the resources definition of alarm identification code is enough big, resource does not need to reclaim so, because enough big resources definition has guaranteed to be far longer than the even running time of system equipment its service time, therefore, by restarting of system equipment, can avoid producing the situation of inadequate resource.
Fig. 9 has introduced the handling process of type for the alarm information of notice.The alarm information of the notification type that alarm source produces, just be responsible for informing some equipment in the network management system begin start, start finish, link-state change etc.Be very similar to the no affirmation mechanism Trap among the SNMP.Alarm management module is not put into the memory block with content of announcement, but directly sends to webmaster, and webmaster does not need the notice Trap that receives is confirmed.
It is the alarm information handling process of alarm that Figure 10 has introduced type.Alarm is responsible for informing some important faults of network management system equipment appearance or the significant problem that system occurs.Alarm management module at first judges that according to the content parameters of alarm the alarm whether repetition is arranged in the memory block exists, if exist, does not carry out any processing so after receiving alarm.If there is no, then obtain the alarm identification code and form the stored record preservation, send then network management alarm.The alarm that alarm source produces all is unacknowledged alarm, needs after the transmission to be provided with to resend time counting and time counting number in the stored record.If receiving alarm Trap, webmaster will confirm for alarm.
Figure 11 has introduced the alarm information handling process of type for recovering.Recovery be with alarm one to one, the reparation that its expression has produced fault, at present equipment normal operation.Alarm management module at first judges whether have corresponding alarm to exist in the memory block, if there is no, not carry out so any processing according to the content parameters of recovering after receiving recovery.If exist, then revise the recovery unconfirmed of being labeled as of stored record, send to then webmaster and recover, what this recovery was set again resends time counting and time counting number.If receiving recovery Trap, webmaster will confirm for recovering.
It is the alarm information handling process that alarm is cancelled that Figure 12 has introduced type.Alarm is cancelled and is represented that alarm source is lost or the recovery of the part alarm that alarm source produces.When alarm source is lost, need to recover all alarms that this alarm source produces.Alarm management module receive the alarm cancel after, at first cancel content parameters is searched coupling in the memory block alarm according to alarm, if certain the alarm stored record do not satisfy condition, do not carry out any processing so.If the recovery unconfirmed of being labeled as of alarm stored record is revised in the alarm that this record represented alarm source that is content parameters produces so, send then webmaster and recover, what this recovery was set again resends time counting and time counting number.If receiving recovery Trap, webmaster will confirm for recovering.
Figure 13 has introduced and how to have processed memory block alarm unconfirmed or recover namely to resend function.In the system equipment running, communication abnormality between webmaster and the network equipment that is managed may cause some affirmations of webmaster can not in time be delivered to alarm management module, will have alarm and the recovery of some not confirmeds in the memory block of alarm management module.Although the alarm by webmaster refreshes and alarm synchronization, the alarm of these not confirmeds can obtain confirming that such affirmation does not embody the real-time of communication apparatus alarm.For those recoveries that is not identified, can remain as permanent data, taken the limited memory block resource of system, cause waste.Therefore, in order better to embody the real-time of alarm and to remove the data that these take resource, in alarm management module, there is the one-period timer, its meeting cycle timing scan is the memory block once, according to the transmitting time that keeps in each stored record counting with send time counting number, decision or send Trap or do and confirm to handle to webmaster.Timing arrives; if send time counting number is that maximum times and transmitting time counting are 0; explanation is in the set time; inferior alarm that is not identified of n (n>=2) and recovery have been sent; but because the affirmation that communication abnormality or other reasons are not received webmaster; therefore the recovery that is not identified is removed, the alarm that is not identified is confirmed.Otherwise, before also not arriving n transmission, judge whether transmitting time arrives, if arrive, resend the alarm and the recovery that are not identified so.If transmitting time does not arrive, be left intact so, wait for the arrival of transmitting time.If the affirmation of receiving webmaster in the limiting time that does not also reach maximum number of times of transmission can be confirmed flow processing so normally.The cycle Timing Processing is just handled alarm and the recovery that is not identified, and does not handle for alarm of having confirmed and recovery.
Figure 14 has introduced acknowledged alarm how to handle webmaster.Webmaster is by arranging the operation of alarm identification code to the mib object that represents acknowledged alarm, expression for which alarm is confirmed.After receiving acknowledged alarm, alarm management module is searched the alarm record that whether has this alarm identification code to represent in the memory block and is existed, if exist, so, amendment record is labeled as the affirmation alarm.Then, removing resends time counting, inferior counting number.If there is no, be left intact so.
Figure 15 has introduced recovery how to process webmaster and has confirmed.Webmaster arranges the operation of alarm identification code by the mib object that representative is recovered to confirm, expression for which recovery is confirmed.After receiving the recovery affirmation, alarm management module is searched the stored record that whether has this alarm identification code to represent in the memory block and is existed, if exist, deletes so this stored record.If there is no, be left intact so.
Figure 16 has introduced the first step in the alarm synchronization operation of webmaster: the alarm identification code of obtaining the alarm record of all preservations.Webmaster carries out read operation by the mib object that representative is obtained the alarm identification code, and expression is obtained.After receiving read operation, alarm management module collects the alarm unconfirmed of all preservations in the memory block or the alarm identification code in the affirmation alarm, composes and gives this mib object.
Figure 17 has introduced the second step in the alarm synchronization operation of webmaster: request sends the alarm record of specifying alarm identification code character to represent.After system equipment operation a period of time, the reasons such as restriction owing to the unusual or system business flow of communication cause webmaster can't receive alarm or the recovery of transmission.Behind system's normal operation, webmaster is initiated the timely demonstration that the Synchronize Alarm operation guarantees alarm.After the operation of finishing the first step, webmaster compares the alarm identification code of self preserving one by one with the alarm identification code that comprises in the mib object numerical value group that gets access to, if, the alarm identification code of webmaster does not exist in mib object numerical value group, so, webmaster thinks that the alarm of this alarm identification code representative recovers, according to recovering processing.Otherwise, according to alarming processing.If the alarm identification code in the mib object numerical value group does not exist, carry out so the second step operation of alarm synchronization in the alarm identification code character that webmaster is preserved.Otherwise, according to alarming processing.
Webmaster is provided with the operation of alarm identification code character, the alarm of expression request appointment by the representative request being specified the mib object of alarm.After receiving alarm identification code character, alarm management module takes out the alarm identification code one by one from group, each alarm identification code is done as follows: search alarm stored record corresponding to this alarm identification code and whether exist, if exist, send alarm to webmaster so, continue then the processing of next alarm identification code.If there is no, then carry out the processing of next alarm identification code.
Figure 18 has introduced alarm refresh requests how to process webmaster.Webmaster carries out setting operation by the mib object that the representative request is refreshed alarm, the alarm of expression request appointment.After alarm management module received the alarm refresh requests, alarm management module began to search for the memory block, if stored record is the alarm record, sent alarm to webmaster so.If not the alarm record, do not do any processing so.Continue so operation, search finishes until the memory block.