CN101150430B - A method for realizing network interface board switching based heartbeat mechanism - Google Patents

A method for realizing network interface board switching based heartbeat mechanism Download PDF

Info

Publication number
CN101150430B
CN101150430B CN2007100771822A CN200710077182A CN101150430B CN 101150430 B CN101150430 B CN 101150430B CN 2007100771822 A CN2007100771822 A CN 2007100771822A CN 200710077182 A CN200710077182 A CN 200710077182A CN 101150430 B CN101150430 B CN 101150430B
Authority
CN
China
Prior art keywords
network interface
interface board
main
standby
operating state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2007100771822A
Other languages
Chinese (zh)
Other versions
CN101150430A (en
Inventor
李智强
李旭瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN2007100771822A priority Critical patent/CN101150430B/en
Publication of CN101150430A publication Critical patent/CN101150430A/en
Application granted granted Critical
Publication of CN101150430B publication Critical patent/CN101150430B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)
  • Hardware Redundancy (AREA)

Abstract

This invnetion discloses a method for realizing switching network interface plates by a heartbeating system including: when a first network interface plate is started up, a default setting is a standby one to ask a system control plate to send configuration information, which sends it to the first interface plate, and the first and second plates send information containing their own working state and heartbeating to the control plate, which decides a master and standby relation of the two plates according to a switch strategy and working state of the two plates, and the two plates report local working state and communication state with the control plate to each other, the master plate judges communication fault with the control plate and redetermines the master and standby relation of the two plates accroding to the switch strategy between them, thus, when a fault happens, they can be recovered by themselves and service will not be interrupted.

Description

A kind of method that realizes network interface board switching by heartbeat mechanism
Technical field
The present invention relates to the fault recovery of communication field, in particular, a kind of method that realizes network interface board switching by heartbeat mechanism.
Background technology
For communication network, the most basic requirement is a non-interrupting service, increases active and standby network interface board so equipment all can be considered, if the main network interface board fault of using wants rapid and reliable realization to switch, thereby realizes the high reliability of network.
Soft switchcall server (SoftSwitch system based on the all-IP net, abbreviation SS) basic framework that realizes as shown in Figure 1, the equipment that whole framework comprises has system control panel (SC, System Control) 101, system protocol disposable plates (SPC, System Protocol Card) 102, network interface board (NIC, Network Interface Card) 103, systems exchange net (SSN, System Switch Net) 104, wherein SSN104 realizes SC101, SPC102, intercommunication between the NIC103, the external Ethernet switch of NIC103 is to wide area network, NIC103 realizes the distribution processor of message, and from the message that wide area network is come, NIC103 is distributed to corresponding SPC102 with message and realizes business such as calling, SPC102 sends a message to NIC103 after handling, NIC103 again with forwards to wide area network, visible NIC103 effect is extremely important, breaks down as if NIC103, add the insecure situation of masterslave switchover, all directly cause service disconnection.
According to the telecommunication service demand, the NIC103 disturbance switching can not interrupting service, and in reversed process failure or switch fault will can self-healing recovery in very short time, the loss that network interrupts bringing is reduced to minimum.In the prior art, the active and standby mechanism of NIC has been stipulated physical location as shown in Figure 2 when hardware designs between the active and standby each other NIC veneer, mainly realize by following steps.
(a) NIC201 and NIC202 startup is definite active and standby, passes through competitive way;
(b) after determining main with or standby operating state under, mainly receive and send data by ethernet port with NIC201, regularly detect the ethernet port state of local terminal, and reply the heartbeat handshake information that standby N IC202 sends; Standby N IC202 regularly sends the heartbeat handshake information to main with NIC201, and does not send and receive any data;
(c) after the master detects the port inefficacy of this plate with NIC201, sending switching request to described standby N IC202, is standby operating state from the master with the work state exchange simultaneously;
(d) described standby N IC202 is receiving switching request, does not perhaps receive mainly during with correctly the replying of NIC201 in setting-up time, is converted to lead from standby operating state and uses operating state.
(e) master passes through Ethernet switch 203 with the NIC201 reception from wide area network 204, and dispatch messages is to service processing board, and business board is handled message and arrived Ethernet switch 203 by the master with NIC201, arrives wide area network 204 again.
Because the active and standby of NIC geographically stipulated, can't pass through data configuration; When the main situation that occurs between with NIC201 and standby N IC202 interrupting, will occur with main with or with standby phenomenon; When requiring to switch after with the NIC201 fault, IC202 switches by standby N when main, no matter is main with switching failure or standbyly switching failure, can cause all that active and standby each other NIC uses with the master or with standby phenomenon.Thisly in the prior art need manual intervention to recover with NIC or can't self-healing recovery,, can cause the large tracts of land service disconnection when serious, bring very big loss as untimely processing with standby N IC itself with main.
Therefore, the NIC switching technique of prior art can't can have a strong impact on existing business when breaking down by the main and standby relation of data configuration NIC when System Expansion; NIC switches fault or active and standby NIC communication disruption, same main usefulness can occur or with standby failure condition, cause service disconnection, and is not only difficult in maintenance, and brings bigger loss.
Therefore, there is defective in prior art, needs to improve.
Summary of the invention
The object of the present invention is to provide and a kind ofly realize the method for network interface board switching by heartbeat mechanism, by the main and standby relation of data configuration NIC, NIC can self-healing recovery when breaking down, and has avoided service disconnection.
Technical scheme of the present invention is as follows:
A kind of method that realizes network interface board switching by heartbeat mechanism, it may further comprise the steps: when A1, the startup of first network interface board, default setting is the backup network interface board, and the Request System control board sends configuration messages, described configuration messages comprises and described first network interface board information of second network interface board of main and standby relation each other, and switching strategy; A2, system control panel send configuration messages to described first network interface board; A3, described first network interface board and described second network interface board send the heartbeat message that comprises this plate operating state respectively and arrive system control panel; A4, system control panel make a strategic decision according to the operating state of described switching strategy and two network interface boards, determine the main and standby relation of two network interface boards; A5, master report this plate operating state with network interface board and described backup network interface board mutually by heartbeat, and the communications status of this plate and system control panel; A6, the described main network interface board of using are according to this plate and the communications status of system control panel and the communications status of backup network interface board and system control panel, when judging this plate with system control panel generation communication failure, then, redefine the main and standby relation of two network interface boards by leading with making a strategic decision according to described switching strategy between network interface board and the backup network interface board.
Described method, wherein, described operating state comprise at least netting twine connection status, gateway status, port flow, cpu busy percentage, memory usage, with system control panel communications status, current main and standby relation, task run state one of them.
Described method, wherein, also comprise steps A 7: described backup network interface board is according to this plate and system control panel, main communications status with network interface board, judge this plate and system control panel generation communication failure, and the master breaks down with network interface board, then is set to the main network interface board of using voluntarily.
Described method, wherein, also comprise steps A 0 before the steps A 1: the user carries out data configuration; Described data configuration comprise at least configuration each network interface board main and standby relation, heart time, switching strategy one of them.
Described method, wherein, steps A 4 is further comprising the steps of: system control panel regularly is reported to the Operation and Maintenance platform with the active and standby information of each network interface board.
Described method, wherein, steps A 5 also comprises steps A 51: each network interface board judges that the operating state of active and standby plate changes, and then sends a warning message to the Operation and Maintenance platform; Perhaps, each network interface board judges that main operating state with network interface board and backup network interface board changes and satisfies a condition of switching strategy at least, then sends a warning message to the Operation and Maintenance platform.
Described method, wherein, steps A 5 also comprises steps A 52: the main operating state with network interface board and backup network interface board that each network interface board will change writes its journal file.
Described method, wherein, described switching strategy is that whether decision carries out the masterslave switchover of network interface board according to the operating state and main operating state with network interface board of the operating state of network interface board and contrast backup network interface board.
Described method, wherein, the operating state of described contrast backup network interface board and main operating state with network interface board specifically may further comprise the steps: B1, points-scoring system is set for every operating state of network interface board; B2, calculate the mark of whole operating states and main whole operating states with network interface board of backup network interface board respectively.
Described method, wherein, the condition of described switching strategy comprise at least cpu busy percentage exceed cpu busy percentage preset value, memory usage exceed memory usage preset value, port flow exceed port flow preset value, netting twine connect obstructed, gateway status is obstructed, with system control panel communication disruption, task run state one of them takes place to hang up.
Adopt such scheme, two network interface boards of the present invention can all have great role to data transmission stability and reliability by system control panel by the optimum veneer of the heartbeat mechanism decision-making usefulness of deciding, if can switch fast when breaking down.By the concrete network interface board main and standby relation of data configuration, simultaneously report heartbeat message between active and standby each other two network interface boards of function as a supplement, when the system control panel operation irregularity, the main and standby relation of also can making a strategic decision between the active and standby network interface board; By the decision strategy between system control panel and the network interface board, even break down, can self-healing recovery, have professionally do not interrupt, advantage that network reliability is high.And soft switchcall server can dispose many to active and standby network interface board each other, can not have influence on existing business when telecommunications is carried out dilatation; System control panel can be main with network interface board and backup network interface board according to the decision-making of optimal selection mode, just can switch before breaking down.
Description of drawings
Fig. 1 is the structural representation of the soft switchcall server of prior art;
Fig. 2 is the realization schematic diagram of the active and standby network interface board of prior art;
Fig. 3 is the realization schematic diagram of the network interface board of one embodiment of the present invention;
Fig. 4 is the startup flow chart of the network interface board of one embodiment of the present invention.
Fig. 5 is the flow chart of the inventive method.
Embodiment
Below preferred embodiment of the present invention is described in detail.
As shown in Figure 5, the invention provides and a kind ofly realize the method that NIC switches, utilize heartbeat to report NIC operating state and main and standby relation to SC by heartbeat mechanism, by SC according to the make a strategic decision main and standby relation of NIC of configured strategy; For preventing the SC failure condition, report the heartbeat pattern active and standby mutually with increase between the NIC simultaneously, can can self-healing recovery even if SC is out of order also in the shortest time, accomplish the influence of business in the time of when breaking down, switching fast and switch as far as possible little, thereby realize the high reliability of network; It specifically may further comprise the steps.
When A1, a NIC started, default setting was standby N IC, and request SC sends configuration messages, and described configuration messages comprises the information of the 2nd NIC of main and standby relation each other with a described NIC, and switching strategy.Promptly, be defaulted as and have veneer for the NIC of firm startup; Behind the NIC electrifying startup, file a request to the SC veneer earlier, require to obtain configuration information, these information comprise that this plate and which piece NIC are main and standby relation, and some decision strategy data of switching.
Before this, can also comprise steps A 0: the user carries out data configuration; Described data configuration comprise at least configuration each NIC main and standby relation, heart time, switching strategy one of them.
The user can register maintenance console (GUI) configuration data, these data can conveniently inquire about, revise and safeguard, promptly can dynamically change the main and standby relation of NIC by data configuration method, can in test process, also can cancel active and standby according to the main and standby relation of business demand flexible configuration NIC like this
Relation.Concrete configuration is more flexible, can dispose main and standby relation, heart time, switching strategy or the like.
Wherein, heart time can realize by data configuration, if the invalid default value that generally all starts of configuration data, for example the heart time default value reports heartbeat 1 second.
Wherein, described switching strategy is that whether decision carries out the masterslave switchover of NIC according to the operating state and main operating state with NIC of the operating state of NIC and contrast standby N IC.
Described switching strategy can comprise multinomial condition, and generates strategy according to the combination of terms and conditions.The condition of described switching strategy comprise at least cpu busy percentage exceed cpu busy percentage preset value, memory usage exceed memory usage preset value, port flow exceed port flow preset value, netting twine connect obstructed (Link Down), gateway status (Ping) obstructed, one of them takes place to hang up with SC communication disruption, task run state.Wherein, the user can change cpu busy percentage preset value, memory usage preset value, port flow preset value or the like, and for example, the cpu busy percentage preset value is set to 90%, and the memory usage preset value is set to 80%, the port flow preset value is set to 85% etc.
Like this, the strategy of switching can dynamically be changed, and for example the condition of switching strategy can be relaxed, and just can not switch before the master is broken down with NIC, prevents from negative effect to occur after the fault.The switching strategy condition generally also realizes by data configuration, such implementation is more flexible, can increase policy condition according to demand, such as: the netting twine of NIC LINK DOWN whether, whether gateway is logical, and network traffics are higher than certain threshold values, cpu busy percentage, memory usage, whether standby N IC condition be much better than main with etc. decision whether switch in advance.
General, only be under the available situation at two NIC, the operating state and main operating state that just need contrast standby N IC for increasing work efficiency with NIC, specifically, the operating state of described contrast standby N IC and main operating state with NIC specifically may further comprise the steps: B1, points-scoring system is set for every operating state of NIC; For example, the full marks of every operating state are 100 fens; Perhaps, for every operating state is provided with different weights, for example, the full marks of netting twine connection status are 100 minutes, and the full marks of gateway status are that 90 minutes, the full marks of port flow are that 95 minutes, the full marks of cpu busy percentage are 120 minutes or the like.B2, calculate the mark of whole operating states and the main whole operating states with NIC of standby N IC respectively.At this moment, described switching strategy can be used NIC as new master for selecting the high NIC of mark.For example, the gross score of whole operating states of standby N IC selects former standby N IC as the new main NIC that use during greater than the gross score of main whole operating states with NIC, and the former main NIC that uses is as new standby N IC; Carry out masterslave switchover.
A2, SC send configuration messages to a described NIC.
A3, a described NIC and described the 2nd NIC send respectively comprise this plate operating state heartbeat message to SC; Promptly by heartbeat mechanism, a described NIC and described the 2nd NIC send heartbeat message respectively and report the operating state of a described NIC and the operating state of described the 2nd NIC to arrive SC.Wherein, described operating state comprise at least this plate netting twine connection status, Ping gateway status, port flow (network interface flow), cpu busy percentage, memory usage, with SC communications status, current main and standby relation, task run state one of them.
A4, SC make a strategic decision according to the operating state of described switching strategy and two NIC, determine the main and standby relation of two NIC; So SC receives main heartbeat message with NIC and standby N IC, resolves its content, compares according to switching strategy then, whether decision-making switches.It is information that SC reports according to the two active and standby each other NIC heartbeats main and standby relation of making a strategic decision.
Wherein, lead the heartbeat message that not only receives standby N IC with the work of NIC, mainly distribute external message and handle, realize call business to corresponding SPC; Standby N IC receives main heartbeat message with NIC, synchronous described work state information, specifically as described above, comprise netting twine connection status, gateway status, port flow, cpu busy percentage, memory usage, with SC communications status, current main and standby relation, task run state or the like.Standby N IC does not receive the message of managing business.
And steps A 4 can also may further comprise the steps: SC regularly is reported to the Operation and Maintenance platform with the active and standby information of each NIC.Active and standby information for NIC can regularly be reported to the Operation and Maintenance platform by SC, and the engineering maintenance personnel are analytical equipment ruuning situation in time.For example, the heartbeat message of each active and standby NIC that SC does not receive at certain hour just sends and alarms maintenance terminal, can not receive that heart time is long also will to write down more detailed logging, as the foundation of attendant's later analysis, and the switching of the active and standby NIC that makes a strategic decision.
A5, described master report this plate operating state with NIC and described standby N IC mutually by heartbeat, and the communications status of this plate and SC.Report this plate operating state mutually by heartbeat between the active and standby NIC, and with the communications status of SC, if with the SC communication failure, just by the main and standby relation of making a strategic decision between the active and standby NIC; The diagnosis of communication failure is realized that by the testing mechanism of prior art for example disconnected by interrupting newspaper communication, interrupt priority level is the highest, and this plate only judges that its interface that provides gets final product.Perhaps, judge that the number of times that heartbeat message sends failure continuously reaches certain number of times, then think and the SC communication failure, for example restart suddenly at the SC veneer, or situation such as off-line.
Like this, each piece NIC knows this plate operating state, detects the communications status with SC simultaneously, also knows active and standby each other another piece NIC operating state, and another piece NIC and SC communications status; Even if under the SC work abnormal conditions, also can realize the main and standby competition pattern.
Steps A 5 can also comprise steps A 51: each NIC judges that the operating state of active and standby plate changes, and then sends a warning message to the Operation and Maintenance platform.Perhaps, each NIC judges that the operating state of active and standby plate changes and satisfies a condition of switching strategy at least, then sends a warning message to the Operation and Maintenance platform.The condition of described switching strategy comprise at least cpu busy percentage exceed cpu busy percentage preset value, memory usage exceed memory usage preset value, port flow exceed port flow preset value, netting twine connect obstructed, gateway status is obstructed, with SC communication disruption, task run state one of them takes place to hang up.The i.e. cpu busy percentage of this plate, the network interface flow, memory usages etc. exceed different threshold values (being threshold value) and go out the different stage alarm, netting twine LINK DOWN alarm, the obstructed alarm of ping gateway, the disconnected alarm of communicating by letter with SC, the task run state has the alarm of hang-up etc.; After a NIC breaks down, switch with regard to decision-making; Can accomplish to make a strategic decision in advance to switch according to different threshold values equally, switch after not waiting single board default, such as when main with NIC network interface flow be higher than a certain threshold values simultaneously CPU be higher than a certain threshold values, relatively whether standby N IC situation is made a strategic decision and can be switched in advance.
And steps A 5 also comprises steps A 52: the operating state of the active and standby plate that each NIC will change writes its journal file.
Like this, NIC is according to the switching strategy of data configuration, and active and standby operating state sends a warning message to maintenance console according to collocation strategy if occur changing, and is more preferably, and for warning information is provided with different ranks, distinguishes warning information by different ranks; For example, with the SC communication disruption with gateway status is obstructed is made as 1 grade, cpu busy percentage is higher than 80% and is made as 3 grades or the like.The all right log file of more detailed information carries out analyzing failure cause for the attendant, can faster more accurate lockout issue.General, the fault of veneer all can have log record, and failing single board will be considered recovery policy, can be finished by the diagnostic task of prior art, and the user scene can in time be recovered, for example log operation such as reset.
The attendant can also analyze and whether carry out Manual Switch by information such as alarms, can carry out manmachine command by the Operation and Maintenance platform and realize switching, and this is switched and sends to SC, is realized switching by SC, and is simple to operate, can not have influence on the NIC normal flow.
A6, the described main NIC that uses judge when this plate with SC communication failure takes place according to this plate and the communications status of SC and the communications status of standby plate and SC, then by making a strategic decision according to described switching strategy between the active and standby NIC, redefine the main and standby relation of two NIC.
Like this, can accurately recognize local terminal working condition and opposite end working condition by main with NIC and standby N IC and the tripartite heartbeat message of SC, if after occurring switching fault, can realize self-healing recovery in this way fast, reaching does not influence service disconnection.
And described method can also comprise steps A 7: described standby N IC judges that according to the communications status of this plate and SC, main board communication failure takes place for this plate and SC, and main board breaks down, and then is set to the main NIC that uses voluntarily.
In fact, for certain piece NIC veneer, receive another NIC heartbeat message of main and standby relation after, compare with this plate operating state, compare according to switching strategy, the decision-making whether switch.In general, comprise this veneer in working order and whether be in malfunction, for example cpu busy percentage reaches 100%, the network interface flow exceeds threshold values, netting twine LINK DOWN, task run is undesired etc., and wherein task run is undesired includes situation such as task suspension, if out of order veneer be main with veneer just decision-making switch.
As shown in Figure 3, below specific implementation method of the present invention made and further specifying.
All states of system control panel 301 control appliances, monitor all veneer ruuning situations, comprise the data configuration information of obtaining, at first obtain the configuration information of network interface board (NIC) from data terminal, if data have change just to notify corresponding network interface board (NIC) synchronizing information by system control panel (SC); And regularly receive the heartbeat message of network interface board (NIC) 302 and NIC303, according to the decision strategy of switching of data configuration, whether decision-making switches active and standby each other NIC302 and NIC303.
System control panel 301 is not received at certain hour that the heartbeat of NIC302 or NIC303 just sends and is alarmed maintenance terminal, can not receive that heart time is long also will to write down more detailed logging, as the foundation of attendant's later analysis, and decision-making NIC302 and NIC303 switch.
Ethernet switch 304 mainly is to connect NIC302 and NIC303 and external wide area network 305; Wide area network 305 main some user terminals, maintenance terminals or the like of connecting.
NIC302 starts the back and obtains active and standby configuration information from SC301 earlier, and the decision strategy information of switching, and the timing heartbeat reports the operating state of this plate to SC301 in the running; This plate operating state mainly comprises the task run situation, utilization of resources situation, such as processor (CPU) operating state, internal memory operating position etc., the Ethernet operating state that connects, this plate regularly detect with Service Gateway whether can intercommunication, and this plate and SC301 signal intelligence, whether this plate is in malfunction etc.
NIC302 also wants regularly the operating state of this plate of heartbeat to standby N IC303, this function is made supplementary functions, prevent under the situation that SC301 work breaks down, can pass through heartbeat message between NIC302 and the NIC303, switching strategy according to data configuration is switched, balance the main and standby relation between NIC302 and the NIC303.
NIC302 mainly finishes message transmissions such as service call, receives by the message of Ethernet switch 304 from wide area network 305, is distributed to concrete Protocol Processing Board, realizes concrete service call, and will send to wide area network 305 from the message of agreement disposable plates.
NIC303 will obtain active and standby configuration information to SC301 as the backup network interface board after the startup too, and the policy information of switching, and the timing heartbeat reports the operating state of this plate to SC301 in the running; SC301 is according to NIC302 and NIC303 heartbeat message its main and standby relation of making a strategic decision.
NIC302 also wants regularly the operating state of this plate of heartbeat to standby N IC303, this function is made supplementary functions, prevent under the situation that SC301 work breaks down, can pass through heartbeat message between NIC302 and the NIC303, switching strategy according to data configuration is switched, balance the main and standby relation between NIC302 and the NIC303.
The NIC303 message such as calling of not managing business receive only main heartbeat message with network interface board 302, also regularly detect with Service Gateway whether can the intercommunication situation, and this plate and SC301 signal intelligence.NIC302 and NIC303, external then insert identical Ethernet switch 304 simultaneously by independent ethernet line separately, link to each other with wide area network 305 through router again.
Method according to above-mentioned employing heartbeat mechanism, the active and standby SC301 that makes a strategic decision can recognize the operating state of NIC302 and NIC303 in real time, by with the switching strategy of data configuration, the active and standby of NIC302 and NIC303 of can making a strategic decision out is if with favourable conditions can the realization of backup network interface board switched; Can recognize the operating state of standby N IC303 and the signal intelligence between NIC303 and the NIC301 for main in real time with NIC302; Can recognize main operating state in real time for standby N IC303 with NIC302, and the signal intelligence between NIC302 and the SC301; Great advantage is SC301, NIC302, that side of NIC303 balance main and standby relation in time that breaks down no matter like this, can not cause service disconnection, switches the unpredictable problem of bringing after can guarding against fault early; Improved the network reliability between primary network interface plate and the wide area network 305 greatly.Thereby solved the defective of present network interface board masterslave switchover, the influence of switching business is reduced to minimum; Just switching after the fault can self-healing recovery, has really realized unattended operation.
For example, under normal circumstances, regularly this plate of heartbeat information is to SC301 for NIC302 and NIC303, and SC301 is according to the switching strategy of data configuration, whether the operating state decision-making of analyzing NIC302 and NIC303 when the primary network interface plate does not break down, in time switches.And for example, when the main network interface board of using, switch after breaking down, suppose to switch and break down, cause and switch failure, the master does not switch to standby with NIC, and standby switching is main using, this time just occurred active and standby each other NIC and all become main NIC, system control panel in this time main and standby relation of just can making a strategic decision of using, balance is active and standby.And for example, system control panel work is undesired, as reset or failure condition, and active and standby each other NIC switches and breaks down, cause be simultaneously main with or be standby situation simultaneously, just rely on the heartbeat message between the active and standby each other NIC this time, active and standby according to the principle of optimality balance, the pressure that operating state is superior switches to main usefulness, and another piece is forced to switch to standby.
As shown in Figure 4, be the whole startup workflow that present embodiment specifically describes network interface board, because the business such as stable direct relation calling of NIC work are emphasis of the present invention so it switches reliability, below be further detailed.
401, network interface board is carried out electrifying startup, restart possibly in following situation, such as being software upgrading, single board default, dilatation increases veneer all can restart veneer.
402, start at board software and to finish, first this plate of default setting is standby, so just can not have influence on startup veneer working properly, prevents with active and standby each other another piece veneer of this plate working properlyly, causing is the phenomenon of the failure of main usefulness simultaneously.
403, network interface board is wanted data configuration information to system control panel, comprises the veneer and the switching strategy information of main and standby relation, and the firm startup of veneer need be obtained configuration information to system control panel, the normal operating conditions of being allowed for access.
404, network interface board reports activestandby state to the system control panel heartbeat, and this plate operating state, simultaneously report operating state, to the system control panel uploaded state, preferentially by the system control panel main and standby relation of making a strategic decision to opposite end network interface board heartbeat, report operating state to the opposite end in addition, prevent to make a strategic decision between the active and standby each other veneer under the system control panel work abnormal conditions, reach the balance of main and standby relation.
405, the information that system control panel reports according to two active and standby network interface boards judges according to the data configuration switching strategy whether decision-making switches.
406, network interface board also will judge and the system control panel signal intelligence, if with the system control panel communication disruption, just check whether opposite end network interface board work normal, undesiredly just switch.
407, the master not only receives the heartbeat message of slave board with network interface board, and the most important thing is distribution service message; The backup network interface board receives only mainboard heartbeat message and synchronization message, the message of not managing business.
Implementation method is that the master realizes switching with communicating between network interface board and the backup network interface board to shake hands in the prior art, if switch the flow process fault, cause easily is to lead usefulness or be spare condition simultaneously simultaneously, in case this situation occurs, software again can't self-healing recovery, needs artificial the participation to solve; And the present invention has improved this situation, by the active and standby situation of data configuration, and can dispose switching strategy, this policy condition can relax, before taking place, fault just switches, the determination strategy of switching can realize by data configuration, no matter is that upgrading or dilatation can accomplish not influence existing business; Because the role of NIC is extremely important, this and interface and active and standby decision-making mechanism data interaction are all finished by SC, and are synchronized to NIC, realize simple and reliable for NIC.NIC starts the back and obtains the main and standby relation data of oneself from SC earlier, the NIC regularly operating state and the main and standby relation of this plate of heartbeat is reported to SC, by the make a strategic decision main and standby relation of two NIC of SC, SC can report the NIC state to carry out the main NIC of using of optimal selection according to two heartbeats according to the switching strategy of data configuration again.System control panel can be recognized the situation of active and standby NIC in real time, if NIC switches when breaking down, and the SC main and standby relation of can in time making a strategic decision, in time balance is active and standby.Simultaneously report heartbeat message between active and standby each other two NIC of function as a supplement, prevent the system control panel operation irregularity, it is active and standby to make a strategic decision equally between the at this time active and standby NIC; If SC work is undesired, also can realize active and standby balance by heartbeat message between the NIC, accomplished compatibility with prior art.Like this, even if break down, by the decision strategy between SC and the NIC, can self-healing recovery, satisfy business and do not interrupted demand.
Advantage of the present invention be NIC active and standby be not by the hardware designs mandatory provision to certain physical location, but can pass through the data flexible configuration; The SS system can dispose many to active and standby NIC each other, can not have influence on existing business when telecommunications is carried out dilatation; All states that NIC timing heartbeat reports the work of this plate are to SC, and the information that reports comprises current activestandby state, this plate available resources etc., and SC can just can switch before breaking down according to the primary and backup NIC of optimal selection mode decision-making; Mutual newspaper heartbeat message between the primary and backup NIC, like this when the system control panel fault, the primary and backup relation of also can making a strategic decision between the NIC.
In sum, adopt the inventive method, realized that active and standby network interface board can pass through data configuration, can not have influence on existing business during this flexibility ratio dilatation; Active and standby decision-making is preferentially finished by system control panel, handles relatively simple for network interface board; If when system control panel work is undesired, also can make a strategic decision this mechanism and prior art compatibility between the active and standby network interface board by heartbeat message; Two network interface boards can all have great role to data transmission stability and reliability by system control panel by the optimum veneer of the heartbeat mechanism decision-making usefulness of deciding, if can switch fast when breaking down.Therefore solved switch when occurring after the fault be main with or be standby situation simultaneously, and this mechanism in the back self-healing recovery in time that breaks down has really been realized unattended operation, business is not interrupted, the network high reliability.
Should be understood that, for those of ordinary skills, can be improved according to the above description or conversion, and all these improvement and conversion all should belong to the protection range of claims of the present invention.

Claims (10)

1. realize the method for network interface board switching by heartbeat mechanism for one kind, it may further comprise the steps:
When A1, the startup of first network interface board, default setting is the backup network interface board, and the Request System control board sends configuration messages, and described configuration messages comprises and described first network interface board information of second network interface board of main and standby relation each other, and switching strategy;
A2, system control panel send configuration messages to described first network interface board;
A3, described first network interface board and described second network interface board send the heartbeat message that comprises this plate operating state respectively and arrive system control panel;
A4, system control panel make a strategic decision according to the operating state of described switching strategy and two network interface boards, determine the main and standby relation of two network interface boards;
A5, master report this plate operating state with network interface board and described backup network interface board mutually by heartbeat, and the communications status of this plate and system control panel;
A6, the described main network interface board of using are according to this plate and the communications status of system control panel and the communications status of backup network interface board and system control panel, when judging this plate with system control panel generation communication failure, then, redefine the main and standby relation of two network interface boards by leading with making a strategic decision according to described switching strategy between network interface board and the backup network interface board.
2. method according to claim 1, it is characterized in that, described operating state comprise at least netting twine connection status, gateway status, port flow, cpu busy percentage, memory usage, with system control panel communications status, current main and standby relation, task run state one of them.
3. method according to claim 1, it is characterized in that, also comprise steps A 7: described backup network interface board is according to this plate and system control panel, main communications status with network interface board, judge this plate and system control panel generation communication failure, and the master breaks down with network interface board, then is set to the main network interface board of using voluntarily.
4. method according to claim 1 is characterized in that, also comprise steps A 0 before the steps A 1: the user carries out data configuration; Described data configuration comprise at least configuration each network interface board main and standby relation, heart time, switching strategy one of them.
5. method according to claim 1 is characterized in that, steps A 4 is further comprising the steps of: system control panel regularly is reported to the Operation and Maintenance platform with the active and standby information of each network interface board.
6. method according to claim 1, it is characterized in that, steps A 5 also comprises steps A 51: each network interface board judges that main operating state with network interface board and backup network interface board changes or leads the condition that operating state with network interface board and backup network interface board changes and satisfies switching strategy at least, then sends a warning message to the Operation and Maintenance platform.
7. method according to claim 6 is characterized in that, steps A 5 also comprises steps A 52: the main operating state with network interface board and backup network interface board that each network interface board will change writes its journal file.
8. according to the arbitrary described method of claim 1 to 7, it is characterized in that, described switching strategy is that whether decision carries out the masterslave switchover of network interface board according to the operating state and main operating state with network interface board of the operating state of network interface board and contrast backup network interface board.
9. method according to claim 8 is characterized in that, the operating state of described contrast backup network interface board and main operating state with network interface board specifically may further comprise the steps:
B1, points-scoring system is set for every operating state of network interface board;
B2, calculate the mark of whole operating states and main whole operating states with network interface board of backup network interface board respectively.
10. method according to claim 8, it is characterized in that, the condition of described switching strategy comprise at least cpu busy percentage exceed cpu busy percentage preset value, memory usage exceed memory usage preset value, port flow exceed port flow preset value, netting twine connect obstructed, gateway status is obstructed, with system control panel communication disruption, task run state one of them takes place to hang up.
CN2007100771822A 2007-09-17 2007-09-17 A method for realizing network interface board switching based heartbeat mechanism Expired - Fee Related CN101150430B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007100771822A CN101150430B (en) 2007-09-17 2007-09-17 A method for realizing network interface board switching based heartbeat mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007100771822A CN101150430B (en) 2007-09-17 2007-09-17 A method for realizing network interface board switching based heartbeat mechanism

Publications (2)

Publication Number Publication Date
CN101150430A CN101150430A (en) 2008-03-26
CN101150430B true CN101150430B (en) 2010-09-01

Family

ID=39250782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007100771822A Expired - Fee Related CN101150430B (en) 2007-09-17 2007-09-17 A method for realizing network interface board switching based heartbeat mechanism

Country Status (1)

Country Link
CN (1) CN101150430B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102377592A (en) * 2010-08-24 2012-03-14 鸿富锦精密工业(深圳)有限公司 Main and standby control unit switching device and switching method
CN102480370A (en) * 2010-11-23 2012-05-30 中兴通讯股份有限公司 Single board switching method, single board for exchange, single board for processing and system
CN102014029A (en) * 2010-12-25 2011-04-13 中国人民解放军国防科学技术大学 Method for detecting abnormal conditions of external service network in dual active
CN103188098B (en) * 2011-12-30 2015-12-02 中国移动通信集团河南有限公司 A kind of disaster tolerance switching method, system and device
WO2014038835A1 (en) * 2012-09-05 2014-03-13 삼성에스디에스 주식회사 Network backup device and network system including the device
CN102932118B (en) * 2012-11-05 2015-11-25 中国铁道科学研究院 The method and system of the active and standby ruling of a kind of two-shipper
CN103840956A (en) * 2012-11-23 2014-06-04 于智为 Backup method for gateway device of Internet of Things
CN104506364A (en) * 2014-12-29 2015-04-08 迈普通信技术股份有限公司 Master-slave switching method, main control card and network equipment
CN105049248A (en) * 2015-07-09 2015-11-11 北京宇航系统工程研究所 Network state fast detection method of SDH (synchronous digital hierarchy) device
CN106789139B (en) * 2015-11-24 2020-05-05 大唐移动通信设备有限公司 Multipoint fault processing method and device
CN105530121B (en) * 2015-12-03 2018-12-14 福建星网锐捷网络有限公司 A kind of principal and subordinate's management board method for handover control and device
CN106301967B (en) * 2016-10-25 2019-10-15 杭州华为数字技术有限公司 A kind of method of data synchronization and outband management equipment
CN106789246A (en) * 2016-12-22 2017-05-31 广西防城港核电有限公司 The changing method and device of a kind of active/standby server
CN106888116B (en) * 2016-12-30 2020-03-10 北京同有飞骥科技股份有限公司 Scheduling method of double-controller cluster shared resources
CN109104347B (en) * 2017-06-21 2020-09-15 比亚迪股份有限公司 Gateway rotation method, system and device for transmitting data based on CANopen protocol
CN107688547B (en) * 2017-08-23 2020-06-16 苏州浪潮智能科技有限公司 Method and system for switching between main controller and standby controller
CN108462529B (en) * 2018-04-27 2023-12-19 上海欣诺通信技术股份有限公司 Main and standby board card switching method, optical transmission network equipment and storage medium
CN109995890B (en) * 2019-03-08 2022-04-22 平安科技(深圳)有限公司 Method and server for managing Network Address Translation (NAT) gateway
CN111049881B (en) * 2019-10-30 2022-07-22 烽火通信科技股份有限公司 Cloud platform node resource monitoring method and system and computer readable medium
CN112787360B (en) * 2019-11-04 2022-11-08 宁波三星智能电气有限公司 High-reliability charging pile
CN110912839B (en) * 2019-12-24 2021-11-26 北京东土军悦科技有限公司 Main and standby switch detection method, system, terminal and storage medium
CN111490903B (en) * 2020-04-14 2022-08-09 广州汇智通信技术有限公司 Network data acquisition and processing method and device
CN115134215A (en) * 2022-05-13 2022-09-30 昆仑太科(北京)技术股份有限公司 Server BMC dynamic network linkage management method and management system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1321004A (en) * 2000-04-25 2001-11-07 华为技术有限公司 Method and equipment for swapping active with standby switches
CN1536772A (en) * 2003-04-08 2004-10-13 中兴通讯股份有限公司 Main and standby boards inverter device and its inversion method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1321004A (en) * 2000-04-25 2001-11-07 华为技术有限公司 Method and equipment for swapping active with standby switches
CN1536772A (en) * 2003-04-08 2004-10-13 中兴通讯股份有限公司 Main and standby boards inverter device and its inversion method

Also Published As

Publication number Publication date
CN101150430A (en) 2008-03-26

Similar Documents

Publication Publication Date Title
CN101150430B (en) A method for realizing network interface board switching based heartbeat mechanism
CN101094157B (en) Method for implementing network interconnection by using link aggregation
EP1982447B1 (en) System and method for detecting and recovering from virtual switch link failures
CN100417094C (en) Network failure recovery method with redundancy port
CN101478435B (en) Topology collecting method for stacking system and dual control board equipment
CN100407646C (en) Method for realizing data service backup
CN102006189B (en) Primary access server determination method and device for dual-machine redundancy backup
CN101094237B (en) Method for sharing load among net elements in IP multimedia sub system
CN100461697C (en) Service take-over method based on device disaster tolerance, service switching device and backup machine
CN103532753B (en) A kind of double hot standby method of synchronization of skipping based on internal memory
CN101373990B (en) Method and apparatus for link backup
CN100574486C (en) The system of dual-homing networking and method thereof in the communication network
CN101582797A (en) Management board and two-unit standby system and method
WO2016095344A1 (en) Link switching method and device, and line card
CN102088372A (en) Heartbeat detecting method, system and equipment
US20060092831A1 (en) Methods, systems, and computer program products for substitution of card-based telephony bearer channel switch modules in a manner that preserves stable calls
US20200136912A1 (en) Method, Device, and System for Implementing MUX Machine
EP2456163B1 (en) Registering an internet protocol phone in a dual-link architecture
CN109101372A (en) Redundancy switching method, storage medium and the Shelf Management Module of Shelf Management Module
CN1855838B (en) Interface inverting method
CN102638369A (en) Method, device and system for arbitrating main/standby switch
JP2003188905A (en) System and method for multiplexing tcp/ip communication for server/client system
JP4485278B2 (en) Optical transmission system
KR100237370B1 (en) A switchover method for duplicated operational workstation server
CN217037201U (en) Management network device for storing products and storage system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100901

Termination date: 20170917

CF01 Termination of patent right due to non-payment of annual fee