The purpose of this invention is to provide a kind of main and standby rearranging method and implement device thereof that is used for ATM switch, adopt method of the present invention, when wherein any plate breaks down, all can switch on the standby plate rapidly, safely and reliably, guarantee the normal operation of switch, and can in time report fault message by webmaster; And when ensureing that switch is finished every function, its hardware configuration simplicity of design is skillfully constructed, and has reduced system cost, the functional reliability height.
Another object of the present invention provides a kind of main and standby rearranging method and implement device thereof that is used for various telephone exchanges.
Main and standby rearranging method of the present invention is achieved in that the redundant back-up that the master control borad of switch and network board is carried out 1+1; Wherein keep real time communication between active and standby two plates of master control borad,, realize Hot Spare also keeping data same on a and the mainboard on the slave board as backup; The switching of master control borad is the mode that adopts active and standby mutual monitoring and control: active and standby two plates of master control borad all send " heartbeat " signal separately, and monitor " heartbeat " signal of the other side simultaneously, when a side is broken down, will the opposing party (be mainboard or slave board according to the state of oneself then?) determine whether switch, and report webmaster: when main board finds that the heartbeat of standby plate has not had in certain period, just think that standby plate breaks down, the notice webmaster is handled; The heartbeat of main board has not had in standby plate is found certain period, thinks that just main board breaks down, and starts active and standby switching, and oneself is upgraded to mainboard, forbids out of order former mainboard simultaneously; Wherein active and standby two plates of network board are in the synchronous operation state, and the data of common standby network board are not exported, in case main when going wrong with network board, then take over job, thereby realize Hot Spare by standby network board; Network board is monitored and switching controls by master control borad: master control borad carries out poll to the status register of network board in real time, with network board problem is arranged in case find the master, then switches, and reports webmaster; Master control borad and network board all will check after operations is carried out, and prevents because of disturbing or that other fault causes is out of control, and determine next step operation according to the result who detects.
Above-mentioned " heartbeat " signal is whether to be in the custom-designed pulse signal with moderate frequency output of normal operating conditions for the reflection circuit board.
Above-mentioned " heartbeat " signal can adopt by software control and write data and the pulse signal that obtains in the particular register of hardware.
Above-mentioned being used for monitored the device whether specific " heartbeat " pulse occurs in the preset time interval, do not receive the pulse that CPU sends in the preset time interval, just CPU resetted, and is in malfunction for a long time to avoid the CPU device.
Active/standby changeover apparatus of the present invention is achieved in that and is fitted with two master control borads identical, that be used for masterslave switchover that on two adjacent slot positions of the backboard of switch each control line between this two plate is connected to each other by backboard; In each master control borad, all be respectively equipped with one and be specifically designed to the memory cell of transmitting data between main board and the standby plate, promptly communication buffer (claiming again: " mailbox "), an activestandby state register (M/S Status Reg) and being used for monitor specific " heartbeat " pulse signal whether preset time at interval in appearance device (be commonly called as into: Dongle); Also be provided with three pairs of control lines when being used for active and standby switch: represent that this plate is in out the status signal OE that still closes, closes the control signal (DisableOE) of the other side's (main board), the control signal of the other side's master control borad that resets.
Above-mentioned communication buffer (that is: " mailbox ") can adopt a dual-port static random asccess memory SRAM to realize it; During work, only use " mailbox " in the slave control board, main control board is then communicated by letter with being somebody's turn to do " mailbox " by remote access, by the interrupt signal of regular check " mailbox ", can judge whether working properly the other side's master control borad is.
" heartbeat " signal that monitors mutually between active and standby two plates of above-mentioned " mailbox " communication interruption signal, master control borad, and the reset circuit of master control borad " feeding dog " signal is compound is a signal, thereby simplified design improves the reliability of system works.
Above-mentioned activestandby state register M/S Status Reg be used to preserve the residing groove of this plate position information, this plate be in out the OE status signal that still closes and the other side's master control borad be in out still close the OE status signal, finish main and standby competition according to above-mentioned status signal during system start-up
Characteristics of the present invention are: master control borad on the ATM switch and network board are adopted the Hot Spare technology, improved the stability that system is switched, can keep the long-term stable operation of switch.Errorless for assurance work, not only to carry out filtering to the signal of closing main board, also it is carried out strict condition restriction, to improve the main preparation system anti-jamming capacity.After operations is carried out, all check, and determine next step operation, prevented to improve the reliability of system's masterslave switchover greatly because of disturbing or that other fault causes is out of control according to the result who detects.Time of delay of device has also been considered in the switching of network board, so that eliminate and the signal conflict of device the raising stability of a system.Adopt the solution of the present invention,, all can switch safely and reliably for hot-swappable various assembled state and various deadlock state, can deadlock at the state of mistake.In addition, will communicate by letter " feed dog " signal combination of board resetting circuit of interrupt signal, " heartbeat " signal and system of usefulness of the present invention is in the same place, and when finishing every function smoothly, the simplified system design has reduced system cost.On software, innovative design has all been carried out in main and standby competition, heartbeat mechanism and data backup etc.
Below in conjunction with drawings and Examples implementation method of the present invention and device are made specific description:
Referring to shown in Figure 1, listed among the figure: the present invention is fitted with two identical, as to be used for masterslave switchover main control board 1 and slave control board 2 on two adjacent slot positions of the backboard of switch, the main line between these two master control borads 1,2 also is interconnected by backboard; Comprising the control line that has three pairs to use when being used for active and standby switch: represent that this plate is in the control signal of opening still closing state signal OE (output enable), closing the other side's's (main board) control signal Disable OE (forbidding output enable) and the other side's master control borad that resets.In each master control borad 1,2, all be respectively equipped with one and be specifically designed to the memory cell of transmitting data between main board and the standby plate, be that communication buffer 11,21 (claims again: " mailbox ", as shown in FIG.), an activestandby state register 12,22 (M/S Status Reg) and being used for monitor specific " heartbeat " pulse signal whether preset time at interval in the device (be commonly called as be Dongle, not shown in the figures) of appearance.Above-mentioned communication buffer 11,21 (that is: " mailbox ") can adopt a dual-port static random asccess memory SRAM to realize it; During work, only use " mailbox " in the slave control board, main control board is then communicated by letter with " mailbox " in the slave control board by remote access.When main control board was delivered to the far-end mailbox with data, by the interrupt notification slave control board, slave control board is reading of data from mailbox again, did not need regularly to inquire about mailbox.Simultaneously, should " mailbox " interrupt signal can multiplexing heartbeat signal as this master control borad, by the interrupt signal of regular check mailbox, can judge whether working properly the other side's master control borad is.The plate that above-mentioned activestandby state register M/S Status Reg is used to preserve the residing groove of this plate position, this plate selects the information such as OE state of OE state and the other side's master control borad, then finishes the judgement of main and standby competition during system start-up according to these signals.
Mutual control between two master control borads is undertaken by Disable OE signal.And, have only standby plate to carry out shutoff operation to the OE status signal of main board; Simultaneously, can to open the prerequisite of the work of entering be that the OE signal of mainboard must be closed to standby plate.The present invention is by the control to two master control borad OE of the interlocking logic realization that designs in FPGA (Field Programmable Gate Array), and input signal is carried out filtering, thereby it is minimum to disturb the misoperation that may cause to drop to by accident.Two active and standby plates data communication and heartbeat inspecting each other is to be undertaken by the mailbox that is located on the slave board, and the mailbox on the mainboard does not use.
Because " heartbeat " signal is the pulse that occurs in preset time, and the interrupt signal of " mailbox " under normal circumstances, also can occur within a certain period of time, and whether it occur reflecting well whether master control borad is in normal operating conditions, use so can serve as heartbeat signal.The double heartbeat signal of doing of the communication disruption signal of " mailbox " can be simplified the technology of masterslave switchover and realized and software operation.Adopt the interrupt signal of heartbeat signal, mailbox communication to combine among the present invention, both simplified the technology realization, improved the reliability of masterslave switchover again with the feeding-dog signal three.When main control board 1 was delivered to far-end " mailbox " with data, by interrupt notification slave control board 2, slave control board 2 is reading of data from mailbox again, did not need regularly to inquire about mailbox.The mailbox interrupts signal can be multiplexing as heartbeat signal simultaneously, by the interruption of regular check mailbox, can judge whether working properly the other side's master control borad 1 is.
The operation principle that the present invention carries out masterslave switchover is summarized as follows: when the ATM switch operate as normal, active and standby two master control borads 1,2 all send heartbeat signal separately and monitor the other side's heartbeat signal simultaneously.Finding the heartbeat of standby plate 2 in certain period when main board 1 has not had, and thinks that just standby plate 2 has broken down, and just in time the notice webmaster is handled; Finding the heartbeat of main board 1 in certain period when standby plate 2 has not had, and thinks that just main board 1 has broken down, so start active and standby switching, oneself is upgraded to mainboard, simultaneously the former main board 1 of fault is banned use of.Here, have three important notions to need to introduce:
1) heartbeat signal: promptly whether be in the pulse signal that normal operating conditions designs for the reflection circuit board specially with moderate frequency output.The present invention adopts by software control to write data and the pulse that obtains to the particular register of hardware.Its concrete software control can be introduced referring to the back.
2) mailbox: that is communication buffer, be a kind of memory cell of setting up for the data of transmitting between main board and the standby plate, that the present invention adopts is a dual-port static random access memory SRAM.
3) Dongle: be used for monitoring the device whether specific pulse occurs in the preset time interval.Generally be applied in the hardware technology of CPU.In the preset time interval, do not receive the pulse that CPU sends over when Dongle, then CPU is resetted, can avoid the CPU device to be in malfunction for a long time.This Dongle technology is used in many products, and through long-term checking, is a kind of better way technology.
Hardware of the present invention provides communication port and environment, and the concrete operations of its data backup and masterslave switchover are then carried out by software.
The idiographic flow of software of the present invention and implementation are referring to Fig. 2 and shown in Figure 3.
At first referring to Fig. 2: on software, the main and standby competition during whole system experience startup is earlier judged, enters heartbeat listen phase each other then.In listen phase, any plate goes wrong and all can be detected, and main board breaks down in this way, and then system enters active and standby handoff procedure.Fault at faulty board is excluded, and after reloading, system enters into mutual listen phase again.
Referring to the FB(flow block) of master control borad electrifying startup process shown in Figure 3, system is at the beginning during power-up initializing, open plate OE not, and system break is also closed; After main and standby competition is finished,, just in hardware initialization, open plate OE, system break etc. if this plate is the main time spent; Application's data load to be that activestandby state according to this plate determines whether loading data.If this plate is standby, can send a standby notification line frame to main board after the standby plate system start-up in the end, main board can be done some synchronous again work after receiving this frame.Synchronous again main purpose is to the standby plate of just having reached the standard grade with the current configuration full backup on the main board, in order to accomplish this point, the task of the backup plate of main usefulness is that the bit table of each Backup Data is all reset according to current configuration, and standby plate is given in disposable backup.
The purpose of data backup is that the data that keep being disposed on active and standby two master control borads have the height consistency, so in a single day needs active and standby switching, after standby plate is upgraded to main board, just can keep the consistency and the continuity of system, accomplishes to take over seamlessly.Data backup is finished realization by backup tasks.Backup tasks has a timer timing (several seconds once) to check the bit table of each backup control table, if some positions have been set, several units of just calling this backup control table obtain function, will backed up data collect, then these data are packaged into Frame together with backup control table ID etc., write the far-end mailbox that is positioned at standby plate, write mailbox interrupts notice standby plate afterwards.Standby plate receives mailbox interrupts, and reading of data from mailbox is called several units according to the control table ID that comes with simultaneous interpretation, bit table index, operation etc. and function is set data are returned to appointed positions.
In order to guarantee the reliability of the data that master control borad reads from mailbox, the cyclic redundancy check (CRC) codes that transmits data can be write mailbox together in company with data at transmit leg; After reciever is subjected to data, carry out the CRC checking, and will verify that the result is packaged as the checking frame and returns to transmit leg; Transmit leg is subjected to Echo Frame, if the checking result is wrong, and retransmitting data frame then; If correct, then continue data collection next time.
Simply introduce the scheme of data backup below: the high layer software of data backup partly adopts the bit table mode of increment type and the method that the function mode combines.Wherein most of process all adopts the bit table mode Backup Data of increment type, promptly when the significant data (regarding it as array) of this process when changing, just its index value is recorded on the bit table, then determines which data to back up according to bit table.Some process only need back up a part of territory in the bigger data structure, at this moment will use the data backup of function mode: promptly provided by these processes this data structure is mapped as function than minor structure, to save the space, accelerate backup rate.
Bottom carries out communication by mail box fashion.Adopt mail box fashion, can improve the reliability and the speed of transmission greatly, simplify simultaneously and handle.Respectively be provided with a mailbox on mainboard and slave board, mainboard leaves Backup Data in the mailbox of standby plate at ordinary times; Standby plate reads the data in the mailbox, and handles.
The masterslave switchover the very important point is that slave control board is a fault how in time to find main board, the mechanism that the present invention adopts is that so-called heartbeat is monitored: the master control borad of work is in running, system is when any task is transmitted message, the capital produces a heartbeat, the slave control board of opposite side is then whenever checked heartbeat and counting after a while, if the heartbeat numeral does not increase, just think that fault has appearred in main board, then start active and standby switching with the standby master's usefulness that is upgraded to.For anti-locking system is in the IDLE state for a long time and causes there is not heartbeat in the long duration, backup tasks starts a special-purpose timer, and its whenever after a while overtime meeting causes system to transmit a message to backup tasks, guarantees the uninterrupted of heartbeat with this.
It is to realize by the interruption of backing up mailbox that main control board sends heartbeat signal to slave control board, and in fact main board is write heartbeat signal is exactly to write interruption to the backup mailbox.For the interruption difference of heartbeat interruption and Backup Data is come, in the mailbox data district, kept a byte units (WORD) storage interrupt flag-heartbeat and interrupted and the backup interruption.Slave control board is received mailbox interrupts, at first judges this interrupt type according to interrupt flag, if backup is interrupted, then recovers Backup Data, also the heartbeat counting is added 1 simultaneously; If heartbeat is interrupted, then simply the heartbeat counting is added 1.
In a single day slave control board finds that main control board breaks down, and just begins active and standby switching, and its flow process as shown in Figure 4.Referring to active and standby switching flow figure shown in Figure 4, after switching beginning, judge at first whether slave board may switch, and can not switch as slave board, then will restart system.As switching, then standby plate will be closed out of order main board, and open the output of oneself.The former main board that resets then, the state with oneself is changed to main board simultaneously.Can be normally after system is switched, operation smoothly, with switch in each several part level and smooth quality relation greatly, realistic for the distribution condition that guarantees various resources, way of the present invention is earlier dynamic resource to be removed, and redistributes data according to configuration then.The level and smooth of each several part data generally is responsible for by module separately, and several main smooth functions are respectively: groove position information is level and smooth, and hardware is level and smooth, and resource is level and smooth, and system is level and smooth.
In order to prevent that unexpected interference signal from sealing in the control signal of masterslave switchover, cause main board to be closed by mistake, the present invention has not only carried out filtering to the control signal of standby plate pass main board, also it is carried out strict condition restriction.Have only the signal that satisfies stringent condition just can operate.In addition, also adopt the interlocking logic to control to main board and standby plate, any moment can only have a plate to be in open state at most.In software, the state after the operation of two plates is judged simultaneously, thereby the illegal state of having opened when having avoided two plates may occur in main and standby competition effectively improves the main preparation system anti-jamming capacity greatly.
The relative master control borad plate of the backup of network board plate is comparatively simple, and its method is: whenever the network board plate has only a plate in running order under the control of master control borad, and another piece is in stand-by state.Master control borad is the operating state register of poll network board constantly, when monitoring incorrect state, if main board then switches to standby plate and alarm; If standby plate, then alarm.
The present invention tests enforcement on the ATM switch of applicant's development, obtained the goal of the invention of expection, for hot-swappable various assembled state and various deadlock state, this machine can both be switched safely and reliably, can deadlock at the state of mistake, can guarantee the operate as normal of switch.