WO2017215441A1 - Self-recovery method and apparatus for board configuration in distributed system - Google Patents

Self-recovery method and apparatus for board configuration in distributed system Download PDF

Info

Publication number
WO2017215441A1
WO2017215441A1 PCT/CN2017/086396 CN2017086396W WO2017215441A1 WO 2017215441 A1 WO2017215441 A1 WO 2017215441A1 CN 2017086396 W CN2017086396 W CN 2017086396W WO 2017215441 A1 WO2017215441 A1 WO 2017215441A1
Authority
WO
WIPO (PCT)
Prior art keywords
board
main control
state information
rack
previous
Prior art date
Application number
PCT/CN2017/086396
Other languages
French (fr)
Chinese (zh)
Inventor
程寒杰
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017215441A1 publication Critical patent/WO2017215441A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • H04L41/0661Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities

Definitions

  • the present disclosure relates to the field of communications technologies, and in particular, to a method and an apparatus for self-recovery of a single board in a distributed system.
  • a packet transport network (PTN) device in a communication system is usually a distributed system, and the system includes a main control board, a backboard, and a plurality of boards.
  • the board includes the virtual board and the physical board.
  • the virtual board does not occupy the physical slot.
  • the agent process runs on the central processing unit (CPU).
  • the physical board passes the backplane and the master.
  • the boards are connected, and the related programs are independently run on the physical boards. For specific applications, the configuration of the configuration board and the control plane program are sent by the main control board.
  • the board may be removed from other slots or other devices.
  • the original configuration information of the main control board is not saved. Therefore, the configuration information sent by the main control board is lost after the board is powered off. .
  • the board is reset due to various hardware and software reasons, or the application is restarted, the board is configured to be faulty.
  • the main control board must detect the fault and re-deliver the original configuration information of the board. Ensure that the board business can be restored.
  • the analysis shows that the current communication device detects the running status of the board in the following two modes:
  • the handshake mechanism is used to send a handshake packet to the board through the maintenance program on the master. If the board does not return a response packet within a preset time or period, the board is faulty.
  • the main control board has the following shortcomings in the sense that the board has been reset due to various hardware and software reasons or the application has restarted:
  • the dedicated hardware circuit lacks flexibility. When an application process restarts abnormally on the board, the hardened hardware circuit may fail to detect the restart. The configuration information of the board is lost, and the service cannot be restored. .
  • the handshake mechanism usually requires a timeout judgment condition, that is, it needs to wait for several cycles before it can judge that the board is faulty, thereby avoiding the problem of misjudgment due to the blocking of the message in a short time. If the board fails to be restored within the timeout period, the system cannot detect the faults such as the restart and reset of the board. As a result, the board configuration is lost and the service cannot be restored.
  • the technical solution to be solved according to the technical solution provided by the embodiment of the present disclosure is to implement self-recovery of the board configuration information when the board is lost due to a hardware and software fault reset or an application process restart.
  • the main control board sends the configuration information of each board to the corresponding board, and caches the configuration information of the corresponding board.
  • the main control board determines whether the board has been faulty before the board according to the previous status information and the current status information of the board.
  • the configuration information of the previous cache is sent to the board, so that the board can restore the board service.
  • the main control board is determined according to the previous state information and the current state information of the board.
  • the method further includes:
  • the main control board receives the rack image packet from the board, and parses the rack image packet to obtain current status information of the board.
  • the previous state information obtained and saved by parsing the previous rack image message of the board is obtained.
  • the step of determining, by the main control board, whether the board has failed beforehand according to the previous state information and the current state information of the board includes:
  • the main control board uses the rack image serial number in the current state information. And updating the rack image serial number in the previous state information saved by the main control board.
  • the method further includes: if the type of the board is changed, the main control board does not send the configuration information of the previous cache to the board.
  • a storage medium stores a program for implementing a single board configuration self-recovery method in the distributed system described above.
  • the packet forwarding module of the main control board is configured to send configuration information of each board to the corresponding board, and cache configuration information of the corresponding board.
  • the main control board control module is configured to determine whether the board has been faulty before the board according to the previous status information and the current status information of the board.
  • the packet forwarding module of the main control board sends the configuration information of the pre-cache to the board for the single The board resumes the board business.
  • the main control board control module is further configured to parse the received rack image message from the board, and obtain the current state information of the board, and obtain the previous analysis on the board.
  • the previous status information obtained and saved by a rack map message.
  • the main control board control module determines whether the board type is changed according to the board type in the current status information, and if the board type does not change, compare the rack map in the current status information.
  • the serial number of the rack image and the sequence number of the rack image in the previous state information if the rack image serial number in the current state information is smaller than the rack image serial number in the previous state information, and the current state information is in the current state information If the rack image serial number is not equal to zero, it is determined that the board has failed before.
  • the main control board control module determines whether the board has a fault in the front state, and uses the rack image serial number in the current state information according to the previous state information and the current state information of the board. Updating the rack image serial number in the previous state information saved by the main control board.
  • the main control board control module does not send the previously cached configuration information to the board when the board type is changed.
  • a storage medium comprising a stored program, wherein the program is executed to perform the method of any of the above.
  • a processor for running a program wherein the program is executed to perform the method of any of the above.
  • the main control board can accurately detect the loss of the configuration of the board, and then re-deliver the pre-cached configuration information to the board. Ensure that the board business is back to normal.
  • FIG. 1 is a block diagram of a self-recovery method for a single board configuration in a distributed system according to an embodiment of the present disclosure
  • FIG. 2 is a block diagram of a self-recovering device for a single board configuration in a distributed system according to an embodiment of the present disclosure
  • FIG. 3 is a system architecture diagram of a distributed system according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a state machine in which a main control board program determines that a board is in a bit state according to an embodiment of the present disclosure
  • FIG. 5 is a working flowchart of a packet forwarding module of a main control board according to an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of a process of self-recovery of a S-port configuration packet of a board according to an embodiment of the present disclosure.
  • FIG. 1 is a block diagram of a method for self-recovery of a board configuration in a distributed system according to an embodiment of the present disclosure. As shown in FIG. 1 , the steps include:
  • Step S101 The main control board sends the configuration information of each board to the corresponding board, and caches the configuration information of the corresponding board.
  • the configuration information of the board includes parameter configuration information and service configuration information.
  • Step S102 The main control board determines whether the board has a fault beforehand according to the previous state information and the current state information of the board.
  • the main control board receives the rack image packet from the board, and parses the rack image packet to obtain the current state information of the board, and obtains the previous machine that parses the board before.
  • the previous status information obtained and saved by the frame message.
  • the main control board determines whether the type of the board is changed according to the type of the board in the current state information. If the type of the board is changed, the configuration information of the previous cache is not sent to the board. If the type is not changed, the current status information of the board and the previous status information are compared to determine whether the board is in front of the board.
  • the current status information is used to update the previous status information saved by the main control board, for example, replacing the main control board with the rack image serial number in the current status information.
  • the rack map serial number in the saved previous status information so that the main control board can save the latest status information.
  • Step S103 If it is determined that the board is faulty, the configuration information of the previous cache is sent to the board, so that the board can resume the board service.
  • the board has multiple slots, and the boards inserted in each slot are pre-configured.
  • the main control board saves information about each slot and its corresponding board type, and the configuration information of the corresponding board. .
  • the board type is parsed to obtain the board type and the rack number.
  • the main control board matches the type of the board to the board that corresponds to the slot. If the board is inconsistent, the board is changed. Send the configuration information until the board that is inserted in the slot is restored to the specified board.
  • the main control board determines that the board inserted in the slot changes from the inconsistent state of the board to the normal running state.
  • the board sends the configuration information of the previous cache.
  • the board type is matched with the board type corresponding to the saved slot. If the board inserted in the slot does not change, the board type does not change.
  • the serial number of the rack image in the status information determines whether the board has failed before. For example, suppose the board 1 is inserted in slot 1. The main board saves the configuration information of board 1. The main control board periodically receives the rack image of the board 1 inserted in the slot 1 and saves the latest rack map serial number. 1. When the main control board determines that the board is inserted in slot 1 according to the type of the board in the rack image that is currently received, the board is determined to be inconsistent. Message.
  • the configuration information of the previously cached board 1 is sent to the board 1 when the board is reinserted in the slot 1 according to the type of the board in the subsequent rack image. 2.
  • the main control board determines that the board type is the same as that of the board in the slot image, the main control board will receive the current one.
  • Rack diagram message The frame number of the rack image is compared with the latest frame number of the rack image saved. Based on the comparison result, it is determined whether the board 1 has failed before. If the fault occurs before, the board 1 is sent to the front.
  • the embodiment of the present disclosure can use the software to accurately detect the running fault of the board configuration packet (that is, the configuration information), such as the overall reset of the board, and the restart of the board application, and deliver the pre-cached configuration information to the existence. A board that has failed to run. That is, the embodiment of the present disclosure implements single board fault detection and self-recovery of configuration information in a distributed system.
  • the configuration information such as the overall reset of the board, and the restart of the board application
  • the configuration information of the board saved in advance can be sent to the board to ensure that the service can be restored.
  • step S101 to step S103 are included.
  • the storage medium may be a ROM/RAM, a magnetic disk, an optical disk, or the like.
  • FIG. 2 is a block diagram of a self-recovery device for a single-board configuration in a distributed system according to an embodiment of the present disclosure. As shown in FIG. 2, the main control board control module 10 and the main control board packet forwarding module 20 are included.
  • the main control board packet forwarding module 20 is configured to send the configuration information of each board to the corresponding board, and cache the configuration information of the corresponding board.
  • the main control board control module 10 is configured to determine whether the board has been faulty before the board according to the previous status information and the current status information of the board. Specifically, the main control board control module 10 parses the received rack image packet from the board to obtain the current state information of the board, and obtains the previous machine that parses the board before. The previous status information obtained and saved by the frame message. The main control board control module 10 determines whether the board type is changed according to the board type in the current status information, and if the board type changes, the board does not send the board to the board. Pre-cached configuration information. If the board type does not change, compare the current status information of the board with the previous status information to determine whether the board has failed before.
  • the rack image serial number in the previous state information is not equal to zero, it is determined that the board has failed before, Then, using the rack image serial number in the current state information, the rack image serial number in the previous state information saved by the main control board is updated.
  • the main control board packet forwarding module 20 sends the configuration information of the pre-cache to the board for the The board recovers the board business.
  • the system includes a main control board, a backplane, and a plurality of boards (for example, a board A, a board B, and a single C). .
  • the main control board and the board are connected through the backplane.
  • the control plane and the management plane of the main control board pass the packet forwarding module (equivalent to the main control board packet forwarding module 20 in FIG. 2) to the corresponding
  • the board sends configuration BPDUs to configure parameters and services for each board.
  • the board runs the related board application under the configuration of the main control board.
  • the control program of the main control board (implementing the function of the main control board control module 10 in FIG. 2) sends the configuration packet to the designated board through the S interface through the packet forwarding module.
  • the forwarding module (that is, the packet forwarding module) needs to cache the parameter configuration and service configuration packets sent to the board with the card address as the key, and can combine the configuration messages with the same command code.
  • the packet forwarding module is a process or a task running on the main control.
  • the packet forwarding module provides an interface for resending the configuration packet of the specified card. Go to the board.
  • the running status of the board is used to indicate the running status of the board, including three states: normal operation, abnormal operation, and inconsistent board type. If the running status of the board is changed from the running abnormal to the normal running, or the board type is inconsistent and the running status is normal, you need to notify the packet forwarding module to re-issue the board configuration packet.
  • FIG. 4 is a schematic diagram of a state machine in which the main control board program determines the in-position state of the board in the embodiment of the present disclosure.
  • the main control board determines that the initial state of the board is normal, and then waits for the board to report newly.
  • the S-port packet (that is, the rack-based packet) is parsed.
  • the packet is parsed to obtain the board type and rack image serial number. If the type of the board is changed, the board type is inconsistent and waits for the newly reported rack port S packet. Otherwise, determine whether the rack number is increased. If the serial number of the rack is increased, the running status of the board is normal and waits for the newly reported rack interface S packet. Otherwise, it is determined whether the rack number of the rack is smaller and not 0. If the serial number is smaller than 0, the board is running abnormally and waits for the newly reported rack interface S-port packet. Otherwise, the frame number of the rack is smaller and 0. The running status is normal, and waits for the newly reported rack interface S port message.
  • the board application enables the timer to periodically send the rack map S port message to the main control board.
  • the control program of the main control board parses out the board type and rack image serial number information and compares it with the last saved value. If the board type is changed, the status of the board is inconsistent. If the current value (that is, the current rack image serial number) is greater than the previous value, the board is in the normal state. If the current value is present, If the value is less than the last saved value and the value is the overflow value of 0, it is determined that the board is in the normal state. If the current value is smaller than the last saved value and is not equal to the overflow value 0, it is determined that the board is in the abnormal state. After the above-mentioned determination is completed, the index number of the rack image that is saved on the main control board (that is, the rack image serial number) is updated to the latest reported value of the board.
  • the board If the board is in the normal state, that is, the running status is changed, and the current status is in the normal state, the board is abnormally restarted and the board configuration information is lost. Immediately notify the packet forwarding module to re-deliver the configuration of the board. This ensures that the configuration information required for the board is self-recoverable.
  • a S-port packet is a type of communication packet used to exchange messages between the main control board, the board, and between the main control board and the board.
  • the embodiments of the present disclosure can solve the reasons such as the overall reset of the board and the restart of the board application.
  • FIG. 5 is a flowchart of a packet forwarding module of a main control board according to an embodiment of the present disclosure. As shown in FIG. 5, the steps include:
  • Step S201 The message forwarding module waits for the control program configuration message of the main control board.
  • Step S202 The packet forwarding module receives the S interface configuration packet (that is, the configuration packet).
  • Step S203 The packet forwarding module sends the S interface configuration packet to the specified board.
  • Step S204 determining whether it is a registered message, if yes, executing step S205, otherwise performing step S201.
  • the configuration packet of the board, the command code, and the operator is a registered packet. For example, if the configuration packet of the board A is a registered packet, then in step S204, If the configuration packet is sent to board A, the packet is a registered packet.
  • Step S205 Whether it is a modification operation, if yes, step S206 is performed, otherwise step S207 is performed.
  • Step S206 Perform the merge processing on the S interface configuration message.
  • the S interface configuration packet is merged with the previously saved S interface configuration packet, so that the parameter value in the new S interface configuration packet is 1000. .
  • Step S207 Cache the S interface configuration message.
  • the processing of the packet forwarding module of the main control board is: a packet forwarding module is specially designed on the main control board, and the module is an intermediate hub for the S-port packet interaction between the main control board control program and each single board program.
  • the packet forwarding module can forward the S-port configuration packet.
  • the interface can also be used to register the specified packet, the specified command code, and the specified operator. Only registered packets are cached. Provides a message merging function for packets whose operators are modified. Update the configuration information to the latest value.
  • the packet forwarding module first forwards the S interface packet to the board, and then judges the configuration report according to the command code and the operator in the S port packet.
  • the packet forwarding module needs to provide an interface for all the S-port configuration packets that have been cached on the board to be delivered to the board.
  • FIG. 6 is a schematic diagram of a self-recovery process of a S-port configuration packet of a board according to an embodiment of the present disclosure. As shown in FIG. 6, the steps include:
  • Step S301 The main control board control program waits for the rack diagram message of the board.
  • Step S302 Receive and parse the rack map message.
  • Step S303 It is determined whether the rack image message of the board is received for the first time. If the step S304 is performed, otherwise step S305 is performed.
  • the board software of the card is used to query the board software entries. If the board software entry is not found, the rack image of the board is received for the first time.
  • Step S304 Create a board software entry of the board.
  • Step S305 Determine whether the type of the board changes. If the step S301 is performed, otherwise step S306 is performed.
  • Step S306 It is determined whether the current rack image serial number is smaller than the previous value. If the step S307 is performed, otherwise step S301 is performed.
  • Step S307 It is determined whether the current rack image serial number is an overflow value of 0. If the step S301 is performed, otherwise step S308 is performed.
  • Step S308 Notifying the packet forwarding module of the main control board to re-issue the S interface configuration packet of the board.
  • the control program of the main control board is processed as follows:
  • the software needs to create the software with the board address as the key.
  • the entry is used to record the board information.
  • the initial value of the board type in the entry is the value that is carried when the board reports the rack image for the first time.
  • the online running status information of the board is initialized to normal operation, and the frame number of the rack is initialized to 1.
  • the entry is queried by using the board address as the key. If the query fails, the corresponding board entry has not been created. You can create the corresponding board software entry in the above manner.
  • the board software has been created.
  • the item information is compared, and the comparison process is described as follows. If the current board type is different from the previously saved board type, you can determine that the board running status is inconsistent. If the current serial number is greater than the saved value, the board is in the normal state. If the current serial number is smaller than the last saved value, the board is in the abnormal state. In this embodiment, the value of the serial number can ensure that the running device does not overflow for hundreds of years, so there is no need to consider the case where the serial number overflows. After the above comparison is completed, the board status change can be captured. After that, you need to update the serial number of the rack frame saved by the main control board to the last reported value.
  • the control program of the main control board can know the change of the operating state of the board. If the result of the judgment is that the board is in the in-position state and the current value is in the normal state, the board is abnormally restarted and the board configuration information is lost. You need to notify the packet forwarding module to re-issue the packet.
  • the board is configured with packets to ensure that the configuration information required for the board is self-recoverable.
  • the processing of the board application is as follows: After the board is powered on, the board application enables a timer (for example, a 5s timer) to periodically send a rack image S-port packet to the main control board, where the S-port packet header is in the header.
  • a timer for example, a 5s timer
  • the information about the address of the main control board, the address of the board, and the serial number of the packet is used.
  • the board address is used as the keyword of the board.
  • the message index number is a 32-bit unsigned number, and may also be set to 16-bit, 32-bit or 64-bit data according to the needs of the application.
  • the serial number is initialized to 1, counting from 1 and then adding 1 every time it is reported.
  • Embodiments of the present invention also provide a storage medium including a stored program, wherein the program described above executes the method of any of the above.
  • the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a random access memory (Random).
  • Access Memory referred to as RAM
  • mobile hard disk disk or optical disk, and other media that can store program code.
  • Embodiments of the present invention also provide a processor for running a program, wherein the program is executed to perform the steps of any of the above methods.
  • the main control board of the embodiment of the present disclosure can accurately detect the running fault of the board, and then re-send the previously configured board configuration information to the corresponding board to ensure that the board service can be restored to normal.
  • the board configuration information is used to ensure that the board configuration is restored when the board is reset and the board is restarted.
  • the method and device for self-recovery of a single-board configuration in a distributed system have the following beneficial effects: the main control board can accurately detect the loss of the configuration of the single-board, and then re-configure the pre-cached configuration information. The board is delivered to the board to restore the configuration information that has been lost on the board.

Abstract

The present invention relates to the technical field of communications. Disclosed in embodiments of the present disclosure are a self-recovery method and apparatus for board configuration in a distributed system, the method comprising: a main control board sends configuration information of each board to a corresponding board, and caches the configuration information of the corresponding board; the main control board determines whether the board has failed previously according to a previous state and a current state of the board; and if it is determined that the board has failed previously, the main control board sends the previously cached configuration information to the board so as to recover a board service. Upon accurately detecting a board failure, the configuration information of the previously cached board is sent to the corresponding board to ensure that the board service returns to normal.

Description

一种分布式系统中单板配置自恢复方法及装置Single board configuration self-recovering method and device in distributed system 技术领域Technical field
本公开涉及通信技术领域,特别涉及一种分布式系统中单板配置自恢复方法及装置。The present disclosure relates to the field of communications technologies, and in particular, to a method and an apparatus for self-recovery of a single board in a distributed system.
背景技术Background technique
通信系统中的分组传送网(Packet Transport Network,PTN)设备通常为分布式系统,系统中包含主控板、背板和若干单板。单板包括虚拟单板和物理单板,其中:虚拟单板不占用物理槽位,其代理进程运行在主控中央处理器(Central Processing Unit,CPU);物理单板则通过背板与主控板相连接,物理单板上独立运行相关程序。具体应用时,统一由主控板管理平面和控制平面程序发送配置报文分别对各个单板进行参数配置、业务配置。A packet transport network (PTN) device in a communication system is usually a distributed system, and the system includes a main control board, a backboard, and a plurality of boards. The board includes the virtual board and the physical board. The virtual board does not occupy the physical slot. The agent process runs on the central processing unit (CPU). The physical board passes the backplane and the master. The boards are connected, and the related programs are independently run on the physical boards. For specific applications, the configuration of the configuration board and the control plane program are sent by the main control board.
单板可能会被拔插到其他槽位乃至其他设备使用,单板通常也不会保存配置主控板原有的配置信息,因此单板掉电后原先由主控板发送的配置信息会丢失。在单板因各种软硬件原因发生复位、或应用程序重启,导致单板配置丢失时,主控板必须要检测到这种故障,进而重新下发单板原有的各种配置信息,从而保证单板业务能够恢复正常。The board may be removed from other slots or other devices. The original configuration information of the main control board is not saved. Therefore, the configuration information sent by the main control board is lost after the board is powered off. . When the board is reset due to various hardware and software reasons, or the application is restarted, the board is configured to be faulty. The main control board must detect the fault and re-deliver the original configuration information of the board. Ensure that the board business can be restored.
通过分析发现,目前通信设备主要通过以下两类方式来检测单板运行状态:The analysis shows that the current communication device detects the running status of the board in the following two modes:
(1)在背板上设计专门的硬件电路来检测单板运行状态;(1) Design a special hardware circuit on the backplane to detect the running status of the board;
(2)使用握手机制,即通过主控上的维护程序定时向单板发送握手报文,如果单板程序在预先设定的时间或者周期内没有返回应答报文,则判断单板发生故障。(2) The handshake mechanism is used to send a handshake packet to the board through the maintenance program on the master. If the board does not return a response packet within a preset time or period, the board is faulty.
上述两类方法中,主控板在感知单板已因各种软硬件原因发生复位或者应用程序已经发生重启等方面存在如下不足: In the above two methods, the main control board has the following shortcomings in the sense that the board has been reset due to various hardware and software reasons or the application has restarted:
硬件电路存在的问题如下:The problems with the hardware circuit are as follows:
(1)需要占用额外的硬件资源;(1) need to occupy additional hardware resources;
(2)专用硬件电路缺乏灵活性,当单板上某个应用进程异常发生重启时,固化的硬件电路有可能检测不到重启,而此时单板的配置信息已经丢失,导致业务无法恢复正常。(2) The dedicated hardware circuit lacks flexibility. When an application process restarts abnormally on the board, the hardened hardware circuit may fail to detect the restart. The configuration information of the board is lost, and the service cannot be restored. .
通过主控程序定时向单板发送握手报文存在的问题如下:The problem of sending handshake packets to the board periodically through the master program is as follows:
(1)双方握手报文的交互需要占用CPU资源,在定时器设置的间隔很短且单板数量众多的情况下,会占用过多的系统资源;(1) The interaction between the handshake packets of the two parties requires CPU resources. When the interval between timers is short and the number of boards is large, excessive system resources are occupied.
(2)握手机制通常需要超时判断条件,即需要等待若干周期后后方可判断单板出现故障,从而避免因为报文短时间内存在阻塞而导致误判的问题。如果单板故障在超时判定时间之内恢复正常,该机制将无法检测到单板应用程序重启、复位等故障,从而导致单板配置丢失,业务无法恢复正常。(2) The handshake mechanism usually requires a timeout judgment condition, that is, it needs to wait for several cycles before it can judge that the board is faulty, thereby avoiding the problem of misjudgment due to the blocking of the message in a short time. If the board fails to be restored within the timeout period, the system cannot detect the faults such as the restart and reset of the board. As a result, the board configuration is lost and the service cannot be restored.
公开内容Public content
根据本公开实施例提供的技术方案解决的技术问题是当单板因软硬件故障复位或应用进程重启等导致单板配置丢失时,实现单板配置信息的自恢复。The technical solution to be solved according to the technical solution provided by the embodiment of the present disclosure is to implement self-recovery of the board configuration information when the board is lost due to a hardware and software fault reset or an application process restart.
根据本公开实施例提供的一种分布式系统中单板配置自恢复方法,包括:A method for recovering a single board configuration in a distributed system according to an embodiment of the present disclosure includes:
主控板将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存;The main control board sends the configuration information of each board to the corresponding board, and caches the configuration information of the corresponding board.
主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障;The main control board determines whether the board has been faulty before the board according to the previous status information and the current status information of the board.
若确定所述单板在前发生过故障,则将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。If it is determined that the board is faulty, the configuration information of the previous cache is sent to the board, so that the board can restore the board service.
优选地,在所述主控板根据单板的在前状态信息和当前状态信息,确 定所述单板在前是否发生过故障的步骤之前,还包括:Preferably, the main control board is determined according to the previous state information and the current state information of the board. Before the step of determining whether the board has failed before, the method further includes:
所述主控板接收来自单板的机架图报文,并通过解析所述机架图报文,得到所述单板的当前状态信息;The main control board receives the rack image packet from the board, and parses the rack image packet to obtain current status information of the board.
获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。The previous state information obtained and saved by parsing the previous rack image message of the board is obtained.
优选地,所述主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障的步骤包括:Preferably, the step of determining, by the main control board, whether the board has failed beforehand according to the previous state information and the current state information of the board includes:
所述主控板根据所述当前状态信息中的单板类型,确定单板类型是否发生改变;Determining, by the main control board, whether the type of the board changes according to the type of the board in the current state information;
若单板类型未发生改变,则比较所述当前状态信息中的机架图序列号与在前状态信息中的机架图序列号;If the board type does not change, compare the rack map serial number in the current status information with the rack map serial number in the previous status information;
若所述当前状态信息中的机架图序列号小于所述在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则确定所述单板在前发生过故障。Determining the board if the rack image sequence number in the current state information is smaller than the rack image sequence number in the previous state information, and the rack image sequence number in the current state information is not equal to zero A failure occurred before.
优选地,在所述主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障的步骤之后,利用所述当前状态信息中的机架图序列号,更新所述主控板保存的所述在前状态信息中的机架图序列号。Preferably, after the step of determining whether the board has failed before the board is determined according to the previous state information and the current state information of the board, the main control board uses the rack image serial number in the current state information. And updating the rack image serial number in the previous state information saved by the main control board.
优选地,还包括:若单板类型发生改变,则所述主控板不向所述单板发送在前缓存的配置信息。Preferably, the method further includes: if the type of the board is changed, the main control board does not send the configuration information of the previous cache to the board.
根据本公开实施例提供的存储介质,其存储用于实现上述分布式系统中单板配置自恢复方法的程序。A storage medium according to an embodiment of the present disclosure stores a program for implementing a single board configuration self-recovery method in the distributed system described above.
根据本公开实施例提供的一种分布式系统中单板配置自恢复装置,包括:A single board configuration self-restoring device in a distributed system according to an embodiment of the present disclosure includes:
主控板报文转发模块,设置为将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存; The packet forwarding module of the main control board is configured to send configuration information of each board to the corresponding board, and cache configuration information of the corresponding board.
主控板控制模块,设置为根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障;The main control board control module is configured to determine whether the board has been faulty before the board according to the previous status information and the current status information of the board.
其中,若所述主控板控制模块确定所述单板在前发生过故障,则所述主控板报文转发模块将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。If the control board of the main control board determines that the board has failed in the preceding stage, the packet forwarding module of the main control board sends the configuration information of the pre-cache to the board for the single The board resumes the board business.
优选地,所述主控板控制模块还设置为对收到的来自单板的机架图报文进行解析,得到所述单板的当前状态信息,获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。Preferably, the main control board control module is further configured to parse the received rack image message from the board, and obtain the current state information of the board, and obtain the previous analysis on the board. The previous status information obtained and saved by a rack map message.
优选地,所述主控板控制模块根据所述当前状态信息中的单板类型,确定单板类型是否发生改变,若单板类型未发生改变,则比较所述当前状态信息中的机架图序列号与在前状态信息中的机架图序列号,若所述当前状态信息中的机架图序列号小于所述在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则确定所述单板在前发生过故障。Preferably, the main control board control module determines whether the board type is changed according to the board type in the current status information, and if the board type does not change, compare the rack map in the current status information. The serial number of the rack image and the sequence number of the rack image in the previous state information, if the rack image serial number in the current state information is smaller than the rack image serial number in the previous state information, and the current state information is in the current state information If the rack image serial number is not equal to zero, it is determined that the board has failed before.
优选地,所述主控板控制模块根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障之后,利用所述当前状态信息中的机架图序列号,更新所述主控板保存的所述在前状态信息中的机架图序列号。Preferably, the main control board control module determines whether the board has a fault in the front state, and uses the rack image serial number in the current state information according to the previous state information and the current state information of the board. Updating the rack image serial number in the previous state information saved by the main control board.
优选地,所述主控板控制模块在单板类型发生改变时,不向所述单板发送在前缓存的配置信息。Preferably, the main control board control module does not send the previously cached configuration information to the board when the board type is changed.
根据本公开的又一个实施例,还提供了一种存储介质,所述存储介质包括存储的程序,其中,所述程序运行时执行上述任一项所述的方法。According to still another embodiment of the present disclosure, there is also provided a storage medium comprising a stored program, wherein the program is executed to perform the method of any of the above.
根据本公开的又一个实施例,还提供了一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行上述任一项所述的方法。According to still another embodiment of the present disclosure, there is also provided a processor for running a program, wherein the program is executed to perform the method of any of the above.
本公开实施例提供的技术方案具有如下有益效果:The technical solution provided by the embodiment of the present disclosure has the following beneficial effects:
主控板能够准确检测出单板配置丢失问题,进而将预先缓存的配置信息重新下发到单板,从而实现单板上已经丢失的各种配置信息的自恢复, 保证单板业务恢复正常。The main control board can accurately detect the loss of the configuration of the board, and then re-deliver the pre-cached configuration information to the board. Ensure that the board business is back to normal.
附图说明DRAWINGS
图1是本公开实施例提供的分布式系统中单板配置自恢复方法框图;1 is a block diagram of a self-recovery method for a single board configuration in a distributed system according to an embodiment of the present disclosure;
图2是本公开实施例提供的分布式系统中单板配置自恢复装置框图;2 is a block diagram of a self-recovering device for a single board configuration in a distributed system according to an embodiment of the present disclosure;
图3是本公开实施例提供的分布式系统的系统架构图;3 is a system architecture diagram of a distributed system according to an embodiment of the present disclosure;
图4是本公开实施例提供的主控板程序判断单板在位状态的状态机示意图;4 is a schematic diagram of a state machine in which a main control board program determines that a board is in a bit state according to an embodiment of the present disclosure;
图5是本公开实施例提供的主控板报文转发模块的工作流程图;5 is a working flowchart of a packet forwarding module of a main control board according to an embodiment of the present disclosure;
图6是本公开实施例提供的单板S口配置报文自恢复的过程示意图。FIG. 6 is a schematic diagram of a process of self-recovery of a S-port configuration packet of a board according to an embodiment of the present disclosure.
具体实施方式detailed description
以下结合附图对本公开的优选实施例进行详细说明,应当理解,以下所说明的优选实施例仅用于说明和解释本公开,并不用于限定本公开。The preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings.
图1是本公开实施例提供的分布式系统中单板配置自恢复方法框图,如图1所示,步骤包括:1 is a block diagram of a method for self-recovery of a board configuration in a distributed system according to an embodiment of the present disclosure. As shown in FIG. 1 , the steps include:
步骤S101:主控板将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存。Step S101: The main control board sends the configuration information of each board to the corresponding board, and caches the configuration information of the corresponding board.
所述单板的配置信息包括参数配置信息和业务配置信息。The configuration information of the board includes parameter configuration information and service configuration information.
步骤S102:主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障。Step S102: The main control board determines whether the board has a fault beforehand according to the previous state information and the current state information of the board.
所述主控板接收来自单板的机架图报文,并通过解析所述机架图报文,得到所述单板的当前状态信息,获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。所述主控板根据所述当前状态信息中的单板类型,确定单板类型是否发生改变,若单板类型发生改变,则不向所述单板发送在前缓存的配置信息,若单板类型未发生改变,则通过比较所述单板的当前状态信息与在前状态信息,确定所述单板在前是否发 生过故障,具体地说,若某一单板的当前状态信息中的机架图序列号小于在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则说明该单板在位状态异常,此时确定所述单板在前发生过故障。在确定单板在前是否发生过故障之后,利用所述当前状态信息,更新所述主控板保存的所述在前状态信息,例如使用当前状态信息中的机架图序列号替换主控板所保存的在前状态信息中的机架图序列号,从而使主控板能够保存最新的状态信息。The main control board receives the rack image packet from the board, and parses the rack image packet to obtain the current state information of the board, and obtains the previous machine that parses the board before. The previous status information obtained and saved by the frame message. The main control board determines whether the type of the board is changed according to the type of the board in the current state information. If the type of the board is changed, the configuration information of the previous cache is not sent to the board. If the type is not changed, the current status information of the board and the previous status information are compared to determine whether the board is in front of the board. If the fault occurs, specifically, if the rack map serial number in the current state information of a board is smaller than the rack map serial number in the previous state information, and the rack map serial number in the current state information If the board is in the wrong state, the board is in the wrong state. After determining whether the board has failed before, the current status information is used to update the previous status information saved by the main control board, for example, replacing the main control board with the rack image serial number in the current status information. The rack map serial number in the saved previous status information, so that the main control board can save the latest status information.
步骤S103:若确定所述单板在前发生过故障,则将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。Step S103: If it is determined that the board is faulty, the configuration information of the previous cache is sent to the board, so that the board can resume the board service.
背板上具有多个插槽,每个插槽中插入的单板是预先已设定的,且主控板保存每个插槽及其对应的单板类型信息,以及相应单板的配置信息。当主控板从插在某个插槽的单板收到机架图报文时,通过解析该机架图报文,得到单板类型与机架图序列号。主控板将单板类型与其保存的该插槽对应的单板类型匹配,若不一致说明插入该插槽的单板发生变化,即单板类型发生改变,此时主控板不向该单板发送配置信息,直至插入该插槽的单板恢复为设定的单板,主控板确定插入该插槽的单板从单板类型不一致的状态转换为运行正常状态,向该设定的单板发送在前缓存的配置信息。将单板类型与其保存的该插槽对应的单板类型匹配,若一致说明插入该插槽的单板未发生变化,即单板类型未发生改变,此后利用该机架图序列号与在前状态信息中的机架图序列号,确定所述单板在前是否发生过故障。例如,假设插槽1中应插入单板1,主控板保存单板1的配置信息。主控板周期性接收插入插槽1中的单板1的机架图报文,并保存最新的机架图序列号。1、当主控板根据当前收到的机架图报文中的单板类型,确定插槽1中插入的为单板2时,确定单板类型不一致,此时不向单板2发送配置报文。直至主控板根据后续收到的机架图报文中的单板类型,确定插槽1中重新插入单板1时,向单板1发送在前缓存的单板1的配置信息。2、当主控板根据当前收到的机架图报文中的单板类型,确定插槽1中插入的为单板1时,确定单板类型一致,此时主控板将当前收到的机架图报文中 的机架图序列号与其保存的最新的机架图序列号进行比较,并根据比较结果,确定单板1在前是否发生过故障,若在前发生过故障,则向单板1发送在前缓存的单板1的配置信息。The board has multiple slots, and the boards inserted in each slot are pre-configured. The main control board saves information about each slot and its corresponding board type, and the configuration information of the corresponding board. . When the main control board receives the rack image from the board that is inserted in a slot, the board type is parsed to obtain the board type and the rack number. The main control board matches the type of the board to the board that corresponds to the slot. If the board is inconsistent, the board is changed. Send the configuration information until the board that is inserted in the slot is restored to the specified board. The main control board determines that the board inserted in the slot changes from the inconsistent state of the board to the normal running state. The board sends the configuration information of the previous cache. The board type is matched with the board type corresponding to the saved slot. If the board inserted in the slot does not change, the board type does not change. The serial number of the rack image in the status information determines whether the board has failed before. For example, suppose the board 1 is inserted in slot 1. The main board saves the configuration information of board 1. The main control board periodically receives the rack image of the board 1 inserted in the slot 1 and saves the latest rack map serial number. 1. When the main control board determines that the board is inserted in slot 1 according to the type of the board in the rack image that is currently received, the board is determined to be inconsistent. Message. The configuration information of the previously cached board 1 is sent to the board 1 when the board is reinserted in the slot 1 according to the type of the board in the subsequent rack image. 2. When the main control board determines that the board type is the same as that of the board in the slot image, the main control board will receive the current one. Rack diagram message The frame number of the rack image is compared with the latest frame number of the rack image saved. Based on the comparison result, it is determined whether the board 1 has failed before. If the fault occurs before, the board 1 is sent to the front. The configuration information of the cached board 1.
本公开实施例可以使用软件方式准确地检测出单板整体复位、单板应用程序重启等导致单板配置报文(即配置信息)丢失的运行故障,并将预先缓存的配置信息下发到存在过运行故障的单板。即本公开实施例实现了分布式系统中单板故障检测及配置信息自恢复。The embodiment of the present disclosure can use the software to accurately detect the running fault of the board configuration packet (that is, the configuration information), such as the overall reset of the board, and the restart of the board application, and deliver the pre-cached configuration information to the existence. A board that has failed to run. That is, the embodiment of the present disclosure implements single board fault detection and self-recovery of configuration information in a distributed system.
本公开实施例的主控板检测到单板故障后,可以将事先保存的单板配置信息重新下发给单板,保证业务能够恢复正常。具有以下优点:After the main control board of the embodiment of the present disclosure detects the fault of the board, the configuration information of the board saved in advance can be sent to the board to ensure that the service can be restored. Has the following advantages:
1、不依赖专门的硬件电路即可检测出单板运行故障,能够降低成本,提高灵活度和准确度;1. It can detect the running failure of the board without relying on the special hardware circuit, which can reduce the cost and improve the flexibility and accuracy.
2、解决了常见的握手机制可能导致的CPU消耗及故障漏检测问题;2. Solving the problem of CPU consumption and fault leakage detection caused by the common handshake mechanism;
3、保证单板因故障丢失的配置信息可以自动恢复。3. The configuration information of the board that is lost due to the fault can be automatically restored.
本领域普通技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机可读取存储介质中,该程序在执行时,包括步骤S101至步骤S103。其中,所述的存储介质可以为ROM/RAM、磁碟、光盘等。It will be understood by those skilled in the art that all or part of the steps of the above embodiments may be implemented by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, and the program is executed. When it is, step S101 to step S103 are included. The storage medium may be a ROM/RAM, a magnetic disk, an optical disk, or the like.
图2是本公开实施例提供的分布式系统中单板配置自恢复装置框图,如图2所示,包括主控板控制模块10和主控板报文转发模块20。2 is a block diagram of a self-recovery device for a single-board configuration in a distributed system according to an embodiment of the present disclosure. As shown in FIG. 2, the main control board control module 10 and the main control board packet forwarding module 20 are included.
主控板报文转发模块20,设置为将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存。The main control board packet forwarding module 20 is configured to send the configuration information of each board to the corresponding board, and cache the configuration information of the corresponding board.
主控板控制模块10,设置为根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障。具体地说,主控板控制模块10通过对收到的来自单板的机架图报文进行解析,得到所述单板的当前状态信息,获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。主控板控制模块10根据所述当前状态信息中的单板类型,确定单板类型是否发生改变,若单板类型发生改变,则不向所述单板发送在 前缓存的配置信息,若单板类型未发生改变,则通过比较所述单板的当前状态信息与在前状态信息,确定所述单板在前是否发生过故障,进一步说,若当前状态信息中的机架图序列号小于所述在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则确定所述单板在前发生过故障,然后利用所述当前状态信息中的机架图序列号,更新所述主控板保存的所述在前状态信息中的机架图序列号。The main control board control module 10 is configured to determine whether the board has been faulty before the board according to the previous status information and the current status information of the board. Specifically, the main control board control module 10 parses the received rack image packet from the board to obtain the current state information of the board, and obtains the previous machine that parses the board before. The previous status information obtained and saved by the frame message. The main control board control module 10 determines whether the board type is changed according to the board type in the current status information, and if the board type changes, the board does not send the board to the board. Pre-cached configuration information. If the board type does not change, compare the current status information of the board with the previous status information to determine whether the board has failed before. Further, if the current status information If the rack image serial number is smaller than the rack image serial number in the previous state information, and the rack image serial number in the current state information is not equal to zero, it is determined that the board has failed before, Then, using the rack image serial number in the current state information, the rack image serial number in the previous state information saved by the main control board is updated.
其中,若所述主控板控制模块10确定所述单板在前发生过故障,则所述主控板报文转发模块20将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。If the main control board control module 10 determines that the board has failed in the preceding stage, the main control board packet forwarding module 20 sends the configuration information of the pre-cache to the board for the The board recovers the board business.
图3是本公开实施例提供的分布式系统的系统架构图,如图3所示,系统中包含主控板、背板和若干单板(例如单板A、单板B和单本C)。其中,主控板与单板之间通过背板连接,主控板的控制平面和管理平面程序通过其报文转发模块(相当于图2中的主控板报文转发模块20)向相应的单板发送配置报文,从而对各个单板进行参数配置、业务配置。单板在主控板的配置下,运行相关的单板应用程序。3 is a system architecture diagram of a distributed system according to an embodiment of the present disclosure. As shown in FIG. 3, the system includes a main control board, a backplane, and a plurality of boards (for example, a board A, a board B, and a single C). . The main control board and the board are connected through the backplane. The control plane and the management plane of the main control board pass the packet forwarding module (equivalent to the main control board packet forwarding module 20 in FIG. 2) to the corresponding The board sends configuration BPDUs to configure parameters and services for each board. The board runs the related board application under the configuration of the main control board.
本公开采用以下技术方案:The present disclosure adopts the following technical solutions:
主控板的控制程序(实现图2中主控板控制模块10的功能)通过报文转发模块将配置报文通过S口发送到指定单板。转发模块(即报文转发模块)需要以单板地址为关键字缓存发送到单板的参数配置、业务配置报文,并能够对命令码相同的配置报文进行合并处理。The control program of the main control board (implementing the function of the main control board control module 10 in FIG. 2) sends the configuration packet to the designated board through the S interface through the packet forwarding module. The forwarding module (that is, the packet forwarding module) needs to cache the parameter configuration and service configuration packets sent to the board with the card address as the key, and can combine the configuration messages with the same command code.
报文转发模块是运行在主控上的进程或任务,当单板运行状态信息(即状态信息)发生变化时,报文转发模块提供接口用于将缓存的指定单板的配置报文重新发送到单板。The packet forwarding module is a process or a task running on the main control. When the running status information (that is, the status information) of the board changes, the packet forwarding module provides an interface for resending the configuration packet of the specified card. Go to the board.
其中,单板运行状态信息用于指示单板运行状态,包括运行正常、运行异常、单板类型不一致三种状态。单板运行状态由运行异常转换为运行正常,或者由单板类型不一致转换为运行正常时,需要通知报文转发模块重新下发单板配置报文。 The running status of the board is used to indicate the running status of the board, including three states: normal operation, abnormal operation, and inconsistent board type. If the running status of the board is changed from the running abnormal to the normal running, or the board type is inconsistent and the running status is normal, you need to notify the packet forwarding module to re-issue the board configuration packet.
图4是本公开实施例提供的主控板程序判断单板在位状态的状态机示意图,如图4所示,主控板将单板的初始状态确定为运行正常,然后等待单板新上报的机架图S口报文(即机架图报文),解析该报文得到单板类型和机架图序列号。判断单板类型是否变化,若单板类型变化,则说明单板运行状态为单板类型不一致,并等待新上报的机架图S口报文,否则判断机架图序列号是否增大;若机架图序列号增大,则说明单板运行状态为运行正常,并等待新上报的机架图S口报文,否则判断是否机架图序列号变小且不为0;若机架图序列号变小且不为0,则说明单板运行状态为运行异常,并等待新上报的机架图S口报文,否则说明机架图序列号变小且为0,此时说明单板运行状态为运行正常,并等待新上报的机架图S口报文。4 is a schematic diagram of a state machine in which the main control board program determines the in-position state of the board in the embodiment of the present disclosure. As shown in FIG. 4, the main control board determines that the initial state of the board is normal, and then waits for the board to report newly. The S-port packet (that is, the rack-based packet) is parsed. The packet is parsed to obtain the board type and rack image serial number. If the type of the board is changed, the board type is inconsistent and waits for the newly reported rack port S packet. Otherwise, determine whether the rack number is increased. If the serial number of the rack is increased, the running status of the board is normal and waits for the newly reported rack interface S packet. Otherwise, it is determined whether the rack number of the rack is smaller and not 0. If the serial number is smaller than 0, the board is running abnormally and waits for the newly reported rack interface S-port packet. Otherwise, the frame number of the rack is smaller and 0. The running status is normal, and waits for the newly reported rack interface S port message.
也就是说,单板应用程序启用定时器定时向主控板发送机架图S口报文。主控板的控制程序接收到单板机架图报文后,解析出单板类型、机架图序列号信息并与上一次保存的值进行比较。如果单板类型发生变化,则判定单板在位状态为单板类型不一致;如果当前值(即当前的机架图序列号)大于之前的值,则判定单板在位状态正常;如果当前值小于上一次保存的值,且取值为溢出值0,则判定单板在位状态正常。如果当前值小于上一次保存的值,且不等于溢出值0,则判定单板在位状态异常。每次完成上述判定之后,将主控板保存的单板机架图索引号(即机架图序列号)更新为单板最新上报的值。That is to say, the board application enables the timer to periodically send the rack map S port message to the main control board. After receiving the rack image of the board, the control program of the main control board parses out the board type and rack image serial number information and compares it with the last saved value. If the board type is changed, the status of the board is inconsistent. If the current value (that is, the current rack image serial number) is greater than the previous value, the board is in the normal state. If the current value is present, If the value is less than the last saved value and the value is the overflow value of 0, it is determined that the board is in the normal state. If the current value is smaller than the last saved value and is not equal to the overflow value 0, it is determined that the board is in the abnormal state. After the above-mentioned determination is completed, the index number of the rack image that is saved on the main control board (that is, the rack image serial number) is updated to the latest reported value of the board.
通过上述判定,如果主控板检测到单板在位状态(即运行状态)发生变化,且当前值为在位状态正常,可以认为单板程序发生过异常重启,单板配置信息已经丢失,需要立即通知报文转发模块重新下发单板配置报文,从而确保单板运行需要的配置信息能够自恢复。If the board is in the normal state, that is, the running status is changed, and the current status is in the normal state, the board is abnormally restarted and the board configuration information is lost. Immediately notify the packet forwarding module to re-deliver the configuration of the board. This ensures that the configuration information required for the board is self-recoverable.
S口报文是一种通信报文,用于在主控板内、单板内、主控板和单板间的各个进程之间进行消息交互。A S-port packet is a type of communication packet used to exchange messages between the main control board, the board, and between the main control board and the board.
本公开实施例能够解决单板整体复位、单板应用程序重启等原因导致 单板配置报文丢失的问题。The embodiments of the present disclosure can solve the reasons such as the overall reset of the board and the restart of the board application. The problem that the board configuration packet is lost.
下面结合典型实施例对主控板控制程序如何检测单板运行状态并重新发送单板配置报文的详细过程进行描述。The detailed process of how the main control board control program detects the running status of the board and resends the board configuration message is described in the following.
图5是本公开实施例提供的主控板的报文转发模块的工作流程图,如图5所示,步骤包括:FIG. 5 is a flowchart of a packet forwarding module of a main control board according to an embodiment of the present disclosure. As shown in FIG. 5, the steps include:
步骤S201:报文转发模块等待主控板的控制程序配置报文。Step S201: The message forwarding module waits for the control program configuration message of the main control board.
步骤S202:报文转发模块收到S口配置报文(即配置报文)。Step S202: The packet forwarding module receives the S interface configuration packet (that is, the configuration packet).
步骤S203:报文转发模块将该S口配置报文发送到指定单板。Step S203: The packet forwarding module sends the S interface configuration packet to the specified board.
步骤S204:判断是否是注册过的报文,若是则执行步骤S205,否则执行步骤S201。Step S204: determining whether it is a registered message, if yes, executing step S205, otherwise performing step S201.
在初始化阶段可以指定关于某些单板、命令码、操作符的配置报文为注册过的报文,例如指定单板A的配置报文均为注册过的报文,那么在步骤S204中,如果该配置报文是发送给单板A的,则该报文为注册过的报文。In the initialization phase, the configuration packet of the board, the command code, and the operator is a registered packet. For example, if the configuration packet of the board A is a registered packet, then in step S204, If the configuration packet is sent to board A, the packet is a registered packet.
步骤S205:是否是修改操作,若是,执行步骤S206,否则执行步骤S207。Step S205: Whether it is a modification operation, if yes, step S206 is performed, otherwise step S207 is performed.
步骤S206:将S口配置报文进行合并处理。Step S206: Perform the merge processing on the S interface configuration message.
例如将某一配置参数的参数值由2000修改为1000时,将所述S口配置报文与在前保存的S口配置报文合并,使得新的S口配置报文中的参数值为1000。For example, if the parameter value of a configuration parameter is changed from 2000 to 1000, the S interface configuration packet is merged with the previously saved S interface configuration packet, so that the parameter value in the new S interface configuration packet is 1000. .
步骤S207:将S口配置报文进行缓存。Step S207: Cache the S interface configuration message.
也就是说,主控板报文转发模块的处理为:主控板上专门设计一个报文转发模块,该模块是主控板控制程序与各个单板程序进行S口报文交互的中间枢纽。报文转发模块可以转发S口配置报文,同时还提供接口用于对指定单板、指定命令码、指定操作符的配置报文进行注册,只有注册过的报文才会进行缓存。针对操作符为修改操作的报文,提供报文合并功能, 即将配置信息更新为最新值。主控板控制程序将单板配置信息发送到报文转发模块后,报文转发模块首先将S口报文转发到单板,之后根据S口报文中的命令码和操作符判断该配置报文是否注册过。如果注册过,且操作符为修改,则进行合并、缓存处理;如果注册过,且操作符为添加,则直接进行缓存处理;如果没有注册,则直接跳过。此外,报文转发模块还需要提供接口,用于将指定单板已经缓存的S口配置报文全部重新下发到单板。That is to say, the processing of the packet forwarding module of the main control board is: a packet forwarding module is specially designed on the main control board, and the module is an intermediate hub for the S-port packet interaction between the main control board control program and each single board program. The packet forwarding module can forward the S-port configuration packet. The interface can also be used to register the specified packet, the specified command code, and the specified operator. Only registered packets are cached. Provides a message merging function for packets whose operators are modified. Update the configuration information to the latest value. After the main control board control program sends the board configuration information to the packet forwarding module, the packet forwarding module first forwards the S interface packet to the board, and then judges the configuration report according to the command code and the operator in the S port packet. Whether the text has been registered. If the registration is done and the operator is modified, the merge and cache processing is performed; if the registration is performed and the operator is added, the cache processing is directly performed; if there is no registration, the direct processing is skipped. In addition, the packet forwarding module needs to provide an interface for all the S-port configuration packets that have been cached on the board to be delivered to the board.
图6是本公开实施例提供的单板S口配置报文自恢复的过程示意图,如图6所示,步骤包括:FIG. 6 is a schematic diagram of a self-recovery process of a S-port configuration packet of a board according to an embodiment of the present disclosure. As shown in FIG. 6, the steps include:
步骤S301:主控板控制程序等待单板的机架图报文。Step S301: The main control board control program waits for the rack diagram message of the board.
步骤S302:接收并解析机架图报文。Step S302: Receive and parse the rack map message.
步骤S303:判断是否是第一次收到该单板的机架图报文,若是执行步骤S304,否则执行步骤S305。Step S303: It is determined whether the rack image message of the board is received for the first time. If the step S304 is performed, otherwise step S305 is performed.
以该单板的单板地址为关键字查询单板软件表项,若未查询到该单板的单板软件表项,则说明是第一次收到该单板的机架图报文。The board software of the card is used to query the board software entries. If the board software entry is not found, the rack image of the board is received for the first time.
步骤S304:创建该单板的单板软件表项。Step S304: Create a board software entry of the board.
步骤S305:判断单板类型是否发生变化,若是执行步骤S301,否则执行步骤S306。Step S305: Determine whether the type of the board changes. If the step S301 is performed, otherwise step S306 is performed.
步骤S306:判断当前机架图序列号是否小于之前的值,若是执行步骤S307,否则执行步骤S301。Step S306: It is determined whether the current rack image serial number is smaller than the previous value. If the step S307 is performed, otherwise step S301 is performed.
步骤S307:判断当前机架图序列号是否为溢出值0,若是执行步骤S301,否则执行步骤S308。Step S307: It is determined whether the current rack image serial number is an overflow value of 0. If the step S301 is performed, otherwise step S308 is performed.
步骤S308:通知主控板的报文转发模块重新下发该单板的S口配置报文。Step S308: Notifying the packet forwarding module of the main control board to re-issue the S interface configuration packet of the board.
也就是说,主控板的控制程序的处理为:当主控板的控制程序第一次收到某个单板上报的机架图报文时,需要创建以单板地址为关键字的软件 表项,用于对单板信息进行记录。表项中的单板类型初始值即为该单板首次上报机架图时所携带的值,单板在线运行状态信息初始化为运行正常,机架图序列号初始化为1。当主控板控制程序再次收到某个单板机架图报文后,以单板地址为关键字查询上述表项。如果查询失败,表示该单板对应的表项尚未创建,按照上述方式创建相应的单板软件表项;如果查询成功,表示单板软件表项已经创建成功,需要将当前单板信息与之前保存的表项信息进行比较,比较过程描述如下。如果当前单板类型与之前保存的单板类型不相同,则可以判断单板运行状态为单板类型不一致;如果当前序列号取值大于之保存的值,则判定单板在位状态正常;如果当前序列号取值小于上一次保存的值,则判定单板在位状态异常。在本实施例中序列号的取值可以保证运行设备运行上百年不会溢出,因此,不需要考虑序列号溢出的情况。完成上述比较之后,即可捕捉单板运行状态变化情况。之后,还需要更新主控板保存的单板机架图序列号更新为最近一次上报的值。经过上述判断后,主控板的控制程序可以知道单板运行状态的变化情况。如果判断结果是:单板在位状态发生变化,且当前值为在位状态正常,则可以认为单板程序发生过异常重启,单板配置信息已经丢失,需要立即通知报文转发模块重新下发单板配置报文,从而确保单板运行需要的配置信息能够自恢复。That is to say, the control program of the main control board is processed as follows: When the control program of the main control board receives the rack image message reported on the board for the first time, the software needs to create the software with the board address as the key. The entry is used to record the board information. The initial value of the board type in the entry is the value that is carried when the board reports the rack image for the first time. The online running status information of the board is initialized to normal operation, and the frame number of the rack is initialized to 1. After the control board of the main control board receives the packet of a single rack, the entry is queried by using the board address as the key. If the query fails, the corresponding board entry has not been created. You can create the corresponding board software entry in the above manner. If the query is successful, the board software has been created. The item information is compared, and the comparison process is described as follows. If the current board type is different from the previously saved board type, you can determine that the board running status is inconsistent. If the current serial number is greater than the saved value, the board is in the normal state. If the current serial number is smaller than the last saved value, the board is in the abnormal state. In this embodiment, the value of the serial number can ensure that the running device does not overflow for hundreds of years, so there is no need to consider the case where the serial number overflows. After the above comparison is completed, the board status change can be captured. After that, you need to update the serial number of the rack frame saved by the main control board to the last reported value. After the above judgment, the control program of the main control board can know the change of the operating state of the board. If the result of the judgment is that the board is in the in-position state and the current value is in the normal state, the board is abnormally restarted and the board configuration information is lost. You need to notify the packet forwarding module to re-issue the packet. The board is configured with packets to ensure that the configuration information required for the board is self-recoverable.
单板应用程序的处理为:单板上电后,单板应用程序启用定时器(例如5s定时器),定时向主控板发送机架图S口报文,其中S口报文头部中包含了主控板地址信息、单板地址信息、报文序列号等信息,单板地址作为单板的关键字使用。本实施例中,报文索引号为32位无符号数,根据应用的需要也可以设置为16位、32位或者64位数据。序列号初始化为1,从1开始计数,之后每上报一次就进行一次加1操作。The processing of the board application is as follows: After the board is powered on, the board application enables a timer (for example, a 5s timer) to periodically send a rack image S-port packet to the main control board, where the S-port packet header is in the header. The information about the address of the main control board, the address of the board, and the serial number of the packet is used. The board address is used as the keyword of the board. In this embodiment, the message index number is a 32-bit unsigned number, and may also be set to 16-bit, 32-bit or 64-bit data according to the needs of the application. The serial number is initialized to 1, counting from 1 and then adding 1 every time it is reported.
本发明的实施例还提供了一种存储介质,该存储介质包括存储的程序,其中,上述程序运行时执行上述任一项所述的方法。Embodiments of the present invention also provide a storage medium including a stored program, wherein the program described above executes the method of any of the above.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random  Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。Optionally, in this embodiment, the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a random access memory (Random). Access Memory (referred to as RAM), mobile hard disk, disk or optical disk, and other media that can store program code.
本发明的实施例还提供了一种处理器,该处理器用于运行程序,其中,该程序运行时执行上述任一项方法中的步骤。Embodiments of the present invention also provide a processor for running a program, wherein the program is executed to perform the steps of any of the above methods.
综上所述,本公开的实施例具有以下技术效果:In summary, the embodiments of the present disclosure have the following technical effects:
本公开实施例的主控板可以通过准确地检测出单板运行故障,进而重新将事先缓存的单板配置信息发送到相应单板,保证单板业务可以恢复正常,即主控板可以缓存指定单板配置信息,其在准确地检测出单板复位、单板应用程序重启等导致单板配置报文丢失的故障时,重新下发单板配置信息,保证单板业务可以恢复正常。The main control board of the embodiment of the present disclosure can accurately detect the running fault of the board, and then re-send the previously configured board configuration information to the corresponding board to ensure that the board service can be restored to normal. The board configuration information is used to ensure that the board configuration is restored when the board is reset and the board is restarted.
尽管上文对本公开进行了详细说明,但是本公开不限于此,本技术领域技术人员可以根据本公开的原理进行各种修改。因此,凡按照本公开原理所作的修改,都应当理解为落入本公开的保护范围。Although the present disclosure has been described in detail above, the present disclosure is not limited thereto, and various modifications may be made by those skilled in the art in accordance with the principles of the present disclosure. Therefore, modifications made in accordance with the principles of the present disclosure are to be understood as falling within the scope of the present disclosure.
工业实用性Industrial applicability
如上所述,本发明实施例提供的一种分布式系统中单板配置自恢复方法及装置具有以下有益效果:主控板能够准确检测出单板配置丢失问题,进而将预先缓存的配置信息重新下发到单板,从而实现单板上已经丢失的各种配置信息的自恢复,保证单板业务恢复正常。 As described above, the method and device for self-recovery of a single-board configuration in a distributed system provided by the embodiment of the present invention have the following beneficial effects: the main control board can accurately detect the loss of the configuration of the single-board, and then re-configure the pre-cached configuration information. The board is delivered to the board to restore the configuration information that has been lost on the board.

Claims (11)

  1. 一种分布式系统中单板配置自恢复方法,包括:A self-recovery method for a single board configuration in a distributed system, comprising:
    主控板将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存;The main control board sends the configuration information of each board to the corresponding board, and caches the configuration information of the corresponding board.
    主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障;The main control board determines whether the board has been faulty before the board according to the previous status information and the current status information of the board.
    若确定所述单板在前发生过故障,则将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。If it is determined that the board is faulty, the configuration information of the previous cache is sent to the board, so that the board can restore the board service.
  2. 根据权利要求1所述的方法,在所述主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障的步骤之前,还包括:The method according to claim 1, before the step of determining whether the board has failed before the board is determined according to the previous state information and the current state information of the board, the method further includes:
    所述主控板接收来自单板的机架图报文,并通过解析所述机架图报文,得到所述单板的当前状态信息;The main control board receives the rack image packet from the board, and parses the rack image packet to obtain current status information of the board.
    获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。The previous state information obtained and saved by parsing the previous rack image message of the board is obtained.
  3. 根据权利要求1或2所述的方法,所述主控板根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障的步骤包括:The method according to claim 1 or 2, wherein the step of determining, by the main control board, whether the board has failed beforehand according to the previous state information and the current state information of the board includes:
    所述主控板根据所述当前状态信息中的单板类型,确定单板类型是否发生改变;Determining, by the main control board, whether the type of the board changes according to the type of the board in the current state information;
    若单板类型未发生改变,则比较所述当前状态信息中的机架图序列号与在前状态信息中的机架图序列号;If the board type does not change, compare the rack map serial number in the current status information with the rack map serial number in the previous status information;
    若所述当前状态信息中的机架图序列号小于所述在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则确定所述单板在前发生过故障。Determining the board if the rack image sequence number in the current state information is smaller than the rack image sequence number in the previous state information, and the rack image sequence number in the current state information is not equal to zero A failure occurred before.
  4. 根据权利要求3所述的方法,在所述主控板根据单板的在前 状态信息和当前状态信息,确定所述单板在前是否发生过故障的步骤之后,利用所述当前状态信息中的机架图序列号,更新所述主控板保存的所述在前状态信息中的机架图序列号。The method according to claim 3, wherein the main control board is based on the front of the board Updating the previous state information saved by the main control board by using the rack image serial number in the current state information, after the step of determining whether the board has failed in the previous state information and the current state information. The rack map serial number in .
  5. 根据权利要求3所述的方法,还包括:The method of claim 3 further comprising:
    若单板类型发生改变,则所述主控板不向所述单板发送在前缓存的配置信息。If the type of the board is changed, the main control board does not send the configuration information of the previous cache to the board.
  6. 一种分布式系统中单板配置自恢复装置,包括:A single board configuration self-restoring device in a distributed system, comprising:
    主控板报文转发模块,设置为将各单板的配置信息发送至相应单板,并对所述相应单板的配置信息进行缓存;The packet forwarding module of the main control board is configured to send configuration information of each board to the corresponding board, and cache configuration information of the corresponding board.
    主控板控制模块,设置为根据单板的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障;The main control board control module is configured to determine whether the board has been faulty before the board according to the previous status information and the current status information of the board.
    其中,若所述主控板控制模块确定所述单板在前发生过故障,则所述主控板报文转发模块将在前缓存的配置信息发送至所述单板,以供所述单板恢复单板业务。If the control board of the main control board determines that the board has failed in the preceding stage, the packet forwarding module of the main control board sends the configuration information of the pre-cache to the board for the single The board resumes the board business.
  7. 根据权利要求6所述的装置,所述主控板控制模块还设置为对收到的来自单板的机架图报文进行解析,得到所述单板的当前状态信息,获取在前通过解析所述单板的上一机架图报文而得到并保存的在前状态信息。The apparatus according to claim 6, wherein the main control board control module is further configured to parse the received rack image message from the board to obtain current status information of the board, and obtain the previous parsing resolution. The previous status information obtained and saved by the previous rack image of the board.
  8. 根据权利要求6或7所述的装置,所述主控板控制模块根据所述当前状态信息中的单板类型,确定单板类型是否发生改变,若单板类型未发生改变,则比较所述当前状态信息中的机架图序列号与在前状态信息中的机架图序列号,若所述当前状态信息中的机架图序列号小于所述在前状态信息中的机架图序列号,且所述当前状态信息中的机架图序列号不等于零,则确定所述单板在前发生过故障。The apparatus according to claim 6 or 7, wherein the main control board control module determines whether the type of the board changes according to the type of the board in the current status information, and if the type of the board does not change, compare the The rack map serial number in the current state information and the rack map serial number in the previous state information, if the rack map serial number in the current state information is smaller than the rack map serial number in the previous state information If the rack image sequence number in the current state information is not equal to zero, it is determined that the board has failed before.
  9. 根据权利要求8所述的装置,所述主控板控制模块根据单板 的在前状态信息和当前状态信息,确定所述单板在前是否发生过故障之后,利用所述当前状态信息中的机架图序列号,更新所述主控板保存的所述在前状态信息中的机架图序列号。The device according to claim 8, wherein the main control board control module is based on a single board Updating the previous state saved by the main control board by using the rack image serial number in the current state information after determining whether the board has failed in the previous state information and the current state information. The rack map serial number in the message.
  10. 根据权利要求8所述的装置,所述主控板控制模块在单板类型发生改变时,不向所述单板发送在前缓存的配置信息。The apparatus according to claim 8, wherein the main control board control module does not send the previously cached configuration information to the board when the board type is changed.
  11. 一种存储介质,所述存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至5中任一项所述的方法。 A storage medium, the storage medium comprising a stored program, wherein the program is executed to perform the method of any one of claims 1 to 5.
PCT/CN2017/086396 2016-06-15 2017-05-27 Self-recovery method and apparatus for board configuration in distributed system WO2017215441A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610421458.3 2016-06-15
CN201610421458.3A CN107517110B (en) 2016-06-15 2016-06-15 Single board configuration self-recovery method and device in distributed system

Publications (1)

Publication Number Publication Date
WO2017215441A1 true WO2017215441A1 (en) 2017-12-21

Family

ID=60662968

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/086396 WO2017215441A1 (en) 2016-06-15 2017-05-27 Self-recovery method and apparatus for board configuration in distributed system

Country Status (2)

Country Link
CN (1) CN107517110B (en)
WO (1) WO2017215441A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113206698A (en) * 2021-03-22 2021-08-03 深圳震有科技股份有限公司 Satellite media resource redundancy protection method, intelligent terminal and storage medium
CN113824631A (en) * 2021-09-10 2021-12-21 烽火通信科技股份有限公司 Message forwarding method and device, communication equipment and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062839A (en) * 2018-07-18 2018-12-21 郑州云海信息技术有限公司 A kind of method, apparatus and computer readable storage medium detecting HBA card
CN109462502B (en) * 2018-10-30 2022-02-11 新华三技术有限公司合肥分公司 Control method and device for configuration information storage instruction and SDN controller
CN109639509B (en) * 2019-01-21 2021-12-07 新华三技术有限公司合肥分公司 Network equipment configuration method and device
CN110177372B (en) * 2019-04-16 2021-12-14 中信科移动通信技术股份有限公司 Base station copyright permission verification method and device
CN110519098B (en) * 2019-08-30 2022-06-21 新华三信息安全技术有限公司 Method and device for processing abnormal single board
CN111432085B (en) * 2020-03-13 2021-05-25 深圳震有科技股份有限公司 Method for controlling user account registration, storage medium and voice gateway

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1744514A (en) * 2004-08-30 2006-03-08 华为技术有限公司 Method and device for obtaining information of operation and failure state for network equipment
CN101621396A (en) * 2008-07-01 2010-01-06 中兴通讯股份有限公司 Single board automatic management device and method
CN101883013A (en) * 2010-07-09 2010-11-10 中兴通讯股份有限公司 Automatic configuration method and system for single board with alternative mode
CN102355368A (en) * 2011-10-08 2012-02-15 大连环宇移动科技有限公司 Fault processing method of network equipment and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080038754A (en) * 2006-10-31 2008-05-07 한국전력공사 Distribution automation system
US7783813B2 (en) * 2007-06-14 2010-08-24 International Business Machines Corporation Multi-node configuration of processor cards connected via processor fabrics
CN103618618B (en) * 2013-11-13 2017-05-24 福建星网锐捷网络有限公司 Line card fault recovery method and related device based on distributed PCIE system
CN105357023A (en) * 2014-08-22 2016-02-24 中兴通讯股份有限公司 Rack diagram display method and apparatus
CN104618136B (en) * 2014-12-25 2018-11-06 曙光信息产业(北京)有限公司 Configuring management method and device for blade server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1744514A (en) * 2004-08-30 2006-03-08 华为技术有限公司 Method and device for obtaining information of operation and failure state for network equipment
CN101621396A (en) * 2008-07-01 2010-01-06 中兴通讯股份有限公司 Single board automatic management device and method
CN101883013A (en) * 2010-07-09 2010-11-10 中兴通讯股份有限公司 Automatic configuration method and system for single board with alternative mode
CN102355368A (en) * 2011-10-08 2012-02-15 大连环宇移动科技有限公司 Fault processing method of network equipment and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113206698A (en) * 2021-03-22 2021-08-03 深圳震有科技股份有限公司 Satellite media resource redundancy protection method, intelligent terminal and storage medium
CN113824631A (en) * 2021-09-10 2021-12-21 烽火通信科技股份有限公司 Message forwarding method and device, communication equipment and storage medium
CN113824631B (en) * 2021-09-10 2023-04-07 烽火通信科技股份有限公司 Message forwarding method and device, communication equipment and storage medium

Also Published As

Publication number Publication date
CN107517110B (en) 2022-07-12
CN107517110A (en) 2017-12-26

Similar Documents

Publication Publication Date Title
WO2017215441A1 (en) Self-recovery method and apparatus for board configuration in distributed system
US10425316B2 (en) Heart beat monitoring for broadband access devices and enterprise devices
US7017082B1 (en) Method and system for a process manager
US11743097B2 (en) Method and system for sharing state between network elements
US7353259B1 (en) Method and apparatus for exchanging configuration information between nodes operating in a master-slave configuration
US10838752B2 (en) Network notification loss detection for virtual machine migration
US10361992B2 (en) Method for synchronizing virtual machine location information between data center gateways, gateway, and system
CN103220160B (en) The method and apparatus that the management overall situation is transmitted in distributed switch
CN110535692B (en) Fault processing method and device, computer equipment, storage medium and storage system
CN106933659B (en) Method and device for managing processes
CN112506702B (en) Disaster recovery method, device, equipment and storage medium for data center
CN104486125A (en) Backup method and device of configuration files
US20150071091A1 (en) Apparatus And Method For Monitoring Network Performance
US10530634B1 (en) Two-channel-based high-availability
CN111083049B (en) User table item recovery method and device, electronic equipment and storage medium
US10778571B2 (en) Flow entry timing processing method and apparatus
CN110213176B (en) Message processing method, device, equipment and medium of switch
WO2017071430A1 (en) Message processing method, network card, system, information update method, and server
CN109428814B (en) Multicast traffic transmission method, related equipment and computer readable storage medium
US11411829B1 (en) Provisioning managed network nodes and/or managing network nodes
CN109361781B (en) Message forwarding method, device, server, system and storage medium
CN112217718A (en) Service processing method, device, equipment and storage medium
CN110912837A (en) VSM system-based main/standby switching method and device
CN114143253B (en) VRRP (virtual router redundancy protocol) back-cut method and device
CN114513398B (en) Network equipment alarm processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17812552

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17812552

Country of ref document: EP

Kind code of ref document: A1