WO2016101825A1 - 一种分布式保护中控制器热备份的方法和装置 - Google Patents

一种分布式保护中控制器热备份的方法和装置 Download PDF

Info

Publication number
WO2016101825A1
WO2016101825A1 PCT/CN2015/097578 CN2015097578W WO2016101825A1 WO 2016101825 A1 WO2016101825 A1 WO 2016101825A1 CN 2015097578 W CN2015097578 W CN 2015097578W WO 2016101825 A1 WO2016101825 A1 WO 2016101825A1
Authority
WO
WIPO (PCT)
Prior art keywords
controller
protocol
standby
sent
packet
Prior art date
Application number
PCT/CN2015/097578
Other languages
English (en)
French (fr)
Inventor
李必锴
卢刚
李国仿
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP15871893.2A priority Critical patent/EP3247055A4/en
Priority to US15/539,598 priority patent/US20180269963A1/en
Publication of WO2016101825A1 publication Critical patent/WO2016101825A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B10/00Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
    • H04B10/03Arrangements for fault recovery
    • H04B10/032Arrangements for fault recovery using working and protection systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0062Network aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • H04Q2011/0037Operation
    • H04Q2011/0039Electrical control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • H04Q2011/0037Operation
    • H04Q2011/0043Fault tolerance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0005Switch and router aspects
    • H04Q2011/0037Operation
    • H04Q2011/0045Synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/0001Selecting arrangements for multiplex systems using optical switching
    • H04Q11/0062Network aspects
    • H04Q2011/0079Operation or maintenance aspects
    • H04Q2011/0081Fault tolerance; Redundancy; Recovery; Reconfigurability

Definitions

  • the application relates to, but is not limited to, a hot backup technology of an optical transmission network.
  • APS Automatic protection switching
  • the controller which is the core part of the APS, is responsible for receiving the input of the detector and the protocol transmitter, and performing protection protocol calculation on the input, and issuing a switching command to the actuator, and signaling through the protocol transmitter during the bidirectional switching. Give the opposite end.
  • the detector is responsible for fault detection of the work or protection path and reports it to the controller as an input.
  • the actuator is responsible for receiving the switching command output by the controller and controlling the switching of the service.
  • the protocol transmitter in the bidirectional switching configuration, notifies the protocol transmitter to send the PCC signaling in the standard at the originating controller, and notifies the controller after the receiving protocol transmitter receives the signaling of the peer.
  • the APS controller is usually in a separate CPU in a network element, such as a main control board, and the detector, the actuator, and the protocol transmitter are generally on the service board. All protection groups of the network element are controlled by a unified controller.
  • the biggest advantage of this mode is that the protection function is fast, and the protocol algorithm can be quickly adjusted, but its shortcomings are also obvious. Due to the limited processing load of the centralized CPU, the entire network is When there are many protection services, the switching time is affected.
  • the CPU load is shared to multiple CPUs, which can overcome the disadvantages of centralized.
  • the switching time is not affected, and the full network element caused by the controller failure can also be avoided.
  • the service protection function is invalid.
  • the scenario controller sinks to the service board.
  • the controller is the core component of the automatic protection switching. If the controller hardware or software fails, it cannot be reversed, and the service will be interrupted accordingly. In order to improve the survivability of the optical transmission network, control is required. The device is protected by redundant backup.
  • the controller In a centralized protection system, the controller is usually running in the main control board.
  • the main control board itself adopts the 1+1/1:1 hot backup mode, that is, two controls are used in the network element with automatic protection switching function.
  • the traditional 1:1 hot backup method has the following two commonly used methods:
  • Both the primary protocol controller and the standby protocol controller can operate normally, and can receive the input of the detector at the same time. After the protocol is run, only the primary protocol controller sends a switching command to the actuator, and the standby protocol controller does not deliver the protocol. The controller and the standby protocol controller perform the real-time synchronization protocol and the timing synchronization protocol. If the protocol is inconsistent, the main control board is the main control board. When the main control board of the main protocol controller is restarted, unplugged, or damaged, the standby protocol controller can detect it. The standby controller is switched to the new primary protocol controller. The previous master switch to the standby. The main control board needs to maintain the active/standby state.
  • the standby protocol controller does not run, only the primary protocol controller receives the input of the detector or the primary and secondary protocol controllers receive, but the standby protocol controller filters out the input, the primary protocol control
  • the switch command is sent to the executor, and the master protocol controller performs a real-time synchronization protocol and a timing synchronization protocol on the standby protocol controller.
  • the process of the active/standby switchover is the same as that of the method 1.
  • the active/standby switchover of the distributed protection is not as good as that of the main control board.
  • the active and standby states of the main control board usually have dedicated channels to maintain the active/standby state. It is possible to communicate by temporary board.
  • the distributed protection switching time cannot be optimized. If the traditional primary and backup controllers are used, the backup controller may perform more efficiently for different protection groups. The shorter it is.
  • This paper provides a method and device for controller hot backup in distributed protection.
  • a method for hot backup of a controller in distributed protection comprising:
  • the executor receives the packets sent by each controller, determines the packets triggered by the same protocol according to the sequence number of each packet, and maintains the master/slave relationship of each controller through the packets triggered by the same protocol, and executes the master.
  • the message sent by the controller determines that the switching condition is reached when the protocol is in steady state, and the actuator performs the active/standby switchover.
  • the message sent by each controller includes: protocol control information, protocol status information, and a sequence number.
  • the maintenance of the master/slave relationship of each controller by the packet triggered by the same protocol includes: the controller that first receives the packet triggered by the same protocol as the primary controller, and the other controllers.
  • the controller that records the first received packet is recorded every time the protocol is triggered, and the number of packets of the controller is incremented by one in the active/standby relationship information.
  • the method further includes: setting a first timeout period T1, when the executor actually receives the message sent by the standby controller first, if the host controller does not receive the first timeout period T1 The packet performs the protection switching operation according to the packet sent by the standby controller.
  • the steady state of the protocol is: setting a second timeout period T2, and after receiving the message sent by the controller, the executor considers that the packet sent by the controller is not received within the second timeout period T2.
  • the protection protocol of the group is in steady state;
  • the switching condition is: in the current state of the protocol, the presence of the current active/standby relationship information reaches the first The number of packets of the standby controller of the threshold N and the number of packets of the primary controller are less than N.
  • the performing the active/standby switchover includes: the executor compares the number of the packets of the standby controller in the master/slave relationship information, and the standby controller with the most times turns into the primary controller, and the original primary controller becomes the standby controller, and The number of times that each controller in the master/slave relationship information is cleared is 0.
  • the method further includes: the executor checks the protocol control information and the protocol state information of each controller, and if there is any inconsistency, performs protocol synchronization or protocol reset.
  • the executor checks the protocol control information and the protocol status information of each controller, and if there is any inconsistency, performs protocol synchronization or protocol reset, including: the executor is in the protocol steady state, according to each time Finally, the protocol control information and the protocol status information in the message sent by each controller are received, and the controllers in which the protocol control information and the protocol status information in the sent message are the same are classified into one class, if there is a type of controller The maximum number of packets is sent to each controller.
  • the multicast packet contains the protocol synchronization command and the protocol control status information of the controller. Each controller receives the multicast packet and validates its own protocol. Control information and protocol status information. If they are inconsistent, the protection protocol is synchronized.
  • the multicast packets sent to the controller contain only protocol reset commands. After the controller performs a reset of the protection protocol, it re-query the network failure of the detector to execute the protocol.
  • a device for hot backup of a controller in a distributed protection comprising: a receiving module, a determining module, a maintenance module, a message execution module, and an active/standby switching module;
  • the receiving module is configured to: receive a message sent by each controller;
  • the determining module is configured to: determine, according to the serial number of each packet, a packet triggered by the same protocol;
  • the maintenance module is configured to maintain the master/slave relationship of each controller through packets triggered by the same protocol.
  • the message execution module is configured to: execute a message sent by the main controller;
  • the active/standby switchover module is set to: when the protocol is in steady state, it is determined that the handover condition is reached, and the active/standby switchover is performed.
  • the message sent by each controller includes: protocol control information, protocol status information, and sequence number.
  • the maintenance module is configured to: the controller that first receives the message triggered by the same protocol as the primary controller, and the other controllers are the standby controller, each time the protocol is triggered, The controller that records the first received packet, and adds 1 to the number of packets of the controller in the active/standby relationship information.
  • the message execution module is further configured to: when the packet sent by the standby controller is received first, if the packet sent by the primary controller has not been received within the first timeout period T1, The protection switching operation is performed according to the packet sent by the standby controller.
  • the switching condition is: in the steady state protocol, the number of packets of the standby controller that reaches the first threshold N exists in the current active/standby relationship information, and the number of packets of the main controller is less than N;
  • the master/slave switch module is configured to compare the number of packets of the standby controller in the master/slave relationship information when the switchover condition is determined, and the standby controller with the highest number of times is converted into the master controller, and the original master controller becomes the standby controller.
  • the controller clears the number of packets of each controller in the active/standby relationship information to 0.
  • the device further includes: a verification module, configured to: verify protocol control information and protocol status information of each controller, and if there is any inconsistency, perform protocol synchronization or protocol reset.
  • a verification module configured to: verify protocol control information and protocol status information of each controller, and if there is any inconsistency, perform protocol synchronization or protocol reset.
  • the verification module is configured to: in the protocol steady state, according to the protocol control information and the protocol status information in the last received message sent by each controller, the protocol in the message will be sent.
  • the controllers with the same control information and protocol status information are classified into one type. If there are the largest number of controllers, the multicast packets are sent to each controller.
  • the multicast packets include protocol synchronization commands and controllers of the type.
  • the protocol controls the status information, so that each controller receives the multicast packet and checks its own protocol control information and protocol status information. If not, the protection protocol is synchronized, and if it is consistent, it is ignored; The number of controllers does not exist at most.
  • the multicast packets sent to the controller contain only protocol reset commands, so that each controller performs a reset of the protection protocol and re-query the network fault of the detector to execute the protocol.
  • An actuator comprising the above-described device for hot backup of a controller in distributed protection.
  • a system for hot backup of a controller in distributed protection comprising: an actuator, a detector, and a plurality of controllers;
  • the detector is configured to: after performing fault detection, notify each controller of the network failure by sending a multicast packet;
  • the controller is configured to separately perform the protection protocol after receiving the multicast packet, and send the packet to the executor after the protection protocol is executed, and the packet sent to the executor includes: protocol control information and protocol status. Information and serial number;
  • the executor is configured to: receive the packets sent by each controller, determine the packets triggered by the same protocol according to the sequence number of each packet, and maintain the master/slave relationship of each controller through the packets triggered by the same protocol. And executing the message sent by the main controller, when the protocol is steady state, it is determined that the switching condition is reached, and the active/standby switchover is performed.
  • An embodiment of the present invention provides a method and a device for hot backup of a controller in a distributed protection.
  • the executor receives a packet sent by each controller, and determines a packet triggered by the same protocol according to the sequence number of each packet.
  • the packets triggered by the same protocol maintain the master/slave relationship of each controller and execute the packets sent by the master controller.
  • the protocol is steady state, it determines that the switchover condition is reached, and the executor performs the master/slave switchover; thus, each The controllers are running the protocol normally.
  • the controllers do not need to maintain the master/slave relationship and do not need to maintain the master/slave state.
  • the actuator can synchronize the protection protocol, and the master/slave protocol synchronization between the controllers is not required.
  • FIG. 1 is a schematic diagram of a system architecture in which a main protocol controller and a standby protocol controller are both normally operated to implement 1:1 hot backup;
  • Figure 2 is a schematic diagram of a system architecture in which the primary protocol controller is in normal operation, and the standby protocol controller does not run to implement 1:1 hot backup;
  • FIG. 3 is a schematic flowchart of a method for hot backup of a controller in distributed protection according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of an apparatus for hot backup of a controller in distributed protection according to an embodiment of the present invention. intention;
  • FIG. 5 is a schematic structural diagram of a system for hot backup of a controller in distributed protection according to an embodiment of the present invention.
  • the executor receives the packet sent by each controller, determines the packet triggered by the same protocol according to the sequence number of each packet, and maintains the master of each controller by the packet triggered by the same protocol. Prepare the relationship and execute the message sent by the main controller. When the protocol is steady state, it is determined that the switching condition is reached, and the actuator performs the active/standby switchover.
  • the embodiment of the invention implements a method for hot backup of a controller in distributed protection. As shown in FIG. 3, the method includes the following steps:
  • Step 301 The executor receives the packet sent by each controller.
  • the detector Before the fault detection, the detector sends a multicast packet to notify each controller of the network fault. After receiving the multicast packet, each controller separately performs the protection protocol and performs protection. Send the message to the executor after the protocol;
  • the multicast packets sent by the detector include the network fault information and the serial number that uniquely identifies the fault information triggering the execution protocol.
  • the sequence number can be generated. a timestamp or a sequence of ones each time;
  • the message sent by the controller to the executor includes: protocol control information, protocol status information, and a sequence number, where the protocol control information is a command that the controller board that the controller sends to the executor actually switches or switches.
  • the protocol status information is a set of key variable values of some protocols, which directly use the sequence number in the multicast message sent by the detector.
  • Step 302 The executor determines, according to the sequence number of each packet, the packet triggered by the same protocol, and maintains the master/slave relationship of each controller by using the packet triggered by the same protocol.
  • the executor determines the packet triggered by the same protocol according to the sequence number of each packet, and the controller that first receives the packet triggered by the same protocol is the primary controller, and the other controllers are used for standby control.
  • the controller that records the first received message when each protocol is triggered, and in the active/standby relationship The number of messages to the controller is increased by one in the message.
  • Step 303 The executor executes a message sent by the main controller.
  • the actuator actually executes the message sent by the main controller, does not execute the message sent by the other standby controller, and presets the first timeout period T1, when the actuator actually receives the message sent by the standby controller first.
  • the protection switching operation is performed according to the packet sent by the standby controller.
  • Step 304 When the protocol is in steady state, the executor determines that the switching condition is reached, and performs an active/standby switchover;
  • the steady state of the protocol means that the second timeout period T2 is preset, and after receiving the message sent by the controller, the executor does not receive any message sent by the controller within the second timeout period T2.
  • the protection protocol of the protection group is considered to be in a steady state.
  • the switching condition is that, in the steady state protocol, the number of packets of the standby controller that reaches the first threshold N exists in the current active/standby relationship information, and the number of packets of the primary controller is less than N.
  • the performing the active/standby switchover includes: the executor compares the number of packets of the standby controller in the active/standby relationship information, and the standby controller with the highest number of times becomes the primary controller, the original primary controller becomes the standby controller, and the active/standby relationship is cleared. The number of messages per controller in the message is 0.
  • the method further includes: performing, by the executor, verifying protocol control information and protocol status information of each controller, and if there is any inconsistency, performing protocol synchronization or protocol reset;
  • the actuator controls the protocol control information and the protocol status information in the sent message according to the protocol control information and the protocol status information in the message sent by each controller at each protocol steady state.
  • the device is classified into one type. If there are the largest number of controllers, the multicast packets are sent to each controller.
  • the multicast packets include protocol synchronization commands and protocol control status information of the controllers. After receiving the multicast packet, the protocol control information and the protocol state information of the multicast packet are verified. If they are inconsistent, the protection protocol is synchronized, and if they are consistent, they are ignored; if there is no such thing as the largest number of controllers, the control is performed.
  • the multicast packet sent by the device only contains the protocol reset command. Each controller performs a reset of the protection protocol and re-query the network fault of the detector to execute the protocol.
  • the embodiment of the present invention further provides a device for hot backup of a controller in a distributed protection, the device is disposed on an actuator, as shown in FIG. 4, the device includes: a receiving module 41, a determining module 42, The maintenance module 43, the message execution module 44, and the active/standby switching module 45;
  • the receiving module 41 can be implemented by an interface of the executor, and configured to: receive a message sent by each controller;
  • the determining module 42 may be implemented by a processor of the executor, and configured to: determine, according to a sequence number of each packet, a message triggered by the same protocol;
  • the maintenance module 43 can be implemented by the memory of the executor, and is configured to maintain the active/standby relationship of each controller by using the packet triggered by the same protocol.
  • the message execution module 44 can be implemented by a processor of the executor, and configured to: execute a message sent by the main controller;
  • the active/standby switching module 45 can be implemented by the processor of the executor, and is configured to determine that the switching condition is reached and perform the active/standby switching when the protocol is in steady state.
  • the packet sent by each controller includes: protocol control information, protocol status information, and a sequence number, where the protocol control information is a command for actually switching a cross or switch of a service board sent by the controller to the executor.
  • the protocol status information is a set of key variable values of some protocols, and the sequence number directly uses the sequence number in the multicast message sent by the detector.
  • the maintenance module 43 uses the controller that first receives the packet triggered by the same protocol as the primary controller for the first time, and the other controllers are the standby controllers, and records the first received packet when each protocol is triggered.
  • the controller adds 1 to the number of packets of the controller in the active/standby relationship information.
  • the message execution module 44 does not execute the message sent by the main controller every time, and does not execute the message sent by the other standby controller, and presets the first timeout time T1, when the actuator actually receives the standby control first. If the packet sent by the primary controller is not received within the first timeout period T1, the protection switching operation is performed according to the packet sent by the standby controller.
  • the steady state of the protocol means that the second timeout period T2 is set in advance, and after receiving the message sent by the controller, the executor does not receive any message sent by the controller within the second timeout period T2, and considers that the protection group is The protection protocol is in steady state.
  • the switching condition is that, in the steady state protocol, the number of packets of the standby controller that reaches the first threshold N exists in the current active/standby relationship information, and the number of packets of the primary controller is less than N.
  • the master/slave switch module 45 compares the number of packets of the standby controller in the master/slave relationship information, and the standby controller with the highest number of times becomes the master controller, and the original master controller becomes the standby controller. The number of times that each controller in the master/slave relationship information is cleared is 0.
  • the device further includes: a verification module 46, configured to: verify protocol control information and protocol status information of each controller, and if there is any inconsistency, perform protocol synchronization or protocol reset;
  • the controller with the same protocol control information and protocol state information in the sent message is divided into one.
  • Class if there is a maximum number of controllers, send a multicast packet to each controller.
  • the multicast packet contains the protocol synchronization command and the protocol control status information of the controller so that each controller receives
  • the protocol control information and the protocol state information of the multicast packet are verified. If the packets are inconsistent, the protection protocol is synchronized. If the packets are consistent, they are ignored. If there is no controller, the number of controllers is the largest.
  • the multicast packet sent includes only the protocol reset command, so that each controller performs a reset of the protection protocol and re-query the network fault of the detector to execute the protocol.
  • an embodiment of the present invention further provides an actuator, which includes the device for hot backup of the controller in the distributed protection shown in FIG.
  • the embodiment of the present invention further provides a system for hot backup of a controller in distributed protection.
  • the system includes: an executor 51, a detector 53, and a plurality of controllers 52 (including control) 52-1, controller 52-2, ... controller 52-n); wherein
  • the detector 53 is configured to: after performing fault detection, notify each controller 52 of the network failure by sending a multicast message;
  • the controller 52 is configured to separately perform the protection protocol after receiving the multicast message, and send the message to the executor 51 after the protection protocol is executed, and the message sent to the executor 51 includes: protocol control information. , protocol status information, serial number;
  • the executor 51 is configured to: receive the message sent by each controller 52, determine the message triggered by the same protocol according to the sequence number of each message, and maintain each control by the message triggered by the same protocol.
  • the actuator 51 includes the device for hot backup of the controller in the distributed protection shown in FIG.
  • This embodiment is a 1:2 hot backup active/standby switchover:
  • the network management system newly establishes a protection group for the distributed system, and sends the configuration of the protection group to the controller, the actuator, and the detector. There are a total of three controllers, which are mutually hot backups.
  • the executor receives the configuration, it maintains the master/slave relationship information of the three controllers in the memory. This embodiment displays the master/slave relationship information in the form of a table.
  • controller After the detector reports the network fault for the first time to trigger the protocol execution of each controller, the controller sends a switch control command message to the executor. After receiving three identical control command messages, the executor finds that the controller is the first to receive the controller. 1 sent, temporarily considers controller 1 as the main, update controller 1's active/standby relationship field is M, other controller's active/standby relationship field is S, and controller 1's message number is increased by 1, as shown in Table 1. Shown.
  • the controller 1 is selected as the main in the initial establishment stage of the protection group, after a certain time, the actuator selection is preferably taken as the controller 2, which increases the switching execution efficiency.
  • the executor performs the master/slave switchover of the controller.
  • the master/slave relationship field of controller 2 is M
  • controller 1 is S
  • the number of packets of each controller is cleared to 0. Start counting.
  • This embodiment is an execution flow inconsistent controller protection protocol of 1:2 hot backup.
  • the detector reports the network fault as the working SF trigger controller 1-3 respectively executes the protocol, wherein the controller 1 is the master, and the protocol is synchronized after the protocol is stable.
  • WSF Wireless Fidelity
  • RR RR
  • etc. are abbreviations of protocol status information.
  • the protocol control information of the packet sent by the controller 1 is Switch, and the protocol status information is WSF;
  • the protocol control information of the packet sent by the controller 2 is Switch, and the protocol status information is WSF;
  • the protocol control information in the message sent by the controller 3 is Idle, and the protocol status information is RR.
  • Switch/WSF and Idle/RR There are two types of protocol control information and status information: Switch/WSF and Idle/RR, controllers 1 and 2 are in Switch/WSF, and controller 3 is in Idle/RR. At this time, the controller of Switch/WSF is a majority.
  • the implementation of the protocol is based on the Switch/WSF.
  • the executor sends multicast packets to each controller.
  • the packets include Switch/WSF and protocol synchronization commands.
  • controllers 1 and 2 After receiving the multicast packet, controllers 1 and 2 compare with their own protocol control and status, find that they are consistent, and do nothing. After receiving the protocol, controller 3 passes its own protocol control and status. The Idle/RR comparison finds inconsistency, then synchronizes the protocol, and sends the protocol control information and the protocol status information message to the executor again. The executor performs the check again. If it is still inconsistent, it proves that the controller 3 software or the hardware itself has a problem or control. Both devices 1 and 2 perform protocol errors at the same time, and report alarms or events to the network management manual.
  • the detector reports a network fault to protect the SD trigger controller 1-3 to execute the protocol respectively, wherein the controller 1 is the master, then
  • the protocol control information in the message sent by the controller 1 is Idle, and the protocol status information is RR;
  • the protocol control information of the packet sent by the controller 2 is the Switch, and the protocol status information is RR;
  • the protocol control information in the message sent by the controller 3 is Idle, and the protocol status information is PSD.
  • protocol control and status information There are three types of protocol control and status information, namely Idle/RR, Switch/RR, Idle/PSD, controller 1 in Idle/RR class, controller 2 in Switch/RR class, and controller 3 in Idle/PSD class.
  • the secondary protocol check cannot distinguish which one is the standard.
  • the executor sends a multicast packet to the controller.
  • the packet content only includes the protocol reset command.
  • the network fault information of the detector needs to be re-queried. After the protocol is triggered, the protocol control information and the status information message are sent to the actuator again.
  • all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve.
  • the devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
  • the device/function module/functional unit in the above embodiment When the device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium.
  • the above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
  • the actuator can synchronize the protection protocol when the protocol is steady state, and does not require the master and the backup between the controllers. Protocol synchronization.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Maintenance And Management Of Digital Transmission (AREA)
  • Safety Devices In Control Systems (AREA)

Abstract

一种分布式保护中控制器热备份的方法和装置,所述方法包括:执行器接收每个控制器发送的报文(301),根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系(302),并执行主控制器发送的报文(303),在协议稳态时,确定达到切换条件,执行器进行主备切换(304)。

Description

一种分布式保护中控制器热备份的方法和装置 技术领域
本申请涉及但不限于光传输网络的热备份技术。
背景技术
自动保护倒换(APS,automatic protect switching)技术已经广泛应用于光网络中,通常在设备中我们将保护系统逻辑上分为控制器、检测器、执行器、协议传递器这4部分。
控制器,为APS的核心部分,负责接收检测器、协议传递器的输入,并对输入进行保护协议计算,并下发倒换命令给执行器,在双向倒换时还会通过协议传递器发送信令给对端。
检测器,负责对工作或保护路径进行故障检测,并报给控制器作为输入。
执行器,负责接收控制器输出的倒换命令,控制业务的切换。
协议传递器,在双向倒换配置下,在发端控制器通知协议传递器发送标准中的PCC信令,在收端协议传递器收到对端的信令后通知给控制器。
集中式保护控制系统中,APS控制器通常在一个网元中的独立CPU中,比如主控板,而检测器、执行器、协议传递器一般都在业务单板上。所述网元所有保护组由统一控制器进行控制,这种模式最大的优点是保护功能实现快速,协议算法能够快速调整,但其缺点也很明显,由于集中CPU的处理负载有限,在全网保护业务较多时,倒换时间受影响。
分布式保护控制系统中,将CPU的负载分担到多个CPU,可以克服集中式的缺点,在需要保护的业务较多时,倒换时间不受影响,也能避免控制器失效带来的全网元业务保护功能失效的问题,这种场景控制器下沉到业务单板。
不论是集中式还是分布式,控制器都是自动保护倒换的核心组件,如果控制器硬件或软件失效了是无法倒换的,业务也会相应中断,为了提高光传输网络的生存能力,需要对控制器进行冗余备份保护。
在集中式保护系统中,控制器通常运行在主控板中,主控板本身采用1+1/1:1热备份的方式,即在具有自动保护倒换功能的网元中使用了两个控制器,主协议控制器和备协议控制器,为了实现热备份方式的自动保护倒换功能,如图1、2所示,传统的1:1热备份方法,有以下2种常用的方法:
1)主协议控制器和备协议控制器都能正常运行,能同时接收检测器的输入,运行完协议后只有主协议控制器下发倒换命令给执行器,备协议控制器不下发,主协议控制器和备协议控制器进行实时同步协议和定时同步协议,如果协议不一致,通常以主为准,当主协议控制器所在主控板重启、拔板、或者损坏,备协议控制器能检测到时,备协议控制器切换为新的主协议控制器,以前主也切换为备,主控板需要维护主备状态。
2)只有主协议控制器正常工作,备协议控制器不运行,只有主协议控制器接收检测器的输入或者主、备协议控制器都接收但备协议控制器过滤掉所述输入,主协议控制器计算协议后下发倒换命令给执行器,主协议控制器对备协议控制器进行实时同步协议和定时同步协议,主备切换的流程和方法1一样。
在相关技术的分布式保护系统中,如果组网有多节点互连的场景且互连的节点作为控制器,那么会有多个控制器运行,也需要主备热备份的方式来保护控制器,如果采用传统的集中式保护中的热备份方法,会有以下缺陷:
(1)分布式保护中需要维护多个主备控制器关系、且会有多个同步协议的过程,对通信资源要求高,而集中式保护只有主备主控板之间需要维护主备状态和所有同步过程。
(2)分布式保护主备切换不如主控板及时,主控板的主备间一般有专属通道来维护主备状态,而分布式的控制器所在的业务单板之间维护主备状态只能靠临时的板间通信。
(3)随着组网越来越复杂,分布式保护中可能存在多个控制器作为热备份的场景,比如多节点互连的场景,如果互连节点个数大于2,且互连节点作为控制器,即变为1:N的热备份,传统的集中式保护的1+1/1:1热备份无法支持。
(4)分布式保护倒换时间无法达到最优,如果采取传统主备控制器的方式,也许针对不同的保护组,备控制器的执行效率更高,下发倒换命令也更快,保护倒换时间也越短。
所以在分布式保护系统中,有必要找到其它的方法来解决控制器热备份的问题。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本文提供一种分布式保护中控制器热备份的方法和装置。
一种分布式保护中控制器热备份的方法,该方法包括:
执行器接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,执行器进行主备切换。
上述方案中,所述每个控制器发送的报文包括:协议控制信息、协议状态信息和序列号。
上述方案中,所述通过同一次协议触发的报文维护每个控制器的主备关系包括:以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系信息中对所述控制器的报文次数加1。
上述方案中,该方法还包括:设置第一超时时间T1,当执行器实际是先接收到备控制器发送的报文时,如果在第一超时时间T1内还没有接收到主控制器发送的报文,按照备控制器发送的报文来执行保护倒换操作。
上述方案中,所述协议稳态为:设置第二超时时间T2,执行器收到控制器发送的报文后,在第二超时时间T2内没有收到任何控制器发送的报文就认为保护组的保护协议在稳态;
所述切换条件为:在协议稳态下,在当前主备关系信息中存在达到第一 门限N的备控制器的报文次数、且主控制器的报文次数小于N。
上述方案中,所述进行主备切换包括:执行器比较主备关系信息中备控制器的报文次数,次数最多的备控制器转变为主控制器,原主控制器变为备控制器,并清空主备关系信息中每个控制器的报文次数为0。
上述方案中,该方法还包括:执行器对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位。
上述方案中,所述执行器对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位,包括:执行器在协议稳态下,根据每次最后收到每个控制器发送的报文中的协议控制信息和协议状态信息,将发送报文中协议控制信息和协议状态信息都相同的控制器分为一类,如果存在一类控制器的个数最多,向每个控制器发送组播报文,组播报文中包含协议同步命令和该类控制器的协议控制状态信息,每个控制器收到所述组播报文后效验自身的协议控制信息和协议状态信息,如果不一致,则进行保护协议的同步,如果一致则忽略掉;如果不存在一类控制器的个数最多,向控制器发送的组播报文中只包含协议复位命令,每个控制器进行保护协议的复位后重新查询检测器的网络故障来执行协议。
一种分布式保护中控制器热备份的装置,该装置包括:接收模块、确定模块、维护模块、报文执行模块、主备切换模块;其中,
接收模块,设置为:接收每个控制器发送的报文;
确定模块,设置为:根据每个报文的序列号确定同一次协议触发的报文;
维护模块,设置为:通过同一次协议触发的报文维护每个控制器的主备关系;
报文执行模块,设置为:执行主控制器发送的报文;
主备切换模块,设置为:在协议稳态时,确定达到切换条件,进行主备切换。
上述方案中,所述每个控制器发送的报文包括:协议控制信息、协议状态信息、序列号。
上述方案中,所述维护模块,是设置为:以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系信息中对所述控制器的报文次数加1。
上述方案中,所述报文执行模块,还设置为:当实际是先接收到备控制器发送的报文时,如果在第一超时时间T1内还没有接收到主控制器发送的报文,按照备控制器发送的报文来执行保护倒换操作。
上述方案中,所述切换条件为:在协议稳态下,在当前主备关系信息中存在达到第一门限N的备控制器的报文次数、且主控制器的报文次数小于N;
所述主备切换模块,是设置为:在确定达到切换条件时,比较主备关系信息中备控制器的报文次数,次数最多的备控制器转变为主控制器,原主控制器变为备控制器,并清空主备关系信息中每个控制器的报文次数为0。
上述方案中,该装置还包括:校验模块,设置为:对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位。
上述方案中,所述校验模块,是设置为:在协议稳态下,根据每次最后收到每个控制器发送的报文中的协议控制信息和协议状态信息,将发送报文中协议控制信息和协议状态信息都相同的控制器分为一类,如果存在一类控制器的个数最多,向每个控制器发送组播报文,组播报文中包含协议同步命令和该类控制器的协议控制状态信息,以使每个控制器收到所述组播报文后校验自身的协议控制信息和协议状态信息,如果不一致,则进行保护协议的同步,如果一致则忽略掉;如果不存在一类控制器的个数最多,向控制器发送的组播报文中只包含协议复位命令,以使每个控制器进行保护协议的复位后重新查询检测器的网络故障来执行协议。
一种执行器,该执行器包括上述的分布式保护中控制器热备份的装置。
一种分布式保护中控制器热备份的系统,该系统包括:执行器、检测器以及多个控制器;其中,
检测器,设置为:进行故障检测后,通过发送组播报文将网络故障告知每个控制器;
控制器,设置为:接收到所述组播报文后各自单独执行保护协议,在执行完保护协议后分别发送报文给执行器,发送给执行器的报文包括:协议控制信息、协议状态信息和序列号;
执行器,设置为:接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,进行主备切换。
本发明实施例提供了一种分布式保护中控制器热备份的方法和装置,执行器接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,执行器进行主备切换;如此,每个控制器都正常运行协议,控制器自身不分主备关系、且自身也不用维护主备状态,而由每个保护组的执行器动态维护控制器的主备关系,能够支持分布式保护下的1:N(N>=1)控制器热备份的场景,另外,在协议稳态时,执行器能够进行保护协议的同步,不需要控制器之间进行主备协议同步。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1为主协议控制器和备协议控制器都正常运行实现1:1热备份的系统架构示意图;
图2为主协议控制器正常运行,备协议控制器不运行实现1:1热备份的系统架构示意图;
图3为本发明实施例提供的分布式保护中控制器热备份的方法的流程示意图;
图4为本发明实施例提供的分布式保护中控制器热备份的装置的结构示 意图;
图5为本发明实施例提供的分布式保护中控制器热备份的系统的结构示意图。
本发明的实施方式
本发明实施例中,执行器接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,执行器进行主备切换。
下面通过附图对本发明的实施方式进行说明。
本发明实施例实现一种分布式保护中控制器热备份的方法,如图3所示,该方法包括以下几个步骤:
步骤301:执行器接收每个控制器发送的报文;
在本步骤之前,检测器进行故障检测后,通过发送组播报文将网络故障告知每个控制器,每个控制器接收到所述组播报文后各自单独执行保护协议,在执行完保护协议后分别发送报文给执行器;
这里,每个保护组中的多个控制器需要加入组播组,检测器发送的组播报文包括网络故障信息和唯一标识本次故障信息触发执行协议的序列号,序列号的生成可以依据时间戳或每次加1的数列;
所述控制器发送给执行器的报文包括:协议控制信息、协议状态信息和序列号,所述协议控制信息是控制器发送给执行器的业务单板实际切换交叉或开关的命令,所述协议状态信息是一些协议的关键变量值的集合,所述序列号直接使用检测器发送的组播报文中的序列号。
步骤302:执行器根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系;
其中,执行器根据每个报文的序列号确定同一次协议触发的报文,并以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系 信息中对所述控制器的报文次数加1。
步骤303:执行器执行主控制器发送的报文;
其中,执行器每次实际执行主控制器发送的报文,而不执行其它备控制器发送的报文,并预先设置第一超时时间T1,当执行器实际是先接收到备控制器发送的报文时,如果在第一超时时间T1内还没有接收到主控制器发送的报文,按照备控制器发送的报文来执行保护倒换操作。
步骤304:在协议稳态时,执行器确定达到切换条件,进行主备切换;
本步骤中,所述协议稳态是指:预先设置第二超时时间T2,执行器收到控制器发送的报文后,在第二超时时间T2内没有收到任何控制器发送的报文就认为保护组的保护协议在稳态。
所述切换条件为:在协议稳态下,在当前主备关系信息中存在达到第一门限N的备控制器的报文次数、且主控制器的报文次数小于N。
所述进行主备切换包括:执行器比较主备关系信息中备控制器的报文次数,次数最多的备控制器转变为主控制器,原主控制器变为备控制器,并清空主备关系信息中每个控制器的报文次数为0。
这里,在协议稳态下,主控制器的报文次数先达到N时,不执行主备切换,并清空主备关系信息中每个控制器的报文次数为0。
上述方法还包括:执行器对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位;
其中,执行器在协议稳态下,根据每次最后收到每个控制器发送的报文中的协议控制信息和协议状态信息,将发送报文中协议控制信息和协议状态信息都相同的控制器分为一类,如果存在一类控制器的个数最多,向每个控制器发送组播报文,组播报文中包含协议同步命令和该类控制器的协议控制状态信息,每个控制器收到所述组播报文后效验自身的协议控制信息和协议状态信息,如果不一致,则进行保护协议的同步,如果一致则忽略掉;如果不存在一类控制器的个数最多,向控制器发送的组播报文中只包含协议复位命令,每个控制器进行保护协议的复位后重新查询检测器的网络故障来执行协议。
为了实现上述方法,本发明实施例还提供一种分布式保护中控制器热备份的装置,该装置设置在执行器上,如图4所示,该装置包括:接收模块41、确定模块42、维护模块43、报文执行模块44、主备切换模块45;其中,
接收模块41,可以由执行器的接口实现,设置为:接收每个控制器发送的报文;
确定模块42,可以由执行器的处理器实现,设置为:根据每个报文的序列号确定同一次协议触发的报文;
维护模块43,可以由执行器的存储器实现,设置为:通过同一次协议触发的报文维护每个控制器的主备关系;
报文执行模块44,可以由执行器的处理器实现,设置为:执行主控制器发送的报文;
主备切换模块45,可以由执行器的处理器实现,设置为:在协议稳态时,确定达到切换条件,进行主备切换。
其中,所述每个控制器发送的报文包括:协议控制信息、协议状态信息、序列号,所述协议控制信息是控制器发送给执行器的业务单板实际切换交叉或开关的命令,所述协议状态信息是一些协议的关键变量值的集合,所述序列号直接使用检测器发送的组播报文中的序列号。
所述维护模块43以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系信息中对所述控制器的报文次数加1。
所述报文执行模块44每次实际执行主控制器发送的报文,而不执行其它备控制器发送的报文,并预先设置第一超时时间T1,当执行器实际是先接收到备控制器发送的报文时,如果在第一超时时间T1内还没有接收到主控制器发送的报文,按照备控制器发送的报文来执行保护倒换操作。
所述协议稳态是指:预先设置第二超时时间T2,执行器收到控制器发送的报文后,在第二超时时间T2内没有收到任何控制器发送的报文就认为保护组的保护协议在稳态。
所述切换条件为:在协议稳态下,在当前主备关系信息中存在达到第一门限N的备控制器的报文次数、且主控制器的报文次数小于N。
所述主备切换模块45在确定达到切换条件时,比较主备关系信息中备控制器的报文次数,次数最多的备控制器转变为主控制器,原主控制器变为备控制器,并清空主备关系信息中每个控制器的报文次数为0。
该装置还包括:校验模块46,设置为:对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位;
在协议稳态下,根据每次最后收到每个控制器发送的报文中的协议控制信息和协议状态信息,将发送报文中协议控制信息和协议状态信息都相同的控制器分为一类,如果存在一类控制器的个数最多,向每个控制器发送组播报文,组播报文中包含协议同步命令和该类控制器的协议控制状态信息,以使每个控制器收到所述组播报文后校验自身的协议控制信息和协议状态信息,如果不一致,则进行保护协议的同步,如果一致则忽略掉;如果不存在一类控制器的个数最多,向控制器发送的组播报文中只包含协议复位命令,以使每个控制器进行保护协议的复位后重新查询检测器的网络故障来执行协议。
基于上述装置,本发明实施例还提供一种执行器,该执行器包括图4所示的分布式保护中控制器热备份的装置。
基于上述执行器,本发明实施例还提供一种分布式保护中控制器热备份的系统,如图5所示,该系统包括:执行器51、检测器53和多个控制器52(包括控制器52-1,控制器52-2,……控制器52-n);其中,
检测器53,设置为:进行故障检测后,通过发送组播报文将网络故障告知每个控制器52;
控制器52,设置为:接收到所述组播报文后各自单独执行保护协议,在执行完保护协议后分别发送报文给执行器51,发送给执行器51的报文包括:协议控制信息、协议状态信息、序列号;
执行器51,设置为:接收每个控制器52发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控 制器52的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,进行主备切换;
所述执行器51包括图4所示的分布式保护中控制器热备份的装置。
实施案例1
本实施例为1:2热备份主备控制器切换:
网管新建立分布式系统的保护组,将保护组的配置下发给控制器、执行器、检测器,这里总共有3个控制器,它们之间互为热备份。执行器收到配置后在内存中维护3个控制器的主备关系信息,本实施例以表的形式展现主备关系信息。
检测器第一次上报网络故障给每个控制器触发协议执行后,控制器下发倒换控制命令报文给执行器,执行器收到3个相同控制命令报文后发现最先收到控制器1发送的,暂时认为控制器1为主,更新控制器1的主备关系字段为M,其它控制器的主备关系字段为S,并将控制器1的报文次数加1,如表1所示。
控制器编号 主备关系 控制器的报文次数
1 M 1
2 S 0
3 S 0
表1
经过检测器多次上报网络故障信息,执行器也收到多个控制命令报文,如表2所示,其中记录的最先收到控制器2的报文次数达到10,本实施例假设N=10达到主备切换时机的条件,最先收到控制器2的报文次数最大可能的原因是执行器和控制器2的通信通道效率最高或者本保护组在控制器2中的执行效率最高,虽然在保护组初始建立阶段选取了控制器1做为主,但一定时间后,执行器择优选取了控制器2为主,增加了倒换执行效率。
Figure PCTCN2015097578-appb-000001
Figure PCTCN2015097578-appb-000002
表2
如表3所示,执行器进行了控制器的主备切换,更新表中控制器2的主备关系字段为M,控制器1为S,每个控制器的报文次数都清0,重新开始计数。
控制器编号 主备关系 控制器的报文次数
1 S 0
2 M 0
3 S 0
表3
实施案例2
本实施例为1:2热备份的控制器保护协议不一致的执行流程。
本实施例以1:2控制器热备份中出现协议控制信息和协议状态信息不一致的场景为例。检测器上报网络故障为工作SF触发控制器1-3分别执行协议,其中控制器1为主,协议稳态后进行协议同步的操作。
其中WSF、RR等为协议状态信息的缩写。
1)控制器1下发的报文中协议控制信息为Switch,协议状态信息为WSF;
2)控制器2下发的报文中协议控制信息为Switch,协议状态信息为WSF;
3)控制器3下发的报文中协议控制信息为Idle,协议状态信息为RR。
协议控制信息和状态信息有2类,Switch/WSF和Idle/RR,控制器1和2在Switch/WSF类,控制器3在Idle/RR类,这时Switch/WSF类的控制器为多数,本次协议执行以Switch/WSF为准,执行器发送组播报文给每个控制器,报文内容包括Switch/WSF和协议同步命令。
控制器1和2收到组播报文后,和本身的协议控制和状态进行比较,发现一致,不做任何操作,控制器3在收到后,经过和本身的协议控制和状态 Idle/RR比较发现不一致,然后进行协议同步,再次向执行器发送协议控制信息和协议状态信息报文,执行器再次进行校验,如果还是不一致,证明控制器3软件或硬件本身存在问题或控制器1和2同时执行协议错误,上报告警或事件给网管人工处理。
检测器上报网络故障为保护SD触发控制器1-3分别执行协议,其中控制器1为主,则
1)控制器1下发的报文中协议控制信息为Idle,协议状态信息为RR;
2)控制器2下发的报文中协议控制信息为Switch,协议状态信息为RR;
3)控制器3下发的报文中协议控制信息为Idle,协议状态信息为PSD。
协议控制和状态信息有3类,为Idle/RR、Switch/RR、Idle/PSD,控制器1在Idle/RR类,控制器2在Switch/RR类,控制器3在Idle/PSD类,本次协议校验无法区分出以哪个为准,执行器发送组播报文给控制器,报文内容只包括协议复位命令。
3个控制器都进行协议复位后,需要重新查询检测器的网络故障信息,触发协议运行后再次发送协议控制信息和状态信息报文给执行器。
本领域普通技术人员可以理解上述实施例的全部或部分步骤可以使用计算机程序流程来实现,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在相应的硬件平台上(如系统、设备、装置、器件等)执行,在执行时,包括方法实施例的步骤之一或其组合。
可选地,上述实施例的全部或部分步骤也可以使用集成电路来实现,这些步骤可以被分别制作成一个个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。
上述实施例中的装置/功能模块/功能单元可以采用通用的计算装置来实现,它们可以集中在单个的计算装置上,也可以分布在多个计算装置所组成的网络上。
上述实施例中的装置/功能模块/功能单元以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。上述提到的计算机可读取存储介质可以是只读存储器,磁盘或光盘等。
工业实用性
采用本发明实施例,每个控制器都正常运行协议,控制器自身不分主备关系、且自身也不用维护主备状态,而由每个保护组的执行器动态维护控制器的主备关系,能够支持分布式保护下的1:N(N>=1)控制器热备份的场景,另外,在协议稳态时,执行器能够进行保护协议的同步,不需要控制器之间进行主备协议同步。

Claims (15)

  1. 一种分布式保护中控制器热备份的方法,该方法包括:
    执行器接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,执行器进行主备切换。
  2. 根据权利要求1所述的方法,其中,所述每个控制器发送的报文包括:协议控制信息、协议状态信息和序列号。
  3. 根据权利要求2所述的方法,其中,所述通过同一次协议触发的报文维护每个控制器的主备关系包括:以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系信息中对所述控制器的报文次数加1。
  4. 根据权利要求1所述的方法,该方法还包括:设置第一超时时间T1,当执行器实际是先接收到备控制器发送的报文时,如果在第一超时时间T1内还没有接收到主控制器发送的报文,按照备控制器发送的报文来执行保护倒换操作。
  5. 根据权利要求1所述的方法,其中,所述协议稳态为:设置第二超时时间T2,执行器收到控制器发送的报文后,在第二超时时间T2内没有收到任何控制器发送的报文就认为保护组的保护协议在稳态;
    所述切换条件为:在协议稳态下,在当前主备关系信息中存在达到第一门限N的备控制器的报文次数、且主控制器的报文次数小于N。
  6. 根据权利要求5所述的方法,其中,所述进行主备切换包括:执行器比较主备关系信息中备控制器的报文次数,次数最多的备控制器转变为主控制器,原主控制器变为备控制器,并清空主备关系信息中每个控制器的报文次数为0。
  7. 根据权利要求2所述的方法,该方法还包括:执行器对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步 或协议复位。
  8. 根据权利要求7所述的方法,其中,所述执行器对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位,包括:执行器在协议稳态下,根据每次最后收到每个控制器发送的报文中的协议控制信息和协议状态信息,将发送报文中协议控制信息和协议状态信息都相同的控制器分为一类,如果存在一类控制器的个数最多,向每个控制器发送组播报文,组播报文中包含协议同步命令和该类控制器的协议控制状态信息,每个控制器收到所述组播报文后效验自身的协议控制信息和协议状态信息,如果不一致,则进行保护协议的同步,如果一致则忽略掉;如果不存在一类控制器的个数最多,向控制器发送的组播报文中只包含协议复位命令,每个控制器进行保护协议的复位后重新查询检测器的网络故障来执行协议。
  9. 一种分布式保护中控制器热备份的装置,该装置包括:接收模块、确定模块、维护模块、报文执行模块、主备切换模块;其中,
    接收模块,设置为:接收每个控制器发送的报文;
    确定模块,设置为:根据每个报文的序列号确定同一次协议触发的报文;
    维护模块,设置为:通过同一次协议触发的报文维护每个控制器的主备关系;
    报文执行模块,设置为:执行主控制器发送的报文;
    主备切换模块,设置为:在协议稳态时,确定达到切换条件,进行主备切换。
  10. 根据权利要求9所述的装置,其中,所述各控制器发送的报文包括:协议控制信息、协议状态信息和序列号。
  11. 根据权利要求10所述的装置,其中,所述维护模块,是设置为:以第一次最先收到同一协议触发的报文的控制器为主控制器,其它控制器为备控制器,在每次协议触发时,记录最先收到的报文的控制器,并在主备关系信息中对所述控制器的报文次数加1。
  12. 根据权利要求10所述的装置,其中,该装置还包括:校验模块,设置为:对每个控制器的协议控制信息和协议状态信息进行校验,如果有不一致的,进行协议同步或协议复位。
  13. 一种执行器,该执行器包括权利要求9至15任一项所述的分布式保护中控制器热备份的装置。
  14. 一种分布式保护中控制器热备份的系统,该系统包括:执行器、检测器以及多个控制器;其中,
    检测器,设置为:进行故障检测后,通过发送组播报文将网络故障告知每个控制器;
    控制器,设置为:接收到所述组播报文后各自单独执行保护协议,在执行完保护协议后分别发送报文给执行器,发送给执行器的报文包括:协议控制信息、协议状态信息和序列号;
    执行器,设置为:接收每个控制器发送的报文,根据每个报文的序列号确定同一次协议触发的报文,通过同一次协议触发的报文维护每个控制器的主备关系,并执行主控制器发送的报文,在协议稳态时,确定达到切换条件,进行主备切换。
  15. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1-8任一项的方法。
PCT/CN2015/097578 2014-12-25 2015-12-16 一种分布式保护中控制器热备份的方法和装置 WO2016101825A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP15871893.2A EP3247055A4 (en) 2014-12-25 2015-12-16 Method and apparatus for hot standby of controllers in distributed protection
US15/539,598 US20180269963A1 (en) 2014-12-25 2015-12-16 Method and apparatus for hot standby of controllers in distributed protection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410820899.1 2014-12-25
CN201410820899.1A CN105790825B (zh) 2014-12-25 2014-12-25 一种分布式保护中控制器热备份的方法和装置

Publications (1)

Publication Number Publication Date
WO2016101825A1 true WO2016101825A1 (zh) 2016-06-30

Family

ID=56149247

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097578 WO2016101825A1 (zh) 2014-12-25 2015-12-16 一种分布式保护中控制器热备份的方法和装置

Country Status (4)

Country Link
US (1) US20180269963A1 (zh)
EP (1) EP3247055A4 (zh)
CN (1) CN105790825B (zh)
WO (1) WO2016101825A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108390781A (zh) * 2018-02-12 2018-08-10 王磊 一种主机自动热备份的方法与系统

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3627247B1 (en) * 2018-09-18 2023-04-05 KNORR-BREMSE Systeme für Nutzfahrzeuge GmbH Control architecture for a vehicle
CN110789569B (zh) * 2019-10-17 2022-04-22 北京全路通信信号研究设计院集团有限公司 一种列控dmi数据冗余控制方法和系统
US12120226B2 (en) * 2020-11-13 2024-10-15 Citrix Systems, Inc. Preventing HTTP cookie stealing using cookie morphing
CN112235150B (zh) * 2020-12-16 2021-03-02 北京宇信科技集团股份有限公司 主备机自动接管方法和系统
CN112671575B (zh) * 2020-12-21 2022-04-29 苏州盛科通信股份有限公司 工作链路的切换方法及装置、存储介质、电子装置
CN115190577B (zh) * 2022-05-11 2023-10-13 四川创智联恒科技有限公司 一种oran系统时序同步互备份方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1801735A (zh) * 2005-01-01 2006-07-12 华为技术有限公司 复用段保护倒换的热备份装置及其方法
CN101350679A (zh) * 2007-07-18 2009-01-21 华为技术有限公司 基于以太网无源光网络的保护倒换方法、系统和设备
WO2012097611A1 (zh) * 2011-01-19 2012-07-26 中兴通讯股份有限公司 一种光网络自动保护倒换方法及装置
CN103944974A (zh) * 2014-04-02 2014-07-23 华为技术有限公司 一种协议报文处理方法、控制器故障处理方法及相关设备
CN104125079A (zh) * 2013-04-23 2014-10-29 华为技术有限公司 一种确定双机热备份配置信息的方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE60204940T2 (de) * 2002-03-27 2006-04-20 Lightmaze Solutions Ag Intelligentes optisches Netzelement
JP4745387B2 (ja) * 2005-04-25 2011-08-10 トムソン ライセンシング メッシュ・ネットワークにおけるマルチキャストのルーティング・プロトコル
US8305877B2 (en) * 2009-09-10 2012-11-06 Tyco Electronics Subsea Communications Llc System and method for distributed fault sensing and recovery
US8913482B2 (en) * 2012-06-01 2014-12-16 Telefonaktiebolaget L M Ericsson (Publ) Enhancements to PIM fast re-route with upstream activation packets
CN104168193B (zh) * 2014-08-12 2017-12-15 华为技术有限公司 一种虚拟路由器冗余协议故障检测的方法及路由设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1801735A (zh) * 2005-01-01 2006-07-12 华为技术有限公司 复用段保护倒换的热备份装置及其方法
CN101350679A (zh) * 2007-07-18 2009-01-21 华为技术有限公司 基于以太网无源光网络的保护倒换方法、系统和设备
WO2012097611A1 (zh) * 2011-01-19 2012-07-26 中兴通讯股份有限公司 一种光网络自动保护倒换方法及装置
CN104125079A (zh) * 2013-04-23 2014-10-29 华为技术有限公司 一种确定双机热备份配置信息的方法及装置
CN103944974A (zh) * 2014-04-02 2014-07-23 华为技术有限公司 一种协议报文处理方法、控制器故障处理方法及相关设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3247055A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108390781A (zh) * 2018-02-12 2018-08-10 王磊 一种主机自动热备份的方法与系统

Also Published As

Publication number Publication date
EP3247055A1 (en) 2017-11-22
CN105790825B (zh) 2020-08-14
CN105790825A (zh) 2016-07-20
US20180269963A1 (en) 2018-09-20
EP3247055A4 (en) 2018-11-14

Similar Documents

Publication Publication Date Title
WO2016101825A1 (zh) 一种分布式保护中控制器热备份的方法和装置
US11734138B2 (en) Hot standby method, apparatus, and system
JP5910811B2 (ja) スイッチ装置の制御システム、その構成制御装置および構成制御方法
WO2016192408A1 (zh) 集群系统中节点的故障检测方法和装置
WO2018137254A1 (zh) 一种基于调用链的并发控制的方法、装置及控制节点
US20150195102A1 (en) Data transfer device system, network system, and method of changing configuration of network system
WO2020224237A1 (zh) 区块链共识的方法、装置、设备及存储介质
KR20160106149A (ko) 소프트웨어 정의 네트워크에서 멀티-마스터 선택
EP3213441B1 (en) Redundancy for port extender chains
WO2017000832A1 (zh) Mac地址的同步方法、装置及系统
WO2018192534A1 (zh) 节点设备运行方法、工作状态切换装置、节点设备及介质
WO2017215430A1 (zh) 一种集群内的节点管理方法及节点设备
WO2016050074A1 (zh) 集群脑裂处理方法和装置
US20190319875A1 (en) Inter-chassis link failure management system
CN102255751A (zh) 一种堆叠冲突的处理方法和设备
CN102209008A (zh) 一种用于智能弹性架构的多激活检测方法和设备
WO2016095344A1 (zh) 链路切换方法、装置及线卡
WO2014036724A1 (zh) 一种操作维护通道的故障恢复方法和网络管理终端
CN103856357A (zh) 一种堆叠系统故障处理方法及堆叠系统
CN103607293A (zh) 一种流量保护方法及设备
WO2015039456A1 (zh) 网络数据自环回的控制方法及装置
WO2014117499A1 (zh) 组播恢复方法和装置及包括该组播恢复装置的中继设备
CN104518995A (zh) 基于分布式架构的交换机虚拟化系统
WO2016101409A1 (zh) 数据倒换的方法、设备及系统
WO2016074497A1 (zh) 堆叠系统中实现分工的方法、主设备、备设备和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15871893

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15539598

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2015871893

Country of ref document: EP