WO2012009914A1 - Protection switching method and system - Google Patents

Protection switching method and system Download PDF

Info

Publication number
WO2012009914A1
WO2012009914A1 PCT/CN2010/079022 CN2010079022W WO2012009914A1 WO 2012009914 A1 WO2012009914 A1 WO 2012009914A1 CN 2010079022 W CN2010079022 W CN 2010079022W WO 2012009914 A1 WO2012009914 A1 WO 2012009914A1
Authority
WO
WIPO (PCT)
Prior art keywords
active
microwave node
node device
standby
protection switching
Prior art date
Application number
PCT/CN2010/079022
Other languages
French (fr)
Chinese (zh)
Inventor
任文杰
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012009914A1 publication Critical patent/WO2012009914A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/74Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for increasing reliability, e.g. using redundant or spare channels or apparatus

Definitions

  • the present invention relates to the field of communications, and in particular, to a protection switching method and system. Background technique
  • microwave communication technology has been in existence for more than half a century. This technology is a wireless communication method for information transmission in the microwave band through the ground line of sight.
  • Microwave communication plays an important role in the field of communication and is a fast means of communication. Whether in the mobile access network or in the mobile metropolitan area network and the core network, the microwave equipment can be seen everywhere, especially in emergency communication, microwave is an irreplaceable means.
  • Digital microwave communications, fiber optics, and satellites are collectively referred to as the three pillars of modern communications transmission.
  • the common protection switching methods are hot backup, hot backup + space diversity, frequency diversity, hot backup + frequency diversity. Their purpose is to ensure high-reliability transmission of transmission services.
  • microwave devices on the market generally support the above protection switching methods. Since protection switching is a dynamic process, if any problems occur during the protection switching process, the protection switching fails, and the result of the protection switching is difficult to control.
  • the method of implementing protection switching is different, and the effect of protection switching may not be possible.
  • the protection switching relies heavily on the trigger channel between the active and standby. If the channel fails, that is, the communication between the active and standby devices is abnormal, the protection switching cannot be completed. If the trigger channel fails. This switching method is completely ineffective. For example, there is a problem that the current fault is not detected, and the service is not switched, and the communication link is faulty, and the service is defective, and the fault is caused by the fault detection.
  • the present invention provides a protection switching method and system, which solves the problem that the switching cannot be realized when the communication between the primary and secondary devices is abnormal in the prior art.
  • the present invention provides a protection switching method in which a proxy center is provided in an active/standby microwave node device (a primary microwave node device and a backup microwave node device) constituting a protection pair, and a management center is disposed in the protection pair.
  • the method includes:
  • the active and standby microwave node device proxy centers that is, the primary microwave node device proxy center and the standby microwave node device proxy center, determine one or more faults of the primary microwave node device or the standby microwave node device according to the preset multipoint fault detection strategy. ;
  • the active and standby microwave node device proxy centers When the communication between the active and standby microwave node devices is normal, the active and standby microwave node device proxy centers perform protection switching of the active and standby microwave node devices through the protection switching communication channel.
  • the active and standby microwave node device agent center When the communication between the active and standby microwave node devices is abnormal, the active and standby microwave node device agent center notifies the management center.
  • the management center performs forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
  • the invention also provides a protection switching system, comprising:
  • the primary microwave node device proxy center is configured to determine that one or more faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy, and the communication between the active and standby microwave node devices is normal.
  • the protection switching of the active and standby microwave node devices is performed through the protection switching communication channel;
  • Spare microwave node device proxy center for pre-set multi-point fault detection strategy Determining one or more faults of the primary microwave node device or the standby microwave node device, and performing protection switching of the active and standby microwave node devices through the protection switching communication channel when the communication between the active and standby microwave node devices is normal;
  • the management center is configured to receive the notification of the proxy center of the active and standby microwave node devices in the case of abnormal communication between the active and standby microwave node devices, and perform forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
  • the multi-point detection control strategy and the protection switching triggered by each fault are used in a relatively independent and unified implementation manner, which solves the problem that the switching of the active/standby equipment in the prior art is not realized when the communication is abnormal.
  • the high-speed communication channel between the active and standby devices can solve the problem that the service interruption time is long due to the long switching time, and the current undetectable fault can be solved by the multi-point detection control strategy without causing traffic failure and communication link failure.
  • FIG. 1 is a schematic diagram of a management center-agent center architecture and data flow direction according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a protection switching method according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a process of triggering protection switching of fault information according to an embodiment of the present invention
  • FIG. 4 is a flowchart of a process of triggering protection switching of a master-slave agent in an embodiment of the present invention
  • FIG. 5 is a flowchart of a process for triggering protection switching of a power-down message according to an embodiment of the present invention
  • FIG. 6 is a flowchart of a process for triggering protection switching by a remote alarm according to an embodiment of the present invention
  • FIG. 7 is a protection switching system according to an embodiment of the present invention. Schematic diagram of the structure. detailed description
  • the present invention provides a protection switching method and system.
  • FIG. 1 is a schematic diagram of a management center-agent center architecture and data flow direction according to an embodiment of the present invention, as shown in FIG. (Manager) - Agent Center (Agent) architecture consists of three modules: Management Center (Manager), Active Agent Center, that is, the primary microwave node device agent center (Master Agent), the standby agent center, that is, the standby microwave node device agent center (Slave Agent); six types of messages: a Manager-Agent heartbeat message, including: a heartbeat message 101 and a heartbeat message 103; a Manager-Agent control message, including: a control message 102 and a control message 104; and an Agent-Agent heartbeat message, including: The heartbeat message 105; the agent-agent control message includes: a control message 106; a remote alarm (RDI) message, including: a remote alarm 107 and a remote alarm 108.
  • Management Center Manager
  • Active Agent Center that is, the primary microwave node device agent center (Master Agent)
  • the standby agent center that is
  • the content of the Manager-Agent heartbeat message includes: fault information that the agent can detect, that is, the current fault state; the current working state of the agent, including: the active state and the standby state.
  • the contents of the Manager-Agent control message include: a forced switching message sent by the Manager to the Agent; a completion forced switching end message sent by the Agent to the Manager; a request to monitor the protection switching message sent by the Agent to the Manager; and a cancellation monitoring protection switching message sent by the Agent to the Manager .
  • the agent-agent heartbeat message includes: the fault information that the agent can detect, that is, the current fault state; the current working state of the agent, including: the active state and the standby state.
  • the Agent-Agent control message includes: a request protection switching message sent by the Master Agent to the Slave Agent; a protection switching end message sent by the original Slave Agent to the original Master Agent.
  • Remote alarm message It is an extensible message, which can indicate any fault of the peer and cause the local fault. The message is that the peer end inserts the fault information into the microwave frame and feeds back to the local end.
  • the Manager is a control management center. Its main function is to manage the heartbeat message 101 and the heartbeat message 103 reported by the Master Agent and the Slave Agent.
  • the information content of the two heartbeat messages mainly includes the status information and fault information of the Master Agent and the Slave Agent.
  • the triggering report refers to: When the change occurs, it will be reported automatically.
  • the main purpose is to enable the management center to plan the correct transmission path when the fast channel fails.
  • the periodic reporting or interval reporting at the same time mainly refers to the management center to the page.
  • the Management Center monitors it based on heartbeat messages.
  • the Manager After receiving the heartbeat message 101 and the heartbeat message 103 of the Master Agent and the Slave Agent, the Manager returns the status of the current two agent centers (Agent) to the user interface, and does not receive the heartbeat message 101 or the heartbeat message 103 when the timeout expires.
  • the working status of the corresponding device is displayed as not working, the fault status is set to a serious fault, and the alarm is reported; when the Manager receives the request to monitor the protection switching message from the Master Agent or the Slave Agent, the optimal planning of the transmission unit is started.
  • the transmission path sends a mandatory switching message to the Master Agent and the Slave Agent, and then monitors the fault status of the Master Agent and the Slave Agent of the transmission unit, and ensures that the transmission unit can transmit services as normal as possible until the cancellation monitoring protection switching message is received.
  • Agent is an agent center. Its main function is to process the fault information detected by the detection module of the agent.
  • the communication module of the agent communicates with another agent, and the execution module of the control agent completes the protection switching action.
  • the Master Agent can the distal end of this fault indication (R emo te Defect indication, referred to as RDI) message is inserted into the microwave frame, RDI message is returned to the remote end of the present failure information.
  • RDI fault indication
  • the Manager of the embodiment of the present invention is substantially different from the protection switching controller in the prior art.
  • the protection switching controller in the prior art is the core of the protection switching.
  • the function of the Manager to control the protection switching has been weakened in the embodiment of the present invention. That is, if the active and standby agents can communicate normally, the Manager does not participate in the protection switching process, and only monitors the heartbeat messages of the active and standby agents.
  • the protection switching control right is acquired only when the communication between the active and standby agents is abnormal, and unified protection switching planning and control are performed.
  • the agent in the embodiment of the present invention is no longer a simple proxy function. When the communication between the active and standby agents is normal, the agent in the active state can participate in the protection switching process.
  • a protection switching method is provided, which is based on the above
  • Manager-Agent architecture The active and standby microwave node devices that form the protection pair are set with their own agents, and one manager is set in each protection pair.
  • 2 is a flowchart of a protection switching method according to an embodiment of the present invention. As shown in FIG. 2, a protection switching method according to an embodiment of the present invention includes the following processing:
  • Step 201 The active/standby agent determines that one or more faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy.
  • the multi-point fault detection policy includes at least one of the following: The heartbeat message is detected, the power-down message is detected, the fault information of the active and standby agents is detected, and the remote alarm is detected. It should be noted that the remote alarm detection is mainly used to indicate that the local end cannot detect a fault through the alarm information of the peer end, and is an alarm that can be extended.
  • the active/standby agent includes a fault detection module for detecting a device fault, a communication module for communication between the master and the backup agent, and a communication between the Manager and the agent, and an execution module for completing the protection switching action and writing the remote alarm information.
  • the active and standby agents are determined according to a preset multi-point fault detection policy.
  • the priority of multiple faults needs to be determined. The priority is from high to low: power failure, active and standby microwave node device proxy center communication abnormality The fault is faulty on the active and standby microwave node devices and the remote alarm is faulty. Finally, the corresponding protection switching operation needs to be performed according to the priority of the fault.
  • Step 202 When the communication between the active and standby microwave node devices is normal, the active/standby agent performs protection switching of the active and standby microwave node devices through the protection switching communication channel.
  • the protection switching communication channel includes: a high speed between the active and standby agents. A reliable communication channel between the communication channel, the active and standby Agents, and the Manager, and a remote alarm communication channel. Specifically, the communication channel between the active and standby agents should be made into a high-speed channel to ensure that the protection switching is completed quickly and the protection switching time is shortened. The communication channel between the active and standby Agents and the Manager must be made into a reliable channel to ensure high reliability of protection switching. The communication channel inserts local state information into the microwave frame.
  • the protection switching information is transmitted through the high-speed communication channel; when the high-speed communication channel is faulty, the monitoring information sent by the agent center to the management center, and the forced switching sent by the management center to the agent center The information will use the reliable communication channel, and the heartbeat message is also transmitted through the reliable communication channel; the remote alarm communication channel is used to transmit the fault information returned by the peer end to the local end.
  • Step 203 In the case that the communication between the active and standby microwave node devices is abnormal, the active/standby agent notifies the Manager, and the manager performs forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
  • the central location of the Manager has been weakened, mainly to shorten the protection switching time.
  • the reserved management function is to complete the protection switching when the communication between the active and standby agents cannot be completed. Switching reliability.
  • the following describes the process of performing protection switching or forced protection switching for different faults on the active and standby microwave node devices.
  • the active and standby agents detect that their own device has failed.
  • the detection module of the agent detects that the device is faulty, and notifies the agent that the device is faulty.
  • the standby microwave node device agent center goes to the primary microwave node device.
  • the master agent sends a fault message, and the master agent modifies the fault state information of the standby microwave node device according to the fault status message.
  • the fault state information is modified to check whether the protection switchover is performed during the protection switching. .
  • the MasterAgent detects the fault state information of the standby microwave node device.
  • the standby microwave node device has a fault, the alarm is reported. If the standby microwave node device is normal, the master and the standby agent are used.
  • the high-speed communication channel sends a protection switching message to the slave agent, performs protection switching of the active and standby microwave node devices, and changes its working status information to the standby state.
  • the slave agent After receiving the protection switching message, the slave agent performs the primary and secondary microwaves.
  • the protection of the node device is changed, and the working status information of the node is changed to the active state, and the protection switching completion message is sent to the original master agent.
  • the original MasterAgent does not receive the protection switching completion message, the original master agent notifies the manager through the reliable communication channel. Perform a forced protection switchover.
  • the agent will retry three times. If the protection switching completion message of the slave agent has not been received at this time, the Manager is notified to take over the protection switching control. Right, the Manager completes the planning and decision-making of protection switching.
  • FIG. 3 is a flowchart of a process for triggering protection switching of fault information according to an embodiment of the present invention. As shown in FIG. 3, the following processing is included:
  • Step 301 The detection module of the agent detects that the transmission unit is faulty.
  • Step 302 The detecting module of the agent reports the fault information to the agent.
  • Step 303 the agent determines whether its working state is the main state, if the determination is no, step 304 is performed, otherwise, step 305 is performed;
  • Step 304 If it is in the standby state, notify the MasterAgent of the fault message, and end the operation; Step 305, if it is the main state, it is determined whether there is a fault in the standby unit, if the determination is yes, then step 306 is performed, otherwise, step 307 is performed;
  • Step 306 If the standby unit is faulty, report the fault of the transmission unit to the upper level, and end the operation;
  • Step 307 If the standby unit is not faulty, send a protection switching message to the Slave Agent, complete its own protection switching action, and change the working state to the standby state;
  • Step 308 the Slave Agent receives the request protection switching message.
  • Step 309 the Slave Agent completes the protection switching action by executing the module, and changes the working state to the active state;
  • the master agent returns a protection switching end message to the original master agent.
  • the original master agent receives the protection switching end message, and the protection switching process ends.
  • Case 2 The active and standby agents detect that the communication between the active and standby agents is abnormal through the heartbeat message. When the active/standby agent fails to receive the heartbeat message from the other agent or the correct message cannot be resolved, the master/slave agent cannot communicate with the active/standby agent.
  • the Master Agent in the active state will notify the Manager to take over the protection switching control.
  • the Manager completes the planning and decision of the protection switching through the reliable communication channel.
  • the Master Agent sends a message to take over the protection switching control right to the Manager.
  • the Manager receives the takeover protection switching control.
  • the weight message is used to determine whether the active and standby microwave node devices meet the transmission service condition according to the fault state information and the working state information of the active and standby microwave node devices.
  • the Manager sends a mandatory protection switching message to the active and standby agents.
  • the protection switching between the active and standby microwave node devices is performed.
  • the Manager plans the transmission path according to the fault status information, performs mandatory protection switching on the active and standby agents according to the transmission path, and receives the forced switching returned by the active and standby agents. After the message is completed, the monitoring status is entered.
  • FIG. 4 is a flowchart of processing a communication interruption triggering protection switching of the active/standby agent according to the embodiment of the present invention. As shown in FIG. 4, the following processing is included:
  • Step 401 The communication module detects that the communication between the active and standby agents is abnormal.
  • Step 402 The Master Agent sends a request to the Manager to monitor the protection switching message.
  • Step 403 After receiving the monitoring protection switching message, the Manager determines the planning according to the working status and the fault status of the active and standby agents.
  • Step 404 the Manager determines whether the transmission unit meets the requirements of the normal transmission service, if the determination is no, step 405 is performed, otherwise, step 408 is performed;
  • Step 405 Plan a path that can satisfy the transmission service and send a mandatory protection switching message to the active/standby agent.
  • Step 406 After the active/standby agent receives the mandatory protection switching message, the execution module completes the protection switching action.
  • Step 407 The active/standby agent returns a mandatory protection switching completion message to the Manager.
  • Step 408 After receiving the mandatory protection switching completion message, the Manager enters the monitoring state to ensure normal transmission of the service. If the communication between the active and standby agents is normal, the master The agent sends a cancel monitoring protection switching message to the Manager, and the subsequent protection switching is still completed by the active and standby agents.
  • the active and standby agents check the power-down message to confirm that the peer is powered off.
  • the agent receives the power-down message of the other party. If the master agent confirms that the standby microwave node device is powered off by detecting the power-down message, the master agent does not perform the operation. If the slave agent detects the power-down message to confirm that the primary microwave node device is powered off, the slave agent The protection switching is performed through the high-speed communication channel, and its own working status information is set as the main state.
  • FIG. 5 is a flowchart of a process for triggering protection switching of a power down message according to an embodiment of the present invention. As shown in FIG. 5, the following processing is included:
  • Step 501 An Agent detects that the other party's agent is powered off, or receives a power failure message from the other party; Step 502, the agent determines whether the working state of the agent is the active state. If the determination is yes, the operation ends; otherwise, step 503 is performed;
  • Step 503 the protection switching operation is completed by the execution module of the agent, and the working state is set to the active state.
  • Case 4 The active and standby agents detect the remote alarm.
  • the master agent in the active state detects the remote alarm through the remote alarm communication channel.
  • the master agent detects whether the device is faulty according to the remote alarm. If the judgment is yes, it detects that the device has a detectable fault.
  • the agent performs the protection switching of the active and standby microwave node devices through the high-speed communication channel, that is, enters the protection switching process triggered by the fault information; if the determination is no, the master agent detects the fault state information of the standby microwave node device; If the fault occurs, the master agent does not perform the operation. If it is determined that there is no fault in the standby microwave node device, the master agent performs protection switching through the high-speed communication channel.
  • the original slave agent After the protection switching, the original slave agent detects whether there is a remote alarm through the remote alarm communication channel. If the judgment is yes, the original Master Agent reports an undetectable fault alarm. If the remote alarm information continues to be displayed, the peer device is faulty. You need to report the alarm information of the undetectable fault on the peer device.
  • FIG. 6 is a flowchart of a process for triggering protection switching by a remote alarm according to an embodiment of the present invention. As shown in FIG. 6, the following processing is included:
  • Step 601 The master agent detects that there is remote alarm information.
  • Step 602 the Master Agent first detects whether there is a fault, if the determination is yes, then step 603 is performed, otherwise, step 604 is performed;
  • Step 603 If the Master Agent is faulty, the protection switching process triggered by the fault information is entered;
  • Step 604 if there is no fault, it is detected whether there is fault information in the standby unit, if the determination is yes, step 605 is performed, otherwise, step 606 is performed; Police information;
  • Step 606 If the standby unit does not have a fault, the master agent sends a request protection switching message to the Slave Agent.
  • Step 607 After receiving the request for protection switching message, the Slave Agent completes the protection switching action and returns a protection switching completion message.
  • Step 608 it is detected whether there is still a remote alarm, if the determination is yes, then step 609 is performed, otherwise, step 605 is performed;
  • Step 609 When there is an undetectable fault on the local end or the opposite end receiving end, the alarm information of the undetectable fault is reported to the local sending end or the opposite end receiving end.
  • the above process is a relatively independent fault process that is decomposed. It is a protection switch process that is selected according to different trigger conditions. It must be processed uniformly during the process. Otherwise, repeated protection switching or no switching will occur.
  • the above various types of faults need to be analyzed in the process flow, and various faults are classified and prioritized according to the fault association relationship, and different processing flows are selected according to different trigger conditions. For example, if there is a power failure, the primary and secondary agent communication abnormality faults and remote alarm faults will occur. If the power failure message is processed, the active and standby agent communication abnormalities and remote alarm faults will definitely occur. In this case, the priority of the power failure is greater than that of the active and standby agents, and the remote alarm is faulty.
  • the priority of each fault is from high to low: a power failure fault, an abnormal communication between the active and standby agents, a fault of the active and standby devices, and a fault of the remote alarm.
  • the protection switching policy implementation process of the embodiment of the present invention uses a multi-point protection switching strategy to spread the risk of protection switching.
  • the fastness of the protection switching is improved as much as possible.
  • Shorten the protection switching time because it is distributed protection In the process of changing, the processing flow of each piece of information is different, but it can be managed uniformly.
  • the embodiments of the present invention can be implemented as independent modules, which are convenient for porting to other products requiring 1+1 protection, reducing development cost and shortening development time; remote alarm information can also help locate faults of positioning equipment for future development and maintenance. Gain experience.
  • FIG. 7 is a schematic structural diagram of a protection switching system according to an embodiment of the present invention.
  • the protection switching system according to an embodiment of the present invention includes: a primary microwave.
  • the node device agent center (MasterAgent) 70 the standby microwave node device agent center (Slave Agent) 72, and the management center (Manager) 74.
  • the respective modules of the embodiments of the present invention are described in detail below.
  • the master agent 70 is configured to determine, according to the preset multi-point fault detection policy, that one or more faults occur in the primary microwave node device or the standby microwave node device, and the communication between the active and standby microwave node devices is normal.
  • the protection switching of the active and standby microwave node devices is performed by the protection switching communication channel.
  • the multi-point fault detection policy includes at least one of the following: detecting a heartbeat message of the active and standby agents, detecting a power failure message, and detecting the The agent's own device fault information is detected and the remote alarm is detected. It should be noted that the remote alarm detection is mainly used to indicate that the local end cannot detect faults through the alarm information of the peer end, and is an alarm that can be extended.
  • the protection switching communication channel includes: a high-speed communication channel between the active and standby agents, a reliable communication channel between the active and standby Agents and the Manager, and a remote alarm communication channel.
  • the communication channel between the active and standby agents should be made into a high-speed channel to ensure that the protection switching is completed quickly and the protection switching time is shortened.
  • the communication channel between the active and standby Agents and the Manager must be made into a reliable channel to ensure high reliability of protection switching.
  • the communication channel inserts local state information into the microwave frame. It should be noted that, when the high-speed communication channel is fault-free, the protection switching information is transmitted through the high-speed communication channel; when the high-speed communication channel is faulty, the monitoring information sent by the agent center to the management center is managed.
  • the forced switching information sent by the center to the agent center will utilize the reliable communication channel, and the heartbeat message is also transmitted through the reliable communication channel; the remote alarm communication channel is used to transmit the fault information returned by the peer end to the local end.
  • the Slave Agent 72 is configured to determine, according to the preset multi-point fault detection policy, that one or more faults occur in the primary microwave node device or the standby microwave node device, and the protection is performed when the communication between the active and standby microwave node devices is normal. Performing protection switching of the active and standby microwave node devices by switching the communication channel;
  • the active/standby agent includes a fault detection module for detecting a device fault, a communication module for communication between the master and the backup agent, and a communication between the Manager and the agent, and an execution module for completing the protection switching action and writing the remote alarm information.
  • the active/standby agent determines that multiple faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy
  • the priority of multiple faults needs to be determined, where priority From high to low: power failure, abnormal communication of the active and standby microwave node device agent center, failure of the active and standby microwave node devices, and remote alarm failure.
  • the corresponding protection switching operation needs to be performed according to the priority of the fault.
  • the manager 74 is configured to perform forced protection switching on the active and standby microwave node devices by using the protection switching communication channel in the case that the communication between the active and standby microwave node devices is abnormal.
  • the central location of the Manager has been weakened, mainly to shorten the protection switching time.
  • the reserved management function is to complete the protection switching when the communication between the active and standby agents cannot be completed. Switching reliability.
  • the following describes the process of performing protection switching or forced protection switching for different faults on the active and standby microwave node devices.
  • the active/standby agent detects that its own device has failed.
  • the detection module of the agent notifies the agent that the device is faulty.
  • the standby microwave node is set.
  • the slave agent sends a fault status message to the master agent node (Master Agent), and the master agent modifies the fault state information of the standby microwave node device according to the fault status message. It should be noted that the fault state information is modified.
  • the master agent detects the fault state information of the standby microwave node device. If the standby microwave node device has a fault, the alarm is reported.
  • the master and the standby device are configured.
  • the high-speed communication channel between the agents sends a protection switching message to the slave agent, and performs protection switching of the active and standby microwave node devices, and changes the working status information of the device to the standby state.
  • the slave agent After receiving the protection switching message, the slave agent performs the primary and backup operations.
  • the protection of the microwave node device is changed, and the working status information of the device is changed to the active state, and the protection switching completion message is sent to the original master agent.
  • the original master agent passes the reliable communication channel. Notify the Manager to perform forced protection switching.
  • the agent will retry three times. If the protection switching completion message of the slave agent has not been received at this time, the Manager is notified to take over the protection switching. Control, the Manager completes the planning and decision making of protection switching.
  • Case 2 The active and standby agents detect that the communication between the active and standby agents is abnormal through the heartbeat message.
  • the active/standby agent fails to receive the heartbeat message from the other agent or the correct message cannot be resolved, the master/slave agent cannot communicate with the active/standby agent.
  • the Master Agent in the active state will notify the Manager to take over the protection switching control.
  • the Manager completes the planning and decision of the protection switching through the reliable communication channel.
  • the Master Agent sends a message to take over the protection switching control right to the Manager.
  • the Manager receives the takeover protection switching control.
  • the weight message is used to determine whether the active and standby microwave node devices meet the transmission service condition according to the fault state information and the working state information of the active and standby microwave node devices.
  • the Manager When determining that the transmission service condition is met, the Manager sends a mandatory protection switching message to the active and standby agents. , for the main and standby microwave section The protection switching of the point device is performed.
  • the Manager plans a transmission path according to the fault status information, performs a mandatory protection switching on the active and standby agents according to the transmission path, and receives a forced switching completion message returned by the active and standby agents, and enters Monitoring status;
  • the Manager cancels the protection switching control and ensures the fast switching protection.
  • the active and standby agents check the power-down message to confirm that the peer is powered off.
  • the agent receives the power-down message from the other party. If the master agent confirms that the standby microwave node device is powered off by detecting the power-down message, the master agent does not perform the operation. If the slave agent detects the power-down message to confirm that the primary microwave node device is powered off, the slave agentt The protection switching is performed through the high-speed communication channel, and its own working status information is set as the main state.
  • Case 4 The active and standby agents detect the remote alarm.
  • the master agent in the active state detects the remote alarm through the remote alarm communication channel.
  • the master agent detects whether the device is faulty according to the remote alarm. If the judgment is yes, it detects that the device has a detectable fault.
  • the agent performs the protection switching of the active and standby microwave node devices through the high-speed communication channel, that is, enters the protection switching process triggered by the fault information; if the determination is no, the master agent detects the fault state information of the standby microwave node device; If the fault occurs, the master agent does not perform the operation. If it is determined that there is no fault in the standby microwave node device, the master agent performs protection switching through the high-speed communication channel.
  • the original slave agent After the protection switching, the original slave agent detects whether there is a remote alarm through the remote alarm communication channel. If the judgment is yes, the original Master Agent reports an undetectable fault alarm. If the remote alarm information continues to be displayed, the peer device is faulty. You need to report the alarm information of the undetectable fault on the peer device.
  • the above process is a relatively independent fault process that is decomposed. It is a protection switch process that is selected according to different trigger conditions. It must be processed uniformly during the process. Otherwise, repeated protection switching or no switching will occur.
  • the situation needs to be handled
  • the above various types of faults are analyzed, and various types of faults are classified and prioritized according to the relationship of faults, and different processing flows are selected according to different triggering conditions. For example, if there is a power failure, the primary and secondary agent communication abnormality faults and remote alarm faults will occur. If the power failure message is processed, the active and standby agent communication abnormalities and remote alarm faults will definitely occur. In this case, the priority of the power failure is greater than that of the active and standby agents, and the remote alarm is faulty.
  • the priority of each fault is from high to low: a power failure fault, an abnormal communication between the active and standby agents, an active/standby device fault, and a remote alarm fault.
  • the embodiment of the present invention solves the problem of long-distance interruption of the service caused by the long switching time in the prior art by the distributed detection control strategy and the relatively independent and unified implementation of the protection switching triggered by each fault. If the fault is not detected, the fault is caused by the fault of the service link, the fault of the communication link, the fault of the communication link, and the fault detection caused by the fault detection.
  • the protection switching action is completely under the controllable state, so that the protection is performed.
  • the switching is more reliable and safe, reducing the occurrence of false switching and non-reversing; and shortening the time of protection switching and improving the performance of the equipment.
  • embodiments of the present invention can be implemented as independent modules, which are convenient for porting to other products requiring 1+1 protection, reducing development cost and shortening development time; remote alarm information can also help locate faults of positioning equipment for future development and maintenance. Gain experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Maintenance And Management Of Digital Transmission (AREA)

Abstract

The invention discloses a protection switching method and system, in order to solve the problem that the protection switching can not be achieved when the communication between a master device and a slave device is abnormal. The protection switching method includes the following steps: a master and a slave microwave node device agent center determine that one or more failures occur on the master microwave node device or the slave microwave node device according to a multi-point failure detection policy; when the communication between the master and the slave microwave node device is normal, the master and the slave microwave node device agent center perform the protection switching through a protection switching communication path; and when the communication between the master and the slave microwave node device is abnormal, the master and the slave microwave node device agent center inform the management center and the management center performs the protection switching on the master and the slave microwave node device through the protection switching communication path. With the use of the invention, the protection switching is more reliable, the occurrence of false-switching and no-switching is reduced, and the time for protection switching is reduced.

Description

保护倒换方法及系统 技术领域  Protection switching method and system
本发明涉及通讯领域, 特别是涉及一种保护倒换方法及系统。 背景技术  The present invention relates to the field of communications, and in particular, to a protection switching method and system. Background technique
目前, 微波通信技术问世已半个多世纪, 该技术是在微波频段通过地 面视距进行信息传播的一种无线通信手段。 微波通信在通信领域起着举足 轻重的作用, 是一种快速的通信手段。 无论是在移动接入网络, 还是在移 动城域网络和核心网络中, 随处都可以看到微波设备的身影, 尤其在应急 通信中, 微波更是一个不可替代的手段。 数字微波通信、 光纤、 以及卫星 一起被称为现代通信传输的三大支柱。  At present, microwave communication technology has been in existence for more than half a century. This technology is a wireless communication method for information transmission in the microwave band through the ground line of sight. Microwave communication plays an important role in the field of communication and is a fast means of communication. Whether in the mobile access network or in the mobile metropolitan area network and the core network, the microwave equipment can be seen everywhere, especially in emergency communication, microwave is an irreplaceable means. Digital microwave communications, fiber optics, and satellites are collectively referred to as the three pillars of modern communications transmission.
作为单个的微波设备必须能够稳定可靠的工作才能支撑起稳定可靠的 微波网络, 但是, 硬件软件都存在自身的缺陷, 因此不可避免的会出现故 障, 在相关技术中, 为了提高单个微波设备的可靠性, 一般都会选择 1+1 保护方案: 即, 一个微波节点设备为主用子单元, 而另一个微波节点设备 为备用子单元, 主备子单元构成某传输方向的保护对。 当主用子单元出现 故障时, 立即倒换到备用子单元上, 从而保证该传输方向的业务的正常运 行。  As a single microwave device, it must be stable and reliable to support a stable and reliable microwave network. However, hardware and software have their own defects, so inevitably there will be failures. In the related technology, in order to improve the reliability of a single microwave device. Sexuality generally selects a 1+1 protection scheme: that is, one microwave node device is the primary subunit, and the other microwave node device is the standby subunit, and the primary and secondary subunits constitute a protection pair of a certain transmission direction. When the primary subunit fails, it is immediately switched to the spare subunit to ensure the normal operation of the traffic in the transmission direction.
目前, 一般常见的保护倒换方式有热备份, 热备份 +空间分集、 频率分 集、 热备份 +频率分集。 它们的目的是均是为了保证传输业务能够高可靠的 传输, 目前市场上的微波设备一般都会支持以上保护倒换方式。 由于保护 倒换是一个动态过程, 如果在保护倒换过程中出现任何问题, 都会造成保 护倒换失败, 保护倒换的结果就 4艮难得到控制。  At present, the common protection switching methods are hot backup, hot backup + space diversity, frequency diversity, hot backup + frequency diversity. Their purpose is to ensure high-reliability transmission of transmission services. Currently, microwave devices on the market generally support the above protection switching methods. Since protection switching is a dynamic process, if any problems occur during the protection switching process, the protection switching fails, and the result of the protection switching is difficult to control.
实际应用中, 实现保护倒换的方法不同, 保护倒换的效果有可能也不 同, 但目前的相关技术中, 保护倒换严重依赖于主用与备用之间的触发通 道, 如果该通道失效, 即主备设备间的通讯出现异常, 将无法完成保护倒 换, 如果触发通道失效则该倒换方法完全失效。 例如, 出现当前不检测故 障而不进行倒换导致业务瘫痪、 或通讯链路出现故障而不进行倒换导致业 务瘫痪、 发生误检测而导致保护倒换等问题。 发明内容 In practical applications, the method of implementing protection switching is different, and the effect of protection switching may not be possible. In the current related technology, the protection switching relies heavily on the trigger channel between the active and standby. If the channel fails, that is, the communication between the active and standby devices is abnormal, the protection switching cannot be completed. If the trigger channel fails. This switching method is completely ineffective. For example, there is a problem that the current fault is not detected, and the service is not switched, and the communication link is faulty, and the service is defective, and the fault is caused by the fault detection. Summary of the invention
本发明提供一种保护倒换方法及系统, 以解决现有技术中主备设备通 讯异常时, 倒换不可实现的问题。  The present invention provides a protection switching method and system, which solves the problem that the switching cannot be realized when the communication between the primary and secondary devices is abnormal in the prior art.
本发明提供一种保护倒换方法,在构成保护对的主备微波节点设备(主 用微波节点设备和备用微波节点设备) 中设置有各自的代理中心, 在所述 保护对中设置有一个管理中心, 该方法包括:  The present invention provides a protection switching method in which a proxy center is provided in an active/standby microwave node device (a primary microwave node device and a backup microwave node device) constituting a protection pair, and a management center is disposed in the protection pair. , the method includes:
主备微波节点设备代理中心, 即主用微波节点设备代理中心和备用微 波节点设备代理中心, 根据预先设置的多点故障检测策略确定主用微波节 点设备或备用微波节点设备出现一个或多个故障;  The active and standby microwave node device proxy centers, that is, the primary microwave node device proxy center and the standby microwave node device proxy center, determine one or more faults of the primary microwave node device or the standby microwave node device according to the preset multipoint fault detection strategy. ;
在主备微波节点设备之间通信正常的情况下, 主备微波节点设备代理 中心通过保护倒换通信通道进行主备微波节点设备的保护倒换;  When the communication between the active and standby microwave node devices is normal, the active and standby microwave node device proxy centers perform protection switching of the active and standby microwave node devices through the protection switching communication channel.
在主备微波节点设备之间通信异常的情况下, 主备微波节点设备代理 中心通知管理中心, 管理中心通过保护倒换通信通道对主备微波节点设备 进行强制保护倒换。  When the communication between the active and standby microwave node devices is abnormal, the active and standby microwave node device agent center notifies the management center. The management center performs forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
本发明还提供了一种保护倒换系统, 包括:  The invention also provides a protection switching system, comprising:
主用微波节点设备代理中心, 用于根据预先设置的多点故障检测策略 确定主用微波节点设备或备用微波节点设备出现一个或多个故障, 并在主 备微波节点设备之间通信正常的情况下, 通过保护倒换通信通道进行主备 微波节点设备的保护倒换;  The primary microwave node device proxy center is configured to determine that one or more faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy, and the communication between the active and standby microwave node devices is normal. The protection switching of the active and standby microwave node devices is performed through the protection switching communication channel;
备用微波节点设备代理中心, 用于根据预先设置的多点故障检测策略 确定主用微波节点设备或备用微波节点设备出现一个或多个故障, 并在主 备微波节点设备之间通信正常的情况下, 通过保护倒换通信通道进行主备 微波节点设备的保护倒换; Spare microwave node device proxy center for pre-set multi-point fault detection strategy Determining one or more faults of the primary microwave node device or the standby microwave node device, and performing protection switching of the active and standby microwave node devices through the protection switching communication channel when the communication between the active and standby microwave node devices is normal;
管理中心, 用于在主备微波节点设备之间通信异常的情况下, 接收所 述主备微波节点设备代理中心的通知, 通过保护倒换通信通道对主备微波 节点设备进行强制保护倒换。  The management center is configured to receive the notification of the proxy center of the active and standby microwave node devices in the case of abnormal communication between the active and standby microwave node devices, and perform forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
本发明有益效果如下:  The beneficial effects of the present invention are as follows:
本发明实施例通过多点检测控制策略、 对各故障触发的保护倒换釆用 相对独立而又统一的实现方式, 解决了现有技术中存在的主备设备通讯异 常时, 倒换不可实现的问题。 同时, 通过主备设备之间的高速通讯通道可 以解决倒换时间较长造成业务中断时间长、 通过多点检测控制策略可以解 决当前不可检测故障而不进行倒换导致业务瘫痪、 通讯链路出现故障而不 进行倒换导致业务瘫痪、 以及发生误检测而导致保护倒换的问题, 使保护 倒换动作完全处于可控状态下, 使保护倒换更加可靠安全, 减少出现错误 倒换和不倒换的情况; 并缩短了保护倒换的时间, 提高了设备的性能。 附图说明  In the embodiment of the present invention, the multi-point detection control strategy and the protection switching triggered by each fault are used in a relatively independent and unified implementation manner, which solves the problem that the switching of the active/standby equipment in the prior art is not realized when the communication is abnormal. At the same time, the high-speed communication channel between the active and standby devices can solve the problem that the service interruption time is long due to the long switching time, and the current undetectable fault can be solved by the multi-point detection control strategy without causing traffic failure and communication link failure. The problem of protection switching caused by business failure and misdetection without switching, so that the protection switching action is completely under controllable state, making protection switching more reliable and safe, reducing the occurrence of false switching and non-reversing; and shortening protection The time of switching improves the performance of the device. DRAWINGS
图 1是本发明实施例的管理中心-代理中心架构及数据流向的示意图; 图 2是本发明实施例的保护倒换方法的流程图;  1 is a schematic diagram of a management center-agent center architecture and data flow direction according to an embodiment of the present invention; FIG. 2 is a flowchart of a protection switching method according to an embodiment of the present invention;
图 3是本发明实施例的故障信息触发保护倒换的处理流程图; 图 4是本发明实施例的主备 Agent通信中断触发保护倒换的处理流程 图;  FIG. 3 is a flowchart of a process of triggering protection switching of fault information according to an embodiment of the present invention; FIG. 4 is a flowchart of a process of triggering protection switching of a master-slave agent in an embodiment of the present invention;
图 5是本发明实施例的掉电消息触发保护倒换的处理流程图; 图 6是本发明实施例的由远端告警触发保护倒换的处理流程图; 图 7是本发明实施例的保护倒换系统的结构示意图。 具体实施方式 FIG. 5 is a flowchart of a process for triggering protection switching of a power-down message according to an embodiment of the present invention; FIG. 6 is a flowchart of a process for triggering protection switching by a remote alarm according to an embodiment of the present invention; FIG. 7 is a protection switching system according to an embodiment of the present invention; Schematic diagram of the structure. detailed description
为了解决现有技术中存在的倒换时间较长造成业务中断时间长、 当前 不可检测故障而不进行倒换导致业务瘫痪、 通讯链路出现故障而不进行倒 换导致业务瘫痪、 以及发生误检测而导致保护倒换的问题, 本发明提供了 一种保护倒换方法及系统。  In order to solve the problem that the switching time of the prior art is long, the service interruption time is long, the current undetectable fault is not performed, and the service is faulty, the communication link is faulty, the service is not switched, and the service is detected. The problem of switching, the present invention provides a protection switching method and system.
在对本发明实施例进行说明之前, 首先对本发明实施例的网络架构进 行详细的说明, 图 1是本发明实施例的管理中心 -代理中心架构及数据流向 的示意图, 如图 1所示, 管理中心 (Manager ) -代理中心 (Agent ) 架构包 括三个模块: 管理中心 (Manager ), 主用代理中心, 即主用微波节点设备 代理中心 (Master Agent )、 备用代理中心, 即备用微波节点设备代理中心 ( Slave Agent ); 六类消息: Manager- Agent心跳消息, 包括: 心跳消息 101 和心兆消息 103; Manager- Agent控制消息, 包括: 控制消息 102和控制消 息 104; Agent-Agent心跳消息, 包括: 心跳消息 105; Agent- Agent控制消 息, 包括: 控制消息 106; 远端告警(RDI ) 消息, 包括: 远端告警 107和 远端告警 108。  Before the embodiment of the present invention is described, the network architecture of the embodiment of the present invention is first described in detail. FIG. 1 is a schematic diagram of a management center-agent center architecture and data flow direction according to an embodiment of the present invention, as shown in FIG. (Manager) - Agent Center (Agent) architecture consists of three modules: Management Center (Manager), Active Agent Center, that is, the primary microwave node device agent center (Master Agent), the standby agent center, that is, the standby microwave node device agent center (Slave Agent); six types of messages: a Manager-Agent heartbeat message, including: a heartbeat message 101 and a heartbeat message 103; a Manager-Agent control message, including: a control message 102 and a control message 104; and an Agent-Agent heartbeat message, including: The heartbeat message 105; the agent-agent control message includes: a control message 106; a remote alarm (RDI) message, including: a remote alarm 107 and a remote alarm 108.
其中, Manager- Agent心跳消息包括的内容为: Agent能够检测到的故 障信息, 即, 当前故障状态; Agent当前工作状态, 包括: 主用状态和备用 状态。 Manager- Agent控制消息包括的内容: Manager发给 Agent的强制倒 换消息; Agent发送到 Manager的完成强制倒换结束消息; Agent发送到 Manager的请求监控保护倒换消息; Agent发送到 Manager的取消监控保护 倒换消息。 Agent- Agent心跳消息包括: Agent能够检测到的故障信息, 即, 当前故障状态; Agent 当前工作状态, 包括: 主用状态和备用状态。 Agent-Agent控制消息包括: Master Agent发送到 Slave Agent的请求保护倒 换消息; 原 Slave Agent发送到原 Master Agent的保护倒换结束消息。 远端 告警消息: 为可扩展的消息, 它可以指示任何对端故障而导致本端故障的 消息, 是对端将故障信息插到微波帧中反馈到本端。 The content of the Manager-Agent heartbeat message includes: fault information that the agent can detect, that is, the current fault state; the current working state of the agent, including: the active state and the standby state. The contents of the Manager-Agent control message include: a forced switching message sent by the Manager to the Agent; a completion forced switching end message sent by the Agent to the Manager; a request to monitor the protection switching message sent by the Agent to the Manager; and a cancellation monitoring protection switching message sent by the Agent to the Manager . The agent-agent heartbeat message includes: the fault information that the agent can detect, that is, the current fault state; the current working state of the agent, including: the active state and the standby state. The Agent-Agent control message includes: a request protection switching message sent by the Master Agent to the Slave Agent; a protection switching end message sent by the original Slave Agent to the original Master Agent. Remote alarm message: It is an extensible message, which can indicate any fault of the peer and cause the local fault. The message is that the peer end inserts the fault information into the microwave frame and feeds back to the local end.
Manager是一个控制管理中心, 它的主要作用是管理 Master Agent和 Slave Agent上报的心跳消息 101和心跳消息 103 ,这两条心跳消息的信息内 容主要包括 Master Agent和 Slave Agent的状态信息和故障信息, 它的上报 分为两种: 触发上报状态信息和故障信息以及周期上报或间隔相同时间上 报状态信息和故障信息。 其中, 触发上报是指: 发生变化时就会主动上报, 主要目的是在快速通道出现故障时, 使管理中心能规划出正确的传输路径; 周期上报或间隔相同时间上报主要是指管理中心向页面反馈 Master Agent 和 Slave Agent的工作状态, 以及在快速通道出现问题时, 管理中心会根据 心 ϋ 消息进行监控。 Manager在接收到 Master Agent和 Slave Agent上才艮的 心跳消息 101 和心跳消息 103 后, 会向用户界面返回当前两个代理中心 ( Agent ) 的状态, 当超时未收到心跳消息 101或心跳消息 103 , 就将相应 设备的工作状态显示为未工作、 将故障状态设置为严重故障并上报告警; 当 Manager收到 Master Agent或 Slave Agent的请求监控保护倒换消息后, 开始规划该传输单元的最优传输路径并向 Master Agent和 Slave Agent发送 强制倒换消息, 然后一直监控该传输单元的 Master Agent和 Slave Agent的 故障状态, 尽可能保证该传输单元能够正常传输业务, 直到收到取消监控 保护倒换消息。  The Manager is a control management center. Its main function is to manage the heartbeat message 101 and the heartbeat message 103 reported by the Master Agent and the Slave Agent. The information content of the two heartbeat messages mainly includes the status information and fault information of the Master Agent and the Slave Agent. There are two types of reporting: triggering reporting status information and fault information, and reporting status information and fault information at the same time during periodic reporting or interval. The triggering report refers to: When the change occurs, it will be reported automatically. The main purpose is to enable the management center to plan the correct transmission path when the fast channel fails. The periodic reporting or interval reporting at the same time mainly refers to the management center to the page. Feedback on the working status of the Master Agent and the Slave Agent, and when there is a problem with the Fast Track, the Management Center monitors it based on heartbeat messages. After receiving the heartbeat message 101 and the heartbeat message 103 of the Master Agent and the Slave Agent, the Manager returns the status of the current two agent centers (Agent) to the user interface, and does not receive the heartbeat message 101 or the heartbeat message 103 when the timeout expires. The working status of the corresponding device is displayed as not working, the fault status is set to a serious fault, and the alarm is reported; when the Manager receives the request to monitor the protection switching message from the Master Agent or the Slave Agent, the optimal planning of the transmission unit is started. The transmission path sends a mandatory switching message to the Master Agent and the Slave Agent, and then monitors the fault status of the Master Agent and the Slave Agent of the transmission unit, and ensures that the transmission unit can transmit services as normal as possible until the cancellation monitoring protection switching message is received.
Agent是一个代理中心,它的主用作用是处理 Agent的检测模块检测到 的故障信息, 通过 Agent的通讯模块与另一 Agent进行通讯, 控制 Agent 的执行模块完成保护倒换动作, 其中的 Master Agent可以将本端的远端故 障指示 (Remote Defect Indication, 简称为 RDI ) 消息插入到微波帧, RDI 消息为对端向本端返回的故障信息。 Agent is an agent center. Its main function is to process the fault information detected by the detection module of the agent. The communication module of the agent communicates with another agent, and the execution module of the control agent completes the protection switching action. The Master Agent can the distal end of this fault indication (R emo te Defect indication, referred to as RDI) message is inserted into the microwave frame, RDI message is returned to the remote end of the present failure information.
需要说明的是, 本发明实施例的 Manager与现有技术中保护倒换控制 器有着本质的区别, 首先现有技术中保护倒换控制器是保护倒换的核心, 而本发明实施例的 Manager控制保护倒换的功能已经被削弱, 也就是说, 如果主备 Agent能够正常通信, 则此 Manager不会参与保护倒换流程, 仅 是监控主备 Agent的心跳消息。只有在收到主备 Agent通信异常时才获取保 护倒换控制权, 进行统一的保护倒换规划和控制。 此外, 本发明实施例的 Agent也不再是完成简单的代理功能, 在主备 Agent间通信正常时, 处于主 用态的 Agent能够参与保护倒换流程。 It should be noted that the Manager of the embodiment of the present invention is substantially different from the protection switching controller in the prior art. First, the protection switching controller in the prior art is the core of the protection switching. The function of the Manager to control the protection switching has been weakened in the embodiment of the present invention. That is, if the active and standby agents can communicate normally, the Manager does not participate in the protection switching process, and only monitors the heartbeat messages of the active and standby agents. The protection switching control right is acquired only when the communication between the active and standby agents is abnormal, and unified protection switching planning and control are performed. In addition, the agent in the embodiment of the present invention is no longer a simple proxy function. When the communication between the active and standby agents is normal, the agent in the active state can participate in the protection switching process.
在对本发明实施例的 Manager-Agent 架构及数据流向进行了详细说明 之后, 以下结合附图以及实施例, 对本发明进行进一步详细说明。 应当理 解, 此处所描述的具体实施例仅仅用以解释本发明, 并不限定本发明。  The details of the Manager-Agent architecture and data flow in the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
方法实施例  Method embodiment
根据本发明的实施例, 提供了一种保护倒换方法, 该方法基于上述的 According to an embodiment of the present invention, a protection switching method is provided, which is based on the above
Manager-Agent 架构: 构成保护对的主备微波节点设备中设置有各自的 Agent, 每个保护对中设置有一个 Manager。 图 2是本发明实施例的保护倒 换方法的流程图, 如图 2所示, 根据本发明实施例的保护倒换方法包括如 下处理: Manager-Agent architecture: The active and standby microwave node devices that form the protection pair are set with their own agents, and one manager is set in each protection pair. 2 is a flowchart of a protection switching method according to an embodiment of the present invention. As shown in FIG. 2, a protection switching method according to an embodiment of the present invention includes the following processing:
步骤 201 ,主备 Agent根据预先设置的多点故障检测策略确定主用微波 节点设备或备用微波节点设备出现一个或多个故障; 其中, 多点故障检测 策略包括以下至少之一: 对主备 Agent 的心跳消息进行检测、 对掉电消息 进行检测、 对主备 Agent 自身设备故障信息进行检测、 对远端告警进行检 测。 需要说明的是, 远端告警检测主要用于通过对端的告警信息来指示本 端出现了无法检测到故障, 是可以扩展的告警。  Step 201: The active/standby agent determines that one or more faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy. The multi-point fault detection policy includes at least one of the following: The heartbeat message is detected, the power-down message is detected, the fault information of the active and standby agents is detected, and the remote alarm is detected. It should be noted that the remote alarm detection is mainly used to indicate that the local end cannot detect a fault through the alarm information of the peer end, and is an alarm that can be extended.
具体地, 主备 Agent 包括用于检测设备故障的故障检测模块, 用于主 备 Agent的通信和 Manager与 Agent间通信的通信模块, 用于完成保护倒 换动作并写入远端告警信息的执行模块。  Specifically, the active/standby agent includes a fault detection module for detecting a device fault, a communication module for communication between the master and the backup agent, and a communication between the Manager and the agent, and an execution module for completing the protection switching action and writing the remote alarm information. .
需要说明的是, 在主备 Agent根据预先设置的多点故障检测策略确定 主用微波节点设备或备用微波节点设备出现多个故障的情况下, 需要确定 多个故障的优先级, 其中, 优先级由高到低为: 掉电故障、 主备微波节点 设备代理中心通信异常故障、 主备微波节点设备故障、 远端告警故障。 最 后, 需要根据故障的优先级进行相应的保护倒换操作。 It should be noted that the active and standby agents are determined according to a preset multi-point fault detection policy. In the case that multiple faults occur in the primary microwave node device or the standby microwave node device, the priority of multiple faults needs to be determined. The priority is from high to low: power failure, active and standby microwave node device proxy center communication abnormality The fault is faulty on the active and standby microwave node devices and the remote alarm is faulty. Finally, the corresponding protection switching operation needs to be performed according to the priority of the fault.
步骤 202 , 在主备微波节点设备之间通信正常的情况下, 主备 Agent 通过保护倒换通信通道进行主备微波节点设备的保护倒换; 其中, 保护倒 换通信通道包括: 主备 Agent之间的高速通信通道、 主备 Agent与 Manager 之间的可靠通信通道、 以及远端告警通信通道。 具体地, 主备 Agent 间通 信通道要做成高速通道, 保证保护倒换快速完成, 缩短保护倒换时间; 主 备 Agent与 Manager间通信通道要做成可靠通道, 保证保护倒换高可靠性; 远端告警通信通道是在微波帧中插入本端状态信息。 需要说明的是, 在高 速通信通道无故障时, 保护倒换信息是通过高速通信通道传送的; 高速通 信通道有故障时, 代理中心向管理中心发送的监控信息, 管理中心向代理 中心发送的强制倒换信息就会利用可靠通信通道, 另外心跳消息也是通过 可靠通信通道传输的; 远端告警通信通道用于传输对端向本端返回的故障 信息。  Step 202: When the communication between the active and standby microwave node devices is normal, the active/standby agent performs protection switching of the active and standby microwave node devices through the protection switching communication channel. The protection switching communication channel includes: a high speed between the active and standby agents. A reliable communication channel between the communication channel, the active and standby Agents, and the Manager, and a remote alarm communication channel. Specifically, the communication channel between the active and standby agents should be made into a high-speed channel to ensure that the protection switching is completed quickly and the protection switching time is shortened. The communication channel between the active and standby Agents and the Manager must be made into a reliable channel to ensure high reliability of protection switching. The communication channel inserts local state information into the microwave frame. It should be noted that, when there is no fault in the high-speed communication channel, the protection switching information is transmitted through the high-speed communication channel; when the high-speed communication channel is faulty, the monitoring information sent by the agent center to the management center, and the forced switching sent by the management center to the agent center The information will use the reliable communication channel, and the heartbeat message is also transmitted through the reliable communication channel; the remote alarm communication channel is used to transmit the fault information returned by the peer end to the local end.
步骤 203 , 在主备微波节点设备之间通信异常的情况下, 主备 Agent 通知 Manager, Manager通过保护倒换通信通道对主备微波节点设备进行强 制保护倒换。  Step 203: In the case that the communication between the active and standby microwave node devices is abnormal, the active/standby agent notifies the Manager, and the manager performs forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
从上述处理可以看出, Manager的中心位置已经被削弱,主要是为了缩 短保护倒换时间, 但保留的管理功能是为了完成当主备 Agent 间通信异常 无法完成保护倒换时来控制完成保护倒换, 提高保护倒换可靠性。  It can be seen from the above that the central location of the Manager has been weakened, mainly to shorten the protection switching time. However, the reserved management function is to complete the protection switching when the communication between the active and standby agents cannot be completed. Switching reliability.
下面, 将针对主备微波节点设备出现不同的故障, 对进行保护倒换或 进行强制保护倒换的处理过程进行详细说明。  The following describes the process of performing protection switching or forced protection switching for different faults on the active and standby microwave node devices.
情况一, 主备 Agent检测到自身设备出现故障。 首先, Agent的检测模块检测到本设备出现了故障后, 通知 Agent本设 备出现故障, 在故障设备为备用微波节点设备的情况下, 备用微波节点设 备代理中心 (Slave Agent ) 向主用微波节点设备代理中心 ( Master Agent ) 发送故障消息, Master Agent根据故障状态消息修改备用微波节点设备的故 障状态信息; 需要说明的是, 修改故障状态信息是为了在保护倒换时进行 查询, 以判断是否进行保护倒换。 在故障设备为主用微波节点设备的情况 下, MasterAgent检测备用微波节点设备的故障状态信息, 如果备用微波节 点设备已存在故障, 则上报告警, 如果备用微波节点设备正常, 则通过主 备 Agent之间的高速通信通道向 Slave Agent发送保护倒换消息, 进行主备 微波节点设备的保护倒换,并将自身的工作状态信息修改为备用状态; Slave Agent在接收到保护倒换消息后, 进行主备微波节点设备的保护倒换, 将自 身的工作状态信息修改为主用状态, 并向原 Master Agent发送保护倒换完 成消息;在原 MasterAgent未接收到保护倒换完成消息的情况下,原 Master Agent通过可靠通信通道通知 Manager进行强制保护倒换。优选地, 如果原 MasterAgent在规定的时间内没有收到保护倒换完成消息,该 Agent会重试 三次, 如果此时还未收到 Slave Agent 的保护倒换完成消息, 则会通知 Manager来接管保护倒换控制权, 由 Manager来完成保护倒换的规划和决 來。 In case 1, the active and standby agents detect that their own device has failed. First, the detection module of the agent detects that the device is faulty, and notifies the agent that the device is faulty. In the case that the faulty device is the standby microwave node device, the standby microwave node device agent center (Slave Agent) goes to the primary microwave node device. The master agent sends a fault message, and the master agent modifies the fault state information of the standby microwave node device according to the fault status message. The fault state information is modified to check whether the protection switchover is performed during the protection switching. . In the case that the faulty device is the primary microwave node device, the MasterAgent detects the fault state information of the standby microwave node device. If the standby microwave node device has a fault, the alarm is reported. If the standby microwave node device is normal, the master and the standby agent are used. The high-speed communication channel sends a protection switching message to the slave agent, performs protection switching of the active and standby microwave node devices, and changes its working status information to the standby state. After receiving the protection switching message, the slave agent performs the primary and secondary microwaves. The protection of the node device is changed, and the working status information of the node is changed to the active state, and the protection switching completion message is sent to the original master agent. When the original MasterAgent does not receive the protection switching completion message, the original master agent notifies the manager through the reliable communication channel. Perform a forced protection switchover. Preferably, if the original MasterAgent does not receive the protection switching completion message within the specified time, the agent will retry three times. If the protection switching completion message of the slave agent has not been received at this time, the Manager is notified to take over the protection switching control. Right, the Manager completes the planning and decision-making of protection switching.
图 3 是本发明实施例的故障信息触发保护倒换的处理流程图, 如图 3 所示, 包括如下处理:  FIG. 3 is a flowchart of a process for triggering protection switching of fault information according to an embodiment of the present invention. As shown in FIG. 3, the following processing is included:
步骤 301 , Agent的检测模块检测到本传输单元出现故障;  Step 301: The detection module of the agent detects that the transmission unit is faulty.
步骤 302, Agent的检测模块将故障信息上报给 Agent;  Step 302: The detecting module of the agent reports the fault information to the agent.
步骤 303 , Agent判断自身的工作状态是否为主用状态,如果判断为否, 则执行步骤 304, 否则, 执行步骤 305;  Step 303, the agent determines whether its working state is the main state, if the determination is no, step 304 is performed, otherwise, step 305 is performed;
步骤 304,若为备用状态则将故障消息通知给 MasterAgent,结束操作; 步骤 305 ,若为主用状态则判断备用单元是否存在故障,如果判断为是, 则执行步骤 306, 否则, 执行步骤 307; Step 304: If it is in the standby state, notify the MasterAgent of the fault message, and end the operation; Step 305, if it is the main state, it is determined whether there is a fault in the standby unit, if the determination is yes, then step 306 is performed, otherwise, step 307 is performed;
步骤 306, 若备用单元有故障则向上级上报该传输单元故障告警, 结束 操作;  Step 306: If the standby unit is faulty, report the fault of the transmission unit to the upper level, and end the operation;
步骤 307 ,若备用单元没有故障则向 Slave Agent发请求保护倒换消息, 完成本身的保护倒换动作, 将工作状态改为备用态;  Step 307: If the standby unit is not faulty, send a protection switching message to the Slave Agent, complete its own protection switching action, and change the working state to the standby state;
步骤 308, Slave Agent接收请求保护倒换消息;  Step 308, the Slave Agent receives the request protection switching message.
步骤 309, Slave Agent通过执行模块完成保护倒换动作,将工作状态改 为主用状态;  Step 309, the Slave Agent completes the protection switching action by executing the module, and changes the working state to the active state;
步骤 310, 现 Master Agent向原 Master Agent返回保护倒换结束消息; 步骤 311 ,原 Master Agent收到保护倒换结束消息,保护倒换过程结束。 情况二, 主备 Agent通过心跳消息检测到主备 Agent发生通信异常。 当主备 Agent检测到无法收到对方 Agent心跳消息或无法解析出正确的 消息,则认为主备 Agent间的通信异常,此时仅主备 Agent无法完成保护倒 换动作。 处于主用态的 Master Agent会通知 Manager来接管保护倒换控制 权, 由 Manager通过可靠通信通道来完成保护倒换的规划和决策, Master Agent向 Manager发送接管保护倒换控制权消息; Manager接收接管保护倒 换控制权消息, 并根据主备微波节点设备的故障状态信息和工作状态信息 判断主备微波节点设备是否满足传输业务条件; 在确定满足传输业务条件 的情况下, Manager向主备 Agent发送强制保护倒换消息, 进行主备微波节 点设备的保护倒换; 在确定不满足传输业务情况下, Manager根据故障状态 信息, 规划传输路径, 根据传输路径对主备 Agent进行强制保护倒换, 接 收主备 Agent返回的强制倒换完成消息, 并进入监控状态; 在主备 Agent 的通信恢复正常的情况下, Manager取消保护倒换控制权,保证保护倒换的 †夬速' 1"生。 图 4是本发明实施例的主备 Agent通信中断触发保护倒换的处理流程 图, 如图 4所示, 包括如下处理: In step 310, the master agent returns a protection switching end message to the original master agent. In step 311, the original master agent receives the protection switching end message, and the protection switching process ends. Case 2: The active and standby agents detect that the communication between the active and standby agents is abnormal through the heartbeat message. When the active/standby agent fails to receive the heartbeat message from the other agent or the correct message cannot be resolved, the master/slave agent cannot communicate with the active/standby agent. The Master Agent in the active state will notify the Manager to take over the protection switching control. The Manager completes the planning and decision of the protection switching through the reliable communication channel. The Master Agent sends a message to take over the protection switching control right to the Manager. The Manager receives the takeover protection switching control. The weight message is used to determine whether the active and standby microwave node devices meet the transmission service condition according to the fault state information and the working state information of the active and standby microwave node devices. When determining that the transmission service condition is met, the Manager sends a mandatory protection switching message to the active and standby agents. The protection switching between the active and standby microwave node devices is performed. When it is determined that the transmission service is not satisfied, the Manager plans the transmission path according to the fault status information, performs mandatory protection switching on the active and standby agents according to the transmission path, and receives the forced switching returned by the active and standby agents. After the message is completed, the monitoring status is entered. When the communication between the active and standby agents is restored, the Manager cancels the protection switching control and ensures that the idle switching of the protection switching is '1'. FIG. 4 is a flowchart of processing a communication interruption triggering protection switching of the active/standby agent according to the embodiment of the present invention. As shown in FIG. 4, the following processing is included:
步骤 401, 通信模块检测到主备 Agent之间通信异常;  Step 401: The communication module detects that the communication between the active and standby agents is abnormal.
步骤 402 , 由 Master Agent向 Manager发送请求监控保护倒换消息; 步骤 403 , Manager收到监控保护倒换消息后, 根据主备 Agent的工作 状态和故障状态进行判断规划;  Step 402: The Master Agent sends a request to the Manager to monitor the protection switching message. Step 403: After receiving the monitoring protection switching message, the Manager determines the planning according to the working status and the fault status of the active and standby agents.
步骤 404, Manager判断该传输单元是否满足正常传输业务的需求, 如 果判断为否, 则执行步骤 405 , 否则, 执行步骤 408;  Step 404, the Manager determines whether the transmission unit meets the requirements of the normal transmission service, if the determination is no, step 405 is performed, otherwise, step 408 is performed;
步骤 405 ,规划出能满足传输业务的路径并向主备 Agent发送强制保护 倒换消息;  Step 405: Plan a path that can satisfy the transmission service and send a mandatory protection switching message to the active/standby agent.
步骤 406,主备 Agent收到强制保护倒换消息后由执行模块完成保护倒 换动作;  Step 406: After the active/standby agent receives the mandatory protection switching message, the execution module completes the protection switching action.
步骤 407, 主备 Agent向 Manager返回强制保护倒换完成消息; 步骤 408, Manager收到强制保护倒换完成消息后进入监控状态, 尽可 能保障业务正常传输, 如果主备 Agent间通信恢复正常, 则由 Master Agent 向 Manager发送取消监控保护倒换消息, 后面的保护倒换仍由主备 Agent 完成。  Step 407: The active/standby agent returns a mandatory protection switching completion message to the Manager. Step 408: After receiving the mandatory protection switching completion message, the Manager enters the monitoring state to ensure normal transmission of the service. If the communication between the active and standby agents is normal, the master The agent sends a cancel monitoring protection switching message to the Manager, and the subsequent protection switching is still completed by the active and standby agents.
情况三, 主备 Agent通过检测掉电消息确认对端掉电。  In the third case, the active and standby agents check the power-down message to confirm that the peer is powered off.
Agent收到对方掉电消息,如果 Master Agent通过检测掉电消息确认备 用微波节点设备掉电, Master Agent不执行操作; 如果 Slave Agent通过检 测掉电消息确认主用微波节点设备掉电, 则 Slave Agent通过高速通信通道 进行保护倒换, 并将自身的工作状态信息设置为主用状态。  The agent receives the power-down message of the other party. If the master agent confirms that the standby microwave node device is powered off by detecting the power-down message, the master agent does not perform the operation. If the slave agent detects the power-down message to confirm that the primary microwave node device is powered off, the slave agent The protection switching is performed through the high-speed communication channel, and its own working status information is set as the main state.
图 5是本发明实施例的掉电消息触发保护倒换的处理流程图, 如图 5 所示, 包括如下处理:  FIG. 5 is a flowchart of a process for triggering protection switching of a power down message according to an embodiment of the present invention. As shown in FIG. 5, the following processing is included:
步骤 501 , 某 Agent检测到对方 Agent掉电, 或者收到对方掉电消息; 步骤 502, 该 Agent判断自身的工作状态是否为主用状态, 如果判断为 是, 则结束操作, 否则, 执行步骤 503 ; Step 501: An Agent detects that the other party's agent is powered off, or receives a power failure message from the other party; Step 502, the agent determines whether the working state of the agent is the active state. If the determination is yes, the operation ends; otherwise, step 503 is performed;
步骤 503 ,通过该 Agent的执行模块完成保护倒换动作并将工作状态置 为主用状态。  Step 503, the protection switching operation is completed by the execution module of the agent, and the working state is set to the active state.
情况四, 主备 Agent检测到远端告警。  Case 4: The active and standby agents detect the remote alarm.
处于主用态的 Master Agent通过远端告警通信通道检测到远端告警, Master Agent根据远端告警检测自身设备是否出现故障, 如果判断为是, 即 检测出自身设备出现可检测的故障, 则 Master Agent通过高速通信通道进 行主备微波节点设备保护倒换, 即进入由故障信息触发的保护倒换流程; 如果判断为否, Master Agent则检测备用微波节点设备的故障状态信息; 如 果确定备用微波节点设备存在故障, Master Agent不执行操作, 如果确定备 用微波节点设备不存在故障, Master Agent通过高速通信通道进行保护倒 换; 在进行保护倒换后, 原 Slave Agent通过远端告警通信通道检测是否还 有远端告警, 如果判断为是, 则上报原 Master Agent出现不可检测故障告 警信息。 如果远端告警信息还继续保持说明是对端设备出现了故障, 需上 报对端设备出现不可检测故障的告警信息。  The master agent in the active state detects the remote alarm through the remote alarm communication channel. The master agent detects whether the device is faulty according to the remote alarm. If the judgment is yes, it detects that the device has a detectable fault. The agent performs the protection switching of the active and standby microwave node devices through the high-speed communication channel, that is, enters the protection switching process triggered by the fault information; if the determination is no, the master agent detects the fault state information of the standby microwave node device; If the fault occurs, the master agent does not perform the operation. If it is determined that there is no fault in the standby microwave node device, the master agent performs protection switching through the high-speed communication channel. After the protection switching, the original slave agent detects whether there is a remote alarm through the remote alarm communication channel. If the judgment is yes, the original Master Agent reports an undetectable fault alarm. If the remote alarm information continues to be displayed, the peer device is faulty. You need to report the alarm information of the undetectable fault on the peer device.
图 6是本发明实施例的由远端告警触发保护倒换的处理流程图, 如图 6所示, 包括如下处理:  FIG. 6 is a flowchart of a process for triggering protection switching by a remote alarm according to an embodiment of the present invention. As shown in FIG. 6, the following processing is included:
步骤 601 , Master Agent检测到有远端告警信息;  Step 601: The master agent detects that there is remote alarm information.
步骤 602, Master Agent首先检测自身是否存在故障, 如果判断为是, 则执行步骤 603 , 否则, 执行步骤 604;  Step 602, the Master Agent first detects whether there is a fault, if the determination is yes, then step 603 is performed, otherwise, step 604 is performed;
步骤 603 ,若 Master Agent存在故障则进入由故障信息触发保护倒换流 程;  Step 603: If the Master Agent is faulty, the protection switching process triggered by the fault information is entered;
步骤 604, 若不存在故障则检测备用单元是否存在故障信息, 如果判断 为是, 则执行步骤 605 , 否则, 执行步骤 606; 警信息; Step 604, if there is no fault, it is detected whether there is fault information in the standby unit, if the determination is yes, step 605 is performed, otherwise, step 606 is performed; Police information;
步骤 606, 若备用单元不存在故障则 Master Agent向 Slave Agent发送 请求保护倒换消息;  Step 606: If the standby unit does not have a fault, the master agent sends a request protection switching message to the Slave Agent.
步骤 607, Slave Agent收到请求保护倒换消息后完成保护倒换动作并返 回保护倒换完成消息;  Step 607: After receiving the request for protection switching message, the Slave Agent completes the protection switching action and returns a protection switching completion message.
步骤 608, 检测现在是否还存在远端告警, 如果判断为是, 这执行步骤 609, 否则, 执行步骤 605;  Step 608, it is detected whether there is still a remote alarm, if the determination is yes, then step 609 is performed, otherwise, step 605 is performed;
步骤 609, 在本端发送端或者对端接收端存在不可检测的故障时, 上报 本端发送端或者对端接收端出现不可检测故障的告警信息。  Step 609: When there is an undetectable fault on the local end or the opposite end receiving end, the alarm information of the undetectable fault is reported to the local sending end or the opposite end receiving end.
以上处理流程是分解的相对独立的故障处理流程, 是根据不同的触发 条件而选择的保护倒换流程, 而在处理过程中必须统一处理, 否则会出现 重复保护倒换或者不倒换的情况, 为了避免以上情况的发生, 需要在处理 流程中将以上各类故障进行分析, 根据故障的关联关系将各类故障进行归 类排队划分优先级, 根据不同的触发条件选择不同的处理流程。 例如: 如 果出现了掉电故障肯定会出现主备 Agent通信异常故障和远端告警故障, 如果处理完掉电消息再处理主备 Agent通信异常故障和远端告警故障肯定 会出现来回进行保护倒换的情况, 因此掉电故障优先级大于主备 Agent通 信异常故障和远端告警故障, 所以此时只需处理掉电故障不用处理主备 Agent通信异常故障和远端告警故障。在本发明实施例中各故障的优先级由 高到低为: 掉电故障、 主备 Agent通信异常故障、 主备设备故障、 远端告 警故障。  The above process is a relatively independent fault process that is decomposed. It is a protection switch process that is selected according to different trigger conditions. It must be processed uniformly during the process. Otherwise, repeated protection switching or no switching will occur. In the case of the situation, the above various types of faults need to be analyzed in the process flow, and various faults are classified and prioritized according to the fault association relationship, and different processing flows are selected according to different trigger conditions. For example, if there is a power failure, the primary and secondary agent communication abnormality faults and remote alarm faults will occur. If the power failure message is processed, the active and standby agent communication abnormalities and remote alarm faults will definitely occur. In this case, the priority of the power failure is greater than that of the active and standby agents, and the remote alarm is faulty. Therefore, only the power failure and the remote alarm failure are not processed. In the embodiment of the present invention, the priority of each fault is from high to low: a power failure fault, an abnormal communication between the active and standby agents, a fault of the active and standby devices, and a fault of the remote alarm.
从上述处理可以看出, 本发明实施例的保护倒换策略实现流程釆用多 点保护倒换策略, 将保护倒换的风险分散, 在满足高可靠保护倒换情况下, 尽可能提高保护倒换的快速性, 缩短保护倒换时间, 由于是分布式保护倒 换流程, 对每条信息的处理流程是有差异的, 但又能够进行统一管理。 此外, 本发明实施例可以做成独立的模块, 便于移植到需要 1+1保护 的其他产品, 降低开发成本缩短开发时间; 远端告警信息还有助于发现定 位设备故障, 为以后的开发维护积累经验。 It can be seen that the protection switching policy implementation process of the embodiment of the present invention uses a multi-point protection switching strategy to spread the risk of protection switching. When the high-reliability protection switching is satisfied, the fastness of the protection switching is improved as much as possible. Shorten the protection switching time, because it is distributed protection In the process of changing, the processing flow of each piece of information is different, but it can be managed uniformly. In addition, the embodiments of the present invention can be implemented as independent modules, which are convenient for porting to other products requiring 1+1 protection, reducing development cost and shortening development time; remote alarm information can also help locate faults of positioning equipment for future development and maintenance. Gain experience.
装置实施例  Device embodiment
根据本发明的实施例, 提供了一种保护倒换系统, 图 7是本发明实施 例的保护倒换系统的结构示意图, 如图 7所示, 根据本发明实施例的保护 倒换系统包括: 主用微波节点设备代理中心 (MasterAgent ) 70、 备用微波 节点设备代理中心 ( Slave Agent ) 72、 以及管理中心 ( Manager ) 74。 以下 对本发明实施例的各个模块进行详细的说明。  According to an embodiment of the present invention, a protection switching system is provided. FIG. 7 is a schematic structural diagram of a protection switching system according to an embodiment of the present invention. As shown in FIG. 7, the protection switching system according to an embodiment of the present invention includes: a primary microwave. The node device agent center (MasterAgent) 70, the standby microwave node device agent center (Slave Agent) 72, and the management center (Manager) 74. The respective modules of the embodiments of the present invention are described in detail below.
具体地, Master Agent 70, 用于根据预先设置的多点故障检测策略确定 主用微波节点设备或备用微波节点设备出现一个或多个故障, 并在主备微 波节点设备之间通信正常的情况下, 通过保护倒换通信通道进行所述主备 微波节点设备的保护倒换; 其中, 多点故障检测策略包括以下至少之一: 对主备 Agent 的心跳消息进行检测、 对掉电消息进行检测、 对主备 Agent 自身设备故障信息进行检测、 对远端告警进行检测。 需要说明的是, 远端 告警检测主要用于通过对端的告警信息来指示本端出现了无法检测到故 障, 是可以扩展的告警。  Specifically, the master agent 70 is configured to determine, according to the preset multi-point fault detection policy, that one or more faults occur in the primary microwave node device or the standby microwave node device, and the communication between the active and standby microwave node devices is normal. The protection switching of the active and standby microwave node devices is performed by the protection switching communication channel. The multi-point fault detection policy includes at least one of the following: detecting a heartbeat message of the active and standby agents, detecting a power failure message, and detecting the The agent's own device fault information is detected and the remote alarm is detected. It should be noted that the remote alarm detection is mainly used to indicate that the local end cannot detect faults through the alarm information of the peer end, and is an alarm that can be extended.
保护倒换通信通道包括: 主备 Agent之间的高速通信通道、 主备 Agent 与 Manager之间的可靠通信通道、 以及远端告警通信通道。 具体地, 主备 Agent间通信通道要做成高速通道, 保证保护倒换快速完成, 缩短保护倒换 时间; 主备 Agent与 Manager间通信通道要做成可靠通道, 保证保护倒换 高可靠性; 远端告警通信通道是在微波帧中插入本端状态信息。 需要说明 的是, 在高速通信通道无故障时, 保护倒换信息是通过高速通信通道传送 的; 高速通信通道有故障时, 代理中心向管理中心发送的监控信息, 管理 中心向代理中心发送的强制倒换信息就会利用可靠通信通道, 另外心跳消 息也是通过可靠通信通道传输的; 远端告警通信通道用于传输对端向本端 返回的故障信息。 The protection switching communication channel includes: a high-speed communication channel between the active and standby agents, a reliable communication channel between the active and standby Agents and the Manager, and a remote alarm communication channel. Specifically, the communication channel between the active and standby agents should be made into a high-speed channel to ensure that the protection switching is completed quickly and the protection switching time is shortened. The communication channel between the active and standby Agents and the Manager must be made into a reliable channel to ensure high reliability of protection switching. The communication channel inserts local state information into the microwave frame. It should be noted that, when the high-speed communication channel is fault-free, the protection switching information is transmitted through the high-speed communication channel; when the high-speed communication channel is faulty, the monitoring information sent by the agent center to the management center is managed. The forced switching information sent by the center to the agent center will utilize the reliable communication channel, and the heartbeat message is also transmitted through the reliable communication channel; the remote alarm communication channel is used to transmit the fault information returned by the peer end to the local end.
Slave Agent 72 , 用于根据预先设置的多点故障检测策略确定主用微波 节点设备或备用微波节点设备出现一个或多个故障, 并在主备微波节点设 备之间通信正常的情况下, 通过保护倒换通信通道进行所述主备微波节点 设备的保护倒换;  The Slave Agent 72 is configured to determine, according to the preset multi-point fault detection policy, that one or more faults occur in the primary microwave node device or the standby microwave node device, and the protection is performed when the communication between the active and standby microwave node devices is normal. Performing protection switching of the active and standby microwave node devices by switching the communication channel;
具体地, 主备 Agent 包括用于检测设备故障的故障检测模块, 用于主 备 Agent的通信和 Manager与 Agent间通信的通信模块, 用于完成保护倒 换动作并写入远端告警信息的执行模块。  Specifically, the active/standby agent includes a fault detection module for detecting a device fault, a communication module for communication between the master and the backup agent, and a communication between the Manager and the agent, and an execution module for completing the protection switching action and writing the remote alarm information. .
需要说明的是, 在主备 Agent根据预先设置的多点故障检测策略确定 主用微波节点设备或备用微波节点设备出现多个故障的情况下, 需要确定 多个故障的优先级, 其中, 优先级由高到低为: 掉电故障、 主备微波节点 设备代理中心通信异常故障、 主备微波节点设备故障、 远端告警故障。 最 后, 需要根据故障的优先级进行相应的保护倒换操作。  It should be noted that, in the case that the active/standby agent determines that multiple faults occur in the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy, the priority of multiple faults needs to be determined, where priority From high to low: power failure, abnormal communication of the active and standby microwave node device agent center, failure of the active and standby microwave node devices, and remote alarm failure. Finally, the corresponding protection switching operation needs to be performed according to the priority of the fault.
Manager 74,用于在所述主备微波节点设备之间通信异常的情况下,通 过所述保护倒换通信通道对所述主备微波节点设备进行强制保护倒换。  The manager 74 is configured to perform forced protection switching on the active and standby microwave node devices by using the protection switching communication channel in the case that the communication between the active and standby microwave node devices is abnormal.
从上述处理可以看出, Manager的中心位置已经被削弱,主要是为了缩 短保护倒换时间, 但保留的管理功能是为了完成当主备 Agent 间通信异常 无法完成保护倒换时来控制完成保护倒换, 提高保护倒换可靠性。  It can be seen from the above that the central location of the Manager has been weakened, mainly to shorten the protection switching time. However, the reserved management function is to complete the protection switching when the communication between the active and standby agents cannot be completed. Switching reliability.
下面, 将针对主备微波节点设备出现不同的故障, 对进行保护倒换或 进行强制保护倒换的处理过程进行详细说明。  The following describes the process of performing protection switching or forced protection switching for different faults on the active and standby microwave node devices.
情况一, 主备 Agent检测到自身设备出现故障。  In case 1, the active/standby agent detects that its own device has failed.
首先, Agent的检测模块检测到本设备出现了故障后, 通知 Agent本设 备出现故障, 在故障设备为备用微波节点设备的情况下, 备用微波节点设 备代理中心 (Slave Agent ) 向主用微波节点设备代理中心 ( Master Agent ) 发送故障状态消息, Master Agent根据故障状态消息修改备用微波节点设备 的故障状态信息; 需要说明的是, 修改故障状态信息是为了在保护倒换时 进行查询, 以判断是否进行保护倒换。 在故障设备为主用微波节点设备的 情况下, Master Agent检测备用微波节点设备的故障状态信息, 如果备用微 波节点设备已存在故障, 则上报告警, 如果备用微波节点设备正常, 则通 过主备 Agent之间的高速通信通道向 Slave Agent发送保护倒换消息, 进行 主备微波节点设备的保护倒换, 并将自身的工作状态信息修改为备用状态; Slave Agent在接收到保护倒换消息后,进行主备微波节点设备的保护倒换, 将自身的工作状态信息修改为主用状态, 并向原 Master Agent发送保护倒 换完成消息; 在原 Master Agent未接收到保护倒换完成消息的情况下, 原 Master Agent通过可靠通信通道通知 Manager进行强制保护倒换。 优选地, 如果原 Master Agent在规定的时间内没有收到保护倒换完成消息,该 Agent 会重试三次, 如果此时还未收到 Slave Agent的保护倒换完成消息, 则会通 知 Manager来接管保护倒换控制权, 由 Manager来完成保护倒换的规划和 决策。 First, after detecting that the device is faulty, the detection module of the agent notifies the agent that the device is faulty. In the case that the faulty device is the standby microwave node device, the standby microwave node is set. The slave agent sends a fault status message to the master agent node (Master Agent), and the master agent modifies the fault state information of the standby microwave node device according to the fault status message. It should be noted that the fault state information is modified. In order to perform a query during protection switching, it is determined whether protection switching is performed. In the case that the faulty device is the primary microwave node device, the master agent detects the fault state information of the standby microwave node device. If the standby microwave node device has a fault, the alarm is reported. If the standby microwave node device is normal, the master and the standby device are configured. The high-speed communication channel between the agents sends a protection switching message to the slave agent, and performs protection switching of the active and standby microwave node devices, and changes the working status information of the device to the standby state. After receiving the protection switching message, the slave agent performs the primary and backup operations. The protection of the microwave node device is changed, and the working status information of the device is changed to the active state, and the protection switching completion message is sent to the original master agent. When the original master agent does not receive the protection switching completion message, the original master agent passes the reliable communication channel. Notify the Manager to perform forced protection switching. Preferably, if the original master agent does not receive the protection switching completion message within the specified time, the agent will retry three times. If the protection switching completion message of the slave agent has not been received at this time, the Manager is notified to take over the protection switching. Control, the Manager completes the planning and decision making of protection switching.
情况二, 主备 Agent通过心跳消息检测到主备 Agent发生通信异常。 当主备 Agent检测到无法收到对方 Agent心跳消息或无法解析出正确的 消息,则认为主备 Agent间的通信异常,此时仅主备 Agent无法完成保护倒 换动作。 处于主用态的 Master Agent会通知 Manager来接管保护倒换控制 权, 由 Manager通过可靠通信通道来完成保护倒换的规划和决策, Master Agent向 Manager发送接管保护倒换控制权消息; Manager接收接管保护倒 换控制权消息, 并根据主备微波节点设备的故障状态信息和工作状态信息 判断主备微波节点设备是否满足传输业务条件; 在确定满足传输业务条件 的情况下, Manager向主备 Agent发送强制保护倒换消息, 进行主备微波节 点设备的保护倒换; 在确定不满足传输业务情况下, Manager根据故障状态 信息, 规划传输路径, 根据传输路径对主备 Agent进行强制保护倒换, 接 收主备 Agent返回的强制倒换完成消息, 并进入监控状态; 在主备 Agent 的通信恢复正常的情况下, Manager取消保护倒换控制权,保证保护倒换的 快速性。 Case 2: The active and standby agents detect that the communication between the active and standby agents is abnormal through the heartbeat message. When the active/standby agent fails to receive the heartbeat message from the other agent or the correct message cannot be resolved, the master/slave agent cannot communicate with the active/standby agent. The Master Agent in the active state will notify the Manager to take over the protection switching control. The Manager completes the planning and decision of the protection switching through the reliable communication channel. The Master Agent sends a message to take over the protection switching control right to the Manager. The Manager receives the takeover protection switching control. The weight message is used to determine whether the active and standby microwave node devices meet the transmission service condition according to the fault state information and the working state information of the active and standby microwave node devices. When determining that the transmission service condition is met, the Manager sends a mandatory protection switching message to the active and standby agents. , for the main and standby microwave section The protection switching of the point device is performed. When it is determined that the transmission service is not satisfied, the Manager plans a transmission path according to the fault status information, performs a mandatory protection switching on the active and standby agents according to the transmission path, and receives a forced switching completion message returned by the active and standby agents, and enters Monitoring status; When the communication between the active and standby agents is restored, the Manager cancels the protection switching control and ensures the fast switching protection.
情况三, 主备 Agent通过检测掉电消息确认对端掉电。  In the third case, the active and standby agents check the power-down message to confirm that the peer is powered off.
Agent收到对方掉电消息,如果 Master Agent通过检测掉电消息确认备 用微波节点设备掉电, Master Agent不执行操作; 如果 Slave Agent通过检 测掉电消息确认主用微波节点设备掉电,则 Slave Agentt通过高速通信通道 进行保护倒换, 并将自身的工作状态信息设置为主用状态。  The agent receives the power-down message from the other party. If the master agent confirms that the standby microwave node device is powered off by detecting the power-down message, the master agent does not perform the operation. If the slave agent detects the power-down message to confirm that the primary microwave node device is powered off, the slave agentt The protection switching is performed through the high-speed communication channel, and its own working status information is set as the main state.
情况四, 主备 Agent检测到远端告警。  Case 4: The active and standby agents detect the remote alarm.
处于主用态的 Master Agent通过远端告警通信通道检测到远端告警, Master Agent根据远端告警检测自身设备是否出现故障, 如果判断为是, 即 检测出自身设备出现可检测的故障, 则 Master Agent通过高速通信通道进 行主备微波节点设备保护倒换, 即进入由故障信息触发的保护倒换流程; 如果判断为否, Master Agent则检测备用微波节点设备的故障状态信息; 如 果确定备用微波节点设备存在故障, Master Agent不执行操作, 如果确定备 用微波节点设备不存在故障, Master Agent通过高速通信通道进行保护倒 换; 在进行保护倒换后, 原 Slave Agent通过远端告警通信通道检测是否还 有远端告警, 如果判断为是, 则上报原 Master Agent出现不可检测故障告 警信息。 如果远端告警信息还继续保持说明是对端设备出现了故障, 需上 报对端设备出现不可检测故障的告警信息。  The master agent in the active state detects the remote alarm through the remote alarm communication channel. The master agent detects whether the device is faulty according to the remote alarm. If the judgment is yes, it detects that the device has a detectable fault. The agent performs the protection switching of the active and standby microwave node devices through the high-speed communication channel, that is, enters the protection switching process triggered by the fault information; if the determination is no, the master agent detects the fault state information of the standby microwave node device; If the fault occurs, the master agent does not perform the operation. If it is determined that there is no fault in the standby microwave node device, the master agent performs protection switching through the high-speed communication channel. After the protection switching, the original slave agent detects whether there is a remote alarm through the remote alarm communication channel. If the judgment is yes, the original Master Agent reports an undetectable fault alarm. If the remote alarm information continues to be displayed, the peer device is faulty. You need to report the alarm information of the undetectable fault on the peer device.
以上处理流程是分解的相对独立的故障处理流程, 是根据不同的触发 条件而选择的保护倒换流程, 而在处理过程中必须统一处理, 否则会出现 重复保护倒换或者不倒换的情况, 为了避免以上情况的发生, 需要在处理 流程中将以上各类故障进行分析, 根据故障的关联关系将各类故障进行归 类排队划分优先级, 根据不同的触发条件选择不同的处理流程。 例如: 如 果出现了掉电故障肯定会出现主备 Agent通信异常故障和远端告警故障, 如果处理完掉电消息再处理主备 Agent通信异常故障和远端告警故障肯定 会出现来回进行保护倒换的情况, 因此掉电故障优先级大于主备 Agent通 信异常故障和远端告警故障, 所以此时只需处理掉电故障不用处理主备 Agent通信异常故障和远端告警故障。在本发明实施例中各故障的优先级由 高到低为: 掉电故障、 主备 Agent通信异常故障、 主备设备故障、 远端告 警故障。 The above process is a relatively independent fault process that is decomposed. It is a protection switch process that is selected according to different trigger conditions. It must be processed uniformly during the process. Otherwise, repeated protection switching or no switching will occur. The situation needs to be handled In the process, the above various types of faults are analyzed, and various types of faults are classified and prioritized according to the relationship of faults, and different processing flows are selected according to different triggering conditions. For example, if there is a power failure, the primary and secondary agent communication abnormality faults and remote alarm faults will occur. If the power failure message is processed, the active and standby agent communication abnormalities and remote alarm faults will definitely occur. In this case, the priority of the power failure is greater than that of the active and standby agents, and the remote alarm is faulty. Therefore, only the power failure and the remote alarm failure are not processed. In the embodiment of the present invention, the priority of each fault is from high to low: a power failure fault, an abnormal communication between the active and standby agents, an active/standby device fault, and a remote alarm fault.
综上所述, 本发明实施例通过分布式检测控制策略、 对各故障触发的 保护倒换相对独立而又统一的实现方式, 解决了现有技术中存在的倒换时 间较长造成业务长时间的中断、 当前不检测故障而不进行倒换导致业务瘫 痪、 通讯链路出现故障而不进行倒换导致业务瘫痪、 以及发生误检测而导 致保护倒换的问题, 使保护倒换动作完全处于可控状态下, 使保护倒换更 加可靠安全, 减少出现错误倒换和不倒换的情况; 并缩短了保护倒换的时 间, 提高了设备的性能。  In summary, the embodiment of the present invention solves the problem of long-distance interruption of the service caused by the long switching time in the prior art by the distributed detection control strategy and the relatively independent and unified implementation of the protection switching triggered by each fault. If the fault is not detected, the fault is caused by the fault of the service link, the fault of the communication link, the fault of the communication link, and the fault detection caused by the fault detection. The protection switching action is completely under the controllable state, so that the protection is performed. The switching is more reliable and safe, reducing the occurrence of false switching and non-reversing; and shortening the time of protection switching and improving the performance of the equipment.
此外, 本发明实施例可以做成独立的模块, 便于移植到需要 1+1保护 的其他产品, 降低开发成本缩短开发时间; 远端告警信息还有助于发现定 位设备故障, 为以后的开发维护积累经验。  In addition, the embodiments of the present invention can be implemented as independent modules, which are convenient for porting to other products requiring 1+1 protection, reducing development cost and shortening development time; remote alarm information can also help locate faults of positioning equipment for future development and maintenance. Gain experience.
尽管为示例目的, 已经公开了本发明的优选实施例, 本领域的技术人 员将意识到各种改进、 增加和取代也是可能的, 因此, 本发明的范围应当 不限于上述实施例。  While the preferred embodiments of the present invention have been disclosed for purposes of illustration, those skilled in the art will recognize that various modifications, additions and substitutions are possible, and the scope of the invention should not be limited to the embodiments described above.

Claims

权利要求书 Claim
1、 一种保护倒换方法, 其特征在于, 在构成保护对的主备微波节点设 备中设置有各自的代理中心, 所述保护对中设置有一个管理中心, 所述方 法包括: A protection switching method, characterized in that: a primary proxy center is provided in an active/standby microwave node device constituting a protection pair, and a protection center is provided in the protection center, the method includes:
主备微波节点设备代理中心根据预先设置的多点故障检测策略确定主 用微波节点设备或备用微波节点设备出现一个或多个故障;  The active/standby microwave node device proxy center determines one or more faults of the primary microwave node device or the standby microwave node device according to the preset multi-point fault detection policy;
在主备微波节点设备之间通信正常的情况下, 所述主备微波节点设备 代理中心通过保护倒换通信通道进行所述主备微波节点设备的保护倒换; 在所述主备微波节点设备之间通信异常的情况下, 所述主备微波节点 设备代理中心通知所述管理中心, 所述管理中心通过所述保护倒换通信通 道对所述主备微波节点设备进行强制保护倒换。  In the case that the communication between the active and standby microwave node devices is normal, the active/standby microwave node device proxy center performs the protection switching of the active and standby microwave node devices through the protection switching communication channel; If the communication is abnormal, the active/standby microwave node device proxy center notifies the management center, and the management center performs a forced protection switching on the active and standby microwave node devices through the protection switching communication channel.
2、 如权利要求 1所述的方法, 其特征在于, 所述多点故障检测策略包 括以下至少之一: 对所述主备微波节点设备代理中心的心跳消息进行检测、 对掉电消息进行检测、 对所述主备微波节点设备代理中心自身设备故障信 息进行检测、 对远端告警进行检测。  The method according to claim 1, wherein the multi-point fault detection policy comprises at least one of: detecting a heartbeat message of the active and standby microwave node device proxy center, and detecting a power down message Detecting the fault information of the active and standby microwave node device proxy center and detecting the remote alarm.
3、 如权利要求 2所述的方法, 其特征在于, 所述保护倒换通信通道包 括: 所述主备微波节点设备代理中心之间的高速通信通道、 所述主备微波 节点设备代理中心与所述管理中心之间的可靠通信通道、 以及远端告警通 信通道。  The method of claim 2, wherein the protection switching communication channel comprises: a high-speed communication channel between the active and standby microwave node device proxy centers, and an active/standby microwave node device proxy center and A reliable communication channel between the management centers and a remote alarm communication channel.
4、 如权利要求 1所述的方法, 其特征在于, 在主备微波节点设备代理 中心根据预先设置的多点故障检测策略确定主用微波节点设备或备用微波 节点设备出现多个故障的情况下, 所述方法还包括:  The method according to claim 1, wherein in the case that the active/standby microwave node device proxy center determines that the primary microwave node device or the standby microwave node device has multiple faults according to the preset multipoint fault detection policy The method further includes:
确定所述多个故障的优先级, 其中, 所述优先级由高到低为: 掉电故 障、 主备微波节点设备代理中心通信异常故障、 主备微波节点设备故障、 远端告警故障, 根据故障的优先级进行相应的保护倒换操作。 The priority of the multiple faults is determined, where the priority is from high to low: power failure fault, abnormal communication of the active and standby microwave node device proxy center, fault of the active and standby microwave node equipment, The remote alarm is faulty, and the corresponding protection switching operation is performed according to the priority of the fault.
5、 如权利要求 3至 4任一项所述的方法, 其特征在于, 在所述主备微 波节点设备代理中心检测到自身设备出现故障的情况下, 所述主备微波节 点设备代理中心通过所述高速通信通道进行主备微波节点设备保护倒换。  The method according to any one of claims 3 to 4, wherein, in the case that the active/standby microwave node device proxy center detects that the own device is faulty, the active and standby microwave node device proxy center passes The high-speed communication channel performs protection switching between the active and standby microwave node devices.
6、 如权利要求 5所述的方法, 其特征在于, 所述主备微波节点设备代 理中心通过所述高速通信通道进行主备微波节点设备保护倒换包括:  The method according to claim 5, wherein the active/standby microwave node device proxy center performs protection switching of the active and standby microwave node devices through the high-speed communication channel, including:
在故障设备为备用微波节点设备的情况下, 备用微波节点设备代理中 心向主用微波节点设备代理中心发送故障消息, 所述主用微波节点设备代 理中心根据所述故障消息修改所述备用微波节点设备的故障状态信息; 在故障设备为主用微波节点设备的情况下, 所述主用微波节点设备代 理中心检测所述备用微波节点设备的故障状态信息 , 如果所述备用微波节 点设备已存在故障, 则上报告警, 如果所述备用微波节点设备正常, 则向 所述备用微波节点设备代理中心发送保护倒换消息, 进行所述主备微波节 点设备的保护倒换, 并将自身的工作状态信息修改为备用状态;  In the case that the faulty device is the standby microwave node device, the standby microwave node device proxy center sends a fault message to the active microwave node device proxy center, and the active microwave node device proxy center modifies the standby microwave node according to the fault message. The failure state information of the device; in the case that the faulty device is the primary microwave node device, the primary microwave node device proxy center detects the fault state information of the standby microwave node device, if the standby microwave node device has a fault Then, the alarm is reported. If the standby microwave node device is normal, the protection switching message is sent to the proxy node of the standby microwave node device, and the protection switching of the active and standby microwave node devices is performed, and the working status information of the active standby node is modified. For standby status;
所述备用微波节点设备代理中心在接收到所述保护倒换消息后, 进行 所述主备微波节点设备的保护倒换, 将自身的工作状态信息修改为主用状 态, 并向原主用微波节点设备代理中心发送保护倒换完成消息;  After receiving the protection switching message, the standby microwave node device proxy center performs protection switching of the active and standby microwave node devices, and modifies its working state information to a primary state, and delegates to the original primary microwave node device. The center sends a protection switching completion message;
在所述原主用微波节点设备代理中心未接收到所述保护倒换完成消息 的情况下, 所述原主用微波节点设备代理中心通知所述管理中心进行强制 保护倒换。  When the original primary microwave node device proxy center does not receive the protection switching complete message, the original primary microwave node device proxy center notifies the management center to perform a forced protection switching.
7、 如权利要求 3至 4任一项所述的方法, 其特征在于, 在所述主备微 波节点设备代理中心通过心跳消息检测到主备微波节点设备代理中心发生 通信异常的情况下, 所述主备微波节点设备代理中心通知所述管理中心, 所述管理中心通过所述可靠通信通道对所述主备微波节点设备进行强制保 护倒换。 The method according to any one of claims 3 to 4, wherein, in the case that the active/standby microwave node device proxy center detects a communication abnormality in the proxy center of the active/standby microwave node device through a heartbeat message, The active and standby microwave node device agent center notifies the management center, and the management center performs a forced protection switching on the active and standby microwave node devices through the reliable communication channel.
8、 如权利要求 7所述的方法, 其特征在于, 所述管理中心通过所述可 靠通信通道对所述主备微波节点设备进行强制保护倒换包括: The method according to claim 7, wherein the management center performs a mandatory protection switching on the active and standby microwave node devices by using the reliable communication channel, including:
主用微波节点设备代理中心向所述管理中心发送接管保护倒换控制权 消息;  The active microwave node device proxy center sends a message to take over the protection switching control right to the management center;
所述管理中心接收所述接管保护倒换控制权消息, 并根据所述主备微 波节点设备的故障状态信息和工作状态信息判断所述主备微波节点设备是 否满足传输业务条件;  Receiving, by the management center, the takeover protection switching control right message, and determining, according to the fault state information and the working state information of the active and standby microwave node devices, whether the active and standby microwave node devices meet the transmission service condition;
在确定满足所述传输业务条件的情况下, 所述管理中心向所述主备微 波节点设备代理中心发送强制保护倒换消息, 进行所述主备微波节点设备 的保护倒换;  And the management center sends a mandatory protection switching message to the active/standby microwave node device proxy center to perform protection switching of the active and standby microwave node devices;
在确定不满足所述传输业务条件的情况下, 所述管理中心根据所述故 障状态信息, 规划传输路径, 根据所述传输路径对所述主备微波节点设备 代理中心进行强制保护倒换, 接收所述主备微波节点设备代理中心返回的 强制倒换完成消息, 并进入监控状态;  In the case that it is determined that the transmission service condition is not met, the management center plans a transmission path according to the fault state information, and performs a forced protection switching on the proxy center of the active and standby microwave node devices according to the transmission path, and receives the The forced switching completion message returned by the active and standby microwave node device agent center is entered, and the monitoring state is entered;
在所述主备微波节点设备代理中心的通信恢复正常的情况下, 所述管 理中心取消保护倒换控制权。  When the communication between the active and standby microwave node device proxy centers returns to normal, the management center cancels the protection switching control right.
9、 如权利要求 3至 4任一项所述的方法, 其特征在于, 在所述主备微 波节点设备代理中心通过检测掉电消息确认对端掉电的情况下, 所述主备 微波节点设备代理中心通过所述高速通信通道进行主备微波节点设备保护 倒换。  The method according to any one of claims 3 to 4, wherein, in the case that the active/standby microwave node device proxy center detects the power down message to confirm that the peer is powered off, the active and standby microwave nodes The device proxy center performs protection switching between the active and standby microwave node devices through the high-speed communication channel.
10、 如权利要求 9所述的方法, 其特征在于, 在所述主备微波节点设 备代理中心通过检测掉电消息确认对端掉电的情况下, 所述主备微波节点 设备代理中心通过所述高速通信通道进行主备微波节点设备保护倒换包 括:  The method according to claim 9, wherein, in the case that the active/standby microwave node device proxy center detects the power failure message to confirm that the peer end is powered off, the active and standby microwave node device proxy center passes through the The protection switching of the active and standby microwave node devices in the high-speed communication channel includes:
如果主用微波节点设备代理中心通过检测掉电消息确认备用微波节点 设备掉电, 所述主用微波节点设备代理中心不执行操作; If the primary microwave node device proxy center confirms the standby microwave node by detecting the power down message If the device is powered off, the active microwave node device agent center does not perform operations;
如果备用微波节点设备代理中心通过检测掉电消息确认主用微波节点 设备掉电, 则所述备用微波节点设备代理中心进行保护倒换, 并将自身的 工作状态信息设置为主用状态。  If the standby microwave node device proxy center confirms that the primary microwave node device is powered off by detecting the power failure message, the standby microwave node device agent center performs protection switching, and sets its own working state information as the primary state.
11、 如权利要求 3至 4任一项所述的方法, 其特征在于, 在所述主备 微波节点设备代理中心通过所述远端告警通信通道检测到远端告警的情况 下, 所述主备微波节点设备代理中心通过所述高速通信通道进行主备微波 节点设备保护倒换。  The method according to any one of claims 3 to 4, wherein, in the case that the remote standby alarm communication channel detects the remote alarm by the active/standby microwave node device proxy center, the primary The proxy node of the microwave node device performs protection switching of the active and standby microwave node devices through the high-speed communication channel.
12、 如权利要求 11所述的方法, 其特征在于, 在所述主备微波节点设 备代理中心通过所述远端告警通信通道检测到远端告警的情况下, 所述主 备微波节点设备代理中心通过所述高速通信通道进行主备微波节点设备保 护倒换包括:  The method according to claim 11, wherein, in the case that the remote standby alarm communication channel detects the remote alarm through the proxy center of the active/standby microwave node device, the active and standby microwave node device agents The protection switching of the active and standby microwave node devices by the center through the high-speed communication channel includes:
所述主用微波节点设备代理中心根据所述远端告警检测自身设备是否 出现故障, 如果判断为是, 则主用微波节点设备代理中心进行主备微波节 点设备保护倒换; 如果判断为否, 所述主用微波节点设备代理中心则检测 所述备用微波节点设备的故障状态信息;  The proxy node of the active microwave node device detects whether the device is faulty according to the remote alarm. If the determination is yes, the active microwave node device proxy center performs protection switching of the active and standby microwave node devices; if the determination is no, The active microwave node device proxy center detects fault state information of the standby microwave node device;
如果确定所述备用微波节点设备存在故障, 所述主用微波节点设备代 理中心不执行操作, 如果确定所述备用微波节点设备不存在故障, 所述主 用微波节点设备代理中心进行保护倒换。  If it is determined that the standby microwave node device is faulty, the primary microwave node device proxy center does not perform an operation, and if it is determined that the standby microwave node device does not have a fault, the primary microwave node device proxy center performs protection switching.
13、 如权利要求 12所述的方法, 其特征在于, 在进行保护倒换后, 原 备用微波节点设备代理中心检测是否还有所述远端告警, 如果判断为是, 则上报原主用微波节点设备出现不可检测故障告警信息。  The method according to claim 12, wherein after the protection switching, the original standby microwave node device agent center detects whether the remote alarm is still present, and if the determination is yes, reports the original primary microwave node device. An undetectable fault alarm message appears.
14、 一种保护倒换系统, 其特征在于, 包括:  14. A protection switching system, comprising:
主用微波节点设备代理中心, 用于根据预先设置的多点故障检测策略 确定主用微波节点设备或备用微波节点设备出现一个或多个故障, 并在主 备微波节点设备之间通信正常的情况下, 通过保护倒换通信通道进行所述 主备微波节点设备的保护倒换; The primary microwave node device proxy center is configured to determine one or more faults of the primary microwave node device or the standby microwave node device according to a preset multi-point fault detection policy, and The protection switching between the active and standby microwave node devices is performed through the protection switching communication channel when the communication between the standby microwave node devices is normal.
备用微波节点设备代理中心, 用于根据预先设置的多点故障检测策略 确定主用微波节点设备或备用微波节点设备出现一个或多个故障, 并在主 备微波节点设备之间通信正常的情况下, 通过保护倒换通信通道进行所述 主备微波节点设备的保护倒换;  The standby microwave node device proxy center is configured to determine, according to the preset multi-point fault detection policy, that one or more faults occur in the primary microwave node device or the standby microwave node device, and the communication between the active and standby microwave node devices is normal. Protecting the protection switching between the active and standby microwave node devices by using a protection switching communication channel;
管理中心, 用于在所述主备微波节点设备之间通信异常的情况下, 接 收所述主备微波节点设备代理中心的通知, 通过所述保护倒换通信通道对 所述主备微波节点设备进行强制保护倒换。  a management center, configured to receive a notification of the proxy center of the active and standby microwave node devices, and perform the notification on the active/standby microwave node device by using the protection switching communication channel in a case that the communication between the active and standby microwave node devices is abnormal Forced protection switching.
15、 如权利要求 14所述的系统, 其特征在于,  15. The system of claim 14 wherein:
所述多点故障检测策略包括以下至少之一: 对所述主备微波节点设备 代理中心的心跳消息进行检测、 对掉电消息进行检测、 对所述主备微波节 点设备代理中心自身设备故障信息进行检测、 对远端告警进行检测;  The multi-point fault detection strategy includes at least one of the following: detecting a heartbeat message of the active and standby microwave node device proxy center, detecting a power down message, and detecting, by the active and standby microwave node device, a proxy center device fault information. Perform detection and detect remote alarms;
所述保护倒换通信通道包括: 所述主备微波节点设备代理中心之间的 高速通信通道、 所述主备微波节点设备代理中心与所述管理中心之间的可 靠通信通道、 以及远端告警通信通道。  The protection switching communication channel includes: a high-speed communication channel between the active and standby microwave node device proxy centers, a reliable communication channel between the active and standby microwave node device proxy centers and the management center, and a remote alarm communication aisle.
PCT/CN2010/079022 2010-07-21 2010-11-23 Protection switching method and system WO2012009914A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010232516.0 2010-07-21
CN201010232516.0A CN102340407B (en) 2010-07-21 2010-07-21 Protection switching method and system

Publications (1)

Publication Number Publication Date
WO2012009914A1 true WO2012009914A1 (en) 2012-01-26

Family

ID=45496465

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/079022 WO2012009914A1 (en) 2010-07-21 2010-11-23 Protection switching method and system

Country Status (2)

Country Link
CN (1) CN102340407B (en)
WO (1) WO2012009914A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11249860B2 (en) 2017-11-21 2022-02-15 Beijing Kingsoft Cloud Network Technology, Co., Ltd. Node down recovery method and apparatus, electronic device, and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905114B (en) * 2012-12-25 2017-02-22 中国移动通信集团广西有限公司 Optical cable line failure point locating method, device and system
CN107688547B (en) * 2017-08-23 2020-06-16 苏州浪潮智能科技有限公司 Method and system for switching between main controller and standby controller

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040001449A1 (en) * 2002-06-28 2004-01-01 Rostron Andy E. System and method for supporting automatic protection switching between multiple node pairs using common agent architecture
CN1889373A (en) * 2005-06-30 2007-01-03 华为技术有限公司 Method for realizing master and spare conversion of distributing connection equipment
CN101237315A (en) * 2008-02-28 2008-08-06 浪潮电子信息产业股份有限公司 A synchronous detection and failure separation method for dual control high-availability system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1109416C (en) * 2000-04-25 2003-05-21 华为技术有限公司 Method and equipment for swapping active with standby switches
CN1251419C (en) * 2002-05-30 2006-04-12 华为技术有限公司 Method for realization of fast rearranging main spared device in communication devices

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040001449A1 (en) * 2002-06-28 2004-01-01 Rostron Andy E. System and method for supporting automatic protection switching between multiple node pairs using common agent architecture
CN1889373A (en) * 2005-06-30 2007-01-03 华为技术有限公司 Method for realizing master and spare conversion of distributing connection equipment
CN101237315A (en) * 2008-02-28 2008-08-06 浪潮电子信息产业股份有限公司 A synchronous detection and failure separation method for dual control high-availability system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11249860B2 (en) 2017-11-21 2022-02-15 Beijing Kingsoft Cloud Network Technology, Co., Ltd. Node down recovery method and apparatus, electronic device, and storage medium

Also Published As

Publication number Publication date
CN102340407A (en) 2012-02-01
CN102340407B (en) 2015-07-22

Similar Documents

Publication Publication Date Title
US9237092B2 (en) Method, apparatus, and system for updating ring network topology information
KR101204130B1 (en) Fault processing method, system and exchanging device based on industry ethernet network
JP5513342B2 (en) Packet relay device
WO2008014639A1 (en) A distributed master and standby managing method and system based on the network element
WO2011143876A1 (en) Master/backup switching method and device for service nodes
CN105577444B (en) A kind of wireless controller management method and wireless controller
CN101267392B (en) A realizing method for notifying downstream device in case of switch of uplink link status
WO2015196676A1 (en) Networking protection method and device, and main convergence network element in networking
WO2011157149A2 (en) Method, communication device and system, and service request device for main/standby switch between communication devices
US20150288600A1 (en) Cross-device linear multiplex section protection method, gateway and controller
WO2014036724A1 (en) Fault recovery method of operation and maintenance channel and network management terminal
WO2013049981A1 (en) Hybrid ring network protection method and system based on shared path
WO2010121459A1 (en) Method and system for implementing protection and recovery in automatically switching optical network
EP2892180B1 (en) Service traffic protection method and apparatus
WO2012009914A1 (en) Protection switching method and system
WO2011150780A1 (en) Method for triggering route switching and service provider-end provider edge device
WO2009036676A1 (en) State transition method and network node equipment
CN101860888A (en) Method, system and equipment for transmitting data by wireless link
CN115408199A (en) Disaster tolerance processing method and device for edge computing node
CN102638369A (en) Method, device and system for arbitrating main/standby switch
WO2020124445A1 (en) Method for processing network anomaly, communication system, related processing unit
JP5786055B2 (en) Packet relay device
US20220141123A1 (en) Network device, network system, network connection method, and program
JP5475706B2 (en) Monitoring device, communication device, and network monitoring method
WO2021240629A1 (en) Communication system, communication path monitoring method, communication device, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10854944

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10854944

Country of ref document: EP

Kind code of ref document: A1