WO2016070530A1 - Procédé et système de traitement du fonctionnement d'un dispositif primaire et de veille - Google Patents

Procédé et système de traitement du fonctionnement d'un dispositif primaire et de veille Download PDF

Info

Publication number
WO2016070530A1
WO2016070530A1 PCT/CN2015/073275 CN2015073275W WO2016070530A1 WO 2016070530 A1 WO2016070530 A1 WO 2016070530A1 CN 2015073275 W CN2015073275 W CN 2015073275W WO 2016070530 A1 WO2016070530 A1 WO 2016070530A1
Authority
WO
WIPO (PCT)
Prior art keywords
standby
primary
primary device
link
standby device
Prior art date
Application number
PCT/CN2015/073275
Other languages
English (en)
Chinese (zh)
Inventor
杨青海
毕忠良
杨骐
尹旺中
陈宗立
朱田
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016070530A1 publication Critical patent/WO2016070530A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0811Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity

Definitions

  • the present invention relates to the field of communications, and in particular to a method and system for processing an active/standby device.
  • the general bearer is usually configured in the form of a primary device and a standby device.
  • the primary device mainly performs related functions; and the standby device exists as a backup of the primary device.
  • the standby device upgrades the primary device to take over the work of the original primary device, maintaining the uninterrupted service, and when the standby device is down, the primary device reselects the new standby device.
  • heartbeat messages are generally used for keepalive between the primary device and the backup device of the cluster system.
  • the primary device and the standby device are inactive, the node cannot be identified as a node fault or a link fault. As a result, the active device and the standby device are in the wrong path. If the threshold is not exceeded, the heartbeat packet of the peer cannot be received. , it is considered that the other party has an abnormality, and the standby device is started to upgrade to the primary device or re-select the standby device.
  • the system may evolve into a "dual master" (that is, the standby device also switches to the primary device, but the original primary device is still still Running).
  • the recovery is generally performed by the restart of the reforming system, but this will reduce the stability of the system and make the user experience poor.
  • the standby device directly converts to the primary device, causing the two primary devices to operate in the system, reducing The stability of the system makes the user experience poor, and no effective solution has been proposed.
  • the present invention provides a method and system for processing an active/standby device.
  • a method for processing an active/standby device includes: after determining that the primary device and the first standby device are disconnected, the primary device detects the primary device and other devices.
  • Link connectivity wherein the first standby device detects link connectivity between the first standby device and other devices, where the other device is the cluster device in which the primary device and the first standby device are located.
  • a device other than the primary device and the standby device the primary device according to the detection result of the link connectivity Processing of the primary device and/or the first standby device, and the first standby device according to the detection result of the link connectivity to the primary device and/or the first standby device Run for processing.
  • the primary device processes the operation of the primary device and/or the first standby device according to the detection result of the link connectivity, and includes: when the primary device detects the chain When the path is connected, the first standby device is replaced with a second standby device; when the primary device detects that the link is not connected, the active device is prohibited from running for a second predetermined time period. .
  • the first standby device processes the operation of the primary device and the first standby device according to the detection result of the link connectivity, including: when the first standby device detects the chain When the road is connected, it is determined whether the primary device is running; when the primary device is not running, the first standby device is used as the primary device.
  • the first standby device is prohibited from operating for a third predetermined period of time.
  • determining, by at least one of the following manners, whether the active device is running notifying by a third party outside the primary device and the first standby device; transmitting a message on the forwarding plane by using the first standby device The specified information of the channel is detected.
  • the first standby device processes the operation of the primary device and the first standby device according to the detection result of the link connectivity, including: when the first standby device detects the chain When the road is not connected, the first standby device is prohibited from operating during the first predetermined time period.
  • determining that the primary device and the first standby device are disconnected including: when the primary device and/or the first standby device does not receive a keep-alive message, determining the primary device and the The first standby device is disconnected.
  • an operating system for an active and standby device including: a primary device, configured to detect the primary device and the primary device after determining that the primary device and the first standby device are disconnected Link connectivity of other devices, and processing of the operation of the primary device and/or the first standby device according to the detection result of the link connectivity, wherein the other device is the primary device and The cluster system in which the first standby device is located, except for the primary device and the standby device; the first standby device is configured to detect the primary device and the primary device Detecting link connectivity of the first standby device with other devices, and detecting the primary device and/or the first standby according to the detection result of the link connectivity. The operation of the device is processed.
  • the primary device is further configured to: when the primary device detects that the link is connected, replace the first standby device with a second standby device; and when the primary device detects the When the link is not connected, the active device is prohibited from running during the second predetermined time period.
  • the first standby device is further configured to: when the first standby device detects that the link is connected, determine whether the active device is running; when the primary device is not running, The first standby device is used as a primary device.
  • the primary device and the standby device simultaneously detect the link connectivity between each of the primary device and the standby device, and further detect the link connectivity according to the detected link connectivity.
  • the technical solution for processing the active device and/or the standby device solves the problem that in the related art, after the primary device and the standby device are disconnected, since the node failure or the link failure cannot be distinguished, the standby device is directly converted into the primary device.
  • the use of the device causes two main devices to operate in the system, which reduces the stability of the system and makes the user experience poor, thereby enhancing the stability of the system and improving the user experience.
  • FIG. 1 is a flowchart of a method for processing an active/standby device according to an embodiment of the present invention
  • FIG. 2 is a block diagram showing the structure of a cluster system in accordance with a preferred embodiment of the present invention
  • FIG. 3 is a schematic diagram of processing after a master/slave device detects link connectivity according to a preferred embodiment of the present invention
  • FIG. 4 is a structural block diagram of an operation processing system of a master and backup device according to an embodiment of the present invention
  • FIG. 5 is another structural block diagram of an operation processing system of a master and backup device according to an embodiment of the present invention.
  • the active device and the standby device when the active device and the standby device fail to be inactivated, it is not confirmed that the device is faulty or the link is faulty, and the primary device and the backup device may be evolved according to the wrong path, that is, the standby device directly switches to the primary device.
  • the device which causes problems with two active devices in the system, provides the following technical solutions.
  • FIG. 1 is a flowchart of a method for processing an active/standby device according to an embodiment of the present invention. The process includes the following steps:
  • Step S102 after determining that the primary device and the first standby device are disconnected, the primary device detects link connectivity between the primary device and the other device, and the first standby device detects the chain of the first standby device and other devices.
  • Road connectivity wherein the other device is a device other than the primary device and the backup device in the cluster system where the primary device and the first standby device are located;
  • Step S104 The primary device processes the operation of the primary device and/or the first standby device according to the detection result of the link connectivity, and the first standby device performs the primary usage according to the detection result of the link connectivity.
  • the operation of the device and/or the first standby device described above is processed.
  • the primary device and the standby device simultaneously detect the link connectivity between each of the primary device and the standby device, and further detect the link connectivity according to the detected link connectivity.
  • the technical solution for processing the primary device and/or the standby device is that after the primary device and the backup device are disconnected, since the node failure or the link failure cannot be distinguished, the backup device is directly converted into the primary device.
  • the two main devices operate in the system, which reduces the stability of the system and makes the user experience poor, thereby enhancing the stability of the system and improving the user experience.
  • step S104 the embodiment of the present invention can be embodied in the following aspects:
  • the primary device processes the operation of the primary device and/or the first backup device according to the detection result of the link connectivity, and includes: when the primary device detects that the link is connected, And replacing the first standby device with the second standby device; when the primary device detects that the link is not connected, prohibiting the active device from running for a second predetermined time period.
  • the first standby device processes the operation of the primary device and the first backup device according to the detection result of the link connectivity, and includes: when the first standby device detects that the link is connected, determining Whether the above-mentioned primary device is running; when the primary device is not running, the first standby device is used as a primary device, and when the primary device is running, the first is prohibited for a third predetermined time period The standby device is running.
  • the first backup device processes the operation of the primary device and the first backup device according to the detection result of the link connectivity, and includes: when the first standby device detects that the link is not connected, The first standby device is prohibited from operating during the first predetermined time period.
  • the primary device processes the operation of the primary device and/or the standby device according to the link connectivity
  • the standby device uses the primary device according to the link connectivity.
  • the process of processing the standby equipment can be combined and judged.
  • the judgment process of the primary equipment and the judgment process of the standby equipment are not contradictory, and the two processes can coexist.
  • the primary device side and the standby device side simultaneously detect the connectivity of the link, and when the primary device detects the connectivity of the link, when the primary device detects the above
  • the primary device replaces the first standby device with the second standby device, that is, the link does not fail, but the primary device and the backup device still lose connectivity, indicating that the first standby device exists.
  • the fault needs to be replaced with the second standby device; when the primary device detects that the link is not connected, the active device is prohibited from running for the second predetermined time period.
  • the standby device determines whether the active device is running in multiple manners. In an optional example of the embodiment of the present invention, determining whether the primary device is running by using at least one of the following: A third party outside the standby device informs; the specified information of the message transmission channel on the forwarding plane is detected by the first standby device.
  • step S102 determining that the primary device and the first standby device are disconnected by performing the following process: when the primary device and/or the first standby device does not receive the keep-alive message, determining the foregoing The primary device and the first standby device are disconnected.
  • the primary device and the standby device calculate the connectivity of the link based on the connectivity with other devices in the system.
  • the connectivity value is TURE(T), indicating the relationship between the primary device and the backup device.
  • the link is connected, or FALSE(F), indicating that the link between the primary device and the standby device is not connected.
  • the two-way keep-alive and two-way detection are used between the active device and the standby device to ensure that the two ends of the link are aware of the state change of the keep-alive link at the same time. If the keep-alive fails in either direction, it is determined that the primary device and the standby device are disconnected.
  • TT the connectivity detection of the active and standby devices is TURE(T);
  • the primary device selects a new standby device.
  • the standby device detects whether the primary device is in position through the third-party mechanism. If the primary device is running, the standby device is suspended for a preset period of time. The detection result is the primary device. The standby device is switched to the primary device when it is not running.
  • the primary device connectivity detection is FALSE (F)
  • the standby device connectivity detection is TURE (T);
  • the primary device is suspended for a preset period of time; the standby device detects whether the primary device is running through the third-party mechanism. If the primary device is running, the standby device is suspended. If the primary device is not running, the backup device is switched to Main equipment.
  • the primary device selects a new alternate device; the standby device is suspended.
  • the primary and backup devices are suspended.
  • no matter which party detects the connectivity in the primary device and the standby device is FASLE (F), all of which are suspended during the predetermined time period; no matter which party detects the connectivity between the primary device and the standby device It is TURE(T), the primary device selects the new standby device; the standby device detects whether the primary device is running through the 3rd party mechanism, and the detection device is running, then the standby device is suspended, and the detection result is not running.
  • the backup device is switched to the primary device.
  • the cluster system of the preferred embodiment of the present invention will be briefly described. As shown in FIG. 2, the cluster system is divided into several devices. For the convenience of description, only three devices are described in FIG. The primary device (master node) and the standby device (by the node) are bi-directionally kept and detected in both directions. The primary device and the standby device perform connectivity detection with other devices (other nodes) in the system.
  • the primary device and the standby device perform connectivity detection with other devices in the system, and adopt a message-based detection mechanism, which may be, but is not limited to, the following solutions: communication link detection, such as a TCP link, a TIPC link, and the like.
  • communication link detection such as a TCP link, a TIPC link, and the like.
  • the asynchronous message keeps alive, and the above-mentioned solution is a technical means commonly used in the related art, and the embodiment of the present invention will not be described again.
  • FIG. 3 is a schematic diagram of processing after the active/standby device detects link connectivity according to a preferred embodiment of the present invention.
  • the technical solution illustrated in FIG. 3 can be summarized as: mutual heartbeat between the primary device and the standby device.
  • the message, the primary device and the standby device each receive and check the received message.
  • the connectivity detection is FASLE(F) to restart itself;
  • the connectivity detection is TRUE(T), if it is the primary device, the new standby device is selected, if it is standby
  • the device detects whether the active device is running through the third-party mechanism. If the primary device is running, the standby device is suspended. If the primary device is not running, the backup device is transferred to the active device.
  • FIG. 4 is a structural block diagram of an operation processing system for the active and standby devices according to the embodiment of the present invention.
  • the primary device 40 is configured to detect link connectivity between the primary device 40 and the other device 44 after determining that the primary device 40 and the first backup device 42 are disconnected, and the detection result according to the link connectivity.
  • the operation of the primary device 40 and/or the first standby device 42 is processed, wherein the other devices 44 are in the cluster system where the primary device 40 and the first standby device 42 are located, except the primary device 40 and the first standby device.
  • the first backup device 42 is configured to detect link connectivity between the first backup device 42 and the other device 44 when the primary device 40 detects link connectivity between the primary device 40 and the other device 44, and according to the chain
  • the detection result of the road connectivity processes the operation of the primary device 40 and/or the first standby device 42.
  • the primary device and the standby device simultaneously detect the link connectivity between the primary device and the standby device, and then according to the detected link.
  • the technical solution of the detection of the connectivity of the primary device and/or the standby device is that after the primary device and the backup device are disconnected, since the node failure or the link failure cannot be distinguished, the standby device is directly converted to
  • the main device causes two main devices to operate in the system, which reduces the stability of the system and makes the user experience poor, thereby enhancing the stability of the system and improving the user experience.
  • the primary device 40 is further configured to: when the primary device 40 detects that the link is connected, replace the first backup device 42 with the second backup device 46; and when the primary device is When detecting that the link is not connected, the foregoing active device is prohibited from running during the second predetermined time period; the first standby device 42 is further configured. To determine whether the primary device 40 is running when the first backup device 42 detects that the link is in communication, the first standby device 42 is used as the primary device when the primary device 40 is not operating.
  • the embodiment of the present invention achieves the following technical effects: the problem of "dual master" caused by direct switching of the standby device in the related art is solved, the actual condition of the keep-alive link is correctly detected, and the system is unified. Evolve the path to avoid the separate evolution of the device to prevent the occurrence of the above-mentioned double main phenomenon and improve the stability of the system.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the primary device and the standby device simultaneously detect link connectivity between the primary device and the standby device, and further, according to the detected chain.
  • the technical solution for processing the primary device and/or the standby device is to solve the problem that the primary device and the backup device are disconnected after the primary device and the backup device are disconnected, and the node failure or the link failure cannot be distinguished.
  • the device will directly convert the main device into two.
  • the main device is running in the system, which reduces the stability of the system and makes the user experience worse. This enhances the stability of the system and improves the user experience. effect.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Hardware Redundancy (AREA)

Abstract

La présente invention concerne un procédé et un système permettant de traiter le fonctionnement d'un dispositif primaire et de veille. Le procédé consiste à : détecter, après avoir déterminé qu'un dispositif primaire est hors de contact avec un premier dispositif de veille, par un dispositif primaire, une connectivité de liaison entre le dispositif primaire et d'autres dispositifs ; pendant ce temps, détecter, par le premier dispositif de veille, la connectivité de liaison entre le premier dispositif de veille et d'autres dispositifs, les autres dispositifs étant des dispositifs, à l'exception du dispositif primaire et du dispositif de veille, dans un système en grappe où sont situés le dispositif primaire et le premier dispositif de veille ; traiter, par le dispositif primaire, le fonctionnement du dispositif primaire et/ou du premier dispositif de veille en fonction d'un résultat de détection de la connectivité de liaison ; et traiter, par le premier dispositif de veille, le fonctionnement du dispositif primaire et/ou du premier dispositif de veille en fonction du résultat de détection de la connectivité de liaison. Au moyen de la solution technique, le problème dans l'art connexe selon lequel la stabilité du système est réduite en raison du fait qu'une défaillance de nœud ou de liaison n'est pas distinguée est résolu, ce qui permet d'obtenir l'effet d'amélioration de la stabilité du système.
PCT/CN2015/073275 2014-11-04 2015-02-25 Procédé et système de traitement du fonctionnement d'un dispositif primaire et de veille WO2016070530A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410614104.1 2014-11-04
CN201410614104.1A CN105634779B (zh) 2014-11-04 2014-11-04 主备设备的运行处理方法及装置

Publications (1)

Publication Number Publication Date
WO2016070530A1 true WO2016070530A1 (fr) 2016-05-12

Family

ID=55908461

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/073275 WO2016070530A1 (fr) 2014-11-04 2015-02-25 Procédé et système de traitement du fonctionnement d'un dispositif primaire et de veille

Country Status (2)

Country Link
CN (1) CN105634779B (fr)
WO (1) WO2016070530A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019036892A1 (fr) * 2017-08-22 2019-02-28 深圳瀚飞科技开发有限公司 Système et procédé de détection de communication à distance pour plate-forme de surveillance en ligne
CN107688547B (zh) * 2017-08-23 2020-06-16 苏州浪潮智能科技有限公司 一种控制器主备切换的方法及系统
CN107579860A (zh) * 2017-09-29 2018-01-12 新华三技术有限公司 节点选举方法及装置
CN109728981A (zh) * 2019-03-19 2019-05-07 江苏汇智达信息科技有限公司 一种云平台故障监测方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101674199A (zh) * 2009-09-22 2010-03-17 中兴通讯股份有限公司 用于实现网络故障时切换的方法及查询器
CN101729290A (zh) * 2009-11-04 2010-06-09 中兴通讯股份有限公司 用于实现业务系统保护的方法及装置
CN102480423A (zh) * 2010-11-30 2012-05-30 中兴通讯股份有限公司 一种l2tp网络的保护方法及系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101207408B (zh) * 2006-12-22 2012-07-11 中兴通讯股份有限公司 一种用于主备倒换的综合故障检测装置和方法
US8094569B2 (en) * 2008-12-05 2012-01-10 Cisco Technology, Inc. Failover and failback of communication between a router and a network switch
US8244125B2 (en) * 2009-01-21 2012-08-14 Calix, Inc. Passive optical network protection switching
CN102742222B (zh) * 2011-06-29 2015-05-13 华为技术有限公司 维持传输线路连通的方法和装置
US8675479B2 (en) * 2011-07-12 2014-03-18 Tellabs Operations, Inc. Methods and apparatus for improving network communication using ethernet switching protection
CN103931139B (zh) * 2013-03-19 2017-02-15 华为技术有限公司 一种冗余保护方法、装置、设备及系统
CN103560955B (zh) * 2013-10-24 2016-09-28 华为技术有限公司 冗余设备切换方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101674199A (zh) * 2009-09-22 2010-03-17 中兴通讯股份有限公司 用于实现网络故障时切换的方法及查询器
CN101729290A (zh) * 2009-11-04 2010-06-09 中兴通讯股份有限公司 用于实现业务系统保护的方法及装置
CN102480423A (zh) * 2010-11-30 2012-05-30 中兴通讯股份有限公司 一种l2tp网络的保护方法及系统

Also Published As

Publication number Publication date
CN105634779A (zh) 2016-06-01
CN105634779B (zh) 2019-09-03

Similar Documents

Publication Publication Date Title
US8438307B2 (en) Method and device of load-sharing in IRF stack
US8825844B2 (en) Notifying network operator when virtual addresses do not match on network elements configured for interchassis redundancy
CN101588304B (zh) 一种vrrp的实现方法和设备
CN106330475B (zh) 一种通信系统中管理主备节点的方法和装置及高可用集群
CN109450666B (zh) 分布式系统网络管理方法及装置
CN109525445B (zh) 链路切换方法、链路冗余备份网络和计算机可读存储介质
WO2016070530A1 (fr) Procédé et système de traitement du fonctionnement d'un dispositif primaire et de veille
CN102917384B (zh) Zigbee网络中协调器的热备方法
CN110730125B (zh) 一种报文转发方法、装置、双活系统及通信设备
CN101729426B (zh) 一种虚拟路由冗余协议主备用设备快速切换的方法及系统
CN104486128B (zh) 一种实现双控制器节点间冗余心跳的系统及方法
WO2016095344A1 (fr) Procédé et dispositif de commutation de liaison, et carte de ligne
CN109462533B (zh) 链路切换方法、链路冗余备份网络和计算机可读存储介质
CN104901834A (zh) 一种网络服务器自动切换的方法及系统
CN104994173A (zh) 一种消息处理方法和系统
WO2017036165A1 (fr) Procédé et appareil de détection de défaillance de liaison
US20150263884A1 (en) Fabric switchover for systems with control plane and fabric plane on same board
CN108259325B (zh) 路由维护方法和路由设备
CN107872822B (zh) 一种业务的承载方法及承载装置
US10986015B2 (en) Micro server built-in switch uplink port backup mechanism
WO2017146718A1 (fr) Division de réseau de protection en anneau
CN106130783B (zh) 一种端口故障处理方法及装置
WO2019000954A1 (fr) Procédé, dispositif, et système de surveillance de l'état de survie d'un nœud
CN108174417B (zh) 一种主备切换方法、装置、相关电子设备及可读存储介质
CN106559331B (zh) Mstp网络中的报文传输方法、装置及网络系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15856894

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15856894

Country of ref document: EP

Kind code of ref document: A1