WO2021136370A1 - Procédé et système de restauration de service pour système distribué - Google Patents

Procédé et système de restauration de service pour système distribué Download PDF

Info

Publication number
WO2021136370A1
WO2021136370A1 PCT/CN2020/141371 CN2020141371W WO2021136370A1 WO 2021136370 A1 WO2021136370 A1 WO 2021136370A1 CN 2020141371 W CN2020141371 W CN 2020141371W WO 2021136370 A1 WO2021136370 A1 WO 2021136370A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
address
standby
standby node
faulty
Prior art date
Application number
PCT/CN2020/141371
Other languages
English (en)
Chinese (zh)
Inventor
董友球
杜铁军
Original Assignee
威创集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 威创集团股份有限公司 filed Critical 威创集团股份有限公司
Publication of WO2021136370A1 publication Critical patent/WO2021136370A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses

Definitions

  • the present invention relates to the technical field of distributed systems, and more specifically, to a method and system for business recovery of distributed systems.
  • Distributed splicing display system and distributed network seat system are commonly used equipment in the control room.
  • the distributed splicing system usually includes multiple distributed splicing nodes, each node is responsible for displaying a certain part of the screen of the splicing wall, and multiple nodes perform their duties to form a splicing display system.
  • the distributed network seat system usually has a sending box and a receiving box, the sending box is connected to the business computer, the receiving box is connected to the keyboard, mouse and the monitor, and the audio, video and keyboard and mouse information are transmitted through the network, so as to achieve the effect of man-machine separation and one machine with multiple screens. .
  • each node communicates with each other through IP addresses.
  • the existing technical means generally first find out the faulty IP address information, and then connect the backup node to the computer to make a backup
  • the IP address of the node is set to the IP address of the original failed node, and the corresponding configuration is modified according to business needs, and then the backup node is replaced with the failed node.
  • This series of operations will take at least a few minutes. If the control room is used in an emergency command system, the difference in emergency command conditions is far from perfect. Therefore, within a few minutes, some key information may be missed. Which leads to decision-making errors.
  • the present invention aims to overcome at least one defect (deficiency) of the above-mentioned prior art and provide a distributed system service recovery method and system, which are used to achieve the effect of quickly recovering system services.
  • the technical solution adopted by the present invention is a business recovery method of a distributed system, and the method includes:
  • the network connection with the standby node is re-established and the services allocated to the failed node are re-allocated to the standby node.
  • the method of the present invention records the IP address of the faulty node, and then uses a physical replacement method to first replace the faulty node with the standby node, and then sends the broadcast instruction discovered by the device to make it in the standby state
  • the standby node receives and reports its own status information. According to the reported status information, it can be determined that the standby node is an available standby node, and then the IP address of the standby node is modified to the IP address of the failed node by sending a modification instruction to establish a connection with The network connection of the backup node allows the backup node to replace the failed node to continue to perform the original business of the failed node.
  • the method of the present invention solves the problem that after a node in the distributed system fails and a new node is replaced, the configuration modification can be automatically completed in an online manner without any manual configuration, so that the distributed system can restore the original working state.
  • the backup node can be plug and play, which not only simplifies the operation steps, reduces the operation difficulty, but also greatly shortens the repair time of the fault.
  • the present invention also provides a business recovery system of a distributed system, and the system includes:
  • the monitoring module is used to monitor the network connection with all nodes. When it is detected that the network connection of a node is in a disconnected state, determine that the disconnected node is a faulty node, and record the IP address of the faulty node;
  • a prompt message issuing module which is used to send out prompt information for replacing the faulty node with the standby node through a physical replacement method
  • Broadcast module used to send broadcast instructions for device discovery
  • the receiving module is used to receive the status information reported by the standby node after receiving the broadcast instruction discovered by the device;
  • a spare node determining module configured to determine that the spare node is an available spare node according to the status information reported by the spare node;
  • An IP address modification module configured to send a modification instruction for modifying an IP address to the IP address of the failed node to the standby node;
  • the connection establishment module is used for re-establishing the network connection with the standby node and redistributing the services allocated to the failed node to the standby node.
  • the system of the present invention uses the monitoring module to detect the faulty node of the distributed system online. When a faulty node is detected, the IP address of the faulty node is recorded, and then the prompt message is sent out by the prompt message sending module to remind the physical replacement
  • the standby node replaces the failed node on the physical connection, and then sends the broadcast instruction discovered by the device through the broadcast instruction module so that the standby node that has been physically connected to replace the failed node can receive the broadcast instruction discovered by the device, and the receiving module according to
  • the status information reported by the standby node, and then the standby node determination module can determine that the standby node is an available standby node based on the status information, and then the IP address modification module modifies the IP address of the standby node to the IP address of the failed node by sending a modification instruction, and finally
  • the connection establishment module is used to establish a network connection with the standby node, and the standby node replaces the failed node to continue to perform the original business of the
  • the system of the present invention solves the problem that after a node in the distributed system fails and a new node is replaced, the configuration modification can be automatically completed in an online manner without any manual configuration, so that the distributed system can restore the original working state.
  • the backup node can be plug and play, which not only simplifies the operation steps, reduces the operation difficulty, but also greatly shortens the repair time of the fault.
  • the beneficial effects of the present invention are: the method and system of the present invention can be applied to a distributed system.
  • a node of the distributed system fails, the cost of redundant equipment is not increased.
  • the standby node is plug-and-play, and the standby node does not need to do any manual configuration, which can achieve the effect of quickly restoring distributed system services.
  • Fig. 1 is a flowchart of a method for restoring a service in a distributed system according to Embodiment 1 of the present invention.
  • FIG. 2 is a flowchart of a method for restoring a service in a distributed system according to Embodiment 2 of the present invention.
  • FIG. 3 is a framework diagram of a service recovery system of a distributed system according to Embodiment 3 of the present invention.
  • the service recovery method of the distributed system in this embodiment includes the following specific steps:
  • S102 Send a prompt message that the standby node replaces the failed node by means of physical replacement
  • S105 Determine that the standby node is an available standby node according to the status information reported by the standby node;
  • S106 Send a modification instruction for modifying the IP address to the IP address of the failed node to the backup node;
  • the method of the present invention records the IP address of the faulty node, and then uses a physical replacement method to first replace the faulty node with the standby node, and then sends the broadcast instruction discovered by the device to make it in the standby state
  • the standby node receives and reports its own status information. According to the reported status information, it can be determined that the standby node is an available standby node, and then the IP address of the standby node is modified to the IP address of the failed node by sending a modification instruction to establish a connection with The network connection of the backup node allows the backup node to replace the failed node to continue to perform the original business of the failed node.
  • the method of the present invention solves the problem that after a node in the distributed system fails and a new node is replaced, the configuration modification can be automatically completed in an online manner without any manual configuration, so that the distributed system can restore the original working state.
  • the backup node can be plug and play, which not only simplifies the operation steps, reduces the operation difficulty, but also greatly shortens the repair time of the fault.
  • the method of this embodiment can be applied to a control module of a distributed system, and the method of this embodiment is implemented through the control module.
  • the control module can be arranged on a server or on a certain node of a distributed system, and its purpose is to manage the logical relationship between the various nodes and the state between the nodes.
  • the control module pre-establishes a network connection with all nodes in the main state in the distributed system. After the network connection is established, the control module executes all the steps S101-S107 above so that the distributed system is in the main state.
  • the standby node in the standby state can be quickly inserted and online configuration can be quickly realized, so as to quickly restore the business of the distributed system.
  • step S101 of this embodiment when it is detected that the network connection with a node is in a disconnected state, alarm information is also sent to the control end of the distributed system. By sending out alarm information, the control end of the distributed system can be reminded so that it can make corresponding decisions.
  • the method of this embodiment prompts that the faulty node has failed by sending a prompt message in S103, so that the backup node can be physically replaced by the faulty node.
  • the replacement of the faulty node by the standby node by means of physical replacement refers to inserting the physical wiring external to the faulty node into the standby node, and the physical wiring may include the power cord, network cable, and video cable of the faulty node. , One or more of the USB cables.
  • the prompt information may include the IP address of the failed node, and the failed node can be quickly traced in actual operations according to the IP address of the failed node, so that the backup node can quickly replace the failed node on the physical connection.
  • the replacement in this step is only on the physical connection.
  • the IP address and other related configuration information of the standby node are not modified to make it consistent with the failed node.
  • the standby node still cannot replace the failed node and cannot replace the failed node.
  • the subsequent steps still need to be performed to make the configuration information of the standby node consistent with the failed node.
  • the status information may include the IP address and MAC address of the standby node, and status information about whether an end-to-end network connection has been established.
  • the IP address of the standby node is a preset initial value.
  • the IP address of the device-use node enables it to receive broadcast instructions after physical access, and the preset IP address can report status information. But even if the standby node has a preset IP address, it has not established an end-to-end connection with the control module of the distributed system or has not established an end-to-end connection with other control modules. Therefore, the control module can be based on the node Whether an end-to-end connection is established with it to determine whether the node reporting status information is an available standby node.
  • step S105 includes:
  • the standby node determines whether the IP address of the standby node is the preset initial value and whether the standby node has not established an end-to-end network connection, if the IP address of the standby node is the preset initial value and the standby node If the end-to-end network connection is not established, the standby node is determined to be an available standby node.
  • the control module judges the received status information of the standby node, and if the IP address of the standby node is a preset initial value and the standby node has not established an end-to-end network connection, the standby node is determined to be an available standby node.
  • the distributed system executes all the above steps S101-S107 by setting the control module to complete the interaction with the standby node, so that the service can be quickly restored when the node of the distributed system fails.
  • the method of this embodiment is further described in conjunction with the interaction mode between the control module and the standby node.
  • control module pre-establishes a network connection with all nodes in the primary state in the distributed system, and the standby node in the standby state is preset with an initial IP address;
  • the control module starts to monitor all nodes in the main state.
  • the control module determines that the node in the disconnected state is a faulty node, and records all the nodes in the main state. State the IP address of the failed node, and actively send alarm information to the control end of the distributed system;
  • control module sends out a prompt message, prompting that the standby node needs to be replaced by a physical replacement method for the failed node;
  • control module After the control module sends out the prompt information, it also sends the broadcast instruction of the device discovery;
  • the backup node After the backup node replaces the failed node on the physical connection, the backup node can receive the broadcast instruction by using the preset initial IP address. When the backup node receives the broadcast instruction, it actively reports its status information to the control module.
  • the control module receives the status information reported by the standby node, and then determines whether the IP address of the standby node is the preset initial value and whether the standby node has not established an end-to-end network connection with any control module according to the status information. If so, the standby node can be determined
  • the node is an available spare node
  • control module sends a modification instruction to the available backup node, the instruction instructs the available backup node to modify its own IP address to the IP address of the failed node;
  • the available backup node After the available backup node receives the modification instruction, it immediately sets the IP address of the machine according to the relevant parameters in the modification instruction.
  • control module After the available backup node modifies the IP address, the control module re-establishes the network connection with the available backup node and redistributes the services allocated to the failed node to the backup node.
  • the method of the present invention realizes the effect of rapid service recovery of the distributed system through the plug and play of the standby node, simplifies the process of configuring the standby node, and greatly shortens the repair time of the fault.
  • the service recovery method of a distributed system in this embodiment is a method to solve how to repair multiple failed nodes when more than one faulty node is detected using step S101 of Embodiment 1. .
  • the specific steps of a method for business recovery of a distributed system in this embodiment include:
  • S202 Perform numbering and sorting of multiple faulty nodes according to a preset numbering rule
  • S203 Perform the following steps S204-S209 cyclically according to the numbering of the failed nodes so that all the failed nodes are replaced by available spare nodes;
  • S206 Receive the status information reported by the standby node after receiving the broadcast instruction discovered by the device;
  • S207 Determine that the standby node is an available standby node according to the status information reported by the standby node.
  • S208 Send a modification instruction for modifying the IP address to the IP address of the failed node to the standby node;
  • the preset numbering rule may be to number the nodes according to the logical relationship of the nodes in the state in the distributed system, and then to sort the nodes according to the size of the node number of each faulty node.
  • the rule can be formulated according to the characteristics of the system. For example, the wall-to-wall system can be sorted from left to right and from top to right. The node corresponding to each screen is numbered in the order below, the node 1 in the upper left corner is numbered, and the number is increased in sequence. The rule can be that each time the node with the smallest or largest number among the current failed nodes is replaced.
  • the invention also provides a business recovery system of the distributed system.
  • a service recovery system of a distributed system in this embodiment specifically includes:
  • the monitoring module 301 is used to monitor the network connection with all nodes. When it is detected that the network connection of a node is in a disconnected state, determine that the disconnected node is a faulty node, and record the IP address of the faulty node;
  • the prompt message issuing module 302 is configured to send out prompt information for replacing the faulty node with the standby node through a physical replacement method
  • the broadcast module 303 is used to send broadcast instructions for device discovery
  • the receiving module 304 is configured to receive the status information reported by the standby node after receiving the broadcast instruction discovered by the device;
  • the standby node determining module 305 is configured to determine that the standby node is an available standby node according to the status information reported by the standby node;
  • the IP address modification module 306 is configured to send a modification instruction for modifying the IP address to the IP address of the failed node to the standby node;
  • the connection establishment module 307 is configured to re-establish a network connection with the backup node and redistribute the services allocated to the failed node to the backup node.
  • the system of the present invention uses the monitoring module 301 to detect the faulty nodes of the distributed system online.
  • the IP address of the faulty node is recorded, and then the prompt message sending module 302 sends a prompt message to remind you to pass
  • the backup node replaces the failed node on the physical connection, and then sends the broadcast instruction of device discovery through the broadcast instruction module 303 so that the backup node that has been physically connected to replace the failed node can receive the broadcast instruction of device discovery.
  • the receiving module 304 can determine that the standby node is an available standby node according to the status information reported by the standby node, and then the standby node determining module 305 can determine that the standby node is an available standby node based on the status information, and then the IP address modification module 306 modifies the IP address of the standby node by sending a modification instruction to The IP address of the failed node, and finally the connection establishment module 307 is used to establish a network connection with the standby node and the standby node replaces the failed node to continue to perform the original service of the failed node.
  • the system of the present invention solves the problem that after a node in the distributed system fails and a new node is replaced, the configuration modification can be automatically completed in an online manner without any manual configuration, so that the distributed system can restore the original working state.
  • the backup node can be plug and play, which not only simplifies the operation steps, reduces the operation difficulty, but also greatly shortens the repair time of the fault.
  • the business recovery system of this embodiment can be applied to a control module of a distributed system, and the business recovery system is implemented by setting each module of this embodiment on the control module.
  • the control module can be arranged on a server or on a certain node of a distributed system, and its purpose is to manage the logical relationship between the nodes and the state between the nodes.
  • the service recovery system of this implementation further includes an alarm information sending module, and the alarm information sending module is used to send information to the distribution system when the monitoring module 301 detects that the network connection with the node is in a disconnected state.
  • the control end of the integrated system sends alarm information. By issuing warning messages, the control end of the distributed system can be reminded so that it can make corresponding decisions.
  • a prompt message is issued by the prompt message issuing module 302 to prompt that a faulty node has failed, so that the backup node can be physically replaced with the faulty node.
  • the replacement of the faulty node by the standby node by means of physical replacement refers to inserting the physical wiring external to the faulty node into the standby node, and the physical wiring may include the power cord, network cable, and video cable of the faulty node. , One or more of the USB cables.
  • the prompt information may include the IP address of the failed node, and the failed node can be quickly traced in actual operations according to the IP address of the failed node, so that the backup node can quickly replace the failed node on the physical connection.
  • the status information may include the IP address, MAC address of the standby node, and status information about whether an end-to-end network connection has been established.
  • the IP address of the standby node is a preset initial value.
  • the IP address of the device-use node enables it to receive broadcast instructions after physical access, and the preset IP address can report status information.
  • the standby node has a preset IP address, it has not established an end-to-end connection with the business recovery system of the distributed system or has not established an end-to-end connection with other business recovery systems. Therefore, in the business recovery system
  • the standby node determining module 305 can determine whether the node reporting the status information is an available standby node according to whether the node has established an end-to-end connection with it.
  • the standby node determining module 305 is specifically configured to:
  • the backup node determines whether the IP address of the backup node is the preset initial value and whether the backup node has not established an end-to-end network connection with the service recovery system, if the IP address of the backup node is preset If the initial value and the standby node have not established an end-to-end network connection with the service recovery system, it is determined that the standby node is an available standby node.
  • the standby node determining module 305 judges the status information of the standby node received, and if the IP address of the standby node is a preset initial value and the standby node has not established an end-to-end network connection, the standby node is determined to be an available standby node.
  • the monitoring module 301 when the monitoring module 301 detects that there are multiple disconnected nodes, it determines that the multiple disconnected nodes are all faulty nodes, and records multiple faults. Node’s IP address, and control the prompt message issuing module 302 to send out the prompt information corresponding to each failed node one by one, so that each failed node can be replaced with a spare node one by one, so that the service recovery system can quickly repair.
  • the monitoring module 301 controls the prompt information issuing module 302 to send out the prompt information corresponding to each faulty node one by one, specifically:
  • the monitoring module 301 sorts the number of multiple faulty nodes according to the preset numbering rule, and any one by one according to the number sequence of the faulty node sends out the prompt information corresponding to each faulty node one by one so that all the faulty nodes are available as spare nodes replace.
  • the preset numbering rule may be that the nodes are numbered according to the logical relationship of the nodes in the state in the distributed system, and then sorted according to the size of the node number of each failed node.
  • the rule can be formulated according to the characteristics of the system.
  • the node corresponding to each screen is numbered in the order below, the node 1 in the upper left corner is numbered, and the number is increased in sequence.
  • the rule can be that each time the node with the smallest or largest number among the current failed nodes is replaced.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Hardware Redundancy (AREA)

Abstract

L'invention concerne un procédé et un système de restauration de service pour un système distribué. Le procédé comprend les étapes suivantes : lorsqu'il est détecté que la connexion réseau d'un nœud se trouve dans un état déconnecté, détermination que le nœud dans l'état déconnecté est un nœud défaillant et enregistrement d'une adresse IP du nœud défaillant ; réalisation d'un traitement de remplacement sur le nœud défaillant, le traitement de remplacement comprenant : envoi d'informations d'invite indiquant que le nœud défaillant doit être remplacé par un nœud de secours au moyen d'un remplacement physique ; envoi d'une instruction de diffusion pour la découverte de dispositif ; réception d'informations d'état signalées par le nœud de secours après que le nœud de secours a reçu l'instruction de diffusion pour la découverte de dispositif ; détermination, en fonction des informations d'état signalées par le nœud de secours, que le nœud de secours est un nœud de secours disponible ; envoi au nœud de secours d'une instruction de modification en vue de modifier une adresse IP à l'adresse IP du nœud défaillant ; et rétablissement de la connexion réseau au nœud de secours, et réattribution au nœud de secours d'un service attribué au nœud défaillant. Dans la présente invention, lorsqu'un nœud d'un système distribué est défaillant, la mise en circuit prêt à l'emploi du nœud de secours est mise en œuvre et un service du système distribué peut être rapidement restauré.
PCT/CN2020/141371 2019-12-30 2020-12-30 Procédé et système de restauration de service pour système distribué WO2021136370A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911396738.3 2019-12-30
CN201911396738.3A CN111130899A (zh) 2019-12-30 2019-12-30 一种分布式系统的业务恢复方法及系统

Publications (1)

Publication Number Publication Date
WO2021136370A1 true WO2021136370A1 (fr) 2021-07-08

Family

ID=70505248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/141371 WO2021136370A1 (fr) 2019-12-30 2020-12-30 Procédé et système de restauration de service pour système distribué

Country Status (2)

Country Link
CN (1) CN111130899A (fr)
WO (1) WO2021136370A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111130899A (zh) * 2019-12-30 2020-05-08 威创集团股份有限公司 一种分布式系统的业务恢复方法及系统
CN115002001B (zh) * 2022-02-25 2023-08-04 苏州浪潮智能科技有限公司 一种检测集群网络亚健康的方法、装置、设备及介质
CN116743752A (zh) * 2023-08-11 2023-09-12 山东恒宇电子有限公司 一种分布式网络通讯实现数据处理负载均衡系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105406980A (zh) * 2015-10-19 2016-03-16 浪潮(北京)电子信息产业有限公司 一种多节点备份方法及装置
CN109151028A (zh) * 2018-08-23 2019-01-04 郑州云海信息技术有限公司 一种分布式存储系统容灾方法及装置
WO2019049433A1 (fr) * 2017-09-06 2019-03-14 日本電気株式会社 Système de grappe, procédé de commande de système de grappe, dispositif de serveur, procédé de commande et support lisible par ordinateur non transitoire dans lequel est stocké un programme
CN110572275A (zh) * 2019-08-01 2019-12-13 新华三技术有限公司成都分公司 一种网卡切换方法、装置、服务器及计算机可读存储介质
CN111130899A (zh) * 2019-12-30 2020-05-08 威创集团股份有限公司 一种分布式系统的业务恢复方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09244910A (ja) * 1996-03-11 1997-09-19 Nippon Steel Corp 分散制御システムのバックアップ方法
CN107145306B (zh) * 2017-04-27 2020-08-21 杭州哲信信息技术有限公司 分布式数据存储方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105406980A (zh) * 2015-10-19 2016-03-16 浪潮(北京)电子信息产业有限公司 一种多节点备份方法及装置
WO2019049433A1 (fr) * 2017-09-06 2019-03-14 日本電気株式会社 Système de grappe, procédé de commande de système de grappe, dispositif de serveur, procédé de commande et support lisible par ordinateur non transitoire dans lequel est stocké un programme
CN109151028A (zh) * 2018-08-23 2019-01-04 郑州云海信息技术有限公司 一种分布式存储系统容灾方法及装置
CN110572275A (zh) * 2019-08-01 2019-12-13 新华三技术有限公司成都分公司 一种网卡切换方法、装置、服务器及计算机可读存储介质
CN111130899A (zh) * 2019-12-30 2020-05-08 威创集团股份有限公司 一种分布式系统的业务恢复方法及系统

Also Published As

Publication number Publication date
CN111130899A (zh) 2020-05-08

Similar Documents

Publication Publication Date Title
WO2021136370A1 (fr) Procédé et système de restauration de service pour système distribué
CN103607297B (zh) 一种计算机集群系统的故障处理方法
US11307943B2 (en) Disaster recovery deployment method, apparatus, and system
JPH07221782A (ja) ネットワーク集中監視装置
CN111158962B (zh) 一种异地容灾方法、装置、系统、电子设备及存储介质
CN107147540A (zh) 高可用性系统中的故障处理方法和故障处理集群
CN105681077A (zh) 故障处理方法、装置及系统
CN104038376A (zh) 一种管理真实服务器的方法、装置及lvs集群系统
CN104469181B (zh) 一种基于pis系统的音视频矩阵切换方法
CN105933407A (zh) 一种实现Redis集群高可用的方法及系统
CN111464601A (zh) 一种节点服务调度系统和方法
CN103701655A (zh) 交换机的故障自诊断、自恢复方法及系统
CN105915426A (zh) 环形网络的故障恢复方法及装置
CN106294795A (zh) 一种数据库切换方法及系统
JP2020088470A (ja) 情報処理装置、ネットワークシステム及びチーミングプログラム
CN104243304B (zh) 非全连通拓扑结构的数据处理方法、设备和系统
CN102487332B (zh) 故障处理方法、装置和系统
CN106657390A (zh) 集群文件系统目录隔离方法、装置及系统
JPH05260049A (ja) ネットワークシステムにおける故障管理方法
CN106294030A (zh) 基于服务器虚拟化系统的存储冗余方法及装置
CN106027313A (zh) 基于vpn的网络链路容灾系统及方法
JP5225166B2 (ja) 電力系統監視システムおよび電力系統監視方法
CN108829570A (zh) 服务器节点信息显示控制方法、装置、系统及存储介质
CN113946474A (zh) 一种储存系统高效容灾保护方法及容灾处理系统
CN101106461B (zh) 通讯设备线卡管理状态机控制方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20909901

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20909901

Country of ref document: EP

Kind code of ref document: A1