WO2014176969A1 - Automatic disaster recovery switching method and device - Google Patents

Automatic disaster recovery switching method and device Download PDF

Info

Publication number
WO2014176969A1
WO2014176969A1 PCT/CN2014/075319 CN2014075319W WO2014176969A1 WO 2014176969 A1 WO2014176969 A1 WO 2014176969A1 CN 2014075319 W CN2014075319 W CN 2014075319W WO 2014176969 A1 WO2014176969 A1 WO 2014176969A1
Authority
WO
WIPO (PCT)
Prior art keywords
resource
primary system
resources
primary
switching
Prior art date
Application number
PCT/CN2014/075319
Other languages
French (fr)
Chinese (zh)
Inventor
李果
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2014176969A1 publication Critical patent/WO2014176969A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/533Voice mail systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • H04M7/0081Network operation, administration, maintenance, or provisioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/55Aspects of automatic or semi-automatic exchanges related to network data storage and management
    • H04M2203/554Data synchronization

Definitions

  • a Voice Message System (VMS) service is a telecom value-added service.
  • the service is based on a voice message platform and is used for voice mail service. It mainly provides users with the function of receiving and sending voice messages. .
  • the service provides basic functions for users to receive messages and send messages, new message notification functions, personalized mailbox settings, different Class of Service (COS) level functions, etc., in addition to existing interactive voices.
  • COS Class of Service
  • IVR Interactive Voice Response
  • users can even access and customize their voicemail through personal portals, WAP portals, and VVM.
  • the broadband intelligent network platform realizes the control of service calls, encapsulation of various protocols, and reconstruction of messages, which well supports the operation of various telecom value-added services.
  • the business data ORACLE adopts a mature stream replication scheme
  • EBASE adopts the self-developed X2X scheme.
  • the voice data is directly copied through the internal interface, and only the network bandwidth needs to be considered.
  • no voice mail service can automatically switch to an application instance running on the standby system after the primary system fails. Most of them are switched by hand. Even if the status of the resource can be automatically monitored, the switching operation is relatively independent and cannot be completely automatic.
  • the VMS voice mail service causes a business to run intermittently after the failure of the primary system.
  • an automatic disaster tolerance switching method includes: monitoring resources of a primary system; and stopping monitoring when the resources of the primary system are faulty With the operation of the system resources, the resources of the standby system are started to carry the services of the above-mentioned main system.
  • monitoring the resources of the primary system includes: detecting a resource status of the primary system by using a message/command transmission manner; if receiving the correct response of the primary system, determining that the resources of the primary system are operating normally.
  • the method further includes: after continuously monitoring the number of times that the resource operation abnormality of the primary system exceeds a threshold number of times, determining that the resource of the primary system is faulty.
  • the automatic disaster recovery switching condition is set according to the impact of the resource on the system operation;
  • the resource of the primary system is faulty, it is determined whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, the resource of the primary system is stopped, and the resource of the standby system is started.
  • the resources of the foregoing active system include at least one of the following: a data block (Data Block, DB for short) resource, a storage resource, and an internal interface resource.
  • an automatic disaster tolerance switching apparatus is further provided, where the apparatus includes: a monitoring module configured to monitor resources of the primary system; and a switching module configured to monitor the foregoing When the resources of the primary system fail, the resources of the primary system are stopped, and the resources of the standby system are started to carry the services of the primary system.
  • the monitoring module includes: a state detecting unit configured to detect a resource state of the primary system in a message/command sending manner; and a first determining unit configured to determine, when receiving the correct response of the primary system, The resource of the above-mentioned main system runs normally; the second determining unit is configured to determine that the resource of the main system is abnormally operated when the response of the main system is not received or an error response is received.
  • the monitoring module further includes: a third determining unit, configured to determine that the resource of the primary system is faulty after continuously monitoring the number of times that the resource running abnormality of the primary system exceeds a threshold number of times.
  • the switching module includes: a condition setting unit configured to set an automatic disaster tolerance switching condition according to a degree of influence of resources on system operation; and a switching unit configured to detect when a resource of the primary system is faulty, Determining whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, stopping the operation of the resource of the primary system and starting the resource of the standby system.
  • the resources of the foregoing primary system include at least one of the following: a DB resource, a storage resource, and an internal interface resource. The resource of the active system is monitored by the embodiment of the present invention.
  • the resource of the active system When the resource of the active system is detected to be faulty, the resource of the active system is stopped, and the resource of the standby system is started to carry the service of the primary system. In this way, the automatic disaster recovery switching of the VMS voice mail service is realized, the normal operation of the service and the user's indistinct feeling are ensured, seamless and smooth switching is performed, and automatic disaster recovery switching is performed.
  • FIG. 1 is a flowchart of an automatic disaster tolerance switching method according to an embodiment of the present invention
  • FIG. 2 is a structural block diagram of an automatic disaster tolerance switching apparatus according to an embodiment of the present invention
  • FIG. 3 is a VMS voice according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of an automatic disaster tolerance switching method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps (step S102 to step S104): Step S102, monitoring resources of the active system.
  • the resources of the active system include at least one of the following: a DB resource, a storage resource, and an internal interface resource.
  • step S104 when the resource of the primary system is faulty, the resource of the primary system is stopped, and the resource of the standby system is started to carry the service of the primary system.
  • the resources of the primary system are monitored.
  • the resources of the primary system are stopped, and the resources of the standby system are started to carry the services of the primary system.
  • the automatic disaster recovery switchover of the VMS voice mail service is implemented, ensuring the normal operation of the service and the user's indistinct feeling, seamlessly and smoothly switching, and performing automatic disaster recovery switching.
  • An embodiment of the present embodiment may be implemented by the following technical solutions: 1. Establish a partition under a VMS logical site, and each partition includes related resources that can support the independent operation of the service. 2. A resource monitoring module is deployed in each partition to monitor the critical resources supporting the running of the VMS service in the partition. After detecting the failure of the critical resources, the other resources in the partition are automatically stopped, and the standby partition resource monitoring module is notified. The standby resource in the standby partition is started to continue to carry the VMS service of the primary partition. 3. The related data used by the service in the standby zone is backed up by the following scheme before the disaster recovery switchover: a) Data backup uses flow replication. b) Voice data is copied directly across partitions.
  • monitoring the resources of the active system includes: detecting a resource status of the primary system by using a message/command sending manner; if receiving a correct response of the primary system, determining that the resources of the primary system are operating normally; After receiving the response from the primary system, or receiving an error response, it is determined that the resources of the primary system are abnormal.
  • the method further includes: after continuously monitoring the number of times the resource running abnormality of the primary system exceeds a threshold number of times, determining that the resource of the primary system is faulty.
  • FIG. 2 is a structural block diagram of an automatic disaster tolerance switching apparatus according to an embodiment of the present invention. As shown in FIG.
  • the apparatus includes: a monitoring module 10 and a switching module 20.
  • the monitoring module 10 is configured to monitor resources of the primary system.
  • the resources of the foregoing active system include at least one of the following: a DB resource, a storage resource, and an internal interface resource.
  • the switching module 20 is connected to the monitoring module 10, and is configured to stop the operation of the resources of the primary system when the resource of the primary system is detected to be faulty, and start the resources of the standby system to carry the services of the primary system.
  • the device of the primary system is monitored by the foregoing device. When the resource of the primary system is detected to be faulty, the resource of the primary system is stopped, and the resource of the standby system is started to carry the service of the primary system.
  • the monitoring module 10 includes: a state detecting unit configured to detect a resource state of the active system by using a message/command sending manner; and a first determining unit configured to receive a correct response of the primary system when receiving It is determined that the resource of the primary system is operating normally; and the second determining unit is configured to determine that the resource of the primary system is abnormal after receiving the response of the primary system or receiving an error response.
  • the monitoring module 10 further includes: a third determining unit, configured to determine that the resource of the primary system is faulty after continuously monitoring the number of times the resource operation abnormality of the primary system exceeds a threshold number of times.
  • the switching module 20 includes: a condition setting unit configured to set an automatic disaster tolerance switching condition according to a degree of influence of resources on system operation; and a switching unit configured to detect when the resource of the primary system is faulty And determining whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, stopping the operation of the resource of the primary system and starting the resource of the standby system.
  • the monitoring module is a core module for implementing automatic disaster tolerance at the partition level. It mainly includes the following functions:
  • Periodically check the status of the specified resource 1) Monitor the DB resource The resource monitoring module detects the DB resource status by some message or command mode. If there is no response or response error, the resource is abnormal. 2) Monitoring the storage resource The resource monitoring module detects the status of the storage resource in a certain message or command mode. If there is no response or response error, the resource is abnormal.
  • the monitoring resource monitoring module of other internal interface resources detects the status of other internal interface resources in a certain message or command mode. If there is no response or response error, the resource is abnormal. The resource monitoring module periodically detects the above modules, and continuously detects that the resource is abnormal after exceeding a certain threshold number.
  • Automatic DR switchover conditions The resource monitoring module provides the function of detecting and monitoring all other resource modules, identifies the faulty resources, and divides the importance of the impact of the resource modules on the system operation, which has a great impact on the system operation and occurs.
  • the faulty resource module is set to the automatic disaster recovery switching condition. When the conditions are met, automatic DR switching is performed. If the conditions are not met, automatic DR switching is not performed. Only non-critical modules are processed or not processed. At the same time, when the handover is performed, it is further determined whether the spare partition and the standby resource are available, otherwise the condition for automatic disaster recovery handover is not satisfied.
  • the resource monitoring module changes the status of the partition and resources to offline or unavailable, and provides or identifies the reason for offline or unavailable.
  • Control the specified module to stop when switching When the disaster tolerance is automatically switched, the resource monitoring module provides corresponding commands to control the specified resources of the faulty partition to stop running. Different resource types correspond to different instruction types.
  • the resource monitoring module of the standby partition is notified of the disaster recovery switching event: the resource monitoring module notifies the standby partition resource monitoring module of the switching event when the disaster tolerance is automatically switched.
  • the standby partition resource monitoring module After receiving the handover notification event of the primary partition resource monitoring module, the standby partition resource monitoring module enables the standby resource module on the standby partition to take over the service.
  • the resource monitoring module provides the function of receiving the notification of other modules to trigger the disaster recovery switchover.
  • the switching conditions and the operations to be performed are the same as those for automatic disaster recovery.
  • the monitoring module includes the following sub-modules:
  • Status monitoring sub-module Mainly complete the resource information to be detected from the DB, and then perform status check.
  • Switch operation processing sub-module Complete the partition switching processing operation.
  • Takeover service control sub-module mainly handles the handover request from the primary partition and completes the takeover service operation.
  • Active/standby partition offline control sub-module Mainly completes processing offline requests from the web, completes partition switching or takeover processing.
  • service start and stop control sub-module Mainly complete the start and stop operation of the module or resource object.
  • service start and stop control sub-module Mainly complete the start and stop operation of the module or resource object.
  • SMDB stop processing sub-module mainly processing requests from the web to stop the main DB, will be used
  • the DB performs the operation of stopping the service.
  • the resource monitoring module of the partition continuously detects a certain threshold value, it is confirmed that the DB is faulty, and the DB module belongs to a preset key of system operation.
  • the module, the resource monitoring module determines that there is an alternate partition in the partition to which it belongs, and the standby DB exists and is available on the spare partition.
  • the resource monitoring module starts automatic disaster recovery switching, updates the partition and resource status in the database, and stops other related resources.
  • the resource monitoring module of the standby partition is notified of the disaster recovery switching event.
  • the resource monitoring module of the standby partition starts the related standby resource after receiving the notification event, thus completing the automatic partitioning process of the entire partition level.
  • This embodiment describes automatic disaster recovery switching at the partition level. If the faulty partition is restored to normal, the resource monitoring module can still automatically detect, update the partition and resource status, and start the related resource module, and notify the standby partition resource monitoring module that the primary partition is restored to normal; the standby partition resource monitoring module receives the master. After the normal message event is restored with the partition, the related standby resources are stopped, and the takeover service is stopped, and the primary partition resource monitoring module is taken over again, so that the system provides the automatic disaster recovery function at the partition level. However, to achieve this function, data synchronization from the alternate partition to the primary partition is required during the disaster tolerance process.
  • the synchronization method and process are similar to the data synchronization of the primary partition to the standby partition.
  • the VMS voice mail service can automatically switch to the standby system after the failure of the primary system to ensure uninterrupted operation of the service, and the user feels no difference before and after the handover, and is fully automatic. Automatic disaster recovery achieved by smooth switching. While the preferred embodiments of the present invention have been disclosed for purposes of illustration, those skilled in the art will recognize that various modifications, additions and substitutions are possible, and the scope of the invention should not be limited to the embodiments described above.
  • the technical solution provided by the embodiments of the present invention can be applied to the field of computer communication, realizing automatic disaster recovery switching of the VMS voice mail service, ensuring the normal operation of the service and the user's indistinct feeling, and performing seamless and smooth switching. Automatic disaster recovery switching.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Hardware Redundancy (AREA)
  • Alarm Systems (AREA)

Abstract

Disclosed are an automatic disaster recovery switching method and device. The method comprises: monitoring resources of a primary system; and when it is monitored that the resources of the above-mentioned primary system fail, stopping operation of the resources of the above-mentioned primary system and starting resources of a standby system to bear a service of the above-mentioned primary system. By means of the present invention, the resources of the primary system are monitored, and when it is monitored that the resources of the primary system fail, stopping operation of the resources of the primary system and starting the resources of the standby system to bear the service of the above-mentioned primary system, thereby realizing the automatic disaster recovery switching of a VMS voice mail service, ensuring normal operation of the service and no difference in the user experience, conducting seamless smooth switching and conducting automatic disaster recovery switching.

Description

一种自动容灾切换方法及装置 技术领域 本发明涉及计算机通信领域, 特别是涉及一种自动容灾切换方法及装置。 背景技术 语音消息系统 (Voice Message System, 简称为 VMS) 业务是一种电信增值业务, 该业务基于语音消息平台, 用于语音信箱业务的实现, 主要是提供用户进行语音留言 的接收和发送的功能。 该业务提供用户接收留言和发送留言的基本功能, 新留言的通 知功能, 个性化的信箱设置功能, 不同的服务等级 (Class of Service, 简称为 COS) 等级功能等, 除了已有的互动式语音应答 (Interactive Voice Response, 简称为 IVR) 流程, 用户甚至可以通过个人门户、 WAP门户以及 VVM等访问和定制自己的语音信 箱。 宽带智能网平台实现对业务呼叫的控制, 对各种协议的封装, 对消息的重构, 很 好地支持了各种电信增值业务的运行。 针对不同类型的数据备份, 业务数据 ORACLE 采用了成熟的流复制方案, EBASE则采用自研的 X2X方案, 语音数据则通过内部接 口直接拷贝, 只需要考虑网络带宽即可。 相关技术中, 还没有语音信箱业务在主用系统发生故障后能自动切换到备用系统 上运行的应用实例。 大部分是通过手工切换, 即使对资源的状态能做到自动监控, 但 切换的操作也是相对独立的并且做不到完全自动。 在相关技术中, VMS语音信箱业务在主用系统发生故障后, 导致业务间断运行。 针对相关技术中的上述问题, 目前尚未提出有效的解决方案。 发明内容 针对相关技术中的上述问题,本发明实施例提供了一种自动容灾切换方法及装置, 用以解决上述技术问题。 根据本发明实施例的一个方面, 提供了一种自动容灾切换方法, 其中, 该方法包 括: 对主用系统的资源进行监控; 当监控到上述主用系统的资源出现故障时, 停止上 述主用系统的资源的运行, 启动备用系统的资源, 以承载上述主用系统的业务。 优选地, 对主用系统的资源进行监控包括: 以消息 /命令发送方式, 探测上述主用 系统的资源状态; 如果收到上述主用系统的正确响应, 则确定上述主用系统的资源运 行正常; 如果未收到上述主用系统的响应, 或者收到错误响应, 则确定上述主用系统 的资源运行异常。 优选地, 确定上述主用系统的资源运行异常之后, 上述方法还包括: 在连续监控 上述主用系统的资源运行异常的次数超过阈值次数后, 则确定上述主用系统的资源出 现故障。 优选地, 当监控到上述主用系统的资源出现故障时, 停止上述主用系统的资源的 运行, 启动备用系统的资源包括: 根据资源对系统运行的影响程度, 设定自动容灾切 换条件; 当监控到上述主用系统的资源出现故障时, 判断上述主用系统的资源是否满 足上述自动容灾切换条件; 如果满足, 则停止上述主用系统的资源的运行, 启动备用 系统的资源。 优选地, 上述主用系统的资源包括以下至少之一: 数据块 (Data Block, 简称为 DB) 资源, 存储资源, 内部接口资源。 根据本发明实施例的另一方面, 还提供了一种自动容灾切换装置, 其中, 该装置 包括: 监控模块, 设置为对主用系统的资源进行监控; 切换模块, 设置为在监控到上 述主用系统的资源出现故障时, 停止上述主用系统的资源的运行, 启动备用系统的资 源, 以承载上述主用系统的业务。 优选地, 上述监控模块包括: 状态探测单元, 设置为以消息 /命令发送方式, 探测 上述主用系统的资源状态; 第一确定单元,设置为在收到上述主用系统的正确响应时, 确定上述主用系统的资源运行正常; 第二确定单元, 设置为在未收到上述主用系统的 响应, 或者收到错误响应时, 确定上述主用系统的资源运行异常。 优选地, 上述监控模块还包括: 第三确定单元, 设置为在连续监控上述主用系统 的资源运行异常的次数超过阈值次数后, 确定上述主用系统的资源出现故障。 优选地, 上述切换模块包括: 条件设定单元, 设置为根据资源对系统运行的影响 程度, 设定自动容灾切换条件; 切换单元, 设置为在监控到上述主用系统的资源出现 故障时, 判断上述主用系统的资源是否满足上述自动容灾切换条件; 如果满足, 则停 止上述主用系统的资源的运行, 启动备用系统的资源。 优选地, 上述主用系统的资源包括以下至少之一: DB 资源, 存储资源, 内部接 口资源。 通过本发明实施例, 对主用系统的资源进行监控, 当监控到主用系统的资源出现 故障时, 停止主用系统的资源的运行, 启动备用系统的资源, 以承载上述主用系统的 业务, 从而实现了 VMS语音信箱业务的自动容灾切换, 保证业务的正常运行以及用 户的无差别感受, 进行无缝的平滑切换, 进行自动容灾切换。 上述说明仅是本发明实施例技术方案的概述, 为了能够更清楚了解本发明实施例 的技术手段, 而可依照说明书的内容予以实施, 并且为了让本发明实施例的上述和其 它目的、 特征和优点能够更明显易懂, 以下特举本发明实施例的具体实施方式。 附图说明 图 1 是根据本发明实施例的自动容灾切换方法的流程图; 图 2 是根据本发明实施例的自动容灾切换装置的结构框图; 图 3 是根据本发明实施例的 VMS语音信箱业务的分区级自动容灾的系统架构图。 具体实施方式 本发明实施例提供了一种自动容灾切换方法及装置, 以下结合附图以及实施例, 对本发明实施例进行进一步详细说明。 应当理解, 此处所描述的具体实施例仅仅用以 解释本发明, 并不限定本发明。 本实施例提供了一种自动容灾切换方法, 图 1是根据本发明实施例的自动容灾切 换方法的流程图, 如图 1所示, 该方法包括以下步骤 (步骤 S102至步骤 S104): 步骤 S102, 对主用系统的资源进行监控。 优选地, 主用系统的资源包括以下至少 之一: DB资源, 存储资源, 内部接口资源。 步骤 S104, 当监控到上述主用系统的资源出现故障时, 停止上述主用系统的资源 的运行, 启动备用系统的资源, 以承载上述主用系统的业务。 通过上述方法, 对主用系统的资源进行监控, 当监控到主用系统的资源出现故障 时, 停止主用系统的资源的运行, 启动备用系统的资源, 以承载上述主用系统的业务, 从而实现了 VMS语音信箱业务的自动容灾切换, 保证业务的正常运行以及用户的无 差别感受, 进行无缝的平滑切换, 进行自动容灾切换。 本实施例的一个实施方式, 可以采用以下技术方案实现: 1、 在一个 VMS逻辑站 点下建立分区, 每个分区中包含能支撑业务独立运行的相关资源。 2、每个分区中部署 一个资源监控模块, 对该分区中支撑 VMS业务运行的关键资源进行监控, 发现关键 资源出现故障后, 自动停止该分区中其他资源的运行, 并通知备用分区资源监控模块, 启动备用分区中的备用资源继续承载该主用分区的 VMS业务。 3、 业务在备用分区所 使用的相关数据在容灾切换前通过以下方案进行备份: a) 数据备份采用流复制。 b ) 语音数据通过跨分区的直接复制。 优选地, 对主用系统的资源进行监控包括: 以消息 /命令发送方式, 探测主用系统 的资源状态; 如果收到主用系统的正确响应, 则确定主用系统的资源运行正常; 如果 未收到主用系统的响应, 或者收到错误响应, 则确定主用系统的资源运行异常。 优选地, 确定主用系统的资源运行异常之后, 上述方法还包括: 在连续监控主用 系统的资源运行异常的次数超过阈值次数后, 则确定主用系统的资源出现故障。 优选地, 当监控到主用系统的资源出现故障时, 停止主用系统的资源的运行, 启 动备用系统的资源包括: 根据资源对系统运行的影响程度, 设定自动容灾切换条件; 当监控到主用系统的资源出现故障时, 判断主用系统的资源是否满足自动容灾切换条 件; 如果满足, 则停止主用系统的资源的运行, 启动备用系统的资源。 对应于上述实施例介绍的自动容灾切换方法, 本实施例提供了一种自动容灾切换 装置, 用以实现上述实施例。 图 2是根据本发明实施例的自动容灾切换装置的结构框 图, 如图 2所示, 该装置包括: 监控模块 10和切换模块 20。 下面对该结构进行详细 介绍。 监控模块 10, 设置为对主用系统的资源进行监控。 优选地, 上述主用系统的资源 包括以下至少之一: DB资源, 存储资源, 内部接口资源。 切换模块 20,连接至监控模块 10,设置为在监控到上述主用系统的资源出现故障 时, 停止上述主用系统的资源的运行, 启动备用系统的资源, 以承载上述主用系统的 业务。 通过上述装置, 对主用系统的资源进行监控, 当监控到主用系统的资源出现故障 时, 停止主用系统的资源的运行, 启动备用系统的资源, 以承载上述主用系统的业务, 从而实现了 VMS语音信箱业务的自动容灾切换, 保证业务的正常运行以及用户的无 差别感受, 进行无缝的平滑切换, 进行自动容灾切换。 优选地, 上述监控模块 10包括: 状态探测单元, 设置为以消息 /命令发送方式, 探测上述主用系统的资源状态; 第一确定单元, 设置为在收到上述主用系统的正确响 应时, 确定上述主用系统的资源运行正常; 第二确定单元, 设置为在未收到上述主用 系统的响应, 或者收到错误响应时, 确定上述主用系统的资源运行异常。 优选地, 上述监控模块 10还包括: 第三确定单元, 设置为在连续监控上述主用系 统的资源运行异常的次数超过阈值次数后, 确定上述主用系统的资源出现故障。 优选地, 上述切换模块 20包括: 条件设定单元, 设置为根据资源对系统运行的影 响程度, 设定自动容灾切换条件; 切换单元, 设置为在监控到上述主用系统的资源出 现故障时, 判断上述主用系统的资源是否满足上述自动容灾切换条件; 如果满足, 则 停止上述主用系统的资源的运行, 启动备用系统的资源。 下面通过具体实施例对 VMS语音信箱业务的分区级自动容灾操作进行介绍,图 3 是根据本发明实施例的 VMS 语音信箱业务的分区级自动容灾的系统架构图, 如图 3 所示, 监控模块是实现分区级自动容灾的核心模块, 主要包括以下功能: The present invention relates to the field of computer communications, and in particular, to an automatic disaster tolerance switching method and apparatus. A Voice Message System (VMS) service is a telecom value-added service. The service is based on a voice message platform and is used for voice mail service. It mainly provides users with the function of receiving and sending voice messages. . The service provides basic functions for users to receive messages and send messages, new message notification functions, personalized mailbox settings, different Class of Service (COS) level functions, etc., in addition to existing interactive voices. In the Interactive Voice Response (IVR) process, users can even access and customize their voicemail through personal portals, WAP portals, and VVM. The broadband intelligent network platform realizes the control of service calls, encapsulation of various protocols, and reconstruction of messages, which well supports the operation of various telecom value-added services. For different types of data backup, the business data ORACLE adopts a mature stream replication scheme, and EBASE adopts the self-developed X2X scheme. The voice data is directly copied through the internal interface, and only the network bandwidth needs to be considered. In the related art, no voice mail service can automatically switch to an application instance running on the standby system after the primary system fails. Most of them are switched by hand. Even if the status of the resource can be automatically monitored, the switching operation is relatively independent and cannot be completely automatic. In the related art, the VMS voice mail service causes a business to run intermittently after the failure of the primary system. In view of the above problems in the related art, an effective solution has not yet been proposed. SUMMARY OF THE INVENTION The present invention provides an automatic disaster tolerance switching method and apparatus for solving the above problems in the related art. According to an aspect of the embodiments of the present invention, an automatic disaster tolerance switching method is provided, where the method includes: monitoring resources of a primary system; and stopping monitoring when the resources of the primary system are faulty With the operation of the system resources, the resources of the standby system are started to carry the services of the above-mentioned main system. Preferably, monitoring the resources of the primary system includes: detecting a resource status of the primary system by using a message/command transmission manner; if receiving the correct response of the primary system, determining that the resources of the primary system are operating normally. If the response of the above-mentioned main system is not received, or an error response is received, it is determined that the resources of the above-mentioned main system are abnormal. Preferably, after determining that the resource of the primary system is abnormal, the method further includes: after continuously monitoring the number of times that the resource operation abnormality of the primary system exceeds a threshold number of times, determining that the resource of the primary system is faulty. Preferably, when the resource of the primary system is faulty, the resource of the primary system is stopped, and the resource of the standby system is started: the automatic disaster recovery switching condition is set according to the impact of the resource on the system operation; When the resource of the primary system is faulty, it is determined whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, the resource of the primary system is stopped, and the resource of the standby system is started. Preferably, the resources of the foregoing active system include at least one of the following: a data block (Data Block, DB for short) resource, a storage resource, and an internal interface resource. According to another aspect of the present invention, an automatic disaster tolerance switching apparatus is further provided, where the apparatus includes: a monitoring module configured to monitor resources of the primary system; and a switching module configured to monitor the foregoing When the resources of the primary system fail, the resources of the primary system are stopped, and the resources of the standby system are started to carry the services of the primary system. Preferably, the monitoring module includes: a state detecting unit configured to detect a resource state of the primary system in a message/command sending manner; and a first determining unit configured to determine, when receiving the correct response of the primary system, The resource of the above-mentioned main system runs normally; the second determining unit is configured to determine that the resource of the main system is abnormally operated when the response of the main system is not received or an error response is received. Preferably, the monitoring module further includes: a third determining unit, configured to determine that the resource of the primary system is faulty after continuously monitoring the number of times that the resource running abnormality of the primary system exceeds a threshold number of times. Preferably, the switching module includes: a condition setting unit configured to set an automatic disaster tolerance switching condition according to a degree of influence of resources on system operation; and a switching unit configured to detect when a resource of the primary system is faulty, Determining whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, stopping the operation of the resource of the primary system and starting the resource of the standby system. Preferably, the resources of the foregoing primary system include at least one of the following: a DB resource, a storage resource, and an internal interface resource. The resource of the active system is monitored by the embodiment of the present invention. When the resource of the active system is detected to be faulty, the resource of the active system is stopped, and the resource of the standby system is started to carry the service of the primary system. In this way, the automatic disaster recovery switching of the VMS voice mail service is realized, the normal operation of the service and the user's indistinct feeling are ensured, seamless and smooth switching is performed, and automatic disaster recovery switching is performed. The above description is only an overview of the technical solutions of the embodiments of the present invention, and the technical means of the embodiments of the present invention can be more clearly understood, and can be implemented according to the contents of the specification, and the above and other objects, features and embodiments of the embodiments of the present invention are Advantages can be more clearly understood, and specific embodiments of the embodiments of the present invention are exemplified below. BRIEF DESCRIPTION OF DRAWINGS FIG. 1 is a flowchart of an automatic disaster tolerance switching method according to an embodiment of the present invention; FIG. 2 is a structural block diagram of an automatic disaster tolerance switching apparatus according to an embodiment of the present invention; FIG. 3 is a VMS voice according to an embodiment of the present invention. System architecture diagram of partition-level automatic disaster tolerance for mailbox business. DETAILED DESCRIPTION OF THE EMBODIMENTS The embodiments of the present invention provide an automatic disaster tolerance switching method and apparatus. The embodiments of the present invention are further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. This embodiment provides an automatic disaster tolerance switching method. FIG. 1 is a flowchart of an automatic disaster tolerance switching method according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps (step S102 to step S104): Step S102, monitoring resources of the active system. Preferably, the resources of the active system include at least one of the following: a DB resource, a storage resource, and an internal interface resource. In step S104, when the resource of the primary system is faulty, the resource of the primary system is stopped, and the resource of the standby system is started to carry the service of the primary system. Through the above method, the resources of the primary system are monitored. When the resources of the primary system are detected to be faulty, the resources of the primary system are stopped, and the resources of the standby system are started to carry the services of the primary system. The automatic disaster recovery switchover of the VMS voice mail service is implemented, ensuring the normal operation of the service and the user's indistinct feeling, seamlessly and smoothly switching, and performing automatic disaster recovery switching. An embodiment of the present embodiment may be implemented by the following technical solutions: 1. Establish a partition under a VMS logical site, and each partition includes related resources that can support the independent operation of the service. 2. A resource monitoring module is deployed in each partition to monitor the critical resources supporting the running of the VMS service in the partition. After detecting the failure of the critical resources, the other resources in the partition are automatically stopped, and the standby partition resource monitoring module is notified. The standby resource in the standby partition is started to continue to carry the VMS service of the primary partition. 3. The related data used by the service in the standby zone is backed up by the following scheme before the disaster recovery switchover: a) Data backup uses flow replication. b) Voice data is copied directly across partitions. Preferably, monitoring the resources of the active system includes: detecting a resource status of the primary system by using a message/command sending manner; if receiving a correct response of the primary system, determining that the resources of the primary system are operating normally; After receiving the response from the primary system, or receiving an error response, it is determined that the resources of the primary system are abnormal. Preferably, after determining that the resource of the primary system is abnormal, the method further includes: after continuously monitoring the number of times the resource running abnormality of the primary system exceeds a threshold number of times, determining that the resource of the primary system is faulty. Preferably, when the resource of the active system is detected to be faulty, the running of the resource of the active system is stopped, and the resources of the standby system are started: the automatic disaster tolerance switching condition is set according to the influence degree of the resource on the system operation; When the resources of the primary system are faulty, it is determined whether the resources of the primary system meet the automatic disaster tolerance switching condition; if yes, the resources of the primary system are stopped, and the resources of the standby system are started. Corresponding to the automatic disaster recovery switching method introduced in the foregoing embodiment, this embodiment provides an automatic disaster tolerance switching apparatus for implementing the foregoing embodiment. FIG. 2 is a structural block diagram of an automatic disaster tolerance switching apparatus according to an embodiment of the present invention. As shown in FIG. 2, the apparatus includes: a monitoring module 10 and a switching module 20. The structure is described in detail below. The monitoring module 10 is configured to monitor resources of the primary system. Preferably, the resources of the foregoing active system include at least one of the following: a DB resource, a storage resource, and an internal interface resource. The switching module 20 is connected to the monitoring module 10, and is configured to stop the operation of the resources of the primary system when the resource of the primary system is detected to be faulty, and start the resources of the standby system to carry the services of the primary system. The device of the primary system is monitored by the foregoing device. When the resource of the primary system is detected to be faulty, the resource of the primary system is stopped, and the resource of the standby system is started to carry the service of the primary system. The automatic disaster recovery switchover of the VMS voice mail service is implemented, ensuring the normal operation of the service and the user's indistinct feeling, seamlessly and smoothly switching, and performing automatic disaster recovery switching. Preferably, the monitoring module 10 includes: a state detecting unit configured to detect a resource state of the active system by using a message/command sending manner; and a first determining unit configured to receive a correct response of the primary system when receiving It is determined that the resource of the primary system is operating normally; and the second determining unit is configured to determine that the resource of the primary system is abnormal after receiving the response of the primary system or receiving an error response. Preferably, the monitoring module 10 further includes: a third determining unit, configured to determine that the resource of the primary system is faulty after continuously monitoring the number of times the resource operation abnormality of the primary system exceeds a threshold number of times. Preferably, the switching module 20 includes: a condition setting unit configured to set an automatic disaster tolerance switching condition according to a degree of influence of resources on system operation; and a switching unit configured to detect when the resource of the primary system is faulty And determining whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, stopping the operation of the resource of the primary system and starting the resource of the standby system. The following is a description of a partition-level automatic disaster tolerance operation of a VMS voice mail service according to a specific embodiment. FIG. 3 is a system architecture diagram of a partition-level automatic disaster tolerance of a VMS voice mail service according to an embodiment of the present invention, as shown in FIG. The monitoring module is a core module for implementing automatic disaster tolerance at the partition level. It mainly includes the following functions:
1、 周期性检测指定资源的状态: 1 ) 对 DB资源的监控 资源监控模块以某种消息或命令方式探测 DB资源状态, 无响应或响应错误则判 断该资源异常。 2) 对存储资源的监控 资源监控模块以某种消息或命令方式探测存储资源状态, 无响应或响应错误则判 断该资源异常。 1. Periodically check the status of the specified resource: 1) Monitor the DB resource The resource monitoring module detects the DB resource status by some message or command mode. If there is no response or response error, the resource is abnormal. 2) Monitoring the storage resource The resource monitoring module detects the status of the storage resource in a certain message or command mode. If there is no response or response error, the resource is abnormal.
3 ) 对其他内部接口资源的监控 资源监控模块以某种消息或命令方式探测其他内部接口资源状态, 无响应或响应 错误则判断该资源异常。 资源监控模块对上述模块进行周期性的检测, 连续检测超过某个阀值次数后确定 该资源为异常。 2、 自动容灾切换条件: 资源监控模块提供对所有其他资源模块进行检测监控的功能, 标识出发生故障的 资源, 根据资源模块对系统运行影响的重要程度进行划分, 对系统运行影响大且发生 故障的资源模块设定为自动容灾切换条件。 当条件满足时, 进行自动容灾切换; 条件 不满足时, 不进行自动容灾切换, 只对非关键模块进行其他处理或不处理。 同时, 切 换时进一步判断其备用分区以及备用资源是否可用, 否则也不满足进行自动容灾切换 的条件。 3) The monitoring resource monitoring module of other internal interface resources detects the status of other internal interface resources in a certain message or command mode. If there is no response or response error, the resource is abnormal. The resource monitoring module periodically detects the above modules, and continuously detects that the resource is abnormal after exceeding a certain threshold number. 2. Automatic DR switchover conditions: The resource monitoring module provides the function of detecting and monitoring all other resource modules, identifies the faulty resources, and divides the importance of the impact of the resource modules on the system operation, which has a great impact on the system operation and occurs. The faulty resource module is set to the automatic disaster recovery switching condition. When the conditions are met, automatic DR switching is performed. If the conditions are not met, automatic DR switching is not performed. Only non-critical modules are processed or not processed. At the same time, when the handover is performed, it is further determined whether the spare partition and the standby resource are available, otherwise the condition for automatic disaster recovery handover is not satisfied.
3、 切换时修改分区和资源的状态: 满足自动容灾切换的条件后, 资源监控模块将分区和资源的状态改为离线或不可 用状态, 可提供或标识离线或不可用的原因等信息 3. Modify the status of the partition and resources when switching: After the conditions of automatic disaster recovery switching are met, the resource monitoring module changes the status of the partition and resources to offline or unavailable, and provides or identifies the reason for offline or unavailable.
4、 切换时控制指定模块停止: 容灾自动切换时, 资源监控模块提供相应指令, 控制故障分区的指定资源停止运 行, 不同的资源类型对应不同的指令类型。 4. Control the specified module to stop when switching: When the disaster tolerance is automatically switched, the resource monitoring module provides corresponding commands to control the specified resources of the faulty partition to stop running. Different resource types correspond to different instruction types.
5、 通知备用分区的资源监控模块容灾切换事件: 资源监控模块在容灾自动切换时通知备用分区资源监控模块该次切换事件。 5. The resource monitoring module of the standby partition is notified of the disaster recovery switching event: the resource monitoring module notifies the standby partition resource monitoring module of the switching event when the disaster tolerance is automatically switched.
6、 切换控制相应模块的启动: 备用分区资源监控模块在收到主用分区资源监控模块的切换通知事件后, 启用备 用分区上的备用资源模块, 接管业务。 6. Switching to control the startup of the corresponding module: After receiving the handover notification event of the primary partition resource monitoring module, the standby partition resource monitoring module enables the standby resource module on the standby partition to take over the service.
7、 收到其他模块的通知触发切换: 资源监控模块提供接收其他模块通知而触发容灾切换的功能, 切换条件以及需要 执行的操作和自动容灾切换时一样。 在图 3中, 监控模块包括以下几个子模块: 7. Receive notification from other modules to trigger the switchover: The resource monitoring module provides the function of receiving the notification of other modules to trigger the disaster recovery switchover. The switching conditions and the operations to be performed are the same as those for automatic disaster recovery. In Figure 3, the monitoring module includes the following sub-modules:
1、 状态监控子模块: 主要完成从 DB中获取待检测的资源信息, 然后进行状态检 1. Status monitoring sub-module: Mainly complete the resource information to be detected from the DB, and then perform status check.
2、 切换操作处理子模块: 完成分区的切换处理操作。 3、接管服务控制子模块:主要处理来自主用分区的切换请求,完成接管服务操作。 2. Switch operation processing sub-module: Complete the partition switching processing operation. 3. Takeover service control sub-module: mainly handles the handover request from the primary partition and completes the takeover service operation.
4、 主 /备分区离线控制子模块: 主要完成处理来自 web的离线请求, 完成分区的 切换或接管处理。 4. Active/standby partition offline control sub-module: Mainly completes processing offline requests from the web, completes partition switching or takeover processing.
5、 服务启停控制子模块: 主要完成对模块或资源对象的启停操作。 6、 SMDB停止处理子模块: 主要处理来自 web的停止主用 DB的请求, 将主用5, service start and stop control sub-module: Mainly complete the start and stop operation of the module or resource object. 6, SMDB stop processing sub-module: mainly processing requests from the web to stop the main DB, will be used
DB进行停止服务的操作。 当主用分区上的某个 DB资源发生故障, 并且该分区的资源监控模块连续检测到 超过设定的某个阀值后, 确认该 DB发生故障, 同时 DB模块属于预先设定的系统运 行的关键模块, 资源监控模块判断其所属分区存在备用分区, 且在备用分区上存在备 用 DB且可用, 资源监控模块开始进行自动容灾切换, 更新数据库中的分区以及资源 状态, 并停止其他相关资源, 同时通知备用分区的资源监控模块发生容灾切换事件; 备用分区的资源监控模块收到该通知事件后启动相关的备用资源, 这样就完成了整个 分区级自动容灾的切换过程。 此时语音信箱业务仍然可用, 用户感受和正常情况下无 差异。 本实施例描述的是分区级的自动容灾切换。 如果故障分区恢复正常后, 资源监控 模块仍能自动检测到, 更新分区以及资源状态, 并启动相关的资源模块, 同时通知备 用分区资源监控模块主用分区恢复正常; 备用分区资源监控模块收到主用分区恢复正 常的消息事件后, 停止相关的备用资源, 并停止接管服务, 让主用分区资源监控模块 重新接管回去, 这样系统就提供了分区级的自动容灾恢复功能。 但实现该功能还需在 容灾过程中实现数据从备用分区到主用分区的同步, 同步方法和过程同主用分区到备 用分区的数据同步类似。 从以上的描述中可以看出, 通过本发明, VMS语音信箱业务在主用系统发生故障 后能自动切换到备用系统上, 保证业务不间断运行, 用户在切换前后感受无差异, 属 于全自动的平滑切换而实现的自动容灾。 尽管为示例目的, 已经公开了本发明的优选实施例, 本领域的技术人员将意识到 各种改进、 增加和取代也是可能的, 因此, 本发明的范围应当不限于上述实施例。 工业实用性 本发明实施例提供的技术方案可以应用于计算机通信领域, 实现了 VMS语音信 箱业务的自动容灾切换, 保证业务的正常运行以及用户的无差别感受, 进行无缝的平 滑切换, 进行自动容灾切换。 The DB performs the operation of stopping the service. When a certain DB resource on the primary partition fails, and the resource monitoring module of the partition continuously detects a certain threshold value, it is confirmed that the DB is faulty, and the DB module belongs to a preset key of system operation. The module, the resource monitoring module determines that there is an alternate partition in the partition to which it belongs, and the standby DB exists and is available on the spare partition. The resource monitoring module starts automatic disaster recovery switching, updates the partition and resource status in the database, and stops other related resources. The resource monitoring module of the standby partition is notified of the disaster recovery switching event. The resource monitoring module of the standby partition starts the related standby resource after receiving the notification event, thus completing the automatic partitioning process of the entire partition level. At this point, the voicemail service is still available, and the user experience is no different from normal. This embodiment describes automatic disaster recovery switching at the partition level. If the faulty partition is restored to normal, the resource monitoring module can still automatically detect, update the partition and resource status, and start the related resource module, and notify the standby partition resource monitoring module that the primary partition is restored to normal; the standby partition resource monitoring module receives the master. After the normal message event is restored with the partition, the related standby resources are stopped, and the takeover service is stopped, and the primary partition resource monitoring module is taken over again, so that the system provides the automatic disaster recovery function at the partition level. However, to achieve this function, data synchronization from the alternate partition to the primary partition is required during the disaster tolerance process. The synchronization method and process are similar to the data synchronization of the primary partition to the standby partition. As can be seen from the above description, the VMS voice mail service can automatically switch to the standby system after the failure of the primary system to ensure uninterrupted operation of the service, and the user feels no difference before and after the handover, and is fully automatic. Automatic disaster recovery achieved by smooth switching. While the preferred embodiments of the present invention have been disclosed for purposes of illustration, those skilled in the art will recognize that various modifications, additions and substitutions are possible, and the scope of the invention should not be limited to the embodiments described above. Industrial Applicability The technical solution provided by the embodiments of the present invention can be applied to the field of computer communication, realizing automatic disaster recovery switching of the VMS voice mail service, ensuring the normal operation of the service and the user's indistinct feeling, and performing seamless and smooth switching. Automatic disaster recovery switching.

Claims

权 利 要 求 书 、 一种自动容灾切换方法, 包括: The claim for rights, an automatic disaster recovery switching method, including:
对主用系统的资源进行监控;  Monitor the resources of the primary system;
当监控到所述主用系统的资源出现故障时, 停止所述主用系统的资源的运 行, 启动备用系统的资源, 以承载所述主用系统的业务。 、 如权利要求 1所述的方法, 其中, 对主用系统的资源进行监控包括:  When the resource of the primary system is detected to be faulty, the running of the resource of the primary system is stopped, and the resource of the standby system is started to carry the service of the primary system. The method of claim 1, wherein monitoring the resources of the active system comprises:
以消息 /命令发送方式, 探测所述主用系统的资源状态;  Detecting the resource status of the active system in a message/command manner;
如果收到所述主用系统的正确响应,则确定所述主用系统的资源运行正常; 如果未收到所述主用系统的响应, 或者收到错误响应, 则确定所述主用系 统的资源运行异常。 、 如权利要求 1所述的方法, 其中, 确定所述主用系统的资源运行异常之后, 所 述方法还包括:  If the correct response of the primary system is received, determining that the resource of the primary system is operating normally; if the response of the primary system is not received, or receiving an error response, determining that the primary system is The resource is running abnormally. The method of claim 1, wherein after determining that the resource of the primary system is abnormal, the method further includes:
在连续监控所述主用系统的资源运行异常的次数超过阈值次数后, 确定所 述主用系统的资源出现故障。 、 如权利要求 1所述的方法, 其中, 当监控到所述主用系统的资源出现故障时, 停止所述主用系统的资源的运行, 启动备用系统的资源包括:  After continuously monitoring the number of times the resource operation abnormality of the primary system exceeds a threshold number of times, it is determined that the resource of the primary system is faulty. The method of claim 1, wherein when the resource of the active system is detected to be faulty, the running of the resource of the primary system is stopped, and the resources of the standby system are:
根据资源对系统运行的影响程度, 设定自动容灾切换条件;  Set automatic disaster tolerance switching conditions based on the impact of resources on system operation;
当监控到所述主用系统的资源出现故障时, 判断所述主用系统的资源是否 满足所述自动容灾切换条件; 如果满足, 则停止所述主用系统的资源的运行, 启动备用系统的资源。 、 如权利要求 1至 4中任一项所述的方法, 其中, 所述主用系统的资源包括以下 至少之一:  When the resource of the primary system is faulty, it is determined whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, the resource of the primary system is stopped, and the standby system is started. resource of. The method according to any one of claims 1 to 4, wherein the resources of the primary system comprise at least one of the following:
数据块 DB资源, 存储资源, 内部接口资源。 、 一种自动容灾切换装置, 包括:  Data block DB resource, storage resource, internal interface resource. An automatic disaster recovery switching device includes:
监控模块, 设置为对主用系统的资源进行监控; 切换模块, 设置为在监控到所述主用系统的资源出现故障时, 停止所述主 用系统的资源的运行, 启动备用系统的资源, 以承载所述主用系统的业务。 、 如权利要求 6所述的装置, 其中, 所述监控模块包括: 状态探测单元, 设置为以消息 /命令发送方式, 探测所述主用系统的资源状 态; a monitoring module, configured to monitor resources of the primary system; The switching module is configured to stop the operation of the resources of the primary system when the resource of the primary system is faulty, and start the resources of the standby system to carry the services of the primary system. The device of claim 6, wherein the monitoring module comprises: a status detecting unit configured to detect a resource status of the active system in a message/command sending manner;
第一确定单元, 设置为在收到所述主用系统的正确响应时, 确定所述主用 系统的资源运行正常;  The first determining unit is configured to: when the correct response of the primary system is received, determine that the resources of the primary system are operating normally;
第二确定单元, 设置为在未收到所述主用系统的响应, 或者收到错误响应 时, 确定所述主用系统的资源运行异常。 、 如权利要求 7所述的装置, 其中, 所述监控模块还包括: 第三确定单元, 设置为在连续监控所述主用系统的资源运行异常的次数超 过阈值次数后, 确定所述主用系统的资源出现故障。 、 如权利要求 7所述的装置, 其中, 所述切换模块包括: 条件设定单元, 设置为根据资源对系统运行的影响程度, 设定自动容灾切 换条件;  The second determining unit is configured to determine that the resource of the active system is abnormal when the response of the primary system is not received or an error response is received. The device of claim 7, wherein the monitoring module further comprises: a third determining unit, configured to determine the primary use after continuously monitoring the number of times the resource operation abnormality of the primary system exceeds a threshold number of times The system's resources have failed. The device of claim 7, wherein the switching module comprises: a condition setting unit configured to set an automatic disaster tolerance switching condition according to a degree of influence of resources on system operation;
切换单元, 设置为在监控到所述主用系统的资源出现故障时, 判断所述主 用系统的资源是否满足所述自动容灾切换条件; 如果满足, 则停止所述主用系 统的资源的运行, 启动备用系统的资源。 0、 如权利要求 7至 9中任一项所述的装置, 其中, 所述主用系统的资源包括以下 至少之一:  a switching unit, configured to determine, when the resource of the active system is faulty, whether the resource of the primary system meets the automatic disaster tolerance switching condition; if yes, stop the resource of the primary system Run, start the resources of the standby system. The apparatus according to any one of claims 7 to 9, wherein the resources of the primary system comprise at least one of the following:
数据块 DB资源, 存储资源, 内部接口资源。  Data block DB resource, storage resource, internal interface resource.
PCT/CN2014/075319 2013-10-30 2014-04-14 Automatic disaster recovery switching method and device WO2014176969A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310528896.6 2013-10-30
CN201310528896.6A CN104601350A (en) 2013-10-30 2013-10-30 Automatic disaster-tolerant switching method and device

Publications (1)

Publication Number Publication Date
WO2014176969A1 true WO2014176969A1 (en) 2014-11-06

Family

ID=51843106

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/075319 WO2014176969A1 (en) 2013-10-30 2014-04-14 Automatic disaster recovery switching method and device

Country Status (2)

Country Link
CN (1) CN104601350A (en)
WO (1) WO2014176969A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726190B (en) * 2018-09-07 2021-06-29 网联清算有限公司 Automatic switching method and device for database control center and storage medium
CN114143619A (en) * 2021-11-25 2022-03-04 新华三技术有限公司成都分公司 Base station over-temperature protection method and device and electronic equipment
CN116863723B (en) * 2023-08-14 2024-05-07 深圳市双银科技有限公司 Use method of digital twin base

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136758A (en) * 2007-07-20 2008-03-05 南京联创科技股份有限公司 Application method for online accounting system in owing risk control system
CN102025551A (en) * 2010-12-23 2011-04-20 中兴通讯股份有限公司 Method and device for switching master device to backup device based on access gateway
CN102487332A (en) * 2010-12-03 2012-06-06 中兴通讯股份有限公司 Fault processing method, apparatus thereof and system thereof
CN102497288A (en) * 2011-12-13 2012-06-13 华为技术有限公司 Dual-server backup method and dual system implementation device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101110862A (en) * 2006-07-18 2008-01-23 中兴通讯股份有限公司 Speech processing system implementing method
US8676753B2 (en) * 2009-10-26 2014-03-18 Amazon Technologies, Inc. Monitoring of replicated data instances
CN102571310B (en) * 2010-12-09 2016-03-30 中兴通讯股份有限公司 The disaster recovery method of Voice Mail Service and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101136758A (en) * 2007-07-20 2008-03-05 南京联创科技股份有限公司 Application method for online accounting system in owing risk control system
CN102487332A (en) * 2010-12-03 2012-06-06 中兴通讯股份有限公司 Fault processing method, apparatus thereof and system thereof
CN102025551A (en) * 2010-12-23 2011-04-20 中兴通讯股份有限公司 Method and device for switching master device to backup device based on access gateway
CN102497288A (en) * 2011-12-13 2012-06-13 华为技术有限公司 Dual-server backup method and dual system implementation device

Also Published As

Publication number Publication date
CN104601350A (en) 2015-05-06

Similar Documents

Publication Publication Date Title
US11307943B2 (en) Disaster recovery deployment method, apparatus, and system
CN101908980B (en) Network management upgrading method and system
CN105933407B (en) method and system for realizing high availability of Redis cluster
CN112181660A (en) High-availability method based on server cluster
WO2012174893A1 (en) Dual-center disaster recovery-based switching method and device in iptv system
CN103812675A (en) Method and system for realizing allopatric disaster recovery switching of service delivery platform
CN102916825A (en) Management equipment of dual-computer hot standby system, management method and dual-computer hot standby system
CN106533736B (en) Network equipment restarting method and device
EP2637102B1 (en) Cluster system with network node failover
CN110673981B (en) Fault recovery method, device and system
CN105577444B (en) A kind of wireless controller management method and wireless controller
CN112799786A (en) Exit method, device, equipment and storage medium of micro-service instance
CN101594383A (en) A kind of service of double controller storage system and controller state method for supervising
WO2014176969A1 (en) Automatic disaster recovery switching method and device
CN109842526B (en) Disaster recovery method and device
CN101442437B (en) Method, system and equipment for implementing high availability
CN113438111A (en) Method for restoring RabbitMQ network partition based on Raft distribution and application
JP2006285443A (en) Object relief system and method
JP5285044B2 (en) Cluster system recovery method, server, and program
CN114598594B (en) Method, system, medium and equipment for processing application faults under multiple clusters
CN107682888B (en) Cloud AC redundancy backup system and method
CN106817238A (en) Virtual machine repair method, virtual machine, system and business function network element
JP6856574B2 (en) Service continuation system and service continuation method
CN115408199A (en) Disaster tolerance processing method and device for edge computing node
CN107783855B (en) Fault self-healing control device and method for virtual network element

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14791406

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14791406

Country of ref document: EP

Kind code of ref document: A1