WO2005055517A1 - Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets - Google Patents

Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets Download PDF

Info

Publication number
WO2005055517A1
WO2005055517A1 PCT/CN2003/001041 CN0301041W WO2005055517A1 WO 2005055517 A1 WO2005055517 A1 WO 2005055517A1 CN 0301041 W CN0301041 W CN 0301041W WO 2005055517 A1 WO2005055517 A1 WO 2005055517A1
Authority
WO
WIPO (PCT)
Prior art keywords
core control
control device
access
disaster recovery
devices
Prior art date
Application number
PCT/CN2003/001041
Other languages
English (en)
French (fr)
Inventor
Chen Wang
Xianli Hu
Rujun Li
Haipeng Li
Original Assignee
Zte Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=34638029&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO2005055517(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Zte Corporation filed Critical Zte Corporation
Priority to PCT/CN2003/001041 priority Critical patent/WO2005055517A1/zh
Priority to AU2003292846A priority patent/AU2003292846A1/en
Priority to CN2003801103950A priority patent/CN100407620C/zh
Priority to EP03782048.7A priority patent/EP1705829B2/en
Priority to US10/581,387 priority patent/US7675850B2/en
Publication of WO2005055517A1 publication Critical patent/WO2005055517A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/04Network management architectures or arrangements
    • H04L41/042Network management architectures or arrangements comprising distributed management centres cooperatively managing the network

Definitions

  • the present invention belongs to the field of communications, and relates to the improvement of communication switching equipment under a next-generation network architecture. Specifically, the invention relates to a device for implementing soft-switch off-site disaster tolerance based on a packet network. . BACKGROUND OF THE INVENTION
  • hardware equipment exchange control was mainly used. In this case, the control domain of each equipment is fixed, and the control device and the controlled equipment are connected by cables. It is almost impossible for such a system to achieve disaster recovery in different places or other locations.
  • the next-generation network uses soft switching as the control core, and through the packet-switched network as the transmission network, the elimination of fixed cable connections also makes it possible to provide remote control functions for disaster recovery functions.
  • the core control devices of the current next-generation network have not yet provided remote disaster recovery functions.
  • the object of the present invention is to provide a device for implementing soft-switch off-site disaster recovery based on a packet network, so that the core control device of the next-generation network has the off-site disaster recovery function. When one of the core control devices fails, it is located in another location. The core control device will take over the controlled equipment under its control and continue to provide control services.
  • the invention is implemented as follows:
  • a device for implementing soft-switch off-site disaster tolerance based on a packet network including at least two core control devices located at different physical locations, which are respectively used to provide control services for respective access devices;
  • the core control device further includes: a processing unit, a database unit, a sharing unit, and a synchronization process unit for remote disaster recovery;
  • the processing unit and the database unit are independent of the existing processors and databases in the core control device, and are used to provide services for access devices in different places, so that the core control devices in different physical locations are mutually disaster-tolerant. Relationship
  • the sharing unit is configured to share processing capability and data
  • the synchronization process unit is configured to complete data synchronization between the core control devices that are mutually different disaster tolerance relationships.
  • the data of the database unit serving the off-site access equipment comes from the database of the off-site core control device, and is realized by being a synchronous process unit in the disaster-tolerant core control device;
  • a change in the configuration data of any one of the core control devices will trigger a synchronization process unit to synchronize data with another core control device that is in a disaster tolerance relationship with each other;
  • Any one of the core control devices may also actively request related configuration data from another core control device that is in a disaster tolerance relationship through a synchronization process unit.
  • the shared processing capability is mainly network processing capability, and the shared data includes local basic environmental parameters, sub-publication and current distribution of access devices; the sub-publication is used for accessing the core control device.
  • the current distribution of the access devices includes not only the access devices controlled locally, but also the distribution of access devices controlled remotely.
  • An access device is registered or unregistered on the local core control device. Regardless of whether it is a locally-controlled access device, its current distribution needs to be recorded and synchronized to a remote core control device that is a disaster tolerance relationship in real time through a synchronization process unit to ensure that the access device can be used for other Accessed by the device.
  • the synchronous data transmission between the remote disaster recovery devices completed by the synchronization process unit is generally established on a TCP connection, and the TCP connection is maintained throughout the entire life cycle of the system, which is used to ensure the reliability of data on the IP network. And timely synchronization.
  • the data file can be generated locally and then transmitted to the remote place through the FTP protocol, and then extracted from the data file in the remote place. Data to improve data transmission efficiency and network utilization.
  • the daily maintenance and management of the core control device itself are performed independently, and additions, deletions, and changes to the configuration data are synchronized to the core control device with which they have a disaster tolerance relationship.
  • a core control device fails, it can Immediate remote disaster recovery switchover.
  • the failure of the core control device is sensed by the access device provided by the service, that is, the access device needs to be able to actively detect whether the core control device is available through the protocol handshake mechanism, and can automatically switch after detecting the failure of the core control device. To the pre-configured core control device for mutual disaster tolerance relationship.
  • the core control device and the access device that it controls to provide services are located in a packet-switched network;
  • the core control device supports remote disaster recovery switchover of some access devices, that is, the mutual control Under the normal operation of the core control device for disaster recovery, some or all of the remote access devices can be taken over.
  • the current registration distribution of the access devices between the core control devices that are mutually disaster-tolerant is synchronized in real time; the core control device makes access to the local access devices locally according to the current distribution of the access devices. Whether to transfer to a different place, or whether access to a remote access device is transferred to a different place or completed locally.
  • the switching of the local access device registered on the remote core control device to the local core control device is performed by the remote core control devices that are mutually disaster-tolerant according to a pre-established disaster recovery strategy Perform recovery.
  • FIG. 1 provides a structure diagram of a core control device of a next-generation network for disaster recovery in a remote location
  • next-generation network architecture using soft switching as the core control device there are two or more core control devices located in different physical locations. Under normal circumstances, these control devices are all devices that are put into operation, and respectively manage their respective access devices and provide control services. When a catastrophic failure of one of the core control devices fails to continue to provide services, a remote core control device that is in a disaster tolerance relationship with each other can take over the failed control device in the shortest time and continue to provide comprehensive services. The access equipment it controls will not affect its operation if it does not fail at this time.
  • control device of the present invention in order to provide a remote disaster recovery function, an independent processor and a database space are specifically divided to serve the control device that is mutually remote disaster recovery, which minimizes the impact of disaster recovery in a remote location. Impact of local system services.
  • the core control device's own daily maintenance and management are performed independently, but the additions, deletions, and changes to the configuration data are synchronized to the equipment with which they have a disaster tolerance relationship. It is designed to be able to immediately perform a remote location when a control device has a disaster. Disaster recovery switch to replace the faulty device.
  • the failure of the core control device is sensed by the access device rather than other control devices. That is, the access device needs to be able to actively detect whether the control device is available through the protocol handshake mechanism and automatically switch to the pre-configured backup control device after detecting the failure of the control device.
  • control device and the access device controlled by the present invention are located in a packet switched network, and whether the control device is available to be sensed by the access device. Therefore, when an access device is Network unreachable) considers its master device unavailable, although At this time, the control device is still operating normally, and it will switch to the pre-configured backup control device.
  • the control device according to the present invention supports remote disaster recovery switchover of some access devices, that is, under the condition that the control devices that are mutually disaster-tolerant normally operate, it can take over some or all of the access devices of the other party.
  • the control apparatus since the control apparatus according to the present invention supports remote disaster recovery switching of some access devices, the current registration distribution of the access devices is synchronized between the control devices that are disaster recovery to each other in real time. Based on the current distribution of the access devices, the control device makes a decision whether the access to the local access device is completed locally or transferred to a different place, or whether the access to the remote access device is transferred to a different place or locally.
  • the switchover of the access device registered on the backup control device to the main control device is performed by the backup control device in accordance with a predetermined capacity.
  • Disaster recovery strategy after the faulty control device resumes operation, the switchover of the access device registered on the backup control device to the main control device is performed by the backup control device in accordance with a predetermined capacity.
  • FIG. 1 is a schematic diagram of implementing a remote disaster recovery function provided by the core control device according to the present invention.
  • the core control devices C1 and C2 are mutually different disaster tolerance relationships.
  • “-" indicates data synchronization between control devices that are in different places for disaster tolerance
  • "" indicates processor access to the database
  • "" indicates data exchange between the processor and the shared zone.
  • an independent processing unit and a database unit are specially divided to serve the control devices that are mutually different disaster recovery, which minimizes the impact on the local system services in the case of remote disaster recovery.
  • P (processor) and DB (Database) serves the access equipment of the local master, while P- (processing unit) and DB- (database unit) serve the access equipment of the remote control device, which are independent of each other (such as in the control device C1, P1 and DB1 serve C1, and P2-and DB2_ serve C2.)
  • the Shared Area in the device is the shared department. Points, including processing power and data sharing. Synchronizat ion (synchronization process) is responsible for completing the data synchronization between the control devices that are in a disaster recovery relationship with each other.
  • the data in the database unit DB_ serving the remote access equipment comes from the database DB of the remote control device, that is, the data of DB2 in C1 comes from DB 2 in C2, and vice versa. It is realized through the synchronization process in the C1 and C2 devices.
  • the synchronization process will be triggered to synchronize the data with the other control device that is in a disaster tolerance relationship with each other.
  • Any control device can also actively request related configuration data from another control device that is in a disaster tolerance relationship through a synchronization process.
  • the processing capability shared in the Shared Area is mainly the network processing capability.
  • the shared data includes the local basic environmental parameters, the distribution of the distribution, and the current distribution of access devices.
  • the function of the sub-publishing is to make routing decisions for the requests entering the device, and decide whether it should be handled by P or P-.
  • the current distribution of access devices includes not only local access devices but also the distribution of remote access devices. When an access device is registered or unregistered on the control device, whether it is a locally controlled access device or not, it needs to record its current distribution and synchronize to the mutual disaster tolerance in real time through the synchronization process. Only the related remote control device can ensure that the access device is accessed by other devices. It is also because of the distributed data that the control device described in the present invention can support remote disaster recovery switching of some access devices.
  • Synchronizaton is responsible for completing the data synchronization between the control devices that are in a disaster recovery relationship in different places.
  • data synchronization transmission between remote disaster recovery devices is generally established on a TCP connection, and the TCP connection is maintained throughout the entire life cycle of the system.
  • TCP connection is maintained throughout the entire life cycle of the system.
  • the networking diagram shown in FIG. 2 is a typical architecture of a next-generation network.
  • the core control devices Cl, C2, and C3 are core control devices of the next-generation network that provide disaster recovery in different places. They are located at the control layer of the network architecture, where C1 and C2 is a disaster recovery relationship between different places.
  • the access devices Al, A2, A3, A4, and A5 are access devices controlled by the core control device, and are located in the access layer of the network architecture. Among them, the main control device of Al and A2 is CI, the main control device of A3 and A4 is C2, and the main control device of A5 is C3.
  • control device C1 or C2 fails, for example, C1 fails, another control device C2 will take over the access devices A1 and A2 controlled by it, that is, remote disaster recovery of the core control device occurs.
  • This process is initiated by the access devices A1 and A2 instead of the control device C2. That is, when the access devices Al and A2 controlled by C1 actively detect that their master control device C1 is unavailable, they automatically turn to their backup control device C2 to register and become the current access device of C2.
  • C2's own access device such as A3, wants to access A1.
  • C2 learns that A1 has access to the local area, and converts the remote access request into a local access request.
  • the control device C3 For the access of the access device A5 of the third-party control device (such as C3) to A1, under normal circumstances, the control device C3 sends the access request to the main control device C1 of A1, and then The CI sends an access request to Al. If you want C1 to fail and its access device is taken over by C2 (distance disaster recovery), you can still complete A5's access to A1. You need to configure two replaceable routes in C3, which point to each other. The core control device for disaster relations. In this way, when C1 fails and becomes unreachable, C3 will automatically send the access request to C2 through the replacement route, because A1 is already tolerated at this time. C2 is affected, so the access request is accepted.
  • C2 distance disaster recovery
  • the control device of the present invention supports remote disaster recovery switching of some access devices.
  • access device A1 Taking access device A1 as an example, when A1 cannot register with its master device C1 or detects that C1 is unavailable (because of network unreachability, etc.) through the protocol handshake mechanism, even if its master control device C1 is in a normal operating state at this time, A1 will still register with the backup control device C 2 which is in a different place disaster tolerance relationship with C1, and the control device C2 accepts it as the current access device.
  • ⁇ I wants to access the access device ⁇ 2, which is currently controlled by C1.
  • C2 After querying the current distribution data of the access device in the Shared Area by C2, C2 converts the local access request to an off-site (C1) access request.
  • C1 converts the local access request to a remote (C2) access request according to the current distribution data of the access devices on it.
  • C2's own access device such as A3, wants to access A1.
  • C2 learns that A1 ⁇ accesses the local area and turns off-site access requests into local access requests. .
  • C3 For access to A1 by the access device A5 of the third-party control device (such as C3), since C1 is still in a normal operating state at this time, C3 will send an access request to the main control device CI of A1 according to the normal process, and C1 is receiving After the request is obtained, by querying the current distribution data of the access devices in the Shared Area, and knowing that A1 is currently connected to C2, the access request is forwarded to C2.
  • the access device is registered on the backup control device, for example, A1 is registered on C2.
  • A1 is registered on C2.
  • C2 the current control device
  • A1 will not actively resume registration with C1.
  • the action of A1 resuming the registration with the master control device C1 can only be triggered by the current control device C2, or the current control device C2 becomes unavailable.
  • a disaster recovery strategy is configured in the core control device.
  • the strategy specifies the timing and actions to be taken to actively request the access device to register with its main control device through signaling. At that time, if the connection between the control devices that are mutually disaster-tolerant is active, that is, from the perspective of the control device, the peer end is considered to be in a normal operating state.
  • P queries the currently connected remote devices in the Shared Area. To avoid the simultaneous registration of a large number of access devices, signaling can be sent to them to resume registration to their master device in batches according to the policy.
  • Disaster recovery strategies with manual intervention can also be used.
  • control device of the present invention a technical solution in which local services and remote disaster recovery services coexist, and at the same time are independent of each other.
  • the core control devices that are mutually different disaster recovery relationships are normally put into operation, so when When one of the control devices fails, the time to complete the remote disaster recovery switchover is actually the time for the access device to re-connect, which ensures a short period of service recovery. Because the decision-making authority for the occurrence of a disaster is passed to the access device, rather than being determined by mutual monitoring between the control devices, it avoids misjudgments and misjudgments caused by factors such as network instability, and a large number of accesses caused by this. Register and unregister when you enter the device.
  • control device since the control device according to the present invention allows disaster-tolerant switching of some access devices, when the access device cannot access its main control device in the event of certain faults in its network environment, etc., By choosing to access to its backup control device, the availability of the access equipment is greatly improved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Telephonic Communication Services (AREA)

Description

一种基于分组网络实现软交换异地容灾的装置 技术领域 本发明属于通讯领域, 涉及下一代网络体系架构下的通讯交换设备 的改进, 具体是一种基于分组网络实现软交换异地容灾的装置。 背景技术 在过去的通讯交换设备中, 以硬件设备交换控制为主, 在这种情况 下, 每个设备的控制域是固定的, 而且控制装置与被控设备间是采用电 缆进行连接。这样的系统要实现异地或其他位置的容灾是几乎不可能的。
下一代的网络采用软交换为控制核心, 通过分组交换网作为传输网 络, 取消了固定的电缆连接, 也使得提供异地容灾功能的控制装置成为 可能。 但目前的下一代网络的核心控制装置都尚未提供异地容灾功能。 发明内容 本发明的目的是提供一种基于分组网络实现软交换异地容灾的装 置, 使得下一代网络的核心控制装置具备异地容灾功能, 当其中一个核 心控制装置出现故障时, 位于另一位置的核心控制装置将接管其控制下 的被控设备, 继续提供控制服务。 本发明是这样实现的:
一种基于分组网络实现软交换异地容灾的装置, 包括至少两个位于 不同物理位置的核心控制装置, 分别用于对各自的接入设备提供控制服 务;
其特征在于所述核心控制装置内还包括用于异地容灾的: 处理单元、 数据库单元、 共享单元、 同步进程单元;
所述处理单元和数据库单元, 与核心控制装置内已有的处理机和数 据库相互独立, 用于为异地的接入设备提供服务, 从而使不同物理位置 的核心控制装置之间互为异地容灾关系;
所述共享单元, 用于处理能力和数据的共享;
所述同步进程单元, 用于完成互为异地容灾关系的核心控制装置之 间数据的同步。
服务于异地接入设备的数据库单元的数据来自于异地核心控制装置 的数据库, 通过互为容灾核心控制装置内的同步进程单元实现;
所述任何一个核心控制装置的配置数据发生变动, 都会触发同步进 程单元向互为容灾关系的另一核心控制装置同步数据;
所述任何一个核心控制装置也可以通过同步进程单元主动向互为容 灾关系的另一核心控制装置请求相关配置数据。
所述共享单元中, 共享的处理能力主要是网络处理能力, 共享的数 据包括本地的基本环境参数、 分发表和接入设备的当前分布情况; 所述分发表, 用于对进入核心控制装置的请求进行路由决策, 决定 其是交由已有的处理机还是处理单元进行处理; 所述接入设备的当前分布情况, 不仅包括本地主控的接入设备, 也 包括异地控制的接入设备的分布情况, 某个接入设备在本地核心控制装 置上进行了注册或退出注册, 无论其是否为本地主控的接入设备, 均需 要对其当前分布进行记录, 并通过同步进程单元实时同步到互为容灾关 系的异地核心控制装置, 用于保证该接入设备能为其他设备所访问。
所述同步进程单元所完成的互为异地容灾装置间的数据同步传输一 般建立在 TCP连接上, 并且在系统的整个生存周期中一直保持该 TCP连 接, 用于保证数据在 IP网络上的可靠和及时同步。
在异地容灾关系建立之初、 或进行数据的重新同步时, 需要完成大 量配置数据的同步, 可以采用在本地先生成数据文件, 然后通过 FTP协 议传输到异地, 再在异地从数据文件中提取数据, 以提高数据传输效率 和网络利用率。
所述核心控制装置本身的日常维护和管理独立进行, 对于配置数据 所做的增删改, 则同步至与其互为容灾关系的核心控制装置, 用于当某 个核心控制装置出现故障时, 能够立即进行异地容灾切换。
所述核心控制装置出现故障是由其所提供服务的接入设备感知, 即 需要接入设备能够通过协议握手机制主动检测核心控制装置是否可用, 并能够在感知到核心控制装置故障后, 自动切换到预先配置的互为容灾 关系的核心控制装置。
所述核心控制装置和其所控制提供服务的接入设备位于分组交换网 络;
所述核心控制装置支持部分接入设备的异地容灾切换, 即在与其互 为容灾的核心控制装置正常运行的情况下, 可以接管异地的部分或全部 接入设备。
所述互为容灾的核心控制装置之间, 接入设备的当前注册分布情况 是实时同步的; 核心控制装置根据接入设备的当前分布, 做出对本地接 入设备的访问是在本地完成还是转往异地, 或对异地接入设备的访问是 转往异地还是在本地完成等决策。
所述发生故障的核心控制装置恢复运行后, 在异地核心控制装置上 注册的本地接入设备向本地核心控制装置的切换, 由互为容灾的异地核 心控制装置按照预先制定的容灾恢复策略进行恢复。
在下一代网络体系架构中, 核心控制装置的控制功能更加集中, 对 设备的稳定性和可靠性提出了更高的要求, 但不管如何, 由设备之外的 原因造成的故障是无法避免的。 在这种情况下, 可以提供异地容灾功能 的核心控制装置就为系统的不间断运行提供了保障。 同时, 由于本发明 所述的核心控制装置支持部分接入设备的异地容灾切换, 为分组交换网 中接入设备的正常接入提供了高可用性。 附图说明 图 1提供异地.容灾的下一代网络核心控制装置结构示意图; 图 2提供异地容灾的下一代网络核心控制装置组网示意图。 具体实施方式 下面结合附图对技术方案的实施作进一步的详细描述: 在本发明所述的以软交换为核心控制装置的下一代网络体系架构 中,存在两个或多个位于不同物理位置的核心控制装置, 在正常情况下, 这些控制装置都是投入运营的设备, 分别管理着各自的接入设备, 提供 控制服务。 当其中的一个核心控制装置发生灾难性的故障无法继续提供 服务时, 与其互为容灾关系的异地核心控制装置可以在最短的时间内接 替发生故障的控制装置, 继续提供完善的服务, 只要受其控制的接入设 备此时未发生故障就不会影响其运行。
本发明所述的控制装置中, 为了提供异地容灾功能, 专门划分出了 独立的处理机和数据库空间服务于与其互为异地容灾的控制装置, 最大 限度地降低了异地容灾情况下对本地系统服务的影响。 核心控制装置自 身的日常维护和管理独立进行, 但对配置数据所做的增删改, 则同步到 与其互为容灾关系的设备中, 旨在当某个控制装置出现灾难时, 能够立 即进行异地容灾切换, 接替故障设备。
进一步地, 在本发明所述的系统中, 核心控制装置出现故障是由接 入设备而非其他控制装置感知的。 即需要接入设备能够通过协议握手机 制主动检测控制装置是否可用, 并能够在探知到控制装置故障后, 自动 切换到预先配置的备用控制装置上。
进一步地, 本发明所述的控制装置和其控制的接入设备是位于分組 交换网络中的, 并且控制装置是否可用由接入设备感知, 因此, 当某个 接入设备因为某种原因 (如网络不可达)认为其主控设备不可用, 尽管 此时该控制装置依然正常运行, 它也会切换到预先配置的备用控制装置 上。 本发明所述的控制装置支持部分接入设备的异地容灾切换, 即在与 其互为容灾的控制装置正常运行的情况下, 可以接管对方部分或全部接 入设备。
进一步地, 由于本发明所述的控制装置支持部分接入设备的异地容 灾切换, 因此, 在互为容灾的控制装置间, 接入设备的当前注册分布情 况是实时同步的。 控制装置根据接入设备的当前分布, 做出对本地接入 设备的访问是在本地完成还是转往异地, 或对异地接入设备的访问是转 往异地还是在本地完成等的决策。
进一步地, 本发明所述的异地容灾架构中, 当发生故障的控制装置 恢复运行后, 在备用控制装置上注册的接入设备向主控设备的切换, 由 备用控制装置按照预先制定的容灾恢复策略进行恢复。
附图 1是本发明所述的核心控制装置提供异地容灾功能的实现示意 图。 核心控制装置 C1和 C2互为异地容灾关系。 图中 "- 表示互为异 地容灾关系的控制装置间的数据同步, " " 表示处理机对数据库的访 问, ,,表示处理机与共享区间的数据交换。 为了提供异地容灾功能, 在控制装置中专门划分出了独立的处理单元和数据库单元服务于与其互 为异地容灾的控制装置, 最大限度地降低了异地容灾情况下对本地系统 服务的影响。 其中 P (处理机) 和 DB (数据库) 为本地主控的接入设备 服务, 而 P- (处理单元) 和 DB— (数据库单元) 则服务于异地控制装置 的接入设备, 之间相互独立(如在控制装置 C1中, P1和 DB1为 C1服务, 而 P2 -和 DB2 _则为 C2服务)。 装置中的 Shared Area (共享区)为共享部 分, 包括处理能力和数据的共享。 Synchronizat ion (同步进程)负责完成 互为异地容灾关系的控制装置间的数据同步。
服务于异地接入设备的数据库单元 DB_中的数据来自于异地控制装 置的数据库 DB, 即 C1中 DB2—的数据来自于 C2中的 DB2, 反之亦然。 通 过 C1与 C2装置内的同步进程实现。 当任何一个控制装置的配置数据发 生变动, 都会触发同步进程向互为容灾关系的另一控制装置同步数据。 任何一个控制装置也可以通过同步进程主动向互为容灾关系的另一控制 装置请求相关配置数据。
在 Shared Area中共享的处理能力主要是网络处理能力, 共享的数 据包括本地的基本环境参数、 分发表和接入设备的当前分布情况等。 分 发表的功能是对进入装置的请求进行路由决策, 决定其是交由 P还是 P- 处理。 接入设备的当前分布情况, 不仅包括本地接入设备, 也包括异地 接入设备的分布情况。 当某个接入设备在本控制装置上进行了注册或退 出注册, 无论其是否为本地主控的接入设备, 均需要对其当前分布进行 记录, 并通过同步进程实时同步到互为容灾关系的异地控制装置, 才能 保证该接入设备为其他设备所访问。 也正因为有了分布数据, 本发明所 述的控制装置才能支持部分接入设备的异地容灾切换。
Synchroniza t ion 负责完成互为异地容灾关系的控制装置间的数据 同步。 为了保证数据在 IP网络上的可靠和及时同步, 互为异地容灾装置 间的数据同步传输一般建立在 TCP连接上, 并且在系统的整个生存周期 中一直保持该 TCP连接。 对于大量配置数据的同步 (如异地容灾关系建 立之初或进行数据的重新同步), 可以采用在本地先生成数据文件, 然后 通过 FTP协议传输到异地, 再在异地从数据文件中提取数据, 这样可以 极大地提高数据传输效率和网络利用率。
下面结合附图 2和附图 1对异地容灾的实施方案从核心控制装置的 异地容灾、 部分接入设备的异地容灾切换和容灾恢复策略三个方面作进 一步的详细描述。附图 2中所示的组网示意图是下一代网絡的典型架构, 其中核心控制装置 Cl、 C2、 C3是提供异地容灾的下一代网絡核心控制装 置, 位于网络架构的控制层, 其中 C1和 C2互为异地容灾关系。 接入设 备 Al、 A2、 A3、 A4、 A5是受控于核心控制装置的接入设备, 位于网络架 构的接入层。 其中 Al、 A2的主控设备是 CI , A3、 A4的主控设备是 C2 , A5的主控设备是 C3。
1. 核心控制装置的异地容灾
当控制装置 C1或 C2出现故障时, 例如 C1出现故障, 另一控制装置 C2会接管其控制的接入设备 A1和 A2 ,即发生核心控制装置的异地容灾。 这一过程是由接入设备 Al、 A2主动发起, 而非由控制装置 C2发起。 即 当受 C1控制的接入设备 Al、 A2主动探测到其主控设备 C1不可用后, 自 动转向其备用控制装置 C2进行注册, 成为 C2的当前接入设备。
此时, A1和 A2设备的相互访问与受控于 C1时类似。
此时 C2 自身的接入设备,例如 A3 ,要访问 A1时, C2通过查询 Shared Area 中的接入设备当前分布数据, 得知 A1 已接入到本地, 将异地访问 请求转为本地访问请求。
对于第三方控制装置 (例如 C3 ) 的接入设备 A5对 A1的访问, 在正 常情况下, 其控制装置 C3会将该访问请求发往 A1的主控设备 C1 , 再由 CI将访问请求发送给 Al。若希望在 C1出现故障, 其接入设备为 C2接管 的情况下 (异地容灾), 依然能够完成 A5对 A1的访问, 需要在 C3中配 置两条可替换的路由, 分别指向互为异地容灾关系的核心控制装置。 这 样当 C1出现故障不可达时, C3会自动将访问请求通过替换路由发往 C2 , 因为此时 A1 已容.灾到了 C2 , 因此访问请求被接受。
2. 部分接入设备的异地容灾切换
本发明所述的控制装置支持部分接入设备的异地容灾切换。 以接入 设备 A1为例,当 A1无法注册到其主 设备 C1或通过协议握手机制探测 到 C1不可用 (因为网络不可达等原因) 时, 即使此时其主控设备 C1处 于正常运营状态, A1仍会向与 C1互为异地容灾关系的备用控制装置 C2 注册, 控制装置 C2接受其为当前接入设备。
此时,ΑΙ要访问目前仍受控于 C1的接入设备 Α2 ,由 C2在查询 Shared Area中接入设备的当前分布数据后, 将本地访问请求转为异地( C1 )访 问请求。同理, A2对 A1的访问由 C1根据其上的接入设备当前分布数据, 将本地访问请求转为异地 (C2 ) 访问请求。
此时 C2 自身的接入设备, 例如 A3, 要访问 A1时, C2通过查询其上 的接入设备的当前分布数据,得知 A1巳接入到本地, 将异地访问请求转 为本地访问清求。
对于第三方控制装置(例如 C3 ) 的接入设备 A5对 A1的访问, 由于 此时 C1仍处于正常的运营状态, C3会按照正常流程将访问请求发往 A1 的主控设备 CI , C1在接到该请求后, 通过查询 Shared Area中的接入设 备的当前分布数据,得知 A1 目前已接入到 C2 , 则将该访问请求转至 C2。 3. 容灾恢复策略
出现异地容灾的情况后, 接入设备在备用控制装置上注册, 例如 A1 在 C2上注册, 在其未探测到当前控制装置(C2)不可用之前, 不再主动检 测其主控设备是否可用。 即无论此时 C1是否恢复, 或者 A1的网络环境 是否改善, A1都不会主动恢复向 C1注册。 A1恢复向主控设备 C1注册的 动作只能由当前控制装置 C2来触发,或者当前控制装置 C2变为不可用。
在核心控制装置中配置有容灾恢复策略, 策略中规定了通过信令主 动要求接入设备恢复向其主控设备注册的时机和采取的动作。 届时, 若 互为容灾关系的控制装置间的连接处于活动状态, 即从控制装置的角度 认为对端处于正常运营状态, 由 P—按照策略, 查询 Shared Area中目前 接入的异地设备, 为避免大量接入设备的同时注册可以按照策略分批向 其发送恢复注册到其主控设备上的信令。 也可采用人工干预的容灾恢复 策略。
在本发明所述的控制装置中,采用了本地服务与异地容灾服务共存, 同时又相互独立的技术方案, 互为异地容灾关系的核心控制装置正常情 况下都是投入运营的, 这样当其中一个控制装置出现故障时, 完成异地 容灾切换的时间实际上就是接入设备重新接入的时间, 保证了短时间内 的服务恢复。 由于将是否出现灾难的决策权交与了接入设备, 而非由控 制装置间通过互相监测判定, 避免了由于网络的不稳定等因素导致的误 判和错判, 以及由此引发的大量接入设备的同时注册和退出注册操作。 同时, 由于本发明所述的控制装置允许部分接入设备的容灾切换, 使得 接入设备在其所处网络环境等出现某些故障无法接入其主控设备时, 可 以选择向其备用控制装置接入, 极大地提高了接入设备的可用性。

Claims

权 利 要 求
1、一种基于分组网络实现软交换异地容灾的装置, 包括至少两个位 于不同物理位置的核心控制装置, 分别用于对各自的接入设备提供控制 服务;
其特征在于所述核心控制装置内还包括用于异地容灾的: 处理单元、 数据库单元、 共享单元、 同步进程单元;
所述处理单元和数据库单元, 与核心控制装置内已有的处理机和数 据库相互独立, 用于为异地的接入设备提供服务, 从而使不同物理位置 的核心控制装置之间互为异地容灾关系;
所述共享单元, 用于处理能力和数据的共享;
所述同步进程单元, 用于完成互为异地容灾关系的核心控制装置之 间数据的同步。
2、如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
服务于异地接入设备的数据库单元的数据来自于异地核心控制装置 的数据库, 通过互为容灾核心控制装置内的同步进程单元实现;
所述任何一个核心控制装置的配置数据发生变动, 都会触发同步进 程单元向互为容灾关系的另一核心控制装置同步数据;
所述任何一个核心控制装置也可以通过同步进程单元主动向互为容 灾关系的另一核心控制装置请求相关配置数据。
3、如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述共享单元中, 共享的处理能力主要是网络处理能力, 共享的数 据包括本地的基本环境参数、 分发表和接入设备的当前分布情况;
所述分发表, 用于对进入核心控制装置的请求进行路由决策, 决定 其是交由已有的处理机还是处理单元进行处理;
所述接入设备的当前分布情况, 不仅包括本地主控的接入设备, 也 包括异地控制的接入设备的分布情况, 某个接入设备在本地核心控制装 置上进行了注册或退出注册, 无论其是否为本地主控的接入设备, 均需 要对其当前分布进行记录, 并通过同步进程单元实时同步到互为容灾关 系的异地核心控制装置, 用于保证该接入设备能为其他设备所访问。
4、 如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述同步进程单元所完成的互为异地容灾装置间的数据同步传输一 般建立在 TCP连接上, 并且在系统的整个生存周期中一直保持该 TCP连 接, 用于保证数据在 IP网络上的可靠和及时同步。
5、 如权利要求 4所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
在对于异地容灾关系建立之初、 或进行数据的重新同步时, 需要完 成大量配置数据的同步,可以采用在本地先生成数据文件,然后通过 FTP 协议传输到异地, 再在异地从数据文件中提取数据, 以提高数据传输效 率和网络利用率。
6、 如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述核心控制装置本身的日常维护和管理独立进行, 对于配置数据 所做的增删改, 则同步至与其互为容灾关系的核心控制装置, 用于当某 个核心控制装置出现故障时, 能够立即进行异地容灾切换。
7、如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述核心控制装置出现故障是由其所提供服务的接入设备感知, 即 需要接入设备能够通过协议握手机制主动检测核心控制装置是否可用, 并能够在感知到核心控制装置故障后, 自动切换到预先配置的互为容灾 关系的核心控制装置。
8、如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述核心控制装置和其所控制提供服务的接入设备位于分组交换网 络;
所述核心控制装置支持部分接入设备的异地容灾切换, 即在与其互 为容灾的核心控制装置正常运行的情况下, 可以接管异地的部分或全部 接入设备。
9、如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其 特征在于:
所述互为容灾的核心控制装置之间, 接入设备的当前注册分布情况 是实时同步的; 核心控制装置根据接入设备的当前分布, 做出对本地接 入设备的访问是在本地完成还是转往异地, 或对异地接入设备的访问是 转往异地还是在本地完成等决策。
10、 如权利要求 1所述基于分组网络实现软交换异地容灾的装置, 其特征在于:
所述发生故障的核心控制装置恢复运行后, 在异地互为容灾的核心 控制装置上注册的本地接入设备向本地恢复运行的核心控制装置的切 换, 由互为容灾的异地核心控制装置按照预先制定的容灾恢复策略进行 恢复。
PCT/CN2003/001041 2003-12-05 2003-12-05 Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets WO2005055517A1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
PCT/CN2003/001041 WO2005055517A1 (fr) 2003-12-05 2003-12-05 Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets
AU2003292846A AU2003292846A1 (en) 2003-12-05 2003-12-05 A apparatus for realizing softswitch allopatric disaster recovery based on packet network
CN2003801103950A CN100407620C (zh) 2003-12-05 2003-12-05 一种基于分组网络实现软交换异地容灾的装置
EP03782048.7A EP1705829B2 (en) 2003-12-05 2003-12-05 System for realizing softswitch allopatric disaster recovery based on packet network
US10/581,387 US7675850B2 (en) 2003-12-05 2003-12-05 Apparatus for realizing soft-switch allopatric disaster recovery based on packet network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2003/001041 WO2005055517A1 (fr) 2003-12-05 2003-12-05 Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets

Publications (1)

Publication Number Publication Date
WO2005055517A1 true WO2005055517A1 (fr) 2005-06-16

Family

ID=34638029

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2003/001041 WO2005055517A1 (fr) 2003-12-05 2003-12-05 Appareil de realisation de la recuperation d'un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets

Country Status (5)

Country Link
US (1) US7675850B2 (zh)
EP (1) EP1705829B2 (zh)
CN (1) CN100407620C (zh)
AU (1) AU2003292846A1 (zh)
WO (1) WO2005055517A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007048319A1 (fr) * 2005-10-26 2007-05-03 Huawei Technologies Co., Ltd. Systeme et procede de recuperation sur sinistre de dispositif de commande de service dans un reseau intelligent
CN102355595A (zh) * 2011-09-23 2012-02-15 中兴通讯股份有限公司 一种网络电视多平台容灾方法及系统
WO2017198144A1 (zh) * 2016-05-20 2017-11-23 中兴通讯股份有限公司 一种iptv系统容灾方法及iptv容灾系统
CN111988808A (zh) * 2019-05-22 2020-11-24 普天信息技术有限公司 核心网容灾备份方法和装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9047248B2 (en) 2013-01-29 2015-06-02 Sungard Availability Services, Lp Logical domain recovery
CN106375102B (zh) * 2015-07-22 2019-08-27 华为技术有限公司 一种服务注册方法、使用方法及相关装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0874314A1 (fr) * 1997-04-21 1998-10-28 Alcatel Système à stations réceptrices de données installées en réseau

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6470398B1 (en) 1996-08-21 2002-10-22 Compaq Computer Corporation Method and apparatus for supporting a select () system call and interprocess communication in a fault-tolerant, scalable distributed computer environment
US6467006B1 (en) * 1999-07-09 2002-10-15 Pmc-Sierra, Inc. Topology-independent priority arbitration for stackable frame switches
US6430577B1 (en) * 1999-10-08 2002-08-06 Unisys Corporation System and method for asynchronously receiving multiple packets of audit data from a source databased host in a resynchronization mode and asynchronously writing the data to a target host
US6970448B1 (en) * 2000-06-21 2005-11-29 Pulse-Link, Inc. Wireless TDMA system and method for network communications
US6615322B2 (en) 2001-06-21 2003-09-02 International Business Machines Corporation Two-stage request protocol for accessing remote memory data in a NUMA data processing system
US7023794B2 (en) 2002-02-11 2006-04-04 Net2Phone, Inc. Method and architecture for redundant SS7 deployment in a voice over IP environment
US7113938B2 (en) * 2002-02-14 2006-09-26 Gravic, Inc. Method of increasing system availability by splitting a system
CN100336309C (zh) * 2002-05-02 2007-09-05 中兴通讯股份有限公司 一种在主备模块上进行协议呼叫数据处理的方法
US7468984B1 (en) * 2004-12-29 2008-12-23 At&T Corp. Method and apparatus for providing disaster recovery using network peering arrangements

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0874314A1 (fr) * 1997-04-21 1998-10-28 Alcatel Système à stations réceptrices de données installées en réseau

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
C. GROVES: "Gateway Control Protocol Version 1", IETF STANDARD, INTERNET ENGINEERING TASK FORCE, 1 June 2003 (2003-06-01)
CISCO MGC NODE MANAGER PROVISIONING TOOL USER'S GUIDE, vol. 2.4, no. 1, May 2003 (2003-05-01), Retrieved from the Internet <URL:http://www.cisco.com/en/US/docs/net_mgmt/vspt/2.4/vs241.pdf>
LIN L. ET AL: "Application of Data Remote Back-up in Electronic Business Security", COMPUTER APPLICATION, vol. 22, no. 7, July 2002 (2002-07-01), pages 83 - 85, XP008100723 *
See also references of EP1705829A4
YINGFENG L. ET AL: "Disaster Recovery:Concept and Application", COMPUTER APPLICATION RESEARCH, no. 6, 2002, pages 7 - 10, XP008100651 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007048319A1 (fr) * 2005-10-26 2007-05-03 Huawei Technologies Co., Ltd. Systeme et procede de recuperation sur sinistre de dispositif de commande de service dans un reseau intelligent
CN101160794B (zh) * 2005-10-26 2010-05-19 华为技术有限公司 一种智能网业务控制设备容灾系统和方法
CN102355595A (zh) * 2011-09-23 2012-02-15 中兴通讯股份有限公司 一种网络电视多平台容灾方法及系统
WO2017198144A1 (zh) * 2016-05-20 2017-11-23 中兴通讯股份有限公司 一种iptv系统容灾方法及iptv容灾系统
CN111988808A (zh) * 2019-05-22 2020-11-24 普天信息技术有限公司 核心网容灾备份方法和装置

Also Published As

Publication number Publication date
EP1705829B1 (en) 2013-09-25
US7675850B2 (en) 2010-03-09
EP1705829A4 (en) 2009-07-22
CN1802814A (zh) 2006-07-12
US20090109839A1 (en) 2009-04-30
AU2003292846A1 (en) 2005-06-24
EP1705829A1 (en) 2006-09-27
EP1705829B2 (en) 2016-09-28
CN100407620C (zh) 2008-07-30

Similar Documents

Publication Publication Date Title
US7111035B2 (en) Fault tolerance associations for IP transport protocols
CA2427285C (en) Method and system for implementing ospf redundancy
US6826182B1 (en) And-or multi-cast message routing method for high performance fault-tolerant message replication
JP5941404B2 (ja) 通信システム、経路切替方法及び通信装置
US7257629B2 (en) Method and apparatus for providing back-up capability in a communication system
WO2011157151A2 (zh) 实现容灾备份的方法、设备及系统
WO2008014639A1 (en) A distributed master and standby managing method and system based on the network element
JP4650414B2 (ja) 通信処理システム、パケット処理負荷分散装置及びそれに用いるパケット処理負荷分散方法
CN1921369B (zh) 一种网络连接的接管方法
US20110188506A1 (en) Distributed Master Election
WO2012122945A1 (zh) 用于虚拟网络单元的工作方法及装置
WO2012000234A1 (zh) 链路间快速切换的方法、装置和系统
WO2014059844A1 (zh) 分布式弹性网络互连中网关动态切换方法和装置
WO2005083569A1 (ja) ネットワーク間のプロセス移動方法およびそのネットワークシステム
WO2013097366A1 (zh) 链路聚合的异常恢复方法和交换设备
WO2008014696A1 (fr) Méthode et dispositif pour effectuer un transfert de communications
WO2011147312A1 (zh) 一种业务接入路由器的端口备份方法、装置和系统
WO2009117946A1 (zh) 调度服务器的主备实现方法及调度服务器
CN111371625A (zh) 一种双机热备的实现方法
EP1940091B1 (en) Autonomous network, node device, network redundancy method and recording medium
CN100409619C (zh) 数据网络设备及其管理控制方法
WO2005055517A1 (fr) Appareil de realisation de la recuperation d&#39;un systeme allopatrique de commutateur logiciel utilisant un reseau a paquets
JP2011040931A (ja) 移動体通信ゲートウェイ装置及び移動体通信ゲートウェイ制御方法
KR20200072941A (ko) 실시간 오류 감지를 통한 vrrp 기반의 네트워크 장애 대응 방법 및 장치
EP1719348A1 (en) Method and system for service node redundancy

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200380110395.0

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 10581387

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2003782048

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2003782048

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP