CN100452797C - High-available distributed boundary gateway protocol system based on cluster router structure - Google Patents

High-available distributed boundary gateway protocol system based on cluster router structure Download PDF

Info

Publication number
CN100452797C
CN100452797C CN 200510012192 CN200510012192A CN100452797C CN 100452797 C CN100452797 C CN 100452797C CN 200510012192 CN200510012192 CN 200510012192 CN 200510012192 A CN200510012192 A CN 200510012192A CN 100452797 C CN100452797 C CN 100452797C
Authority
CN
China
Prior art keywords
node
message
backup
routing
peer
Prior art date
Application number
CN 200510012192
Other languages
Chinese (zh)
Other versions
CN1719831A (en
Inventor
吴建平
勇 崔
张智泉
恪 徐
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Priority to CN 200510012192 priority Critical patent/CN100452797C/en
Publication of CN1719831A publication Critical patent/CN1719831A/en
Application granted granted Critical
Publication of CN100452797C publication Critical patent/CN100452797C/en

Links

Abstract

基于集群路由器结构的高可用分布式边界网关协议系统属于路由协议系统结构领域,其特征在于:在集群路由器结构中,选取一个节点为主控节点,另一个节点为主控节点的备份节点,一个连接节点,至少一个从节点;通过对主控节点的备份,使系统没有单一失效点,提高了系统可靠性;通过划分算法的合理分配,使各从节点的负载均衡,提高了整体BGP系统的性能;从而实现了BGP消息的快速处理和BGP的可靠服务。 Highly available distributed systems Border Gateway Protocol router architecture based on the cluster belongs to the field of routing protocol architecture, comprising: a router in a cluster configuration, select a node as a master node, the other nodes as a backup master node to the node, a a connection node, at least one slave node; by the backup master node, so that the system is no single point of failure, increasing system reliability; by dividing the reasonable allocation algorithm, each node from load balancing and improve the overall system BGP performance; in order to achieve a reliable service and fast processing of BGP BGP message.

Description

基于集群路由器结构的高可用分布式边界网关协议系统 Based on a highly available distributed Border Gateway Protocol router cluster system structure

技术领域 FIELD

基于集群路由器结构的高可用分布式边界网关协议系统属于路由协议系统结构领域,尤其涉及双节点备份技术和多节点分布式计算系统。 Border Gateway Protocol-based distributed system availability cluster router routing protocol structure belonging to the field system, and particularly to a two-node multi-node distributed backup technology and computing systems.

背景技术 Background technique

Internet的飞速发展对网络设备的计算能力、转发能力和端口密度都提出了更高的要求。 The rapid development of Internet computing power network equipment, forwarding capacity and port density have put forward higher requirements. 单个路由节点在可靠性、性能可扩展性、规模可扩展性和服务可扩展性等方面有其难以逾越的障碍,已经不能满足下一代Internet发展需要。 A single routing node in reliability, performance, scalability, extensibility terms of scale and service scalability of its insurmountable obstacles, can not meet the development needs of the next generation Internet. 核心路由器技术正在经历着巨大的变化,以T比特核心路由器为代表,路由器体系结构向着路由器群集、分布式、可扩展方向发展。 The core router technology is undergoing tremendous change to the core router bits represented by T, toward the router router architecture cluster, distributed, scalable direction.

路由器硬件体系结构从集中控制方式发展到集群结构下的分布式并行处理方式,路由器软件技术的发展则相对比较滞后。 Router hardware architecture evolved from the centralized control to distributed parallel processing cluster configuration, router software development technology is relatively lagged. 在传统的路由器中,所有路由协议、路由策略相关的计算功能还只能在单个节点上运行,其他节点只是作为备份节点,并没有真正实现软件系统可扩展和高可用。 In traditional routers, all routing protocols, routing policy calculation functions can only run on a single node, the other nodes only as a backup node, and not truly scalable and highly available system software.

边界网关协议(BGP)作为Internet域间网络协议,负责自治域间的路由可达信息的交互。 Border Gateway Protocol (BGP) as an inter-domain Internet Protocol network, responsible for routing between autonomous up interactive information. BGP协议对等体之间相互建立连接,通过发送路由更新(UPDATE)报文通知路由信息的变化。 Protocol to establish a connection with each other BGP peers, updated (the UPDATE) packet routing information change notification by sending routing. 每个BGP实体根据自己的策略计算路由信息的优先级,并选择出最优的路由。 Each BGP entity according to its own policy priorities calculate routing information, and select the optimal route.

Interne淑心路由器控制平面的BGP协议性能面临新的挑战。 Interne Shu heart of the BGP router control plane performance is facing new challenges. 目前Interne滑干结点的BGP 路由表容量呈现出线性增长与指数增长交替的趋势,在大容量路由表条件下,路由器需要消耗更多的存储空间,造成路由更新处理变慢,增加BGP协议的计算开销。 BGP routing table of the current calculated slip dry Interne points show a linear alternating growth and exponential growth trends in large-capacity routing table condition, routers need to consume more storage space, resulting in slowing routing update process, a BGP protocol overhead. 传统单进程集中控制的BGP实现在可靠性、路由表容量、路由计算能力和支持的邻居规模上都无法满足未来Internet的发展需求。 Traditional single centralized process control BGP achieved in terms of reliability, capacity routing table, routing computing power and support neighbors are unable to scale to meet the development needs of the future of the Internet.

本发明充分利用集群结构路由器硬件平台所提供的分布式计算资源与存储能力,设计了合理的划分算法,将BGP实现分布到各个节点并行运行,使各节点的计算负载和内存消耗得到均衡,从而提高BGP系统的整体效率。 The present invention fully utilizes a distributed architecture router cluster hardware platform provides computing resources and storage capacity, a reasonable partitioning algorithm design, implement the BGP distributed among the nodes operating in parallel, so that the computational load and memory consumption of each node are equalized, so that BGP improve the overall efficiency of the system. 同时,对系统中可能存在的单一失效点实现冗余备份,达到提髙整体系统可靠性的目的。 Meanwhile, redundancy backup of the single point of failure that may exist in the system, to achieve the object mentioned Gao overall system reliability.

发明内容 SUMMARY

本发明的目的在于克服传统的单节点BGP实现的计算能力、存储能力以及可靠性的不足,提供一种基于集群路由器结构的高可用分布式的BGP实现方案。 Object of the present invention to overcome the traditional single node calculation capacity, storage capacity and reliability BGP implemented insufficient, there is provided a distributed implementation based high availability cluster configuration BGP router.

本发明解决其技术问题所采用的技术方案是:如图1所示,在集群结构中, 一个节点为连接节点, 一个节点作为主控节点,另一个节点为主控节点的备份节点,其他节点作为从节点。 The present invention solves the technical problem using the technical solution is: 1, in a cluster configuration, a node is connected to the node, a node as a master node, the other nodes as a backup master node to node, other nodes as slave nodes. 连接节点负责与外部Internet的连接,在外部Internet和内部节点之间转发数据。 Responsible for external connection node connected to the Internet, forwarding data between the external and internal Internet nodes. 主控节点负责管理从节点以及与对等体建立连接,并根据划分算法,将对等体的路由更新(UPDATE) The master node is responsible for managing connections from node as well as peer and update (UPDATE) divided according to the routing algorithm, a peer of

报文分配给从节点处理,从节点对UPDATE报文进行解析后计算路由。 Allocated to the packet from node processing, calculating the route from the node UPDATE packet parsing.

基于集群路由器结构高可用分布式BGP系统由两部分组成:主控节点子系统和从节点子系统。 High availability cluster distributed BGP router architecture based system consists of two parts: the master node and slave node subsystem subsystem. 主控节点子系统运行在主控节点上,实现与接对等体建立连接、对从节点的管理以及负载的分配,同时,将重要信息发送给备份节点;从节点子系统运行在从节点上,用于解析UPDATE报文以及路由计算。 Master node subsystem running on the master node, implemented to establish connection with the access peer node of the management and distribution of the load, while transmitting important information to the backup node; subsystem running on the node from the node for parsing UPDATE message and a route calculation.

通过主控节点的集中控制,使分布式BGP系统便于管理,通过对主控节点的备份,使系统没有单一失效点,提髙了系统的可靠性;通过划分算法的合理分配,使各从节点的负载均衡,提高了整体BGP系统的性能。 Centralized control by the master node, the system facilitates a distributed BGP management, backup master node of the system has no single point of failure, the reliability of the system mentioned Gao; division by rational allocation algorithm, from the respective node load balancing, BGP improve the performance of the overall system.

本发明的特征在于:在集群路由器结构中,选取一个节点作为主控节点,另一个节点为主控节点的备份节点,构成主控节点子系统; 一个连接节点;其他节点作为从节点,构成从节点子系统;主控节点、从节点和连接节点通过髙速交换网组成所述基于集群路由器结构的高可用分布式边界网关协议系统,所述系统基于网络传输控制协议通过连接节点和对等体建立连接,所述对等体是指与所述系统交互协议信息的边界网关协议系统:其中, The present invention is characterized in that: in the cluster router configuration, select a node as a master node, the other nodes as a backup master node to node, the master node configuration subsystem; a connection node; other nodes as slave nodes, constituting from node subsystem; master node, and the node connected to the switching network nodes Gao speed based on the composition of the system highly available distributed border gateway protocol router cluster configuration, the network system based on transmission control protocol and a connection node by peers establishing a connection, a peer system refers to a border gateway protocol protocols to interact with the information system: wherein

A.主控节点子系统运行在所述主控节点上,负责以下任务:与所述对等体建立连接; 根据划分算法把从对等体接收的载有路由更新消息的路由更新报文发送给相应的从节点处理,所述的路由更新消息用"UPDATE消息"表示;接收各从节点UPDATE消息处理后的局部最优路由并从中选择出全局最优路由;将UPDATE消息通告给所述对等伴;管理从节点以及把重要消息发送给所述备份节点; A. subsystem running on the master node of the master node, is responsible for the following tasks: establish a connection with the peer; contains the routing update message from peer receives partitioning algorithms based routing updates sent from the process to a corresponding node, the routing update message with the "uPDATE message" represents; and receiving respective global optimal route selected from local optimum route after the processing from the node uPDATE message; advertised to the uPDATE message to the with like; from management node and the backup node to send important messages;

在所述主控节点上维护以下数据库: Maintaining the database on the master node:

全局最优路由数据信息库:保存路由计算得到的路由器全局最优路由信息; 从节点数据库:保存分布式BGP系统中工作的从节点ID,每个从节点的工作负责情况, 以及主控节点与从节点的通讯操作备份,所述BGP系统指的是边界网关协议系统; 输出路由信息库:保存发送给对等体的路由更新信息; 在所述主控节点上配置了以下软件模块: Global optimal routing information base data: Save the router global optimal routing information in the routing calculated; from the node database: save the node ID from the distributed BGP system to work, each responsible for the case from the nodes and the master node communication operation from the backup node, the system refers to a BGP BGP system; output routing information base: to save to a peer routing update information; on the master node is configured with the following software modules:

(1) 分布式划分算法模块 (1) Distributed partitioning algorithm module

当所述BGP系统与新的对等体建立连接后,主控节点选择负载最小的从节点来处理新的对等体的UPDATE消息; When the system with the new BGP peer connection is established, the master node selects the least loaded UPDATE message from the node to handle the new peer;

(2) 从节点管理模块该模块包括以下各子模块: (2) from the node management module which module comprises following sub-modules:

(2.1) 从节点加入子模块新加入节点由管理员配置ID和主控节点ID,当新节点加入Cluster时,立即发送消息通 (2.1) was added sub-module added to the new node from the master node ID and the node ID configuration by the administrator, the Cluster when a new node is added, immediately sends a message through

告主控节点.主控节点回应这个消息,确认新节点的加入,并将新节点的信息加入到从节点信息库中,所述Cluster即集群路由器结构; . Advertisement master node master node in response to this message, confirm a new node is added, and the new node is added to the information from the node repository, i.e., the cluster Cluster router architecture;

(2.2) 从节点退出子模块主控节点删除从节点信息库中退出从节点的信息,并按照划分算法把这个从节点上处理 (2.2) exit from the master node node sub-module deletes the information from the exit node from the node repository, and in accordance with this process partitioning algorithm from the node

的对等体重新分配给其他的从节点处理; The new weight assigned to a peer node from the other processing;

(2.3) 从节点状态监控子模块 (2.3) from a node status monitoring sub-module

主控节点周期性的向其他所有从节点发送询问消息,收到询问消息的从节点向主控节点回复消息,没有回复消息的从节点将被认为故障; Master node periodically sends an inquiry to all other nodes from the message, receive a reply message from the node to the master node interrogation message, there is no reply message from the node to be considered a failure;

(2.4) 从节点故降处理子模块 (2.4) so ​​that drop from node processing sub-module

主控节点通过状态监控发现某个从节点出现故障,主控节点删除从节点信息库中这个从节点的信息,并按照划分算法把这个从节点上处理的对等体重新分配给其他的从节点处理; Master node status monitoring finds a fault from the node, the master node delete this information from the repository node from the nodes, and partitioning algorithm according to the new distribution deal with this peer from the node from the node to other weight deal with;

(3) 与对等体建立连接模块 (3) to establish a connection with the peer module

该模块依次按以下步骤实现与对等体的连接: The module is implemented by the steps of sequentially connected peer:

步骤3-l:启动与对等体的连接; Step 3-l: initiate a connection with a peer;

步骤3-2:启动TCP连接; Step 3-2: starting a TCP connection;

步骤3-3:建立BGP连接,按以下步骤进行; Step 3-3: BGP connection is set, perform the following steps;

步骤3-3-l:向对等体发送用来建立BGP对等体连接的问讯消息,称为OPEN消息; 步骤3-3-2:接收到对等体的OPEN消息后,向对等体回复保持BGP连接的通告消息称 Step 3-3-l: the peer sends Inquiry messages used to establish the BGP peer connections, called OPEN message; Step 3-3-2: after receiving the OPEN message peer to peer reply to remain connected to announce the news that BGP

为KEEPALIVE消息,同时等待对等体的KEEPALIVE消息,连接状态设置为OpenConfirm; 步骤3-3-3:接收对等体的KEEPALIVE消息,完成与对等体的连接,连接状态设置为 For the KEEPALIVE message while waiting for the KEEPALIVE messages to a peer, a connection state is set to the OpenConfirm; Step 3-3-3: connection, the connection status KEEPALIVE message receiving peer, and the peer complete set

Established; Established;

步骤3-4:主控节点根据所述分配算法选出负载最小的从节点,由该从节点处理该对等体的UPDATE消息; Step 3-4: the master node from the node with the lowest load is selected from the UPDATE message from the peer node processing in accordance with the allocation algorithm;

(4) 处理BGP消息模块该模块按以下步骤实现消息处理: (4) processing the BGP message module message processing module implements the following steps:

步骤4-1:主控节点调用TCP socket读函数得到BGP消息: 步骤4-2:主控节点处理不同类型消息: Step 4-1: the master node calls the read function to get TCP socket BGP messages: Step 4-2: different types of messages processed master node:

步骤4-2-1:处理OPEN消息 Step 4-2-1: Processing OPEN message

从OPEN消息中读取版本号、自治域号、超时时间、BGP标识符四个域的值,并分别予以检验; Reading the version number, AS number, timeout, BGP identifier value from the four fields in the OPEN message, and be examined, respectively;

根据自治域号和BGP标识符判断OPEN消息是否来自管理员设置的邻居节点:若不是, 则发送用NOTIFICATION表示的故障消息与对等体中断连接:若是,则进行以下检测; The AS number BGP identifier and determines whether the OPEN message from a neighbor node set by the administrator: if not, transmitting a failure message and the like to disconnect body represented NOTIFICATION: If, following the detection;

根据BGP协议的连接冲突检测定义进行冲突检测:若有冲突并需关闭该连接便发送故障消息以中断与该对等体的连接;若无冲突,便执行以下检测; Collision detection connector according to the BGP protocol defines conflict detection: In case of conflict the connection to be closed and will send a fault message to break the connection to the peer; without conflict, it performs the following tests;

检测版本号是否正确:若不正确,发送故障消息给该对等体以中断连接;若正确,便执行以下检测; Detects the version numbers are correct: If correct, the failure to send a message to the peer to disconnect; if correct, it performs the following tests;

检测超时时间是否为零或者小于3秒:若不是,发送故障消息以中断与该对等体的^h 否则,便执行以下检测; Detecting whether the time-out time is zero or less than 3 seconds: if not, sending a failure message to interrupt Otherwise, it performs the following tests and ^ h the peer;

比较本路由器BGP实体设置的超时时间置和接收的OPEN消息中的超时时间值,以值小的作为这个连接的超时时间值,设置保持BGP连接的通告消息定时器的值为所述连接超时时间值的三分之一; Timeout counter timeout value comparison router and BGP OPEN disposed entity received message to a small value as the timeout values ​​for the connection, an announcement message is provided to maintain the value of the timer is connected to the BGP connection time one third of the value;

发送保持BGP连接的通告消息给该对等体确认接收OPEN消息,连接状态设置为OpcnConfirm状态; Sending BGP connection holding advertisement message to the receiving peer OPEN acknowledgment message, the connection state is set to state OpcnConfirm;

步骤4-2-2:处理保持BGP连接的通告消息 Step 4-2-2: Processing BGP announcement message remains connected

当连接状态为OpenConfirm状态时,主控节点把连接状态变为Established状态并向对等体发送保持BGP连接的通告消息; When the connection state is OpenConfirm state, the master node changes the connected state to transmit keep Established state BGP announcement message peer connection;

当连接状态为Established状态时,增加保持BGP连接的通告消息接收计数,重置超时时间定时器; When the connection state is Established state, to increase the holding BGP announcement message received connection count, resets the timeout timer;

步骤4-2-3:处理从对等体接收到的路由更新消息 Step 4-2-3: Processing routing update message received from a peer to

主控节点收到路由更新消息后,把路由更新消息发送给相应的从节点;由从节点作以下检査; After the master node receives the routing update message, the route update message to the corresponding slave node; for a slave node checks the following;

对整个属性长度作检查,若超过规定长度,通过故陣消息通告对等体,丢弃该路由更新 For inspection of the entire length of the property, if it exceeds a predetermined length, so the array by the peer advertisement message, discards the route update

消息; News;

若路由更新消息中包括不可用路由,检査该路由长度是否正确,若超过规定值,向对等体发送故障消息并丢弃该路由更新消息;否则,对该不可用路由进行语法检査,若有错误, 便丢弃该路由更新消息;若正确,便获取不可用路由的值存入变量中; If the routing update message includes routing is unavailable, the route length check is correct, when more than a predetermined value, sending a failure message to the peer, and discards the route update message; otherwise, the route is unavailable syntax checking, if there is an error, it discards the route update message; if correct, will get the value stored in the variable is not available in the route;

若路由更新消息中包含可用路由,则检査该路由的长度,若超过规定值,向对等体发送故陣消息并丢弃该路由更新消息;否则,对该可用路由的路径属性的每个^进行检査,若有错误,便丢弃该路由更新消息:若正确,便获取路由属性各个域的值存入一个结构变量中; If the routing update message includes the available routes, the length of the route is checked, if more than a predetermined value, so that the array transmits a message to the peer, and discards the route update message; otherwise, the attribute of each path available routes ^ check, if an error, it discards the route update message: if correct, would obtain the value of each domain routing attribute structure into a variable;

对于不可用路由,从输入路由信息库中删除该路由,启动分布式BGP路由计算; For routing is unavailable, the input to delete this route from the routing information base, distributed BGP route calculation starts;

对于可用路由,更新输入路由信息库,保存路径属性,启动分布式BGP路由计算; For the available route, updating the routing information base input, path attribute stored, distributed BGP route calculation starts;

步骤4-2-4:处理故障消息 Step 4-2-4: Processing fault message

主控节点获取该故陣消息中各个域的值,显示错误信息,断开与故障对等体的连接;接着,通知该对等体UPDATE消息的处理从节点删除包括故哮对等体所发布的路由以及路由属性在内的所有相关信息: Therefore, the value of the master node obtains the message array in each domain, the error message displays the connection, disconnect the faulty peer; then, notifies the peer UPDATE message is deleted from the processing node including asthma so released peer routing and routing attributes, including all relevant information:

(5)双节点冗余备份模块 (5) two-node redundancy module

主控节点和备份节点形成双节点备份的硬件环境,但是节点之间不提供相互的软硬件失效的硬件检测机制,它们通过心跳算法实现双机的状态监测;主控节点和备份节点都运行主控节点子系统,当主控节点正常工作时,备份节点只能接收主控节点的备份消息,并把备份消息中的备份数据备份到相应的数据库中;当主控节点出现故障时,备份节点接替主控节点的工作; Master and backup nodes form a two-node backup hardware environment, but does not provide another mechanism between the detection node hardware failure of hardware and software that implement dual-status monitoring algorithm by a heartbeat; backup master node and the master node runs subsystem control node, the master node when working properly, the backup node can receive messages backup master node, and the backup data to the backup in a backup message to the appropriate database; when the master node fails, the backup node take over the work of the master node;

为实现这种失效转移,采用了的方法是进行检査点(Checkpoint)状态备份,然后进行状态回滚恢复:该模块按以下步骤实现: 步骤l.双节点备份的状态检测 To achieve this failover, using the checkpoint method (Checkpoint) backup state and roll back the state: This module implements the following steps: Step l backup node state detecting dual

主控节点定时发送査询消息给备份节点,备份节点回复消息;当主控节点收不到备份节点的回复消息时,就认为备份节点故降,这时主控节点将不会向备份节点发送备份消息;当备份节点不能收到主控节点的查询消息时,就认为主控节点出现故障,这时备份节点将进行 Master node periodically sends a query message to the backup node, the backup node reply message; when the master node does not receive a reply message backup node, the backup node so that it is down, then the master node will not send to the backup node backup message; when the backup node can not receive the query message master node, it is considered the master node fails, then the backup node will be

状态回滚恢复,接替主控节点的工作; 步骤2.状态备份 Roll back the state to take over the work of the master node; Step 2. State Backup

在主控节点模块中,需要备份的状态信息可以分为两类, 一类是:通讯相关的状态信息, 包括主控节点与从节点的通讯信息:另一类是:应用相关的状态数据,包括和本BGP协议建 In the master node module, status information to be backed up can be divided into two categories, one is: related to the communication state information, communication information including the master node and the slave node: the other is: application dependent state data, and this includes the construction of BGP

立连接的其它BGP对等体的IP地址和自治系统号码ASN等配置参数、本集群路由器的全局 IP address and autonomous system numbers ASN other BGP peers establish a connection configuration parameters like body, this global cluster router

最优路由、本BGP协议的输出路由、本集群路由器中的从节点; Output routing optimal route, the BGP protocol, router from this cluster node;

对于通讯相关的状态数据来说,任何一次操作都可能涉及到从节点的状态变化,所以它们的状态备份必须做到小粒度的备份,在每一次主控节点与从节点进行通讯后进行相应的状 For data related to the communication status, any one operation may involve a change from a state of the node, so they must be done backup state backup small particle size, corresponding in each master node and the slave communication node shape

态备份;当主控节点与从节点通讯时,主控节点同时将通讯数据读写操作备份到备份节点中, 备份的读写操作中包括读写操作的读取和写入数据,数据长度,以及操作返回的结果; Backup state; and when the write operation of the master node from the communication node, the master node while communication data read and write operations to the backup node, the backup data comprises reading and writing the read and write operations, data length, operation and the results returned;

而对于应用相关的状态数据,数据量大,备份粒度较大,主控节点每隔一段的时间把这 For application-related state data, data volume, backup larger particle size, the master node at intervals of this time

些应用相关数据发送给备份节点; 步骤3.状态回滚恢复 These application-related data to the backup node; roll back step 3. Status

当主控节点出现故障时,备份节点接替主控节点工作,这时应用相关的状态数据已经保存在备份节点的相应数据库中,备份节点上的主控节点子系统可以直接使用这些状态数据启动,然后重复进行通讯数据读写操作,但是通讯数据读写操作不是在进行实际的数据读写操 When the master node fails, the backup node takes over the work of the master node, then the application of the relevant state data already stored in the corresponding database backup node, the master node on the backup node subsystem can use these data start state, communication is then repeated data read and write operations, but not read and write the data communication performing actual data reads and writes

作,而是从备份的读写操作中返回相应的数据和结果; Made, but the results and returns the corresponding data read and write operations from the backup;

B.从节点子系统,负责路由更新消息处理,局部最优路由选择,还要配合主控节点进行全局最优路由选择;该从节点子系统仅有分布式BGP路由计算子模块,按以下步骤以完成从节点子系统的任务: B. Update node subsystem is responsible for routing the message processing, local optimal routing, but also with the master node for global optimal routing; only the computing subsystem distributed BGP routing node from the sub-module, the following steps to complete the task from the node subsystems:

(1) 优先级计算 (1) the priority calculation

当从节点对UPDATE报文解析后,发现有可用路由,触发优先级计算过程;在优先级计 When the node from the UPDATE message parsing, available routes found, the trigger priority calculation process; priority count

算过程中,锁定输入路由信息库,根据预先设定好的策略,对新的可用路由或者是替代路由 Calculation process, the lock input routing information base, in accordance with preset strategy, new alternative route or routes available

计算一个优先级;计算完成后,解开输入路由信息库,触发路由选择过程; Calculating a priority; completion of the calculation, the routing information base input unlock trigger routing process;

(2) 路由选择 (2) routing

在分布式BGP系统中,路由选择分为两步完成,第一步是从节点选择局部最优路由,第二步是主控节点选择全局最优路由; In a distributed system, BGP, routing is divided into two steps, the first step is to select locally the best route from the node, the master node selects the second step is globally optimal route;

当优先级计算过程完成后,首先激活从节点路由选择;从节点路由选择过程锁定输入路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,如果选出的路由与局部最优路由信息库中保存的路由相同,结束路由选择过程;否则,更新局郁最优信息库,解开输入路由信息库,同时通过系统的分布式消息机制把这条路由信息发送给主控节点,激活主控节点全局路由选择过程; When the priority calculation process is completed, first activate select the route from the node; lock input from a routing information base node routing process, to select a highest priority from the routing all new route to the destination is available the same route, if the election the routing and preservation of local optimal routing information base routing the same, the end of the route selection process; otherwise, the Bureau Yu optimal update repository, unlock the input routing information base, but this route via distributed messaging system information to the master node, the master node to activate the global routing process;

主控节点上保存着所有从节点的局部最优路由,当收到一条从节点发送来的新的路由时, 锁定全局最优路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,更新全局最优路由信息库,解开全局最优路由信息库,触发路由分发过程; Save on the master node of all the best route from the local node, when receiving a new route transmitted from the node, the locking global optimal routing information base, from all the new route to the destination is available the same route select a route with the highest priority, update the global optimal routing information base, unlock the global optimal routing information base, routing trigger the distribution process;

(3) 路由分发 (3) route distribution

路由分发过程被路由选择过程激活,将全局最优路由信息库的更新路由包装到UPDATE 消息中,发送给每个和本BGP协议建立连接的BGP对等体,同时在每个对等体的输出路由信息库中记录发送的路由; Route distribution process is routing process is activated, the global optimal route update package routing information base to the UPDATE message, and transmits this to each BGP peer BGP protocol to establish the connection body, while the output of each peer routed routing information database record;

本发明所提出的基于集群路由器结构高可用分布式BGP系统,克服了传统的单节点BGP 系统性能和可靠性的不足,提供了一种新的BGP系统实现方案,通过构建--个集群结构分布式处理系统,可以实现BGP报文的快速处理和BGP的可靠服务。 The present invention proposes a high availability cluster distributed BGP router architecture based system, to overcome the traditional single node BGP system performance and reliability is insufficient, there is provided a new BGP system implementation, by building - the distribution of clusters Structure type processing system that can achieve reliable service BGP packets fast processing and BGP.

附图说明 图1. 基于集群路由器结构分布式BGP系统结构图2. 从节点状态信息査询示意图图3. 主控节点子系统与对等体建立连接流程图图4. 分布式BGP路由计算示意图图5. 容错应用系统的状态备份和回滚恢复示意图 1. FIG distributed BGP router architecture based on the cluster system configuration of the master node 2. FIG subsystem to establish a connection with its peer BGP routing flowchart of FIG. 4. Distributed from node status query a schematic diagram of calculation 5. FIG fault tolerant application system state backup and rollback recovery schematic

具体实施方式 基于集群路由器结构高可用分布式BGP系统主要由两个子系统构成:主控节点子系统和从节点系统子系统。 DETAILED DESCRIPTION high availability cluster distributed BGP router architecture based on the primary system consists of two subsystems: subsystem and the subsystem from the master node node system. *主要功能 *The main function

主控节点子系统:与对等体建立连接;根据划分算法把接收的路由更新报文发送给相应的从节点处理;接收各从节点BGP消息处理后的局部最优路由并选出全局最优路由;将路由 Subsystem master node: establishing peer connection; packet to the appropriate node from the processing according to the division algorithm to the received route update; receiving the best route from each local node and the selected message processing BGP global optimum routing; routing

更新通告给对等体;管理从节点。 Update advertised to peers; from the management node.

从节点子系统:对UPDATE消息进行解析;计算各条路由的优先级;选出局部最优路由。 UPDATE message to parsing; calculating priority respective routes; selecting the best local route: from node subsystem. *几个重要概念 * Several important concepts

BGP实体:路由器上运行的BGP系统。 BGP entity: BGP system running on the router.

BGP对等体:与当前系统交互协议消息的BGP系统。 BGP peer: this system interacting with BGP protocol message system.

BGP定义了4种报文: BGP defines four types of messages:

□ OPEN消息:用来建立BGP对等体连接的问讯消息; □ OPEN message: Inquiry message used to establish the BGP peer connection;

□ UPDATE消息:路由更新消息; □ UPDATE Message: routing update message;

□ KEEPALIVE消息:保持BGP连接的通告消息: □ KEEPALIVE message: BGP announcement message holding connection:

□ NOTIFICATION消息:故障通告消息; □ NOTIFICATION message: the fault notification message;

同时BGP定义了六种与对等体连接状态:启动对等体连接即Idle状态,启动TCP连接即Connect状态、等待TCP连接即Active状态、发送Open消息即OpenSent状态、等待接收OPEN消息确认即OpenConfirm状态、BGP连接成功即Established状态,用来描述与BGP 对等体连接建立过程中的不同阶段。 And simultaneously defines six BGP peer connection status: i.e. starting peer connection Idle state, i.e. to start a TCP connection Connect state, i.e. Active TCP connection waiting state, i.e. OpenSent Open message transmission state, i.e. waits to receive a confirmation message OPEN OpenConfirm state, i.e. established state BGP connection is successful, and to describe the BGP peer different stages of the connection establishment process. 在每个连接状态,需要接收的BGP消息不同,并且,会根据接收到的BGP消息改变连接状态 In each of the connected state, different BGP messages to be received, and will change the connection state according to the received BGP message

主控节点子系统维护的数据库: Sub-master node database maintained by:

口全局最优路由信息库:保存路由计算得到的路由器全局最优路由信息; Opening global optimal routing information base: Save the router global optimal routing information in the routing calculated;

□从节点数据库:保存分布式BGP系统中工作的从节点ID,以及每个从节点的工作负 □ from the node database: BGP distributed storage system operating from a negative node ID, and from each working node

载情况。 Overload conditions.

□输出路由信息库:保存发送给对等体的路由更新报文信息; □ output routing information base: Save sent to the peer routing update message information;

从节点子系统维护的数据库- From the sub-node database maintained -

输入路由信息库:保存接收到的对等体更新报文信息。 Enter the routing information base: Save the received update packet peer information.

局部最优路由信息库:保存从节点路由计算得到的本节点最优路由信息。 Local optimal routing information base: Save the calculation obtained from the present node routing node the best route information. «分布式划分算法 << Distributed partitioning algorithm

主控子系统维护的从节点信息库中记录了当前有哪些从节点,每个从节点上分配了多少个对等体报文的处理工作,当BGP系统与新的对等体建立连接后,主控子系统选择负载最小 Master subsystem maintenance records from the node information base from which the current node, each assigned a number of peer processing body of the message from the node, the system with the new BGP peer connection is established when, select minimum load master subsystem

的从节点处理新的对等体的UPDATE消息。 UPDATE message the new peer node from a process. 这个分配算法能够保证各从节点的负载比较均衡。 The allocation algorithm can ensure that all nodes from the load more balanced. *从节点管理 * From Node Manager

□从节点的加入 □ from node to join

1. 配置从节点的标识号和主控节点的标识号。 Configuring an identification number and identification number from the node to the master node.

2. 向主控节点发送加入通告消息,等待主控节点回应消息; 2. Send announcement message to the master node is added, wait for a response message for the master node;

3. 主控节点收到从节点的加入通告消息,将新节点的信息加入到从节点库中,发送回应 3. Join the master node receives advertisement messages from the node, the new node is added to the information from the node bank, sends a reply

消息给从节点。 Message to the slave node. 口从节点的退出 Exit from the port node

1. 主控节点删除从节点库中退出从节点的信息; 1. Delete the master node exit information from the node from the node bank;

2. 按照划分算法把退出从节点上处理的负载重新分配给其他的从节点处理; □从节点的状态检查, 主控节点周期性的向所有从节点发送询问消息,收到询问消息的节点向.主控节点回复消 2. The partitioning algorithm processing load from the exit node reassigned to another node from the process; □ from node status check, the master node periodically sends a query to all nodes from the message to the received inquiry message . cancellation master node replies

息,没有回复消息的从节点将被认为故障。 Information, there is no reply message from the nodes will be considered a failure. 主从节点状态信息査询流程如图2所示。 2 from the master node status query process. □从节点的故障处理 □ the process from the failed node

1. 主控节点通过从节点状态检査发现故障从节点; 1 from the master node by node failures found from status check;

2. 主控节点等待从节点恢复,等待时间由管理员设定,缓存由故陣从节点处理的UPDATE 消息; 2. The master node waits for recovery from the node, the waiting time is set by the administrator, it is cached by the node from the processing array UPDATE message;

3. 如果在等待时间内从节点没有恢复工作,主控节点删除从节点库中故障从节点的信息, 按照划分算法把故障从节点上处理的负载重新分配给其他的从节点处理。 3. If you do not return to work, the master node from within the waiting time remove the failed node from the information, according to the fault partitioning algorithm reassigned from the node bank from processing load on the node to other nodes from the process.

4. 如果在等待时间内故障从节点恢复工作,主控节点将缓存的UPDATE消息发送给它处理。 4. If the failure recovery within the waiting time, the master node from the buffered UPDATE message to process it.

拳与对等体建立连接 Fist to establish a connection with a peer,

当前路由器BGP实体首先要和对等体建立连接,流程如图3所示。 Current BGP router first entity and the peer connection is established, the flow shown in FIG. BGP是基于网络传输控制协议(TCP)之上的路由协议。 BGP routing protocol is based on a network transmission control protocol (TCP). 因此,与对等体建立连接可以分为两个步骤:先建立TCP 连接,再建立BGP连接。 Thus, the peer establishes a connection can be divided into two steps: first establish a TCP connection, and then establish a BGP connection. 与对等体建立连接之前将连接状态设置为Idle。 Before connecting to a peer connection state to Idle.

□建立TCP连接有两种模式:主动模式和被动模式 □ establish a TCP connection has two modes: active mode and passive mode

主动模式:主控节点子系统主动向对等体发起TCP连接请求,通过三次握手与对等体被动模块:主控节点子系统监听TCP的179端口,对等体请求建立TCP连接,通过三次握手与对等体建立TCP连接; Active mode: master node initiates the subsystem to peer TCP connection request, through three-way handshake with peer passive modules: the master node subsystem listens on TCP port 179, a peer requests to establish a TCP connection, via three-way handshake establish a TCP connection with the peer;

启动TCP连接前将连接状态设置为Connect。 Connect the connection status is set to start before the TCP connection.

□建立BGP连接 □ establish a BGP connection

1. 向对等体发送OPEN消息,等待对等体的OPEN消息,连接状态设置为OpenSent: 1. peer sends an OPEN message, OPEN message waiting peer connection state is set to OpenSent:

2. 接收到对等体的OPEN消息,回复KEEPALIVE消息,同时等待对等体的KEEPALIVE 消息,连接状态设置为OpenConfirm; 2. OPEN message receiving peer, KEEPALIVE message reply, while waiting for the KEEPALIVE messages to a peer, a connection state is set to the OpenConfirm;

3. 接收KEEPALIVE消息,完成与对等体的连接,连接状态设置为Established 当BGP实体与对等体建立连接后,主控节点根据分配算法选出负载最小的从节点,由这 3. KEEPALIVE messages received, and the peer connection is completed, the connection status is set to Established when BGP peer entity and the connection is established, the master node assignment algorithm is selected according to the lowest load from the node by which

个从节点处理这个对等体的UPDATE消息。 The UPDATE message from peer node treatments. 像处理BGP报文流程 Image processing BGP packet flow

主控节点子系统通过调用TCP socket读函数得到BGP报文。 Get the master node subsystem BGP packets by calling the TCP socket read function.

□ OPEN消息处理 □ OPEN message processing

在分布式BGP系统中,OPEN消息的处理是在主控节点上实现的,OPEN消息的处理流程如下: In a distributed system, BGP, OPEN message processing is implemented on the master node, the OPEN message processing flow is as follows:

1. 从OPEN报文中读取版本号(Version)、自治域号(AS Number)、超时时间(HoldTime) 和BGP标识符(BGP Identifier)四个域的值; 1. From the OPEN message read the version number (Version), autonomous system number (AS Number), timeout (HoldTime) and values ​​of four fields BGP identifier (BGP Identifier);

2. 根据AS Number和BGP Identifier判断OPEN消息是否来自管理员设置的邻居节点。 The AS Number BGP Identifier and determining whether the OPEN message from a neighbor node set by the administrator. 如果不是,发送NOTIFICATION消息给对等体。 If not, it sends a NOTIFICATION message to the peer.

3. 根据BGP协议的连接冲突检測定义进行冲突检测,如果有冲突并霈學关闭这个连接, 则发送NOTIFICATION消息给对等体中断这个连接。 3. The connector conflict detection BGP protocol defines collision detection, if there is a conflict and Science Pei close the connection, sending a NOTIFICATION message to the peer interrupt this connection.

4. 检测版本号是否正确,不正确发送NOTIFICATION消息给对等体中断这个连接。 4. The correct version number is detected, the transmission is not correctly NOTIFICATION message to the peer interrupt this connection.

5. 检测AS Number是否正确,不正确发送NOTIFICATION消息给对等体中断这个连接。 The detection AS Number is correct, does not send the correct NOTIFICATION message to the peer interrupt this connection.

6. 检测Hold Time是否为零或者大于3秒,如果不是,发送NOTIFICATION消息给对等体中断这个连接。 6. Hold Time detecting whether zero or more than 3 seconds, if not, sends a NOTIFICATION message to the peer interrupt this connection.

7. 比较本路由器BGP实体设置的Hold Time值和接受的OPEN消息中的Hold Time值, 以值小的作为这个连接的Hold Time值,设置KEEPALIVE消息定时器为连接Hold Time值的 7. Hold Time value comparison router entity set Hold Time value and BGP OPEN messages received to a small value as the connection Hold Time value is provided for the KEEPALIVE message timer connected Hold Time value

三分之一。 one third.

8. 发送KEEPALIVE消息给对等体确认接受OPEN消息,使对等体连接的有限状态变为OpenConfirm状态。 The transmitting KEEPALIVE messages to a peer to confirm acceptance of the OPEN message, so that the finite state peer connection becomes OpenConfirm state.

□KEEPALIVE消息处理 □ KEEPALIVE message processing

在分布式BGP系统中,KEEPALIVE消息的处理是在主控节点上实现的。 In a distributed system, the BGP process KEEPALIVE messages are implemented in the master node. KEEPALIVE消息只有一个消息头,对它的处理比较简单。 KEEPALIVE messages only a message header, it is a relatively simple process.

当连接状态为OpenConfirm状态时,处理流程如下- When the connection state is OpenConfirm state, the process is as follows -

1. 将连接状态变为Established状态。 1. Established connection state to state.

2. 发送KEEPALIVE消息给对等体。 2. KEEPALIVE messages sent to the peer.

3. 把当前路由器路由表全部通过UPDATE消息发送给对等体。 3. The routing table of the router to send all peer through UPDATE message. 当连接状态为Established状态时,处理流程如下: When the connection state in the Established state, the process flow is as follows:

1.增加KEEPALIVE消息接收计数。 1. KEEPALIVE message reception count increases. 2.重置HOLD Time定时器。 2. HOLD Time to reset the timer. □ UPDATE消息处理 □ UPDATE Message Handling

在分布式BGP系统中,UPDATE消息由主控节点接收,对UPDATE消息的处理是在从 In a distributed system, BGP, UPDATE message is received by the master node, the processing of UPDATE message is from

节点上实现的,处理流程如下- Implemented on a node, the processing flow is as follows -

主控节点接收到UPDATE消息,把UPDATE报文发送给相应的从节点; 对整个属性长度检査,如果超过规定长度,通过NOTIFICATION消息通知对等体,丢弃 The master node receives the UPDATE message, sends an UPDATE message to the corresponding slave node; inspection of the entire length of the property, and if it exceeds a predetermined length, the NOTIFICATION message to notify the peer discarded

这个UPDATE消息; The UPDATE message;

如果UPDATE消息中包含不可用路由,检查不可用路由长度是否正确。 If the UPDATE message includes routing is unavailable, the route is unavailable to check the correct length. 如果超过规定长度,通过NOTIFICATION消息通知对等体,丢弃这个UPDATE消息; If it exceeds a predetermined length, the NOTIFICATION message to notify the peer, an UPDATE message is discarded;

对不可用路由进行语法检査,如果错误,丢弃这个UPDATE消息;如果正确,获取不可用路由的值存入变量中; Routes unavailable for grammar checking, if an error, discard the UPDATE message; if correct, is not acquired with the value stored in the variable in the route;

如果UPDATE消息中包含可用路由,对可用路由长度检查,如果太长,通过NOTIFICATION消息通知对等体,丢弃这个UPDATE消息; If the UPDATE message includes the available routes, the length of the examination of the available routes, if too long, the NOTIFICATION message to notify the peer, an UPDATE message is discarded;

对可用路由的路径属性的每个域进行检査,如果有错误,通过NOTIFICATION消息通知对等体,丢弃这个UPDATE消息;如果正确,获取路径属性各个域的值存入一个结构变量中; Path attribute for each domain route available to check if there is an error, the NOTIFICATION message to notify the peer, UPDATE message is discarded; if correct, obtaining the value of the path attribute stored in a domain structure of each variable;

对可用路由进行语法检査,如果错误,丢弃这个UPDATE消息;如果正确,获取不可用路由的值存入变量中; , Syntax check of the available routes, if an error, discard the UPDATE message; if correct, to obtain the value stored in the variable routing is unavailable; and,

如果有不可用路由,从输入路由信息库中删除这条路由,启动分布式BGP路由计算; If you have not used the route, delete the route from the input routing information base to start the BGP routing distributed computing;

如果有可用路由,更新输入路由信息库,保存路径属性;启动分布式BGP路由计算。 If the route is available, updates routing information base input, path attribute stored; distributed BGP route calculation starts.

□NOTIFICATION消息处理 □ NOTIFICATION message processing

在分布式BGP系统中,对NOTIFICATION消息的处理是由主控节点和从节点配合实现的,处理流程如下: BGP in a distributed system, the processing of the NOTIFICATION message is a master node and slave node with the implemented process is as follows:

1. 获取NOTIFICATION消息中各个域的值; 1. NOTIFICATION message to obtain the value of each field;

2. 把错误信息显示出来; 2. The error message is displayed;

3. 断开与对等体的连接。 3. Disconnect the peer.

4. 主控节点通知处理这个对等体UPDATE消息的从节点删除与之有关的所有相关信息(包括它所发布的路由以及描述这些路由的属性),启动分布式BGP路由计算; 4. The master node notifies the peer processing UPDATE message from node deletes all information relating thereto (including its published description of these routes and route attributes), distributed BGP route calculation starts;

*分布式BGP路由计算 * BGP routing distributed computing

在BGP协议中,BGP路由计算又称为决策过程,分为三期:优先级计算、路由选择、路由分发。 In BGP, BGP route calculation, also known as the decision-making process, divided into three phases: priority calculation, routing, route distribution. 这三期分别是三个独立的过程,由不同的事件激发,图4为路由计算示意图。 This three are the three separate processes, excited by different events, Figure 4 is a schematic route calculation.

分布式BGP路由算法描述如下: Distributed BGP routing algorithm is described as follows:

1. 优先级计算 1. Priority Calculation

当从节点对UPDATE报文解析后,发现有可用路由,触发优先级计算过程。 When the node from the UPDATE message parsing, available routes found, the trigger priority calculation process. 在优先级计算过程中,锁定输入路由信息库,根据预先设定好的策略,对新的可用路由或者是替代路由计算一个优先级。 In the priority calculation process, the lock input routing information base, in accordance with preset strategy, new alternative route is available route or calculate a priority. 计算完成后,解开输入路由信息库,触发路由选择过程。 After the calculations are complete, unlock the input routing information base, trigger routing process.

2. 路由选择 2. Routing

在分布式BGP系统中,路由选择分为两步完成,第一步是从节点选择局部最优路由,第二步是主控节点选择全局最优路由。 In a distributed system, BGP, routing is divided into two steps, the first step is to select the best route from the local node, the master node selects the second step is globally optimal route.

当优先级计算过程完成后,首先激活从节点路由选择。 When the priority calculation process is completed, the activated first selected node routing. 从节点路由选择过程锁定输入路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,如果选出的路由与局部最优路由信息库中保存的路由相同,结束路由选择过程;否则,更新局部最优信息库,解开输入路由信息库,同时通过系统的分布式消息机制把这条路由信息发送给主控节点,激活主控节点全局路由选择过程。 From the lock input node routing process routing information base to elect the highest priority of a route from all the newly available route to the destination the same route, if you save the selected routing and local optimal routing information base same route end routing process; otherwise, the local optimum update information base, unlock the input routing information base, and this road by the information sent to the master node through a distributed messaging system, activate the master node of the global routing process.

主控节点上保存着所有从节点的局部最优路由,当收到一条从节点发送来的新的路由时, 锁定全局最优路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,更新全局最优路由信息库,解开全局最优路由信息库,触发路由分发过程。 Save on the master node of all the best route from the local node, when receiving a new route transmitted from the node, the locking global optimal routing information base, from all the new route to the destination is available the same route select a route with the highest priority, update the global optimal routing information base, unlock the global optimal routing information base, routing trigger distribution process.

3.路由分发 3. route distribution

路由分发过程被路由选择过程激活,将全局最优路由信息库的更新路由包装到UPDATE 消息中,发送给每个对端,同时在每个对等体的输出路由信息库中记录发送的路由。 Route distribution process is routing process is activated, the global optimal route update package routing information base to the UPDATE message, sent to each peer, while the transmission route recorded in the output routing information base of each peer. *主控节点的双节点冗余备份 * Two-node redundant backup master node

主控节点和备份节点形成双节点备份的硬件环境,但是节点之间不提供相互的软硬件失效的硬件检测机制,它们通过心跳算法实现双机的状态监测。 Master and backup nodes form a two-node backup hardware environment, but does not provide another mechanism for detecting hardware failures between hardware and software nodes that implement dual-status monitoring algorithm by a heartbeat. 主控节点和备份节点都运行主控节点子系统,当主控节点正常工作时,备份节点只能接收主控节点的备份消息,并把备份消息中的备份数据备份到相应的数据库中;当主控节点出现故障时,备份节点接替主控节点的工作。 Master and backup nodes are running a backup master node message subsystem work when the master node, the master node can only receive backup node, the backup data and the backup in a backup message to the appropriate database; when when the master node fails, the backup node takes over the work of the master node.

为实现这种失效转移,采用了的方法是进行检査点(Checkpoint)状态备份,然后进行状态回滚恢复,如图5所示:主控节点上的主控节点子系统一歩一步执行,在每个步骤完成后都插入一个检査点,检査系统当前的状态,并把系统状态保存到备份节点相应的数据库中, 当主控节点上的主控节点子系统在某个步骤,如步骤3出现故障时,备份节点将检査点2的系统状态信息恢复到备份节点上,备份节点的主控节点子系统可以继续执行步骤3。 To achieve this failover, using the checkpoint method (Checkpoint) backup state and roll back the state shown in Figure 5: the master node of the master node on a sub-ho step performed in after completion of each step of a checkpoint is inserted, the current state of the inspection system, and to save the system state database to a corresponding backup node, the master node when the master node in the subsystem on a step, such as step 3 fails, the backup node will check the system status information 2 is returned to the point of the backup node, the backup master node node subsystem may proceed to step 3.

处理流程如下- Process is as follows -

主控节点定时发送査询消息给备份节点,备份节点回复消息;当主控节点收不到备份节点的回复消息时,就认为备份节点故障,这时主控节点将不会向备份节点发送备份消息;当主控节点能够收到备份节点的回复消息时,就认为备份节点正常工作,可以向备份节点备份状态信息; Master node periodically sends a query message to the backup node, the backup node reply message; when the master node does not receive a reply message backup node, the backup node failure is considered, then the master node will not send the backup to the backup node news; when the master node can receive the reply message backup node, it considers that the backup node is working properly, you can backup status information to the backup node;

在主控节点模块中,需要备份的状态信息可以分为两类, 一类是:通讯相关的状态信息, 包括主控节点与从节点的通讯信息;另一类是:应用相关的状态数据,包括和本BGP协议建 In the master node modules, the state information to be backed up can be divided into two categories, one is: related to the communication state information, communication information including a master node and a slave node; the other is: application dependent state data, and this includes the construction of BGP

立连接的其它BGP对等体的IP地址和自治系统号码ASN等配置参数、本集群路由器的全局最优路由、本BGP协议的输出路由、本集群路由器中的从节点; IP address and other BGP Autonomous System number of peers establish a connection member ASN like configuration parameters, output routing global optimal route according to the present cluster router, the BGP protocol, router from this cluster node;

对于通讯相关的状态数据来说,任何一次操作都可能涉及到从节点的状态变化,所以它们的状态备份必须做到小粒度的备份,在每一次主控节点与从节点进行通讯后进行相应的状 For data related to the communication status, any one operation may involve a change from a state of the node, so they must be done backup state backup small particle size, corresponding in each master node and the slave communication node shape

态备份;当主控节点与从节点通讯时,主控节点同时将通讯数据读写操作备份到备份节点中, 备份的读写操作中包括读写操作的读取和写入数据,数据长度,以及操作返回的结果-, Backup state; and when the write operation of the master node from the communication node, the master node while communication data read and write operations to the backup node, the backup data comprises reading and writing the read and write operations, data length, and the results returned by the operation -

而对于应用相关的状态数据,数据量大,备份粒度较大,主控节点每隔一段的时间把这些应用相关数据发送给备份节点; For application-related state data, data volume, backup larger particle size, at intervals of time the master node sends this data to the application associated backup node;

当备份节点不能收到主控节点的査询消息时,就认为主控节点出现故障,这时备份节点将进行状态回滚恢复,接替主控节点的工作; When the backup node can not receive the query message master node, it is considered the master node fails, then the backup node will be rolled back to restore the state to take over the work of the master node;

备份节点接替主控节点工作,这时应用相关的状态数据已经保存在备份节点的相应数据库中,备份节点上的主控节点子系统直接使用这些状态数据启动,然后重复进行通讯数据读写操作,但是通讯数据读写操作不是在进行实际的数据读写操作,而是从备份的读写操作中返回相应的数据和结果。 Backup node to take over the work of the master node, then the associated application state data already stored in the respective database backup node, the backup node on the master node subsystems use these data start state, and then the communication data read and write operations are repeated, However, the communication is not read and write data during the actual data write operation, but the results and returns the corresponding data read and write operations from backup.

Claims (1)

1.基于集群路由器结构的高可用分布式边界网关协议系统,其特征在于:在集群路由器结构中,选取一个节点作为主控节点,另一个节点为主控节点的备份节点,构成主控节点子系统; 一个连接节点;其他节点作为从节点,构成从节点子系统;主控节点、从节点和连接节点通过髙速交换网组成所述基于集群路由器结构的高可用分布式边界网关协议系统,所述系统基于网络传输控制协议通过连接节点和对等体建立连接,所述对等体是指与所述系统交互协议信息的边界网关协议系统;其中,A.主控节点子系统运行在所述主控节点上,负责以下任务:与所述对等体建立连接; 根据划分算法把从对等体接收的载有路由更新消息的路由更新报文发送给相应的从节点处理,所述的路由更新消息用"UPDATE消息"表示;接收各从节点UPDATE消息处理后的局部最优路由并 1. Distributed availability Border Gateway Protocol router architecture based on the cluster system, comprising: a router in a cluster configuration, select a node as a master node, the other nodes as a backup master node to node, constituting the sub master node system; a connection node; other nodes as slave nodes, the nodes constituting the subsystem; master node, and the node from the node connected via the switching network consisting of the speed Gao availability distributed border gateway protocol router architecture based on the cluster system, the said transmission control protocol-based network system to establish a connection via connecting node and the peer, the peer is a border gateway protocol refers to the system information with the interactive protocol system;. wherein, a subsystem is operating in the master node the master node is responsible for the following tasks: establish a connection with the peer; the partitioning algorithm from the update message containing routing peer receives a route update message sent from the node to the corresponding processing routing according to update message is represented by "uPDATE message"; and each receiving the local optimal route from node uPDATE message processing 中选择出全局最优路由;将UPDATE消息通告给所述对等体;管理从节点以及把重要消息发送给所述备份节点;在所述主控节点上维护以下数据库-全局最优路由数据信息库:保存路由计算得到的路由器全局最优路由信息; 从节点数据库:保存分布式BGP系统中工作的从节点ID,每个从节点的工作负责情况, 以及主控节点与从节点的通讯操作备份,所述BGP系统指的是边界网关协议系统: 输出路由信息库:保存发送给对等体的路由更新信息; 在所述主控节点上配置了以下软件模块-(1 )分布式划分算法模块当所述BGP系统与新的对等体建立连接后,主控节点选择负载最小的从节点来处理新的对等体的UPDATE消息; (2)从节点管理模块该模块包括以下各子模块-(2.1) 从节点加入子模块新加入节点由管理员配置ID和主控节点ID,当新节点加入Cluster时,立即发送消息通告主 The global optimal route selected; advertise the UPDATE message to the peer; from the management node, and transmitting the important message to the backup node; maintaining the database on the master node - the global optimal routing data library: save the router global optimal routing information in the routing calculated; from the node database: save the node ID from the distributed BGP system to work, each responsible for the case from the nodes and the master node and communications operations from the backup node the BGP system refers to a system border gateway protocol: output routing information base: stored routing updates sent to the peer; on the master node is configured with the following software modules - (1) dividing the distributed algorithm module when the system with the new BGP peer connection is established, the master node selects the minimum load to process a new UPDATE message from peer node; (2) from the node management module following the module includes sub-modules - (2.1) was added sub-module added to the new node from the master node ID and the node ID configuration by the administrator, the Cluster when a new node is added, immediately sends a message announcement master 节点,主控节点回应这个消息,确认新节点的加入,并将新节点的信息加入到从节点信息库中,所述Cluster即集群路由器结构;(2.2) 从节点退出子模块主控节点删除从节点信息库中退出从节点的信息,并按照划分算法把这个从节点上处理的对等体重新分配给其他的从节点处理;(2.3) 从节点状态监控子模块主控节点周期性的向其他所有从节点发送询问消息,收到询问消息的从节点向主控节点回复消息,没有回复消息的从节点将被认为故障;(2.4) 从节点故障处理子模块主控节点通过状态监控发现某个从节点出现故障,主控节点删除从节点信息库中这个从节点的信息,并按照划分算法把这个从节点上处理的对等体重新分配给其他的从节点处理;(3) 与对等体建立连接模块该模块依次按以下步骤实现与对等体的连接:步骤3-l:启动与对等体的连接;步骤3-2:启动 Node, the master node in response to this message, confirm a new node is added, and the new node is added to the information from the node repository, i.e., the cluster router Cluster structure; (2.2) exits from the node is deleted from the master node submodule exit information database node information from the nodes, and partitioning algorithm according to this new assignment from the peer node processing from the weight to other processing nodes; (2.3) from the master node status monitoring sub-modules to other nodes periodically All sends a query message from the node, receive a reply message from the node to the master node interrogation message, there is no reply message from the node to be considered a fault; (2.4) from the failed master node processing sub-module finds a node status monitoring from node failure occurs, the master node deletes the information from the node, and in accordance with the newly allocated partitioning algorithm from the database of the node information and the like from the processing node to another slave node weight process; (3) the peer the module connection establishing module implemented by the steps of sequentially connected to the peer: initiate a connection with a peer; step 3-2:: start step 3-l TCP连接;步骤3-3:建立BGP连接,按以下步骤进行:步骤3-3-l:向对等体发送用来建立BGP对等体连接的问讯消息,称为OPEN消息; 步骤3-3-2:接收到对等体的OPEN消息后,向对等体回复保持BGP连接的通告消息称为KEEPALIVE消息,同时等待对等体的KEEPALIVE消息,连接状态设置为OpenConfirm; 步骤3-3-3:接收对等体的KEEPALIVE消息,完成与对等体的连接,连接状态设置为Established;步骤3-4:主控节点根据所述分配算法选出负载最小的从节点,由该从节点处理该对等体的UPDATE消息;(4) 处理BGP消息模块该模块按以下步骤实现消息处理:步骤4-l:主控节点调用TCP socket读函数得到BGP消息; 步骤4-2:主控节点处理不同类型消息: 步骤4-2-1:处理OPEN消息从OPEN消息中读取版本号、自治域号、超时时间、BGP标识符四个域的值,并分别予以检验;根据自治域号和BGP标识符判断OPEN消息是否来自管理 TCP connection; Step 3-3: BGP connection is set, perform the following steps: Step 3-3-l: peer sends Inquiry messages used to establish the BGP peer connections, called OPEN message; Step 3-3 -2: after receiving the OPEN message peer to peer connection BGP announcement message reply holding referred KEEPALIVE messages, while waiting for the KEEPALIVE messages to a peer, a connection state is set to the OpenConfirm; step 3-3-3 : a KEEPALIVE message receiving peer, and the peer connection is completed, the connection status is set to the Established; step 3-4: the master node from the selected node with the lowest load based on the allocation algorithm, the processing by this node from UPDATE message peer; (4) the processing module implements the messaging module BGP message according to the processing steps: step 4-l: the master node calls the read function to obtain TCP socket BGP message; step 4-2: different master node processing message type: step 4-2-1: processing OPEN message read the version number, AS number, timeout, BGP identifier value from the four fields in the OPEN message, and be examined separately; the BGP AS number and identification OPEN operator determines whether the message from the management 设置的邻居节点:若不是, 则发送用NOTIFICATION表示的故降消息与对等体中断连接;若是,则进行以下检测-,根据BGP协议的连接冲突检測定义进行沖突检测:若有冲突并霈关闭该连接便发送故障消息以中断与该对等体的连接:若无冲突,便执行以下检测;检测版本号是否正确:若不正确,发送故障消息给该对等体以中断连接;若正确,便执行以下检测;检测超时时间是否为零或者小于3秒:若不是,发送故障消息以中断与该对等体的连接: 否则,便执行以下检测:比较本路由器BGP实体设置的超时时间置和接收的OPEN消息中的超时时间值,以值小的作为这个连接的超时时间值,设置保持BGP连接的通告消息定时器的值为所述连接超时时间值的三分之一;发送保持BGP连接的通告消息给该对等体确认接收OPEN消息,连接状态设g为OpenConfirm状态;步骤4-2-2:处理保 Neighbor node set: if not, then the message and transmitting it down peer NOTIFICATION represented disconnected; if yes, perform the following tests -, collision detection collision detection connector according to the BGP protocol defined: In case of conflict and close Pei the connection failure message will be sent to interrupt the connection with the peer: the absence of conflict, it performs the following tests; to detect the version number is correct: if correct, the failure to send a message to the peer to disconnect; if correct, it performs the following tests; detecting whether the time-out time is zero or less than 3 seconds: if not, sending a failure message to the disconnected peer: otherwise, it performs the following tests: Comparative present BGP router entity set timeout counter and timeout value OPEN message received, to a small value as the timeout values ​​for the connection, is provided to maintain BGP announcement message timer is connected to the third connection timeout value; holding the transmission BGP connection advertisement message to the receiving peer OPEN acknowledgment message, the connection state is set OpenConfirm state g; step 4-2-2: processing Paul BGP连接的通告消息当连接状态为OpenConfirm状态时,主控节点把连接状态变为Established状态并向对等体发送保持BGP连接的通告消息;当连接状态为Established状态时,增加保持BGP连接的通告消息接收计数,重置超时时间定时器;步骤4-2-3:处理从对等体接收到的路由更新消息主控节点收到路由更新消息后,把路由更新消息发送给相应的从节点;由从节点作以下检查;对整个属性长度作检查,若超过规定长度,通过故障消息通告对等体,丢弃该路由更新消息;若路由更新消息中包括不可用路山,检査该路由长度是否正确,若超过规定值,向对等体发送故降消息并丢弃该路由更新消息;否则,对该不可用路由进行语法检査,若有错误, 便丢弃该路由更新消息;若正确,便获取不可用路由的值存入变量中;若路由更新消息中包含可用路由,则检査该路由的长度 When BGP announcement message when the connection state is connected OpenConfirm state, the master node changes the connected state to transmit keep Established state BGP announcement message peer connection; state when the connection is in the Established state, BGP connections increase the holding advertised receiving a message count, resets the timeout timer; step 4-2-3: processing received from peer routing update messages to the master node receives the routing update message, the route update message to the corresponding slave node; the slave node for the following checks; for inspection of the entire length of the property, when more than a predetermined length, a failure message advertised by peer discards the route update message; if the routing update message includes a not available mountain road, the route length is checked right, when more than a predetermined value, reducing transmit it discards the message and a route update message to the peer; otherwise, the route is unavailable syntax checked and any errors, the route update message is discarded; if correct, it acquires value stored in the variable routing is unavailable; if the routing update message includes routing is available, then check the length of the route 若超过规定值,向对等体发送故障消息并丢弃该路由更新消息;否则,对该可用路由的路径属性的每个域进行检査,若有错误,便丢弃该路由更新消息;若正确,便获取路由属性各个域的值存入一个结构变量中;对于不可用路由,从输入路由信息库中删除该路由,启动分布式BGP路由计算;对于可用路由,更新输入路由信息库,保存路径属性,启动分布式BGP路由计算;步骤4-2-4:处理故障消息主控节点获取该故障消息中各个域的值,显示错误信息,断幵与故障对等体的连接;接着,通知该对等体UPDATE消息的处理从节点删除包括故障对等体所发布的路由以及路由属性在内的所有相关信息;(5)双节点冗余备份模块主控节点和备份节点形成双节点备份的硬件环境,但节点之间不提供相互的软硬件失效的硬件检测机制,它们通过心跳算法实现双机状态监测; Exceeds a predetermined value, sending a failure message to the peer, and discards the route update message; otherwise, the path for each attribute domain available routes to check for any errors, the route update message is discarded; if correct, You will get the value of each domain routing attribute structure into a variable; not available for routing, delete this route from the routing information base input, distributed BGP route calculation starts; available for routing, a routing information base update input, path attribute stored , distributed BGP route calculation starts; step 4-2-4: Troubleshooting message master node acquires the value of the fault message in each domain, the error message display is connected, and Jian-off failure peer; Next, the notification on UPDATE message peer processing comprises deleting the faulty peer published routes and routing attributes including all the relevant information from the node; (5) two-node module redundancy and backup nodes master node form a double backup hardware environment , but it does not provide another mechanism for detecting hardware failures between hardware and software nodes that implement dual-state algorithm monitors heartbeat; 控节点和备份节点都运行主控节点子系统,当主控节点正常工作时,备份节点只能接收主控节点的备份消息,并把备份消息中的备份数据备份到相应的数据库中;主控节点出现故障时,备份节点接替主控节点的工作:为实现这种失效转移,采用了的方法是进行检査点状态备份,然后进行状态回滚恢复:主控节点定时发送査询消息给备份节点,备份节点回复消息;当主控节点收不到备份节点的回复消息时,就认为备份节点故障,这时主控节点将不会向备份节点发送备份消息;当主控节点能够收到备份节点的回复消息时,就认为备份节点正常工作,可以向备份节点备份状态信息;在主控节点模块中,需要备份的状态信息可以分为两类, 一类是:与通讯相关的状态信息,包括主控节点与从节点的通讯信息;另一类是:与应用相关的状态数据,包括和本BG Control and backup nodes are running subsystem master node, the master node when working properly, the backup node can backup master node receives the message, and the backup data backup in a backup message to the appropriate database; master when a node failure, the backup node to take over the work of the master node: failover to achieve this, a method is adopted to check status of the backup point, and roll back the state: the master node periodically sends a query message to the backup node, the backup node reply message; when the master node can not receive a reply message to the backup node, the backup node failure is considered, then the backup master node will not send a message to the backup node; when the master node can receive backups when a node reply message, it is considered the backup node is working properly, you can backup status information to the backup node; the master node module, the state needs to back up information can be divided into two categories, one is: status information associated with communications, comprising a master node and communications information from the node; the other is: the state data associated with the application, and including the present BG P 协议建立连接的其它BGP对等体的IP地址和自治系统号码ASN等配置参数、本集群路由器与全局最优路由相关的数据、本BGP协议与输出路由相关的数据、本集群路由器中与从节点相关的数据;对于通讯相关的状态数据来说,任何一次操作都可能涉及到从节点的状态变化,所以它们的状态备份必须做到小粒度的备份,在每一次主控节点与从节点进行通讯后进行相应的状们的状态备份必须做到小粒度的备份,在每一次主控节点与从节点进行通讯后进行相应的状态备份:当主控节点与从节点通讯时,主控节点同时将通讯数据读写操作备份到备份节点中, 备份的读写操作中包括读写操作的读取和写入数据,数据长度,以及操作返回的结果;而对于应用相关的状态数据,数据量大,备份粒度较大,主控节点每隔一段的时间把这些应用相关数据发送给备份节点; P BGP peer protocol to establish another connection member of the IP address and autonomous system numbers and the like ASN configuration parameters associated with this cluster router global optimal route data, the present BGP protocol routing data associated with the output, and from this cluster router node associated data; for state data related to the communication, any one operation may involve a change from a state of the node, so they must be done backup state backup small particle size, and each node from the master node their corresponding shape must be done after the backup state backup communications small particle size, the state of the corresponding backup after each node and the master node from the communication: when the master node from the communication node, the master node while the communication data read and write operations to the backup node, the backup write operations include reading and writing data read and write operations, data length, and the results of the operation returned; and for applications related to the state data, the amount of data backup larger particle size, at intervals of time the master node sends this data to the application associated backup node; 备份节点不能收到主控节点的查询消息时,就认为主控节点出现故障,这时备份节点将进行状态回滚恢复,接替主控节点的工作;备份节点接替主控节点工作,这时应用相关的状态数据已经保存在备份节点的相应数据库中,备份节点上的主控节点子系统直接使用这些状态数据启动,然后重复进行通讯数据读写操作,但是通讯数据读写操作不是在进行实际的数据读写操作,而是从备份的读写操作中返回相应的数据和结果;B.从节点子系统,负责路由更新消息处理,局部最优路由选择,还要配合主控节点进行全局最优路由选择;该从节点子系统仅有分布式BGP路由计算子模块,按以下步骤以完成从节点子系统的任务: (1)优先级计算当从节点对UPDATE报文解析后,发现有可用路由,触发优先级计算过程;在优先级计算过程中,锁定输入路由信息库,根据预先 When the backup node can not receive the query message master node, it is considered the master node fails, then the backup node will be rolled back to restore the state to take over the work of the master node; the backup node to take over the work of the master node, then use the relevant status data already stored in the respective database backup node, the backup node on the master node subsystems use these data start state, and then repeating the communication data read and write operations, but not read and write the data communication performing actual data write operation, but returns from the respective read and write operations of the backup data and results;. B from node subsystem is responsible for message routing update process, the local optimal routing, but also to the master node with the global optimum routing; only the calculation sub-module subsystem from distributed BGP routing node, the following steps to complete the task subsystem node from: (1) when the priority calculation node from the UPDATE message parsing, available routes found triggering the priority calculation process; in the priority calculation process, the routing information base lock input, according to the pre 设定好的策略,对新的可用路由或者是替代路由计算一个优先级;计算完成后,解开输入路由信息库,触发路由选择过程;(2) 路由选择在分布式BGP系统中,路由选择分为两步完成,第一步是从节点选择局部最优路山,第二歩是主控节点选择全局最优路由;当优先级计算过程完成后,首先激活从节点路由选择;从节点路由选择过程锁定输入路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,如果选出的路由与局部最优路由信息库中保存的路由相同,结束路由选择过程;否则,更新局部最优信息库,解开输入路由信息库,同时通过系统的分布式消息机制把这条路由信息发送给主控节点,激活主控节点全局路由选择过程;主控节点上保存着所有从节点的局部最优路由,当收到一条从节点发送来的新的路由时, 锁定全局 Setting a good strategy, new alternative route is available for routing calculation or a priority; completion of the calculation, the routing information base input unlock trigger routing process; (2) in a distributed BGP routing system, the routing is divided into two steps, the first step is to select a node from a local optimum Hill Road, ho is the master node selects the second global optimal route; when the priority calculation process is completed, first activate select the route from the node; the route from the node lock input selection process routing information base to elect the highest priority of a route from all the newly available route to the destination the same route, the same as if you save the selected routing and local optimal routing information base routing, the end of the route selection process; otherwise, the local optimum update information base, unlock the input routing information base, and this road by the information sent to the master node through a distributed messaging system, activate the master node of the global routing process; the master node the preservation of all the optimum route from the local node, upon receipt of a new route from the sending node, the global lock 优路由信息库,从所有与新的可用路由目的地相同的路由中选出优先级最高的一条路由,更新全局最优路由信息库,解开全局最优路由信息库,触发路由分发过程;(3) 路由分发路由分发过程被路由选择过程激活,将全局最优路由信息库的更新路由包装到UPDATE 消息中,发送给每个和本BGP协议建立连接的BGP对等体,同时在每个对等体的输出路由信息库中记录发送的路由。 Optimal routing information base, selected from all available with the new route to the destination the same route with the highest priority of a route, update the global optimal routing information base, unlock the global optimal routing information base, routing trigger distribution process; ( 3) the process of route distribution route distribution routing process is activated, the global optimal route update package routing information base to the uPDATE message, and transmits this to each BGP peer BGP protocol to establish the connection body, while each of the other output member in a routing information base routed recording.
CN 200510012192 2005-07-15 2005-07-15 High-available distributed boundary gateway protocol system based on cluster router structure CN100452797C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200510012192 CN100452797C (en) 2005-07-15 2005-07-15 High-available distributed boundary gateway protocol system based on cluster router structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200510012192 CN100452797C (en) 2005-07-15 2005-07-15 High-available distributed boundary gateway protocol system based on cluster router structure

Publications (2)

Publication Number Publication Date
CN1719831A CN1719831A (en) 2006-01-11
CN100452797C true CN100452797C (en) 2009-01-14

Family

ID=35931554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200510012192 CN100452797C (en) 2005-07-15 2005-07-15 High-available distributed boundary gateway protocol system based on cluster router structure

Country Status (1)

Country Link
CN (1) CN100452797C (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036548A (en) * 2014-07-01 2014-09-10 浪潮(北京)电子信息产业有限公司 MHA cluster environment reconstruction method, device and system
US9934114B2 (en) 2013-09-26 2018-04-03 Mistubishi Electric Corporation Communication system, standby device, communication method, and standby program

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2007254088A1 (en) * 2006-05-16 2007-11-29 Bea Systems, Inc. Next generation clustering
US9384103B2 (en) 2006-05-16 2016-07-05 Oracle International Corporation EJB cluster timer
CN101141382A (en) * 2006-09-07 2008-03-12 华为技术有限公司 Routing update method and router
CN101427531B (en) 2007-01-04 2011-11-23 中兴通讯股份有限公司 A protection method of the speakers of the interdomain protocol in the optical network
CN101014011B (en) 2007-01-31 2010-06-09 华为技术有限公司 Router switching equipment, IP network, communication system and path switching method
CN101309201B (en) * 2007-05-14 2012-05-23 华为技术有限公司 Route processing method, routing processor and router
CN101056270B (en) * 2007-05-18 2010-10-06 华为技术有限公司 A route convergence method and routing device
CN101110776B (en) 2007-07-05 2011-06-01 华为技术有限公司 Backup method, backup device and backup system for data business
CN100550866C (en) 2007-08-06 2009-10-14 中兴通讯股份有限公司 Method, device and communication system to recover boundary gateway protocol connection
CN101127705B (en) 2007-09-20 2011-09-21 中兴通讯股份有限公司 Method for realizing network transmission service quality
CN100574284C (en) 2007-09-24 2009-12-23 中兴通讯股份有限公司 Operation control method of distributed router system
CN101179504B (en) 2007-11-20 2011-05-04 华为技术有限公司 Method, system and network appliance to restrain routing
CN101217402B (en) 2008-01-15 2012-01-04 杭州华三通信技术有限公司 A method to enhance the reliability of the cluster and a high reliability communication node
CN101534239B (en) * 2008-03-13 2012-01-25 华为技术有限公司 Method and device for installing routers
CN101605089B (en) 2008-06-11 2012-02-22 华为技术有限公司 Method and apparatus for dynamic migration Bgp
CN101309167B (en) 2008-06-27 2011-04-20 华中科技大学 Disaster allowable system and method based on cluster backup
CN101360056B (en) 2008-09-12 2011-04-20 中兴通讯股份有限公司 System and method solving backup routing engine upper label competition
CN101488966A (en) * 2009-01-14 2009-07-22 深圳市同洲电子股份有限公司 Video service system
CN101483548B (en) 2009-02-26 2011-01-19 中国人民解放军信息工程大学 Method and system for distance vector routing protocol self-recovery
CN102064954B (en) * 2009-11-17 2013-09-18 腾讯科技(深圳)有限公司 Distributed fault tolerant system, equipment and method
CN102135929B (en) * 2010-01-21 2013-11-06 腾讯科技(深圳)有限公司 Distributed fault-tolerant service system
CN102340410B (en) * 2010-07-21 2014-09-10 中兴通讯股份有限公司 Cluster management system and method
CN101958805B (en) 2010-09-26 2014-12-10 中兴通讯股份有限公司 Terminal access and management method and system in cloud computing
CN102694825A (en) * 2011-03-22 2012-09-26 腾讯科技(深圳)有限公司 Data processing method and data processing system
CN102202425B (en) * 2011-06-24 2013-09-18 中国人民解放军国防科学技术大学 Satellite cluster self-organization networking method based on master-slave heterogeneous data transmission module
CN102291455B (en) * 2011-08-10 2014-02-19 华为技术有限公司 Distributed cluster processing system and message processing method
CN103023673A (en) * 2011-09-21 2013-04-03 中兴通讯股份有限公司 Starting method and apparatus of control units
JP5927871B2 (en) * 2011-11-30 2016-06-01 富士通株式会社 Management device, an information processing apparatus, the management program, management method, a program and a processing method
CN102523257A (en) * 2011-11-30 2012-06-27 广东电子工业研究院有限公司 Infrastructure as a service (IAAS)-cloud-platform-based virtual machine fault-tolerance method
CN102904761B (en) * 2012-10-24 2016-08-17 浙江宇视科技有限公司 A method of stacking and nvr nvr
CN103036717B (en) * 2012-12-12 2015-11-04 北京邮电大学 Consistency maintenance system and method for distributed data
CN103166796B (en) * 2013-03-13 2014-12-10 武汉邮电科学研究院 Method for realizing consistency of service signal transceiving paths during service recovery of power communication network
CN103269286B (en) * 2013-06-04 2016-01-13 上海数讯信息技术有限公司 Visual route monitoring and management system based on Border Gateway Protocol
CN103888310B (en) * 2013-09-04 2017-11-24 中寰卫星导航通信有限公司 Processing method and system monitoring
CN103491011B (en) * 2013-09-05 2017-02-08 杭州华三通信技术有限公司 Bgp session changing method and apparatus
CN103491192B (en) * 2013-09-30 2016-08-17 北京搜狐新媒体信息技术有限公司 Namenode switching method and system for a distributed system
US9626261B2 (en) * 2013-11-27 2017-04-18 Futurewei Technologies, Inc. Failure recovery resolution in transplanting high performance data intensive algorithms from cluster to cloud
CN104821892B (en) * 2015-04-09 2018-06-19 清华大学 Inter-plane switching system routing behavior verification method and apparatus for collaborative
CN106302198A (en) * 2015-05-25 2017-01-04 中兴通讯股份有限公司 Configuration method for cluster router CPU resources, and cluster router
CN105095008B (en) * 2015-08-25 2018-04-17 国电南瑞科技股份有限公司 Suitable for distributed cluster system task fault tolerance method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1431808A (en) 2003-01-27 2003-07-23 西安电子科技大学 Large capacity and expandable packet switching network structure
WO2005002136A1 (en) 2003-06-06 2005-01-06 Microsoft Corporation Method and system for global routing and bandwidth sharing
CN1610332A (en) 2004-07-09 2005-04-27 清华大学 Non-state end-to-end constraint entrance permit control method for kernel network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1431808A (en) 2003-01-27 2003-07-23 西安电子科技大学 Large capacity and expandable packet switching network structure
WO2005002136A1 (en) 2003-06-06 2005-01-06 Microsoft Corporation Method and system for global routing and bandwidth sharing
CN1610332A (en) 2004-07-09 2005-04-27 清华大学 Non-state end-to-end constraint entrance permit control method for kernel network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9934114B2 (en) 2013-09-26 2018-04-03 Mistubishi Electric Corporation Communication system, standby device, communication method, and standby program
CN104036548A (en) * 2014-07-01 2014-09-10 浪潮(北京)电子信息产业有限公司 MHA cluster environment reconstruction method, device and system

Also Published As

Publication number Publication date
CN1719831A (en) 2006-01-11

Similar Documents

Publication Publication Date Title
US8806031B1 (en) Systems and methods for automatically detecting network elements
US7185096B2 (en) System and method for cluster-sensitive sticky load balancing
US9088478B2 (en) Methods, systems, and computer readable media for inter-message processor status sharing
US7284067B2 (en) Method for integrated load balancing among peer servers
JP4680919B2 (en) Redundant routing function for the network node cluster
US8095935B2 (en) Adapting message delivery assignments with hashing and mapping techniques
CN101207550B (en) Load balancing system and method for multi business to implement load balancing
US6614764B1 (en) Bridged network topology acquisition
CA2555545C (en) Interface bundles in virtual network devices
US7894372B2 (en) Topology-centric resource management for large scale service clusters
US7765283B2 (en) Network provisioning in a distributed network management architecture
US6490246B2 (en) System and method for using active and standby routers wherein both routers have the same ID even before a failure occurs
US6092096A (en) Routing in data communications network
CN100456738C (en) Routing system and method for synchronizing
US20010037387A1 (en) Method and system for optimizing a network by independently scaling control segments and data flow
US6910149B2 (en) Multi-device link aggregation
JP5235998B2 (en) Method and apparatus for establishing and managing Diameter association
US10225157B2 (en) Dynamically deployable self configuring distributed network management system and method having execution authorization based on a specification defining trust domain membership and/or privileges
RU2375746C2 (en) Method and device for detecting network devices
US20020089982A1 (en) Dynamic multicast routing facility for a distributed computing environment
US6934875B2 (en) Connection cache for highly available TCP systems with fail over connections
US6721275B1 (en) Bridged network stations location revision
CN101471885B (en) Virtual multicast routing for a cluster having state synchronization
CN1849783B (en) Distributed software architecture for implementing BGP
JP3765138B2 (en) Improved node discovery and monitoring with a network management system

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C14 Granted
C17 Cessation of patent right