CN106603261B - Hot backup method, first main device, standby device and communication system - Google Patents
Hot backup method, first main device, standby device and communication system Download PDFInfo
- Publication number
- CN106603261B CN106603261B CN201510675648.3A CN201510675648A CN106603261B CN 106603261 B CN106603261 B CN 106603261B CN 201510675648 A CN201510675648 A CN 201510675648A CN 106603261 B CN106603261 B CN 106603261B
- Authority
- CN
- China
- Prior art keywords
- link quality
- link
- active
- backup
- quality level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0668—Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/22—Alternate routing
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明实施例提供一种热备份方法、第一主用设备、备用设备和通信系统,有助于节省系统资源。该方法包括:第一主用设备获取第一主用设备到备用设备的第一链路质量级别、第一主用设备到其它N‑1台主用设备的第X1链路质量级别、第N‑1台主用设备到备用设备的第Y链路质量级别,当第一链路质量级别和第X1链路质量级别中异常级别的数量之和小于第Y链路质量级别中异常级别的数量,并且第X1链路质量级别中正常级别的数量大于等于第Y链路质量级别中正常级别的数量,并且第一主用设备自身链路质量级别为异常级别时,第一主用设备热备份到备用设备;否则第一主用设备保持主用设备状态。
Embodiments of the present invention provide a hot backup method, a first active device, a backup device and a communication system, which help to save system resources. The method includes: the first master device acquires the first link quality level from the first master device to the standby device, the X1th link quality level from the first master device to other N- 1 master devices, and the first link quality level from the first master device to other N-1 master devices. The Y-th link quality level from N-1 active equipment to the backup equipment, when the sum of the first link quality level and the number of abnormal levels in the X1- th link quality level is less than the abnormal level in the Y-th link quality level and the number of normal levels in the X1th link quality level is greater than or equal to the number of normal levels in the Yth link quality level, and the link quality level of the first active device itself is an abnormal level, the first active device The device is hot backed up to the standby device; otherwise, the first active device remains in the state of the active device.
Description
技术领域technical field
本发明实施例涉及通信技术领域,尤其涉及一种热备份方法、第一主用设备、备用设备和通信系统。The embodiment of the present invention relates to the technical field of communication, and in particular to a hot backup method, a first active device, a backup device and a communication system.
背景技术Background technique
随着通信技术的发展,网络中的通信设备上所集成的功能越来越多,通信设备的故障将导致其服务的终端设备无法正常通信,甚至可能影响整个网络的正常运转。因此,一般地,各通信设备中配置有备用通道来保证通信设备在主通信通道出现故障时仍能够正常工作,网络中也配置有备用设备,使某设备出现故障时整个网络人能够正常工作。With the development of communication technology, more and more functions are integrated on the communication equipment in the network. The failure of communication equipment will cause the terminal equipment served by it to fail to communicate normally, and may even affect the normal operation of the entire network. Therefore, in general, each communication device is equipped with a backup channel to ensure that the communication device can still work normally when the main communication channel fails, and the network is also configured with a backup device so that the entire network can work normally when a certain device fails.
热备份技术是网络中一种可靠的保护技术,当网络中的单台通信设备出现故障时,该故障的通信设备中的业务迅速切换到备用通信设备上,且使终端设备不感知通信设备的故障切换。热备份一般有双机热备份和多机热备份等形式。其中,双机热备份即一台备用通信设备保护一台主用通信设备;多机热备份即1台备用通信设备保护N台主用通信设备。显然,双机热备份是多机热备份中的一种。Hot backup technology is a reliable protection technology in the network. When a single communication device in the network fails, the business in the failed communication device will be quickly switched to the standby communication device, and the terminal device will not be aware of the communication device. failover. Hot backup generally has the form of dual-machine hot backup and multi-machine hot backup. Among them, dual-machine hot backup means that one backup communication device protects one main communication device; multi-machine hot backup means that one backup communication device protects N main communication devices. Obviously, dual-machine hot backup is a kind of multi-machine hot backup.
热备份的应用场景广泛,例如对于宽带远程接入服务器(Broadband RemoteAccess Server,BRAS),就使用虚拟路由器冗余协议(Virtual Router RedundancyProtocol,VRRP)实现热备份。以两台BRAS进行双机热备份为例,当主用BRAS设备发生故障后,发生主备切换,主用BRAS上不论上行还是下行的数据流量都将切换到备用BRAS中,在整个切换的过程中,用户设备将不会感知,因此无需用户设备重新连接网络。而当主用BRAS的故障恢复后,将切换到主用BRAS继续工作。Hot backup has a wide range of application scenarios. For example, for a broadband remote access server (Broadband Remote Access Server, BRAS), a virtual router redundancy protocol (Virtual Router Redundancy Protocol, VRRP) is used to implement hot backup. Take two BRAS for dual-system hot backup as an example. When the active BRAS device fails, active-standby switchover occurs, and the data traffic on the active BRAS, whether uplink or downlink, will be switched to the standby BRAS. During the entire switchover process , the user device will not be aware, so there is no need for the user device to reconnect to the network. And when the failure of the active BRAS recovers, it will switch to the active BRAS to continue working.
通常的多机热备份技术仅能对链路中断的故障做出正确的判断,而对于链路严重丢包的情况,则容易做出错误的判断,从而可能导致资源的浪费。The usual multi-machine hot backup technology can only make correct judgments on link interruptions, but it is easy to make wrong judgments on serious link packet loss, which may lead to waste of resources.
发明内容Contents of the invention
本发明实施例提供一种热备份方法、第一主用设备、备用设备和通信系统,通过判断各主用设备以及备用设备之间的链路质量确定是否备份,使得备份后的系统中各通信设备能够正常通信,从而节约了系统资源。An embodiment of the present invention provides a hot backup method, a first active device, a backup device, and a communication system. By judging the link quality between each active device and the backup device, it is determined whether to back up, so that each communication in the backup system The devices can communicate normally, thus saving system resources.
第一方面,提供了一种热备份方法,包括:In the first aspect, a hot backup method is provided, including:
第一主用设备获取第一链路质量级别,所述第一链路质量级别用于表示所述第一主用设备和备用设备之间的链路质量;The first active device acquires a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device;
当所述第一链路质量级别表示链路异常时,所述第一主用设备获取所述第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,所述第Xi链路质量级别用于表示所述第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N;When the first link quality level indicates that the link is abnormal, the first active device acquires the N-1 X i -th links between the first active device and other N-1 active devices Road quality level, the X i -th link quality level is used to indicate the link quality between the first active device and the i-th active device, N is greater than or equal to 2, i is greater than 1 and less than or equal to N ;
所述第一主用设备向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述备用设备之间的链路质量级别;The first active device sends a request message to each of the other N-1 active devices, and the request message is used to request the other N-1 active devices Each active device in the above respectively sends the link quality level between the other N-1 active devices and the backup device;
所述第一主用设备分别接收所述其它N-1台主用设备中M台主用设备发送的第j主用设备与所述备用设备之间的M个第Yj链路质量级别,所述第Yj链路质量级别用于表示第j主用设备和所述备用设备之间的链路质量,j大于1且小于或等于N,M大于等于零且小于或等于N-1;The first active device respectively receives the M Y j -th link quality levels between the j-th active device and the backup device sent by the M active devices among the other N-1 active devices, The Y jth link quality level is used to represent the link quality between the jth active device and the backup device, j is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1;
当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和小于N-1个所述第Xi链路质量级别中表示链路异常的数量,并且所述M个第Yj链路质量级别中表示链路正常的数量大于或等于N-1个所述第Xi链路质量级别中表示链路正常的数量,并且所述第一主用设备的业务受损时,所述第一主用设备将业务切换到所述备用设备。When the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than N-1 of the X i -th link quality levels that indicate link abnormalities number, and the number of normal links in the M Y j -th link quality levels is greater than or equal to the number of normal links in the N-1 X i -th link quality levels, and the first When the service of the active device is damaged, the first active device switches the service to the backup device.
结合第一方面,提供了第一方面的第一种可能的实现方式,还包括:In combination with the first aspect, a first possible implementation of the first aspect is provided, which also includes:
当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和大于或等于N-1个所述第Xi链路质量级别中表示链路异常的数量时,或者当M个所述第Yj链路质量级别中表示链路正常的数量小于N-1个所述第Xi链路质量级别中表示链路正常的数量时,或者所述第一主用设备自身业务未受损时,所述第一主用设备在预设的静默时间窗内不向所述备用设备备份数据。When the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is greater than or equal to N-1 of the X i -th link quality levels, the link When the number of abnormalities, or when the number of normal links in the M Y j -th link quality levels is less than the normal number of links in the N-1 X i -th link quality levels, or the When the service of the first active device is not damaged, the first active device does not back up data to the standby device within a preset silent time window.
结合上述第一方面的第一种可能的实现方式,还提供了第一方面的第二种可能的实现方式,还包括:In combination with the first possible implementation of the first aspect above, a second possible implementation of the first aspect is also provided, including:
当所述静默时间窗超时后,所述第一主用设备再次获取所述第一链路质量级别;When the silent time window expires, the first active device acquires the first link quality level again;
当所述第一链路质量级别表示链路正常时,所述第一主用设备通知所述备用设备将业务切换回所述第一主用设备。When the first link quality level indicates that the link is normal, the first active device notifies the backup device to switch services back to the first active device.
结合上述第一方面、第一方面的第一种可能的实现方式,或第一方面的第二种可能的实现方式,还提供了第一方面的第三种可能的实现方式,还包括:In combination with the above first aspect, the first possible implementation of the first aspect, or the second possible implementation of the first aspect, a third possible implementation of the first aspect is also provided, which further includes:
所述第一主用设备从向所述其它N-1台主用设备中的每一台主用设备分别发送所述请求报文开始的等待时间窗内,暂停向所述备用设备发送备份数据。The first active device suspends sending backup data to the backup device within a waiting time window starting from sending the request message to each of the other N-1 active devices respectively .
第二方面,提供了一种热备份方法,包括:In the second aspect, a hot backup method is provided, including:
备用设备获取第一链路质量级别,所述第一链路质量级别用于表示第一主用设备和所述备用设备之间的链路质量;The backup device acquires a first link quality level, where the first link quality level is used to indicate the link quality between the first active device and the backup device;
当所述第一链路质量级别表示链路异常级别时,所述备用设备获取所述备用设备和除所述第一主用设备外的其它N-1台主用设备中每一台主用设备之间的N-1个第Yj链路质量级别,所述第Yj链路质量级别用于表示所述备用设备和所述第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N;When the first link quality level indicates a link abnormality level, the backup device acquires the backup device and each of the other N-1 active devices except the first active device. N-1 Y j -th link quality levels between devices, the Y j -th link quality level is used to represent the link quality between the backup device and the j-th active device, and N is greater than or Equal to 2, j greater than 1 and less than or equal to N;
所述备用设备向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述第一主用设备之间的链路质量级别;The backup device sends a request message to each of the other N-1 active devices, and the request message is used to request each of the other N-1 active devices One active device respectively sends the link quality levels between the other N-1 active devices and the first active device;
所述备用设备分别接收所述其它N-1台主用设备发送的第i主用设备和所述第一主用设备之间的M个第Xi链路质量级别,所述第Xi链路质量级别用于表示第i主用设备和所述第一主用设备之间的链路质量,i大于1且小于或等于N,M大于等于零且小于或等于N-1;The backup device respectively receives the M X i link quality levels between the i-th active device and the first active device sent by the other N-1 active devices, and the X i link The road quality level is used to indicate the link quality between the i-th active device and the first active device, i is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1;
当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和小于M个所述第Xi链路质量级别中表示链路异常的数量,并且N-1个所述第Yj链路质量级别中表示链路正常的数量大于等于M个所述第Xi链路质量级别中表示链路正常的数量,所述备用设备提升接入优先级,所述提升后的接入优先级用于指示所述第一主用设备的用户切换到所述备用设备。When the sum of the first link quality level and the number of link abnormalities in the N-1 Y j -th link quality levels is less than the number of link abnormalities in the X i -th link quality levels number, and the quantity indicating that the link is normal in the N-1 said Y jth link quality level is greater than or equal to the number indicating that the link is normal in the M said Xi'th link quality level, and the backup device promotes the connection The access priority after the promotion is used to instruct the user of the first active device to switch to the backup device.
结合上述第二方面,还提供了第二方面的第一种可能的实现方式,还包括:In combination with the second aspect above, a first possible implementation of the second aspect is also provided, including:
当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和大于等于M个所述第Xi链路质量级别中表示链路异常的数量,或者N-1个所述第Yj链路质量级别中表示链路正常的数量小于M个所述第Xi链路质量级别中表示链路正常的数量,所述备用设备在预设的静默时间窗内暂停接收所述第一主用设备发送的备份数据。When the sum of the first link quality level and the number of N-1 Y j -th link quality levels indicating link anomalies is greater than or equal to M said X i -th link quality levels indicating link anomalies , or the number of normal links in the N-1 Y jth link quality levels is less than the normal number of links in the M X i link quality levels, and the backup device is in the pre-preparation Suspend receiving the backup data sent by the first master device within the preset silent time window.
第三方面,提供了一种第一主用设备,包括:In a third aspect, a first primary device is provided, including:
第一获取模块,用于获取第一链路质量级别,所述第一链路质量级别用于表示所述第一主用设备和备用设备之间的链路质量;A first acquiring module, configured to acquire a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device;
判断模块,用于判断所述第一链路质量级别是否表示链路异常;A judging module, configured to judge whether the first link quality level indicates a link abnormality;
第二获取模块,用于当所述判断模块判断所述第一链路质量级别表示链路异常时,获取所述第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,所述第Xi链路质量级别用于表示所述第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N;The second acquiring module is configured to acquire N-1 links between the first master device and other N-1 master devices when the judging module judges that the first link quality level indicates that the link is abnormal An X i -th link quality level, the Xi- th link quality level is used to represent the link quality between the first active device and the i-th active device, N is greater than or equal to 2, and i is greater than 1 and less than or equal to N;
发送模块,用于向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述备用设备之间的链路质量级别;A sending module, configured to send a request message to each of the other N-1 active devices, and the request message is used to request each of the other N-1 active devices Each active device sends the link quality level between the other N-1 active devices and the backup device;
接收模块,用于分别接收所述其它N-1台主用设备中M台主用设备发送的第j主用设备与所述备用设备之间的M个第Yj链路质量级别,所述第Yj链路质量级别用于表示第j主用设备和所述备用设备之间的链路质量,j大于1且小于或等于N,M大于等于零且小于或等于N-1;The receiving module is configured to respectively receive the M Y j link quality levels between the jth active device and the backup device sent by the M active devices among the other N-1 active devices, the The Y j -th link quality level is used to represent the link quality between the j-th active device and the backup device, where j is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1;
处理模块,用于当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和小于N-1个所述第Xi链路质量级别中表示链路异常的数量,并且所述M个第Yj链路质量级别中表示链路正常的数量大于或等于N-1个所述第Xi链路质量级别中表示链路正常的数量,并且所述第一主用设备的业务受损时,将业务切换到所述备用设备。A processing module, configured to be used when the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than N-1 of the X i -th link quality levels Represents the number of link abnormalities, and the number of normal links in the M Y j link quality levels is greater than or equal to the normal number of links in the N- 1 Xi link quality levels, And when the service of the first master device is damaged, switch the service to the backup device.
结合上述第三方面,提供了第三方面的第一种可能的实现方式,所述处理模块,还用于当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和大于或等于N-1个所述第Xi链路质量级别中表示链路异常的数量时,或者当M个所述第Yj链路质量级别中表示链路正常的数量小于N-1个所述第Xi链路质量级别中表示链路正常的数量时,或者所述第一主用设备自身业务未受损时,在预设的静默时间窗内不向所述备用设备备份数据。In combination with the above third aspect, a first possible implementation manner of the third aspect is provided, the processing module is further configured to, when the first link quality level and the M Y j -th link quality levels When the sum of the numbers indicating link anomalies is greater than or equal to the number indicating link anomalies in N-1 said X i link quality levels, or when M said Y j link quality levels indicate link When the normal number is less than the N-1 number indicating that the link is normal in the X i -th link quality level, or when the first active device's own service is not damaged, no Back up data to the backup device.
结合上述第三方面的第一种可能的实现方式,还提供了第三方面的第二种可能的实现方式,所述第一获取模块,还用于当所述静默时间窗超时后,再次获取所述第一链路质量级别;In combination with the first possible implementation of the third aspect above, a second possible implementation of the third aspect is also provided, the first acquiring module is further configured to acquire again after the silent time window expires the first link quality level;
所述发送模块,还用于当所述判断模块判断所述第一链路质量级别表示链路正常时,通知所述备用设备将业务切换回所述第一主用设备。The sending module is further configured to notify the standby device to switch services back to the first active device when the judging module judges that the first link quality level indicates that the link is normal.
结合上述第三方面、第三方面的第一种可能的实现方式或第三方面的第二种可能的实现方式,还提供了第三方面的第三种可能的实现方式,所述发送模块,还用于从向所述其它N-1台主用设备中的每一台主用设备分别发送所述请求报文开始的等待时间窗内,暂停向所述备用设备发送备份数据。In combination with the above third aspect, the first possible implementation manner of the third aspect, or the second possible implementation manner of the third aspect, a third possible implementation manner of the third aspect is also provided, the sending module, It is also used to suspend sending backup data to the backup device within a waiting time window starting from sending the request message to each of the other N-1 master devices.
第四方面,提供了一种备用设备,包括:In the fourth aspect, a backup device is provided, including:
第一获取模块,用于获取第一链路质量级别,所述第一链路质量级别用于表示第一主用设备和所述备用设备之间的链路质量;A first obtaining module, configured to obtain a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device;
判断模块,用于判断所述第一链路质量级别是否表示链路异常;A judging module, configured to judge whether the first link quality level indicates a link abnormality;
第二获取模块,用于当所述判断模块判断所述第一链路质量级别表示链路异常级别时,获取所述备用设备和除所述第一主用设备外的其它N-1台主用设备中每一台主用设备之间的N-1个第Yj链路质量级别,所述第Yj链路质量级别用于表示所述备用设备和所述第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N;The second acquisition module is configured to acquire the backup device and other N-1 masters except the first master device when the judging module judges that the first link quality level indicates a link abnormality level N-1 Y j -th link quality levels between each active device in the active device, and the Y j -th link quality level is used to indicate that between the backup device and the j-th active device The link quality of , N is greater than or equal to 2, j is greater than 1 and less than or equal to N;
发送模块,用于向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述第一主用设备之间的链路质量级别;A sending module, configured to send a request message to each of the other N-1 active devices, and the request message is used to request each of the other N-1 active devices Each active device respectively sends the link quality level between the other N-1 active devices and the first active device;
接收模块,用于分别接收所述其它N-1台主用设备发送的第i主用设备和所述第一主用设备之间的M个第Xi链路质量级别,所述第Xi链路质量级别用于表示第i主用设备和所述第一主用设备之间的链路质量,i大于1且小于或等于N,M大于等于零且小于或等于N-1;A receiving module, configured to respectively receive the M X i -th link quality levels between the i-th active device and the first active device sent by the other N-1 active devices, and the X i -th The link quality level is used to indicate the link quality between the i-th active device and the first active device, where i is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1;
处理模块,用于当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和小于M个所述第Xi链路质量级别中表示链路异常的数量,并且N-1个所述第Yj链路质量级别中表示链路正常的数量大于等于M个所述第Xi链路质量级别中表示链路正常的数量,提升接入优先级,所述提升后的接入优先级用于指示所述第一主用设备的用户切换到所述备用设备。A processing module, configured to be used when the sum of the first link quality level and the number of abnormal links in the N-1 Y j -th link quality levels is less than M of the X i -th link quality levels Indicates the number of link abnormalities, and the number of normal links in the Y jth link quality level of N-1 is greater than or equal to the number of normal links in the X ith link quality level of M, and the improvement Access priority, where the elevated access priority is used to instruct a user of the first active device to switch to the standby device.
结合上述第四方面,提供了第四方面的第一种可能的实现方式,所述处理模块,还用于当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和大于等于M个所述第Xi链路质量级别中表示链路异常的数量,或者N-1个所述第Yj链路质量级别中表示链路正常的数量小于M个所述第Xi链路质量级别中表示链路正常的数量,在预设的静默时间窗内暂停接收所述第一主用设备发送的备份数据。In combination with the fourth aspect above, a first possible implementation of the fourth aspect is provided, the processing module is further configured to, when the first link quality level and N-1 the Y j -th link quality The sum of the number of link abnormalities in the levels is greater than or equal to the number of link abnormalities in the X i -th link quality level, or the link is normal in the N-1 Y j -th link quality levels The number is less than the number of the M X i link quality levels indicating that the link is normal, and the reception of the backup data sent by the first master device is suspended within the preset silent time window.
第五方面,提供了一种通信系统,包括:N台主用设备和上述第四方面或上述第四方面的第一种可能的实现方式提供的备用设备,所述N台主用设备中的任意一台主用设备为上述第三方面或上述第三方面的任意一种可能的实现方式提供的第一主用设备。A fifth aspect provides a communication system, including: N master devices and the backup device provided in the fourth aspect above or the first possible implementation manner of the fourth aspect above, where the N master devices Any one master device is the first master device provided in the above third aspect or any possible implementation manner of the above third aspect.
本发明实施例提供的热备份方法、第一主用设备、备用设备和通信系统,统计了第一主用设备与其它各主用设备之间的链路状态,并且统计了各主用设备与备用设备之间的链路状态,从而能够准确判断出导致第一主用设备和备用设备之间链路问题的原因,从而能够做出更加合理的热备份处理,不会由于第一主用设备和备用设备之间链路丢包而造成错误的热备份处理,节约了系统资源。The hot backup method, the first active device, the backup device and the communication system provided by the embodiment of the present invention count the link states between the first active device and other active devices, and count the links between each active device and the other active devices. The link status between the backup devices can accurately determine the cause of the link problem between the first active device and the backup device, so that a more reasonable hot backup process can be made, and it will not be caused by the first active device. Error hot backup processing caused by link packet loss between the backup device and the backup device, saving system resources.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings on the premise of not paying creative efforts.
图1A为一种BRAS双机热备份示意图;Fig. 1A is a schematic diagram of a BRAS dual-machine hot backup;
图1B为另一种BRAS双机热备份示意图;Fig. 1B is another schematic diagram of BRAS dual-machine hot backup;
图2为包含N台主用设备和一台备用设备的多机热备组网示意图;Figure 2 is a schematic diagram of a multi-machine hot standby network including N active devices and one backup device;
图3为本发明实施例提供的热备份方法的流程图;FIG. 3 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图4为本发明实施例提供的热备份方法的流程图;FIG. 4 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图5为本发明实施例提供的热备份方法的流程图;FIG. 5 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图6为本发明实施例提供的热备份方法的流程图;FIG. 6 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图7为本发明实施例提供的热备份方法的流程图;FIG. 7 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图8为本发明实施例提供的热备份方法的流程图;FIG. 8 is a flowchart of a hot backup method provided by an embodiment of the present invention;
图9为本发明实施例提供的第一主用设备的结构示意图;FIG. 9 is a schematic structural diagram of a first master device provided by an embodiment of the present invention;
图10为本发明实施例提供的备用设备的结构示意图;FIG. 10 is a schematic structural diagram of a backup device provided by an embodiment of the present invention;
图11为本发明实施例提供的第一主用设备的结构示意图;FIG. 11 is a schematic structural diagram of a first master device provided by an embodiment of the present invention;
图12为本发明实施例提供的备用设备的结构示意图;FIG. 12 is a schematic structural diagram of a backup device provided by an embodiment of the present invention;
图13为本发明实施例提供的通信系统的结构示意图。FIG. 13 is a schematic structural diagram of a communication system provided by an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
热备份是通信网络中常用的技术,以BRAS使用VRRP协议实现热备份为例,BRAS双机热备份的基本流程如图1A和图1B所示。其中图1A为一种BRAS双机热备份示意图;图1B为另一种BRAS双机热备份示意图。Hot backup is a commonly used technology in communication networks. Take BRAS implementing hot backup using VRRP protocol as an example. The basic flow of BRAS dual-system hot backup is shown in Figure 1A and Figure 1B. 1A is a schematic diagram of a BRAS dual-machine hot backup; FIG. 1B is a schematic diagram of another BRAS dual-computer hot backup.
在图1A和图1B中,终端11通过接入设备12与交换机13通信,交换机13能够与BRAS14和BRAS 15通信,BRAS 14通过核心路由器(Core Router,CR)16接入核心网络17,BRAS 15通过CR 16接入核心网络17。In Fig. 1A and Fig. 1B, terminal 11 communicates with switch 13 through access device 12, switch 13 can communicate with BRAS14 and BRAS 15, BRAS 14 access core network 17 through core router (Core Router, CR) 16, BRAS 15 The core network 17 is accessed through the CR 16 .
其中,BRAS 14为主用BRAS,BRAS 15为备用BRAS。BRAS 14负责终端11接入和承载用户业务,BRAS 15从BRAS 14同步用户的会话等数据库信息,比如用户介质访问控制(Media Access Control,MAC)、网络协议(Internet Protocol,IP)、动态主机配置协议(Dynamic Host Configuration Protocol,DHCP)租期、DHCP选项82(option82)等。通过路由策略,可使得BRAS 14发布的用户路由的优先级高于BRAS 15,这样,网络侧到终端11的下行流量优选在BRAS 14转发,如图1A中路径18。Among them, BRAS 14 is the active BRAS, and BRAS 15 is the backup BRAS. BRAS 14 is responsible for terminal 11 access and bearing user services, and BRAS 15 synchronizes database information such as user sessions from BRAS 14, such as user media access control (Media Access Control, MAC), network protocol (Internet Protocol, IP), dynamic host configuration Agreement (Dynamic Host Configuration Protocol, DHCP) lease term, DHCP option 82 (option82), etc. Through the routing policy, the priority of the user route issued by BRAS 14 can be higher than that of BRAS 15. In this way, the downlink traffic from the network side to terminal 11 is preferably forwarded by BRAS 14, as shown in path 18 in FIG. 1A.
当BRAS 14发生故障后,所述故障可以是链路故障或整机故障,VRRP和冗余用户信息(Redundant User Information,RUI)发生主备切换,BRAS 15作为新的主用BRAS,BRAS14作为新的备用BRAS。BRAS 15会发布高优先级用户的互联网协议(Internet Protocol,IP)路由,BRAS 14会撤消用户的IP路由或者降低路由的优先级,使网络侧下行到终端11的流量切到BRAS 15。在用户接入侧,BRAS 15设备会发送地址解析协议(Address ResolutionProtocol,ARP)报文去修改交换机13中的MAC表,终端11上行的流量也会切到BRAS 15上,如图1B中路径19。上述切换保护过程不会被终端11感知,即终端11无需重新拨号上线。当BRAS14的故障恢复后,接入BRAS 15的用户可回切到BRAS 14。When BRAS 14 fails, the failure may be a link failure or a complete machine failure, VRRP and redundant user information (Redundant User Information, RUI) undergo active-standby switchover, BRAS 15 acts as the new active BRAS, and BRAS 14 acts as the new active BRAS. Standby BRAS. BRAS 15 will publish the Internet Protocol (Internet Protocol, IP) route of the high-priority user, and BRAS 14 will cancel the user's IP route or reduce the priority of the route, so that the traffic downlinked from the network side to the terminal 11 is switched to the BRAS 15. On the user access side, the BRAS 15 device will send an Address Resolution Protocol (Address Resolution Protocol, ARP) message to modify the MAC table in the switch 13, and the upstream traffic of the terminal 11 will also be switched to the BRAS 15, as shown in the path 19 in Figure 1B . The above switching protection process will not be perceived by the terminal 11, that is, the terminal 11 does not need to redial to go online. After the fault of BRAS14 recovers, the users who access BRAS15 can switch back to BRAS14.
BRAS使用VRRP协议实现多机热备有几个重要的模块:BRAS uses VRRP protocol to realize multi-device hot backup and has several important modules:
模块1:VRRP模块。在每两台BRAS之间部署VRRP协议,选举决定的主备角色。部署VRRP后,主用BRAS会向备用BRAS发送VRRP心跳报文,该报文为二层组播报文,因此需要在主用BRAS和备用BRAS间建立一条二层通道。如果主用BRAS和备用BRAS都下挂在一个二层网络,如交换机上,则主用BRAS和备用BRAS间存在一条二层通道。如果主用BRAS和备用BRAS间的距离较远,或者主用BRAS和备用BRAS间不存在经过二层网络的通道,则主用BRAS和备用BRAS间需要部署穿越城域网的二层虚拟专用网络(Layer 2 Virtual Private Network,L2VPN)隧道。VRRP模块可以跟踪接入口的状态,比如可用(up)或不可用(down),来动态调整VRRP协议的优先级参数。同时,VRRP模块可以联动双向转发侦测(BidirectionalForwarding Detection,BFD)等检测技术,触发重新选举主用BRAS和备用BRAS,并联动业务进行倒换处理。Module 1: VRRP module. Deploy the VRRP protocol between every two BRASs, and elect the determined active and standby roles. After VRRP is deployed, the active BRAS will send VRRP heartbeat packets to the standby BRAS. The packets are Layer 2 multicast packets. Therefore, a Layer 2 channel needs to be established between the active BRAS and the standby BRAS. If both the active BRAS and the standby BRAS are connected to a layer 2 network, such as a switch, there is a layer 2 channel between the active BRAS and the standby BRAS. If the distance between the active BRAS and the standby BRAS is relatively long, or there is no channel through the Layer 2 network between the active BRAS and the standby BRAS, a Layer 2 virtual private network that traverses the MAN needs to be deployed between the active BRAS and the standby BRAS (Layer 2 Virtual Private Network, L2VPN) tunnel. The VRRP module can track the state of the access port, such as available (up) or unavailable (down), to dynamically adjust the priority parameters of the VRRP protocol. At the same time, the VRRP module can be linked with detection technologies such as Bidirectional Forwarding Detection (BFD) to trigger re-election of the active BRAS and standby BRAS, and link services for switching processing.
模块2:远程备份服务(Remote Backup Service,RBS)模块。RBS模块用于在两个BRAS间传输备份数据。RBS模块采用传输控制协议(Transmission Control Protocol,TCP)实现,给业务提供批量备份和实时备份的机制。批量备份是发生在TCP连接建立成功之后。实时备份发生在用户接入成功、用户在线后的属性变更成功、或者用户统计数据备份周期到达。Module 2: Remote Backup Service (RBS) module. The RBS module is used to transmit backup data between two BRASs. The RBS module is implemented using Transmission Control Protocol (TCP), and provides batch backup and real-time backup mechanisms for services. Batch backup occurs after the TCP connection is successfully established. The real-time backup occurs when the user accesses successfully, the attributes of the user are changed successfully after being online, or the backup cycle of user statistical data arrives.
模块3:RUI处理模块。RUI处理模块负责用户信息备份,跟踪VRRP的状态,确定接入用户的主用设备和备用设备的状态,并依据状态控制转发路由的倒换,计费控制等。Module 3: RUI processing module. The RUI processing module is responsible for backing up user information, tracking the status of VRRP, determining the status of the primary device and backup device for accessing users, and controlling the switching of forwarding routes and charging control based on the status.
VRRP协议可有效确定两台BRAS的主备关系,同时结合BFD等检测技术,可以保障在主用BRAS故障后,备用BRAS迅速感知并迅速做切换。具体表现在,如图1A和图1B所示,VRRP心跳报文通过两台BRAS下连的交换机转发,同时BFD跟踪主用BRAS网络侧链路,主用BRAS整机故障、与交换机的链路故障、与网络侧CR设备的链路故障,备用BRAS都能够快速感知。在相对复杂的多机热备组网中,也是通过VRRP结合BFD检测的技术实现故障的迅速感知和切换处理。The VRRP protocol can effectively determine the active/standby relationship between two BRASs, and combined with BFD and other detection technologies, it can ensure that the standby BRAS can quickly sense and switch quickly after the active BRAS fails. Specifically, as shown in Figure 1A and Figure 1B, VRRP heartbeat packets are forwarded through the switches connected to the two BRASs. At the same time, BFD tracks the link on the network side of the active BRAS. The standby BRAS can quickly detect faults and link faults with the CR device on the network side. In a relatively complex multi-device hot standby network, rapid detection and switching of faults is also achieved through the technology of VRRP combined with BFD detection.
当前多机热备技术中的VRRP结合BFD检测技术仅能处理链路故障,比如链路中断、整机故障等故障类型,对于链路丢包等故障,则可能做出错误的判断。图2为包含N台主用设备和一台备用设备的多机热备组网示意图。在图2所示的网络中,以包括3台主用BRAS和一台备用BRAS的多机热备组网为例,3台主用BRAS分别为BRAS 21、BRAS 22和BRAS 23,备用BRAS为BRAS 24。BRAS 21、BRAS 22和BRAS 23分别与BRAS 24建立VRRP和RBS,形成3个设备对,3个设备对中每个设备对包含的两个BRAS之间存在主备关系。每个设备对中的任意一台BRAS可依据VRRP和BFD判断对端的状态,根据路由是否可达确定RBS的建联与否。检测技术仅能在链路故障或整机故障时生效,每个设备对中的任意一台BRAS无法在中间网络严重丢包时准确地判断对端的状态。The VRRP combined with BFD detection technology in the current multi-device hot backup technology can only deal with link faults, such as link interruption and complete machine failure. For faults such as link packet loss, erroneous judgments may be made. Fig. 2 is a schematic diagram of a multi-machine hot standby network including N active devices and one backup device. In the network shown in Figure 2, take the multi-machine hot standby network including 3 active BRAS and one standby BRAS as an example, the three active BRAS are BRAS 21, BRAS 22 and BRAS 23, and the standby BRAS is BRAS 24. BRAS 21, BRAS 22, and BRAS 23 respectively establish VRRP and RBS with BRAS 24 to form three device pairs, and there is an active-standby relationship between two BRASs included in each of the three device pairs. Any BRAS in each device pair can judge the status of the peer end according to VRRP and BFD, and determine whether the RBS is established or not according to whether the route is reachable. The detection technology can only take effect when the link fails or the whole machine fails, and any BRAS in each device pair cannot accurately judge the status of the opposite end when the intermediate network suffers serious packet loss.
具体地,RBS反复断链或BFD震荡,主用BRAS仍认为自己是VRRP主设备,并保持VRRP主设备的角色不变;由于BFD震荡和偶尔能收到主用BRAS发送的VRRP心跳,备用BRAS的VRRP状态不断震荡。但是,基于此信息,主用BRAS或备用BRAS都没有办法判断对端或者对端所处网络是否有问题。备用BRAS也无法准确地判断自己是否有问题,除非备用BRAS存在非RUI用户,且非RUI用户存在大量掉线的情况,备用BRAS可获知自身的网络出现故障。Specifically, if the RBS repeatedly disconnects or BFD flaps, the active BRAS still considers itself to be the VRRP master and maintains the role of the VRRP master; The VRRP state keeps fluctuating. However, based on this information, neither the active BRAS nor the standby BRAS can determine whether there is a problem with the peer end or the network where the peer end is located. The backup BRAS cannot accurately judge whether there is a problem with itself, unless there are non-RUI users in the backup BRAS, and a large number of non-RUI users are disconnected, the backup BRAS can learn that its own network is faulty.
在以下三种场景,VRRP无法做出正确地处理。(1)主用BRAS所在网络丢包,造成RBS震荡。主用BRAS应切换为备用BRAS,做RUI切换,但实际上,由于仅仅是丢包,不是链路不可用,不会触发VRRP切换。(2)备用BRAS所在网络故障丢包,造成RBS震荡。RUI主备状态保持不变,理应不向备用BRAS再备份数据,但实际上,只要RBS恢复就启动了备份,造成资源浪费。(3)主用BRAS和备用BRAS间网络故障,造成RBS震荡。此时双方都不能保障到所有用户的链路可达,部分用户与主用BRAS间的链路可达,部分用户与备用BRAS间的链路可达,此时RUI应该满足所有用户的业务,但实际上,VRRP一直在震荡,不能稳定主用BRAS和备用BRAS,主用BRAS和备用BRAS间一直在备份数据。In the following three scenarios, VRRP cannot handle it correctly. (1) Packets are lost on the network where the active BRAS is located, causing RBS to flap. The active BRAS should be switched to the standby BRAS for RUI switchover, but in fact, because only packet loss, not link unavailability, VRRP switchover will not be triggered. (2) The network where the backup BRAS is located fails and loses packets, causing RBS to fluctuate. The active and standby state of the RUI remains unchanged, and it is reasonable not to back up data to the standby BRAS, but in fact, as long as the RBS recovers, the backup is started, resulting in waste of resources. (3) The network failure between the active BRAS and the standby BRAS causes RBS to fluctuate. At this time, both parties cannot guarantee that the links of all users are reachable, the links between some users and the active BRAS are reachable, and the links between some users and the backup BRAS are reachable. At this time, the RUI should meet the services of all users. But in fact, VRRP has been fluctuating, unable to stabilize the active BRAS and the standby BRAS, and the active BRAS and the standby BRAS are always backing up data.
由此可知,由于VRRP仅能在整机故障或链路中断故障时做出正确地备份选择,但在整机或链路未全部故障时,VRRP就可能做出错误的判断,从而可能导致资源浪费。It can be seen that VRRP can only make the correct backup selection when the whole machine fails or the link is interrupted. waste.
在多机热备份场景下,多个主用设备和一个备用设备之间的链路状态可能不同程度的存在问题,但仅通过两个设备自身检测,可能无法判断具体是由于哪个设备的原因而导致的链路问题。因此,为了在多机热备份场景下能够做出正确的备份选择,首先需要正确地确定导致链路出现问题的设备。在N:1多机热备场景中,当主用BRAS发现自身和其它的主用BRAS丢包率很小,但自身和备用BRAS之间丢包率很大,说明备用BRAS为导致链路出问题的设备。当所有的主用BRAS及备用BRAS均发现其自身和某一个主用BRAS的丢包率很大,说明是所述某一个主用BRAS为导致链路出问题的设备。因此,可以在所有BRAS之间配置性能检测协议,当发生VRRP震荡或RBS震荡时,主用BRAS发现自身和其它的主用BRAS丢包率很小,但自身和备用BRAS之间丢包率很大,说明备用BRAS为导致链路出问题的设备。当所有的主用BRAS及备用BRAS均发现其和某一个主用BRAS的丢包率很大,说明所述某一个主用BRAS为导致链路出问题的设备。In a multi-device hot backup scenario, the link status between multiple active devices and a backup device may have problems to varying degrees, but only through the detection of the two devices themselves, it may not be possible to determine which device is the cause. cause link problems. Therefore, in order to make a correct backup selection in a multi-device hot backup scenario, it is first necessary to correctly determine the device causing the link problem. In the N:1 multi-device hot backup scenario, when the active BRAS finds that the packet loss rate between itself and other active BRASs is small, but the packet loss rate between itself and the standby BRAS is high, it means that the standby BRAS is causing a link problem. device of. When all active BRASs and standby BRASs find that the packet loss rate of themselves and a certain active BRAS is very high, it means that the certain active BRAS is the device causing the link problem. Therefore, a performance detection protocol can be configured between all BRASs. When VRRP flapping or RBS flapping occurs, the active BRAS finds that the packet loss rate between itself and other active BRASs is very small, but the packet loss rate between itself and the standby BRAS is very high. If the value is large, it indicates that the standby BRAS is the device causing the link failure. When all active BRASs and standby BRASs find that they have a high packet loss rate with a certain active BRAS, it indicates that the certain active BRAS is a device causing a link problem.
本发明实施例就是基于上述构思,提供了一种热备份方法,能够准确地确定导致链路故障的设备,进而进行合理地进行热备份处理。Based on the above idea, the embodiment of the present invention provides a hot backup method, which can accurately determine the device causing the link failure, and then reasonably perform hot backup processing.
图3为本发明实施例提供的热备份方法的流程图,本实施例提供的热备份方法应用于主用设备和备用设备的数量比为N:1的多机热备份系统,其中N≥2。下面结合图3,对本发明实施例提供的热备份方法进行详细说明。Fig. 3 is a flow chart of the hot backup method provided by the embodiment of the present invention. The hot backup method provided by this embodiment is applied to a multi-machine hot backup system in which the number ratio of the active device to the backup device is N:1, where N≥2 . The hot backup method provided by the embodiment of the present invention will be described in detail below with reference to FIG. 3 .
S301,第一主用设备获取第一链路质量级别,第一链路质量级别用于表示第一主用设备和备用设备之间的链路质量。S301. The first active device acquires a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device.
具体地,本实施例提供的热备份方法应用于包括N台主用设备和一台备用设备的多机热备份系统,其中N≥2。Specifically, the hot backup method provided in this embodiment is applied to a multi-machine hot backup system including N master devices and one backup device, where N≥2.
本实施例提供的热备份方法的执行主体为任一主用设备,将其称为第一主用设备。首先,第一主用设备需要获取第一主用设备到备用设备的第一链路质量级别。第一链路质量级别用于表示第一主用设备和备用设备之间的链路质量。第一链路质量级别可以采用多种参数来表示,例如第一主用设备和备用设备之间的丢包率、数据吞吐率等。第一主用设备通过第一链路质量级别可以获知第一主用设备和备用设备之间的链路是否正常。第一链路质量级别可以分为几个等级,例如正常、异常、不可用等,其中不可用是第一主用设备无法获取第一链路质量级别时的一种状态。或者第一链路质量级别可以是一个数值或一个比例值,对于正常、异常、不可用等状态,可设置有相应的取值范围,在此不再逐一举例说明。The execution subject of the hot backup method provided in this embodiment is any active device, which is called the first active device. First, the first active device needs to obtain a first link quality level from the first active device to the backup device. The first link quality level is used to indicate the link quality between the first active device and the backup device. The first link quality level may be represented by various parameters, such as the packet loss rate and data throughput rate between the first active device and the backup device. The first active device can learn whether the link between the first active device and the standby device is normal through the first link quality level. The first link quality level can be divided into several levels, such as normal, abnormal, unavailable, etc., where unavailable is a state when the first active device cannot obtain the first link quality level. Alternatively, the first link quality level may be a numerical value or a proportional value, and corresponding value ranges may be set for normal, abnormal, and unavailable states, which will not be described one by one here.
S302,当第一链路质量级别表示链路异常时,第一主用设备获取第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,第Xi链路质量级别用于表示第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N。S302. When the first link quality level indicates that the link is abnormal, the first active device obtains the N-1 X i link quality levels between the first active device and other N-1 active devices, The X i th link quality level is used to indicate the link quality between the first active device and the i th active device, N is greater than or equal to 2, and i is greater than 1 and less than or equal to N.
具体地,当第一主用设备获取到第一链路质量级别后,若第一链路质量级别表示链路正常,则意味着第一主用设备和备用设备之间的链路状态正常,可以按照正常的热备份流程进行处理。Specifically, after the first active device obtains the first link quality level, if the first link quality level indicates that the link is normal, it means that the link status between the first active device and the backup device is normal, It can be processed according to the normal hot backup process.
当第一链路质量级别表示链路异常时,意味着第一主用设备和备用设备之间的链路存在问题,则第一主用设备需要确定是备用设备故障导致的链路问题还是第一主用设备故障导致的链路问题。When the first link quality level indicates that the link is abnormal, it means that there is a problem with the link between the first active device and the backup device, and the first active device needs to determine whether the link problem is caused by the backup device failure or the second A link problem caused by the failure of the active device.
举例说明,第一主用设备首先获取第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,第Xi链路质量级别用于表示第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N。由于除了第一主用设备以外,系统中还包括N-1个主用设备,那么第一主用设备将获取到N-1个第Xi链路质量级别。若N-1个第Xi链路质量级别中W个第Xi链路质量级别为异常,S个第Xi链路质量级别为正常,则W与S的和为N-1。For example, the first active device first obtains the N-1 X i link quality levels between the first active device and other N-1 active devices, and the X i link quality level is used to represent the The link quality between an active device and the i-th active device, where N is greater than or equal to 2, and i is greater than 1 and less than or equal to N. Since the system includes N−1 active devices in addition to the first active device, the first active device will obtain N−1 X i th link quality levels. If the W link quality level of the Xi th link quality level in the N-1 xi th link quality levels is abnormal, and the S th link quality level of the Xi th link is normal, then the sum of W and S is N-1.
S303,第一主用设备向其它N-1台主用设备中的每一台主用设备分别发送请求报文,该请求报文用于请求其它N-1台主用设备中的每一台主用设备分别发送其它N-1台主用设备与备用设备之间的链路质量级别。S303, the first active device sends a request message to each of the other N-1 active devices respectively, and the request message is used to request each of the other N-1 active devices The active device sends the link quality levels between the other N-1 active devices and the backup device respectively.
具体地,第一主用设备还需要获知其它N-1台主用设备和备用设备之间的链路质量级别,也就是获知其它N-1台主用设备和备用设备之间的链路状态。第一主用设备向其它N-1台设备中的每一台主用设备分别发送请求报文。Specifically, the first active device also needs to know the link quality level between other N-1 active devices and backup devices, that is, to know the link status between other N-1 active devices and backup devices . The first active device sends request packets to each of the other N-1 active devices respectively.
需要说明的是,导致第一主用设备和备用设备之间链路状态异常的原因可能是第一主用设备的故障导致的,第一主用设备和其它N-1台主用设备之间的链路可能由于第一主用设备的故障而导致异常,因此,第一主用设备发送的请求报文不一定会被其它N-1台设备接收到。It should be noted that the cause of abnormal link status between the first active device and the backup device may be caused by the failure of the first active device. The link between the first active device and other N-1 active devices The link of may be abnormal due to the failure of the first active device. Therefore, the request message sent by the first active device may not be received by other N-1 devices.
S304,第一主用设备分别接收其它N-1台主用设备中M台主用设备发送的第j主用设备与备用设备之间的M个第Yj链路质量级别,第Yj链路质量级别用于表示第j主用设备和备用设备之间的链路质量,j大于1且小于或等于N,M大于等于零且小于或等于N-1。S304. The first active device respectively receives the M Y j-th link quality levels between the j -th active device and the standby device sent by M active devices among the other N-1 active devices, and the Y j -th link quality level is The link quality level is used to indicate the link quality between the jth active device and the standby device, where j is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1.
具体地,第一主用设备在向其它N-1台主用设备发送了请求报文后,将接收到第j主用设备发送的第Yj链路质量级别,其中,第Yj链路质量级别用于表示第j主用设备和备用设备之间的链路质量。由于第一主用设备可能与其它N-1台主用设备之间的链路存在问题,因此第一主用设备可能接收不到其它N-1台主用设备中第j主用设备发送的第Yj链路质量,也就是可能接收不到N-1个链路质量级别,所述N-1个链路质量级别中的Yj链路质量级别用于表示第j主用设备和备用设备之间的链路质量。若第一主用设备共收到了M个链路质量级别,M小于或等于N-1。设P个链路质量级别为异常,Q个链路质量级别为正常,那么P与Q之和小于或等于N-1。Specifically, after the first master device sends request messages to other N-1 master devices, it will receive the Y j -th link quality level sent by the j-th master device, where the Y j -th link The quality level is used to indicate the link quality between the jth active device and the backup device. Since there may be problems with the link between the first active device and the other N-1 active devices, the first active device may not receive the message sent by the jth active device among the other N-1 active devices. The Y jth link quality, that is, the N-1 link quality level may not be received, and the Y j link quality level in the N-1 link quality levels is used to represent the jth active device and the backup Link quality between devices. If the first active device has received M link quality levels in total, M is less than or equal to N-1. Suppose P link quality levels are abnormal, and Q link quality levels are normal, then the sum of P and Q is less than or equal to N-1.
S305,当第一链路质量级别和M个第Yj链路质量级别中表示链路异常的数量之和小于N-1个第Xi链路质量级别中表示链路异常的数量,并且M个第Yj链路质量级别中表示链路正常的数量大于或等于N-1个第Xi链路质量级别中表示链路正常的数量,并且第一主用设备的业务受损时,第一主用设备将业务切换到备用设备。S305, when the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than the number of link abnormalities in the N-1 X i -th link quality levels, and M When the number of normal links in the Y j -th link quality level is greater than or equal to the number of normal links in the N-1 X i -th link quality level, and the service of the first active device is damaged, the A master device switches services to the backup device.
举例说明,M个第Yj链路质量级别表示的是M个主用设备中每个主用设备与备用设备间的链路质量级别。若M大于3,则M个第Yj链路质量级别分别为第Y2链路质量级别、第Y3链路质量级别…第YM+1链路质量级别。其中,第Y2链路质量级别表示第二主用设备到备用设备的链路状态;第Y3链路质量级别表示第三主用设备到备用设备的链路状态;第YM+1链路质量级别表示第M+1主用设备到备用设备的链路状态。M为其它数值时,M个第Yj链路质量级别的具体内容在此不再举例说明。For example, the M Y j -th link quality levels represent the link quality level between each active device and the backup device among the M active devices. If M is greater than 3 , then the M Y jth link quality levels are Y2th link quality level, Y3rd link quality level... YM+1th link quality level. Among them, the Y 2nd link quality level represents the link status from the second master device to the backup device; the Y 3rd link quality level represents the link status from the third master device to the backup device; the Y M+1 link The link quality level indicates the link status from the M+1th master device to the backup device. When M is other values, the specific content of the M Y j -th link quality level will not be illustrated here.
举例说明,N-1个第Xi链路质量级别表示的是N-1个主用设备中每个主用设备与第一主用设备间的链路质量级别。若N大于3,则N-1个第Xi链路质量级别分别为第X2链路质量级别、第X3链路质量级别…第XN链路质量级别。其中,第X2链路质量级别表示第一主用设备和第二主用设备间的链路状态。第X3链路质量级别表示第一主用设备和第三主用设备间的链路状态。第XN链路质量级别表示第N主用设备和第一主用设备间的链路状态。N为其它数值时,N-1个第Xi链路质量级别的具体内容在此不再举例说明。For example, the N−1 X i th link quality levels indicate the link quality level between each active device among the N−1 active devices and the first active device. If N is greater than 3, then the N−1 X i -th link quality levels are respectively the X2 - th link quality level, the X3 -th link quality level...the XN -th link quality level. Wherein, the X2th link quality level indicates the link status between the first active device and the second active device. The X3rd link quality level indicates a link state between the first active device and the third active device. The X Nth link quality level indicates a link state between the Nth active device and the first active device. When N is other values, the specific content of the N−1 X i th link quality level will not be illustrated here.
具体地,在第一主用设备获取了第一链路质量级别、N-1个第Xi链路质量级别、M个第Yj链路质量级别后,即可判断第一主用设备和备用设备之间的链路问题是由于谁造成的。若第一链路质量级别和M个第Yj链路质量级别中表示链路异常的数量之和小于N-1个第Xi链路质量级别中表示链路异常级别的数量,即P与1之和小于W。P与1之和是由于第一主用设备也认为和备用设备之间的链路质量级别表示链路异常,即认为备用设备的链路质量级别表示链路异常的数量小于认为第一主用设备的链路质量级别表示链路异常的数量;并且M个第Yj链路质量级别中表示链路正常级别的数量大于或等于N-1个第Xi链路质量级别中表示链路正常级别的数量,即Q≥S,即认为备用设备的链路质量级别表示链路正常的数量大于或等于认为第一主用设备的链路质量级别表示链路正常的数量。那么也就是说,备用设备的链路质量级别表示链路异常的可能性小于第一主用设备的链路质量级别表示链路异常的可能性,并且备用设备的链路质量级别表示链路正常的可能性大于等于第一主用设备的链路质量级别表示链路正常的可能性。同时,还需要第一主用设备发现自身的业务受损时,则第一主用设备将确定第一主用设备和备用设备之间的链路问题是由于自身故障造成的,因此第一主用设备将业务切换到备用设备。Specifically, after the first active device obtains the first link quality level, the N-1 X i link quality levels, and the M Y j link quality levels, it can determine whether the first active device and The link problem between the backup devices is due to whom. If the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than the number of link abnormalities in the N-1 X i -th link quality levels, that is, P and The sum of 1 is less than W. The sum of P and 1 is because the first active device also believes that the link quality level between the backup device and the backup device indicates that the link is abnormal. The link quality level of the device indicates the number of link abnormalities; and the number of link normal levels in the M Y j -th link quality level is greater than or equal to the N-1 X i -th link quality level indicating that the link is normal The number of levels, that is, Q≥S, means that the link quality level of the backup device is considered to indicate that the link is normal, which is greater than or equal to the number of the link quality level of the first active device that is considered to be normal. That is to say, the link quality level of the standby device indicates that the link is less likely to be abnormal than the link quality level of the first active device indicates that the link is abnormal, and the link quality level of the standby device indicates that the link is normal The possibility of being greater than or equal to the link quality level of the first active device indicates the possibility that the link is normal. At the same time, when the first active device finds that its own service is damaged, the first active device will determine that the link problem between the first active device and the backup device is caused by its own fault, so the first active device Use the equipment to switch the service to the standby equipment.
第一主用设备除了对各链路质量级别进行判断之外,还需要判断自身业务受损才确定进行业务切换,是由于在N:1多机热备份系统中,备用设备始终是处于备用状态,只有在第一主用设备确实出现问题时,才切换到备用设备,而备用设备是不适于长期作为主用设备的角色在网络中存在的。而第一主用设备若自身业务未受损,那么依然能够为用户提供网络服务,只是由于和其他主用设备以及备用设备之间的链路可能存在问题而对数据的热备份产生影响。为了保证系统中的用户不被频繁切换,保证系统的稳定性,因此只有当第一主用设备的业务受损时,同时第一主用设备和备用设备之间的链路问题是由于第一主用设备的故障造成的,才将第一主用设备的业务切换到备用设备。第一主用设备可以通过检测已接入的用户是否存在掉线、是否存在长时间没有数据传输、或与已接入的其他用户之间的数据传输是否正常来确定自身业务是否受损,换句话说,所述第一主用设备在检测到已接入的用户掉线、存在长时间没有数据传输或与已接入的其他用户之间的数据传输异常,确定自身业务受损。In addition to judging the quality level of each link, the first active device also needs to judge that its own business is damaged before making a business switchover. This is because in the N:1 multi-device hot backup system, the backup device is always in the standby state , only when the first master device does have a problem, it will switch to the backup device, and the backup device is not suitable for long-term role as the master device in the network. If the first active device is not damaged in its own business, it can still provide network services for users, but the hot backup of data may be affected due to possible problems in links with other active devices and backup devices. In order to ensure that the users in the system are not frequently switched and ensure the stability of the system, only when the service of the first active device is damaged and the link problem between the first active device and the backup device is caused by the first The service of the first active device is switched to the backup device only when the fault of the active device is caused. The first active device can determine whether its own business is damaged by detecting whether the connected user is disconnected, whether there is no data transmission for a long time, or whether the data transmission with other connected users is normal. In other words, the first active device determines that its own service is damaged when it detects that the connected user is offline, there is no data transmission for a long time, or the data transmission with other connected users is abnormal.
第一主用设备可采用通常的业务切换方法,完成将业务切换到备用设备,在此不再赘述。The first active device can use a common service switching method to complete the service switching to the standby device, which will not be repeated here.
本实施例提供的热备份方法中,统计了第一主用设备与其它各主用设备之间的链路状态,并且统计了各主用设备与备用设备之间的链路状态,从而能够准确判断出导致第一主用设备和备用设备之间链路问题的原因,从而能够做出更加合理的热备份处理,而不会由于第一主用设备和备用设备之间链路丢包而造成错误的热备份处理,节约了系统资源。In the hot backup method provided in this embodiment, the link status between the first active device and other active devices is counted, and the link status between each active device and the backup device is counted, so that it can accurately Determine the cause of the link problem between the first active device and the backup device, so that a more reasonable hot backup process can be made without causing packet loss on the link between the first active device and the backup device. Wrong hot backup processing saves system resources.
进一步地,在图3所示实施例中,当第一主用设备将业务切换到备用设备之后,那么第一主用设备将转换为备用设备状态,而备用设备将转换为主用设备状态。此时,第一主用设备还可以启动一个预设的静默时间窗,在该静默时间窗内,第一主用设备将不再切换回主用设备状态。这样可以避免主用设备和备用设备之间的频繁切换。Further, in the embodiment shown in FIG. 3 , after the first active device switches services to the backup device, the first active device will switch to the standby device state, and the backup device will switch to the active device state. At this time, the first active device may also start a preset silent time window, and within the silent time window, the first active device will not switch back to the active device state. This avoids frequent switching between active and standby devices.
进一步地,当静默时间窗超时之后,第一主用设备将再次获取和备用设备之间的第一链路质量级别,若第一链路质量级别为正常,则第一主用设备将通知备用设备将业务切换回主用设备。静默时间窗的时间可以根据经验值设置,静默时间窗一般可以设为系统的平均故障修复时间。Further, when the silent time window expires, the first active device will obtain the first link quality level with the backup device again, and if the first link quality level is normal, the first active device will notify the backup device The device switches services back to the active device. The time of the silent time window can be set according to empirical values, and the silent time window can generally be set as the average fault recovery time of the system.
进一步地,在图3所示实施例中,第一主用设备将业务切换到备用设备的触发方式可以包括:第一主用设备降低接入优先级或初始化,触发接入第一主用设备的用户设备切换至备用设备。也就是当第一主用设备确定自身故障导致与备用设备之间链路问题后,第一主用设备可以主动降低自身的接入优先级,使得接入第一主用设备的用户设备切换至备用设备,从而完成业务切换。或者第一主用设备还可以进行初始化处理,同样可以使得接入第一主用设备的用户设备切换至备用设备,从而完成业务切换。第一主用设备降低接入优先级的原则是需要将接入优先级降低到比备用设备的接入优先级更低,这样才能使原接入第一主用设备的用户切换至备用设备。Further, in the embodiment shown in FIG. 3 , the triggering method for the first active device to switch services to the backup device may include: the first active device lowers the access priority or initializes, and triggers access to the first active device The user device of the user is switched to the backup device. That is to say, when the first active device determines that its own failure causes a link problem with the backup device, the first active device can actively lower its own access priority, so that the user equipment connected to the first active device switches to Backup equipment to complete service switching. Alternatively, the first active device may also perform initialization processing, which may also switch the user equipment connected to the first active device to the backup device, thereby completing service switching. The principle of lowering the access priority of the first active device is that the access priority needs to be lowered to be lower than that of the standby device, so that the user who originally accessed the first active device can switch to the standby device.
进一步地,在第一主用设备降低接入优先级或初始化之后,本实施例提供的热备份方法还包括:第一主用设备向备用设备发送第一主用设备降低接入优先级或初始化的消息。这样备用设备也可以获知第一主用设备降低优先级或初始化的信息,从而可以准备原接入第一主用设备的用户设备的接入。需要说明的是,虽然第一主用设备向备用设备发送了第一主用设备降低接入优先级或初始化的消息,但由于第一主用设备和备用设备之间的链路存在问题,备用设备可能接收不到该消息,但由于第一主用设备已降低了接入优先级,因此即使备用设备接收不到主用设备发送的信息,仍然可以完成用户设备的切换。Further, after the first active device lowers the access priority or initializes, the hot backup method provided in this embodiment further includes: the first active device sends the first active device lowering the access priority or initializing news. In this way, the backup device can also learn the information about the priority reduction or initialization of the first active device, so as to prepare for the access of the user equipment that originally accessed the first active device. It should be noted that although the first active device sent a message that the first active device lowered the access priority or initialized to the backup device, due to a problem with the link between the first active device and the backup device, the backup The device may not receive the message, but because the first active device has lowered the access priority, even if the backup device cannot receive the information sent by the active device, the handover of the user equipment can still be completed.
图4为本发明实施例提供的热备份方法的流程图。本实施例提供的热备份方法,在图3所示实施例提供的方法的基础上,还可以包括:FIG. 4 is a flowchart of a hot backup method provided by an embodiment of the present invention. The hot backup method provided in this embodiment, on the basis of the method provided in the embodiment shown in Figure 3, may also include:
S401,当第一链路质量级别和M个第Yj链路质量级别中表示链路异常的数量之和大于或等于N-1个第Xi链路质量级别中表示链路异常的数量时,或者当M个第Yj链路质量级别中表示链路正常的数量小于N-1个第Xi链路质量级别中表示链路正常的数量时,或者第一主用设备自身业务未受损时,第一主用设备在预设的静默时间窗内不向备用设备备份数据。S401, when the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is greater than or equal to the number of link abnormalities in the N-1 X i -th link quality levels , or when the number of normal links in the M Y j -th link quality levels is less than the normal number of links in the N-1 X i -th link quality levels, or the service of the first active device is not affected In the event of a loss, the first active device does not back up data to the backup device within the preset silent time window.
举例说明,在第一主用设备获取了第一链路质量级别、N-1个第Xi链路质量级别、M个第Yj链路质量级别后,还可以进行以下判断。若第一链路质量级别和N-1个第Xi链路质量级别中表示链路异常级别的数量之和大于等于M个第Yj链路质量级别中表示链路异常级别的数量,即P与1之和大于或等于W。P+1是由于第一主用设备也认为和备用设备之间的链路质量级别表示链路异常),即认为备用设备的链路质量级别表示链路异常的数量大于等于认为第一主用设备的链路质量级别表示链路异常的数量,那么第一主用设备即可认为备用设备和其它N-1台主用设备之间的链路状态劣于第一主用设备和其它N-1台主用设备之间的链路状态,那么第一主用设备可以确定第一主用设备和备用设备之间的链路问题是由于备用设备故障造成的。For example, after the first active device acquires the first link quality level, the N-1 X i -th link quality levels, and the M Y- th link quality levels, the following judgments may also be made. If the sum of the first link quality level and the number of abnormal link levels in the N-1 X i link quality levels is greater than or equal to the number of link abnormal levels in the M Y j link quality levels, that is The sum of P and 1 is greater than or equal to W. P+1 is because the first active device also thinks that the link quality level between the backup device and the backup device indicates that the link is abnormal), that is, the link quality level of the backup device indicates that the number of link abnormalities is greater than or equal to that of the first active device. The link quality level of the device indicates the number of abnormal links, so the first active device can consider that the link status between the backup device and other N-1 active devices is worse than that between the first active device and other N-1 active devices. If the link status between 1 master device is known, then the first master device can determine that the link problem between the first master device and the backup device is caused by a failure of the backup device.
举例说明,若N-1个第Xi链路质量级别中表示链路正常级别的数量小于M个第Yj链路质量中表示链路正常级别的数量,即Q<S,即认为备用设备的链路质量级别表示链路正常的数量小于认为第一主用设备的链路质量级别表示链路正常的数量,那么第一主用设备即可认为备用设备到其它N-1台主用设备的链路状态劣于第一主用设备到其它N-1台主用设备的链路状态,那么第一主用设备也可以确定第一主用设备到备用设备的链路问题是由于备用设备故障造成的。那么也就是说,各设备的投票意见为备用设备的链路质量级别表示链路异常的可能性大于等于第一主用设备的链路质量级别表示链路异常的可能性,或者备用设备的链路质量级别表示链路正常的可能性小于第一主用设备的链路质量级别表示链路正常的可能性。For example, if the number of N-1 X i -th link quality levels indicating the normal level of the link is less than the number of M Y- th link quality levels indicating the normal level of the link, that is, Q<S, it is considered a standby device The link quality level of the link indicates that the number of normal links is less than the link quality level of the first active device indicates that the number of links is normal, then the first active device can be considered as a backup device to other N-1 active devices is worse than the link status of the first master device to other N-1 master devices, then the first master device can also determine that the link problem between the first master device and the backup device is due to the backup device caused by a malfunction. That is to say, the voting opinion of each device is that the link quality level of the backup device indicates that the possibility of link abnormality is greater than or equal to the link quality level of the first active device. The link quality level indicates that the link is normal than the link quality level of the first active device indicates that the link is normal.
那么,当第一主用设备确定第一主用设备到备用设备的链路问题是由于备用设备故障造成的,那么若第一主用设备继续向备用设备发送备份数据,则可能由于链路丢包等原因造成备份失败或者反复备份,从而造成资源浪费。那么此时第一主用设备将保持主用设备状态,而不进行热备份,即不向备用设备备份数据。Then, when the first active device determines that the link problem between the first active device and the backup device is caused by the failure of the backup device, if the first active device continues to send backup data to the backup device, it may be due to link loss. Backup failures or repeated backups due to reasons such as data packets, resulting in waste of resources. Then at this time, the first active device will maintain the state of the active device without performing hot backup, that is, not backing up data to the standby device.
进一步地,第一主用设备还可以启动一个静默时间窗,在静默时间窗内第一主用设备不再进行热备份处理,也就是不再向备用设备备份数据。在静默时间窗超时后,第一主用设备可以再次进行图3所示实施例的热备份处理。静默时间窗的时间可以根据经验值设置,静默时间窗一般可以设为系统的平均故障修复时间。Further, the first active device may also start a silent time window, and within the silent time window, the first active device does not perform hot backup processing, that is, no longer backs up data to the standby device. After the silent time window expires, the first master device may perform the hot backup processing in the embodiment shown in FIG. 3 again. The time of the silent time window can be set according to empirical values, and the silent time window can generally be set as the average fault recovery time of the system.
进一步地,在图4所示实施例中,第一主用设备确定备用设备到其它N-1台主用设备的链路状态劣于第一主用设备到其它N-1台主用设备的链路状态之后,还包括:第一主用设备向备用设备发送静默时间窗,静默时间窗用于使备用设备在静默时间窗内不处理接收到的备份数据。Further, in the embodiment shown in FIG. 4 , the first active device determines that the link status between the backup device and the other N-1 active devices is worse than that between the first active device and the other N-1 active devices. After the link state, it also includes: the first active device sends a silent time window to the backup device, and the silent time window is used to prevent the backup device from processing received backup data within the silent time window.
具体地,当第一主用设备确定第一主用设备到备用设备的链路问题是由于备用设备故障造成的,第一主用设备确定保持主用设备状态,并且第一主用设备启动静默时间窗后,第一主用设备还可以向备用设备发送该静默时间窗。当备用设备接收到该静默时间窗后,将确定第一主用设备在该时间窗内不会进行热备份。此时备用设备即使接收到备份数据,也将不进行处理。Specifically, when the first master device determines that the link problem between the first master device and the backup device is caused by a failure of the backup device, the first master device determines to maintain the state of the master device, and the first master device initiates a silent After the time window, the first active device may also send the silent time window to the backup device. After receiving the silent time window, the backup device will determine that the first active device will not perform hot backup within the time window. At this time, even if the backup device receives the backup data, it will not process it.
进一步地,图3对应的实施例提供的方法或图4对应的实施例提供的方法还包括:第一主用设备从向其它N-1台主用设备中的每一台主用设备分别发送请求报文开始的等待时间窗内,暂停向备用设备发送备份数据。Further, the method provided in the embodiment corresponding to FIG. 3 or the method provided in the embodiment corresponding to FIG. 4 further includes: the first master device sends the slave to each of the other N-1 master devices respectively. During the waiting time window at the beginning of the request message, the sending of backup data to the backup device is suspended.
具体地,当第一主用设备向其它N-1台主用设备发送请求报文之后,此时第一主用设备还不确定第一主用设备和备用设备之间的链路问题是由于哪个设备故障导致的,第一主用设备还可以在预设的等待时间窗内暂停向备用设备发送备份数据。这样避免了由于备用设备故障,第一主用设备还继续向备用设备备份数据而造成的资源浪费。Specifically, after the first active device sends a request message to other N-1 active devices, the first active device is not yet sure whether the link problem between the first active device and the backup device is due to Which device is faulty, the first active device may also suspend sending backup data to the backup device within a preset waiting time window. In this way, resource waste caused by the first active device continuing to back up data to the backup device due to a failure of the backup device is avoided.
图5为本发明实施例提供的热备份方法的流程图。本实施例提供的热备份方法应用于主用设备和备用设备数量比为N:1的多机热备份系统,其中N大于或等于2。下面结合图5,对本发明实施例提供的方法进行说明。FIG. 5 is a flowchart of a hot backup method provided by an embodiment of the present invention. The hot backup method provided in this embodiment is applied to a multi-machine hot backup system in which the ratio of active equipment to standby equipment is N:1, where N is greater than or equal to two. The method provided by the embodiment of the present invention will be described below with reference to FIG. 5 .
S501,备用设备获取第一链路质量级别,第一链路质量级别用于表示第一主用设备和备用设备之间的链路质量。S501. The backup device acquires a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device.
本实施例提供的热备份方法的执行主体为备用设备。第一链路质量级别的含义可参见图3对应的实施例中的相应内容,在此不再赘述。The execution subject of the hot backup method provided in this embodiment is the backup device. For the meaning of the first link quality level, refer to the corresponding content in the embodiment corresponding to FIG. 3 , which will not be repeated here.
S502,当第一链路质量级别表示链路异常级别时,备用设备获取备用设备和除第一主用设备外的其它N-1台主用设备中每一台主用设备之间的N-1个第Yj链路质量级别,第Yj链路质量级别用于表示备用设备和第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N。S502. When the first link quality level indicates the link abnormality level, the backup device obtains the N- 1 Y jth link quality level, the Y jth link quality level is used to indicate the link quality between the backup device and the jth active device, N is greater than or equal to 2, and j is greater than 1 and less than or equal to N.
具体地,当备用设备获取到第一链路质量级别后,若第一链路质量级别为表示链路正常,则意味着第一主用设备和备用设备之间的链路状态正常,可以按照正常的热备份流程进行处理。Specifically, after the backup device obtains the first link quality level, if the first link quality level is , indicating that the link is normal, it means that the link status between the first active device and the backup device is normal, and the Normal hot backup process for processing.
当第一链路质量级别表示链路异常时,意味着第一主用设备和备用设备之间的链路存在问题,则备用设备需要判断是备用设备故障导致的链路问题还是第一主用设备故障导致的链路问题。备用设备获取备用设备和其它N-1台主用设备之间的N-1个链路质量级别,N-1个链路质量级别中的第Yj链路质量级别用于表示备用设备和第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N。N-1个第Yj链路质量级别的含义与图3对应的实施例中的相应内容相同,在此不再赘述。备用设备将获取到N-1个第Yj链路质量级别,其中,设N-1个第Yj链路质量级别中p个第Yj链路质量级别为异常,q个第Yj链路质量级别为正常,p+q=N-1。When the first link quality level indicates that the link is abnormal, it means that there is a problem with the link between the first active device and the backup device, and the backup device needs to determine whether the link problem is caused by the backup device failure or the first active Link problems caused by device failures. The backup device obtains N-1 link quality levels between the backup device and other N-1 active devices, and the Y j -th link quality level in the N-1 link quality levels is used to represent the backup device and the first link quality level. j Link quality between active devices, N is greater than or equal to 2, and j is greater than 1 and less than or equal to N. The meanings of the N-1 Y j -th link quality levels are the same as those in the embodiment corresponding to FIG. 3 , and will not be repeated here. The standby device will obtain N-1 quality levels of Y j -th links, wherein, among the N-1 quality levels of Y j -th links, p Y j - th link quality levels are abnormal, and q Y-th link quality levels are abnormal. The road quality level is normal, p+q=N-1.
S503,备用设备向其它N-1台主用设备中的每一台主用设备分别发送请求报文,请求报文用于请求其它N-1台主用设备中的每一台主用设备分别发送其它N-1台主用设备与第一主用设备之间的链路质量级别。S503. The backup device sends a request message to each of the other N-1 active devices, and the request message is used to request each of the other N-1 active devices to Send the link quality levels between the other N-1 active devices and the first active device.
具体地,备用设备通过发送请求报文的方式,获知其它N-1台主用设备和第一主用设备之间的链路质量级别。Specifically, the standby device learns the link quality level between the other N-1 active devices and the first active device by sending a request message.
需要说明的是,导致第一主用设备和备用设备之间链路状态异常的原因可能是备用设备的故障导致的,因此备用设备发送的请求报文不一定会被其它N-1台设备都接收到。It should be noted that the abnormal link status between the first active device and the backup device may be caused by a failure of the backup device, so the request message sent by the backup device may not be received by all other N-1 devices. received.
S504,备用设备分别接收其它N-1台主用设备发送的第i主用设备和第一设备之间的M个第Xi链路质量级别,第Xi链路质量级别用于表示第i主用设备和第一主用设备之间的链路质量,i大于1且小于或等于N,M大于等于零且小于或等于N-1。S504. The backup device respectively receives the M X i -th link quality levels between the i-th active device and the first device sent by other N-1 active devices, and the X i -th link quality level is used to represent the i-th link quality level The link quality between the active device and the first active device, i is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1.
具体地,M个第Xi链路质量级别的含义与图3对应的实施例中的相应内容相同,在此不再赘述。备用设备在发送了请求报文后,将接收到其它主用设备发送的各主用设备和第一主用设备之间的链路质量级别。由于备用设备可能与其它N-1台主用设备之间的链路存在问题,因此备用设备可能接收不到其它N-1台主用设备发送的N-1个链路质量级别。设备用设备共收到了M个第Xi链路质量级别,M小于或等于N-1。设w个第Xi链路质量级为异常,s个第Xi链路质量级别为正常,w+s≤N-1。Specifically, the meanings of the M X i -th link quality levels are the same as those in the embodiment corresponding to FIG. 3 , and will not be repeated here. After the standby device sends the request message, it will receive the link quality levels between each active device and the first active device sent by other active devices. Since the backup device may have a link problem with the other N-1 active devices, the backup device may not receive the N-1 link quality levels sent by the other N-1 active devices. The device uses the device to receive a total of M X i -th link quality levels, and M is less than or equal to N-1. It is assumed that the quality level of the wth Xi link is abnormal, and the quality level of the s Xi link is normal, w + s≤N -1.
S505,当第一链路质量级别和N-1个第Yj链路质量级别中表示链路异常的数量之和小于M个第Xi链路质量级别中表示链路异常的数量,并且N-1个第Yj链路质量级别中表示链路正常的数量大于等于M个第Xi链路质量级别中表示链路正常的数量,备用设备提升接入优先级,提升后的接入优先级用于指示第一主用设备的用户切换到备用设备。S505, when the sum of the first link quality level and the number of link abnormalities in the N-1 Y jth link quality levels is less than the number of link abnormalities in the M Xith link quality levels, and N - The number of normal links in the Y jth link quality level is greater than or equal to the M number of normal links in the X i link quality level, the backup device increases the access priority, and the upgraded access priority Level is used to instruct the user of the first master device to switch to the backup device.
具体地,在备用设备获取了第一链路质量级别、N-1个第Yj链路质量级别、M个第Xi链路质量级别后,即可判断第一主用设备和备用设备之间的链路问题是由于谁造成的。若第一链路质量级别和N-1个第Yj链路质量级别中表示链路异常的数量之和小于M个第Xi链路质量级别中表示链路异常的数量,即p+1<w。p+1是由于备用设备也认为备用设备和第一主用设备之间的链路质量级别表示链路异常,即认为备用设备的链路质量级别表示链路异常的数量小于认为第一主用设备的链路质量级别表示链路异常的数量;并且N-1个第Yj链路质量级别中表示链路正常的数量大于或等于M个第Xi链路质量级别中表示链路正常的数量,即q≥s,即认为备用设备的链路质量级别表示链路正常的数量大于或等于认为第一主用设备的链路质量级别表示链路正常的数量。那么也就是说,各设备的投票意见为备用设备的链路质量级别表示链路异常的可能性小于第一主用设备的链路质量级别表示链路异常的可能性,并且备用设备的链路质量级别表示链路正常的可能性大于等于第一主用设备的链路质量级别表示链路正常的可能性。则备用设备将确定第一主用设备和备用设备之间的链路问题是由于第一主用设备故障造成的,因此备用设备将确定需要将第一主用设备的业务切换至备用设备。Specifically, after the backup device acquires the first link quality level, the N-1 Y j -th link quality levels, and the M X- th link quality levels, it can determine the difference between the first active device and the backup device. The link problem between them is caused by whom. If the sum of the first link quality level and the number of link abnormalities in the N-1 Y j -th link quality levels is less than the number of link abnormalities in the M X i -th link quality levels, that is, p+1 <w. p+1 is because the backup device also thinks that the link quality level between the backup device and the primary active device indicates that the link is abnormal. The link quality level of the device indicates the number of link abnormalities; and the number of normal links in the N-1th link quality level of Y j is greater than or equal to the number of normal links in the Mth link quality level of X i The number, that is, q≥s, means that the number of link quality levels of the backup device that indicate a normal link is greater than or equal to the number that the link quality level of the first active device indicates that a link is normal. That is to say, the voting opinion of each device is that the link quality level of the standby device indicates that the possibility of link abnormality is lower than that of the first active device, and the link quality level of the standby device indicates the possibility of link abnormality. The quality level indicates that the possibility of the link being normal is greater than or equal to the link quality level of the first active device, indicating the possibility of the link being normal. Then the backup device will determine that the link problem between the first master device and the backup device is caused by the failure of the first master device, so the backup device will determine that the service of the first master device needs to be switched to the backup device.
举例说明,备用设备若想将第一主用设备的业务切换至备用设备,则可以通过提升接入优先级的方法实现。当备用设备提升接入优先级,则原接入第一主用设备的用户设备将可以切换到备用设备。备用设备提高接入优先级的原则是需要将接入优先级提高到比第一主用设备的接入优先级更高,这样才能使原接入第一主用设备的用户切换至备用设备。For example, if the backup device wants to switch the service of the first active device to the backup device, it can be realized by increasing the access priority. When the backup device raises the access priority, the user equipment originally connected to the first active device can be switched to the backup device. The principle for the backup device to increase the access priority is that the access priority needs to be increased to be higher than that of the first active device, so that the user who originally accessed the first active device can switch to the backup device.
本实施例提供的热备份方法中,统计了备用设备与其它各主用设备之间的链路状态,并且统计了各主用设备与第一主用设备之间的链路状态,从而能够准确判断出导致第一主用设备和备用设备之间链路问题的原因,从而能够做出更加合理的热备份处理,而不会由于第一主用设备和备用设备之间链路丢包而造成错误的热备份处理,节约了系统资源。In the hot backup method provided in this embodiment, the link status between the backup device and other active devices is counted, and the link status between each active device and the first active device is counted, so that it can accurately Determine the cause of the link problem between the first active device and the backup device, so that a more reasonable hot backup process can be made without causing packet loss on the link between the first active device and the backup device. Wrong hot backup processing saves system resources.
图6为本发明实施例提供的热备份方法的流程图,本实施例提供的热备份方法,在图5对应的实施例提供的方法的基础上,还包括:Fig. 6 is a flow chart of the hot backup method provided by the embodiment of the present invention. The hot backup method provided by this embodiment, on the basis of the method provided by the embodiment corresponding to Fig. 5, further includes:
S601,当第一链路质量级别和N-1个第Yj链路质量级别中表示链路异常的数量之和大于等于M个第Xi链路质量级别中表示链路异常的数量,或者N-1个第Yj链路质量级别中表示链路正常的数量小于M个第Xi链路质量级别中表示链路正常的数量,备用设备在预设的静默时间窗内暂停接收第一主用设备发送的备份数据。S601, when the sum of the first link quality level and the number of link abnormalities in the N-1 Y jth link quality levels is greater than or equal to the number of link abnormalities in the M Xith link quality levels, or The number of normal links in the N-1 Y jth link quality level is less than the normal number of links in the M X ith link quality level, and the backup device suspends receiving the first link within the preset silent time window. Backup data sent by the active device.
具体地,在备用设备获取了第一链路质量级别、N-1个Yj链路质量级别、M个第Xi链路质量级别后,还可以进行以下判断。若第一链路质量级别和N-1个Yj链路质量级别中表示链路异常的数量之和大于等于M个第Xi链路质量级别中表示链路异常的数量,即p+1≥w(p+1是由于备用设备也认为备用设备和第一主用设备之间的链路质量级别表示链路异常),即认为备用设备的链路质量级别表示链路异常的数量大于等于认为第一主用设备的链路质量级别为异常的数量,那么备用设备即可认为备用设备和其它N-1台主用设备之间的链路状态劣于第一主用设备和其它N-1台主用设备之间的链路状态,那么备用设备可以确定第一主用设备和备用设备之间的链路问题是由于备用设备故障造成的。若N-1个Yj链路质量级别中表示链路正常的数量小于M个第Xi链路质量级别中表示链路正常的数量,即q<s,即认为备用设备的链路质量级别表示链路正常的数量小于认为第一主用设备的链路质量级别表示链路正常的数量,那么备用设备即可认为备用设备和其它N-1台主用设备之间的链路状态劣于第一主用设备和其它N-1台主用设备之间的链路状态,那么备用设备也可以确定第一主用设备和备用设备之间的链路问题是由于备用设备故障造成的。那么也就是说,备用设备的链路质量级别表示链路异常的可能性大于等于第一主用设备的链路质量级别表示链路异常的可能性,或者备用设备的链路质量级别表示链路正常的可能性小于第一主用设备的链路质量级别表示链路正常的可能性。Specifically, after the standby device acquires the first link quality level, the N-1 Y j link quality levels, and the M X i -th link quality levels, the following judgments may also be made. If the sum of the first link quality level and the number of link abnormalities in the N-1 Y j link quality levels is greater than or equal to the number of link abnormalities in the M X i link quality levels, that is, p+1 ≥w (p+1 is because the backup device also thinks that the link quality level between the backup device and the first active device indicates link abnormality), that is, the link quality level of the backup device indicates that the number of link exceptions is greater than or equal to If the link quality level of the first active device is considered to be abnormal, then the backup device can consider that the link status between the backup device and other N-1 active devices is worse than that between the first active device and other N-1 active devices. If there is no link status between 1 active device, then the backup device can determine that the link problem between the first active device and the backup device is caused by the failure of the backup device. If the number of normal links in the N-1 Y j link quality levels is less than the normal number of links in the M X i -th link quality levels, that is, q<s, the link quality level of the backup device is considered If the number of normal links is less than the number of normal links indicated by the link quality level of the first active device, then the backup device can consider that the link status between the backup device and other N-1 active devices is worse than The link status between the first active device and other N-1 active devices, then the backup device can also determine that the link problem between the first active device and the backup device is caused by the failure of the backup device. That is to say, the link quality level of the standby device indicates that the possibility of link abnormality is greater than or equal to the link quality level of the first active device indicates the possibility of link abnormality, or the link quality level of the standby device indicates that the link A possibility that the link is normal is lower than the link quality level of the first active device indicates the possibility that the link is normal.
那么,当备用设备确定第一主用设备到备用设备的链路问题是由于备用设备故障造成的,那么若备用设备继续接收第一主用设备发送的备份数据,则可能由于链路丢包等原因造成备份失败或者反复备份,从而造成资源浪费。那么此时备用设备暂停接收第一主用设备发送的备份数据。进一步地,备用设备还可以启动一个静默时间窗,在静默时间窗内备用设备暂停接收第一主用设备发送的备份数据。在静默时间窗超时后,备用设备可以再次进行图5所示实施例的热备份处理。静默时间窗的时间可以根据经验值设置,静默时间窗一般可以设为系统的平均故障修复时间。Then, when the backup device determines that the link problem between the first active device and the backup device is caused by the failure of the backup device, if the backup device continues to receive the backup data sent by the first active device, it may be due to link packet loss, etc. Reasons cause backup failure or repeated backup, resulting in waste of resources. Then, at this time, the standby device suspends receiving the backup data sent by the first active device. Further, the standby device may also start a silent time window, and within the silent time window, the standby device suspends receiving the backup data sent by the first active device. After the silent time window expires, the backup device can perform the hot backup processing in the embodiment shown in FIG. 5 again. The time of the silent time window can be set according to empirical values, and the silent time window can generally be set as the average fault recovery time of the system.
进一步地,在图5或图6所示实施例中,备用设备是根据自己和各主用设备之间的链路状态来判断是否使第一主用设备和备用设备间进行业务切换的。但还有一种可能是第一主用设备来判断是否启动业务切换,而在第一主用设备进行了判断后,若第一主用设备确定第一主用设备和备用设备的链路问题是由于备用设备造成的,而确定不进行业务切换,则第一主用设备可以向备用设备发送静默时间窗;当备用设备接收到第一主用设备发送的静默时间窗后,则可以获知第一主用设备已经确定了不进行业务切换,并且确定第一主用设备到备用设备的链路问题是由于备用设备故障造成的,那么备用设备在静默时间窗内不处理接收到的备份数据。也就是说,当第一主用设备确定了不业务切换后,即可向备用设备发送静默时间窗,那么备用设备将无需再进行判断,而是可以在该时间窗内不处理接收到的备份数据。当静默时间窗超时后,备用设备可以再根据图5或图6所示实施例的方法进行处理。Further, in the embodiment shown in FIG. 5 or FIG. 6 , the backup device judges whether to switch services between the first master device and the backup device according to the link status between itself and each master device. However, there is another possibility that the first active device judges whether to start service switching, and after the first active device makes the judgment, if the first active device determines that the link problem between the first active device and the backup device is If it is determined not to switch services due to the backup device, the first active device can send the silent time window to the backup device; when the backup device receives the silent time window sent by the first active device, it can know the first The active device has determined not to perform service switching, and it is determined that the link problem between the first active device and the backup device is caused by a failure of the backup device, then the backup device does not process the received backup data within the silent time window. That is to say, after the first active device determines that there is no service switching, it can send the silent time window to the backup device, then the backup device will not need to make a judgment, but will not process the received backup device within this time window. data. When the silent time window expires, the backup device may perform processing according to the method of the embodiment shown in FIG. 5 or FIG. 6 .
另外,当第一主用设备判断是否启动业务切换,而在第一主用设备进行了判断后,若第一主用设备确定第一主用设备和备用设备的链路问题是由于第一主用设备造成的,而确定进行业务切换,则第一主用设备可以向备用设备发送第一主用设备降低接入优先级或初始化的消息。当备用设备接收到第一主用设备发送的第一主用设备降低接入优先级或初始化的消息后,则可以获知第一主用设备已经确定需要进行业务切换,那么备用设备则可以接受第一主用设备的业务切换。也就是说,当第一主用设备确定了触发业务切换后,即可向备用设备发送第一主用设备降低接入优先级或初始化的消息,那么备用设备将无需再进行判断,而是可以直接接受第一主用设备的业务切换。In addition, when the first active device judges whether to start service switching, and after the first active device makes the judgment, if the first active device determines that the link problem between the first active device and the backup device is caused by the first active device If it is determined to perform service switching due to the failure of the active device, the first active device may send a message that the first active device lowers the access priority or initializes to the backup device. When the standby device receives the message that the first active device lowers the access priority or initializes the message sent by the first active device, it can know that the first active device has determined that it needs to perform service switching, and then the standby device can accept the first active device. A service switching of the master device. That is to say, after the first active device determines to trigger service switching, it can send a message to the backup device that the first active device lowers the access priority or initializes, then the backup device does not need to make a judgment, but can Directly accept the service switching of the first master device.
进一步地,图3至图6所示实施例提供的热备份方法中,主用设备和备用设备为BRAS,第一主用设备通过VRRP热备份到备用设备。下面以BRAS通过VRRP热备份为例,对本发明实施例提供的热备份方法进行进一步说明。Further, in the hot backup method provided by the embodiments shown in FIG. 3 to FIG. 6 , the active device and the backup device are BRASs, and the first active device is hot-backed up to the backup device through VRRP. Taking BRAS hot backup through VRRP as an example, the hot backup method provided by the embodiment of the present invention will be further described below.
为了在现有的如图1A和图1B所示的系统架构下应用本实施例提供的热备份方法,需要在基于BRAS的系统中部署额外的功能模块,包括:在BRAS间增加部署性能检测模块(所有BRAS之间两两部署,包括每两个主用BRAS之间、每个主用BRAS和对应的备用BRAS之间均部署)。在系统内部增加一个模块“BRAS间链路质量检测”模块,设置一个检测周期D,例如D=10秒,按照D周期收集IP流动性能检测(Flow Performance Monitor,FPM)的丢包检测(主要是吞吐率指标)结果,将检测结果转换为0-100之间丢包比例(DropPacketRate),数值越大,丢包约严重。In order to apply the hot backup method provided by this embodiment under the existing system architecture shown in Figure 1A and Figure 1B, it is necessary to deploy additional functional modules in the BRAS-based system, including: adding a deployment performance detection module between BRAS (All BRASs are deployed in pairs, including between every two active BRASs, and between each active BRAS and the corresponding standby BRAS). Add a module "inter-BRAS link quality detection" module in the system, set a detection period D, such as D=10 seconds, collect the packet loss detection (mainly Throughput rate index) results, the detection results are converted into a drop packet ratio (DropPacketRate) between 0-100, the larger the value, the more serious the packet loss.
在系统内部增加一个模块“BRAS终端掉线率检测”模块,设置一个统计周期P,例如P=60秒;设置一个检测周期E,例如E=10秒。该模块按照E的周期计算,P周期内平均每秒的因探测超时下线速率(DownRate)值。系统设定一个最大下线速率(MaxDownRate),例如MaxDownRate=200会话(Sesson)/秒。设置下线健康指数(CientDownHealthTarget),CientDownHealthTarget=DownRate*100/MaxDownRete;其取值范围是0-100,数值越大表明掉线情况约严重。Add a module "BRAS terminal disconnection rate detection" module inside the system, set a statistical cycle P, for example P=60 seconds; set a detection cycle E, for example E=10 seconds. This module calculates according to the period of E, and the average offline rate (DownRate) value per second due to detection timeout in the period of P. The system sets a maximum offline rate (MaxDownRate), for example, MaxDownRate=200 sessions (Sesson)/second. Set the offline health index (CientDownHealthTarget), CientDownHealthTarget=DownRate*100/MaxDownRete; the value range is 0-100, and the larger the value, the more serious the offline situation is.
在系统内部增加一个模块“RBS闪断检测”模块,设置一个统计周期Q,例如Q=1800秒(30分钟);设置一个检测周期F,例如F=10秒。该模块按照F的周期计算,Q周期内平均每固定时间(例如每300秒)的RBS震荡频率(RbsShakeRate),RBS从下线(Down)变为上线(Up),或从上线(Up)变为下线(Down)算作一次震荡。RbsShakeRate取值范围例如为0-25(300/12)。设定一个最大震荡频率(MaxRbsShakeRate),例如MaxRbsShakeRate=300(按照每秒1次震荡计算)。折算一个RBS健康度指数(RBSHealthTarget)RBSHealthTarget=RbsShakeRate*100/MaxRbsShakeRate,取值为0-100之间的值,取值越大RBS的健康指标越低。Add a module "RBS flash detection" module inside the system, set a statistical cycle Q, such as Q=1800 seconds (30 minutes); set a detection cycle F, such as F=10 seconds. This module calculates according to the cycle of F, the average RBS shaking frequency (RbsShakeRate) per fixed time (for example, every 300 seconds) in the Q cycle, RBS changes from offline (Down) to online (Up), or from online (Up) to It counts as a shock for going offline (Down). The value range of RbsShakeRate is, for example, 0-25 (300/12). Set a maximum shaking frequency (MaxRbsShakeRate), for example, MaxRbsShakeRate=300 (calculated on the basis of 1 shaking per second). Convert an RBS health index (RBSHealthTarget) RBSHealthTarget=RbsShakeRate*100/MaxRbsShakeRate, the value is between 0-100, the larger the value, the lower the RBS health index.
并且,定义一种新的设备健康度通告协议(Device Health AdvertisingProtocol,DHAP),协议内容如表1所示。Moreover, a new device health advertising protocol (Device Health Advertising Protocol, DHAP) is defined, and the content of the protocol is shown in Table 1.
表1 DHAP协议Table 1 DHAP protocol
表1中,版本号(Version)、角色(Role)、消息类型(Message Type)、通告周期(Interval)字段各占用1个字节,健康度指标(HealthTarget)、保留字段(Reserved)各占用2个字节,系统标识(System ID)字段占用4个字节。In Table 1, the version number (Version), role (Role), message type (Message Type), and notification cycle (Interval) fields each occupy 1 byte, and the health indicator (HealthTarget) and reserved fields (Reserved) each occupy 2 bytes. bytes, and the System ID field occupies 4 bytes.
Version字段为协议的版本号,例如当前为第一版则取值为1。Role字段为设备的角色,例如取值1为主用设备,取值2为备用设备;角色是按照设备当前在系统中处于主用设备、还是备用设备来设定的。MessageType字段为消息类型:例如取值1为健康度通告(Health Advertisement),当然还可以为其它类型。Interval为通告周期,单位秒,默认值1,最大范围1秒到255秒。Reserved为保留字段。SystemID字段为系统ID,以本设备的一个IP地址唯一标示本设备,该IP地址例如可以选择RBS通信的源IP地址。HealthTarget字段为设备的健康度指标,取值DropPacketRate+CientDownHealthTarget+RBSHealthTarget;取值越小越健康,取值范围0-300。The Version field is the version number of the protocol. For example, if the current version is the first version, the value is 1. The Role field is the role of the device, for example, the value 1 is the active device, and the value 2 is the backup device; the role is set according to whether the device is currently the active device or the backup device in the system. The MessageType field is a message type: for example, a value of 1 is a Health Advertisement, and of course other types can also be used. Interval is the notification period, the unit is second, the default value is 1, and the maximum range is 1 second to 255 seconds. Reserved is a reserved field. The SystemID field is the system ID, and an IP address of the device is used to uniquely identify the device. The IP address, for example, can select the source IP address of the RBS communication. The HealthTarget field is the health index of the device, and the value is DropPacketRate+CientDownHealthTarget+RBSHealthTarget; the smaller the value, the healthier it is, and the value range is 0-300.
DHAP协议的传输层例如可以是用户数据报文协议(User Datagram Protocol,UDP)或者传输控制协议(Transmission Control Protocol,TCP),在本发明实施实施例中例如按照UDP实现。The transport layer of the DHAP protocol may be, for example, a User Datagram Protocol (User Datagram Protocol, UDP) or a Transmission Control Protocol (Transmission Control Protocol, TCP), which is implemented, for example, according to UDP in the embodiments of the present invention.
还需为DHAP协议设计检测对端设备的检测参数,其中包括通告周期(Advertisement Interval)、调整时间(Skew_Time)、对端下线检测周期(PeerDownInterval)等。其中通告周期为DHAP协议报文向外通告的周期,单位为秒,例如默认为1秒,配置范围可以从1秒到255秒。调整时间为对端下线检测周期的调整时间,单位为秒,其取值例如可以为((256-HealthTarget)/256)。对端下线检测周期为对端设备的下线检测周期,单位为秒,其取值例如可以为(3×Advertisement_Interval)+Skew_time)。It is also necessary to design and detect the detection parameters of the peer device for the DHAP protocol, including the advertisement interval (Advertisement Interval), the adjustment time (Skew_Time), the peer offline detection interval (PeerDownInterval), etc. The notification period is the period for the DHAP protocol message to be notified to the outside, and the unit is second. For example, the default is 1 second, and the configuration range can be from 1 second to 255 seconds. The adjustment time is the adjustment time of the peer offline detection period, and the unit is second, and its value may be ((256-HealthTarget)/256), for example. The peer offline detection period is the offline detection period of the peer device, in seconds, and its value may be (3×Advertisement_Interval)+Skew_time), for example.
可以为DHAP协议定义两个状态,分别为初始化(Initialize)状态和运行(Running)状态,其中,在初始化状态中,各设备配置邻居信息,然后切换到运行状态。在运行状态中,各设备收到邻居设备的通告(Advertisement)报文,实时应答一个通告(Advertisement)报文,即把对方的通告(Advertisement)报文看做是请求,并将邻居设备通告的健康度指标(HealthTarget)记录到数据库周期监控数据库中的邻居设备状态,如果超期不能收到邻居设备的通告,则将该邻居视作不可用,对端健康度指标(PeerHealthTarget)参数设置为最大值65535。Two states can be defined for the DHAP protocol, which are the Initialize state and the Running state, wherein, in the Initializing state, each device configures neighbor information and then switches to the Running state. In the running state, each device receives an Advertisement message from a neighbor device, and responds to an Advertisement message in real time, that is, it regards the Advertisement message from the other party as a request, and The health index (HealthTarget) is recorded in the database to periodically monitor the state of the neighboring device in the database. If the notification from the neighboring device cannot be received within the time limit, the neighbor is regarded as unavailable, and the peer health index (PeerHealthTarget) parameter is set to the maximum value. 65535.
各设备收集的邻居设备健康信息数据库(HealthTargetDB)信息定义如表2所示。Table 2 shows the definition of neighbor device health information database (HealthTargetDB) information collected by each device.
表2 HealthTargetDB信息Table 2 HealthTargetDB information
本设备自身的健康度信息也记录到该数据库中。The health information of the device itself is also recorded in the database.
定义变量自身系统标识(MySystemId),记录本设备的系统标识(Identity,ID)。定义变量所有设备个数(TotalDeviceNumber),记录所有的设备个数,包括本设备。定义健康阈值:正常(Normal)<=50,50<异常(Abnormal)<=100,不可用(Unaviailable)>100。定义终端设备掉线阈值:正常(Noraml)<=100;异常(Abnormal)>100。定义RBS闪断阈值,正常(Noraml)=0,异常(Abnormal)>2(每分钟超过2次)。Define the variable self-system identification (MySystemId) to record the system identification (Identity, ID) of the device. Define the variable number of all devices (TotalDeviceNumber), record the number of all devices, including this device. Define health threshold: normal (Normal)<=50, 50<abnormal (Abnormal)<=100, unavailable (Unaviailable)>100. Define the offline threshold of terminal equipment: normal (Noraml)<=100; abnormal (Abnormal)>100. Define RBS flicker threshold, normal (Noraml) = 0, abnormal (Abnormal) > 2 (more than 2 times per minute).
在系统中设置了上述模块以及协议后,即可按照图7或图8所示的方法进行热备份处理,其中,图7为主用BRAS的处理流程,图8为备用BRAS的处理流程。After the above-mentioned modules and protocols are set in the system, hot backup processing can be performed according to the method shown in FIG. 7 or FIG. 8 , wherein FIG. 7 is the processing flow of the active BRAS, and FIG. 8 is the processing flow of the standby BRAS.
图7为本发明实施例提供的热备份方法的流程图,本实施例应用于通过VRRP热备份包括N台主用BRAS和一台备用BRAS的多机热备份场景,本实施例的执行主体为任一主用BRAS,将其称为第一主用BRAS。如图7所示,本实施例的热备份方法包括:FIG. 7 is a flow chart of the hot backup method provided by the embodiment of the present invention. This embodiment is applied to a multi-machine hot backup scenario including N active BRASs and one backup BRAS through VRRP hot backup. The execution subject of this embodiment is Any active BRAS is referred to as the first active BRAS. As shown in Figure 7, the hot backup method of the present embodiment includes:
S701,第一主用BRAS检测备用BRAS的HealthTarget,当第一主用BRAS发现备用BRAS的HealthTarget变成异常(Abnormal)级别时,则继续执行S702。否则第一主用BRAS将按照现有的热备份流程进行处理。第一主用BRAS检测的备用BRAS的HealthTarget即为第一链路质量级别。S701. The first active BRAS detects the HealthTarget of the standby BRAS. When the first active BRAS finds that the HealthTarget of the standby BRAS has changed to an abnormal (Abnormal) level, continue to execute S702. Otherwise, the first active BRAS will be processed according to the existing hot backup process. The HealthTarget of the standby BRAS detected by the first active BRAS is the first link quality level.
S702,第一主用BRAS比较到其它N-1台主用BRAS的HealthTarget,得到W台设备与自己的IP FPM指标劣化,S台设备与自己的IP FPM指标正常;W+S=N-1。第一主用BRAS检测的其它N-1台主用BRAS的HealthTarget即为第Xi链路质量级别。S702. The first active BRAS compares the HealthTargets of other N-1 active BRASs, and obtains that the IP FPM indicators of the W equipment and its own are degraded, and the S equipment and its own IP FPM indicators are normal; W+S=N-1 . The HealthTarget of the other N-1 active BRASs detected by the first active BRAS is the X i th link quality level.
S703,第一主用BRAS向其它N-1台主用BRAS发送第一请求报文,请求对方应答到备用BRAS是否劣化;并通报自己到备用BRAS的健康度的消息,并等待一个回应消息的时间窗Tw,在此时间窗Tw内暂停向备用BRAS备份数据。S703, the first active BRAS sends a first request message to other N-1 active BRASs, requesting the other party to respond to whether the standby BRAS is degraded; and notifies itself of the health degree of the standby BRAS, and waits for a response message A time window Tw, within which the data backup to the standby BRAS is suspended.
S704,在一个等待时间窗Tw内,第一主用BRAS收到L份应答消息。其中P份应答说对端到备用BRAS的健康度劣化,Q份应答说对端到备用BRAS的健康度未劣化,P+Q=L,L<=N-1;做举手表决。第一主用BRAS收到的应答消息即为第Yj链路质量级别。S704, within a waiting time window Tw, the first active BRAS receives L copies of response messages. Among them, P replies that the health degree of the peer end to the standby BRAS has deteriorated, and Q replies say that the health degree of the peer end to the standby BRAS has not deteriorated, and P+Q=L, L<=N-1; vote by show of hands. The response message received by the first active BRAS is the Y jth link quality level.
S705,若(P+1>=W或Q<S)。备用BRAS到其它主用BRAS,较第一主用BRAS更不稳定,表决备用BRAS故障:VRRP静默不做切换,同时启动静默时间窗Ts,在时间窗Ts内暂停向备用BRAS备份数据;不断向备用BRAS发送消息告知VRRP静默的消息(该消息slave可能收不到)。S705, if (P+1>=W or Q<S). The standby BRAS is more unstable than the first active BRAS to other active BRASs. Vote for the failure of the standby BRAS: VRRP does not switch over silently, and at the same time starts the silent time window Ts, and suspends backup data to the standby BRAS within the time window Ts; The backup BRAS sends a message to inform the VRRP of silence (this message may not be received by the slave).
S706,若(P+1<W且Q>=S)且同时发现自身健康度异常(Abnormal)(业务受损,如BRAS RUI用户探测掉线,ARP RUI探测不可达等)。备用BRAS到其它主用BRAS,较第一主用BRAS更稳定,表决自身故障:立即降低自身VRRP的优先级或初始化VRRP,触发VRRP/RUI切换;向备用BRAS发送VRRP心跳报文,其中携带当前优先级或者VRRP初始化的消息(该消息备用BRAS可能收不到),同时启动静默时间窗Ts。S706, if (P+1<W and Q>=S) and at the same time it finds that its own health is abnormal (Abnormal) (service damage, such as BRAS RUI user detection disconnection, ARP RUI detection unreachable, etc.). The backup BRAS is more stable than the first active BRAS to other active BRASs, and votes for its own failure: immediately lower the priority of its own VRRP or initialize VRRP, trigger VRRP/RUI switchover; send a VRRP heartbeat message to the standby BRAS, which carries the current Priority or VRRP initialization message (this message may not be received by the standby BRAS), and start the silent time window Ts at the same time.
S707,除S705和S706以外的情况,第一主用BRAS保持VRRP主用设备状态,启动静默时间窗Ts,不向对端备份数据。S707, in cases other than S705 and S706, the first active BRAS maintains the state of the VRRP active device, starts the silent time window Ts, and does not back up data to the opposite end.
其中,在S705至S707中,第一主用BRAS在静默时间窗Ts内,进行如下判断,如果第一主用BRAS与备用BRAS吞吐率恢复正常:则静默状态解除,VRRP以当前状态与对端协商,保持主用设备状态不变或者延时回切主用设备状态。否则,若第静默时间窗Ts超时,则重复执行S701。Among them, in S705 to S707, the first active BRAS is within the silent time window Ts, and the following judgment is made, if the throughput rate of the first active BRAS and the standby BRAS returns to normal: then the silent state is released, and VRRP communicates with the opposite end in the current state Negotiation, keep the state of the active device unchanged or switch back to the state of the active device after a delay. Otherwise, if the first silent time window Ts times out, repeat S701.
S708,恢复RUI用户备份。S708. Restore the RUI user backup.
图8为本发明实施例提供的热备份方法的流程图,本实施例应用于通过VRRP热备份包括N台主用BRAS和一台备用BRAS的多机热备份场景,本实施例的执行主体为备用BRAS。如图8所示,本实施例的热备份方法包括:Figure 8 is a flow chart of the hot backup method provided by the embodiment of the present invention. This embodiment is applied to a multi-machine hot backup scenario including N active BRASs and one backup BRAS through VRRP hot backup. The execution subject of this embodiment is Standby BRAS. As shown in Figure 8, the hot backup method of this embodiment includes:
S801,备用BRAS检测与第一主用BRAS的RBS健康度指数(RBSHealthTarget)是否超过阈值,当备用BRAS与第一主用BRAS的RBS健康度指数(RBSHealthTarget)超过阈值时,则继续执行S802。否则将按照现有的热备份流程进行处理。备用BRAS检测的第一主用BRAS的的HealthTarget即为第一链路质量级别。S801, the standby BRAS detects whether the RBS health index (RBSHealthTarget) of the first active BRAS exceeds a threshold, and when the RBS health index (RBSHealthTarget) of the standby BRAS and the first active BRAS exceeds the threshold, continue to execute S802. Otherwise, it will be processed according to the existing hot backup process. The HealthTarget of the first active BRAS detected by the standby BRAS is the first link quality level.
S802,如果备用BRAS收到第一主用BRAS发送的VRRP静默的消息,转入S810。S802, if the standby BRAS receives the VRRP silence message sent by the first active BRAS, transfer to S810.
S803,如果备用BRAS收到第一主用BRAS发送的VRRP初始化消息或者低优先级的VRRP,转入S811。S803, if the standby BRAS receives the VRRP initialization message sent by the first active BRAS or the low-priority VRRP, transfer to S811.
S804,备用BRAS比较到其它N-1台主用BRAS的IP FPM丢包比例(DropHealthTarget),得到p台设备与自己的IPFPM指标劣化,q台设备与自己的IP FPM指标正常;p+q=N-1。备用BRAS检测的其它N-1台主用BRAS的DropHealthTarge即为第Yj链路质量级别。S804, the standby BRAS compares the IP FPM packet loss ratio (DropHealthTarget) of other N-1 active BRASs, and obtains that the IPFPM index of the p equipment and itself is degraded, and the IP FPM index of the q equipment and itself is normal; p+q= N-1. The DropHealthTarge of the other N-1 active BRASs detected by the standby BRAS is the Y jth link quality level.
S805,备用BRAS向其它N-1台主用BRAS发送第二请求报文,请求对方应答到第一主用BRAS的链路是否劣化;并通报自己到第一主用BRAS的健康度指标(吞吐率劣化)的消息(通过报文的HealthTarget属性),并等待一个回应消息的时间窗Tw,在此时间窗Tw内暂停接收第一主用BRAS发送过来的备份数据。S805, the standby BRAS sends a second request message to other N-1 active BRASs, requesting the other side to answer whether the link to the first active BRAS is degraded; rate degradation) message (through the HealthTarget attribute of the message), and wait for a time window Tw of a response message, and suspend receiving the backup data sent by the first active BRAS within this time window Tw.
S806,在一个等待时间窗Tw内,备用BRAS收到l份应答消息。其中w份应答说对端到第一主用BRAS的吞吐率劣化,s份应答说对端到第一主用BRAS的健康度指标未劣化,w+s=l,l<=N-1;做举手表决。备用BRAS收到的应答消息即为第Xi链路质量级别。S806, within a waiting time window Tw, the standby BRAS receives one reply message. Wherein w replies say that the throughput rate from the peer to the first active BRAS is degraded, s replies say that the health index from the peer to the first active BRAS is not degraded, w+s=l, l<=N-1; Do a show of hands. The response message received by the standby BRAS is the X i link quality level.
S807,若(p+1>=w或者q<s),备用BRAS到其它主用BRAS,较第一主用BRAS更不稳定,表决备用BRAS故障:转入S810。S807, if (p+1>=w or q<s), the backup BRAS is sent to another active BRAS, which is more unstable than the first active BRAS, and votes for a failure of the backup BRAS: transfer to S810.
S808,若(p+1<w且q>=s),备用BRAS到其它主用BRAS,较第一主用BRAS更稳定,表决第一主用BRAS故障:立即升高备用BRAS的VRRP优先级,转入S811。S808, if (p+1<w and q>=s), the standby BRAS is sent to other active BRASs, which is more stable than the first active BRAS, and votes for the failure of the first active BRAS: immediately raise the VRRP priority of the standby BRAS , transfer to S811.
S809,除S807和S808以外的情况,各设备保持VRRP状态,或者为主用设备状态或者为备用设备状态,但主用设备状态下不向对端备份数据。S809, in cases other than S807 and S808, each device maintains a VRRP state, either as an active device state or as a standby device state, but does not back up data to the peer end in the active device state.
S810,备用BRAS的VRRP转入初始化状态,即使收到RBS备份消息也不处理。S810, the VRRP of the standby BRAS is transferred to an initialization state, and the RBS backup message is not processed even if it receives the RBS backup message.
S811,备用BRAS的VRRP从备用设备状态升为主用设备状态。S811, the VRRP of the standby BRAS is upgraded from the state of the standby equipment to the state of the active equipment.
图7和图8实施例仅是以BRAS通过VRRP热备份的一种实施方式,但发明实施例提供的热备份方法不以此为限,其中,各BRAS之间的链路质量级别可以包括:丢包比例、超时下线速率比例、远程备份震荡比例中的至少一种;异常级别包括丢包比例、超时下线速率比例、远程备份震荡比例中的至少一种或任意之和超过预设的异常比例阈值。The embodiment of Fig. 7 and Fig. 8 is only a kind of implementation mode that uses BRAS to pass VRRP hot backup, but the hot backup method that the embodiment of the invention provides is not limited to this, wherein, the link quality level between each BRAS can include: At least one of packet loss ratio, timeout offline rate ratio, remote backup shock ratio; abnormal level includes at least one of packet loss ratio, timeout offline rate ratio, remote backup shock ratio or any sum exceeding the preset Anomaly ratio threshold.
图9为本发明实施例提供的第一主用设备的结构示意图,本实施例提供的第一主用设备应用于主用设备和备用设备为N:1的多机热备份系统,其中N≥2,如图9所示,本实施例提供的第一主用设备包括:Fig. 9 is a schematic structural diagram of the first master device provided by the embodiment of the present invention. The first master device provided by this embodiment is applied to a multi-machine hot backup system in which the master device and the backup device are N: 1, where N≥ 2. As shown in Figure 9, the first active device provided in this embodiment includes:
第一获取模块91,用于获取第一链路质量级别,所述第一链路质量级别用于表示所述第一主用设备和备用设备之间的链路质量。The first obtaining module 91 is configured to obtain a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device.
判断模块92,用于判断所述第一链路质量级别是否表示链路异常。A judging module 92, configured to judge whether the first link quality level indicates a link abnormality.
第二获取模块93,用于当判断模块92判断所述第一链路质量级别表示链路异常时,获取所述第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,所述第Xi链路质量级别用于表示所述第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N。The second obtaining module 93 is configured to obtain N-1 links between the first active device and other N-1 active devices when the judging module 92 judges that the first link quality level indicates that the link is abnormal. An X i -th link quality level, the Xi- th link quality level is used to represent the link quality between the first active device and the i-th active device, N is greater than or equal to 2, and i is greater than 1 And less than or equal to N.
发送模块94,用于向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述备用设备之间的链路质量级别。The sending module 94 is configured to send a request message to each of the other N-1 active devices respectively, and the request message is used to request that one of the other N-1 active devices Each of the active devices respectively sends the link quality level between the other N-1 active devices and the backup device.
接收模块95,用于分别接收所述其它N-1台主用设备中M台主用设备发送的第j主用设备与所述备用设备之间的M个第Yj链路质量级别,所述第Yj链路质量级别用于表示第j主用设备和所述备用设备之间的链路质量,j大于1且小于或等于N,M大于等于零且小于或等于N-1。The receiving module 95 is configured to respectively receive the M Y j link quality levels between the jth active device and the backup device sent by the M active devices among the other N-1 active devices, so The Y j -th link quality level is used to represent the link quality between the j-th active device and the backup device, j is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1.
处理模块96,用于当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和小于N-1个所述第Xi链路质量级别中表示链路异常的数量,并且所述M个第Yj链路质量级别中表示链路正常的数量大于或等于N-1个所述第Xi链路质量级别中表示链路正常的数量,并且所述第一主用设备的业务受损时,将业务切换到所述备用设备。A processing module 96, configured to be used when the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than N-1 of the X i -th link quality levels Indicates the number of abnormal links, and the number of normal links in the M Y j link quality levels is greater than or equal to the normal number of links in the N-1 X i link quality levels , and when the service of the first master device is damaged, switch the service to the standby device.
本实施例提供的第一主用设备用于实现图3所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The first master device provided in this embodiment is used to implement the technical solution of the method embodiment shown in FIG. 3 , and its implementation principle and technical effect are similar, and will not be repeated here.
进一步地,在图9所示实施例中,处理模块96还用于在将业务切换到所述备用设备之后,启动预设的静默时间窗,在所述静默时间窗内不再切换回主用设备状态。Further, in the embodiment shown in FIG. 9 , the processing module 96 is also configured to start a preset silent time window after switching the service to the backup device, and not to switch back to the active device within the silent time window. device status.
进一步地,在图9所示实施例中,处理模块96具体用于降低接入优先级或初始化,触发接入所述第一主用设备的用户将业务切换到所述备用设备。Further, in the embodiment shown in FIG. 9 , the processing module 96 is specifically configured to lower the access priority or initialize, and trigger the user accessing the first active device to switch services to the standby device.
进一步地,在图9所示实施例中,发送模块94还用于在处理模块96降低接入优先级或初始化之后,向所述备用设备发送所述第一主用设备降低接入优先级或初始化的消息。Further, in the embodiment shown in FIG. 9 , the sending module 94 is further configured to send the first active device lowering the access priority or initialization message.
进一步地,在图9所示实施例中,第一获取模块91还用于当所述静默时间窗超时后,再次获取所述第一链路质量级别。发送模块94还用于当判断模块92判断所述第一链路质量级别表示链路正常时,通知所述备用设备将业务切换回所述第一主用设备。Further, in the embodiment shown in FIG. 9 , the first acquiring module 91 is further configured to acquire the first link quality level again after the silent time window expires. The sending module 94 is further configured to notify the standby device to switch services back to the first active device when the judging module 92 judges that the first link quality level indicates that the link is normal.
进一步地,在图9所示实施例中,处理模块96还用于当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和大于或等于N-1个所述第Xi链路质量级别中表示链路异常的数量时,或者当M个所述第Yj链路质量级别中表示链路正常的数量小于N-1个所述第Xi链路质量级别中表示链路正常的数量时,或者所述第一主用设备自身业务未受损时,在预设的静默时间窗内不向所述备用设备备份数据。Further, in the embodiment shown in FIG. 9 , the processing module 96 is further configured to be used when the sum of the numbers indicating link abnormalities in the first link quality level and the M Y j -th link quality levels is greater than or When it is equal to the number of abnormal links in the N-1 said X i link quality levels, or when the number of normal links in the M said Y j -th link quality levels is less than the N-1 said When the link quality level X i indicates a normal number of links, or when the service of the first active device is not damaged, it does not back up data to the standby device within a preset silent time window.
进一步地,在图9所示实施例中,发送模块94还用于在处理模块96确定所述备用设备到所述其它N-1台主用设备的链路状态劣于所述第一主用设备到所述其它N-1台主用设备的链路状态之后,向所述备用设备发送所述静默时间窗,所述静默时间窗用于使所述备用设备在所述静默时间窗内不处理接收到的备份数据。Further, in the embodiment shown in FIG. 9 , the sending module 94 is also used to determine in the processing module 96 that the link status between the backup device and the other N-1 master devices is worse than that of the first master device. After the device is in the link state of the other N-1 active devices, send the silent time window to the standby device, and the silent time window is used to prevent the standby device from Process the received backup data.
进一步地,在图9所示实施例中,发送模块94还用于从向所述其它N-1台主用设备中的每一台主用设备分别发送所述请求报文开始的等待时间窗内,暂停向所述备用设备发送备份数据。Further, in the embodiment shown in FIG. 9 , the sending module 94 is further configured to wait for a waiting time window starting from sending the request message to each of the other N-1 master devices respectively. , suspend sending backup data to the backup device.
进一步地,在图9所示实施例中,主用设备和所述备用设备为BRAS;Further, in the embodiment shown in FIG. 9, the master device and the backup device are BRAS;
处理模块96具体用于通过VRRP热备份到所述备用设备。The processing module 96 is specifically configured to perform hot backup to the backup device through VRRP.
进一步地,在图9所示实施例中的链路质量级别包括:丢包比例、超时下线速率比例、远程备份震荡比例中的至少一种。所述链路质量级别可以是所述第一链路质量级别、所述第Xi链路质量级别或所述第Yj链路质量级别。所述异常级别包括所述丢包比例、所述超时下线速率比例、所述远程备份震荡比例中的至少一种超过预设的异常比例阈值或上述任意两个之和超过预设的异常比例阈值。Further, the link quality level in the embodiment shown in FIG. 9 includes: at least one of a packet loss ratio, a timeout offline rate ratio, and a remote backup vibration ratio. The link quality level may be the first link quality level, the X i th link quality level or the Y j th link quality level. The abnormal level includes that at least one of the packet loss ratio, the timeout offline rate ratio, and the remote backup shock ratio exceeds a preset abnormal ratio threshold, or the sum of any two of the above exceeds a preset abnormal ratio threshold.
图10为本发明实施例提供的备用设备的结构示意图,本实施例提供的备用设备应用于主用设备和备用设备为N:1的多机热备份系统,其中N≥2,如图10所示,本实施例提供的备用设备包括:Fig. 10 is a schematic structural diagram of the backup device provided by the embodiment of the present invention. The backup device provided in this embodiment is applied to a multi-machine hot backup system in which the master device and the backup device are N: 1, where N≥2, as shown in Fig. 10 As shown, the backup equipment provided in this embodiment includes:
第一获取模块101,用于获取第一链路质量级别,所述第一链路质量级别用于表示第一主用设备和所述备用设备之间的链路质量。The first acquiring module 101 is configured to acquire a first link quality level, where the first link quality level is used to represent the link quality between the first master device and the backup device.
判断模块102,用于判断所述第一链路质量级别是否表示链路异常。A judging module 102, configured to judge whether the first link quality level indicates a link abnormality.
第二获取模块103,还用于当判断模块102判断所述第一链路质量级别表示链路异常级别时,获取所述备用设备和除所述第一主用设备外的其它N-1台主用设备中每一台主用设备之间的N-1个第Yj链路质量级别,所述第Yj链路质量级别用于表示所述备用设备和所述第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N。The second acquiring module 103 is further configured to acquire the backup device and other N-1 units except the first active device when the judging module 102 judges that the first link quality level indicates a link abnormality level N-1 Y j -th link quality levels between each active device in the active device, and the Y j -th link quality level is used to indicate that between the backup device and the j-th active device The quality of the link between them, N is greater than or equal to 2, and j is greater than 1 and less than or equal to N.
发送模块104,用于向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述第一主用设备之间的链路质量级别。The sending module 104 is configured to send a request message to each of the other N-1 active devices, and the request message is used to request the other N-1 active devices Each of the active devices respectively sends the link quality level between the other N-1 active devices and the first active device.
接收模块105,用于分别接收所述其它N-1台主用设备发送的第i主用设备和所述第一设备之间的M个第Xi链路质量级别,所述第Xi链路质量级别用于表示第i主用设备和所述第一主用设备之间的链路质量,i大于1且小于或等于N,M大于等于零且小于或等于N-1。The receiving module 105 is configured to respectively receive the M X i link quality levels between the i-th active device and the first device sent by the other N-1 active devices, and the X i -th link quality levels The road quality level is used to indicate the link quality between the i-th active device and the first active device, i is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1.
处理模块106,用于当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和小于M个所述第Xi链路质量级别中表示链路异常的数量,并且N-1个所述第Yj链路质量级别中表示链路正常的数量大于等于M个所述第Xi链路质量级别中表示链路正常的数量,提升接入优先级,所述提升后的接入优先级用于指示所述第一主用设备的用户切换到所述备用设备。A processing module 106, configured to be used when the sum of the first link quality level and the number of link abnormalities in the N-1 Y jth link quality levels is less than M said X i link quality levels Indicates the number of link abnormalities, and the number of normal links in the Y j link quality level of N-1 is greater than or equal to the normal number of links in the X i link quality level of M, Raising the access priority, where the raised access priority is used to instruct a user of the first active device to switch to the standby device.
本实施例提供的备用设备用于实现图5所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The standby device provided in this embodiment is used to implement the technical solution of the method embodiment shown in FIG. 5 , and its implementation principle and technical effect are similar, and details are not repeated here.
进一步地,在图10所示实施例中,处理模块106还用于当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和大于等于M个所述第Xi链路质量级别中表示链路异常的数量,或者N-1个所述第Yj链路质量级别中表示链路正常的数量小于M个所述第Xi链路质量级别中表示链路正常的数量,在预设的静默时间窗内暂停接收所述第一主用设备发送的备份数据。Further, in the embodiment shown in FIG. 10 , the processing module 106 is further configured to: when the first link quality level and the sum of the numbers representing link abnormalities in the N-1 Y j -th link quality levels greater than or equal to the number of M said X i link quality levels indicating that the link is abnormal, or the number of N-1 said Y j link quality levels indicating that the link is normal is less than the M said X i link quality level The link quality level indicates the number of normal links, and the receiving of the backup data sent by the first active device is suspended within the preset silent time window.
进一步地,在图10所示实施例中,接收模块105还用于接收所述第一主用设备发送的静默时间窗;处理模块106还用于在所述静默时间窗内不处理接收到的备份数据。Further, in the embodiment shown in FIG. 10 , the receiving module 105 is also configured to receive the silent time window sent by the first master device; the processing module 106 is also configured not to process the received silent time window within the silent time window. Backup data.
进一步地,在图10所示实施例中,接收模块105还用于接收所述第一主用设备发送的所述第一主用设备降低接入优先级或初始化的消息;处理模块106还用于在接收到所述第一主用设备降低接入优先级或初始化的消息后,接受第一主用设备的用户切换到备用设备。Further, in the embodiment shown in FIG. 10 , the receiving module 105 is also configured to receive a message that the first master device lowers the access priority or initializes the message sent by the first master device; the processing module 106 also uses After receiving the message that the first active device lowers the access priority or initializes, the user accepting the first active device switches to the standby device.
进一步地,在图10所示实施例中,主用设备和所述备用设备为BRAS;所述第一主用设备通过VRRP热备份到所述备用设备。Further, in the embodiment shown in FIG. 10 , the active device and the backup device are BRASs; the first active device is hot-backed up to the backup device through VRRP.
进一步地,在图10所示实施例中的链路质量级别包括:丢包比例、超时下线速率比例、远程备份震荡比例中的至少一种。所述链路质量级别可以是所述第一链路质量级别、所述第Xi链路质量级别或所述第Yj链路质量级别。所述异常级别包括所述丢包比例、所述超时下线速率比例、所述远程备份震荡比例中的至少一种超过预设的异常比例阈值或上述任意两个之和超过预设的异常比例阈值。Further, the link quality level in the embodiment shown in FIG. 10 includes: at least one of a packet loss ratio, a timeout offline rate ratio, and a remote backup vibration ratio. The link quality level may be the first link quality level, the X i th link quality level or the Y j th link quality level. The abnormal level includes that at least one of the packet loss ratio, the timeout offline rate ratio, and the remote backup shock ratio exceeds a preset abnormal ratio threshold, or the sum of any two of the above exceeds a preset abnormal ratio threshold.
图11为本发明实施例提供的第一主用设备的结构示意图,如图11所示,本实施例的第一主用设备包括:处理器111、发送器112、接收器113。可选的,该第一主用设备还可以包括存储器114。其中,处理器111、发送器112、接收器113和存储器114可以通过系统总线或其他方式相连,图11中以系统总线相连为例;系统总线可以是工业标准结构(IndustrialStandard Architecture,ISA)总线、外部设备互联(Peripheral ComponentInterconnect,PCI)总线或扩展工业标准结构(Extended Industrial StandardArchitecture,EISA)总线等。所述系统总线可以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。存储器114用于存储计算机程序。处理器111可从存储器114中读取存储的计算机程序对应的代码,执行如下操作:FIG. 11 is a schematic structural diagram of a first master device provided by an embodiment of the present invention. As shown in FIG. 11 , the first master device in this embodiment includes: a processor 111 , a transmitter 112 , and a receiver 113 . Optionally, the first master device may further include a memory 114 . Wherein, the processor 111, the transmitter 112, the receiver 113 and the memory 114 can be connected by a system bus or in other ways. In FIG. Peripheral Component Interconnect (PCI) bus or Extended Industrial Standard Architecture (Extended Industrial Standard Architecture, EISA) bus, etc. The system bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one line is used in FIG. 11 , but it does not mean that there is only one bus or one type of bus. The memory 114 is used to store computer programs. The processor 111 can read the code corresponding to the stored computer program from the memory 114, and perform the following operations:
获取第一链路质量级别,所述第一链路质量级别用于表示所述第一主用设备和备用设备之间的链路质量;Acquiring a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device;
判断所述第一链路质量级别是否表示链路异常;judging whether the first link quality level indicates a link abnormality;
当判断所述第一链路质量级别表示链路异常时,获取所述第一主用设备和其它N-1台主用设备之间的N-1个第Xi链路质量级别,所述第Xi链路质量级别用于表示所述第一主用设备和第i主用设备之间的链路质量,N大于或等于2,i大于1且小于或等于N;When it is judged that the first link quality level indicates that the link is abnormal, acquiring the N-1 th link quality levels between the first active device and other N-1 active devices, the The Xi- th link quality level is used to indicate the link quality between the first active device and the i-th active device, N is greater than or equal to 2, and i is greater than 1 and less than or equal to N;
通过发送器112,向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述备用设备之间的链路质量级别;Through the transmitter 112, a request message is sent to each of the other N-1 active devices, and the request message is used to request the other N-1 active devices. Each active device sends the link quality level between the other N-1 active devices and the backup device;
通过接收器113,分别接收所述其它N-1台主用设备中M台主用设备发送的第j主用设备与所述备用设备之间的M个第Yj链路质量级别,所述第Yj链路质量级别用于表示第j主用设备和所述备用设备之间的链路质量,j大于1且小于或等于N,M大于等于零且小于或等于N-1。Through the receiver 113, respectively receive the M Y j -th link quality levels between the j-th active device and the backup device sent by the M active devices among the other N-1 active devices, the The Y j -th link quality level is used to represent the link quality between the j-th active device and the backup device, where j is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1.
当所述第一链路质量级别和M个所述第Yj链路质量级别中表示链路异常的数量之和小于N-1个所述第Xi链路质量级别中表示链路异常的数量,并且所述M个第Yj链路质量级别中表示链路正常的数量大于或等于N-1个所述第Xi链路质量级别中表示链路正常的数量,并且所述第一主用设备的业务受损时,将业务切换到所述备用设备。When the sum of the first link quality level and the number of link abnormalities in the M Y j -th link quality levels is less than N-1 of the X i -th link quality levels that indicate link abnormalities number, and the number of normal links in the M Y j -th link quality levels is greater than or equal to the number of normal links in the N-1 X i -th link quality levels, and the first When the service of the active device is damaged, the service is switched to the backup device.
本实施例的第一主用设备用于实现图3所示的热备份方法,其实现原理和技术效果类似,此处不再赘述。The first active device in this embodiment is used to implement the hot backup method shown in FIG. 3 , and its implementation principle and technical effect are similar, and will not be repeated here.
本实施例提供的第一主用设备中,处理器111用于实现图9所示第一主用设备中判断模块92、处理模块96、第一获取模块91和第二获取模块93的处理;发送器112用于实现图9所示第一主用设备中发送模块94的处理;接收器113用于实现图9所示第一主用设备中接收模块95的处理。In the first master device provided in this embodiment, the processor 111 is configured to implement the processing of the judgment module 92, the processing module 96, the first acquisition module 91, and the second acquisition module 93 in the first master device shown in FIG. 9; The transmitter 112 is used to realize the processing of the sending module 94 in the first master device shown in FIG. 9 ; the receiver 113 is used to realize the processing of the receiving module 95 in the first master device shown in FIG. 9 .
图12为本发明实施例提供的备用设备的结构示意图,如图12所示,本实施例的备用设备包括:处理器121、发送器122、接收器123。可选的,该备用设备还可以包括存储器124。其中,处理器121、发送器122、接收器123和存储器124可以通过系统总线或其他方式相连,图12中以系统总线相连为例;系统总线可以是ISA总线、PCI总线或EISA总线等。所述系统总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。存储器124用于存储计算机程序。处理器121可从存储器124中读取存储的计算机程序对应的代码,执行如下操作:FIG. 12 is a schematic structural diagram of a backup device provided by an embodiment of the present invention. As shown in FIG. 12 , the backup device in this embodiment includes: a processor 121 , a transmitter 122 , and a receiver 123 . Optionally, the backup device may also include a memory 124 . Wherein, the processor 121, the transmitter 122, the receiver 123 and the memory 124 can be connected through a system bus or other means, and the system bus connection is taken as an example in FIG. The system bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one line is used in FIG. 12 , but it does not mean that there is only one bus or one type of bus. The memory 124 is used to store computer programs. The processor 121 can read the code corresponding to the stored computer program from the memory 124, and perform the following operations:
获取第一链路质量级别,所述第一链路质量级别用于表示第一主用设备和所述备用设备之间的链路质量;Acquiring a first link quality level, where the first link quality level is used to represent the link quality between the first active device and the backup device;
判断所述第一链路质量级别是否表示链路异常;judging whether the first link quality level indicates a link abnormality;
用于判断所述第一链路质量级别表示链路异常级别时,获取所述备用设备和除所述第一主用设备外的其它N-1台主用设备中每一台主用设备之间的N-1个第Yj链路质量级别,所述第Yj链路质量级别用于表示所述备用设备和所述第j主用设备之间的链路质量,N大于或等于2,j大于1且小于等于N;When it is used to determine that the first link quality level indicates a link abnormality level, obtain a link between the backup device and each of the other N-1 active devices except the first active device. N-1 Y j -th link quality levels between, the Y j -th link quality level is used to represent the link quality between the backup device and the j-th active device, and N is greater than or equal to 2 , j is greater than 1 and less than or equal to N;
通过发送器122,向所述其它N-1台主用设备中的每一台主用设备分别发送请求报文,所述请求报文用于请求所述其它N-1台主用设备中的每一台主用设备分别发送所述其它N-1台主用设备与所述第一主用设备之间的链路质量级别;Send a request message to each of the other N-1 active devices through the transmitter 122, and the request message is used to request each of the other N-1 active devices Each active device respectively sends the link quality level between the other N-1 active devices and the first active device;
通过接收器123,分别接收所述其它N-1台主用设备发送的第i主用设备和所述第一主用设备之间的M个第Xi链路质量级别,所述第Xi链路质量级别用于表示第i主用设备和所述第一主用设备之间的链路质量,i大于1且小于或等于N,M大于等于零且小于或等于N-1;Through the receiver 123, respectively receive the M X i -th link quality levels between the i-th active device and the first active device sent by the other N-1 active devices, and the X i -th The link quality level is used to indicate the link quality between the i-th active device and the first active device, where i is greater than 1 and less than or equal to N, and M is greater than or equal to zero and less than or equal to N-1;
当所述第一链路质量级别和N-1个所述第Yj链路质量级别中表示链路异常的数量之和小于M个所述第Xi链路质量级别中表示链路异常的数量,并且N-1个所述第Yj链路质量级别中表示链路正常的数量大于等于M个所述第Xi链路质量级别中表示链路正常的数量,提升接入优先级,所述提升后的接入优先级用于指示所述第一主用设备的用户切换到所述备用设备。When the sum of the first link quality level and the number of link abnormalities in the N-1 Y j -th link quality levels is less than the number of link abnormalities in the X i -th link quality levels number, and the number of normal links in the Y jth link quality level of N-1 is greater than or equal to the normal number of links in the X i link quality level of M, and the access priority is increased, The elevated access priority is used to instruct a user of the first active device to switch to the backup device.
本实施例提供的备用设备中,处理器121用于实现图10所示备用设备中第一获取模块101、第二获取模块103、判断模块102和处理模块106的处理;发送器122用于实现图10所示备用设备中发送模块104的处理;接收器123用于实现图10所示备用设备中接收模块105的处理。In the backup device provided in this embodiment, the processor 121 is used to implement the processing of the first acquisition module 101, the second acquisition module 103, the judgment module 102, and the processing module 106 in the backup device shown in FIG. 10; the transmitter 122 is used to implement The processing of the sending module 104 in the backup device shown in FIG. 10 ; the receiver 123 is used to implement the processing of the receiving module 105 in the backup device shown in FIG. 10 .
图13为本发明实施例提供的通信系统实施例一的结构示意图,该通信系统为主用设备和备用设备为N:1的多机热备份系统,如图13所示,本实施例的通信系统,包括N个主用设备131和1个备用设备132。Figure 13 is a schematic structural diagram of Embodiment 1 of the communication system provided by the embodiment of the present invention. The system includes N active devices 131 and one backup device 132 .
主用设备131包括如图9或图11所示的第一主用设备;备用设备132包括如图10或图12所示的备用设备。The master device 131 includes the first master device as shown in FIG. 9 or FIG. 11 ; the backup device 132 includes the backup device as shown in FIG. 10 or 12 .
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps of the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510675648.3A CN106603261B (en) | 2015-10-15 | 2015-10-15 | Hot backup method, first main device, standby device and communication system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510675648.3A CN106603261B (en) | 2015-10-15 | 2015-10-15 | Hot backup method, first main device, standby device and communication system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106603261A CN106603261A (en) | 2017-04-26 |
CN106603261B true CN106603261B (en) | 2019-12-06 |
Family
ID=58554246
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510675648.3A Active CN106603261B (en) | 2015-10-15 | 2015-10-15 | Hot backup method, first main device, standby device and communication system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106603261B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201929551A (en) * | 2017-12-26 | 2019-07-16 | 圓剛科技股份有限公司 | Streaming system with backup mechanism and backup method thereof |
CN108334425A (en) * | 2018-01-26 | 2018-07-27 | 郑州云海信息技术有限公司 | A kind of the redundancy replacement method, apparatus and equipment of server QPI link |
CN108880917B (en) * | 2018-08-23 | 2021-01-05 | 华为技术有限公司 | Switching method, device and switching control separation system of control plane equipment |
CN110874929A (en) * | 2018-08-31 | 2020-03-10 | 株式会社电装天 | Data collection device, data collection system, data collection method, and vehicle-mounted device |
CN109327398B (en) * | 2018-11-21 | 2021-05-28 | 新华三技术有限公司 | Method and device for preventing packet loss |
KR102481113B1 (en) * | 2019-02-11 | 2022-12-26 | 주식회사 엘지에너지솔루션 | System and method for checking slave battery management system |
CN110119111B (en) * | 2019-02-26 | 2021-04-16 | 北京龙鼎源科技股份有限公司 | Communication method and device, storage medium, and electronic device |
CN111953436A (en) * | 2020-08-12 | 2020-11-17 | 深圳市泛海三江电子股份有限公司 | Communication method and system based on redundancy technology |
US11503526B2 (en) * | 2020-09-15 | 2022-11-15 | International Business Machines Corporation | Predictive communication compensation |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100479434C (en) * | 2005-09-15 | 2009-04-15 | 华为技术有限公司 | Method and system for realizing virtual router redundant protocol master and standby equipment switching |
CN101447858B (en) * | 2008-01-17 | 2012-01-11 | 中兴通讯股份有限公司 | Method for realizing synchronous switching of virtual router redundancy protocol in dual-machine hot backup system |
CN101257405B (en) * | 2008-04-03 | 2010-12-08 | 中兴通讯股份有限公司 | Method for implementing double chain circuits among master-salve equipments |
CN102448095A (en) * | 2012-01-20 | 2012-05-09 | 杭州华三通信技术有限公司 | Double-homing protection method and equipment |
CN102664750B (en) * | 2012-04-09 | 2014-09-10 | 北京星网锐捷网络技术有限公司 | Method, system and device for hot backup of multi-machine |
CN103368712A (en) * | 2013-07-18 | 2013-10-23 | 华为技术有限公司 | Switchover method and device for main equipment and standby equipment |
-
2015
- 2015-10-15 CN CN201510675648.3A patent/CN106603261B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106603261A (en) | 2017-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106603261B (en) | Hot backup method, first main device, standby device and communication system | |
CN110912780B (en) | High-availability cluster detection method, system and controlled terminal | |
CN105164991B (en) | Redundant Network Protocol System | |
EP3398068B1 (en) | Network redundancy and failure detection | |
US6658595B1 (en) | Method and system for asymmetrically maintaining system operability | |
US9344325B2 (en) | System, method and apparatus providing MVPN fast failover | |
US6581166B1 (en) | Network fault detection and recovery | |
CN104168193B (en) | A kind of method and routing device of Virtual Router Redundancy Protocol fault detect | |
US9032240B2 (en) | Method and system for providing high availability SCTP applications | |
US6782422B1 (en) | Systems and methods for resynchronization and notification in response to network media events | |
CN100499485C (en) | Maintaining method of Ethernet link state | |
US9674285B2 (en) | Bypassing failed hub devices in hub-and-spoke telecommunication networks | |
CN102006189B (en) | Primary access server determination method and device for dual-machine redundancy backup | |
CN109861867B (en) | MEC service processing method and device | |
CN101060533B (en) | A method, system and device for improving reliability of VGMP protocol | |
CN111030926B (en) | A method and device for improving network high availability | |
CN110324375A (en) | A kind of information backup method and relevant device | |
CN102970167A (en) | Method for detecting faults of network nodes in cluster system, network node and system | |
CN108092857A (en) | A kind of distributed system heartbeat detecting method and relevant apparatus | |
CN107332793B (en) | A message forwarding method, related equipment and system | |
CN102187627B (en) | Method, device and broadband access server system for load share | |
CN102006268A (en) | Method, equipment and system for switching main interface and standby interface | |
CN105281929B (en) | A kind of service network interface state-detection and fault-tolerant devices and methods therefor | |
EP3627766B1 (en) | Method and system for switching between active bng and standby bng | |
CN113852514A (en) | Data processing system with uninterrupted service, processing equipment switching method and connecting equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |