WO2023274164A1 - Automatic main/standby switching method, control plane device, vbras system and storage medium - Google Patents

Automatic main/standby switching method, control plane device, vbras system and storage medium Download PDF

Info

Publication number
WO2023274164A1
WO2023274164A1 PCT/CN2022/101589 CN2022101589W WO2023274164A1 WO 2023274164 A1 WO2023274164 A1 WO 2023274164A1 CN 2022101589 W CN2022101589 W CN 2022101589W WO 2023274164 A1 WO2023274164 A1 WO 2023274164A1
Authority
WO
WIPO (PCT)
Prior art keywords
plane device
instance
control plane
state
channel
Prior art date
Application number
PCT/CN2022/101589
Other languages
French (fr)
Chinese (zh)
Inventor
刘硕
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023274164A1 publication Critical patent/WO2023274164A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements

Definitions

  • the present application relates to the technical field of communications, and in particular to a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium.
  • vBRAS virtual Broadband Remote Access Server, virtual broadband remote access server
  • BRAS Broadband Remote Access Server, broadband remote access server
  • the separated transfer and control vBRAS system refers to a vBRAS system in which forwarding and control are separated, the control plane is virtualized and centralized, and the forwarding plane coexists between virtual and real.
  • a vBRAS system with separation of forwarding and control includes a control plane device, a forwarding plane device, and a standardized interface between the control plane device and the forwarding plane device.
  • control plane devices cannot be switched automatically. Therefore, if a single control plane device fails due to a power outage in the computer room, fire, etc., the peer control plane device will not be able to sense and switch in time, causing the user's service function to fail, thereby affecting the user's network experience.
  • This application aims to solve at least one of the technical problems existing in the prior art. To this end, the present application proposes a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium.
  • the embodiment of the present application provides an active-standby automatic switchover method, which is applied to the second control plane device in the vBRAS system.
  • the vBRAS system also includes a first control plane device and a forwarding plane device.
  • the first The control plane device is provided with a first instance in a master state
  • the second control plane device is provided with a second instance in a standby state
  • the first instance communicates with the forwarding plane device through a first channel
  • the second The second instance communicates with the forwarding plane device through the second channel
  • the method includes: receiving fault information, wherein the fault information indicates that the failure rate of the first channel is greater than a first preset threshold; according to the fault information Obtain the current failure rate of the second channel; when the current failure rate of the second channel is less than or equal to a second preset threshold, control the second instance to switch from the standby state to the main state.
  • the embodiment of the present application also provides a control plane device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor executes the The computer program implements the master-standby automatic switching method as described in the embodiment of the first aspect above.
  • the embodiment of the present application further provides a vBRAS system, including the control plane device described in the embodiment of the second aspect above.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute the above-mentioned embodiment of the first aspect.
  • Fig. 1 is the schematic diagram of the system architecture platform of the main-standby automatic switching method provided by one embodiment of the present application;
  • FIG. 2 is a schematic diagram of an application scenario of an active-standby switching method provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a network of an active-standby switching method provided by an embodiment of the present application
  • Fig. 4 is a schematic diagram of a state transition of a second example provided by an embodiment of the present application.
  • Fig. 5 is a specific step diagram of an active-standby automatic switching method provided by an embodiment of the present application.
  • FIG. 6 is a diagram of specific steps of an active-standby automatic switching method provided in another embodiment of the present application.
  • FIG. 7 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
  • FIG. 8 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
  • FIG. 9 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
  • FIG. 10 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
  • FIG. 11 is a diagram of specific steps of a method for automatically switching between master and backup according to another embodiment of the present application.
  • Fig. 12 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
  • vBRAS can be mainly divided into centralized and separated transfer and control according to the architecture.
  • the transfer-control separated vBRAS system refers to the technical ideas of SDN (Software Defined Network, software-defined network) and NFV (Network Functions Virtualization, network function virtualization), combined with CT (Communication technology, communication technology) and IT (Information technology, information technology), according to the actual application scenario requirements of operators, realize the vBRAS system that separates forwarding and control, virtualizes and centralizes the control plane, and coexists virtual and real things on the forwarding plane.
  • a vBRAS system with separation of forwarding and control includes a control plane device, a forwarding plane device, and a standardized interface between the control plane device and the forwarding plane device.
  • an embodiment of the present application provides an active-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium, wherein the active-standby automatic switching method is applied to the second control plane device in the vBRAS system
  • the vBRAS system further includes a first control plane device and a forwarding plane device, the first control plane device is provided with a first instance in a master state, and the second control plane device is provided with a second instance in a standby state, The first instance communicates with the forwarding plane device through a first channel, and the second instance communicates with the forwarding plane device through a second channel.
  • the method includes but is not limited to the following steps: receiving fault information, wherein, The failure information indicates that the failure rate of the first channel is greater than a first preset threshold; the current failure rate of the second channel is obtained according to the failure information; when the current failure rate of the second channel is less than or equal to the first threshold Two preset thresholds, controlling the second instance to switch from the standby state to the main state.
  • the second control plane device can sense the failure of the first control plane device in time according to the fault information, and switch the second instance from the standby state when it judges that it has the ability to upgrade to the master In the active state, the active/standby automatic switching between the control plane devices is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
  • FIG. 1 is a schematic diagram of a system architecture platform for performing an active-standby automatic switching method provided by an embodiment of the present application.
  • the system architecture platform is provided with a processor 100 and a memory 200 , wherein the processor 100 and the memory 200 may be connected via a bus or in other ways.
  • connection via a bus is taken as an example.
  • the memory 200 can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory 200 may include a high-speed random access memory 200 , and may also include a non-transitory memory 200 , such as at least one disk storage 200 , a flash memory device, or other non-transitory solid-state memory 200 .
  • the memory 200 may optionally include memory 200 located remotely relative to the processor 100, and these remote memories 200 may be connected to the system architecture platform through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • system architecture platform can be applied to 3G communication network systems, LTE communication network systems, 5G communication network systems and subsequent evolved mobile communication network systems, etc., which is not specifically limited in this embodiment.
  • FIG. 1 does not constitute a limitation to the embodiment of the present application, and may include more or less components than those shown in the illustration, or combine some components, or have different Part placement.
  • the processor 100 can call an information processing program stored in the memory 200 to execute the method for automatically switching between master and backup.
  • the active-standby automatic switching method is applied to the second control plane device 400 in the vBRAS system.
  • the vBRAS system also includes a first control plane device 300 and a forwarding plane device 500.
  • the first control plane device 300 is provided with The first instance in the master state
  • the second control plane device 400 is provided with a second instance in the standby state
  • the first instance communicates with the forwarding plane device 500 through the first channel (not shown in the figure)
  • the second instance communicates with the forwarding plane device 500 through the second
  • the second channel (not shown in the figure) communicates with the forwarding plane device 500 .
  • the method for automatic active-standby switching specifically includes but is not limited to the following steps S100 , S200 and S300 .
  • Step S100 Receive failure information, wherein the failure information indicates that the failure rate of the first channel is greater than a first preset threshold.
  • the fault information may be generated by the first control plane device 300 and sent to the second control plane device 400, or may be generated by the second control plane device 400 itself according to the information sent by the forwarding plane device 500 , which is not limited in this embodiment.
  • the first preset threshold is the channel failure rate threshold preset in the first example, and the first preset threshold may be 50, 100, etc., which is not limited in this embodiment.
  • Step S200 Obtain the current failure rate of the second channel according to the failure information.
  • the second control plane device 400 when the second control plane device 400 receives fault information indicating that the failure rate of the first channel is greater than the first preset threshold, it means that the first control plane device 300 is in a fault state, that is, the communication with the first instance
  • the forwarding plane device 500 needs other control plane devices to perform the master upgrade operation to take over, so the second control plane device 400 will obtain the current failure rate of the second channel to determine whether the second instance has the master upgrade capability.
  • Step S300 When the current failure rate of the second channel is less than or equal to the second preset threshold, control the second instance to switch from the standby state to the main state.
  • the second control plane device 400 itself has the ability to upgrade to the master, and can switch the second instance from the standby state to the master state.
  • the forwarding plane device 500 communicating with the first instance is taken over.
  • the second preset threshold is the channel failure rate threshold preset in the second example, and the second preset threshold may be 50, 100, etc., which is not limited in this embodiment.
  • the second control plane device 400 can detect the fault of the first control plane device 300 in time according to the fault information, and when it is judged that it has the ability to upgrade to the master, the second control plane device 400 can The instance is switched from the standby state to the active state, and the automatic switchover between the active and standby devices on the control plane is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
  • the active/standby automatic switchover method can be used to manage the active/standby states of two control plane devices, so that active/standby automatic switchover can be performed between the two control plane devices.
  • the automatic switching function of vBRAS system products across control plane devices is realized, thereby improving the disaster recovery performance of vBRAS system products and the reliability of system products.
  • control plane equipment in this technical solution can identify the failure of the peer control plane equipment in time and automatically perform active/standby switchover, the stability of vBRAS system product operation is greatly improved, and the user's network experience is optimized, thereby It reduces the complaint rate of users and reduces the operation and maintenance costs of customers, which is of great significance to the subsequent development of vBRAS system-related technologies.
  • step S100 specifically includes but not limited to the following step S110 .
  • Step S110 Receive the fault information sent from the first control plane device 300 through the heartbeat line, where the first control plane device 300 and the second control plane device 400 communicate through the heartbeat line.
  • the first control plane device 300 when the first control plane device 300 is disconnected from the first channel of the forwarding plane device 500, so that the failure rate of the first channel exceeds the first preset threshold of the first instance, the first control plane device 300 generates a fault information and send the fault information to the second control plane device 400 through the heartbeat line communicatively connected with the second control plane device 400; when the second control plane device 400 receives the fault information through the heartbeat line, it can judge that the first control plane The device 300 is in a fault state, and then the second control plane device 400 judges whether it has the ability to upgrade to the master, and if so, upgrades to the master.
  • step S100 specifically includes but not limited to the following step S120 , step S130 and step S140 .
  • Step S120 Receive a fault event sent from the forwarding plane device 500 through the second channel, where the fault event indicates that there is a fault in the first channel.
  • Step S130 Calculate the failure rate of the first channel according to the failure events.
  • Step S140 When the failure rate of the first channel is greater than a first preset threshold, generate failure information.
  • the second control plane device 400 receives the fault event sent from the forwarding plane device 500, and calculates the failure rate of the first channel according to the fault event, and then judges whether the failure rate of the first channel is greater than the first preset threshold, if so Then it can be judged that the first control plane device 300 is in a fault state, and then the second control plane device 400 judges whether it has the ability to upgrade to the master, and if so, upgrades to the master.
  • the vBRAS system further includes a database 600, and the first control plane device 300 and the second control plane device 400 communicate with the database 600 respectively.
  • step S300 it specifically includes but It is not limited to the following steps S400 and S500.
  • Step S400 Control the second instance to switch from the standby state to the recovering state, where the recovering state is used for the second instance to extract the user data of the first instance from the database 600 .
  • Step S500 When the second instance finishes extracting the user data, control the second instance to switch from the recovering state to the main state.
  • the second control plane device 400 when the second control plane device 400 executes the master upgrade operation, the second control plane device 400 first controls the state of the second instance to switch from the standby state to the recovering state, and then pulls from the database 600 the information related to the first control plane device. 300 communicates with the online users of the forwarding plane device 500, extracts and restores these online users one by one to the second control plane device 400, and after the second control plane device 400 restores all these online users, restores the second instance from the The medium state switches to the main state.
  • step S400 specifically includes but not limited to the following steps S410 and S420 .
  • Step S410 Generate a pointing switch instruction.
  • Step S420 Send a pointing switch command to the forwarding plane device 500, so that the encapsulation and decapsulation table of the forwarding plane device 500 points to the second instance or the channel link of the forwarding plane device 500 points to the second instance.
  • the second control plane device 400 controls the second instance to switch from the standby state to the recovering state, it generates a pointing switch command and sends the pointing switch command to the forwarding plane device 500, so that the encapsulation of the forwarding plane device 500 is decapsulated.
  • the encapsulation table points to the second instance; or, the pointing switching instruction is sent to the forwarding plane device 500, so that the channel link of the forwarding plane device 500 points to the second instance.
  • step S300 specifically includes but not limited to the following steps S600 and S700 .
  • Step S600 Generate a state switching instruction.
  • Step S700 Send a state switching instruction to the first instance, so that the first instance is switched from the master state to the standby state.
  • the second control plane device 400 After the current failure rate of the second channel is less than or equal to the second preset threshold, the second control plane device 400 generates a state switching instruction, and sends the state switching instruction to the first instance, so that the first The instance switches from the primary state to the standby state. It should be noted that at this time, the first control plane device 300 may not receive the state switching instruction due to a failure, but since the first instance and the second instance are independent of each other and do not affect each other, it will not affect the second instance. The control plane device 400 performs an upgrade to master operation.
  • step S300 specifically includes but not limited to the following steps S800 and S900.
  • Step S800 When the failure rate of the first channel returns to less than or equal to the first preset threshold, obtain the first priority of the first instance and the second priority of the second instance, and send the second priority to the first controller Surface device 300.
  • Step S900 Compare the first priority with the second priority, and control the states of the second instance and the first instance according to the comparison result.
  • the first control plane device 300 may fail to receive a state switching instruction from the second control plane device 400, Therefore, when the first control plane device 300 returns to normal, both the first instance and the second instance will be in the master state, and the first control plane device 300 and the second control plane device 400 need to negotiate with each other to determine the final master state. Status of the control plane device.
  • the second control plane device 400 acquires the first priority of the first instance and the second priority of the second instance, and sends the second The second priority is assigned to the first control plane device 300, and then the first priority is compared with the second priority, and the statuses of the second instance and the first instance are controlled according to the comparison result.
  • both the first priority and the second priority are preset values, specifically 100, 200, etc., which are not limited in this embodiment.
  • step S900 specifically includes but not limited to the following steps S910 and S920.
  • Step S910 When the first priority is higher than the second priority, control the second instance to restore from the master state to the standby state, and make the first control plane device 300 maintain the first instance according to the first priority and the second priority
  • the state of is the main state.
  • Step S920 When the first priority is lower than the second priority, maintain the state of the second instance as the master state, and make the first control plane device 300 control the first instance to be mastered according to the first priority and the second priority. The state switches to the standby state.
  • both the first control plane device 300 and the second control plane device 400 are in a normal working state without failure, and when the first priority is higher than the second priority, the second control plane device 400 controls the second instance by restore the master state to the standby state, and make the first control plane device 300 maintain the state of the first instance as the master state according to the first priority and the second priority; when the first priority is lower than the second priority, the second The control plane device 400 maintains the state of the second instance as the master state, and enables the first control plane device 300 to control the first instance to switch from the master state to the standby state according to the first priority and the second priority.
  • the vBRAS system includes a first control plane device 300 , a second control plane device 400 and a forwarding plane device 500 .
  • CP1 represents the first control plane device 300
  • CP2 represents the second control plane device 400
  • UP1 to UP4 represent forwarding plane devices 500
  • CP1 and CP2 include the same instance instance1, in Instance1 in CP1 represents the first instance, and instance1 in CP2 represents the second instance.
  • the forwarding plane device 500 always sends user data to the control plane device in the master state, when the first control plane device 300 is normal, the first control plane device 300 takes over UP1 and UP2.
  • the online message of the user whose physical network is connected to UP1 and UP2 will be delivered to the first control plane device 300, and the first control plane device 300 will save the user information after processing the user's online message into the database 600.
  • one control plane device can be configured with multiple geo-backup-instance instances, and each instance has its own independent master and backup state management, as follows.
  • the command supports configuring the geo-backup-instance instance switching mode as automatic: switch-mode auto.
  • the command supports configuring the priority of the geo-backup-instance instance, and the range is 1-254.
  • the command supports configuring the geo-backup-instance instance to determine the upper threshold of the channel failure rate threshold of the forwarding plane device 500 managed by itself, and the range is 1-100.
  • the command supports configuring the delay time delay-time between when the geo-backup-instance instance decision needs to be automatically upgraded to the execution of the upgrade action, and the range is 240-3600 seconds.
  • the command supports configuring whether the geo-backup-instance instance is enabled to preempt the master switch preempt enable/disable.
  • the forwarding plane device 500 supports reporting the state of the first channel between itself and the first control plane device 300, such as the OpenFlow channel, to the second control plane device 400, so that the second control plane device 400 reports the message through the forwarding plane device 500 To determine whether the current failure rate of the first channel between the first control plane device 300 and the forwarding plane device 500 at the opposite end has exceeded the first preset threshold.
  • first control plane device 300 and the second control plane device 400 notify each other of the configuration under the geo-backup-instance instance through the SIB heartbeat line, and the same instance is allowed to be configured under the instance on the two control plane devices The parameters are different.
  • instance1 and instance2 are configured on CP1 and CP2, where instance1 on CP1 is in the active state, instance2 is in the standby state, and instance1 on CP2 is in the standby state, and instance2 is in the active state.
  • the instance1 of CP1 and CP2 is configured in automatic mode, and the OpenFlow channel between CP1 and CP2 and UP is good.
  • the priority of instance1 of CP1 is configured as 200, and the priority of instance2 of CP1 is configured as 100. You can see that instance1 of CP1 is the master (master ), instance1 of CP2 is the backup (slave).
  • the user dials up to go online, and the user whose physical network is connected to UP1 and UP2 will deliver the online message to CP1, and CP1 will write the user information table into the database 600 .
  • CP1 needs to be restarted or an unexpected failure occurs (such as CP1 server downtime, computer room power failure, CP1 network outbound interface link failure, etc.)
  • the first channel between CP1 and UP1 and UP2 is disconnected, and CP1 and CP2 The sib heartbeat line between them is also disconnected.
  • CP2 needs to take over UP1 and UP2 without affecting online users.
  • CP2 calculates that the failure rate of the first channel of CP1 exceeds the first preset. Threshold (40%), at this time, CP2 knows that its second channel is good by obtaining the current failure rate of the second channel and according to the failure rate being less than or equal to the second preset threshold, so CP2 decides to promote the master, and CP2 Take over UP1 and UP2.
  • CP2 After CP2 performs the upgrade operation, the status of instance1 of CP2 becomes recovery (recovery), indicating that CP2 is recovering data.
  • CP2 sends a state switching command to CP1, commanding the state of instance1 of CP1 to change to the standby state (slave).
  • CP1 is down at this time and will not receive this message, it does not affect CP2 to continue to be promoted to master; at the same time, CP2 passes The second channel sends a pointing switch command to UP1 and UP2 to point the NSH encapsulation and decapsulation table to CP2, and make UP1 and UP2 point the channel link to the second instance of CP2.
  • UP1 and UP2 receive the master upgrade message from CP2, they will start aging data such as user tables and network segment routes on UP1 and UP2, and then wait for CP2 to re-deliver service data.
  • instance1 of CP2 is in recovery state, pulls the already online users of UP1 and UP2 from the database 600, and restores them to CP2.
  • each time CP2 acquires a user it synchronizes the user with UP1 or UP2, so that UP1 and UP2 stop aging of the user after receiving the user synchronization information.
  • the state of CP2 changes from recovery to master state, that is, the master-standby switchover between CP1 and CP2 is completed.
  • instance1 of CP2 when instance1 of CP2 is in the recovery state, due to the existence of the load sharing mechanism, it can ensure that instance2 of CP2 maintains the master state (master) without being affected, and UP3 and UP4 can also go online normally with new users.
  • CP1 cannot receive the message from CP2 ordering CP1 to change to standby, so after CP2 becomes the master state, both CP1 and CP2 will be in the master state, resulting in a dual-master phenomenon. Therefore, when the CP1 server network is back to normal, CP1 and CP2 will negotiate to determine the active CP and standby CP, and the CP with the highest priority is determined to be the final active CP.
  • An embodiment of the present application provides a control plane device, which includes: a memory 200 , a processor 100 , and a computer program stored in the memory 200 and operable on the processor 100 .
  • the processor 100 and the memory 200 may be connected via a bus or in other ways.
  • controller in this embodiment may correspond to include the memory 200 and the processor 100 in the embodiment shown in FIG. 1, which can constitute a part of the system architecture platform in the embodiment shown in FIG. 1. Both belong to the same inventive concept, so both have the same realization principle and beneficial effect, and will not be described in detail here.
  • the non-transitory software programs and instructions required to realize the master-standby automatic switching method of the above-mentioned embodiment are stored in the memory 200, and when executed by the processor 100, the master-standby automatic switchover method of the above-mentioned embodiment is executed, for example, the above description is performed.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • control plane device in the embodiment of the second aspect of the present application and the master-standby automatic switching method in any embodiment of the first aspect above belong to the same inventive concept, the control plane device in the embodiment of the second aspect of the present application
  • the control plane device in the embodiment of the second aspect of the present application For the specific implementation manner and technical effect, reference may be made to the specific implementation manner and technical effect of the master-standby automatic switching method in any embodiment of the first aspect above, and details are not repeated here.
  • the vBRAS system in this embodiment of the present application is a transfer-control-separated vBRAS system
  • the vBRAS system includes the control plane device in each embodiment of the second aspect above, and also includes the forwarding plane device 500 and at least one other control plane device, and the control A standardized interface is provided between the plane device and the forwarding plane device 500 .
  • the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are used to execute the above-mentioned master-standby automatic switching method, for example, execute the above-described method steps S100 to S300 in FIG. 5 and the method in FIG. 6 Step S110, method steps S120 to S140 in FIG. 7, method steps S400 to S500 in FIG. 8, method steps S410 to S420 in FIG. 9, method steps S600 to S700 in FIG. 10, method step S800 in FIG. 11 Go to S900, the method steps S910 to S920 in FIG. 12 .
  • the embodiment of the present application includes a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium, wherein the master-standby automatic switchover method is applied to the second control plane device in the vBRAS system, and the vBRAS system also It includes a first control plane device and a forwarding plane device, the first control plane device is provided with a first instance in a master state, the second control plane device is provided with a second instance in a standby state, and the first instance Communicating with the forwarding plane device through a first channel, the second instance communicating with the forwarding plane device through a second channel, the method includes: receiving fault information, wherein the fault information represents the first channel The failure rate of the second channel is greater than the first preset threshold; the current failure rate of the second channel is obtained according to the failure information; when the current failure rate of the second channel is less than or equal to the second preset threshold, control the second The instance switches from the standby state to the primary state.
  • the second control plane device can sense the failure of the first control plane device in time according to the fault information, and switch the second instance from the standby state when it judges that it has the ability to upgrade to the master In the active state, the active/standby automatic switching between the control plane devices is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
  • a processor 100 such as a central processing unit 100, a digital signal processor 100, or a microprocessor 100, or as hardware, or as an integrated circuit, Such as application specific integrated circuits.
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • Computer storage media including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory 200 technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or Any other medium that can be used to store desired information and that can be accessed by a computer.
  • Computer storage media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

An automatic main/standby switching method, a control plane device, a vBRAS system and a storage medium. The automatic main/standby switching method is applied to a second control plane device in a vBRAS system. The vBRAS system further comprises a first control plane device and a forwarding plane device, wherein the first control plane device is provided with a first instance in a main state, the second control plane device is provided with a second instance in a standby state, the first instance communicates with the forwarding plane device by means of a first channel, and the second instance communicates with the forwarding plane device by means of a second channel. The method comprises: receiving fault information, wherein the fault information represents that a failure rate of a first channel is greater than a first preset threshold value (S100); acquiring the current failure rate of a second channel according to the fault information (S200); and when the current failure rate of the second channel is less than or equal to a second preset threshold value, controlling a second instance to switch from a standby state to a main state (S300).

Description

主备自动切换方法、控制面设备、vBRAS系统和存储介质Active-standby automatic switching method, control plane equipment, vBRAS system and storage medium
相关申请的交叉引用Cross References to Related Applications
本申请基于申请号为202110719886.5、申请日为2021年6月28日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on a Chinese patent application with application number 202110719886.5 and a filing date of June 28, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.
技术领域technical field
本申请涉及通讯技术领域,特别涉及一种主备自动切换方法、控制面设备、vBRAS系统和计算机可读存储介质。The present application relates to the technical field of communications, and in particular to a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium.
背景技术Background technique
vBRAS(virtual Broadband Remote Access Server,虚拟宽带远程接入服务器)作为一种新兴的BRAS(Broadband Remote Access Server,宽带远程接入服务器)的设备形态,按照架构方式主要可以分为集中式和转控分离式。其中,转控分离式vBRAS系统,是指转发和控制分离、控制面虚拟化集中化、转发面虚实共存的vBRAS系统。通常地,转控分离式vBRAS系统包括控制面设备、转发面设备以及控制面设备与转发面设备之间的标准化接口。vBRAS (virtual Broadband Remote Access Server, virtual broadband remote access server), as an emerging BRAS (Broadband Remote Access Server, broadband remote access server) equipment form, can be divided into centralized and separation of transfer and control according to the architecture. Mode. Among them, the separated transfer and control vBRAS system refers to a vBRAS system in which forwarding and control are separated, the control plane is virtualized and centralized, and the forwarding plane coexists between virtual and real. Generally, a vBRAS system with separation of forwarding and control includes a control plane device, a forwarding plane device, and a standardized interface between the control plane device and the forwarding plane device.
目前,相关技术中存在控制面设备之间无法自动切换的不足。因此,若单个控制面设备因机房断电、火灾等出现故障,对端控制面设备将无法及时感知并进行切换,使得用户的业务功能执行失败,从而影响用户的用网体验。At present, there is a deficiency in related technologies that control plane devices cannot be switched automatically. Therefore, if a single control plane device fails due to a power outage in the computer room, fire, etc., the peer control plane device will not be able to sense and switch in time, causing the user's service function to fail, thereby affecting the user's network experience.
发明内容Contents of the invention
本申请旨在至少解决现有技术中存在的技术问题之一。为此,本申请提出一种主备自动切换方法、控制面设备、vBRAS系统和计算机可读存储介质。This application aims to solve at least one of the technical problems existing in the prior art. To this end, the present application proposes a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium.
第一方面,本申请实施例提供了一种主备自动切换方法,应用于vBRAS系统中的第二控制面设备,所述vBRAS系统还包括第一控制面设备和转发面设备,所述第一控制面设备设置有处于主状态的第一实例,所述第二控制面设备设置有处于备状态的第二实例,所述第一实例通过第一通道与所述转发面设备通信,所述第二实例通过第二通道与所述转发面设备通信,所述方法包括:接收故障信息,其中,所述故障信息表征所述第一通道的故障率大于第一预设阈值;根据所述故障信息获取所述第二通道当前的故障率;当所述第二通道当前的故障率小于或等于第二预设阈值,控制所述第二实例由备状态切换至主状态。In the first aspect, the embodiment of the present application provides an active-standby automatic switchover method, which is applied to the second control plane device in the vBRAS system. The vBRAS system also includes a first control plane device and a forwarding plane device. The first The control plane device is provided with a first instance in a master state, the second control plane device is provided with a second instance in a standby state, the first instance communicates with the forwarding plane device through a first channel, and the second The second instance communicates with the forwarding plane device through the second channel, the method includes: receiving fault information, wherein the fault information indicates that the failure rate of the first channel is greater than a first preset threshold; according to the fault information Obtain the current failure rate of the second channel; when the current failure rate of the second channel is less than or equal to a second preset threshold, control the second instance to switch from the standby state to the main state.
第二方面,本申请实施例还提供了一种控制面设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述第一方面实施例所述的主备自动切换方法。In the second aspect, the embodiment of the present application also provides a control plane device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor, and the processor executes the The computer program implements the master-standby automatic switching method as described in the embodiment of the first aspect above.
第三方面,本申请实施例还提供了一种vBRAS系统,包括上述第二方面实施例所述的控制面设备。In a third aspect, the embodiment of the present application further provides a vBRAS system, including the control plane device described in the embodiment of the second aspect above.
第四方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质 存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如上第一方面实施例所述的主备自动切换方法。In the fourth aspect, the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute the above-mentioned embodiment of the first aspect. The main-standby automatic switching method described above.
本申请的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
附图说明Description of drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.
图1是本申请一个实施例提供的主备自动切换方法的系统架构平台的示意图;Fig. 1 is the schematic diagram of the system architecture platform of the main-standby automatic switching method provided by one embodiment of the present application;
图2是本申请一个实施例提供的主备切换方法的应用场景示意图;FIG. 2 is a schematic diagram of an application scenario of an active-standby switching method provided by an embodiment of the present application;
图3是本申请一个实施例提供的主备切换方法的组网示意图;FIG. 3 is a schematic diagram of a network of an active-standby switching method provided by an embodiment of the present application;
图4是本申请一个实施例提供的第二实例的状态转换示意图;Fig. 4 is a schematic diagram of a state transition of a second example provided by an embodiment of the present application;
图5是本申请一个实施例提供的主备自动切换方法的具体步骤图;Fig. 5 is a specific step diagram of an active-standby automatic switching method provided by an embodiment of the present application;
图6是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 6 is a diagram of specific steps of an active-standby automatic switching method provided in another embodiment of the present application;
图7是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 7 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application;
图8是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 8 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application;
图9是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 9 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application;
图10是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 10 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application;
图11是本申请另一个实施例提供的主备自动切换方法的具体步骤图;FIG. 11 is a diagram of specific steps of a method for automatically switching between master and backup according to another embodiment of the present application;
图12是本申请另一个实施例提供的主备自动切换方法的具体步骤图。Fig. 12 is a diagram of specific steps of an active-standby automatic switching method provided by another embodiment of the present application.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书、权利要求书或上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that although the functional modules are divided in the schematic diagram of the device, and the logical sequence is shown in the flowchart, in some cases, it can be executed in a different order than the module division in the device or the flowchart in the flowchart. steps shown or described. The terms "first", "second" and the like in the specification, claims or the above drawings are used to distinguish similar objects, and not necessarily used to describe a specific order or sequence.
vBRAS作为一种新兴的BRAS的设备形态,按照架构方式主要可以分为集中式和转控分离式。其中,转控分离式vBRAS系统是指借鉴SDN(Software Defined Network,软件定义网络)和NFV(Network Functions Virtualization,网络功能虚拟化)的技术思路,结合CT(Communication techonology,通信技术)与IT(Information technology,信息技术)的技术优势,根据运营商实际应用的场景需求,实现转发和控制分离、控制面虚拟化集中化、转发面虚实共存的vBRAS系统。通常地,转控分离式vBRAS系统包括控制面设备、转发面设备以及控制面设备与转发面设备之间的标准化接口。As an emerging BRAS equipment form, vBRAS can be mainly divided into centralized and separated transfer and control according to the architecture. Among them, the transfer-control separated vBRAS system refers to the technical ideas of SDN (Software Defined Network, software-defined network) and NFV (Network Functions Virtualization, network function virtualization), combined with CT (Communication technology, communication technology) and IT (Information technology, information technology), according to the actual application scenario requirements of operators, realize the vBRAS system that separates forwarding and control, virtualizes and centralizes the control plane, and coexists virtual and real things on the forwarding plane. Generally, a vBRAS system with separation of forwarding and control includes a control plane device, a forwarding plane device, and a standardized interface between the control plane device and the forwarding plane device.
目前,相关技术中存在多个控制面设备之间无法自动切换的不足。因此,若单个控制面设备因机房断电、火灾等出现故障时,对端控制面设备将无法及时感知并进行切换,使得用户的业务功能执行失败,从而影响用户的用网体验。At present, there is a deficiency in related technologies that automatic switching among multiple control plane devices cannot be performed. Therefore, if a single control plane device fails due to a power outage in the computer room, fire, etc., the peer control plane device will not be able to sense and switch in time, causing the user's service function to fail, thereby affecting the user's network experience.
基于上述情况,本申请实施例提供了一种主备自动切换方法、控制面设备、vBRAS系统和计算机可读存储介质,其中,主备自动切换方法应用于vBRAS系统中的第二控制面设备,所述vBRAS系统还包括第一控制面设备和转发面设备,所述第一控制面设备设置有处于主状态的第一实例,所述第二控制面设备设置有处于备状态的第二实例,所述第一实例通过第一通道与所述转发面设备通信,所述第二实例通过第二通道与所述转发面设备通信,所述方法包括但不限于以下步骤:接收故障信息,其中,所述故障信息表征所述第一通道的故障率大于第一预设阈值;根据所述故障信息获取所述第二通道当前的故障率;当所述第二通道当前的故障率小于或等于第二预设阈值,控制所述第二实例由备状态切换至主状态。根据本申请实施例提供的方案,使得第二控制面设备能够根据故障信息及时感知到第一控制面设备的故障,并在判断自身具备升主能力的状态下,将第二实例由备状态切换至主状态,完成控制面设备间的主备自动切换,从而提高vBRAS系统的容灾性能,优化用户的用网体验。Based on the foregoing, an embodiment of the present application provides an active-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium, wherein the active-standby automatic switching method is applied to the second control plane device in the vBRAS system, The vBRAS system further includes a first control plane device and a forwarding plane device, the first control plane device is provided with a first instance in a master state, and the second control plane device is provided with a second instance in a standby state, The first instance communicates with the forwarding plane device through a first channel, and the second instance communicates with the forwarding plane device through a second channel. The method includes but is not limited to the following steps: receiving fault information, wherein, The failure information indicates that the failure rate of the first channel is greater than a first preset threshold; the current failure rate of the second channel is obtained according to the failure information; when the current failure rate of the second channel is less than or equal to the first threshold Two preset thresholds, controlling the second instance to switch from the standby state to the main state. According to the solution provided by the embodiment of the present application, the second control plane device can sense the failure of the first control plane device in time according to the fault information, and switch the second instance from the standby state when it judges that it has the ability to upgrade to the master In the active state, the active/standby automatic switching between the control plane devices is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
下面结合附图,对本申请实施例作进一步阐述。The embodiments of the present application will be further described below in conjunction with the accompanying drawings.
如图1所示,图1是本申请一个实施例提供的用于执行主备自动切换方法的系统架构平台的示意图。As shown in FIG. 1 , FIG. 1 is a schematic diagram of a system architecture platform for performing an active-standby automatic switching method provided by an embodiment of the present application.
在图1的示例中,该系统架构平台设置有处理器100和存储器200,其中,处理器100和存储器200可以通过总线或者其他方式连接,图1中以通过总线连接为例。In the example shown in FIG. 1 , the system architecture platform is provided with a processor 100 and a memory 200 , wherein the processor 100 and the memory 200 may be connected via a bus or in other ways. In FIG. 1 , connection via a bus is taken as an example.
存储器200作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器200可以包括高速随机存取存储器200,还可以包括非暂态存储器200,例如至少一个磁盘存储器200件、闪存器件、或其他非暂态固态存储器200件。在一些实施方式中,存储器200可选包括相对于处理器100远程设置的存储器200,这些远程存储器200可以通过网络连接至该系统架构平台。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。As a non-transitory computer-readable storage medium, the memory 200 can be used to store non-transitory software programs and non-transitory computer-executable programs. In addition, the memory 200 may include a high-speed random access memory 200 , and may also include a non-transitory memory 200 , such as at least one disk storage 200 , a flash memory device, or other non-transitory solid-state memory 200 . In some implementations, the memory 200 may optionally include memory 200 located remotely relative to the processor 100, and these remote memories 200 may be connected to the system architecture platform through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
本领域技术人员可以理解的是,该系统架构平台可以应用于3G通信网络系统、LTE通信网络系统、5G通信网络系统以及后续演进的移动通信网络系统等,本实施例对此并不作具体限定。Those skilled in the art can understand that the system architecture platform can be applied to 3G communication network systems, LTE communication network systems, 5G communication network systems and subsequent evolved mobile communication network systems, etc., which is not specifically limited in this embodiment.
本领域技术人员可以理解的是,图1中示出的系统架构平台并不构成对本申请实施例的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the system architecture platform shown in FIG. 1 does not constitute a limitation to the embodiment of the present application, and may include more or less components than those shown in the illustration, or combine some components, or have different Part placement.
在图1所示的系统架构平台中,处理器100可以调用储存在存储器200中的信息处理程序,从而执行主备自动切换方法。In the system architecture platform shown in FIG. 1 , the processor 100 can call an information processing program stored in the memory 200 to execute the method for automatically switching between master and backup.
基于上述系统架构平台,下面提出本申请的主备自动切换方法的各个实施例。Based on the above-mentioned system architecture platform, various embodiments of the master-standby automatic switching method of the present application are proposed below.
参照图3,示例性的,主备自动切换方法应用于vBRAS系统中的第二控制面设备400,vBRAS系统还包括第一控制面设备300和转发面设备500,第一控制面设备300设置有处于主状态的第一实例,第二控制面设备400设置有处于备状态的第二实例,第一实例通过第一通道(图中未示出)与转发面设备500通信,第二实例通过第二通道(图中未示出)与转发面设备500通信。Referring to FIG. 3 , for example, the active-standby automatic switching method is applied to the second control plane device 400 in the vBRAS system. The vBRAS system also includes a first control plane device 300 and a forwarding plane device 500. The first control plane device 300 is provided with The first instance in the master state, the second control plane device 400 is provided with a second instance in the standby state, the first instance communicates with the forwarding plane device 500 through the first channel (not shown in the figure), and the second instance communicates with the forwarding plane device 500 through the second The second channel (not shown in the figure) communicates with the forwarding plane device 500 .
参照图5,该主备自动切换方法具体包括但不限于以下步骤S100、步骤S200和步骤S300。Referring to FIG. 5 , the method for automatic active-standby switching specifically includes but is not limited to the following steps S100 , S200 and S300 .
步骤S100:接收故障信息,其中,故障信息表征第一通道的故障率大于第一预设阈值。Step S100: Receive failure information, wherein the failure information indicates that the failure rate of the first channel is greater than a first preset threshold.
需要说明的是,该故障信息可以是由第一控制面设备300生成并发送至第二控制面设备400的,也可以是由第二控制面设备400根据转发面设备500发送的信息自行生成的,本实施例并不对其做限制。It should be noted that the fault information may be generated by the first control plane device 300 and sent to the second control plane device 400, or may be generated by the second control plane device 400 itself according to the information sent by the forwarding plane device 500 , which is not limited in this embodiment.
需要说明的是,第一预设阈值为第一实例中预设的通道故障率阈值,第一预设阈值可以为50、100等,本实施例并不对其做限制。It should be noted that the first preset threshold is the channel failure rate threshold preset in the first example, and the first preset threshold may be 50, 100, etc., which is not limited in this embodiment.
步骤S200:根据故障信息获取第二通道当前的故障率。Step S200: Obtain the current failure rate of the second channel according to the failure information.
需要说明的是,当第二控制面设备400接收到表征第一通道的故障率大于第一预设阈值的故障信息,则说明第一控制面设备300处于故障状态,即与第一实例通信的转发面设备500需要其他控制面设备执行升主操作来接管,于是第二控制面设备400会获取第二通道当前的故障率来判断第二实例是否具备升主能力。It should be noted that when the second control plane device 400 receives fault information indicating that the failure rate of the first channel is greater than the first preset threshold, it means that the first control plane device 300 is in a fault state, that is, the communication with the first instance The forwarding plane device 500 needs other control plane devices to perform the master upgrade operation to take over, so the second control plane device 400 will obtain the current failure rate of the second channel to determine whether the second instance has the master upgrade capability.
步骤S300:当第二通道当前的故障率小于或等于第二预设阈值,控制第二实例由备状态切换至主状态。Step S300: When the current failure rate of the second channel is less than or equal to the second preset threshold, control the second instance to switch from the standby state to the main state.
需要说明的是,当第二通道当前的故障率小于或等于第二预设阈值,则说明第二控制面设备400自身具备升主能力,即可将第二实例由备状态切换至主状态,从而接管与第一实例通信的转发面设备500。It should be noted that when the current failure rate of the second channel is less than or equal to the second preset threshold, it means that the second control plane device 400 itself has the ability to upgrade to the master, and can switch the second instance from the standby state to the master state. Thus, the forwarding plane device 500 communicating with the first instance is taken over.
需要说明的是,第二预设阈值为第二实例中预设的通道故障率阈值,第二预设阈值可以为50、100等,本实施例并不对其做限制。It should be noted that the second preset threshold is the channel failure rate threshold preset in the second example, and the second preset threshold may be 50, 100, etc., which is not limited in this embodiment.
可以理解的是,通过步骤S100至步骤S300,使得第二控制面设备400能够根据故障信息及时感知到第一控制面设备300的故障,并在判断自身具备升主能力的状态下,将第二实例由备状态切换至主状态,完成控制面设备间的主备自动切换,从而提高vBRAS系统的容灾性能,优化用户的用网体验。It can be understood that, through steps S100 to S300, the second control plane device 400 can detect the fault of the first control plane device 300 in time according to the fault information, and when it is judged that it has the ability to upgrade to the master, the second control plane device 400 can The instance is switched from the standby state to the active state, and the automatic switchover between the active and standby devices on the control plane is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
值得注意的是,该主备自动切换方法可用于管理两个控制面设备的主备状态,使得两个控制面设备之间能够进行主备自动切换。同时,根据运营商实际应用的场景需求,实现了vBRAS系统产品跨控制面设备自动切换的功能,从而提升了vBRAS系统产品的容灾性能和系统产品的可靠性。此外,由于本技术方案中的控制面设备能够及时识别对端控制面设备的故障并自动进行主备切换,因此,大幅提升了vBRAS系统产品运行的稳定性,优化了用户的用网体验,从而降低了用户的投诉率,减少了客户的运维成本,对vBRAS系统相关技术的后续发展具有重要意义。It is worth noting that the active/standby automatic switchover method can be used to manage the active/standby states of two control plane devices, so that active/standby automatic switchover can be performed between the two control plane devices. At the same time, according to the actual application scenario requirements of operators, the automatic switching function of vBRAS system products across control plane devices is realized, thereby improving the disaster recovery performance of vBRAS system products and the reliability of system products. In addition, since the control plane equipment in this technical solution can identify the failure of the peer control plane equipment in time and automatically perform active/standby switchover, the stability of vBRAS system product operation is greatly improved, and the user's network experience is optimized, thereby It reduces the complaint rate of users and reduces the operation and maintenance costs of customers, which is of great significance to the subsequent development of vBRAS system-related technologies.
参照图3和图6,示例性的,关于上述步骤S100,具体包括但不限于以下步骤S110。Referring to FIG. 3 and FIG. 6 , for example, the above step S100 specifically includes but not limited to the following step S110 .
步骤S110:接收来自第一控制面设备300通过心跳线发送的故障信息,其中,第一控制面设备300和第二控制面设备400之间通过心跳线进行通信。Step S110: Receive the fault information sent from the first control plane device 300 through the heartbeat line, where the first control plane device 300 and the second control plane device 400 communicate through the heartbeat line.
具体地,当第一控制面设备300与转发面设备500的第一通道断开,使第一通道的故障率超过第一实例的第一预设阈值时,第一控制面设备300即生成故障信息并通过与第二控制面设备400通信连接的心跳线向第二控制面设备400发送该故障信息;当第二控制面设备400通过心跳线接收到该故障信息,则可判断第一控制面设备300处于故障状态,之后第二控制面设备400判断自身是否具备升主的能力,若是则升主。Specifically, when the first control plane device 300 is disconnected from the first channel of the forwarding plane device 500, so that the failure rate of the first channel exceeds the first preset threshold of the first instance, the first control plane device 300 generates a fault information and send the fault information to the second control plane device 400 through the heartbeat line communicatively connected with the second control plane device 400; when the second control plane device 400 receives the fault information through the heartbeat line, it can judge that the first control plane The device 300 is in a fault state, and then the second control plane device 400 judges whether it has the ability to upgrade to the master, and if so, upgrades to the master.
参照图7,示例性的,关于上述步骤S100,具体包括但不限于以下步骤S120、步骤S130和步骤S140。Referring to FIG. 7 , for example, the above step S100 specifically includes but not limited to the following step S120 , step S130 and step S140 .
步骤S120:接收来自转发面设备500通过第二通道发送的故障事件,故障事件表征第一通道存在故障。Step S120: Receive a fault event sent from the forwarding plane device 500 through the second channel, where the fault event indicates that there is a fault in the first channel.
步骤S130:根据故障事件计算出第一通道的故障率。Step S130: Calculate the failure rate of the first channel according to the failure events.
步骤S140:当第一通道的故障率大于第一预设阈值,生成故障信息。Step S140: When the failure rate of the first channel is greater than a first preset threshold, generate failure information.
具体地,当第一控制面设备300突然断电或者失联后,第一控制面设备300与第二控制面设备400之间的心跳线会断开,第一控制面设备300与转发面设备500之间的第一通道也会断开。此时第二控制面设备400通过接收来自转发面设备500发送的故障事件,并根据故障事件计算出第一通道的故障率,之后判断第一通道的故障率是否大于第一预设阈值,若是则可判断第一控制面设备300处于故障状态,然后第二控制面设备400再判断自身是否具备升主的能力,若是则升主。Specifically, when the first control plane device 300 suddenly loses power or loses connection, the heartbeat line between the first control plane device 300 and the second control plane device 400 will be disconnected, and the first control plane device 300 and the forwarding plane device The first channel between 500 will also be disconnected. At this time, the second control plane device 400 receives the fault event sent from the forwarding plane device 500, and calculates the failure rate of the first channel according to the fault event, and then judges whether the failure rate of the first channel is greater than the first preset threshold, if so Then it can be judged that the first control plane device 300 is in a fault state, and then the second control plane device 400 judges whether it has the ability to upgrade to the master, and if so, upgrades to the master.
参照图2至图4和图8,示例性的,vBRAS系统还包括数据库600,第一控制面设备300和第二控制面设备400分别与数据库600通信,在上述步骤S300之后,具体还包括但不限于以下步骤S400和步骤S500。Referring to FIG. 2 to FIG. 4 and FIG. 8 , for example, the vBRAS system further includes a database 600, and the first control plane device 300 and the second control plane device 400 communicate with the database 600 respectively. After the above step S300, it specifically includes but It is not limited to the following steps S400 and S500.
步骤S400:控制第二实例由备状态切换至恢复中状态,恢复中状态用于第二实例从数据库600中提取第一实例的用户数据。Step S400: Control the second instance to switch from the standby state to the recovering state, where the recovering state is used for the second instance to extract the user data of the first instance from the database 600 .
步骤S500:当第二实例将用户数据提取完成,控制第二实例由恢复中状态切换至主状态。Step S500: When the second instance finishes extracting the user data, control the second instance to switch from the recovering state to the main state.
具体地,第二控制面设备400执行升主操作时,第二控制面设备400先控制第二实例的状态由备状态切换至恢复中状态,之后去数据库600中拉取与第一控制面设备300通信的转发面设备500的在线用户,并将这些在线用户逐个提取并恢复至第二控制面设备400中,待第二控制面设备400将这些在线用户均恢复后,将第二实例由恢复中状态切换至主状态。Specifically, when the second control plane device 400 executes the master upgrade operation, the second control plane device 400 first controls the state of the second instance to switch from the standby state to the recovering state, and then pulls from the database 600 the information related to the first control plane device. 300 communicates with the online users of the forwarding plane device 500, extracts and restores these online users one by one to the second control plane device 400, and after the second control plane device 400 restores all these online users, restores the second instance from the The medium state switches to the main state.
参照图9,示例性的,在上述步骤S400之后,具体包括但不限于以下步骤S410和步骤S420。Referring to FIG. 9 , for example, after the above step S400 , it specifically includes but not limited to the following steps S410 and S420 .
步骤S410:生成指向切换指令。Step S410: Generate a pointing switch instruction.
步骤S420:将指向切换指令发送至转发面设备500,以使转发面设备500的封装解封装表指向第二实例或者以使转发面设备500的通道链接指向第二实例。Step S420: Send a pointing switch command to the forwarding plane device 500, so that the encapsulation and decapsulation table of the forwarding plane device 500 points to the second instance or the channel link of the forwarding plane device 500 points to the second instance.
具体地,第二控制面设备400控制第二实例由备状态切换至恢复中状态后,生成指向切换指令,并将该指向切换指令发送至转发面设备500,以使转发面设备500的封装解封装表指向第二实例;或者,将该指向切换指令发送至转发面设备500,以使转发面设备500的通道链接指向第二实例。Specifically, after the second control plane device 400 controls the second instance to switch from the standby state to the recovering state, it generates a pointing switch command and sends the pointing switch command to the forwarding plane device 500, so that the encapsulation of the forwarding plane device 500 is decapsulated. The encapsulation table points to the second instance; or, the pointing switching instruction is sent to the forwarding plane device 500, so that the channel link of the forwarding plane device 500 points to the second instance.
参照图10,示例性的,在上述步骤S300之后,具体还包括但不限于以下步骤S600和步骤S700。Referring to FIG. 10 , for example, after the above step S300 , it specifically includes but not limited to the following steps S600 and S700 .
步骤S600:生成状态切换指令。Step S600: Generate a state switching instruction.
步骤S700:将状态切换指令发送至第一实例,以使第一实例由主状态切换为备状态。Step S700: Send a state switching instruction to the first instance, so that the first instance is switched from the master state to the standby state.
具体地,在当第二通道当前的故障率小于或等于第二预设阈值之后,第二控制面设备400即生成状态切换指令,并将该状态切换指令发送至第一实例,以使第一实例由主状态切换为备状态。需要说明的是,此时,第一控制面设备300由于故障可能不会接收到该状态切换指令,但由于第一实例与第二实例之间相互独立、互不影响,因此不会影响第二控制面设备400执行升主操作。Specifically, after the current failure rate of the second channel is less than or equal to the second preset threshold, the second control plane device 400 generates a state switching instruction, and sends the state switching instruction to the first instance, so that the first The instance switches from the primary state to the standby state. It should be noted that at this time, the first control plane device 300 may not receive the state switching instruction due to a failure, but since the first instance and the second instance are independent of each other and do not affect each other, it will not affect the second instance. The control plane device 400 performs an upgrade to master operation.
参照图11,示例性的,在上述步骤S300之后,具体还包括但不限于以下步骤S800和步骤S900。Referring to FIG. 11 , for example, after the above step S300, it specifically includes but not limited to the following steps S800 and S900.
步骤S800:当第一通道的故障率恢复至小于或等于第一预设阈值,获取第一实例的第一优先级和第二实例的第二优先级,并发送第二优先级至第一控制面设备300。Step S800: When the failure rate of the first channel returns to less than or equal to the first preset threshold, obtain the first priority of the first instance and the second priority of the second instance, and send the second priority to the first controller Surface device 300.
步骤S900:比较第一优先级和第二优先级,并根据比较结果控制第二实例和第一实例的状态。Step S900: Compare the first priority with the second priority, and control the states of the second instance and the first instance according to the comparison result.
需要说明的是,在第二控制面设备400控制第二实例由备状态切换至主状态后,由于第一控制面设备300故障时可能没有收到来自第二控制面设备400的状态切换指令,因此,当第一控制面设备300恢复正常后,会出现第一实例和第二实例均是主状态的情况,需要第一控制面设备300和第二控制面设备400相互协商决策出最终为主状态的控制面设备。具体地,当第一通道的故障率恢复至小于或等于第一预设阈值,第二控制面设备400即获取第一实例的第一优先级和第二实例的第二优先级,并发送第二优先级至第一控制面设备300,之后比较第一优先级和第二优先级,并根据比较结果控制第二实例和第一实例的状态。It should be noted that after the second control plane device 400 controls the second instance to switch from the standby state to the master state, because the first control plane device 300 may fail to receive a state switching instruction from the second control plane device 400, Therefore, when the first control plane device 300 returns to normal, both the first instance and the second instance will be in the master state, and the first control plane device 300 and the second control plane device 400 need to negotiate with each other to determine the final master state. Status of the control plane device. Specifically, when the failure rate of the first channel recovers to be less than or equal to the first preset threshold, the second control plane device 400 acquires the first priority of the first instance and the second priority of the second instance, and sends the second The second priority is assigned to the first control plane device 300, and then the first priority is compared with the second priority, and the statuses of the second instance and the first instance are controlled according to the comparison result.
需要说明的是,第一优先级和第二优先级均为预设值,具体可以为100、200等,本实施例并不对其做限制。It should be noted that both the first priority and the second priority are preset values, specifically 100, 200, etc., which are not limited in this embodiment.
参照图12,示例性的,关于上述步骤S900,具体包括但不限于以下步骤S910和步骤S920。Referring to FIG. 12 , for example, the above step S900 specifically includes but not limited to the following steps S910 and S920.
步骤S910:当第一优先级高于第二优先级,控制第二实例由主状态恢复至备状态,并以使第一控制面设备300根据第一优先级和第二优先级维持第一实例的状态为主状态。Step S910: When the first priority is higher than the second priority, control the second instance to restore from the master state to the standby state, and make the first control plane device 300 maintain the first instance according to the first priority and the second priority The state of is the main state.
步骤S920:当第一优先级低于第二优先级,维持第二实例的状态为主状态,并以使第一控制面设备300根据第一优先级和第二优先级控制第一实例由主状态切换至备状态。Step S920: When the first priority is lower than the second priority, maintain the state of the second instance as the master state, and make the first control plane device 300 control the first instance to be mastered according to the first priority and the second priority. The state switches to the standby state.
具体地,第一控制面设备300和第二控制面设备400均未出现故障即处于正常工作状态,当第一优先级高于第二优先级,第二控制面设备400则控制第二实例由主状态恢复至备状态,并以使第一控制面设备300根据第一优先级和第二优先级维持第一实例的状态为主状态;当第一优先级低于第二优先级,第二控制面设备400则维持第二实例的状态为主状态,并以使第一控制面设备300根据第一优先级和第二优先级控制第一实例由主状态切换至备状态。Specifically, both the first control plane device 300 and the second control plane device 400 are in a normal working state without failure, and when the first priority is higher than the second priority, the second control plane device 400 controls the second instance by restore the master state to the standby state, and make the first control plane device 300 maintain the state of the first instance as the master state according to the first priority and the second priority; when the first priority is lower than the second priority, the second The control plane device 400 maintains the state of the second instance as the master state, and enables the first control plane device 300 to control the first instance to switch from the master state to the standby state according to the first priority and the second priority.
可以理解的是,当第一控制面设备300和第二控制面设备400均打开了抢占升主开关,且第二优先级被调高,使得第二优先级高于第一优先级,此时,即使第一控制面设备300未出现故障,第二控制面设备400也会自动执行升主操作。It can be understood that when both the first control plane device 300 and the second control plane device 400 have turned on the preemptive upgrade main switch, and the second priority is increased so that the second priority is higher than the first priority, then , even if the first control plane device 300 does not fail, the second control plane device 400 will automatically perform the master upgrade operation.
基于上述主备自动切换方法的各个实施例,下面提出本申请的整体的主备自动切换方法的实施例。Based on the various embodiments of the above-mentioned master-standby automatic switchover method, an embodiment of the overall master-standby automatic switchover method of the present application is proposed below.
参照图2和图3,vBRAS系统包括第一控制面设备300、第二控制面设备400和转发面设备500。示例性的,如图3所示,CP1表示第一控制面设备300,CP2表示第二控制面设备400,UP1至UP4表示转发面设备500,其中,CP1和CP2包括一个相同的实例instance1,在CP1中的instance1表示第一实例,在CP2中的instance1表示第二实例,在CP1正常的情况下,第一实例为主状态,第二实例为备状态,并且,UP1和UP2通过第一通道连接CP1中的第一实例,UP1和UP2通过第二通道连接CP2中的第二实例。由于转发面设备500总是向实例为主状态的控制面设备发送用户数据,所以在第一控制面设备300正常的情况下,由第一控制面设备300来接管UP1和UP2。在用户拨号上线时,物理网络与UP1和UP2相连的用户,上 线报文将投递到第一控制面设备300,而第一控制面设备300在处理完用户的上线报文后,将用户信息保存到数据库600中。Referring to FIG. 2 and FIG. 3 , the vBRAS system includes a first control plane device 300 , a second control plane device 400 and a forwarding plane device 500 . Exemplarily, as shown in FIG. 3 , CP1 represents the first control plane device 300, CP2 represents the second control plane device 400, and UP1 to UP4 represent forwarding plane devices 500, wherein CP1 and CP2 include the same instance instance1, in Instance1 in CP1 represents the first instance, and instance1 in CP2 represents the second instance. When CP1 is normal, the first instance is in the active state, and the second instance is in the standby state, and UP1 and UP2 are connected through the first channel The first instance in CP1, UP1 and UP2 are connected to the second instance in CP2 through a second channel. Since the forwarding plane device 500 always sends user data to the control plane device in the master state, when the first control plane device 300 is normal, the first control plane device 300 takes over UP1 and UP2. When a user dials up to go online, the online message of the user whose physical network is connected to UP1 and UP2 will be delivered to the first control plane device 300, and the first control plane device 300 will save the user information after processing the user's online message into the database 600.
需要说明的是,一个控制面设备可以配置多个geo-backup-instance实例,每个实例有自己独立的主备状态管理,具体如下。It should be noted that one control plane device can be configured with multiple geo-backup-instance instances, and each instance has its own independent master and backup state management, as follows.
1、命令支持配置geo-backup-instance实例切换模式为自动:switch-mode auto。1. The command supports configuring the geo-backup-instance instance switching mode as automatic: switch-mode auto.
2、命令支持配置geo-backup-instance实例的优先级priority,范围是1-254。2. The command supports configuring the priority of the geo-backup-instance instance, and the range is 1-254.
3、命令支持配置geo-backup-instance实例判断自己所管理的转发面设备500的通道故障率阈值上限threshold,范围是1-100。3. The command supports configuring the geo-backup-instance instance to determine the upper threshold of the channel failure rate threshold of the forwarding plane device 500 managed by itself, and the range is 1-100.
4、命令支持配置geo-backup-instance实例决策需要自动升主到执行升主动作之间的延迟时间delay-time,范围是240-3600秒。4. The command supports configuring the delay time delay-time between when the geo-backup-instance instance decision needs to be automatically upgraded to the execution of the upgrade action, and the range is 240-3600 seconds.
5、命令支持配置geo-backup-instance实例是否开启抢占升主开关preempt enable/disable。5. The command supports configuring whether the geo-backup-instance instance is enabled to preempt the master switch preempt enable/disable.
需要说明的是,转发面设备500支持将自己与第一控制面设备300的第一通道如OpenFlow通道状态上报给第二控制面设备400,使第二控制面设备400通过转发面设备500上报消息来判断对端第一控制面设备300与转发面设备500的第一通道当前的故障率是否已超过第一预设阈值。It should be noted that the forwarding plane device 500 supports reporting the state of the first channel between itself and the first control plane device 300, such as the OpenFlow channel, to the second control plane device 400, so that the second control plane device 400 reports the message through the forwarding plane device 500 To determine whether the current failure rate of the first channel between the first control plane device 300 and the forwarding plane device 500 at the opposite end has exceeded the first preset threshold.
需要说明的是,第一控制面设备300和第二控制面设备400之间通过sib心跳线互相通告geo-backup-instance实例下的配置,同一个实例在两个控制面设备上允许实例下配置的参数不同。It should be noted that the first control plane device 300 and the second control plane device 400 notify each other of the configuration under the geo-backup-instance instance through the SIB heartbeat line, and the same instance is allowed to be configured under the instance on the two control plane devices The parameters are different.
示例性的,在CP1和CP2上配置instance1和instance2两个实例,其中,CP1上instance1是主状态,instance2是备状态,CP2上instance1是备状态,instance2是主状态。CP1和CP2的instance1配置为自动模式,CP1和CP2与UP的OpenFlow通道都是好的,CP1的instance1优先级配置200,CP1的instance2的优先级配置100,可以看到CP1的instance1是主(master),CP2的instance1是备(slave)。Exemplarily, two instances, instance1 and instance2, are configured on CP1 and CP2, where instance1 on CP1 is in the active state, instance2 is in the standby state, and instance1 on CP2 is in the standby state, and instance2 is in the active state. The instance1 of CP1 and CP2 is configured in automatic mode, and the OpenFlow channel between CP1 and CP2 and UP is good. The priority of instance1 of CP1 is configured as 200, and the priority of instance2 of CP1 is configured as 100. You can see that instance1 of CP1 is the master (master ), instance1 of CP2 is the backup (slave).
在CP1上通过命令配置OpenFlow通道的第一预设阈值(threshold)为40,在CP2上通过命令配置OpenFlow通道的第二预设阈值(threshold)也为40,该值表示当CP1的Openflow通道故障率超过40%时,可判定CP1异常,需要CP2接管下面的UP。Configure the first preset threshold (threshold) of the OpenFlow channel on CP1 to be 40, and configure the second preset threshold (threshold) of the OpenFlow channel on CP2 to be 40, which means that when the OpenFlow channel of CP1 fails When the rate exceeds 40%, it can be determined that CP1 is abnormal, and CP2 needs to take over the following UP.
用户拨号上线,物理网络与UP1和UP2相连的用户,上线报文将投递到CP1,并由CP1将用户信息表都写入数据库600。The user dials up to go online, and the user whose physical network is connected to UP1 and UP2 will deliver the online message to CP1, and CP1 will write the user information table into the database 600 .
当CP1需要重启或者发生意外故障时(比如CP1服务器宕机、机房断电、CP1的网络出接口链路故障等场景),CP1与UP1和UP2之间的第一通道断开,且CP1与CP2之间的sib心跳线也断开了,此时需要在不影响在线用户的情况下由CP2接管UP1和UP2。When CP1 needs to be restarted or an unexpected failure occurs (such as CP1 server downtime, computer room power failure, CP1 network outbound interface link failure, etc.), the first channel between CP1 and UP1 and UP2 is disconnected, and CP1 and CP2 The sib heartbeat line between them is also disconnected. At this time, CP2 needs to take over UP1 and UP2 without affecting online users.
当UP1和UP2在感知到自己与CP1的第一通道断开后,将这个事件上报给CP2,CP2收到UP上报的消息后,计算出CP1的第一通道的故障率超过了第一预设阈值(40%),此时CP2通过获取第二通道当前的故障率并根据该故障率小于或等于第二预设阈值,知道自己的第二通道是好的,于是CP2决策升主,由CP2接管UP1和UP2。When UP1 and UP2 perceive that they are disconnected from the first channel of CP1, they report this event to CP2. After receiving the message reported by UP, CP2 calculates that the failure rate of the first channel of CP1 exceeds the first preset. Threshold (40%), at this time, CP2 knows that its second channel is good by obtaining the current failure rate of the second channel and according to the failure rate being less than or equal to the second preset threshold, so CP2 decides to promote the master, and CP2 Take over UP1 and UP2.
CP2执行升主操作后,CP2的instance1实例状态变成恢复中(recovery),表示CP2正在恢复数据。CP2发送状态切换指令给CP1,命令CP1的instance1状态变成备状态(slave),虽然实际此时CP1已经宕机,并不会收到这个消息,但不影响CP2继续升主;同时,CP2通 过第二通道发送指向切换指令至UP1和UP2将NSH封装解封装表指向CP2,并使UP1和UP2把通道链接指向CP2的第二实例。同时,UP1和UP2收到CP2升主消息后,会将UP1和UP2上的用户表、网段路由等数据启动老化,然后等待CP2重新下发业务数据。After CP2 performs the upgrade operation, the status of instance1 of CP2 becomes recovery (recovery), indicating that CP2 is recovering data. CP2 sends a state switching command to CP1, commanding the state of instance1 of CP1 to change to the standby state (slave). Although CP1 is down at this time and will not receive this message, it does not affect CP2 to continue to be promoted to master; at the same time, CP2 passes The second channel sends a pointing switch command to UP1 and UP2 to point the NSH encapsulation and decapsulation table to CP2, and make UP1 and UP2 point the channel link to the second instance of CP2. At the same time, after UP1 and UP2 receive the master upgrade message from CP2, they will start aging data such as user tables and network segment routes on UP1 and UP2, and then wait for CP2 to re-deliver service data.
此外,CP2执行升主操作后,CP2的instance1在恢复中(recovery)状态下,去数据库600中拉取已经在线的UP1和UP2用户,并恢复到CP2上。并且,CP2每获取1个用户,就把这个用户向UP1或者UP2同步,使得UP1和UP2收到用户同步信息后停止该用户的老化。当CP2把全部用户都从数据库600恢复完成后,CP2状态就从恢复中(recovery)变成主状态(master),即CP1和CP2的主备切换完成。In addition, after CP2 executes the upgrade operation, instance1 of CP2 is in recovery state, pulls the already online users of UP1 and UP2 from the database 600, and restores them to CP2. Moreover, each time CP2 acquires a user, it synchronizes the user with UP1 or UP2, so that UP1 and UP2 stop aging of the user after receiving the user synchronization information. After CP2 restores all users from the database 600, the state of CP2 changes from recovery to master state, that is, the master-standby switchover between CP1 and CP2 is completed.
可以理解的是,CP2的instance1在恢复中(recovery)状态时,CP2需要控制UP1和UP2的新用户不可以上线,因为新上线的用户占用的资源可能与即将从数据库600中恢复的instance1实例下用户冲突。It can be understood that when instance1 of CP2 is in the recovery state, CP2 needs to control the new users of UP1 and UP2 not to go online, because the resources occupied by the newly online users may be different from those of the instance1 that will be restored from the database 600. User conflict.
可以理解的是,CP2的instance1在恢复中(recovery)状态时,由于负载分担机制的存在,能够保证CP2的instance2维持主状态(master)不受影响,UP3和UP4也可以正常上线新用户。It can be understood that when instance1 of CP2 is in the recovery state, due to the existence of the load sharing mechanism, it can ensure that instance2 of CP2 maintains the master state (master) without being affected, and UP3 and UP4 can also go online normally with new users.
可以理解的是,CP2的instance1在恢复中(recovery)状态时,UP1和UP2上的用户转发表依然存在,使得用户上行流量和下行流量正常,从而保证用户上网功能和体验不受影响。It is understandable that when instance1 of CP2 is in the recovery state, the user forwarding tables on UP1 and UP2 still exist, so that the user's uplink traffic and downlink traffic are normal, thereby ensuring that the user's online function and experience are not affected.
可以理解的是,如果CP1服务器发生网络故障,CP1则无法收到CP2命令CP1变备的消息,于是CP2变成主状态后,CP1和都CP2都将会是主状态,从而出现双主现象。因此,当CP1服务器网络恢复畅通后,CP1和CP2将通过协商的方式决策出主CP和备CP,由优先级高的CP确定为最终的主CP。It is understandable that if a network failure occurs on the CP1 server, CP1 cannot receive the message from CP2 ordering CP1 to change to standby, so after CP2 becomes the master state, both CP1 and CP2 will be in the master state, resulting in a dual-master phenomenon. Therefore, when the CP1 server network is back to normal, CP1 and CP2 will negotiate to determine the active CP and standby CP, and the CP with the highest priority is determined to be the final active CP.
基于上述第一方面实施例的主备自动切换方法,下面提出本申请第二方面的控制面设备的各个实施例。Based on the master-standby automatic switching method in the embodiment of the first aspect above, various embodiments of the control plane device in the second aspect of the present application are proposed below.
本申请的一个实施例提供了一种控制面设备,该控制面设备包括:存储器200、处理器100及存储在存储器200上并可在处理器100上运行的计算机程序。An embodiment of the present application provides a control plane device, which includes: a memory 200 , a processor 100 , and a computer program stored in the memory 200 and operable on the processor 100 .
处理器100和存储器200可以通过总线或者其他方式连接。The processor 100 and the memory 200 may be connected via a bus or in other ways.
需要说明的是,本实施例中的控制器,可以对应为包括有如图1所示实施例中的存储器200和处理器100,能够构成图1所示实施例中的系统架构平台的一部分,两者属于相同的发明构思,因此两者具有相同的实现原理以及有益效果,此处不再详述。It should be noted that the controller in this embodiment may correspond to include the memory 200 and the processor 100 in the embodiment shown in FIG. 1, which can constitute a part of the system architecture platform in the embodiment shown in FIG. 1. Both belong to the same inventive concept, so both have the same realization principle and beneficial effect, and will not be described in detail here.
实现上述实施例的主备自动切换方法所需的非暂态软件程序以及指令存储在存储器200中,当被处理器100执行时,执行上述实施例的主备自动切换方法,例如,执行以上描述的图5中的方法步骤S100至S300、图6中的方法步骤S110、图7中的方法步骤S120至S140、图8中的方法步骤S400至S500、图9中的方法步骤S410至S420、图10中的方法步骤S600至S700、图11中的方法步骤S800至S900、图12中的方法步骤S910至S920。The non-transitory software programs and instructions required to realize the master-standby automatic switching method of the above-mentioned embodiment are stored in the memory 200, and when executed by the processor 100, the master-standby automatic switchover method of the above-mentioned embodiment is executed, for example, the above description is performed. Method steps S100 to S300 in Fig. 5, method steps S110 in Fig. 6, method steps S120 to S140 in Fig. 7, method steps S400 to S500 in Fig. 8, method steps S410 to S420 in Fig. 9, Fig. The method steps S600 to S700 in FIG. 10 , the method steps S800 to S900 in FIG. 11 , and the method steps S910 to S920 in FIG. 12 .
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
可以理解的是,由于本申请第二方面实施例的控制面设备和上述第一方面任一实施例的主备自动切换方法属于同一发明构思,因此,本申请第二方面实施例的控制面设备的具体实 施方式和技术效果,可参照上述第一方面任一实施例的主备自动切换方法的具体实施方式和技术效果,在此不做赘述。It can be understood that, since the control plane device in the embodiment of the second aspect of the present application and the master-standby automatic switching method in any embodiment of the first aspect above belong to the same inventive concept, the control plane device in the embodiment of the second aspect of the present application For the specific implementation manner and technical effect, reference may be made to the specific implementation manner and technical effect of the master-standby automatic switching method in any embodiment of the first aspect above, and details are not repeated here.
基于上述第二方面实施例的控制面设备,下面提出本申请第三方面的vBRAS系统的各个实施例。Based on the control plane device in the embodiment of the second aspect above, various embodiments of the vBRAS system in the third aspect of the present application are proposed below.
具体地,本申请实施例的vBRAS系统为转控分离式vBRAS系统,该vBRAS系统包括上述第二方面各个实施例的控制面设备,还包括转发面设备500和至少一个其他控制面设备,且控制面设备与转发面设备500之间设置有标准化接口。Specifically, the vBRAS system in this embodiment of the present application is a transfer-control-separated vBRAS system, and the vBRAS system includes the control plane device in each embodiment of the second aspect above, and also includes the forwarding plane device 500 and at least one other control plane device, and the control A standardized interface is provided between the plane device and the forwarding plane device 500 .
可以理解的是,由于本申请第三方面实施例的vBRAS系统和上述第二方面任一实施例的控制面设备属于同一发明构思,因此,本申请第二方面实施例的vBRAS系统的具体实施方式和技术效果,可参照上述第二方面任一实施例的控制面设备的具体实施方式和技术效果,在此不做赘述。It can be understood that since the vBRAS system in the embodiment of the third aspect of the present application and the control plane device in any embodiment of the second aspect above belong to the same inventive concept, the specific implementation manner of the vBRAS system in the embodiment of the second aspect of the present application For details and technical effects, reference may be made to the specific implementation manners and technical effects of the control plane device in any embodiment of the second aspect above, and details are not repeated here.
基于上述第一方面实施例的主备自动切换方法,下面提出本申请第四方面的计算机可读存储介质的各个实施例。Based on the master/standby automatic switchover method in the embodiment of the first aspect above, various embodiments of the computer-readable storage medium in the fourth aspect of the present application are proposed below.
该计算机可读存储介质存储有计算机可执行指令,当计算机可执行指令用于执行上述的主备自动切换方法,例如,执行以上描述的图5中的方法步骤S100至S300、图6中的方法步骤S110、图7中的方法步骤S120至S140、图8中的方法步骤S400至S500、图9中的方法步骤S410至S420、图10中的方法步骤S600至S700、图11中的方法步骤S800至S900、图12中的方法步骤S910至S920。The computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are used to execute the above-mentioned master-standby automatic switching method, for example, execute the above-described method steps S100 to S300 in FIG. 5 and the method in FIG. 6 Step S110, method steps S120 to S140 in FIG. 7, method steps S400 to S500 in FIG. 8, method steps S410 to S420 in FIG. 9, method steps S600 to S700 in FIG. 10, method step S800 in FIG. 11 Go to S900, the method steps S910 to S920 in FIG. 12 .
本申请实施例包括一种主备自动切换方法、控制面设备、vBRAS系统和计算机可读存储介质,其中,主备自动切换方法应用于vBRAS系统中的第二控制面设备,所述vBRAS系统还包括第一控制面设备和转发面设备,所述第一控制面设备设置有处于主状态的第一实例,所述第二控制面设备设置有处于备状态的第二实例,所述第一实例通过第一通道与所述转发面设备通信,所述第二实例通过第二通道与所述转发面设备通信,所述方法包括:接收故障信息,其中,所述故障信息表征所述第一通道的故障率大于第一预设阈值;根据所述故障信息获取所述第二通道当前的故障率;当所述第二通道当前的故障率小于或等于第二预设阈值,控制所述第二实例由备状态切换至主状态。根据本申请实施例提供的方案,使得第二控制面设备能够根据故障信息及时感知到第一控制面设备的故障,并在判断自身具备升主能力的状态下,将第二实例由备状态切换至主状态,完成控制面设备间的主备自动切换,从而提高vBRAS系统的容灾性能,优化用户的用网体验。The embodiment of the present application includes a master-standby automatic switching method, a control plane device, a vBRAS system, and a computer-readable storage medium, wherein the master-standby automatic switchover method is applied to the second control plane device in the vBRAS system, and the vBRAS system also It includes a first control plane device and a forwarding plane device, the first control plane device is provided with a first instance in a master state, the second control plane device is provided with a second instance in a standby state, and the first instance Communicating with the forwarding plane device through a first channel, the second instance communicating with the forwarding plane device through a second channel, the method includes: receiving fault information, wherein the fault information represents the first channel The failure rate of the second channel is greater than the first preset threshold; the current failure rate of the second channel is obtained according to the failure information; when the current failure rate of the second channel is less than or equal to the second preset threshold, control the second The instance switches from the standby state to the primary state. According to the solution provided by the embodiment of the present application, the second control plane device can sense the failure of the first control plane device in time according to the fault information, and switch the second instance from the standby state when it judges that it has the ability to upgrade to the master In the active state, the active/standby automatic switching between the control plane devices is completed, thereby improving the disaster recovery performance of the vBRAS system and optimizing the user's network experience.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器100,如中央处理器100、数字信号处理器100或微处理器100执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器200技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外, 本领域普通技术人员公知的是,通信介质通常包括计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those skilled in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware and an appropriate combination thereof. Some or all of the physical components may be implemented as software executed by a processor 100, such as a central processing unit 100, a digital signal processor 100, or a microprocessor 100, or as hardware, or as an integrated circuit, Such as application specific integrated circuits. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media including, but not limited to, RAM, ROM, EEPROM, flash memory or other memory 200 technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or Any other medium that can be used to store desired information and that can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
上面结合附图对本申请实施例作了详细说明,但是本申请不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本申请宗旨的前提下,作出各种变化。The embodiments of the present application have been described in detail above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned embodiments. Within the scope of knowledge of those of ordinary skill in the art, various modifications can be made without departing from the purpose of the present application. kind of change.

Claims (11)

  1. 一种主备自动切换方法,应用于vBRAS系统中的第二控制面设备,所述vBRAS系统还包括第一控制面设备和转发面设备,所述第一控制面设备设置有处于主状态的第一实例,所述第二控制面设备设置有处于备状态的第二实例,所述第一实例通过第一通道与所述转发面设备通信,所述第二实例通过第二通道与所述转发面设备通信,所述方法包括:A master-standby automatic switching method, applied to a second control plane device in a vBRAS system, the vBRAS system also includes a first control plane device and a forwarding plane device, and the first control plane device is provided with a first control plane device in a master state In an example, the second control plane device is provided with a second instance in a standby state, the first instance communicates with the forwarding plane device through a first channel, and the second instance communicates with the forwarding plane device through a second channel Surface device communication, the method includes:
    接收故障信息,其中,所述故障信息表征所述第一通道的故障率大于第一预设阈值;receiving fault information, wherein the fault information indicates that the fault rate of the first channel is greater than a first preset threshold;
    根据所述故障信息获取所述第二通道当前的故障率;Acquiring the current failure rate of the second channel according to the failure information;
    当所述第二通道当前的故障率小于或等于第二预设阈值,控制所述第二实例由备状态切换至主状态。When the current failure rate of the second channel is less than or equal to a second preset threshold, the second instance is controlled to switch from the standby state to the main state.
  2. 根据权利要求1所述的方法,其中,所述接收故障信息,包括:The method according to claim 1, wherein said receiving fault information comprises:
    接收来自所述第一控制面设备通过心跳线发送的故障信息,其中,所述第一控制面设备和所述第二控制面设备之间通过所述心跳线进行通信。receiving fault information sent from the first control plane device through a heartbeat line, wherein communication between the first control plane device and the second control plane device is performed through the heartbeat line.
  3. 根据权利要求1所述的方法,其中,所述接收故障信息,包括:The method according to claim 1, wherein said receiving fault information comprises:
    接收来自所述转发面设备通过所述第二通道发送的故障事件,所述故障事件表征所述第一通道存在故障;receiving a fault event sent from the forwarding plane device through the second channel, where the fault event indicates that a fault exists in the first channel;
    根据所述故障事件计算出所述第一通道的故障率;calculating a failure rate of the first channel according to the failure event;
    当所述第一通道的故障率大于第一预设阈值,生成故障信息。When the failure rate of the first channel is greater than a first preset threshold, failure information is generated.
  4. 根据权利要求1所述的方法,其中,所述vBRAS系统还包括数据库,所述第一控制面设备和所述第二控制面设备分别与所述数据库通信,所述控制所述第二实例由备状态切换至主状态,包括:The method according to claim 1, wherein the vBRAS system further includes a database, the first control plane device and the second control plane device communicate with the database respectively, and the control of the second instance is performed by The standby state is switched to the main state, including:
    控制所述第二实例由备状态切换至恢复中状态,所述恢复中状态用于所述第二实例从所述数据库中提取所述第一实例的用户数据;controlling the second instance to switch from a standby state to a recovering state, where the recovering state is used for the second instance to extract user data of the first instance from the database;
    当所述第二实例将所述用户数据提取完成,控制所述第二实例由恢复中状态切换至主状态。When the second instance finishes extracting the user data, the second instance is controlled to switch from the recovering state to the main state.
  5. 根据权利要求4所述的方法,其中,在所述控制所述第二实例由备状态切换至恢复中状态之后,所述方法还包括:The method according to claim 4, wherein, after controlling the second instance to switch from the standby state to the recovering state, the method further comprises:
    生成指向切换指令;Generate pointing switch instructions;
    将所述指向切换指令发送至所述转发面设备,以使所述转发面设备的封装解封装表指向所述第二实例或者以使所述转发面设备的通道链接指向所述第二实例。Sending the pointing switching instruction to the forwarding plane device, so that the encapsulation and decapsulation table of the forwarding plane device points to the second instance or makes the channel link of the forwarding plane device point to the second instance.
  6. 根据权利要求1所述的方法,其中,在所述当所述第二通道当前的故障率小于或等于第二预设阈值之后,所述方法还包括:The method according to claim 1, wherein, after the current failure rate of the second channel is less than or equal to a second preset threshold, the method further comprises:
    生成状态切换指令;Generate state switching instructions;
    将所述状态切换指令发送至所述第一实例,以使所述第一实例由主状态切换为备状态。Sending the state switching instruction to the first instance, so that the first instance is switched from the master state to the standby state.
  7. 根据权利要求1所述的方法,其中,在所述第二实例由备状态切换至主状态之后,所述方法还包括:The method according to claim 1, wherein, after the second instance is switched from the standby state to the main state, the method further comprises:
    当所述第一通道的故障率恢复至小于或等于所述第一预设阈值,获取所述第一实例的第一优先级和所述第二实例的第二优先级,并发送所述第二优先级至所述第一控制面设备;When the failure rate of the first channel recovers to be less than or equal to the first preset threshold, obtain the first priority of the first instance and the second priority of the second instance, and send the first priority second priority to the first control plane device;
    比较所述第一优先级和所述第二优先级,并根据比较结果控制所述第二实例和所述第一实例的状态。comparing the first priority with the second priority, and controlling the states of the second instance and the first instance according to the comparison result.
  8. 根据权利要求7所述的方法,其中,所述根据比较结果控制所述第二实例和所述第一实例的状态,包括:The method according to claim 7, wherein the controlling the states of the second instance and the first instance according to the comparison result comprises:
    当所述第一优先级高于所述第二优先级,控制所述第二实例由主状态恢复至备状态,并以使所述第一控制面设备根据所述第一优先级和所述第二优先级维持所述第一实例的状态为主状态;When the first priority is higher than the second priority, control the second instance to restore from the master state to the standby state, and make the first control plane device according to the first priority and the The second priority maintains the state of the first instance as the main state;
    当所述第一优先级低于所述第二优先级,维持所述第二实例的状态为主状态,并以使所述第一控制面设备根据所述第一优先级和所述第二优先级控制所述第一实例由主状态切换至备状态。When the first priority is lower than the second priority, maintain the state of the second instance as the main state, and make the first control plane device according to the first priority and the second The priority controls the switching of the first instance from the master state to the standby state.
  9. 一种控制面设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至8中任意一项所述的主备自动切换方法。A control plane device, comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the computer program, the implementation of claims 1 to 8 is achieved. The active-standby automatic switching method described in any one.
  10. 一种vBRAS系统,包括如权利要求9所述的控制面设备。A vBRAS system, comprising the control plane device according to claim 9.
  11. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行如权利要求1至8任意一项所述的主备自动切换方法。A computer-readable storage medium storing computer-executable instructions, the computer-executable instructions being used to execute the master/standby automatic switching method according to any one of claims 1 to 8.
PCT/CN2022/101589 2021-06-28 2022-06-27 Automatic main/standby switching method, control plane device, vbras system and storage medium WO2023274164A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110719886.5 2021-06-28
CN202110719886.5A CN115604087A (en) 2021-06-28 2021-06-28 Main/standby automatic switching method, control plane equipment, vBRAS system and storage medium

Publications (1)

Publication Number Publication Date
WO2023274164A1 true WO2023274164A1 (en) 2023-01-05

Family

ID=84690073

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/101589 WO2023274164A1 (en) 2021-06-28 2022-06-27 Automatic main/standby switching method, control plane device, vbras system and storage medium

Country Status (2)

Country Link
CN (1) CN115604087A (en)
WO (1) WO2023274164A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180123874A1 (en) * 2012-08-31 2018-05-03 Bce Inc. Ip mpls pop virtualization and fault tolerant virtual router
CN108512703A (en) * 2018-03-28 2018-09-07 新华三技术有限公司 BRAS turns backup method, device, equipment and the machine readable storage medium of control separation
CN110022236A (en) * 2019-05-30 2019-07-16 新华三技术有限公司 A kind of message forwarding method and device
CN111654384A (en) * 2019-09-27 2020-09-11 中兴通讯股份有限公司 Main/standby switching method, BRAS (broadband remote Access Server) equipment and storage medium
CN112367182A (en) * 2020-09-29 2021-02-12 新华三大数据技术有限公司 Configuration method and device of disaster recovery main and standby equipment
CN112367252A (en) * 2020-09-25 2021-02-12 新华三技术有限公司合肥分公司 Method and device for realizing disaster recovery backup
CN112887127A (en) * 2021-01-12 2021-06-01 烽火通信科技股份有限公司 vBRAS equipment and method for realizing transfer control separation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180123874A1 (en) * 2012-08-31 2018-05-03 Bce Inc. Ip mpls pop virtualization and fault tolerant virtual router
CN108512703A (en) * 2018-03-28 2018-09-07 新华三技术有限公司 BRAS turns backup method, device, equipment and the machine readable storage medium of control separation
CN110022236A (en) * 2019-05-30 2019-07-16 新华三技术有限公司 A kind of message forwarding method and device
CN111654384A (en) * 2019-09-27 2020-09-11 中兴通讯股份有限公司 Main/standby switching method, BRAS (broadband remote Access Server) equipment and storage medium
CN112367252A (en) * 2020-09-25 2021-02-12 新华三技术有限公司合肥分公司 Method and device for realizing disaster recovery backup
CN112367182A (en) * 2020-09-29 2021-02-12 新华三大数据技术有限公司 Configuration method and device of disaster recovery main and standby equipment
CN112887127A (en) * 2021-01-12 2021-06-01 烽火通信科技股份有限公司 vBRAS equipment and method for realizing transfer control separation

Also Published As

Publication number Publication date
CN115604087A (en) 2023-01-13

Similar Documents

Publication Publication Date Title
US9853856B2 (en) Method and device for protecting service reliability and network virtualization system
US11734138B2 (en) Hot standby method, apparatus, and system
CN101951616B (en) Switching method, system and device for wireless controller
CN102137017B (en) Working method and device used for virtual network unit
CN109982447B (en) Wireless network networking method and system and wireless AP
US9385944B2 (en) Communication system, path switching method and communication device
CN112769587A (en) Forwarding method and device for access flow of dual-homing device and storage medium
US10911295B2 (en) Server apparatus, cluster system, cluster control method and program
CN104486128B (en) A kind of system and method for realizing redundancy heartbeat between dual controller node
CN110351127B (en) Graceful restart method, device and system
WO2011110135A2 (en) Master-standby switching method, system control unit and communication system
CN106911597B (en) Cross-board forwarding method and device
US20140050092A1 (en) Load sharing method and apparatus
CN112218321B (en) Master-slave link switching method, device, communication equipment and storage medium
CN111371680B (en) Route management method, device, equipment and storage medium for dual-computer hot standby
CN111654384A (en) Main/standby switching method, BRAS (broadband remote Access Server) equipment and storage medium
CN112511326A (en) Switching method, device, equipment and storage medium
EP3002906B1 (en) Method and device for updating radio network controller
CN114554615A (en) Service switching method, device and network equipment
CN102487332B (en) Fault processing method, apparatus thereof and system thereof
WO2023274164A1 (en) Automatic main/standby switching method, control plane device, vbras system and storage medium
WO2019134572A1 (en) Sdn-based optical transport network protection recovery method and device, and storage medium
CN114697195A (en) Fault processing method, transmission path adjusting method, network element and storage medium
CN108243052A (en) A kind of network system and the data transmission method for uplink based on network system
CN111224803B (en) Multi-master detection method in stacking system and stacking system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831978

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22831978

Country of ref document: EP

Kind code of ref document: A1