CN110391929B - Fault-tolerant control method and device and fault-tolerant component - Google Patents

Fault-tolerant control method and device and fault-tolerant component Download PDF

Info

Publication number
CN110391929B
CN110391929B CN201810368148.9A CN201810368148A CN110391929B CN 110391929 B CN110391929 B CN 110391929B CN 201810368148 A CN201810368148 A CN 201810368148A CN 110391929 B CN110391929 B CN 110391929B
Authority
CN
China
Prior art keywords
controller
network element
controllers
state
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810368148.9A
Other languages
Chinese (zh)
Other versions
CN110391929A (en
Inventor
张建辉
陈保正
胡涛
郭泽华
李艳捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Gree Information Technology Co ltd
Original Assignee
Shenzhen Gree Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Gree Information Technology Co ltd filed Critical Shenzhen Gree Information Technology Co ltd
Priority to CN201810368148.9A priority Critical patent/CN110391929B/en
Publication of CN110391929A publication Critical patent/CN110391929A/en
Application granted granted Critical
Publication of CN110391929B publication Critical patent/CN110391929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • H04L41/0661Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities by reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery

Abstract

The application is applicable to the technical field of computer networks, and provides a fault-tolerant control method, a device and a fault-tolerant component, which comprise the following steps: acquiring the current states of at least two controllers, and generating a controller set FC in a failure state; acquiring a data packet request quantity of network element equipment managed by each controller in the FC; adaptively generating a redistribution scheme according to the data packet request quantity; and connecting the network element equipment to the controller in the effective state according to the redistribution scheme. In the process, when the controller failure is detected in the network, the self-adaptive allocation scheme is generated according to the data packet request quantity of the network element equipment of the controller, so that reconnection and reliable communication between the network element equipment and the controller are realized, the control plane can carry out fault tolerance in real time under the condition that part of the controllers fail, and the network data analysis system can still run normally.

Description

Fault-tolerant control method and device and fault-tolerant component
Technical Field
The application belongs to the technical field of computer networks, and particularly relates to a fault-tolerant control method and device and a fault-tolerant component.
Background
As a novel network architecture, a Software Defined Network (SDN) realizes the complete decoupling of a data plane and a control plane; more and more data centers, large-scale local area networks, etc. are beginning to employ SDNs to improve network performance. The SDN framework can monitor network data in a control layer, multiple controllers can be used in a large-scale network to improve the expandability of the SDN, but with the rapid increase of bottom network element equipment and data flow, the controllers face huge burden when processing data streams generated by the network element equipment, and the controllers often generate high failure probability due to insufficient processing capacity; in addition, the failure of hardware or software of the server can also cause the controller to break down, and the normal operation of the network is seriously threatened.
Disclosure of Invention
In view of this, embodiments of the present application provide a fault-tolerant control method and apparatus, and a fault-tolerant component, so as to solve the problems in the prior art that a control plane in an SDN framework has poor fault-tolerant performance, and a controller fails to affect normal operation of a network.
A first aspect of an embodiment of the present application provides a fault-tolerant control method, where the fault-tolerant control method includes:
acquiring the current states of at least two controllers, and generating a controller set FC in a failure state;
acquiring a data packet request quantity of network element equipment managed by each controller in the FC;
adaptively generating a redistribution scheme according to the data packet request quantity;
and connecting the network element equipment to the controller in the effective state according to the redistribution scheme.
A second aspect of an embodiment of the present application provides a fault-tolerant control apparatus, including:
the first acquisition module is used for acquiring the current states of at least two controllers and generating a controller set FC in a failure state;
a second obtaining module, configured to obtain a data packet request amount of network element equipment managed by each controller in the FC;
the generating module is used for generating a redistribution scheme in a self-adaptive manner according to the data packet request quantity;
and a connecting module, configured to connect the network element device to the controller in an active state according to the reallocation scheme.
A third aspect of embodiments of the present application provides a fault tolerant component comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any one of the fault tolerant control methods when executing the computer program.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method according to any one of the fault-tolerant control methods.
In the embodiment of the application, a fault-tolerant component is set in a network data analysis system, the fault-tolerant component acquires current states of at least two controllers in the system, a controller set FC in a failure state is generated, a data packet request amount of each network element device managed by each controller in the controller set FC is acquired, a reallocation scheme of the network element devices managed by the controller in the failure state is generated according to the data packet request amount, and each network element device is connected to the controller in the effective state according to the reallocation scheme. In the process, when the failure of the controller in the network is detected, the self-adaptive distribution scheme is generated according to the data packet request quantity of the network element equipment for redistributing the controller as required so as to realize reconnection and reliable communication between the network element equipment and the controller, thereby ensuring that the control plane can carry out fault tolerance in real time and the network data analysis system can still normally operate under the condition that part of the controllers fail.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a general framework of SDN fault tolerance in a network data analysis system according to an embodiment of the present application;
fig. 2 is a schematic flowchart illustrating an implementation process of a fault tolerance control method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an implementation of step S21 in fig. 2 according to an embodiment of the present application;
fig. 4 is a schematic flowchart of an implementation of step S23 in fig. 2 according to an embodiment of the present application;
fig. 5 is a schematic diagram of a network element device reallocation procedure provided in an embodiment of the present application;
FIG. 6 is a schematic diagram of a fault tolerant control apparatus provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of a fault tolerant architecture provided by an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
Fig. 1 illustrates an overall framework of SDN fault tolerance in a network data analysis system in an embodiment of the present application, where the system includes a fault tolerance component disposed in a control plane, more than one controller, and a network element device managed by the controller in a data plane, where the fault tolerance component is responsible for detecting a network relationship between the network element device and the controller, reallocating the controller of the network element device according to a detection result, and authenticating a connection authentication between the network element device and the controller.
Specifically, the fault-tolerant component acquires the current states of at least two controllers in the system, generates a controller set FC in a failure state, acquires a data packet request quantity of each network element device controlled by each controller in the controller set FC, generates a reallocation scheme of the network element devices managed by the controller in the failure state according to the data packet request quantity, and connects each network element device to the controller in an effective state according to the reallocation scheme.
In order to explain the technical means described in the present application, the following description will be given by way of specific examples.
The first embodiment is as follows:
fig. 2 shows a schematic implementation flow diagram of a fault-tolerant control method provided in an embodiment of the present application, which is detailed as follows:
step S21, acquiring the current states of at least two controllers, and generating a controller set FC in a failure state;
according to the embodiment provided by the application, the fault-tolerant component is arranged in the network data analysis system provided with the SDN, and the fault-tolerant component is used for detecting the network connection state between each controller in the system and the network element equipment managed by the controller in real time so as to distinguish the controller in an effective state from the controller in a failure state. The controllers in the failure state are uniformly recorded in the controller set FC according to the identification information of the controllers. Further, the fault-tolerant components may be provided in plurality to form a fault-tolerant component group, and each part of the fault-tolerant components detects controllers in different areas.
Fig. 3 illustrates the obtaining of the current states of at least two controllers and the generating of the controller set FC in the failure state provided in another embodiment of the present application, including:
step S31, acquiring network data;
in the step, network data is collected, and a network data analysis system data set is established, wherein the network data comprises data flow information (encapsulated in a data Packet access request-Packet-in form) sent by a network element device to a controller in a data plane and interactive state information between controllers in a control plane.
Step S32, traversing the operation status and the load status of at least two controllers according to the network data;
in this step, the data set formed from the network data is applied to all controllers { C }in the system 1 ,C 2 ,…,C n And traversing the running state and the load quantity, namely, checking the running state and the load state of each controller one by one. Further, detecting the running state of the controller hardware equipment according to whether the controller hardware equipment is normal or not; and detecting the load condition of the network element equipment according to a Packet-in message sent by the network element equipment managed by the controller and a Packet disconnection-Packet-out response message sent by the controller to the network element equipment managed by the controller.
Step S33, determining the state of the controller in the fault operation state or/and the load exhaustion in the at least two controllers as the failure state;
specifically, during the traversal process, if the controller C is in the process of the traversal process i If a hardware fault occurs, judging that the controller is in a fault operation state, and determining that the controller is in a failure state; or/and, if controller C i And if the load exceeds the maximum load, determining that the controller is in a failure state.
In step S34, the controllers in the failure state are marked to generate a set of controllers FC in the failure state.
In which the controllers in a failed state are marked and added to the same set to generate a set FC of failed controllers. And repeating the process until all the controllers in the system are detected.
Step S22, obtaining a packet request amount of the network element device managed by each controller in the FC;
in the embodiment provided by the present application, the network element device managed by each controller in the set FC is obtained, and because the controller is in a failure state, the network element device managed by the controller needs to reallocate a controller in an effective state, so as to ensure that the network element device operates normally in the system. And for the determined controller, acquiring the data packet request quantity of each network element device managed by the controller so as to find a new controller for each network element device.
Step S23, generating a redistribution scheme in a self-adapting way according to the data packet request quantity;
in this step, a new controller in an effective state is searched for the corresponding network element device according to the data packet request amount of each network element device managed by the controller in the failure state, and a redistribution scheme is generated after each network element device finds the new controller.
Specifically, fig. 4 shows a specific implementation flow of the adaptive generation of the reallocation scheme according to the data packet request amount, which includes:
step S41, calling the controller C in the failure state in the FC i (ii) a According to the controller C i Packet request of managed network element deviceArranging the controllers C in descending order i The managed network element equipment obtains a sequencing result;
in this step, the controller C is called in the controller set FC in the failure state i The controller can be called randomly or sequentially when called. Suppose this call to the ith controller C i . Lock failure controller C i Managed network element device, set C i The managed network element device set is SW i According to SW i And the data packet request quantity of the current network element equipment carries out descending order arrangement on the network element equipment to obtain an ordering result.
Step S42, according to the sorting result, allocating the corresponding network element devices to the controller with the largest residual load amount and in the effective state in turn, and generating the redistribution scheme P i
In this step, SW is controlled according to the sorting result i The network element device in (1) to generate a reallocation scheme, wherein the adaptive allocation follows the following principles: and allocating the network element equipment with the highest data packet request amount to the controller in the effective state, wherein the network element equipment with the highest data packet request amount has the largest residual load amount.
Optionally, after the sequentially allocating, according to the sorting result, the corresponding network element devices to the controller in the effective state with the largest remaining load amount, the method includes:
judging whether the load of the controller in the effective state with the largest load residual quantity exceeds the maximum load after the controller is distributed with new network element equipment;
if not, the allocation is judged to be effective allocation.
In the step, in order to ensure that a new failure controller cannot be generated due to the exhaustion of the controller load in the redistribution process, the load of the controller to which the new network element equipment is distributed is judged, and if the load of the controller does not exceed the maximum load capacity of the controller after the new network element equipment is distributed, the distribution is judged to be effective.
Fig. 5 is a schematic diagram illustrating a reallocation process of network element devices, where Packet-in request amount and controller remaining load amount of the network element devices are shown in the diagram. All controllers had an initial capacity of 5000 packets/s. When the controller C2 fails due to load exhaustion, the network element devices S2 and S3 it manages need to be reallocated. At this time, the request amount of S2 is 4000packets/S, and it selects the controller C1 having a large remaining load amount as the reallocation target. After the reallocation at S2, S3 selects the controller C3 having a large amount of remaining load as the reallocation target.
Step S43, delete controller C i
In this step, for the controller C i After the managed network element equipment is redistributed, the load quantity of the controller in the effective state in the system is updated, and the controller C is deleted from the FC i
Step S44, continuously calling other controllers in the FC and executing the above process, and generating reallocation schemes corresponding to the other controllers in the FC;
in this step, the above steps S41-S43 are repeated for the remaining controllers in the FC until the FC is empty, and a corresponding reallocation scheme P is generated each time the above steps S41-S43 are performed x
Step S45, according to the P i And generating a total reallocation scheme P according to reallocation schemes corresponding to other controllers in the FC.
In this step, after the reallocation of all the network element devices managed by the controllers in the FC is completed, the total reallocation scheme P is determined according to the re-allocation scheme generated each time
Step S24, connecting the network element device to the controller in the active state according to the reallocation scheme.
In the embodiment provided by the application, the fault-tolerant component performs equipment connection authentication on the network element equipment managed by the controller in the failure state, and accurately positions the network element equipment and the controller related in the redistribution scheme through view information shared in a database; and according to the positioning result, the reconnection between the network element equipment and the controller is authenticated, and the network element equipment of all failed controllers is ensured to be reconnected to the new controller. The view information stores the identifier and the position of each network element device and the controller, so that the controller and the network element devices connected with the controller can be conveniently and quickly located.
Optionally, the connecting the network element device to the controller in an active state according to the reallocation scheme includes:
sending the reassignment scheme to the controller in an active state;
and indicating the controller in the effective state to connect the corresponding network element equipment according to the redistribution scheme.
In this step, the generated reallocation scheme may also be output to the controller in an active state, so that the controller connects the corresponding network element device according to the location of the network element device.
Optionally, after the controller reconnects to the network element device, the fault-tolerant component updates the current network state (such as the load of each controller), and updates the new connection relationship between the controller and the network element device.
In the embodiment of the present application, a fault-tolerant component is set in a network data analysis system, the fault-tolerant component acquires current states of at least two controllers in the system to generate a controller set FC in a failure state, acquires a data packet request amount of each network element device controlled by each controller in the controller set FC, generates a reallocation scheme of the network element devices managed by the controller in the failure state according to the data packet request amount, and connects each network element device to the controller in the effective state according to the reallocation scheme. In the process, when the failure of the controller in the network is detected, the self-adaptive distribution scheme is generated according to the data packet request quantity of the network element equipment for redistributing the controller as required so as to realize reconnection and reliable communication between the network element equipment and the controller, thereby ensuring that the control plane can carry out fault tolerance in real time and the network data analysis system can still normally operate under the condition that part of the controllers fail.
The second embodiment:
fig. 6 shows a structural block diagram of the fault-tolerant control apparatus provided in the embodiment of the present application, and for convenience of description, only the parts related to the embodiment of the present application are shown.
Referring to fig. 6, the fault-tolerant control apparatus includes: a first obtaining module 61, a second obtaining module 62, a generating module 63, and a connecting module 64, wherein:
a first obtaining module 61, configured to obtain current states of at least two controllers, and generate a controller set FC in a failure state;
a second obtaining module 62, configured to obtain a data packet request amount of the network element device managed by each controller in the FC;
a generating module 63, configured to generate a redistribution scheme adaptively according to the data packet request amount;
a connecting module 64, configured to connect the network element device to the controller in an active state according to the redistribution scheme.
Further, the first obtaining module 61 includes:
the traversing unit is used for acquiring network data; traversing the operating conditions and the load conditions of at least two controllers according to the network data;
a state determination unit, configured to determine a state of a controller in a fault operation state or/and a load exhaustion state of the at least two controllers as a failure state;
and the marking unit is used for marking the controllers in the failure state to generate the controller set FC in the failure state.
Further, the generating module 63 includes:
a calling unit for calling the controller C in failure state in the FC i
A sorting unit for sorting according to the controller C i The controller C is arranged by descending the data packet request quantity of the managed network element equipment i The managed network element equipment obtains a sequencing result;
a redistribution scheme generating unit, configured to sequentially distribute, according to the sorting result, the corresponding network element devices to the controller with the largest remaining load amount and in the active state, and generate a redistribution scheme P i
A deletion unit for deleting the controller C i
The circulating execution unit is used for continuously calling other controllers in the FC and executing the process to generate redistribution schemes corresponding to the other controllers in the FC; according to said P i And generating a total reallocation scheme P according to reallocation schemes corresponding to other controllers in the FC.
Further, the generating module 63 further includes:
the judging unit is used for judging whether the load of the controller in the effective state with the largest load residual quantity exceeds the maximum load after the controller is distributed with new network element equipment; if not, the allocation is judged to be effective allocation.
Further, the link module 64 includes:
a transmitting unit for transmitting the reallocation scheme to a controller in an active state;
and an indicating unit, configured to indicate the controller in the valid state to connect to the corresponding network element device according to the reallocation scheme.
In the embodiment of the application, a fault-tolerant component is set in a network data analysis system, the fault-tolerant component acquires current states of at least two controllers in the system, a controller set FC in a failure state is generated, a data packet request amount of each network element device managed by each controller in the controller set FC is acquired, a reallocation scheme of the network element devices managed by the controller in the failure state is generated according to the data packet request amount, and each network element device is connected to the controller in the effective state according to the reallocation scheme. In the process, when the controller failure is detected in the network, the self-adaptive allocation scheme is generated according to the data packet request quantity of the network element equipment of the controller, so that reconnection and reliable communication between the network element equipment and the controller are realized, the control plane can carry out fault tolerance in real time under the condition that part of the controllers fail, and the network data analysis system can still run normally.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Example three:
FIG. 7 is a schematic diagram of a fault tolerant architecture provided by an embodiment of the present application. As shown in fig. 7, the fault tolerant means 7 of this embodiment comprises: a processor 70, a memory 71 and a computer program 72 stored in said memory 71 and executable on said processor 70. The processor 70, when executing the computer program 72, implements the steps of the above-described embodiments of the fault-tolerant control method, such as the steps S21 to S24 shown in fig. 2. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the modules/units in the above-described device embodiments, such as the functions of the modules 61 to 64 shown in fig. 6.
Illustratively, the computer program 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 72 in the fault tolerant architecture 7. For example, the computer program 72 may be divided into a first acquisition module, a second acquisition module, a generation module, and a connection module, wherein:
the first acquisition module is used for acquiring the current states of at least two controllers and generating a controller set FC in a failure state;
a second obtaining module, configured to obtain a data packet request amount of network element equipment managed by each controller in the FC;
the generating module is used for generating a redistribution scheme in a self-adaptive manner according to the data packet request quantity;
and the connecting module is used for connecting the network element equipment to the controller in the effective state according to the redistribution scheme.
Further, the first obtaining module includes:
the traversing unit is used for acquiring network data; traversing the operating conditions and the load conditions of at least two controllers according to the network data;
a state determination unit, configured to determine a state of a controller in a fault operation state or/and a load exhaustion state of the at least two controllers as a failure state;
and the marking unit is used for marking the controllers in the failure state to generate the controller set FC in the failure state.
Further, the generating module includes:
a calling unit for calling the controller C in failure state in the FC i
A sorting unit for sorting according to the controller C i The controller C is arranged by the data packet request quantity of the managed network element equipment in a descending order i The managed network element equipment obtains a sequencing result;
a redistribution scheme generating unit, configured to sequentially distribute, according to the sorting result, the corresponding network element devices to the controller with the largest remaining load amount and in the active state, and generate a redistribution scheme P i
A deletion unit for deleting the controller C i
The circulation execution unit is used for continuously calling other controllers in the FC and executing the process to generate redistribution schemes corresponding to the other controllers in the FC; according to said P i And generating a total reallocation scheme P according to reallocation schemes corresponding to other controllers in the FC.
Further, the generating module further includes:
the judging unit is used for judging whether the load of the controller in the effective state with the largest load residual quantity exceeds the maximum load after the controller is distributed with new network element equipment; if not, the allocation is judged to be effective allocation.
Further, the link module includes:
a transmitting unit for transmitting the reallocation scheme to a controller in an active state;
and an indicating unit, configured to indicate the controller in the valid state to connect to the corresponding network element device according to the reallocation scheme.
The fault-tolerant component 7 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing device. The fault tolerant components may include, but are not limited to, a processor 70, a memory 71. It will be appreciated by those skilled in the art that fig. 7 is only an example of a fault tolerant architecture 7 and does not constitute a limitation of the fault tolerant architecture 7, and may comprise more or less components than shown, or some components may be combined, or different components, e.g. the fault tolerant architecture may also comprise input output devices, network access devices, buses, etc.
The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 71 may be an internal storage unit of the fault tolerant means 7, such as a hard disk or a memory of the fault tolerant means 7. The memory 71 may also be an external storage device of the fault tolerant component 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the fault tolerant component 7. Further, the memory 71 may also comprise both an internal storage unit and an external storage device of the fault tolerant means 7. The memory 71 is used for storing the computer programs and other programs and data required by the fault tolerant architecture. The memory 71 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. For the specific working processes of the units and modules in the system, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated module/unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or subtractions depending on the requirements of legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunication signals in accordance with legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present application, and they should be construed as being included in the present application.

Claims (8)

1. A fault tolerant control method, wherein the fault tolerant control method comprises:
acquiring current states of at least two controllers, and generating a controller set FC in a failure state, wherein the method comprises the following steps: detecting in real time by a fault tolerant component the network connection status between each controller and the network element devices it manages to distinguish between controllers in an active state and controllers in a failed state;
acquiring a data packet request quantity of network element equipment managed by each controller in the FC;
the adaptive generation of the redistribution scheme according to the data packet request quantity comprises the following steps: calling a controller Ci in a failure state in the FC; the network element equipment managed by the controller Ci is sorted in a descending order according to the data packet request quantity of the network element equipment managed by the controller Ci to obtain a sorting result; according to the sorting result, sequentially distributing the corresponding network element equipment to the controller with the largest residual load amount and in the effective state to generate a redistribution scheme Pi; deleting the controller Ci; continuing to call other controllers in the FC and executing the process to generate redistribution schemes corresponding to the other controllers in the FC; generating a total reallocation scheme P according to the Pi and reallocation schemes corresponding to other controllers in the FC;
and connecting the network element equipment to the controller in the effective state according to the redistribution scheme.
2. The fault-tolerant control method of claim 1, wherein said obtaining a current state of at least two controllers and generating a set of controllers FC in a failed state comprises:
acquiring network data;
traversing the operating conditions and the load conditions of at least two controllers according to the network data;
determining a state of a controller in a fault operation state or/and a load exhaustion state in the at least two controllers as a failure state;
the controllers in the failure state are marked to generate a set of controllers FC in the failure state.
3. The fault-tolerant control method according to claim 1, wherein after said sequentially allocating the corresponding network element devices to the controller with the largest remaining load amount and in the active state according to the sorting result, the method comprises:
judging whether the load of the controller in the effective state with the largest load residual quantity exceeds the maximum load after the controller is distributed with new network element equipment;
if not, the allocation is judged to be effective allocation.
4. The fault-tolerant control method of claim 1, wherein said connecting said network element device to a controller in an active state according to said reallocation scheme, comprises:
sending the reassignment scheme to a controller in an active state;
and indicating the controller in the effective state to connect the corresponding network element equipment according to the redistribution scheme.
5. A fault tolerant control apparatus, characterized in that the fault tolerant control apparatus comprises:
the first obtaining module is configured to obtain current states of at least two controllers, and generate a controller set FC in a failure state, including: detecting in real time by a fault tolerant component the network connection status between each controller and the network element devices it manages to distinguish between controllers in an active state and controllers in a failed state;
a second obtaining module, configured to obtain a data packet request amount of network element equipment managed by each controller in the FC;
the generating module is used for generating a redistribution scheme in a self-adaptive manner according to the data packet request quantity; the generation module comprises: the calling unit is used for calling the controller Ci in the failure state in the FC; the sorting unit is used for sorting the network element equipment managed by the controller Ci in a descending order according to the data packet request quantity of the network element equipment managed by the controller Ci to obtain a sorting result; a reallocation scheme generation unit, configured to sequentially allocate, according to the sorting result, the corresponding network element devices to the controller with the largest remaining load amount and in the active state, and generate a reallocation scheme Pi; a deleting unit configured to delete the controller Ci; the circulation execution unit is used for continuously calling other controllers in the FC and executing the process to generate redistribution schemes corresponding to the other controllers in the FC; generating a total redistribution scheme P according to the Pi and the redistribution schemes corresponding to other controllers in the FC;
and a connecting module, configured to connect the network element device to the controller in an active state according to the reallocation scheme.
6. The fault-tolerant control device of claim 5, wherein the first obtaining module comprises:
the traversing unit is used for acquiring network data; traversing the operating conditions and the load conditions of at least two controllers according to the network data;
a state determination unit, configured to determine a state of a controller in a fault operation state or/and a load exhaustion state of the at least two controllers as a failure state;
and the marking unit is used for marking the controllers in the failure state to generate the controller set FC in the failure state.
7. A fault tolerant architecture comprising a memory, a processor and a computer program stored in said memory and executable on said processor, characterized in that said processor implements the steps of the method according to any one of claims 1 to 4 when executing said computer program.
8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method according to any one of claims 1 to 4.
CN201810368148.9A 2018-04-23 2018-04-23 Fault-tolerant control method and device and fault-tolerant component Active CN110391929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810368148.9A CN110391929B (en) 2018-04-23 2018-04-23 Fault-tolerant control method and device and fault-tolerant component

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810368148.9A CN110391929B (en) 2018-04-23 2018-04-23 Fault-tolerant control method and device and fault-tolerant component

Publications (2)

Publication Number Publication Date
CN110391929A CN110391929A (en) 2019-10-29
CN110391929B true CN110391929B (en) 2022-07-26

Family

ID=68284460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810368148.9A Active CN110391929B (en) 2018-04-23 2018-04-23 Fault-tolerant control method and device and fault-tolerant component

Country Status (1)

Country Link
CN (1) CN110391929B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111650878B (en) * 2020-06-15 2021-05-04 北京理工大学 Method for optimizing programmability of flow when multiple controllers in software defined network fail

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103501509A (en) * 2013-09-04 2014-01-08 大唐移动通信设备有限公司 Method and device for balancing loads of radio network controller
CN103596652A (en) * 2013-07-30 2014-02-19 华为技术有限公司 Network control method and device
CN103607310A (en) * 2013-11-29 2014-02-26 华为技术有限公司 Method for arbitration of remote disaster recovery
CN103618621A (en) * 2013-11-21 2014-03-05 华为技术有限公司 Method, device and system for automatic configuration of SDN
CN105227385A (en) * 2015-11-12 2016-01-06 中国联合网络通信集团有限公司 A kind of method and system of troubleshooting

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9456314B2 (en) * 2014-05-09 2016-09-27 Aruba Networks, Inc. Multicast transmissions in a network environment with user anchor controllers

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103596652A (en) * 2013-07-30 2014-02-19 华为技术有限公司 Network control method and device
CN103501509A (en) * 2013-09-04 2014-01-08 大唐移动通信设备有限公司 Method and device for balancing loads of radio network controller
CN103618621A (en) * 2013-11-21 2014-03-05 华为技术有限公司 Method, device and system for automatic configuration of SDN
CN103607310A (en) * 2013-11-29 2014-02-26 华为技术有限公司 Method for arbitration of remote disaster recovery
CN105227385A (en) * 2015-11-12 2016-01-06 中国联合网络通信集团有限公司 A kind of method and system of troubleshooting

Also Published As

Publication number Publication date
CN110391929A (en) 2019-10-29

Similar Documents

Publication Publication Date Title
CN105429776B (en) Method and system for managing functions of virtual network
CN108287669A (en) Date storage method, device and storage medium
CN108512890B (en) Container cloud platform resource scheduling method and system based on rack sensing
CN109886693B (en) Consensus realization method, device, equipment and medium for block chain system
US9929969B1 (en) Tenant-based management system and method for distributed computing environments
CN115277566B (en) Load balancing method and device for data access, computer equipment and medium
CN110784515A (en) Data storage method based on distributed cluster and related equipment thereof
CN104320492A (en) Method and device for dispatching web servers
CN111899011A (en) Medical bill reconciliation system, computer equipment and storage medium
CN109873714B (en) Cloud computing node configuration updating method and terminal equipment
CN111045811A (en) Task allocation method and device, electronic equipment and storage medium
CN105306507A (en) Disaster tolerance processing method and disaster tolerance processing device in distributed architecture
CN108667864B (en) Method and device for scheduling resources
CN107066522B (en) Database access method and device
CN110391929B (en) Fault-tolerant control method and device and fault-tolerant component
CN103139136B (en) The management process of a kind of password and equipment
US20140082258A1 (en) Multi-server aggregated flash storage appliance
CN114546493A (en) Core sharing method and device, processing core, electronic device and medium
CN109587218B (en) Cluster election method and device
CN110275772B (en) Data processing method and related equipment thereof
CN114281260A (en) Storage method, device, equipment and medium applied to distributed storage system
CN113127191B (en) Resource updating method, storage medium and related device
CN204425400U (en) Application server system
CN109933959B (en) License control method and related equipment
CN109818767B (en) Method and device for adjusting Redis cluster capacity and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant