CN110708245B - SDN data plane fault monitoring and recovery method under multi-controller architecture - Google Patents
SDN data plane fault monitoring and recovery method under multi-controller architecture Download PDFInfo
- Publication number
- CN110708245B CN110708245B CN201910933770.4A CN201910933770A CN110708245B CN 110708245 B CN110708245 B CN 110708245B CN 201910933770 A CN201910933770 A CN 201910933770A CN 110708245 B CN110708245 B CN 110708245B
- Authority
- CN
- China
- Prior art keywords
- switch
- domain
- fault
- network
- sdn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000011084 recovery Methods 0.000 title claims abstract description 60
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000012544 monitoring process Methods 0.000 title claims abstract description 39
- 230000005856 abnormality Effects 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 17
- 230000009471 action Effects 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 238000004891 communication Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 8
- 230000007246 mechanism Effects 0.000 description 6
- 230000002159 abnormal effect Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 208000024891 symptom Diseases 0.000 description 2
- 206010033799 Paralysis Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/28—Routing or path finding of packets in data switching networks using route fault recovery
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides a SDN data plane fault monitoring and recovering method under a multi-controller architecture, which comprises the following steps: s1, synchronizing global topology among the SDN controllers, and constructing and updating a topological structure of an intra-domain network; s2, judging whether SDN network data plane link failure and switch node failure caused by port abnormality occur or not; s3, the controller resolves the fault; s4, the SDN controllers cooperate to determine a routing path of the arriving data flow, and then a flow table is issued to the switches on the path to complete the routing of the data flow. According to the invention, the detection rate is improved, meanwhile, only a small amount of network load is added, the flexibility and accuracy of detection are considered, and the fault recovery time is reduced.
Description
Technical Field
The invention relates to the field of network reliability research under an SDN architecture, in particular to a fault monitoring and recovering method for an SDN data plane under a multi-controller architecture.
Background
Conventional networks integrate control and forwarding in the same physical device in a tightly coupled relationship. The SDN (software defined network) network technology decouples logic control and data forwarding, and a control plane remotely controls data equipment forwarding by utilizing an API (application programming interface), so that the two planes can be independently and flexibly expanded. Meanwhile, the controller can acquire the global view of the whole network through the centralized control on the SDN network logic, and the convenience of network control and management is improved. The SDN architecture includes a data plane, a control plane, and an application plane. The data plane is composed of network forwarding devices, and the SDN switch is only responsible for data flow forwarding tasks. The control plane is formed by logically centralized controllers and is responsible for controlling and managing the network devices of the data plane, maintaining network topology and state information. The application plane is composed of several SDN services.
SDN, like conventional networks, is inevitably threatened by various failures, resulting in network performance degradation or network paralysis. Common network fault monitoring methods are divided into active network fault monitoring and passive network fault monitoring. Briefly introduced as follows:
the first prior art is as follows: active network fault monitoring
The principle is as follows: obtaining fault information in a network by actively probing a probe message sent into the network
The disadvantages are as follows: it will affect the traffic in the network and increase the network load.
The second prior art is: passive network fault monitoring
The principle is as follows: reasoning and locating faults in a network by passively collecting fault information in the network
The disadvantages are as follows: the presence of symptom loss and symptom falseness in the network causes the failure monitoring of this technique to be inaccurate. There is a large time delay in collecting and processing information in the network in a large-scale distributed network, and the fault monitoring mechanism lacks real-time performance.
Common failure recovery mechanisms can also be classified into active and passive types. The passive recovery mechanism is to notify the controller after the network fails, and the controller reacquires the topology and reroutes the data stream to recover the failure. The active recovery mechanism is to provide redundancy, a backup path is provided by the controller in advance, and when a fault occurs, the fault is solved by switching the backup path. Briefly introduced as follows:
the first prior art is as follows: passive fault recovery mechanism
The principle is as follows: the controller is notified after the network fails, recalculates the route and issues flow table entries to the affected switch
The disadvantages are as follows: the time is more, the load pressure of the multiple controllers is also larger, and the 50ms fault recovery time required by an operator cannot be met.
The second prior art is: active fault recovery mechanism
The principle is as follows: redundancy is provided, the controller provides a backup path in advance, and the switch does not need to request the controller to establish a new path when a fault occurs, but directly switches to the backup path.
The disadvantages are as follows: the range of failures in the network is diverse, and backup paths cannot solve all failure problems, so that the flexibility and the applicability are deficient.
In summary, in the prior art, during fault detection, either the network load is increased to achieve a high detection accuracy, or the fault detection is inaccurate and the delay is large, so that the requirements of flexibility and fault recovery time cannot be met during fault recovery, and an application scenario is inherently limited by an SDN network architecture and lacks of expansibility.
Disclosure of Invention
The SDN data plane fault monitoring and recovery method under the multi-controller architecture is used for solving the problems that in the prior art, network load is increased to achieve higher detection accuracy rate during fault monitoring, or fault detection is inaccurate, delay is large, requirements on flexibility and fault recovery time cannot be met during fault recovery, an application scene is limited by inherent expansibility of an SDN, and the method is not suitable for large networks.
The invention is realized by at least one of the following technical schemes.
The SDN data plane fault monitoring and recovery method under the multi-controller architecture comprises the following steps:
s1, synchronizing global topology among the SDN controllers, and constructing and updating a topological structure of an intra-domain network;
s2, the SDN controller judges whether SDN network data plane link failure and switch node failure caused by Port abnormity occur through monitoring Port-status messages (Port state messages) and Echo messages;
s3, the SDN controller solves the fault, when the SDN controller detects the fault, active fault recovery is adopted, and when the active fault recovery fails, the fault is solved by passive fault recovery;
s4, the SDN controllers cooperate to determine a routing path of the arriving data flow, and then a flow table is issued to the switches on the path to complete the routing of the data flow.
Further, step S1 specifically includes: the method comprises the steps that a plurality of SDN controllers periodically send LLDP data packets to all switches connected with the SDN controllers through Packet _ out messages, and therefore the topological structures of networks in the SDN controller domains are built and updated;
the SDN controllers realize synchronous global topology through east-west interface communication, consistency of an underlying network and service processing logic is guaranteed, only updated information is transmitted in the process of synchronizing the global topology, network bandwidth is saved while the global topology is maintained, and network load is reduced.
Further, the SDN controller in step S2 determines whether a data plane link failure and a switch node failure caused by a Port exception occur by monitoring a Port-status message (Port status message) and an Echo message, and specifically includes:
1) judging the failure of the data plane link: the SDN controller captures a Port-status message sent by a data plane for fault monitoring; when the SDN controller analyzes the Port-status message to know that a certain Port of the switch is deleted, the SDN controller judges whether the Port of the switch is contained in the network according to the local network topology, and if the Port of the switch is contained in the network, the SDN controller considers that a data plane link fault caused by the switch Port fault occurs and needs to solve the fault; otherwise, the deletion of the port is considered to belong to normal network topology change;
2) judging the failure of the switch node: the SDN controller actively monitors switch nodes of a data plane, and judges whether the switch nodes have faults or not through receiving and sending Echo messages; when the SDN controller cannot receive Echo-reply messages of a certain switch for the first time, the SDN controller immediately sends the Echo-request messages to the switch again, and if the SDN controller cannot receive the Echo-reply messages of the switch, the switch node is considered to be in fault, and the fault is solved; if the SDN controller receives an Echo-reply message of the switch, the data plane operates normally.
Further, the step S3 of the SDN controller resolving the fault specifically includes the following steps:
(1) active failure recovery: the method comprises the steps that an SDN controller firstly sends a corresponding group table item to an SDN switch, wherein the group table item comprises a port number of a packet flow forwarding main path and a port number of a backup forwarding path, when the SDN controller detects that the main path fails and is unavailable, an OFPGC _ MODIFY instruction is sent to the switch on the failed path, the switch executes an action instruction corresponding to the group table item, the group table item action instruction of the next priority is selected according to the priority, meanwhile, the controller detects whether the backup forwarding path fails according to existing fault information, and if the backup forwarding path fails, the backup forwarding path is switched to the backup forwarding path to recover the fault;
(2) passive failure recovery: when the backup forwarding path in the active fault recovery cannot solve the data plane link fault, the SDN controller acquires the global topology structure of the data plane again through the LLDP protocol, performs rerouting, and issues a new forwarding path to the switch to complete fault recovery and complete normal operation of the fault recovery;
the SDN controller performs rerouting including performing domain division again, performing a rerouting process again, and determining a forwarding path for the data flow again.
Further, the domain division is to divide the SDN network data plane into a plurality of domains according to the network key elements artificially designated in advance; the network key elements comprise IP address prefixes appointed in the IP network;
different SDN controllers manage and control different domains while also separating switches into edge switches at the boundaries between domains and core switches within domains.
Furthermore, the pre-routing carries out the routing in advance before the network data flow arrives and issues the flow table;
in the pre-routing process, the hop count of the path is used as the routing cost, so the optimal path is the path with the minimum cost, the pre-routing is carried out domain by domain, for each domain, the SDN controller adopts a routing algorithm to obtain the optimal paths from all switches in the domain to the boundary switches, and then the SDN controller adds a flow table for each switch in the domain.
Further, the determining a forwarding path for a data flow specifically includes:
if the two network key elements are the same, the transmitting and receiving end is in the same domain and belongs to the intra-domain route, and the data forwarding path is determined according to the Openflow network route mode;
if the key elements of the network do not carry out cross-domain routing at the same time, the cross-domain routing is divided into three steps, namely calculating the inter-domain optimal routing, and routing the data to the target domain and the data routing in the target domain.
Further, the calculating of the inter-domain optimal route specifically includes:
the SDN controller determines source and destination domains of the data flow according to the source and destination network key elements of the data flow, and obtains an optimal route between the source domain and the destination domain based on a boundary switch by using a global topology and a routing algorithm.
Further, the data is routed to the destination domain, and the process includes: the SDN controller sends a flow table to a source switch according to the obtained inter-domain optimal route, the flow table modifies a target network key element of the flow into a network key element of a first boundary switch on a path, then the flow is routed from the source switch to the boundary switch of a source domain by virtue of pre-routing, meanwhile, the controller sends a flow table to each boundary switch on the optimal inter-domain route, and the action of the flow table item modifies the target network key element of the matched flow into a next boundary switch network key element on the optimal inter-domain route, so that the flow can be routed to the last boundary switch;
and changing the destination network key element of the data flow back to the destination network key element of the original data flow at the last boundary switch.
The data routing process in the destination domain comprises: and calculating the optimal path in the domain according to the routing mode in the OpenFlow network, and issuing the corresponding flow table item to each switch on the path, so that the data flow completes routing.
Further, step S4 specifically includes: the SDN network data plane firstly carries out domain division on the switch, then carries out pre-routing on the switch of the data plane, and finally determines the forwarding path of the data flow and routes the data flow.
Compared with the prior art, the invention has the beneficial effects that: the SDN controller monitors the switch Port fault by analyzing Port-status messages sent by a data plane and combining local network topology information, and monitors the switch node fault by immediately sending Echo-request messages to the switch which does not return the Echo-request messages again, so that the fault detection delay is small, and meanwhile, higher accuracy is guaranteed;
the method adopts a method of combining active and passive modes to carry out fault recovery, absorbs the advantage of small delay when the active fault recovery is directly switched to the backup path, and uses the passive mode as a standby mode to make up for the defect of inflexibility of relying on the backup path in the active mode; the network is divided into independent domains according to management of the SDN controllers, and a single SDN controller is responsible for the fault problem in the domain, so that the complexity of fault recovery can be reduced, wherein the fault monitoring and recovery of the boundary switch are completed by the previous SDN controller in the data flow direction;
the method adopts a method of routing the data flow by the cooperation of multiple controllers, and greatly reduces the load born by the routing decision processing of the controllers due to pre-routing, so the method has the advantage of being applied to a large SDN network.
Drawings
Fig. 1 is a general structure of a fault monitoring and recovering method for an SDN data plane under a multi-controller architecture in this embodiment;
fig. 2 is a data flow routing flowchart of the SDN data plane fault monitoring and recovery method under the multi-controller architecture in this embodiment;
fig. 3 is a flowchart of a SDN data plane fault monitoring and recovery method under the multi-controller architecture in this embodiment.
Detailed Description
The present embodiment will be described below with reference to the accompanying drawings.
As shown in fig. 1, the SDN data plane fault monitoring and recovering method under the multi-controller architecture of the present embodiment includes the following steps:
s1, synchronizing global topology among the SDN controllers, and constructing and updating a topological structure of the network in the domain, specifically as follows:
the method comprises the steps of coordinating routing and information synchronization, determining a routing path of a data flow by a plurality of SDN controllers in a coordinated mode, then issuing a flow table to a part of switches on the path to complete routing of the data flow, simultaneously enabling the plurality of SDN controllers to be in a parallel relation, controlling to acquire and update intra-domain topology information by periodically sending LLDP data packets, enabling the SDN controllers to communicate through east and west interfaces (such as AMQP protocol) to synchronize global topology, ensuring consistency of underlying network and service processing logic, and only transmitting update information in the process of synchronizing the global topology. The arrows in fig. 1 represent the data flows (data flow of the control plane, data flow of the data plane, and data flow of the control plane and the data plane).
S2, the SDN controller judges whether SDN network data plane link failure and switch node failure caused by Port abnormality occur or not by monitoring Port-status messages and Echo messages;
the system comprises a Port-status message and an SDN controller, wherein the Port-status message is used for monitoring a link fault caused by a switch Port fault occurring on a data plane, and the judgment principle is that the switch triggers the Port-status message when the Port state changes and informs the SDN controller of the change of the Port state, the SDN controller receives and analyzes the Port-status message, and the monitoring of the link fault caused by the Port fault is realized by combining local network topology information;
specifically, the judgment of the data plane link failure is that the SDN controller captures a Port-status message sent by the data plane for failure monitoring; when the SDN controller analyzes a Port-status message to know that a certain Port of the switch is deleted, the SDN controller judges whether the Port of the switch is contained in a network or not according to a local network topology (the network topology is formed by points and lines, wherein the switch is represented by one point, if the topology does not contain the point, the switch does not belong to the network topology), and if the Port of the switch is contained in the network, the SDN controller considers that a data plane link fault caused by the switch Port fault occurs and needs to solve the fault; otherwise, the deletion of the port is considered to belong to normal network topology change;
after the switch and the SDN controller are connected, the switch and the SDN controller periodically send Echo-request and Echo-reply messages to keep the connection, so that the switch node fault monitoring with higher accuracy can be achieved by immediately sending the Echo-request messages to the switch which does not return the Echo-reply messages again; the accuracy of fault monitoring is improved by not receiving Echo-reply messages twice.
Specifically, the judgment of the switch node fault is that the SDN controller actively monitors the switch node of the data plane, and whether the switch node has the fault is judged through receiving and sending Echo messages; when the SDN controller cannot receive Echo-reply messages of a certain switch for the first time (when the controller receives a certain data packet from a network, the controller checks a flag bit by itself), the SDN controller immediately sends the Echo-request messages to the switch again, and if the SDN controller cannot receive the Echo-reply messages of the switch, the switch node is considered to be in fault, and the fault is solved; if the SDN controller receives an Echo-reply message of the switch, the data plane operates normally.
S3, the SDN controller solves the fault, when the SDN controller detects the fault, active fault recovery is adopted, and when the active fault recovery fails, the fault is solved by passive fault recovery; monitoring the fault of the switch node with higher accuracy;
active failure recovery, which relies on providing redundant backup paths to replace failed paths to resolve failures;
specifically, active failure recovery: the method comprises the steps that an SDN controller firstly sends a corresponding group entry to an SDN switch, wherein the group entry comprises a port number of a packet flow forwarding main path and a port number of a backup forwarding path, when the SDN controller detects that the main path fails, an OFPGC _ MODIFY instruction is sent to the switch on the failed path, the switch executes an action instruction corresponding to the group entry, the group entry action instruction of the next priority is selected according to the priority, meanwhile, the controller detects whether the backup forwarding path fails according to the existing fault message by using a step S2, and if the backup forwarding path fails, the backup forwarding path is switched to the backup forwarding path to recover the fault; the failure is actively recovered by switching the available backup paths, which has the advantage of small failure recovery delay.
Passive fault recovery, which relies on an SDN controller to reacquire the topology and reroute the data stream to resolve the fault; passive fault recovery recovers a fault by reacquiring the global topology and rerouting, which has the advantage of fault recovery flexibility.
Specifically, passive failure recovery: when the backup forwarding path in the active fault recovery cannot solve the data plane link fault, the SDN controller acquires the global topology structure of the data plane again through the LLDP protocol, performs rerouting, and issues a new forwarding path to the switch to complete fault recovery and complete normal operation of the fault recovery;
the SDN controller performs rerouting, including performing domain division again, performing a rerouting process again, and determining a forwarding path for the data flow again.
The domain division is to divide the SDN network data plane into a plurality of domains according to the network key elements artificially designated in advance; the network key elements comprise IP address prefixes appointed in the IP network;
different SDN controllers manage and control different domains while also separating switches into edge switches at the boundaries between domains and core switches within domains.
The pre-routing carries out the routing in advance before the network data flow arrives and issues the flow table;
in the pre-routing process, hop count of a path is used as a routing cost, so that an optimal path is a path with the minimum cost, the pre-routing is carried out domain by domain, for each domain, an SDN controller obtains optimal paths from all switches in the domain to boundary switches by adopting a routing algorithm (such as a Floyd-Warshall algorithm), and then the SDN controller adds a flow table for each switch in the domain. The load of the controller can be reduced through pre-routing, and the problem of expansibility of a single controller is relieved to a certain extent.
The determining a forwarding path for a data stream specifically includes:
if the two network key elements are the same, the transmitting and receiving end is in the same domain and belongs to the intra-domain route, and the data forwarding path is determined according to the Openflow network route mode;
if the key elements of the network do not carry out cross-domain routing at the same time, the cross-domain routing is divided into three steps, namely calculating the inter-domain optimal routing, and routing the data to the target domain and the data routing in the target domain.
The calculating of the inter-domain optimal route specifically comprises the following steps:
the SDN controller determines source and destination domains of the data flow according to the source and destination network key elements of the data flow, and obtains an optimal route between the source domain and the destination domain based on a boundary switch by using a global topology and a routing algorithm.
The data is routed to a destination domain, and the process comprises the following steps: the SDN controller sends a flow table to a source switch according to the obtained inter-domain optimal route, the flow table modifies a target network key element of the flow into a network key element of a first boundary switch on a path, then the flow is routed from the source switch to the boundary switch of a source domain by virtue of pre-routing, meanwhile, the SDN controller sends a flow table to each boundary switch on the optimal inter-domain route, and the action of the flow table item modifies the target network key element of the matched flow into a next boundary switch network key element on the optimal inter-domain route, so that the flow can be routed to the last boundary switch;
and changing the destination network key element of the data flow back to the destination network key element of the original data flow at the last boundary switch.
The data routing process in the destination domain comprises: and calculating the optimal path in the domain according to the routing mode in the OpenFlow network, and issuing the corresponding flow table item to each switch on the path, so that the data flow completes routing.
S4, determining a routing path of the arriving data flow by cooperation of a plurality of SDN controllers, then issuing a flow table to the switch on the path, and completing routing of the data flow, specifically, a SDN network data plane performs domain division on the switch, then performs pre-routing on the switch of the data plane, and finally determines a forwarding path of the data flow, and routes the data flow.
Cooperative routing under multiple SDN controller architectures as shown in fig. 2, comprising the steps of:
The flow chart of the fault monitoring and fault recovery method shown in fig. 3 includes the following steps:
if the abnormal port does not belong to the network, the deletion of the port is considered to belong to normal network topology change, and an Echo message judgment 304 is carried out;
step 310 and step 311, judging whether the switch node fault exists through the Echo message;
if the network is judged to have no fault through the steps 309, 310 and 311, the network normally operates, namely, the fault recovery is completed through an active method;
if the network is judged to have a fault through the steps 309, 310 and 311, the step 312 is entered for passive fault recovery;
313, the SDN controller re-acquires the global topology of the data plane and re-routes the data stream;
and step 314, formulating a new forwarding path and sending the new forwarding path to the switch to complete the fault recovery and normal operation.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus a necessary hardware platform, and certainly may be implemented by hardware, but in many cases, the former is a better embodiment. With this understanding in mind, all or part of the technical solutions of the present invention that contribute to the background can be embodied in the form of a software product, which can be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments or some parts of the embodiments of the present invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (4)
1. An SDN data plane fault monitoring and recovery method under a multi-controller architecture is characterized by comprising the following steps:
s1, a plurality of SDN controllers carry out synchronous global topology, and a topological structure of an intra-domain network is constructed and updated;
s2, the SDN controller judges whether SDN network data plane link failure and switch node failure caused by Port abnormality occur or not by monitoring Port-status messages and Echo messages; the SDN controller judges whether a data plane link fault and a switch node fault caused by Port abnormity occur or not by monitoring Port-status messages and Echo messages, and the judging mode specifically comprises the following steps:
1) judging the failure of the data plane link: the SDN controller captures a Port-status message sent by a data plane for fault monitoring; when the SDN controller analyzes the Port-status message to know that a certain Port of the switch is deleted, the SDN controller judges whether the Port of the switch is contained in the network according to the local network topology, and if the Port of the switch is contained in the network, the SDN controller considers that a data plane link fault caused by the switch Port fault occurs and needs to solve the fault; otherwise, the deletion of the port is considered to belong to normal network topology change;
2) judging the failure of the switch node: the SDN controller actively monitors switch nodes of a data plane, and judges whether the switch nodes have faults or not through receiving and sending Echo messages; when the SDN controller cannot receive Echo-reply messages of a certain switch for the first time, the SDN controller sends the Echo-request messages to the switch again, and if the SDN controller cannot receive the Echo-reply messages of the switch, the switch node is considered to be in fault, and the fault is solved; if the SDN controller receives an Echo-reply message of the switch, the data plane normally operates;
s3, the SDN controller solves the fault, when the SDN controller detects the fault, active fault recovery is adopted, and when the active fault recovery fails, the fault is solved by passive fault recovery;
the fault resolution of the SDN controller specifically comprises the following steps: (1) active failure recovery: the method comprises the steps that an SDN controller firstly sends a corresponding group entry to an SDN switch, wherein the group entry comprises a port number of a packet flow forwarding main path and a port number of a backup forwarding path, when the SDN controller detects that the main path fails, an OFPGC _ MODIFY instruction is sent to the switch on the path with the failure, the switch executes an action instruction corresponding to the group entry, the group entry action instruction of the next priority is selected according to the priority, meanwhile, the SDN controller detects whether the backup forwarding path fails according to existing failure information, and if the backup forwarding path fails, the backup forwarding path is switched to the backup forwarding path to recover the failure;
(2) passive failure recovery: when the backup forwarding path in the active fault recovery cannot solve the Link fault of the data plane, the SDN controller acquires the global topology structure of the data plane again through an LLDP (Link Layer Discovery Protocol) Protocol, performs rerouting, and issues a new forwarding path to the switch to complete fault recovery and complete normal operation of the fault recovery;
the SDN controller performs rerouting, including domain division, a rerouting process and forwarding path determination for the data flow;
the domain division is to divide the SDN network data plane into a plurality of domains according to the network key elements artificially designated in advance; the network key elements comprise IP address prefixes appointed in the IP network;
different SDN controllers manage and control different domains, and simultaneously divide the switches into boundary switches at the boundary between the domains and switches in the domains;
the pre-routing is a process of pre-routing and issuing a flow table before the network data flow arrives;
in the pre-routing process, hop count of a path is used as routing cost, so that the optimal path is the path with the minimum cost, the pre-routing is carried out domain by domain, for each domain, the SDN controller adopts a Floyd-Warshall algorithm to obtain the optimal paths from all switches in the domain to boundary switches, and then the SDN controller adds a flow table to each switch in the domain by using the optimal paths;
the determining a forwarding path for the data stream specifically includes: if the two network key elements are the same, the transmitting and receiving end is in the same domain and belongs to the intra-domain route, and the data forwarding path is determined according to the Openflow network route mode;
if the key elements of the network are different, performing cross-domain routing, wherein the cross-domain routing is divided into three steps, namely calculating inter-domain optimal routing, and routing data to a target domain and routing data in the target domain;
s4, the SDN controllers cooperate to determine a routing path of the arriving data flow, and then a flow table is issued to the switches on the path to complete the routing of the data flow.
2. The SDN data plane fault monitoring and recovery method under a multi-controller architecture of claim 1, wherein step S1 specifically includes: a plurality of SDN controllers send LLDP (Link Layer Discovery Protocol) data packets to all switches connected with the SDN controllers through Packet _ out messages, so that the topological structures of networks in the SDN controller domain are built and updated;
the SDN controllers realize synchronous global topology through east-west interface communication, consistency of an underlying network and service processing logic is guaranteed, only updated information is transmitted in the process of synchronizing the global topology, network bandwidth is saved while the global topology is maintained, and network load is reduced.
3. The SDN data plane fault monitoring and recovery method under a multi-controller architecture of claim 2, wherein the computing inter-domain optimal routing specifically comprises: the SDN controller determines source and destination domains of the data flow according to the source and destination network key elements of the data flow, and obtains an inter-domain optimal route between the source domain and the destination domain based on a boundary switch by using a global topology and a routing algorithm.
4. The SDN data plane fault monitoring and recovery method under a multi-controller architecture of claim 3, wherein the data is routed to a destination domain by a process comprising: the SDN controller sends a flow table to a source switch according to the obtained inter-domain optimal route, the flow table modifies a target network key element of the flow into a network key element of a first boundary switch on a path, then the flow is routed from the source switch to the boundary switch of a source domain by virtue of pre-routing, meanwhile, the SDN controller sends a flow table to each boundary switch on the optimal inter-domain route, and the action of the flow table item modifies the target network key element of the matched flow into a next boundary switch network key element on the optimal inter-domain route, so that the flow can be routed to the last boundary switch;
changing the target network key element of the data flow back to the target network key element of the original data flow at the last boundary switch;
the data routing process in the destination domain comprises: and calculating the optimal path in the domain according to the routing mode in the OpenFlow network, and issuing the corresponding flow table item to each switch on the path, so that the data flow completes routing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910933770.4A CN110708245B (en) | 2019-09-29 | 2019-09-29 | SDN data plane fault monitoring and recovery method under multi-controller architecture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910933770.4A CN110708245B (en) | 2019-09-29 | 2019-09-29 | SDN data plane fault monitoring and recovery method under multi-controller architecture |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110708245A CN110708245A (en) | 2020-01-17 |
CN110708245B true CN110708245B (en) | 2021-10-22 |
Family
ID=69196551
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910933770.4A Active CN110708245B (en) | 2019-09-29 | 2019-09-29 | SDN data plane fault monitoring and recovery method under multi-controller architecture |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110708245B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114915602B (en) * | 2021-01-29 | 2024-01-26 | 中移(苏州)软件技术有限公司 | Processing method, processing device and terminal for flow table in virtual switch |
CN112887202B (en) * | 2021-02-02 | 2022-05-27 | 浙江工商大学 | SDN link fault network convergence method based on sub-topology network |
CN115086978B (en) * | 2021-03-11 | 2024-05-07 | 中国移动通信集团四川有限公司 | Network function virtualization SDN network system |
CN113660140B (en) * | 2021-08-17 | 2023-04-07 | 北京交通大学 | Service function chain fault detection method based on data control plane hybrid sensing |
CN113992569B (en) * | 2021-09-29 | 2023-12-26 | 新华三大数据技术有限公司 | Multipath service convergence method, device and storage medium in SDN network |
CN114039833B (en) * | 2021-11-09 | 2024-04-12 | 江苏大学 | SRv 6-based industrial Internet multi-domain integrated architecture |
CN115277424B (en) * | 2022-06-23 | 2023-10-03 | 中国联合网络通信集团有限公司 | Decision issuing method, device and storage medium in software defined network |
CN115150322B (en) * | 2022-09-06 | 2022-11-25 | 中勍科技股份有限公司 | Multichannel RapidIO distribution system and fault self-isolation method thereof |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105871718A (en) * | 2016-03-21 | 2016-08-17 | 东南大学 | SDN (Software-Defined Networking) inter-domain routing implementation method |
CN106506353A (en) * | 2016-10-27 | 2017-03-15 | 吉林大学 | Virtual network single link failure restoration methods and system based on SDN |
CN106888163A (en) * | 2017-03-31 | 2017-06-23 | 中国科学技术大学苏州研究院 | The method for routing divided based on network domains in software defined network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10356011B2 (en) * | 2014-05-12 | 2019-07-16 | Futurewei Technologies, Inc. | Partial software defined network switch replacement in IP networks |
-
2019
- 2019-09-29 CN CN201910933770.4A patent/CN110708245B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105871718A (en) * | 2016-03-21 | 2016-08-17 | 东南大学 | SDN (Software-Defined Networking) inter-domain routing implementation method |
CN106506353A (en) * | 2016-10-27 | 2017-03-15 | 吉林大学 | Virtual network single link failure restoration methods and system based on SDN |
CN106888163A (en) * | 2017-03-31 | 2017-06-23 | 中国科学技术大学苏州研究院 | The method for routing divided based on network domains in software defined network |
Non-Patent Citations (1)
Title |
---|
SDN故障监测和恢复技术的研究与实现;卞宇翔;《南京邮电大学硕士学位论文》;20180228;第3.3-3.5节,第4.1-4.3节 * |
Also Published As
Publication number | Publication date |
---|---|
CN110708245A (en) | 2020-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110708245B (en) | SDN data plane fault monitoring and recovery method under multi-controller architecture | |
US5016243A (en) | Automatic fault recovery in a packet network | |
US8441941B2 (en) | Automating identification and isolation of loop-free protocol network problems | |
EP0452487B1 (en) | Automatic fault recovery in a packet network | |
EP1511238B1 (en) | Distributed and disjoint forwarding and routing system and method | |
US6983294B2 (en) | Redundancy systems and methods in communications systems | |
EP0452466B1 (en) | Automatic fault recovery in a packet network | |
US7155632B2 (en) | Method and system for implementing IS-IS protocol redundancy | |
JP5941404B2 (en) | Communication system, path switching method, and communication apparatus | |
JP2017508401A (en) | Switch replacement of partial software defined network in IP network | |
JP2004173136A (en) | Network management device | |
JP2009239359A (en) | Communication network system, communication device, route design device, and failure recovery method | |
JP2002033767A (en) | Network-managing system | |
JP2009303092A (en) | Network equipment and line switching method | |
EP1940091B1 (en) | Autonomous network, node device, network redundancy method and recording medium | |
KR102157711B1 (en) | Methods for recovering failure in communication networks | |
CN111404734B (en) | Cross-layer network fault recovery system and method based on configuration migration | |
WO2011120423A1 (en) | System and method for communications system routing component level high availability | |
WO2023015897A1 (en) | Intelligent control method, apparatus and system for optical network | |
JP4717796B2 (en) | Node device and path setting method | |
CN114039833B (en) | SRv 6-based industrial Internet multi-domain integrated architecture | |
Hraska et al. | Enhanced Derived Fast Reroute Techniques in SDN | |
Hainana et al. | Design of a NFV Traffic Engineering Middlebox for Efficient Link Failure Detection and Recovery in SDN Core Networks | |
Valcarenghi et al. | Which resilience for the optical internet? an e-Photon/ONe+ outlook |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
OL01 | Intention to license declared | ||
OL01 | Intention to license declared |