KR20150059697A - Method and System for detecting network failure in Software Defined Network - Google Patents

Method and System for detecting network failure in Software Defined Network Download PDF

Info

Publication number
KR20150059697A
KR20150059697A KR1020130143238A KR20130143238A KR20150059697A KR 20150059697 A KR20150059697 A KR 20150059697A KR 1020130143238 A KR1020130143238 A KR 1020130143238A KR 20130143238 A KR20130143238 A KR 20130143238A KR 20150059697 A KR20150059697 A KR 20150059697A
Authority
KR
South Korea
Prior art keywords
controller
switch
active controller
active
standby
Prior art date
Application number
KR1020130143238A
Other languages
Korean (ko)
Inventor
박재우
김덕열
송세준
최백영
박형배
백은경
정기태
홍성민
Original Assignee
주식회사 케이티
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 케이티 filed Critical 주식회사 케이티
Priority to KR1020130143238A priority Critical patent/KR20150059697A/en
Publication of KR20150059697A publication Critical patent/KR20150059697A/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0686Additional information in the notification, e.g. enhancement of specific meta-data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0811Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/55Prevention, detection or correction of errors
    • H04L49/557Error correction, e.g. fault recovery or fault tolerance

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to a method and a system for exactly and rapidly detecting a network failure in an SDN system. The method for detecting a link failure in a network system including an active controller and a standby controller comprises: a step of checking a connection state with a switch connected to oneself if the active controller detects the link failure connected to oneself; a step that the active controller transmits a first alarm message informing the standby controller of a disconnection with the switch if checked as disconnected to the switch; a step of checking a connection state to the active controller if the switch detects the link failure connected to oneself; and a step that the switch transmits a second alarm message informing the standby controller of a disconnection with the active controller if checked as disconnected to the active controller, wherein the standby controller, which receives the first and second alarm messages, discloses a role of the active controller.

Description

METHOD AND SYSTEM FOR DETECTING NETWORK DEFECTS IN A SOFTWARE DEFINED NETWORK

The present invention relates to a method and system for detecting a link defect between a controller and an OpenFlow Switch in a Software Defined Network (SDN), and in order to prevent an open flow switch from being isolated from a controller due to a link defect A method for quickly detecting and coping with a network defect, and a network system to which such a method is applied.

A software defined network (SDN) provides a centralized network control structure using software programming by abstracting and separating control planes and data planes from existing integrated networks.

Openflow is a technology used as an interface specification between controller and network equipment in SDN. The control and data planes can be implemented in software using an open flow protocol, and new functions can be implemented quickly by installing such software in a general-purpose server. Based on the open flow protocol, the controller sends commands to the switch over the secure channel, and the switch can perform processing such as sending, modifying, or discarding packets to the destination according to the command.

According to Open Flow Specification 1.3, each switch performs a redundant connection to the active controller and the standby controller to ensure the safety of the SDN. The redundant active controller and the standby controller periodically check each other's status and maintain their own roles. If the active controller fails, the roles of the active controller and the standby controller are changed to overcome the failure. However, since this method can not unify the link between the controllers and the links connected to the open flow switch, there is a possibility that the open flow switch is isolated from the controller, To cause a paralyzed state.

SUMMARY OF THE INVENTION Accordingly, the present invention seeks to provide a method and system for accurately and quickly detecting network faults in an SDN system.

The present invention intends to provide a method for quickly detecting and coping with a network fault in order to prevent an open flow switch from being isolated from a controller managing it due to a network fault in the SDN system and a network system to which such a method is applied.

According to an embodiment of the present invention, a method for detecting link defects in a network system including an active controller and a standby controller is provided. The method includes the steps of: confirming a connection state with a switch connected to the active controller when the active controller detects a link fault connected to the active controller; If it is confirmed that the connection to the switch is disconnected, the active controller sends a first notification message to the standby controller informing that the connection with the switch is broken; Confirming a connection state to the active controller when the switch detects a link defect connected to the switch; And if the connection to the active controller is confirmed to be disconnected, the switch transmits to the standby controller a second notification message informing that the connection with the active controller is broken; And the standby controller receiving the first and second notification messages starts to function as an active controller.

A network system according to an embodiment of the present invention includes an active controller having a processor and a memory in which a first instruction set executed by the processor is stored; A standby controller having a processor and a memory in which a second set of instructions executed by the processor is stored; And a switch controlled by the active and standby controller, wherein when the first instruction set is executed by the processor of the active controller, when the active controller detects a link fault connected to the active controller, And transmitting a first notification message to the standby controller indicating that the connection with the switch is disconnected if the connection to the switch is confirmed to be disconnected, And a second instruction set, when executed by the processor of the standby controller, causes the standby controller to receive the first notification message from the active controller and receive a second notification message from the switch indicating that the connection with the active controller is broken, To It includes instructions that in Shinhan case, to initiate a role of an active controller.

According to an embodiment of the present invention, not only a heartbeat (HB) message for checking the state of each other between redundant or clustered controllers, but also a network fault detection method using a controller topology recognition and link signal more quickly and accurately Network failures can be minimized.

1 is a diagram illustrating a situation where an open flow switch is isolated from a controller due to a link defect in an SDN network system.
2 is a conceptual diagram illustrating a link defect detection method according to the present invention.
3 is a flowchart illustrating a link defect detection method according to an embodiment of the present invention.

While the present invention has been described in connection with certain exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and similarities. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

In addition, the singular phrases used in the present specification and claims should be interpreted generally to mean "one or more " unless otherwise stated.

Also, the terms "module," "part," " interface ", and the like used in the present specification generally mean a computer-related object and may mean, for example, hardware, software and combinations thereof.

Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

1 is a diagram illustrating a situation where an open flow switch is isolated from a controller due to a link defect in an SDN network system.

The illustrated SDN network system includes an active controller 110, a standby controller 120, non-open flow switches 130 and 140, and an open flow switch 150. Although only one active controller 110 and one standby controller 120 are shown on the illustrated system, the present invention is not necessarily limited to such a configuration. The SDN network system may include a controller cluster comprising a plurality of controllers, one controller in the cluster acting as an active controller, and the remaining controllers acting as a standby controller.

Likewise, although only one open flow switch is shown in FIG. 1, it will be understood by those skilled in the art that this is for convenience of description and that a plurality of open flow switches are controlled by the same controller cluster.

According to OpenFlow Specification 1.3, existing controller multiplexing methods require each open flow switch to know the physical IP address of the active controller and standby controller through the configuration message, Lt; / RTI >

As shown in FIG. 1, when an error occurs simultaneously in two links of F1 and F2 in the redundant SDN network, the open flow switch 150 loses access to the active controller 110. Thus, the connection to the current standby controller 120 must be switched to active.

However, according to the current fault detection scheme in which the active controller 110 and the standby controller 120 periodically check each other's states by exchanging heartbeat signals, the controllers 110 and 120 can As it is confirmed to be normal, the isolation of the open flow switch 150 due to the defect of the two links F1 and F2 is not recognized, and the active controller remains active. As a result, the open-flow switch 150 managed by these controllers can not perform normal operation, resulting in a failure of the entire network.

In order to solve such a problem, the present invention proposes a method for minimizing a network failure by detecting network faults more quickly and accurately using topology recognition and link signals of a controller.

2 is a conceptual diagram illustrating a link defect detection method according to the present invention. As shown in the figure, when the active controller 110 detects a defect F1 of a link connected to the active controller 110, the active controller 110 confirms the connection to the open flow switch 150 connected thereto. If the connection to the open flow switch 150 is good, the network operator is informed of the link defect F1. However, if it is confirmed that the connection to the open flow switch 150 is disconnected, the first notification message indicating that the connection to the switch is disconnected is transmitted to the standby controller 120 (1).

On the other hand, the open flow switch 150 confirms the connection to the active controller 110 when it detects a defect in the link connected thereto. If it is confirmed that the connection to the active controller 110 is disconnected, the open flow switch 150 sends a second notification message to the standby controller 120 informing that the connection to the active controller 110 is disconnected (2).

In one embodiment, the open flow switch 150 first determines a controller that can operate as an active controller when there are a plurality of standby controllers in the controller cluster, checks the connection state between the determined standby controller and the open flow switch, The second notification message can be transmitted to the standby controller which is determined to be in a normal connection state with the open switch.

The standby controller 120 receiving the first and second notification messages from the active controller 110 and the open flow switch 150 detects through the messages that the open flow switch 150 is isolated from the active controller 110 And it will start its role as an active controller.

3 is a flowchart illustrating a link defect detection method according to an embodiment of the present invention.

Steps S310-S330 described below are performed in the active controller, and steps S340-S360 are performed in the open flow switch. These operations are performed independently, Those skilled in the art will appreciate that the clock is not thermally dependent.

First, the steps (S310-S330) performed by the active controller will be described first.

In step S310, the active controller detects a link defect linked to itself.

In step S320, the active controller checks whether a situation has occurred in which the connection to the open flow switch connected to itself is broken due to the link defect detected in step S310.

If it is determined in step S330 that the connection to the open flow switch is disconnected, the active controller sends a first notification message to the standby controller indicating that the connection with the open flow switch is lost. On the other hand, if it is determined that the link defect does not affect the connection with the open flow switch, the network operator can simply report the link defect.

Next, the steps (S340-S360) performed by the open flow switch will be described.

In step S340, the open flow switch detects a link defect connected to itself.

In step S350, the open flow switch determines whether a situation occurs in which the connection between itself and the active controller is disconnected due to the link defect detected in step S340.

If it is determined in step S360 that the connection to the active controller is disconnected, the open flow switch sends a second notification message to the standby controller indicating that the connection with the active controller is lost.

In one embodiment, the open flow switch first determines a controller that can operate as an active controller when there are a plurality of standby controllers in the controller cluster, checks the connection state between the determined standby controller and the open flow switch, The second notification message can be transmitted to the standby controller that is determined to be in a normal connection state.

Upon receiving the first notification message and the second notification message by the above-described operations of the active controller and the open flow switch, the standby controller starts its role as an active controller (step S370).

The apparatus and method according to embodiments of the present invention may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination.

Program instructions to be recorded on a computer-readable medium may be those specially designed and constructed for the present invention or may be available to those skilled in the computer software arts. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Includes hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. The above-mentioned medium may also be a transmission medium such as a light or metal wire, wave guide, etc., including a carrier wave for transmitting a signal designating a program command, a data structure and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.

The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

The embodiments of the present invention have been described above. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

Claims (9)

CLAIMS 1. A method for detecting a link defect in a network system including an active controller and a standby controller,
Confirming a connection state with a switch connected to the active controller when the active controller detects a link defect connected to the active controller;
If it is confirmed that the connection to the switch is disconnected, the active controller sends a first notification message to the standby controller informing that the connection with the switch is broken;
Confirming a connection state to the active controller when the switch detects a link defect connected to the switch; And
If the connection to the active controller is confirmed to be disconnected, the switch sends a second notification message to the standby controller informing that the connection with the active controller is broken;
Wherein the standby controller that has received the first and second notification messages starts serving as an active controller.
The link defect detection method according to claim 1, wherein the switch is an open flow switch.
The method of claim 1, further comprising the step of reporting the detected link fault to a network operator when it is determined that the active controller is connected as a result of checking a connection state to a switch connected to the active controller.
2. The method of claim 1, wherein the network system includes a controller cluster comprised of a plurality of controllers, wherein one of the clusters is an active controller and one or more remaining controllers are standby controllers.
5. The method of claim 4, wherein the active controller transmits to the standby controller a first notification message indicating that the connection to the switch is broken,
Determining a controller that can operate as an active controller among the standby controllers in the cluster;
Checking a connection state between the determined standby controller and the switch; And
Transmitting the second notification message to the standby controller that is determined to be in a normal connection state with the switch
Wherein the link defect detection method comprises:
An active controller having a processor and a memory in which a first set of instructions executed by the processor is stored;
A standby controller having a processor and a memory in which a second set of instructions executed by the processor is stored; And
And a switch controlled by the active and standby controllers,
Wherein the first instruction set when executed by the processor of the active controller checks the connection to the switch connected to the active controller when the active controller detects a link fault connected to the active controller, The first notification message indicating that the connection with the switch is disconnected is sent to the standby controller,
Wherein the second set of instructions, when executed by the processor of the standby controller, cause the standby controller to receive the first notification message from the active controller and to receive a second notification from the switch that the connection with the active controller is broken And if the message is received, initiate a role of an active controller.
7. The network system of claim 6, wherein the switch is an open flow switch.
The network system of claim 6, wherein the first instruction set includes a command to report a detected link fault to a network operator if it is determined that a connection to the switch connected to the active controller is confirmed, .
7. The network system according to claim 6, wherein the network system includes a controller cluster composed of a plurality of controllers, one of the clusters being an active controller and one or more remaining controllers being a standby controller.
KR1020130143238A 2013-11-22 2013-11-22 Method and System for detecting network failure in Software Defined Network KR20150059697A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020130143238A KR20150059697A (en) 2013-11-22 2013-11-22 Method and System for detecting network failure in Software Defined Network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020130143238A KR20150059697A (en) 2013-11-22 2013-11-22 Method and System for detecting network failure in Software Defined Network

Publications (1)

Publication Number Publication Date
KR20150059697A true KR20150059697A (en) 2015-06-02

Family

ID=53490819

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020130143238A KR20150059697A (en) 2013-11-22 2013-11-22 Method and System for detecting network failure in Software Defined Network

Country Status (1)

Country Link
KR (1) KR20150059697A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150215156A1 (en) * 2014-01-24 2015-07-30 Electronics And Telecommunications Research Institute Method and apparatus for network failure restoration
CN108011815A (en) * 2016-10-28 2018-05-08 中国电信股份有限公司 Network control method and software defined network equipment and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150215156A1 (en) * 2014-01-24 2015-07-30 Electronics And Telecommunications Research Institute Method and apparatus for network failure restoration
CN108011815A (en) * 2016-10-28 2018-05-08 中国电信股份有限公司 Network control method and software defined network equipment and system
CN108011815B (en) * 2016-10-28 2020-12-01 中国电信股份有限公司 Network control method and software defined network device and system

Similar Documents

Publication Publication Date Title
CN108270669B (en) Service recovery device, main controller, system and method of SDN network
CN109104349B (en) Train network data transmission method, system and device based on CANopen protocol
US8244838B2 (en) Industrial controller employing the network ring topology
US8441941B2 (en) Automating identification and isolation of loop-free protocol network problems
CN105607590B (en) Method and apparatus to provide redundancy in a process control system
US9544223B2 (en) Communication system, control apparatus, method for controlling same, and program
JP5941404B2 (en) Communication system, path switching method, and communication apparatus
JP6278818B2 (en) Relay system and switch device
CA2782256C (en) Verifying communication redundancy in a network
CN106936613B (en) Method and system for rapidly switching main and standby Openflow switch
CN103812675A (en) Method and system for realizing allopatric disaster recovery switching of service delivery platform
CN105634848B (en) A kind of virtual router monitoring method and device
JP2016536906A (en) Network protection method, network protection device, off-ring node, and system
US9692636B2 (en) Relay system and relay device
KR20150059697A (en) Method and System for detecting network failure in Software Defined Network
US9118540B2 (en) Method for monitoring a plurality of rack systems
US20070189157A1 (en) Method and system for providing safe dynamic link redundancy in a data network
JP6359914B2 (en) Relay system and relay device
JP6381017B2 (en) Monitoring control system, monitoring apparatus, monitoring control method, and monitoring control program
CN102946321B (en) A kind of fault handling method based on IRF network and equipment
JP6762032B2 (en) Power receiving device and control method
CN109491236A (en) Method for running the automated system of high availability
JP6935819B2 (en) Node device, recovery operation control method, and recovery operation control program
JP6301750B2 (en) Relay device
JP5475706B2 (en) Monitoring device, communication device, and network monitoring method

Legal Events

Date Code Title Description
WITN Withdrawal due to no request for examination