KR20180081204A

KR20180081204A - System and method for fault recovery of controller in separated SDN controller

Info

Publication number: KR20180081204A
Application number: KR1020170002008A
Authority: KR
Inventors: 정치욱; 유현; 한영태
Original assignee: 주식회사 케이티
Priority date: 2017-01-05
Filing date: 2017-01-05
Publication date: 2018-07-16
Also published as: KR101909264B1

Abstract

When a role of an SDN controller is a preparation controller, the SDN controller receives first execution controller state information transmitted from a first execution controller among a plurality of neighboring SDN controllers. The SDN controller calculates an execution detection value based on the received first execution controller state information, and determines whether the calculated execution detection value is within a predetermined normal range. When the calculated execution detection value is out of the normal range, the first execution controller is determined to be failed, and the role of the preparation controller is changed to a second execution controller. Execution adaptability of the first execution controller and the second execution controller is calculated, and an execution controller is set to substitute the role of an execution controller of the first execution controller which is determined to be failed based on the calculated execution adaptability, thereby recovering the failure. According to the present invention, it is possible to minimize a time spent in detecting and recovering a failure, and thus, high network availability can be guaranteed, and reliability of the SDN controller can be improved.

Description

[0001] The present invention relates to a system and method for a controller failure recovery in a distributed SDN controller structure,

본 발명은 분산 SDN 제어기 구조에서 제어기 장애 복구 시스템 및 방법에 관한 것이다.The present invention relates to a controller fault recovery system and method in a distributed SDN controller architecture.

소프트웨어 정의 방식(SDN: Software Defined Network)의 네트워크의 구조는 중앙 집중식 구조를 나타낸다. 즉, 기존 네트워크 장비의 데이터 평면(Data plane)과 제어 평면(Control plane)을 분리하고, 네트워크에 대한 관리, 모니터링, 제어 처리를 하나의 집중화된 제어기에서 처리하도록 한다.The network structure of the Software Defined Network (SDN) represents a centralized structure. That is, the data plane and the control plane of the existing network equipment are separated, and management, monitoring, and control processing for the network are processed by one centralized controller.

이러한 SDN의 중앙 집중식 네트워크 구조에서는, 네트워크 장비들에 대한 정보를 오픈 플로우(OpenFlow)와 같은 제어 평면 인터페이스를 통해 SDN 제어기로 일괄 취합하여, 사용자에게 추상화된 네트워크 자원 정보를 제공한다. 이렇게 전달받은 추상화된 네트워크 자원 정보를 기반으로, 분산 방식의 네트워크 운용 방식보다 성능이 최적화되고, 유연한 네트워크의 관리가 가능해 진다.In the centralized network structure of the SDN, the information about the network equipment is collectively collected by the SDN controller through a control plane interface such as OpenFlow, and the abstracted network resource information is provided to the user. Based on the received abstracted network resource information, the performance is more optimized than the distributed network operation method, and flexible network management becomes possible.

한편, 넓은 지역에 분포되어 있는 다수의 네트워크 장비들에 대하여 안정적이고 신속한 처리가 필요한 Carrier Grade 급의 네트워크를 처리하기 위해서, SDN 제어기는 물리적으로는 분산되어 있지만, 논리적으론 하나의 SDN 제어기와 같이 구동하는 분산형 SDN 제어기의 구조가 제안되었다. 분산 SDN 제어기 구조에서는 물리적으로 다수의 SDN 제어기가 별도의 제어기가 존재하고 있지만, SDN 제어기 간의 정보 교환을 통해 장애를 감지하고 네트워크 정보를 공유하여, 사용자에게는 동일한 추상화된 네트워크 정보를 제공하는 하나의 논리적인 SDN 제어기로써 동작한다. On the other hand, in order to handle a carrier grade network that requires stable and rapid processing for a large number of network devices distributed over a wide area, the SDN controller is physically dispersed, but it is logically driven like an SDN controller A structure of a distributed SDN controller is proposed. In the distributed SDN controller structure, although a plurality of SDN controllers physically exist as separate controllers, it is possible to detect a failure and share network information by exchanging information between SDN controllers, and to provide a single logical Lt; RTI ID = 0.0 > SDN < / RTI >

따라서, ONOS(Open Network Operating System)와 OpenDaylight와 같은 공개 SDN 제어기들도 모두 분산형 SDN 제어기 구조를 가지고 있다. 그리고, ONF의 다수의 기술 문서들에서도 분산형 SDN 제어기 구조를 권고하고 있다. 또한 ONF에서는 분산형 SDN 제어기들 간의 정보 교환을 위해 SDN 제어기 간의 인터페이스를 정의하고 있다. Therefore, open SDN controllers such as Open Network Operating System (ONOS) and OpenDaylight all have a distributed SDN controller structure. Also, many ONF technical documents recommend a distributed SDN controller architecture. In addition, ONF defines interfaces between SDN controllers for information exchange between distributed SDN controllers.

하지만, 분산 SDN 제어기 구조에서 SDN 제어기들의 장애에 대한 감지 기법 및 장애 발생 상황에서의 대응 방법들은 아직 어떠한 표준 기술 문서들에서도 명시되어 있지 않는 상황이다.However, in the distributed SDN controller structure, the detection technique of the failure of the SDN controller and the countermeasures in the failure occurrence situation are not yet described in any standard technical documents.

따라서, 본 발명은 분산 SDN 제어기 구조에서 분산형 제어기의 상태 정보 공유를 통해 장애를 판별하고 대응할 수 있는 장애 복구 시스템 및 방법을 제공한다.Accordingly, the present invention provides a fault recovery system and method capable of discriminating and responding to faults through sharing of state information of a distributed controller in a distributed SDN controller structure.

상기 본 발명의 기술적 과제를 달성하기 위한 본 발명의 하나의 특징인 분산 SDN 제어기 구조에서 SDN 제어기의 장애를 복구하는 시스템은,According to an aspect of the present invention, there is provided a system for restoring a failure of an SDN controller in a distributed SDN controller structure,

네트워크상의 적어도 하나 이상의 오픈 플로우 스위치에 연결되어 있는 복수의 인접 SDN 제어기와 연동하며, 상기 복수의 인접 SDN 제어기로부터 각각 전송되는 인접 제어기 상태 정보를 수신하고, 수집한 SDN 제어기의 제어기 상태 정보를 상기 복수의 인접 SDN 제어기로 전달하는 인터페이스; 및 상기 인터페이스를 통해 수신한 복수의 인접 제어기 상태 정보와 송신한 제어기 상태 정보를 토대로, 상기 제어기에 대한 역할을 실행 역할, 준비 역할 또는 대기 역할 중 어느 하나로 할당하고, 당된 역할에 따라 실행 역할이 할당된 실행 제어기 또는 준비 역할이 할당된 준비 제어기의 장애를 탐지하는 프로세서를 포함하며, 제어기 장애 복구 시스템은 SDN 제어기 및 상기 복수의 인접 SDN 제어기 각각에 포함되어 있다.And a plurality of SDN controllers connected to at least one open flow switch on the network for receiving adjacent controller status information transmitted from the plurality of adjacent SDN controllers, To an adjacent SDN controller of the base station; And assigning a role for the controller to one of an execution role, a preparation role, and a waiting role based on a plurality of adjacent controller state information received through the interface and transmitted controller state information, Wherein the controller failure recovery system is included in each of the SDN controller and the plurality of adjacent SDN controllers.

상기 본 발명의 기술적 과제를 달성하기 위한 본 발명의 또 다른 특징인 SDN 제어기에 포함된 제어기 장애 복구 시스템이 분산 SDN 제어기 구조에서 장애가 발생한 SDN 제어기의 장애를 복구하는 방법은,According to another aspect of the present invention, there is provided a method for restoring a fault in an SDN controller in a distributed SDN controller structure,

SDN 제어기의 역할이 준비 제어기이면, 준비 제어기의 제어기 장애 복구 시스템이 인접한 복수의 SDN 제어기 중 제1 실행 제어기의 제어기 장애 복구 시스템으로부터 전송되는 제1 실행 제어기 상태 정보를 수신하는 단계; 상기 수신한 제1 실행 제어기 상태 정보를 토대로 실행 탐지값을 계산하고, 계산한 실행 탐지값이 미리 설정한 정상 범위에 해당하는지 확인하는 단계; 상기 정상 범위에 해당하지 않으면, 준비 제어기의 역할을 제2 실행 제어기로 변경하는 단계; 및 상기 제1 실행 제어기와 제2 실행 제어기의 실행 적합도를 계산하고, 계산한 실행 적합도를 토대로 상기 제2 실행 제어기의 역할을 결정하는 단계를 포함한다.If the role of the SDN controller is a staging controller, the controller failure recovery system of the staging controller receives first execution controller status information transmitted from the controller failure recovery system of the first one of the plurality of adjacent SDN controllers; Calculating an execution detection value based on the received first execution controller state information and checking whether the calculated execution detection value corresponds to a preset normal range; Changing the role of the preparation controller to a second execution controller if the normal range is not met; And calculating an execution fitness of the first execution controller and the second execution controller, and determining a role of the second execution controller based on the calculated execution fitness.

본 발명에 따르면 SDN 제어기간의 상태 정보 교환을 통해 각 분산 SDN 제어기에 논리적인 역할을 부여하고, 제어기들의 공유된 상호 정보 그리고 각 논리적인 역할에 따라 제어기 정상 동작을 탐지할 수 있으므로, 장애 감지와 장애 복구 시간을 최소화할 수 있다.According to the present invention, it is possible to assign a logical role to each distributed SDN controller through state information exchange in the SDN control period, detect the normal operation of the controller according to the shared mutual information of the controllers and each logical role, Failure recovery time can be minimized.

또한, 장애 감지와 장애 복구 시간을 최소화 함으로써 높은 네트워크 가용성을 보장할 수 있으며, SDN 제어기의 안정성을 향상시킬 수 있다.Also, by minimizing the failure detection and recovery time, high network availability can be ensured and the stability of the SDN controller can be improved.

도 1은 일반적인 중앙 집중 방식의 SDN 구조에 대한 예시도이다.
도 2는 일반적인 분산 SDN 제어기 구조에 대한 예시도이다.
도 3은 일반적인 제어기 관리 서버를 통한 SDN 제어기의 장애 감지 궁조에 대한 예시도이다.
도 4는 본 발명의 실시예에 따른 제어기 장애 복구 시스템이 적용된 분산 SDN 제어기 구조에 대한 예시도이다.
도 5는 본 발명의 실시예에 따른 제어기 장애 복구 시스템의 구조도이다.
도 6은 본 발명의 실시예에 따른 SDN 제어기 역할 분배 과정에 대한 흐름도이다.
도 7은 본 발명의 제1 실시예에 따른 SDN 제어기 복구 과정에 대한 흐름도이다.
도 8은 본 발명의 제2 실시예에 따른 SDN 제어기 복구 과정에 대한 흐름도이다.
도 9는 본 발명의 실시예에 따른 SDN 제어기 역할 분배에 대한 예시도이다.
도 10은 본 발명의 실시예에 따른 SDN 제어기 복구 상황에 대한 예시도이다.1 is an exemplary diagram of a general centralized SDN structure.
Figure 2 is an exemplary diagram of a general distributed SDN controller architecture.
3 is an exemplary view of a fault detection architecture of an SDN controller through a general controller management server.
4 is an exemplary diagram of a distributed SDN controller structure to which a controller failure recovery system according to an embodiment of the present invention is applied.
5 is a structural diagram of a controller fault recovery system according to an embodiment of the present invention.
6 is a flowchart illustrating a process of distributing an SDN controller according to an embodiment of the present invention.
7 is a flowchart illustrating an SDN controller restoration process according to the first embodiment of the present invention.
8 is a flowchart illustrating an SDN controller recovery process according to a second embodiment of the present invention.
9 is an illustration of SDN controller role distribution according to an embodiment of the present invention.
10 is an exemplary view of an SDN controller recovery situation according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

이하, 도면을 참조로 하여 본 발명의 실시예에 따른 분산 SDN 제어기 구조에서 제어기 장애 복구 시스템 및 방법에 대해 상세히 설명한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A system and method for recovering a controller in a distributed SDN controller according to an embodiment of the present invention will be described in detail with reference to the drawings.

본 발명의 실시예에 대해 설명하기 앞서, 일반적인 SDN 구조에 대해 도 1 내지 도 3을 참조로 먼저 설명한다.Before describing an embodiment of the present invention, a general SDN structure will be described first with reference to FIG. 1 to FIG.

도 1은 일반적인 중앙 집중 방식의 SDN 구조에 대한 예시도이다.1 is an exemplary diagram of a general centralized SDN structure.

도 1에 도시된 바와 같이 일반적인 중앙 집중 방식의 SDN 구조는, SDN 제어기에 장애가 발생할 경우, 장애가 발생한 SDN 제어기가 담당하는 지역에 영향을 미치게 된다. 따라서, 도 1에 도시된 바와 같이, 마스터(Master, Active)와 슬레이브(Slave, Standby) 구성을 통해 가용성과 신뢰성을 확보할 수 있다.As shown in FIG. 1, when a failure occurs in the SDN controller, a general centralized SDN structure affects an area occupied by the failed SDN controller. Therefore, as shown in FIG. 1, availability and reliability can be ensured through a master (Active) and a slave (Slave) configuration.

도 2는 일반적인 분산 SDN 제어기 구조에 대한 예시도이다.Figure 2 is an exemplary diagram of a general distributed SDN controller architecture.

도 2에 도시된 바와 같이, 분산 SDN 제어기 구조에서는 넓은 지역에 대한 신호 처리시, 신속한 처리와 처리량 부하의 분산을 위해 다중화를 구성한다. 이를 토대로, SDN 제어기가 관리하는 해당 지역에 대한 빠른 네트워크의 신호 처리 및 높은 처리량을 제공할 수 있다.As shown in FIG. 2, the distributed SDN controller structure configures multiplexing for rapid processing and distribution of throughput load during signal processing for a large area. Based on this, it is possible to provide fast network signal processing and high throughput for the area managed by the SDN controller.

도 3은 일반적인 제어기 관리 서버를 통한 SDN 제어기의 장애 감지 구조에 대한 예시도이다.3 is an exemplary view of a fault detection structure of an SDN controller through a general controller management server.

도 3에 도시된 바와 같이, 하나의 제어기 관리 서버는 복수의 SDN 제어기들의 정상 동작 유무를 판별한다. 이때, 제어기 관리 서버는 SDN 제어기들에 대한 프로세스 정상 동작 유무, SDN 제어기와 오픈 플로우 스위치간의 연결(connectivity)을 기준으로 SDN 제어기의 정상 동작을 판별한다. As shown in FIG. 3, one controller management server determines whether there is a normal operation of a plurality of SDN controllers. At this time, the controller management server determines the normal operation of the SDN controller based on the presence / absence of the normal operation of the SDN controllers and the connectivity between the SDN controller and the open flow switch.

또한, SDN 제어기에 장애가 발생하면 복구될 SDN 제어기에 대해 임의의 순서대로 복구가 되거나, 랜덤 방식으로 하나의 슬레이브 제어기가 마스터 제어기로 변경된다. 이러한 방식에서는, 각 네트워크에 대해 마스터 제어기와 슬레이브 제어기들이 한정되어 있다.In addition, when a failure occurs in the SDN controller, the SDN controller to be restored is recovered in an arbitrary order, or one slave controller is changed to a master controller in a random manner. In this manner, master and slave controllers are limited for each network.

이상에서 설명한 일반적인 형태의 SDN 제어기 구조에서의 장애 감지시, 프로세스 정보 또는 SDN 제어기와 오픈 플로우 스위치간의 연결만 가지고 SDN 제어기가 정상으로 동작하는지 확인하는 것은 불분명하다. It is not clear whether the SDN controller operates normally when only the connection between the SDN controller and the open flow switch is detected at the time of failure detection in the general type SDN controller structure described above.

이는, 네트워크 인터페이스가 정상 동작하지만 프로세스가 살아있는 경우를 감지하기 어렵다. 또한 연결을 기반으로 SDN 제어기의 정상 동작을 판단하기에는 SDN 제어기의 내부 어플리케이션의 논리 오류, 해킹 등으로 정상적인 플로우 처리가 불가능한 상황에 대해 감지할 수 없기 때문이다.This is difficult to detect when the network interface is operating normally but the process is alive. This is because it is impossible to detect the normal operation of the SDN controller due to logical errors or hacking of the internal application of the SDN controller based on the connection.

또한, 네트워크에 한정되어 임의의 순서대로 혹은 대기중인 슬레이브 제어기를 마스터 서버로 복구하는 것은, 다수의 슬레이브 SDN 제어기 중 가장 최적의 SDN 제어기가 선택되지 못하는 상황을 초래할 수 있다. 그리고, SDN 제어기의 장애 발생 후 다른 제어기로의 선정 과정에서, 정보 교환에 따른 시간 지연이 신속한 장애 제어기의 복구를 저해하는 요인으로 작용된다.Also, restoring the slave controller to the master server in a certain order or in a certain order limited to the network may cause a situation in which the most optimal SDN controller among a plurality of slave SDN controllers can not be selected. And, in the process of selecting another controller after a failure of the SDN controller, time delay due to information exchange is a factor that hinders the recovery of the failure controller quickly.

따라서, 본 발명의 실시예에서는 SDN 제어기간의 상태 정보 교환을 기반으로 네트워크에 대한 논리적 역할을 할당하고, 이에 따라 장애가 발생한 제어기의 장애를 감지하고 복구를 수행할 수 있는 제어기 장애 복구 시스템에 대해 설명한다. 본 발명의 실시예에서는 설명의 편의를 위하여 SDN 제어기 각각에 제어기 장애 복구 시스템(도면 미도시)이 포함되어 있는 것으로 설명하나, SDN 제어기 자체가 제어기 장애 복구 시스템의 기능을 수행할 수 있다.Therefore, in the embodiment of the present invention, a controller failure recovery system capable of allocating a logical role to a network based on state information exchange in the SDN control period, and thereby detecting and recovering a failure of a controller in which a failure occurs do. In the embodiment of the present invention, the SDN controller itself is described as including a controller failure recovery system (not shown) for convenience of explanation, but the SDN controller itself can perform the function of the controller failure recovery system.

도 4는 본 발명의 실시예에 따른 제어기 장애 복구 시스템이 적용된 분산 SDN 제어기 구조에 대한 예시도이다.4 is an exemplary diagram of a distributed SDN controller structure to which a controller failure recovery system according to an embodiment of the present invention is applied.

도 4에 도시된 바와 같이, 복수의 SDN 제어기(10-1~10-8)들은 각각 자신의 제어기 상태 정보를 다른 SDN 제어기들과 공유한다. 여기서 제어기 상태 정보라 함은 SDN 제어기의 CPU 사용량, 초당 오픈 플로우 메시지 처리량, 오픈 플로우 스위치와의 평균 지연 속도, 현재 SDN 제어기에 할당된 역할 정보 등을 포함한다. As shown in FIG. 4, a plurality of SDN controllers 10-1 to 10-8 each share their controller state information with other SDN controllers. Here, the controller status information includes the CPU usage of the SDN controller, the open flow message throughput per second, the average delay rate with the open flow switch, and the role information assigned to the current SDN controller.

이때, 초당 오픈 플로우 메시지 처리량의 경우에는 SDN 제어기가 처음 부팅되는 시점에서는 공유될 수 없으며, SDN 제어기가 오픈 플로우 스위치와 연결되어 오픈 플로우 메시지를 처리한 시점 이후부터 공유된다. 그리고, 평균 지연 속도의 경우에는 SDN 제어기가 처음 부팅하는 시점에 오픈 플로우 스위치와 송수신하는 OpenFlow Hello 메시지를 통해 스위치와의 지연 속도를 추출할 수 있다. OpenFlow Hello 메시지는 이미 알려진 사항으로, 본 발명의 실시예에서는 상세한 설명을 생략한다.In this case, in the case of the open flow message throughput per second, the SDN controller can not be shared when the SDN controller is booted for the first time, and the SDN controller is shared with the open flow switch after the processing of the open flow message. In the case of the average delay rate, the delay time with the switch can be extracted through the OpenFlow Hello message transmitting / receiving with the open flow switch at the time of initial booting of the SDN controller. The OpenFlow Hello message is already known, and a detailed description thereof will be omitted in the embodiment of the present invention.

그리고, 복수의 이웃 SDN 제어기들로부터 각각 전달받은 복수의 제어기 상태 정보를 토대로, SDN 제어기들은 각각 네트워크별로 SDN 제어기 자신에 대한 논리적인 역할을 할당하며, 할당된 역할을 수행한다. 여기서, 외부 네트워크의 SDN 제어기로부터 제어기 상태 정보를 수신하는 것은, 외부 네트워크의 제어기의 상태가 네트워크에 연결된 제어기의 상태보다 더 좋은 상태일 경우를 고려하기 때문이다.Based on the plurality of controller state information received from the plurality of neighboring SDN controllers, the SDN controllers allocate a logical role to each SDN controller for each network, and perform assigned roles. Here, receiving the controller state information from the SDN controller of the external network takes into consideration the case where the state of the controller of the external network is better than the state of the controller connected to the network.

이때, 이웃한 SDN 제어기의 상태 정보를 수신하거나 이웃한 SDN 제어기로 자신의 상태 정보를 전송하는 것, 그리고 논리적인 역할을 할당하는 기능, 그리고 SDN 제어기에 장애가 발생할 경우 이를 감지하여 복구하는 기능은 SDN 제어기(10-1~10-8) 내에 각각 포함되어 있는 제어기 장애 복구 시스템(100a~100h)이 확인하는 것을 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다. 그리고 본 발명의 실시예에서는 모든 SDN 제어기들은 SDN 제어기가 네트워크에 설치되면, 해당 네트워크에서는 최초 실행(run) 상태로 시작하는 것을 예로 하여 설명한다. In this case, the function of receiving status information of a neighboring SDN controller or transmitting its status information to a neighboring SDN controller, a function of assigning a logical role, and a function of detecting and recovering a failure of the SDN controller, Although the controller fault recovery systems 100a to 100h included in the controllers 10-1 to 10-8 respectively confirm such an example, they are not limited thereto. In the embodiment of the present invention, when all the SDN controllers are installed in the network, the SDN controllers start in the first run state in the corresponding network.

제어기 장애 복구 시스템(100)의 구조에 대해 도 5를 참조로 설명한다.The structure of the controller failure recovery system 100 is described with reference to FIG.

도 5는 본 발명의 실시예에 따른 제어기 장애 복구 시스템의 구조도이다.5 is a structural diagram of a controller fault recovery system according to an embodiment of the present invention.

도 5에 도시된 바와 같이, SDN 제어기는 제어기 장애 복구 시스템(100)의 인터페이스(110)를 통해 인접한 모든 SDN 제어기들과 연동한다. 그리고 자신의 제어기 상태 정보를 인접한 모든 SDN 제어기들에 제공하고, 인접한 모든 SDN 제어기들의 제어기 상태 정보를 수신한다. As shown in FIG. 5, the SDN controller interfaces with all adjacent SDN controllers via the interface 110 of the controller failure recovery system 100. And provides its controller state information to all adjacent SDN controllers and receives controller state information of all adjacent SDN controllers.

여기서 제어기 상태 정보는 SDN 제어기의 CPU 사용량, 초당 플로우 처리 수, SDN 스위치와의 평균 지연 속도(latency), SDN 제어기의 상태 정보 등을 포함하는 것을 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다. 그리고 제어기 상태 정보는 미리 설정된 시간 간격으로 주기적으로 공유하는 것을 예로 하여 설명하며, 본 발명의 실시예에서는 10ms마다 자신의 제어기 상태 정보를 인접한 SDN 제어기들에게 제공하는 것을 예로 한다.Here, the controller status information includes the CPU usage amount of the SDN controller, the number of flow processes per second, the average delay time with the SDN switch, and the status information of the SDN controller. However, the present invention is not limited thereto. The controller status information is shared periodically at preset time intervals. In the embodiment of the present invention, the controller status information of the controller is provided to adjacent SDN controllers every 10ms.

프로세서(120)는 인터페이스(110)를 통해 수신한 모든 SDN 제어기들의 상태 정보와 SDN 제어기 자신의 상태 정보를 토대로, SDN 제어기에 대한 역할을 할당한다. 만약 SDN 제어기의 역할을 실행(run)으로 할당할 경우 프로세서(120)는 SDN 제어기를 마스터로 설정하고, 인터페이스(110)를 통해 오픈 플로우 스위치로 마스터로 역할이 설정되었음을 알린다. The processor 120 assigns a role to the SDN controller based on status information of all the SDN controllers received via the interface 110 and status information of the SDN controller itself. If the role of the SDN controller is assigned as a run, the processor 120 sets the SDN controller as the master and notifies the open flow switch via the interface 110 that the role is set as the master.

또한, 프로세서(120)는 실행(run) 상태 역할이 할당된 SDN 제어기의 장애를 감지하기도 한다. 여기서, 역할을 할당하거나 장애를 감지하는 방법에 대해서는 이후 상세히 설명한다.The processor 120 also detects a failure of the SDN controller to which the run state role is assigned. Hereinafter, a method of assigning a role or detecting a failure will be described in detail later.

저장부(130)는 인터페이스(110)를 통해 수신한 모든 SDN 제어기들에 대한 제어기 상태 정보를 저장, 관리한다. 그리고, SDN 제어기 자신에 할당된 역할 정보를 SDN 제어기 자신의 제어기 상태 정보와 함께 저장, 관리한다.The storage unit 130 stores and manages controller status information for all the SDN controllers received through the interface 110. [ The SDN controller stores and manages the role information assigned to the SDN controller itself together with the controller status information of the SDN controller itself.

이러한 환경에서, 제어기 장애 복구 시스템(100)이 SDN 제어기에 역할을 분배하고, 장애가 발생한 SDN 제어기를 복구하는 방법에 대해 도 6 내지 도 8을 참조로 설명한다.In this environment, how the controller failover system 100 distributes the role to the SDN controller and how to recover the failing SDN controller will be described with reference to FIGS. 6-8.

도 6은 본 발명의 실시예에 따른 SDN 제어기 역할 분배 과정에 대한 흐름도이다.6 is a flowchart illustrating a process of distributing an SDN controller according to an embodiment of the present invention.

도 6에 도시된 바와 같이, 제어기 장애 복구 시스템(100)은 SDN 제어기의 제어기 상태 정보를 수집한다(S100). 제어기 상태 정보는, SDN 제어기가 처음 부팅하는 경우에는 CPU 사용량, SDN 스위치와의 평균 지연 속도(latency) 등을 포함하는 것을 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다. 그리고, 본 발명의 실시예에서는 SDN 제어기들은 처음 부팅하는 경우, 기본적으로 SDN 제어기의 역할로 실행(run) 상태 역할이 부여되는 것을 예로 하여 설명한다.As shown in FIG. 6, the controller failure recovery system 100 collects controller state information of the SDN controller (S100). The controller status information includes the CPU usage amount and the average delay time (latency) with the SDN switch when the SDN controller is booted for the first time. However, the controller status information is not limited thereto. In the embodiment of the present invention, when the SDN controllers are booted for the first time, a description will be given of an example in which a run state role is basically given as the role of the SDN controller.

제어기 장애 복구 시스템(100)은 SDN 제어기의 제어기 상태 정보를 수집할 뿐만 아니라, 인접한 모든 SDN 제어기들의 제어기 상태 정보들을 각각 수신하여 확인한다(S101). 인접한 SDN 제어기들로부터 제어기 상태 정보를 수집할 경우에는, 제어기 상태 정보에 SDN 제어기 식별 정보와 인접한 SDN 제어기들의 역할 정보도 포함될 수 있다.The controller failure recovery system 100 not only collects the controller status information of the SDN controller but also receives and confirms the controller status information of all the adjacent SDN controllers (S101). When collecting controller status information from adjacent SDN controllers, the controller status information may include SDN controller identification information and role information of adjacent SDN controllers.

제어기 장애 복구 시스템(100)은 S100 단계와 S101 단계를 통해 SDN 제어기 자신의 제어기 상태 정보뿐만 아니라, 인접한 SDN 제어기들의 상태 정보를 수신하면, 제어기들 각각의 역할 정보를 토대로 실행 상태의 제어기가 적어도 둘 이상 존재하는지 확인한다(S102). Upon receiving the controller status information of the SDN controller itself as well as the status information of the adjacent SDN controllers through the steps S100 and S101, the controller failure recovery system 100 determines whether at least two (S102).

제어기 장애 복구 시스템(100)은 실행 상태의 제어기가 적어도 둘 이상 존재하는 것으로 확인하면, 수신한 제어기 상태 정보들을 확인하여 실행 상태로 설정된 SDN 제어기들 각각에 대한 실행 적합도를 계산한다(S103). 여기서 실행 적합도는 다음 수학식 1을 이용하여 계산한다.If the controller failure recovery system 100 confirms that there are at least two controllers in the running state, the controller failure recovery system 100 checks the received controller state information and calculates an execution fitness for each of the SDN controllers set in the execution state (S103). Here, the execution fitness is calculated by using the following equation (1).

여기서, α는 설정 변수로 어느 하나의 수치로 한정하지 않으며, l_i는 오픈 플로우 스위치와의 지연 속도,

은 t 시점의 CPU 사용량을 의미한다. Here, " a " is not limited to any numerical value as a setting variable, " l _{i "} denotes a delay rate with the open flow switch,

Is the CPU usage at time t.

제어기 장애 복구 시스템(100)은 S103 단계에 따라 계산한 복수의 실행 적합도를 모두 비교하여, 실행 상태의 SDN 제어기 중 SDN 제어기 자신이 가장 높은 실행 적합도를 보이는지 확인한다(S104). 만약 제어기 장애 복구 시스템(100)이 설치된 SDN 제어기 자신의 실행 적합도가 가장 높으면, SDN 제어기를 실행 SDN 제어기로 설정한다(S105). 그리고 오픈 플로우 스위치로 SDN 제어기의 식별 정보와 함께 역할을 공지하는 신호를 전송한다(S106).In step S104, the controller failure recovery system 100 compares all of the plurality of calculated fitness values calculated in step S103 and determines whether the SDN controller of the SDN controller in the running state shows the highest performance fitness. If the execution fitness of the SDN controller installed in the controller failure recovery system 100 is the highest, the SDN controller is set as the execution SDN controller (S105). Then, a signal notifying the role is transmitted along with the identification information of the SDN controller to the open flow switch (S106).

그러나, S104 단계에서 확인한 결과, 제어기 장애 복구 시스템(100)이 설치된 SDN 제어기 자신이 가장 높은 실행 적합도를 나타내지 않는 것으로 확인하면, 제어기 장애 복구 시스템(100)은 실행 상태에서 준비(ready) 상태로 SDN 제어기의 상태를 변경하기 위한 절차를 수행한다.However, if it is determined in step S104 that the SDN controller installed in the controller failure recovery system 100 does not exhibit the highest performance goodness, the controller failure recovery system 100 can perform the SDN Perform the procedure to change the state of the controller.

즉, 제어기 장애 복구 시스템(100)은 SDN 제어기의 역할을 실행 상태에서 준비 상태로 상태로 변경하고자 하는 경우, S101 단계에서 수집한 인접 SDN 제어기들의 모든 상태 정보를 확인한다. 그리고, 준비 상태를 유지하는 SDN 제어기들의 수가 복수개 존재하는지 확인한다(S107). That is, when the controller failure recovery system 100 desires to change the role of the SDN controller from the running state to the ready state, it confirms all the state information of the neighboring SDN controllers collected in the step S101. Then, it is determined whether there are a plurality of SDN controllers that maintain the ready state (S107).

만약, 준비 상태를 유지하는 SDN 제어기들의 수가 복수 개 존재한다면, S100 단계에서 수집하고 S101 단계에서 확인한 제어기 상태 정보를 토대로, 준비 적합도를 계산한다(S108). 여기서 준비 적합도는 다음 수학식 2를 이용하여 계산한다.If there are a plurality of SDN controllers that are in the ready state, the preparation fitness is calculated based on the controller state information collected in step S100 and confirmed in step S101 (S108). Here, the ready fitness is calculated using the following equation (2).

여기서, β는 설정 변수로 어느 하나의 수치로 한정하지 않으며, l_i는 오픈 플로우 스위치와의 지연 속도,

은 t 시점의 CPU 사용량을 의미한다.Here, β is a set variable and is not limited to any one numerical value, l _i is a delay rate with the open flow switch,

Is the CPU usage at time t.

제어기 장애 복구 시스템(100)은 S108 단계에서 계산한 결과 제어기 장애 복구 시스템(100)이 설치된 SDN 제어기 자신이 가장 높은 준비 적합도를 나타내는지 판단한다(S109). In operation S109, the controller failure recovery system 100 determines whether the SDN controller installed in the controller failure recovery system 100 indicates the highest ready fitness level in operation S108.

만약 SDN 제어기 자신이 가장 높은 준비 적합도를 보이면, 제어기 장애 복구 시스템(100)은 SDN 제어기 자신의 역할이 준비 상태가 되도록 설정한다(S110). 그러나, SDN 제어기 자신이 가장 높은 준비 적합도를 나타내지 않는 것으로 확인하면, 제어기 장애 복구 시스템(100)은 준비 상태에서 대기(wait) 상태로 SDN 제어기의 상태를 설정한다(S111). 여기서, 대기 상태로 역할이 설정된 SDN 제어기는 자신의 제어 상태 정보를 주기적으로 인접한 모든 SDN 제어기들에 전송하지 않는 것을 예로 하여 설명한다.If the SDN controller itself exhibits the highest readiness, the controller failure recovery system 100 sets the SDN controller's own role to be ready (S110). However, if it is determined that the SDN controller itself does not indicate the highest ready fitness, the controller failure recovery system 100 sets the state of the SDN controller from the ready state to the wait state (S111). Here, the SDN controller set in the standby state will not transmit its control status information periodically to all the adjacent SDN controllers.

이상에서 설명한 바와 같이 프로세서(120)가 SDN 제어기 자신의 역할을 할당하는 예에 대해 도 9를 참조로 먼저 설명한다. 본 발명의 실시예에서는 설명의 편의를 위하여, 제2 네트워크는 이미 운용중이고, 새로운 제1 네트워크가 구현되어 제1 네트워크에 연결되어 있는 SDN 제어기들의 역할을 분배하는 것을 예로 하여 설명한다.As described above, an example in which the processor 120 assigns the role of the SDN controller itself will be described first with reference to FIG. In the embodiment of the present invention, for convenience of description, the second network is already in operation, and a new first network is implemented and the role of the SDN controllers connected to the first network is distributed.

도 9는 본 발명의 실시예에 따른 SDN 제어기 역할 분배에 대한 예시도이다.9 is an illustration of SDN controller role distribution according to an embodiment of the present invention.

도 9의 (a)에 도시된 바와 같이, 새로 형성된 제1 네트워크에 4개의 SDN 제어기(①∼④)가 직접적으로 연결되어 있으며, 이미 운용중인 제2 네트워크에 연결된 4개의 SDN 제어기(⑤∼⑧)와도 간접적으로 연결되어 있다고 가정한다. 그리고, 제1 네트워크에 직접적으로 연결되는 제1 SDN 제어기(①)부터 제4 SDN 제어기(④)까지는 최초 부팅에 의해 실행 상태인 것을 예로 하여 설명한다. As shown in FIG. 9A, four SDN controllers (1 through 4) are directly connected to a newly formed first network, and four SDN controllers (5 through 8) connected to a second network that is already in operation ) Are also indirectly connected to each other. The first SDN controller (1) to the fourth SDN controller (4), which are directly connected to the first network, are in the execution state by the initial boot, for example.

또한, 제1 네트워크에 간접적으로 연결되는 제5 SDN 제어기(⑤)부터 제8 SDN 제어기(⑧)까지는 이미 제2 네트워크 상에서의 역할이 분배되어 있기 때문에, 제1 네트워크에 대한 역할은 대기 상태로 정해지는 것을 예로 하여 설명한다. 예를 들어, 제5 SDN 제어기(⑤)는 제2 네트워크상에서의 역할은 실행 상태이나, 제1 네트워크상에서의 역할은 대기 상태가 된다. 이와 마찬가지로, 제7 SDN 제어기(⑦)는 제2 네트워크상에서의 역할은 준비 상태이나, 제1 네트워크상에서의 역할은 대기 상태가 된다.Since the roles of the fifth SDN controller (5) to the eighth SDN controller (8), which are indirectly connected to the first network, are already distributed on the second network, the role for the first network is set to the standby state Is described as an example. For example, the fifth SDN controller (5) is in the execution state on the second network, but the role on the first network is in the standby state. Likewise, the seventh SDN controller (7) is in the ready state on the second network, but the standby state on the first network.

도 9의 (b)에 나타낸 바와 같이, 제1 네트워크에 연결된 SDN 제어기들은 각각의 제어기 상태 정보를 토대로 하나의 실행 제어기와 복수의 준비 제어기를 설정한다. 본 발명의 실시예에서는 제1 SDN 제어기(①)가 실행 제어기로 설정되는 것을 예로 하여 설명한다.As shown in FIG. 9 (b), the SDN controllers connected to the first network set up one execution controller and a plurality of preparation controllers based on the respective controller state information. In the embodiment of the present invention, the first SDN controller (1) is set as an execution controller.

그리고, 도 9의 (c)에 나타낸 바와 같이, 제2 SDN 제어기(②)부터 제4 SDN 제어기(④)들의 제어기 상태 정보를 토대로, 하나의 준비 제어기와 복수의 대기 제어기를 설정한다. 본 발명의 실시예에서는 제2 SDN 제어기(②)가 준비 제어기가 되고, 제3 SDN 제어기(③) 및 제4 SDN 제어기(④)가 대기 제어기로 설정되는 것을 예로 하여 나타내었다. 제2 SDN 제어기(②)는 제1 SDN 제어기(①)가 정상적으로 동작하는지 지속적으로 감시한다.Then, as shown in FIG. 9 (c), one preparation controller and a plurality of standby controllers are set based on the controller state information of the second SDN controller (2) to the fourth SDN controller (4). In the embodiment of the present invention, the second SDN controller (2) serves as the preparation controller, and the third SDN controller (3) and the fourth SDN controller (4) are set as the standby controller. The second SDN controller (2) continuously monitors whether the first SDN controller (1) is normally operating.

이상의 절차를 통해 SDN 제어기에 역할이 할당되고, 오픈 플로우 스위치와 연결되어 오픈 플로우 메시지를 처리하는 과정에서, SDN 제어기의 장애 여부를 탐지하고 복구하는 과정에 대해 도 7 및 도 8을 참조로 설명한다. A process of detecting and recovering a failure of the SDN controller in the process of allocating a role to the SDN controller through the above procedure and being connected to the open flow switch and processing the open flow message will be described with reference to FIGS. 7 and 8 .

도 7은 본 발명의 제1 실시예에 따른 SDN 제어기 복구 과정에 대한 흐름도이고, 도 8은 본 발명의 제2 실시예에 따른 SDN 제어기 복구 과정에 대한 흐름도이다. 도 7은 SDN 제어기가 준비 제어기일 경우 실행 제어기에 대한 장애 여부를 탐지하여 복구하는 것이고, 도 8은 SDN 제어기가 대기 제어기일 경우 준비 제어기에 대한 장애 여부를 탐지하여 복구하는 것에 대한 흐름도이다.FIG. 7 is a flowchart illustrating an SDN controller restoration process according to the first embodiment of the present invention, and FIG. 8 is a flowchart illustrating an SDN controller restoration process according to the second embodiment of the present invention. FIG. 7 is a flowchart for detecting and recovering a failure of the preparation controller when the SDN controller is a standby controller, and FIG. 8 is a flowchart for detecting and recovering a failure of the preparation controller when the SDN controller is a standby controller.

먼저, SDN 제어기가 준비 제어기인 경우에 대해 설명하면, 도 7에 도시된 바와 같이 SDN 제어기에 포함되어 있는 제어기 장애 복구 시스템(100)은 주기적으로 연결되어 있는 모든 SDN 제어기들로부터 제어기 상태 정보를 수신하거나 자신의 SDN 제어기의 제어기 상태 정보를 수집하여 확인한다(S200). 그리고 제어기 장애 복구 시스템(100)은 SDN 제어기 자신의 역할이 실행 역할이 할당된 실행 제어기인지 확인한다(S201).Referring to FIG. 7, the controller failure recovery system 100 included in the SDN controller receives controller status information from all SDN controllers periodically connected to the SDN controller. Or the controller status information of the SDN controller is collected and confirmed (S200). Then, the controller failure recovery system 100 confirms whether the role of the SDN controller itself is an execution controller assigned with an execution role (S201).

만약 SDN 제어기가 실행 제어기라면, 제어기에 대한 장애 복구 절차는 종료된다. 본 발명의 실시예에서는 실행 제어기는 어떠한 장애 감지 동작도 수행하지 않는 것을 예로 하여 설명하나, 반드시 이와 같이 한정되는 것은 아니다. If the SDN controller is an execution controller, then the failback procedure for the controller is terminated. In the embodiment of the present invention, the execution controller does not perform any fault detection operation by way of example, but it is not necessarily limited thereto.

한편, S201 단계에서 확인한 결과, SDN 제어기 자신의 역할이 실행 역할이 아닌 준비 역할이 할당된 준비 제어기인 것으로 확인하면, 제어기 장애 복구 시스템(100)은 실행 제어기의 상태 정보를 미리 설정된 주기에 따라 수신하지 못한 정보 미 수집 횟수가 미리 설정한 횟수를 초과하였는지 판단한다(S202). 만약 제어기 상태 정보를 주기적으로 수신하거나 상태 정보 미 수집 횟수가 미리 설정한 횟수를 초과하지 않는 경우, 제어기 장애 복구 시스템(100)은 실행 탐지값을 계산한다(S203). 실행 탐지값은 다음 수학식 3을 이용하여 계산한다.On the other hand, if it is determined in step S201 that the role of the SDN controller itself is not the execution role but is the preparation controller assigned the preparation role, the controller failure recovery system 100 receives the status information of the execution controller It is determined whether the number of information uncollected times has exceeded a preset number (S202). If the controller status information is periodically received or the number of status information uncollected times does not exceed the preset number, the controller failure recovery system 100 calculates an execution detection value (S203). The execution detection value is calculated using the following equation (3).

여기서,

는 플로우당 최대 필요한 CPU 사용량을 의미하고, f_i는 i번째 SDN 제어기의 초당 오픈 플로우 메시지 처리량을 의미한다. 플로우당 최대 필요한 CPU 사용량은 네트워크의 상황에 따라 변경될 수 있는 변수로, 어느 하나의 수치로 한정하지 않는다. 그리고 i번째 SDN 제어기의 경우에는 실행 제어기와 준비 제어기에 한한다.here,

Denotes the maximum required CPU usage per flow, and f _i denotes the open flow message throughput per second of the i-th SDN controller. The maximum CPU usage required per flow is a variable that can be changed depending on the network conditions, and is not limited to any one numerical value. For the i-th SDN controller, it is limited to the execution controller and the preparation controller.

제어기 장애 복구 시스템(100)은 수학식 3을 통해 계산한 실행 탐지값이 정상 범위에 해당하는지 확인한다(S204). 만약 실행 탐지값이 정상 범위에 해당하면, S200 단계 이후의 절차를 반복한다. The controller failure recovery system 100 determines whether the execution detection value calculated through Equation (3) corresponds to a normal range (S204). If the execution detection value falls within the normal range, repeat the procedure from step S200.

그러나, S204 단계에서 확인한 결과 실행 탐지값이 정상 범위에 해당하지 않은 것으로 확인하면, 준비 제어기의 제어기 장애 복구 시스템(100)은 실행 제어기의 동작이 비정상적으로 동작하고 있는 것으로 판단한다. 따라서 준비 제어기의 제어기 장애 복구 시스템(100)은 SDN 제어기 자신의 역할을 실행 상태로 변경한다(S205).However, if it is determined in step S204 that the execution detection value does not correspond to the normal range, the controller failure recovery system 100 of the preparation controller determines that the operation of the execution controller is abnormally operating. Accordingly, the controller failure recovery system 100 of the preparation controller changes the role of the SDN controller itself to the execution state (S205).

이 경우, 기존의 실행 제어기(이하, '제1 실행 제어기'라 지칭함)와 더불어 S205 단계에서 역할이 변경된 실행 제어기(이하, 제2 실행 제어기'라 지칭함까지 두 개의 실행 제어기가 발생한다. 따라서, 두 개의 실행 제어기는 상기 수학식 1을 이용하여 실행 적합도를 계산한다(S206). In this case, two execution controllers are generated in the execution controller (hereinafter referred to as a second execution controller) whose role is changed in step S205, together with an existing execution controller (hereinafter referred to as a 'first execution controller' The two execution controllers calculate the fitness of execution using Equation (1) (S206).

계산한 실행 적합도를 계산한 결과에 따라 높은 적합도의 SDN 제어기를 실행 SDN 제어기로 설정하고, 나머지 SDN 제어기를 준비 제어기 또는 대기 제어기로 설정한다(S207). 여기서, 실행 제어기가 제1 실행 제어기에서 제2 실행 제어기로 변경된 경우, 제2 실행 제어기의 제어기 장애 복구 시스템(100)은 오픈 플로우 스위치로 역할이 변경되었음을 알린다(S208). 이와 동시에, 기존의 제1 실행 제어기의 역할은 대기 상태로 변경된다.The SDN controller with high fitness is set to the execution SDN controller and the remaining SDN controller is set to the preparation controller or the standby controller (S207) according to the result of calculating the calculated fitness. Here, when the execution controller is changed from the first execution controller to the second execution controller, the controller failure recovery system 100 of the second execution controller informs the open flow switch that the role has been changed (S208). At the same time, the role of the existing first execution controller is changed to the standby state.

그러나, 기존의 실행 제어기인 제1 실행 제어기의 실행 적합도가 제2 실행 제어기보다 높은 경우, 제2 실행 제어기는 임시로 변경된 역할인 실행 상태에서 대리 상태로 변경된다. 이때, 제1 실행 제어기가 지속적으로 실행 제어기 역할을 수행하기 때문에, S208 단계는 생략될 수 있다.However, when the execution fitness of the first execution controller, which is an existing execution controller, is higher than that of the second execution controller, the second execution controller is changed from the execution state, which is a temporarily changed role, to the substitute state. At this time, since the first execution controller continuously plays the role of the execution controller, the step S208 may be omitted.

한편, SDN 제어기의 역할로 대기 상태가 할당된 경우에는, 도 8에 도시된 바와 같이 대기 상태의 SDN 제어기 내 제어기 장애 복구 시스템(100)은 준비 제어기로부터 제어기 상태 정보를 수신한다(S210). 제어기 장애 복구 시스템(100)은 준비 제어기로부터 제어기 상태 정보를 수신하지 못한 미 수집 횟수가 미리 설정한 설정 횟수를 초과하는지 확인한다(S211).On the other hand, if the standby state is allocated as the role of the SDN controller, the controller failure recovery system 100 in the standby SDN controller receives the controller state information from the standby controller (S210), as shown in FIG. The controller failure recovery system 100 determines whether the number of uncollected collections that failed to receive the controller status information from the preparation controller exceeds a preset number of times (S211).

만약, 제어기 상태 정보가 미리 설정한 주기로 준비 제어기로부터 전송되거나, 미 수집 횟수가 미리 설정한 횟수를 초과하지 않은 경우에는, S210 단계의 절차를 반복한다. 그러나, 미 수집 횟수가 미리 설정환 횟수를 초과한 경우, 대기 제어기는 준비 제어기에 이상이 발생하였거나, 준비 제어기가 실행 제어기로 역할이 변경된 것으로 판단한다.If the controller status information is transmitted from the preparatory controller at a predetermined period or the number of times of non-collection is not more than a predetermined number, the procedure of step S210 is repeated. However, if the number of uncollected collections exceeds the number of preset cycles, the standby controller determines that an abnormality has occurred in the preparation controller or that the preparation controller has changed its role to the execution controller.

따라서, 복수의 대기 제어기는 인접한 모든 대기 제어기들로부터 제어기 상태 정보를 수신하고, 자신의 제어기 상태 정보를 인접한 대기 제어기들에게 전송한다(S212). 제어기 장애 복구 시스템(100)은 S212 단계에서 공유한 대기 제어기들의 제어기 상태 정보를 토대로, 상기 수학식 2를 이용하여 준비 적합도를 계산한다(S213).Accordingly, the plurality of standby controllers receives the controller status information from all adjacent standby controllers, and transmits its own controller status information to the adjacent standby controllers (S212). In step S213, the controller failure recovery system 100 calculates a ready fitness value using Equation 2 based on the controller state information of the atmospheric controllers shared in step S212.

S213 단계에서 계산된 준비 적합도를 토대로, 제어기 장애 복구 시스템(100)의 SDN 제어기가 가장 높은 준비 적합도를 나타내는지 확인한다(S214). 만약 가장 높은 준비 적합도를 보이면, 제어기 장애 복구 시스템(100)은 SDN 제어기의 역할을 대기 상태에서 준비 상태로 변경하여 설정한다(S215). 그러나, 가장 높은 준비 적합도를 나타내지 않으면, 제어기 장애 복구 시스템(100)은 SDN 제어기의 역할이 지속적으로 대기 상태가 되도록 유지한다(S216).In step S214, the SDN controller of the controller failure recovery system 100 determines whether the SDN controller indicates the highest ready fitness based on the prepared fitness calculated in step S213. If the highest ready fitness is shown, the controller failure recovery system 100 changes the role of the SDN controller from the standby state to the ready state and sets it (S215). However, if it does not indicate the highest ready fitness, the controller failover system 100 maintains the role of the SDN controller to be continuously in the standby state (S216).

이상에서 설명한 SDN 제어기의 복구 상황에 대해 도 10을 참조로 설명한다. 도 10에서는 실행 제어기에 대한 복구만을 예로 하여 설명한다.The recovery situation of the SDN controller described above will be described with reference to FIG. In Fig. 10, only the recovery to the execution controller will be described as an example.

도 10은 본 발명의 실시예에 따른 SDN 제어기 복구 상황에 대한 예시도이다.10 is an exemplary view of an SDN controller recovery situation according to an embodiment of the present invention.

먼저 도 10의 (a)에 나타낸 바와 같이, 제1 네트워크에 연결된 제2 SDN 제어기(②)가 실행 제어기이고, 제2 네트워크에 연결된 제5 SDN 제어기(⑤)가 준비 제어기인 것을 가정하여 설명한다. 이때, 제5 SDN 제어기(⑤)가 제2 SDN 제어기(②)를 지속적으로 감시하여, 제2 SDN 제어기(②)에 이상이 발생한 것으로 판단하였다고 가정한다.10 (a), it is assumed that the second SDN controller (2) connected to the first network is the execution controller and the fifth SDN controller (5) connected to the second network is the preparation controller . At this time, it is assumed that the fifth SDN controller (5) continuously monitors the second SDN controller (2) and determines that an abnormality has occurred in the second SDN controller (2).

그러면 제2 SDN 제어기(②)와 제5 SDN 제어기(⑤) 사이에 상기 도 7에서 설명한 복구 절차를 통해, 두 SDN 제어기 중 실행 적합도가 높은 SDN 제어기가 무엇인지 확인한다. 그리고, 도 10의 (b)에 나타낸 바와 같이, 제5 SDN 제어기(⑤)의 실행 적합도가 더 높은 경우에는, 제5 SDN 제어기(⑤)가 제1 네트워크에서의 실행 제어기가 된다. 이와 함께, 기존의 실행 제어기였던 제2 SDN 제어기(②)는 대기 제어기로 역할이 변경된다. Then, through the restoration procedure described with reference to FIG. 7, between the second SDN controller (2) and the fifth SDN controller (5), it is determined which of the two SDN controllers is an SDN controller having high performance. As shown in FIG. 10 (b), if the execution fitness of the fifth SDN controller (5) is higher, the fifth SDN controller (5) becomes the execution controller in the first network. At the same time, the second SDN controller (2), which was an existing execution controller, is changed to a standby controller.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, It belongs to the scope of right.

Claims

A system for restoring the failure of an SDN controller in a distributed SDN controller structure,
And a plurality of SDN controllers connected to at least one open flow switch on the network for receiving adjacent controller status information transmitted from the plurality of adjacent SDN controllers, To an adjacent SDN controller of the base station; And
A role assigned to the controller is assigned to one of an execution role, a preparation role, and a waiting role based on a plurality of adjacent controller state information received through the interface and transmitted controller state information, A processor that detects a failure of a ready controller
/ RTI >
Wherein the controller failure recovery system is included in each of the SDN controller and the plurality of neighbor SDN controllers.

The method according to claim 1,
The processor comprising:
If there is a plurality of execution controllers, calculates execution fitness for each of the execution controllers based on the controller state information and a plurality of adjacent controller state information, and if the controller indicates the highest execution fitness, A controller failover system.

3. The method of claim 2,
The processor comprising:
If there is a plurality of preparation controllers, calculates the preparation fitness for each of the preparation controllers based on the controller state information and the plurality of adjacent controller state information, and if the controller indicates the highest preparation fitness, A controller failover system.

The method of claim 3,
The processor comprising:
Wherein when the controller is assigned a role as the execution controller or the preparation controller, controller status information collected at predetermined time intervals is transmitted to the plurality of neighboring SDN controllers via the interface, A controller fault recovery system that receives adjacent controller state information from a controller.

The method according to claim 1,
The processor comprising:
If the controller is a preparatory controller, changes the role of the preparation controller to the execution controller if the number of times of non-receipt of status information of the execution controller transmitted from the execution controller is equal to or greater than a predetermined number.

6. The method of claim 5,
The processor comprising:
If the number of times of non-reception is less than or equal to a preset number, the execution detection value is calculated based on the received controller state information of the execution controller. If the calculated execution detection value does not fall within a preset range, A controller failover system that changes to an execution controller.

The method according to claim 1,
The processor comprising:
If the controller is a standby controller, if the number of times of non-receipt of the status information of the preparation controller transmitted from the preparation controller is not less than a predetermined number of times, the controller calculates the preparation goodness based on the controller status information for the plurality of standby controllers, A controller failover system that changes its role to the standby controller to indicate the highest ready fits.

A method for recovering from a failure of a failed SDN controller in a distributed SDN controller structure,
If the role of the SDN controller is a staging controller, the controller failure recovery system of the staging controller receives first execution controller status information transmitted from the controller failure recovery system of the first one of the plurality of adjacent SDN controllers;
Calculating an execution detection value based on the received first execution controller state information and checking whether the calculated execution detection value corresponds to a preset normal range;
Changing the role of the preparation controller to a second execution controller if the normal range is not met; And
Calculating an execution fitness of the first execution controller and the second execution controller, and determining a role of the second execution controller based on the calculated execution fitness
/ RTI >

9. The method of claim 8,
After receiving the first execution controller status information,
Checking whether the number of uncollected collections for the first execution controller status information is greater than a preset number; And
Determining that a failure has occurred in the first execution controller if the first execution controller status information is not collected more than a preset number of times
/ RTI >

9. The method of claim 8,
Wherein determining the role of the second execution controller comprises:
If the execution fidelity of the first execution controller is higher than the execution fidelity of the second execution controller, the first execution controller maintains the role as the execution controller and the second execution controller assigns the role as the standby controller; And
Wherein if the execution fidelity of the first execution controller is lower than the execution fidelity of the second execution controller, the second execution controller maintains the role as the execution controller and the first execution controller changes the role as the standby controller
/ RTI >

9. The method of claim 8,
If the role of the SDN controller is the standby controller,
Receiving preparation controller status information of the preparation controller;
Confirming whether the number of uncollected collections for the preparation controller status information is greater than a preset number;
Determining that a failure has occurred in the preparation controller if the number of times is greater than a preset number;
Receiving standby controller status information from a plurality of standby controllers, and calculating a standby goodness level based on the received standby controller status information; And
Changing the role of the standby controller to the preparation controller indicating the highest ready fitness
/ RTI >

9. The method of claim 8,
Prior to the step of receiving the first execution controller status information,
Wherein the controller failure recovery system included in each of the plurality of SDN controllers assigns a role for a plurality of SDN controllers themselves
/ RTI >

13. The method of claim 12,
Wherein assigning the role comprises:
Receiving neighbor controller state information from a plurality of neighbor SDN controllers and transmitting controller state information of the SDN controller to the plurality of neighbor SDN controllers;
If the SDN controller is an execution controller, determining whether a plurality of execution controllers are present based on adjacent controller state information;
Calculating an execution fitness based on the controller state information of the SDN controller and the adjacent controller state information of each execution controller, when a plurality of execution controllers exist; And
If the execution suitability of the SDN controller is the highest, setting the SDN controller as an execution controller
/ RTI >

14. The method of claim 13,
Confirming whether a plurality of preparation controllers are present based on the adjacent controller state information if the execution fitness of the SDN controller is not the highest;
Calculating a ready fitness value based on the controller state information of the SDN controller and the adjacent controller state information of each of the preparation controllers when a plurality of preparation controllers exist;
Setting the SDN controller as a preparation controller if the readiness of the SDN controller is the highest; And
If the ready fitness of the SDN controller is not the highest, setting the role of the SDN controller to the standby controller
/ RTI >

15. The method of claim 14,
The controller status information and the adjacent controller status information are respectively transmitted to a controller including a CPU usage amount of an SDN controller that collects controller status information, an average delay rate with an open flow switch to which the SDN controller is connected, How to recover from a failure.

16. The method of claim 15,
Wherein the execution fitness and the ready fitness are each calculated based on arbitrary setting variables, a delay rate with the open flow switch, and the CPU usage amount.

16. The method of claim 15,
The controller state information further includes an open flow message throughput per second,
Wherein the execution detection value is computed based on a maximum required CPU usage per flow and an open flow message throughput per second.