KR101913745B1

KR101913745B1 - Apparatus and method of configuring transmission route utilizing data plane application in software defined network

Info

Publication number: KR101913745B1
Application number: KR1020170028877A
Authority: KR
Inventors: 홍충선; 김도현
Original assignee: 경희대학교 산학협력단
Priority date: 2016-11-02
Filing date: 2017-03-07
Publication date: 2018-11-01
Also published as: KR20180048232A

Abstract

본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치가 개시된다. 상기 전송 경로 설정 장치는 출발지-목적지(Src-Dst) 간 각 전송 경로에 대한 우선순위에 관한 플로우 테이블(Flow Table)을 설정하도록 구성되는 컨트롤러를 포함한다. 또한, 상기 전송 경로 설정 장치는 상기 플로우 테이블을 기반으로 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성되는 어플리케이션 모듈을 더 포함할 수 있다. 따라서, 따라서, 본 발명에 따르면, 소프트웨어 정의 네트워크에서, 컨트롤러에 집중되는 부하를 감소시킬 수 있다. A transmission path setting device for establishing a transmission path by utilizing a data plane application in a software defined network according to the present invention is disclosed. The transmission path setting device includes a controller configured to set a flow table related to a priority for each transmission path between the source and destination (Src-Dst). The transmission path setting device may further include an application module configured to perform an action having a largest compensation value (Reward) for each state based on the flow table. Thus, in accordance with the present invention, therefore, in a software defined network, the load concentrated on the controller can be reduced.

Description

[0001] Apparatus and method for establishing a transmission path utilizing a data plane application in a software defined network [0002]

본 발명은 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 장치 및 방법에 관한 것이다. 보다 상세하게는, 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 장치 및 방법에 관한 것이다. The present invention relates to an apparatus and method for establishing a transmission path in a software defined network. More particularly, to an apparatus and method for establishing a transmission path utilizing a data plane application in a software defined network.

소프트웨어 정의 네트워크는 기존 라우터 등의 네트워크 장비에 공존하는 제어 평면과 데이터 평면을 분리하여 다양한 네트워크의 기능들을 소프트웨어적으로 구현하고 적용시킬 수 있는 네트워킹 기술이다. 전통적인 네트워크 환경에서는 특정한 기능, 예를 들어 방화벽, Deep Packet Inspection, Traffic Monitor 등의 기능을 적용시키려 할 때, 해당 기능을 수행하는 고가의 장비를 구입하고 설치해야 하기 때문에 비용적인 측면과 유지보수의 측면에서 많은 부담이 되었다. A software defined network is a networking technology that can implement and apply various network functions by separating control planes and data planes coexisting in network equipment such as existing routers. In a traditional network environment, expensive features such as firewalls, Deep Packet Inspection, and Traffic Monitors are required to be purchased and installed. Therefore, cost and maintenance There was a lot of burden on.

하지만, 소프트웨어 정의 네트워크 환경에서는 앞서 언급했듯 프로그래밍을 통해 해당 기능을 구현/개발할 수 있기 때문에 최근 다양한 네트워크 환경(Internet of Things, Data Center Network)에 적용되고 그에 알맞은 관리 기법 또한 연구/개발되고 있다. 소프트웨어 정의 네트워크의 전체적인 구조는 어플리케이션 계층(Application layer), 제어 평면(Control Plane), 데이터 평면(Date plane)으로 나누어질 수 있다. 이와 관련하여, 도 1과 도 2는 본 발명과 관련된 기존 소프트웨어 정의 네트워크의 기본적인 구조를 나타낸다.However, in the software defined network environment, as described above, since the function can be implemented / developed through the programming, the management technique applicable to the Internet of Things (Data Center Network) has been recently studied and developed. The overall structure of a software defined network can be divided into an application layer, a control plane, and a data plane. In this regard, Figures 1 and 2 illustrate the basic structure of an existing software defined network associated with the present invention.

하지만, 이러한 기존 소프트웨어 정의 네트워크의 경우, 네트워크 운용 및 관리에 대한 많은 기능들이 오픈플로우 스위치 상에 위치하게 된다. 또한, 소프트웨어 정의 네트워크와 연동되는 특정 분야의 필요 기능들도 Application Layer에 위치하여 컨트롤러와 연계를 통해 기능을 수행한다. 이러한 이유로 많은 노드들이 포함된 대규모 네트워크를 컨트롤러만으로 운용/관리하기에는 컨트롤러 과부하가 발생하기 쉽고 더 나아가 네트워크가 효율적으로 운용되지 못하는 문제가 발생할 수 있다. However, in these existing software defined networks, many functions for network operation and management are placed on the open flow switch. In addition, the required functions of a specific field linked with the software definition network are located in the application layer and functions through connection with the controller. For this reason, in order to operate / manage a large-scale network including a large number of nodes with a controller alone, the controller may be overloaded and the network may not be operated efficiently.

따라서, 본 발명에서 해결하고자 하는 과제는, 소프트웨어 정의 네트워크에서, 컨트롤러에 집중되는 부하를 감소시키는 데에 그 목적이 있다. Therefore, an object to be solved by the present invention is to reduce a load concentrated on a controller in a software defined network.

또한, 본 발명에서 해결하고자 하는 과제는, 소프트웨어 정의 네트워크에서, 컨트롤러에 집중되는 부하에 따라 사용자 데이터의 전송 지연이 증가하는 것을 감소시키는 데에 그 목적이 있다. Another object of the present invention is to reduce an increase in transmission delay of user data in accordance with a load concentrated on a controller in a software defined network.

상기와 같은 과제를 해결하기 위한 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치가 개시된다. 상기 전송 경로 설정 장치는 하나의 입력포트(In_port)에 대하여 전송 방향 포트(Output port)가 다수 개일 경우, 출발지-목적지(Src-Dst) 간 각 전송 경로에 대한 우선순위에 관한 플로우 테이블(Flow Table)을 설정하도록 구성되는 컨트롤러를 포함한다. 또한, 상기 전송 경로 설정 장치는 상기 플로우 테이블을 기반으로 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성되는 어플리케이션 모듈을 더 포함할 수 있다. 따라서, 따라서, 본 발명에 따르면, 소프트웨어 정의 네트워크에서, 컨트롤러에 집중되는 부하를 감소시킬 수 있다. In order to solve the above problems, a transmission path setting apparatus for setting a transmission path using a data plane application in a software defined network according to the present invention is disclosed. The transmission path setting device sets a flow table related to priority for each transmission path between the source and destination (Src-Dst) when there are a plurality of output ports for one input port (In_port) ) Of the controller. The transmission path setting device may further include an application module configured to perform an action having a largest compensation value (Reward) for each state based on the flow table. Thus, in accordance with the present invention, therefore, in a software defined network, the load concentrated on the controller can be reduced.

일 실시예에 따르면, 상기 컨트롤러는, 패킷이 호스트로부터 오픈플로우 스위치에 유입될 때, 상기 오픈플로우 스위치로부터 상기 패킷을 전송하기 위한 상기 플로우 테이블에 대한 요청 메시지(Packet_IN Message)를 수신할 수 있다. 또한, 상기 컨트롤러는, 상기 요청 메시지를 전달받아 상기 패킷을 전송할 수 있는 최단 거리 다중 경로를 선정할 수 있다.According to one embodiment, the controller can receive a request message (Packet_IN Message) for the flow table for transferring the packet from the open flow switch when the packet flows from the host to the open flow switch. In addition, the controller may select a shortest-path multipath that can receive the request message and transmit the packet.

일 실시예에 따르면, 상기 오픈플로우 스위치는, 각 플로우 테이블에서 상기 전송 방향 포트들을 활용하여 스위치의 현재 상태(State)를 결정할 수 있다.According to one embodiment, the open flow switch may utilize the forward direction ports in each flow table to determine the current state of the switch.

일 실시예에 따르면, 상기 어플리케이션 모듈은, 보상값 계산모듈(R-Calculator)을 통해 상기 플로우 테이블에서 동일한 입력 포트에 대한 전송 방향 포트 정보를 토대로 보상값 테이블(Reward Table)을 생성할 수 있다.According to one embodiment, the application module can generate a reward table based on the transfer direction port information for the same input port in the flow table through a compensation value calculation module (R-calculator).

일 실시예에 따르면, 상기 플로우 테이블은, 복수의 플로우들을 서로 구분하도록 구성되는 플로우 엔트리; 상기 플로우 엔트리에 대하여, 상기 각 전송 경로에 대한 우선순위, 입력 포트(in_port), 상기 출발지 및 상기 목적지의 MAC 주소와 IP 주소를 포함하는 매치 필드(Match Field); 및 상기 플로우 엔트리에 대하여, 상기 입력 포트(in_port)에 대응하는 출력 포트(output port)를 포함하는 액션 필드를 포함할 수 있다.According to one embodiment, the flow table comprises: a flow entry configured to distinguish a plurality of flows from one another; A match field including a priority for each transmission route, an input port (in_port), a source address and a MAC address of the destination and an IP address for the flow entry; And an action field for the flow entry, the action field including an output port corresponding to the input port in_port.

일 실시예에 따르면, 상기 어플리케이션 모듈은, 입력 포트(In_port)를 제외한 나머지 포트들의 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 모니터링하도록 구성되는 상태 모니터(S-Monitor) 모듈을 포함한다. 또한, 상기 어플리케이션 모듈은, 상기 송신/수신 바이트들에 대하여 각각의 레이트를 계산하여, 상기 레이트의 증가 여부와 연관된 상태를 모니터링하도록 구성되는 보상값 계산모듈(R-Calculator)을 더 포함할 수 있다. 또한, 상기 어플리케이션 모듈은, 상기 송신/수신 바이트들의 수의 증가 여부 및 상기 각각의 레이트의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값이 가장 큰 액션(Action)을 수행하도록 구성되는 액션 컨덕터 모듈(Action conductor)을 더 포함할 수 있다.According to one embodiment, the application module includes a status monitor (S-Monitor) module configured to monitor status associated with an increase in the number of transmit / receive bytes of ports other than the input port (In_port). The application module may further include a compensation value calculation module (R-calculator) configured to calculate a rate for each of the transmit / receive bytes and to monitor a status associated with the rate increase . In addition, the application module receives the state associated with the increase / decrease of the number of transmission / reception bytes and the increase / decrease of the respective rates, and performs an action having the largest compensation value for each state And may further include an action conductor module.

일 실시예에 따르면, 상기 상태 모니터 모듈은, 시간 t+1에서 송신/수신 바이트들의 수가 시간 t에서의 송신/수신 바이트들의 수와 동일하면 상기 해당 상태를 0으로 정의하고, 상기 시간 t+1에서 송신/수신 바이트들의 수가 상기 시간 t에서의 송신/수신 바이트들의 수 보다 증가하면, 상기 해당 상태를 1로 정의하여, 상기 포트들의 상태를 모니터링할 수 있다.According to one embodiment, the status monitor module defines the corresponding state as 0 if the number of transmit / receive bytes at time t + 1 is equal to the number of transmit / receive bytes at time t, If the number of transmission / reception bytes in the transmission / reception byte is greater than the number of transmission / reception bytes at the time t, the corresponding state can be defined as 1 to monitor the status of the ports.

일 실시예에 따르면, 상기 액션 컨덕터 모듈은, 상기 보상값에 대한 보상값 테이블 Q(s,a)에서, 패킷 유입이 검출되면 상태(s)로부터 제1액션(a)을 임의로 선택하고, 상기 제1액션(a)을 수행하여, 상기 보상값(r)을 관측하고, 상기 상태(s)로부터 제2액션(a')을 선택하여 상기 보상값 테이블 Q(s,a)를 Q(s, a')으로 업데이트할 수 있다. 또한, 상기 액션 컨덕터 모듈은, 상기 보상값 테이블 중 상기 상태(s)와 연관된 보상값 테이블(reward table)이 완성(complete)될 때까지 상기 보상값 관측 및 상기 보상값 테이블 업데이트를 반복하고, 상기 반복된 보상값 관측을 통해 상기 보상값이 가장 큰 액션을 선택하여, 경로 결정(Patch decision)이 수행될 수 있다. 이때, 상기 입력포트에 대하여 상기 전송 방향 포트가 하나만 존재한다면, 해당 플로우 엔트리에 대한 우선순위를 높여 업데이트를 수행할 수 있다.According to one embodiment, the action conductor module may optionally select a first action (a) from the state (s) when a packet entry is detected in the compensation value table Q (s, a) (S, a) to Q (s) by performing a first action (a) to observe the compensation value (r) and selecting a second action , a '). The action conductor module may repeat the compensation value observation and the compensation value table update until the reward table associated with the state s in the compensation value table is completed, A Patch decision can be performed by selecting an action having the largest compensation value through observation of repeated compensation values. At this time, if there is only one transmission direction port with respect to the input port, the update can be performed by increasing the priority of the corresponding flow entry.

일 실시예에 따르면, 상기 보상값(Reward)은 상기 액션 컨덕터 모듈(Action conductor)이 상기 액션을 수행하였을 때 해당하는 포트에서 측정되는 전송 지연(Transmission Delay) 값의 역수와 최대 대역폭의 값에 따라 달라지는 가중치(weight)의 곱으로 계산될 수 있다.According to an exemplary embodiment, the compensation value Reward may be determined according to a reciprocal of a transmission delay value measured at a corresponding port when the action conductor performs the action, and a value of a maximum bandwidth Can be calculated as the product of the weights that vary.

일 실시예에 따르면, 상기 액션 컨덕터 모듈은, 상기 플로우 테이블 내의 각 플로우 엔트리의 입력 포트(In port) 정보를 추출하고, 상기 입력 포트에 대응하는 적어도 하나의 출력 포트(output port)의 가용 대역폭을 기반으로 상기 보상값 계산모듈로부터 상기 보상값을 전달받아 최종 액션을 결정할 수 있다.According to one embodiment, the action conductor module extracts the In port information of each flow entry in the flow table and extracts the available bandwidth of at least one output port corresponding to the input port And may receive the compensation value from the compensation value calculation module to determine a final action.

또한, 본 발명의 다른 양상에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 장치가 개시된다. 상기 전송 경로 설정 장치는, 입력 포트(In_port)를 제외한 나머지 포트들의 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 모니터링하도록 구성되는 상태 모니터 모듈을 포함한다. 또한, 상기 전송 경로 설정 장치는, 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성되는 액션 컨덕터 (Action Conductor) 모듈을 더 포함한다.Also disclosed is an apparatus for establishing a transmission path utilizing a data plane application in a software defined network in accordance with another aspect of the present invention. The transmission path setting device includes a status monitor module configured to monitor a status associated with an increase in the number of transmission / reception bytes of ports other than the input port (In_port). In addition, the transmission path setting device may include an action conductor configured to perform an action having a largest compensation value (Reward) for each state by receiving a state associated with an increase in the number of transmission / reception bytes Action Conductor module.

일 실시예에 따르면, 상기 전송 경로 설정 장치는, 상기 송신/수신 바이트들에 대하여 각각의 레이트를 계산하여, 상기 레이트의 증가 여부와 연관된 상태를 모니터링하도록 구성되는 보상값 계산모듈(R-Calculator)을 더 포함할 수 있다. 이때, 상기 액션 컨덕터 모듈은, 상기 송신/수신 바이트들의 수의 증가 여부 및 상기 각각의 레이트의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값이 가장 큰 액션(Action)을 수행하도록 구성될 수 있다.According to one embodiment, the transmission path setting device includes a compensation value calculation module (R-Calculator) configured to calculate a rate for each of the transmission / reception bytes and monitor a status associated with the rate increase, As shown in FIG. At this time, the action conductor module receives the state associated with the increase of the number of transmission / reception bytes and the increase or decrease of the respective rates, and performs an action having the largest compensation value for each state Lt; / RTI >

일 실시예에 따르면, 상기 상태 모니터 모듈은, 시간 t+1에서 송신/수신 바이트들의 수가 시간 t에서의 송신/수신 바이트들의 수와 동일하면 상기 해당 상태를 0으로 정의하고, 상기 시간 t+1에서 송신/수신 바이트들의 수가 상기 시간 t에서의 송신/수신 바이트들의 수보다 증가하면, 상기 해당 상태를 1로 정의하여, 상기 포트들의 상태를 모니터링할 수 있다.According to one embodiment, the status monitor module defines the corresponding state as 0 if the number of transmit / receive bytes at time t + 1 is equal to the number of transmit / receive bytes at time t, If the number of transmission / reception bytes in the transmission / reception byte is greater than the number of transmission / reception bytes at the time t, the corresponding state can be defined as 1 to monitor the status of the ports.

일 실시예에 따르면, 상기 액션 컨덕터 모듈은, 상기 포트들의 사용 유무와 상기 수신된 플로우 테이블에 기반하여, 상기 해당 상태(State)마다 상기 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성될 수 있다. 또한, 상기 액션 컨덕터 모듈은, 상기 포트들의 사용 유무에 의해 상기 상태가 정의되어, 상기 상태에 대한 액션으로부터의 보상값은 상기 포트들의 사용 유무에 따라 달라질 수 있다.According to one embodiment, the action conductor module performs an action having the largest compensation value Reward for each state according to the use of the ports and the received flow table Lt; / RTI > In addition, in the action conductor module, the state is defined according to whether or not the ports are used, and a compensation value from an action on the state may be varied depending on whether or not the ports are used.

또한, 본 발명의 또 다른 양상에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 방법이 개시된다. 상기 전송 경로 설정 방법은, 입력 포트(In_port)를 제외한 나머지 포트들의 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 모니터링하는 상태 모니터링 단계를 포함한다. 또한, 상기 전송 경로 설정 방법은, 상기 송신/수신 바이트들에 대하여 각각의 레이트를 계산하여, 상기 레이트의 증가 여부와 연관된 상태를 모니터링하는 레이트 모니터링 단계를 더 포함한다. 또한, 상기 전송 경로 설정 방법은, 상기 송신/수신 바이트들의 수의 증가 여부 및 상기 각각의 레이트의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값이 가장 큰 액션(Action)을 수행하는 액션 수행 단계를 더 포함한다.Also disclosed is a method of establishing a transmission path utilizing a data plane application in a software defined network according to another aspect of the present invention. The transmission path setting method includes a status monitoring step of monitoring a status associated with an increase in the number of transmission / reception bytes of ports other than the input port (In_port). The transmission path setting method may further include a rate monitoring step of calculating a rate for each of the transmission / reception bytes and monitoring a status associated with the rate increase. In addition, the transmission path setting method receives the state related to whether the number of transmission / reception bytes is increased and whether the rate is increased, and performs an action having the largest compensation value for each state And an action execution step.

일 실시예에 따르면, 상기 상태 모니터링 단계는, 시간 t+1에서 송신/수신 바이트들의 수가 시간 t에서의 송신/수신 바이트들의 수와 동일하면 상기 해당 상태를 0으로 정의하고, 상기 시간 t+1에서 송신/수신 바이트들의 수가 상기 시간 t에서의 송신/수신 바이트들의 수보다 증가하면, 상기 해당 상태를 1로 정의하여, 상기 포트들의 상태를 모니터링할 수 있다.According to one embodiment, the status monitoring step defines the corresponding status as 0 if the number of transmit / receive bytes at time t + 1 is equal to the number of transmit / receive bytes at time t, If the number of transmission / reception bytes in the transmission / reception byte is greater than the number of transmission / reception bytes at the time t, the corresponding state can be defined as 1 to monitor the status of the ports.

일 실시예에 따르면, 상기 액션 수행 단계는, 상기 보상값에 대한 보상값 테이블 Q(s,a)에서, 패킷 유입이 검출되면 상태(s)로부터 제1액션(a)을 임의로 선택하는 제1액션 선택 단계; 상기 제1액션(a)을 수행하여, 상기 보상값(r)을 관측하는 보상값 관측 단계; 및 상기 상태(s)로부터 제2액션(a')을 선택하여 상기 보상값 테이블 Q(s,a)를 Q(s, a')으로 업데이트하는 제2액션 선택/보상값 테이블 업데이트 단계를 포함할 수 있다. According to an exemplary embodiment, the performing of the action may include, in a compensation value table Q (s, a) for the compensation value, a first action (a) for arbitrarily selecting a first action An action selection step; Observing the compensation value (r) by performing the first action (a); And a second action selection / compensation value table updating step of updating the compensation value table Q (s, a) to Q (s, a ') by selecting a second action a' from the state s can do.

일 실시예에 따르면, 상기 액션 수행 단계는, 상기 보상값 테이블 중 상기 상태(s)와 연관된 보상값 테이블(reward table)의 완성(complete) 여부를 판단하는 보상값 테이블 완성 여부 판단 단계; 및 상기 보상값 테이블이 완성될 때까지 상기 보상값 관측 단계 및 상기 제2액션 선택/보상값 테이블 업데이트 단계를 반복하고, 상기 반복된 보상값 관측을 통해 상기 보상값이 가장 큰 액션을 선택하여, 경로 결정(Patch decision)을 수행하는 경로 결정 단계를 더 포함할 수 있다.According to an embodiment of the present invention, the action execution step may include: determining whether a reward value table associated with the state (s) among the reward value table is complete; And repeating the step of observing the compensation value and updating the second action selection / compensation value table until the compensation value table is completed, selecting an action having the largest compensation value through the repeated compensation value observation, And a path determining step of performing a path decision (Patch decision).

본 발명에 따른 전송 경로 설정 방법은, 소프트웨어 정의 네트워크에서 패킷전송만을 담당했던 데이터 평면에 전송 경로 갱신 및 요청 등의 기능을 Application 형태로 배치하여 처리함으로써, 컨트롤러에 집중되는 부하를 감소시킬 수 있다는 장점이 있다. The transmission path setting method according to the present invention is advantageous in that the load concentrated on the controller can be reduced by disposing and processing functions such as a transmission path update and request in an application form in a data plane that was only responsible for packet transmission in a software defined network .

또한, 본 발명에 따른 전송 경로 설정 방법은, 소프트웨어 정의 네트워크에서 패킷전송만을 담당했던 데이터 평면에 전송 경로 갱신 및 요청 등의 기능을 Application 형태로 배치하여 처리함으로써, 컨트롤러에 집중되는 부하에 따라 사용자 데이터의 전송 지연이 증가하는 것을 감소시킬 수 있다는 장점이 있다. According to another aspect of the present invention, there is provided a transmission path setting method including arranging, in an application form, functions such as a transmission path update and a request in a data plane that has only received a packet in a software defined network, It is possible to reduce an increase in the transmission delay of the mobile station.

도 1은 본 발명과 관련된 기존 소프트웨어 정의 네트워크의 기본적인 구조를 나타낸다.
도 2는 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 장치의 개념도이다.
도 3는 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 방법과 관련하여, 강화학습의 기본 동작 절차를 나타낸 것이다.
도 4은 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치의 구성 요소들을 나타낸다.
도 5는 본 발명에 따른 Application (또는 어플리케이션 모듈)이 OpenVSwitch(오픈플로우 스위치)로부터 얻을 수 있는 각 상태를 나타낸 도면이다.
도 6는 본 발명의 일 실시예에 따른 각각의 입력 포트에서의 보상값 테이블을 도시한다.
도 7은 본 발명의 일 실시예에 따라, Packet_IN 메시지를 기반으로 컨트롤러가 OpenVSwitch에 플로우 테이블를 배치한 것을 나타낸다.
도 8은 본 발명의 일 실시예에 따라, 전송 경로가 설정된 형태를 나타낸다.
도 9은 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 방법의 흐름도를 도시한다. 1 shows a basic structure of an existing software defined network related to the present invention.
2 is a conceptual diagram of an apparatus for setting a transmission path in a software defined network according to the present invention.
3 shows a basic operation procedure of reinforcement learning in connection with a method of setting a transmission path in a software defined network according to the present invention.
4 shows components of a transmission path setting apparatus for setting up a transmission path utilizing a data plane application in a software defined network according to the present invention.
FIG. 5 is a view showing states obtained by an application (or an application module) according to the present invention from an OpenVSwitch (open flow switch).
Figure 6 shows a table of compensation values at each input port in accordance with an embodiment of the invention.
FIG. 7 shows that a controller places a flow table on an OpenVSwitch based on a Packet_IN message, according to an embodiment of the present invention.
FIG. 8 shows a form in which a transmission path is set according to an embodiment of the present invention.
Figure 9 shows a flow diagram of a transmission path setup method for establishing a transmission path utilizing a data plane application in a software defined network according to the present invention.

상술한 본 발명의 특징 및 효과는 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해 질 것이며, 그에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 본 발명은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 본문에 상세하게 설명하고자 한다. 그러나 이는 본 발명을 특정한 개시형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 명세서에서 사용한 용어는 단지 특정한 실시 예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다.BRIEF DESCRIPTION OF THE DRAWINGS The above and other features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, It will be possible. The present invention is capable of various modifications and various forms, and specific embodiments are illustrated in the drawings and described in detail in the text. It is to be understood, however, that the invention is not intended to be limited to the particular forms disclosed, but on the contrary, is intended to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용한다.Like reference numerals are used for similar elements in describing each drawing.

제1, 제2등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. "및/또는" 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. The term "and / or" includes any combination of a plurality of related listed items or any of a plurality of related listed items.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미가 있다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않아야 한다.Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Should not.

이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈", "블록" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. The suffix "module "," block ", and "part" for components used in the following description are given or mixed in consideration of ease of specification only and do not have their own distinct meanings or roles .

이하, 본 발명의 바람직한 실시 예를 첨부한 도면을 참조하여 당해 분야에 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 설명한다. 하기에서 본 발명의 실시 예를 설명함에 있어, 관련된 공지의 기능 또는 공지의 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description of the present invention, detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

이하, 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치 및 방법에 대해 살펴보기로 한다. 여기서, 전송 경로 설정 장치는 소스/목적지 노드뿐만 아니라, 다수의 노드들을 포함하는 네트워크상의 임의의 노드일 수 있다. 따라서, 전송 경로 설정 장치는 사용자 단말 이외에 네트워크 기기, 라우터 중 하나일 수 있다.Hereinafter, a transmission path setting apparatus and method for setting a transmission path using a data plane application in a software defined network according to the present invention will be described. Here, the transmission path setting device may be a source / destination node as well as any node on the network including a plurality of nodes. Therefore, the transmission path setting device may be one of a network device and a router other than the user terminal.

도 2는 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 장치의 개념도이다.2 is a conceptual diagram of an apparatus for setting a transmission path in a software defined network according to the present invention.

도 3는 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 방법과 관련하여, 강화학습의 기본 동작 절차를 나타낸 것이다. 3 shows a basic operation procedure of reinforcement learning in connection with a method of setting a transmission path in a software defined network according to the present invention.

이러한 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 방법과 관련하여, 강화 학습(Reinforcement Learning)에 의한 전송 경로 설정이 수행될 수 있다. 따라서, 강화 학습(Reinforcement Learning)에 의한 전송 경로 설정에 대해 살펴보면 다음과 같다.In connection with the method for setting the transmission path in the software defined network according to the present invention, transmission path setting by reinforcement learning can be performed. Therefore, the transmission path setting by the reinforcement learning will be described as follows.

강화학습은 기계학습의 범주에 속하는 학습방법 중 하나로, 적용하고자 하는 환경 안에서 Agent가 현재의 State(상태)를 관측하여 수행 가능한 행동들 중 Reward(보상값)를 최대화하거나 Penalty(불이익)를 최소화하는 행동 혹은 행동순서를 선택하여 학습해 나가는 방식이다. 이는 Markov Decision Process(MDP) 모델을 기반으로 Agent가 얻을 수 있는 State와 Agent가 수행하는 Action, 그리고 Action에 따른 Reward/Penalty를 정의하여 Environment와의 상호작용을 통해 점진적인 학습절차를 거치게 된다. 한편, 도 2는 본 발명에 따른 소프트웨어 정의 네트워크에서 전송경로를 설정하는 장치의 개념도를 나타낸 것이며, 도 3은 본 발명에 따른 소프트웨어 정의 네트워크에서 전송 경로를 설정하는 방법과 관련하여, 강화학습의 기본 동작 절차를 나타낸 것이다. 도 2, 도 3를 참조하면, 상기 강화학습의 기본 동작 절차를 수행하는 전송 경로 설정 장치(1000)는 어플리케이션 모듈(100)과 오픈플로우 스위치(200)를 포함한다. 여기서, 어플리케이션 모듈(100)은 에이전트(agent)로 지칭되고, 오픈플로우 스위치(200)는 (런타임) 환경 모듈((Runtime) Envirionment module)로 지칭될 수 있다.Reinforcement learning is one of the learning methods belonging to the category of machine learning. In the environment to which the agent is applying, the agent observes the current state and maximizes the reward value or minimizes the penalty It is a way to learn by choosing the order of action or action. Based on the Markov Decision Process (MDP) model, it defines the states that the agent can obtain, the actions performed by the agents, and the rewards / penalties according to the actions, and proceeds through incremental learning procedures through interaction with the environment. FIG. 2 is a conceptual diagram of an apparatus for setting a transmission path in a software defined network according to the present invention. FIG. 3 is a flowchart illustrating a method of setting a transmission path in a software defined network according to the present invention. FIG. Referring to FIGS. 2 and 3, the transmission path setting apparatus 1000 for performing the basic operation procedure of the reinforcement learning includes an application module 100 and an open flow switch 200. Here, the application module 100 is referred to as an agent, and the open flow switch 200 may be referred to as a (runtime) environment module (Runtime Envirionment module).

도 3에 도시된 바와 같이, 어플리케이션 모듈(100)은 오픈플로우 스위치(200)로부터 상태(State, St)에 관한 정보를 수신하고, 이에 기반하여 특정한 액션(action, At)을 수행한다. 이러한 특정한 액션(action, At)에 기반하여, 오픈플로우 스위치(200)는 보상값(reward, Rt)를 어플리케이션 모듈(100)로 전달한다. 이러한 액션에 기반하여 보상값을 구하는 과정을 반복함으로써 최적의 보상값을 구하는 것이 강화학습의 궁극적인 목표에 해당한다. 3, the application module 100 receives information on states (State, St) from the open flow switch 200 and performs a specific action (At, At) based thereon. Based on this specific action (action, At), the open flow switch 200 delivers the compensation value (reward, Rt) to the application module 100. It is the ultimate goal of reinforcement learning to obtain the optimal compensation value by repeating the process of obtaining the compensation value based on such an action.

이러한 강화학습과 관련하여, SARSA(State-Action-Reward-State-Action) 알고리즘은 강화학습 알고리즘 중 하나로, 각 State에서 Action을 수행한 후, 그에 따른 Reward를 받아 Q 테이블을 업데이트하는 알고리즘이다. In relation to this reinforcement learning, SARSA (State-Action-Reward-State-Action) algorithm is one of the reinforcement learning algorithms. It executes an action in each state, and then receives the rewards and updates the Q table.

이와 관련하여, 표 1은 본 발명에 따른 SARSA 알고리즘의 기본 개념을 나타낸다.In this regard, Table 1 shows the basic concept of the SARSA algorithm according to the present invention.

즉, 본 발명에 따른 특정 액션에 기반하여 보상값을 구하는 강화 학습 과정에서는, Q라는 보상값 함수(또는 보상값 테이블)을 모든 상태(s)와 액션(a)에 대하여 초기화함으로써 시작된다. 이후, 각각의 에피소드에 대하여, 상태 집합(S)을 초기화하고, Q로부터의 정책(policy)을 이용하여 상태 집합(S)으로부터 특정 액션 A를 선택한다. 이후, 특정 액션 A를 수행하여, 보상값(R)와 업데이트된 상태 집합(S')을 획득한다. 다음으로, Q로부터의 정책을 이용하여 업데이트된 상태 집합(S')로부터 업데이트된 특정 액션(A')을 선택한다. 전술된 바와 같이, 선택된 액션을 수행하고, 보상값을 구하는 과정은, 상태 집합이 종료(terminal) 상태에 이를 때까지 반복될 수 있다. 이러한, 종료 상태는 각 에피소드에 대하여 충분히 필요한 액션이 수행되었거나 획득된 보상값이 수렴되거나 또는 특정 값 이상을 갖는 경우가 이에 해당할 수 있다.That is, in the reinforcement learning process for obtaining a compensation value based on a specific action according to the present invention, Q is started by initializing a compensation value function (or a compensation value table) for all states s and actions a. Then, for each episode, the state set S is initialized and a specific action A is selected from the state set S using a policy from Q. [ Thereafter, a specific action A is performed to obtain the compensation value R and the updated state set S '. Next, an updated specific action A 'is selected from the updated state set S' using the policy from Q. As described above, the process of performing the selected action and obtaining the compensation value may be repeated until the state set reaches the terminal state. This termination state may correspond to a case where an enough necessary action is performed for each episode, or the obtained compensation value is converged or has a certain value or more.

한편, 도 4는 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치의 구성 요소들을 나타낸다. 즉, 본 발명에서 제안하는 경로설정 기능을 위한 데이터 평면 Application 구조를 나타낸다. 도 2와 도4에 도시된 바와 같이 본 발명에서 전송경로를 설정하는 것은 컨트롤러(10)에서 데이터 평면상의 노드들에게 전달하는 전송경로를 위한 Flow Table을 Action으로 정의하고 각 State별로 Action에 대한 Reward를 얻음으로써 다음 State의 Action을 결정하는 방식이다. Meanwhile, FIG. 4 shows components of a transmission path setting apparatus for setting a transmission path by utilizing a data plane application in a software defined network according to the present invention. That is, the data plane application structure for the routing function proposed in the present invention is shown. As shown in FIG. 2 and FIG. 4, the transmission path is set in the present invention by defining a flow table for the transmission path to be transmitted to the nodes on the data plane in the controller 10 as an action, and Reward To determine the action of the next state.

한편, SDN 환경에서는 오픈플로우 스위치 상에 모니터링 기능을 구현하여 모든 노드에 대한 정보를 수집하고 그에 따른 전송 경로를 설정하는 방식이 사용될 수 있다. 하지만, 컨트롤러(10)가 해야 하는 작업 또는 관리하여야 할 부분이 많을 경우, 네트워크상의 Data Plane에 노드가 많을수록 과부하가 발생할 수 있기 때문에 모니터링 기능을 Data Plane 상의 각 노드로 분산시켜 배치할 수 있다.On the other hand, in the SDN environment, a monitoring function is implemented on the open flow switch, information about all the nodes is collected, and a transmission path is set according to the collected information. However, if there is a lot of work to be done by the controller 10 or a lot of parts to be managed, the monitoring function may be distributed to each node on the data plane because overloading may occur as the number of nodes in the data plane on the network increases.

도 4에 도시된 바와 같이, 전송 경로 설정 장치(1000)는 어플리케이션 모듈(100)과 오픈플로우 스위치(200)를 포함한다. 여기서, 어플리케이션 모듈(100)과 오픈플로우 스위치(200)는 각각 DP Application과 Environment로 지칭될 수 있다. 특히, Environment의 일 예로 런타임 환경(Runtime Environment)이 이에 해당할 수 있다.As shown in FIG. 4, the transmission path setting apparatus 1000 includes an application module 100 and an open flow switch 200. Here, the application module 100 and the open flow switch 200 may be referred to as a DP application and an environment, respectively. In particular, a runtime environment may be an example of an environment.

또한, 어플리케이션 모듈(100)은 상태 모니터(S-Monitor)모듈(110), 보상값 계산모듈(R-Calculator) 모듈(120)과 액션 컨덕터 (Action conductor)모듈(130)을 포함한다. 여기서, 상태 모니터 모듈(110), 보상값 계산모듈(120)과 액션 컨덕터 모듈(130)은 각각 S Monitor, R Calculator와 Action conductor로 지칭될 수 있다.The application module 100 also includes a state monitor module 110, a compensation value calculation module 120 and an action conductor module 130. Here, the status monitor module 110, the compensation value calculation module 120, and the action conductor module 130 may be referred to as S Monitor, R Calculator and Action conductor, respectively.

또한, 오픈플로우 스위치(200)는 포트 상태 제공 모듈(210) 및 플로우 테이블 모듈(220)을 포함한다. 여기서, 포트 상태 제공 모듈(210) 및 플로우 테이블 모듈(220)은 각각 Physical port status (providing module) 및 Flow Table (module)로 지칭될 수 있다. 한편, 플로우 테이블 모듈(220)은 플로우 테이블을 포함하는 오픈플로우 스위치로도 지칭될 수 있다. 또한, Physical port status (providing module)에 해당하는 포트 상태 제공 모듈(210)의 일 예로, 네트워크 인터페이스 카드(NIC: Network Interface Card)가 이에 해당될 수 있다.In addition, the open flow switch 200 includes a port status provision module 210 and a flow table module 220. Here, the port state providing module 210 and the flow table module 220 may be referred to as a physical port status (providing module) and a flow table (module), respectively. On the other hand, the flow table module 220 may also be referred to as an open flow switch including a flow table. In addition, an example of the port status providing module 210 corresponding to the physical port status (providing module) may be a network interface card (NIC).

한편, 도 4의 좌측에 도시된 바와 같이, 어플리케이션 모듈(100)과 오픈플로우 스위치(200)는 각각 Applicaiton 및 OVS(Open VSwitch)로 지칭될 수 있다. OpenVSwitch의 State는 SR Monitor 모듈을 통한 포트 모니터링에 의해 결정된다. SR Monitor 모듈에서는 Host와 연결된 Port를 제외한 나머지 포트들의 시간 t+1에서 Tx bytes, Rx bytes와 시간 t에서의 Tx bytes, Rx bytes가 동일할 경우 0(미사용중), 증가했을 경우 1(사용중)로 나타내며 Output Port들이 얻는 0 또는 1값의 집합을 State로 정의한 후, 요청이 있을 시 모니터링 하여 액션 컨덕터(Action conductor) 모듈(130)로 전달하게 된다. 또한, 상기 보상값(Reward)은 액션 컨덕터 모듈(130)이 해당 액션(Action)을 수행하였을 때 해당하는 Port에서 측정되는 전송 지연(Transmission Delay) 값의 역수로 계산된다. 또한, 상기 보상값(Reward)은 상기 액션 컨덕터 모듈(Action conductor)이 상기 액션을 수행하였을 때 해당하는 포트에서 측정되는 전송 지연(Transmission Delay)값의 역수와 최대 대역폭의 값에 따라 달라지는 가중치(weight)의 곱으로 계산될 수 있다. 이에 관해서는 아래에서 자세히 살펴보기로 한다.Meanwhile, as shown in the left side of FIG. 4, the application module 100 and the open flow switch 200 may be referred to as Applicaiton and OVS (Open VSwitch), respectively. The State of the OpenVSwitch is determined by the port monitoring through the SR Monitor module. In the SR Monitor module, 0 (not in use) if Tx bytes, Rx bytes at time t + 1, Tx bytes at Rx bytes at time t, And a set of 0 or 1 values obtained by Output Ports is defined as a State, and when requested, is monitored and transmitted to the action conductor module 130. The compensation value Reward is calculated as a reciprocal of a transmission delay value measured at a corresponding port when the action conductor module 130 performs an action. In addition, the compensation value Reward may be a weighted value that depends on the reciprocal of a transmission delay value measured at a corresponding port when the action conductor performs the action, ). &Lt; / RTI > This will be discussed in detail below.

전술된 내용과 관련하여, 도 4에서의 어플리케이션 모듈(100)과 오픈플로우 스위치(200)의 관점에서, 상세히 살펴보면 다음과 같다.With respect to the above description, the details will be described in terms of the application module 100 and the open flow switch 200 in FIG.

먼저, 도 2, 도 4와 관련하여, 패킷이 전송되는 전제적인 전송경로 설정 과정을 살펴보면 다음의 순서와 같다. 한편, 이러한 전송경로 설정 순서는 다음의 순서에 한정되는 것이 아니라, 본 발명의 범위 내에서 변경 가능하다.First, referring to FIG. 2 and FIG. 4, an overall transmission path setting process in which a packet is transmitted will be described as follows. On the other hand, this transmission path setting procedure is not limited to the following procedure, but can be changed within the scope of the present invention.

1) 패킷이 호스트로부터 오픈플로우 스위치(200)에 유입될 때, 오픈플로우 스위치(200)는 SDN 컨트롤러(10)에게 해당 패킷을 전송하기 위한 다중 경로의 플로우 테이블을 요청한다.1) packet flows from the host to the open flow switch 200, the open flow switch 200 requests the SDN controller 10 for a multi-path flow table for transmitting the packet.

2) 요청 메시지(Packet_IN Message)를 전달받은 SDN 컨트롤러(10)는 패킷을 전송할 수 있는 최단 거리 다중 경로를 선정한다. 이때, 최단 거리 다중 경로는 다중 경로에 대한 플로우 테이블에서 단순히 Hop Count를 활용하여 선정된다. 이후에, SDN 컨트롤러(10)는 최단 거리 다중 경로를 각각의 오픈플로우 스위치(200)로 전달하게 된다.2) Request message (Packet_IN Message), the SDN controller 10 selects the shortest multipath that can transmit the packet. At this time, the shortest multipath is selected by simply using Hop Count in the flow table for multipath. Thereafter, the SDN controller 10 delivers the shortest multipath to each open flow switch 200.

3) 플로우 테이블을 전달받은 각 오픈플로우 스위치(200)는 각 플로우 테이블에서 전송 방향(Output port)의 포트들을 활용하여 스위치의 현재 상태(State)를 결정한다. 이러한, 스위치의 현재 상태(State)는 상태 모니터(S-Monitor)모듈(110)을 통해 모니터링되는 상태를 이용하여 이루어진다.3) Each open flow switch 200 receiving the flow table determines the current state of the switch by utilizing the ports of the output port in each flow table. The current state of the switch is achieved using a state monitored through the S-Monitor module 110. [

4) 또한, 플로우 테이블에서 동일한 입력 포트에 대한 전송 방향 포트 정보를 토대로 보상값 테이블(Reward Table)을 생성한다. 이러한 보상값 테이블은 보상값 계산모듈(R-Calculator) 모듈(120)을 통해 계산되는 보상값에 기반하여 생성된다. 한편, 이러한 보상값 테이블의 일 예시는 추후 기술될 도 6에 도시되는 보상값 테이블일 수 있으며, 이에 대해서는 아래에서 자세히 살펴보기로 한다.4) Also, a reward table is generated based on the transfer direction port information for the same input port in the flow table. The compensation value table is generated based on the compensation value calculated through the compensation value calculation module (R-calculator) module 120. Meanwhile, one example of the compensation value table may be the compensation value table shown in FIG. 6 to be described later, which will be described in detail below.

5) 이때, 하나의 입력포트에 대하여 전송 방향 포트가 다수 개일 경우, 각 전송 방향 포트의 Reward 값 중 가장 큰 Reward 값을 갖는 포트를 선택하여 플로우 테이블에 반영한다. 여기서, 가장 큰 Reward 값을 갖는 포트를 선택하는 것은 해당하는 플로우 엔트리의 우선순위를 높여 해당 전송 방향 포트를 사용할 수 있도록 업데이트하는 것에 해당한다. 이러한, 업데이트 과정은 액션 컨덕터 모듈(130)을 통해 이루어진다.5) In this case, when there are a plurality of transmission direction ports for one input port, a port having the largest Reward value among the Reward values of each transmission direction port is selected and reflected in the flow table. Here, selecting a port having the largest Reward value corresponds to updating the corresponding transfer direction port to use the higher priority of the corresponding flow entry. This updating process is performed through the action conductor module 130. [

6) 만약 입력 포트에 대해서 전송 방향 포트가 하나만 존재한다면, 액션 컨덕터 모듈(130)은 해당 플로우 엔트리 또한 우선순위를 높여 업데이트를 수행한다.6) If there is only one transmission direction port for the input port, the action conductor module 130 also updates the corresponding flow entry by increasing its priority.

한편, 제일 처음 다수의 플로우 엔트리의 집합체인 플로우 테이블을 전달받은 오픈플로우 스위치(200)는 애플리케이션을 통해 위와 같은 과정을 수행한다. 따라서, 현재 상태에서 보상 값에 기반하여 (Based on Reward Value), 각 포트의 효율성을 검증하고, 최종 전송 경로를 선정하게 된다.Meanwhile, the open flow switch 200 receiving the flow table, which is an aggregate of the first plurality of flow entries, performs the above process through the application. Therefore, based on the reward value in the current state, the efficiency of each port is verified and the final transmission path is selected.

한편, 이러한 전술된 과정에 따른 전송경로 설정을 수행하는 과정에 대해 구성요소에 근거하여 좀 더 상세히 살펴보면 다음과 같다.The process of setting the transmission path according to the above-described process will be described in more detail based on the constituent elements.

어플리케이션 모듈(100)은 데이터 평면에 존재하는 출발지-목적지(Src-Dst) 간의 플로우 테이블(Flow Table)을 미리 전달하도록 구성된다. 이때, 하나의 입력포트(In_port)에 대하여 전송 방향 포트(Output port)가 다수 개일 경우, 전술된 동작을 수행할 수 있다. 여기서, 플로우 테이블은 패킷 전송이 필요할 때마다 요청에 의해 제공되는 것이 아니며, 따라서 패킷 전송 이전에 미리 수신된 플로우 테이블을 이용하여 액션을 수행 (및/또는 학습)하고, 이에 따라 패킷을 전송한다. 이에 따라, 오픈플로우 스위치(200)는 출발지-목적지(Src-Dst) 간 각 전송 경로에 대한 우선순위에 관한 플로우 테이블(Flow Table)을 설정하도록 구성된다.The application module 100 is configured to transmit a flow table between the source and destination (Src-Dst) existing in the data plane in advance. At this time, when there are a plurality of output ports for one input port (In_port), the above-described operation can be performed. Here, the flow table is not provided by a request every time a packet transmission is required, and thus performs (and / or learns) an action using a flow table previously received prior to packet transmission and transmits the packet accordingly. Accordingly, the open flow switch 200 is configured to set a flow table regarding the priority for each transmission path between the source and destination (Src-Dst).

한편, 어플리케이션 모듈(100)은 상기 플로우 테이블을 기반으로 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성된다. 또한, 어플리케이션 모듈(100)은 상기 보상값이 가장 큰 액션을 수행함으로써 패킷을 전송하고, 보상값을 상기 액션에 대응하는 상기 보상값으로 갱신하도록 구성된다. 한편, 상기 패킷은 에지(Edge) 노드에서 전송되는 경우뿐만 아니라, 전체 네트워크 상에서 임의의 노드로부터 전송될 수 있음은 물론이다. 또한, 상기 패킷은 출발지 노드에서 목적지 노드 간의 경로 상의 중간 노드들이 갖고 있는 플로우 테이블을 통해 상기 목적지까지 전송될 수 있다.On the other hand, the application module 100 is configured to perform an action having the largest compensation value Reward for each state based on the flow table. In addition, the application module 100 is configured to transmit the packet by performing the action with the largest compensation value, and update the compensation value with the compensation value corresponding to the action. It should be noted that the packet can be transmitted not only from an edge node but also from an arbitrary node on the entire network. In addition, the packet may be transmitted to the destination through the flow table of the intermediate nodes on the route between the source node and the destination node.

상태 모니터 모듈(110)은 입력 포트(In_port)를 제외한 나머지 포트들의 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 모니터링하도록 구성된다. 구체적으로, 상태 모니터 모듈(110)은 OpenVSwitch가 설치된 하드웨어의 이더넷 포트(Ethernet Port)를 모니터링할 수 있다. 또한, 이러한 모니터링의 경우 지속적으로 동작하는 것이 아니라, 오픈플로우 스위치(200)에서 Packet_IN Message를 수신하는 경우에만 이루어지도록 구성될 수 있다. 즉, 각 OpenVSwitch로 이에 대한 정보를 전달할 때 액션 컨덕터 모듈(130)의 요청에 의해 동작하게 되고, 전송 포트가 결정됨과 동시에 대기상태로 들어간다. 이는 각 노드에서 모니터링에 의한 자원 낭비를 방지하고 Data Plane의 전체적인 부하를 감소시키기 위함이다.The status monitor module 110 is configured to monitor a status associated with an increase in the number of transmission / reception bytes of ports other than the input port In_port. Specifically, the status monitor module 110 can monitor the Ethernet port of the hardware in which the OpenVSwitch is installed. In addition, in the case of such a monitoring, it is configured not to operate continuously but only when receiving the Packet_IN Message from the open flow switch 200. In other words, when the information is transmitted to each OpenVSwitch, it is operated by a request of the action conductor module 130, and a transfer port is determined and simultaneously enters a standby state. This is to prevent resource waste by monitoring at each node and to reduce the overall load of the data plane.

이와 관련하여, 컨트롤러(10)는 패킷이 호스트로부터 오픈플로우 스위치(200)에 유입될 때, 상기 오픈플로우 스위치(200)로부터 상기 패킷을 전송하기 위한 상기 플로우 테이블에 대한 요청 메시지(Packet_IN Message)를 수신한다. 또한, 컨트롤러(10)는 상기 요청 메시지를 전달받아 상기 패킷을 전송할 수 있는 최단 거리 다중 경로를 선정할 수 있다. 이때, 최단 거리 다중 경로는 다중 경로에 대한 플로우 테이블에서 단순히 Hop Count를 활용하여 선정된다. 이후에, 컨트롤러(10)는 최단 거리 다중 경로를 각각의 오픈플로우 스위치(200)로 전달하게 된다.In this regard, the controller 10 transmits a request message (Packet_IN Message) for the flow table for transferring the packet from the open flow switch 200 to the open flow switch 200 when the packet flows from the host to the open flow switch 200 . In addition, the controller 10 may select the shortest-path multipath to which the packet can be transmitted in response to the request message. At this time, the shortest multipath is selected by simply using Hop Count in the flow table for multipath. Thereafter, the controller 10 transfers the shortest multipath to each open flow switch 200.

상기 오픈플로우 스위치(200)는, 각 플로우 테이블에서 상기 전송 방향 포트들을 활용하여 스위치의 현재 상태(State)를 결정할 수 있다. 또한, 상기 어플리케이션 모듈(100)은 보상값 계산모듈(R-Calculator, 130)을 통해 상기 플로우 테이블에서 동일한 입력 포트에 대한 전송 방향 포트 정보를 토대로 보상값 테이블(Reward Table)을 생성할 수 있다. 일 예로, 보상값 계산모듈(130)은 송신/수신 바이트들에 대하여 각각의 레이트를 계산하여, 상기 레이트의 증가 여부와 연관된 상태를 모니터링하도록 구성될 수 있다. 이에 따라, 보상값 계산모듈(130)은 전송 방향 포트 정보와 레이트 정보에 기반하여 보상값 테이블(Reward Table)을 생성할 수 있다. The open flow switch 200 can determine the current state of the switch by utilizing the transfer direction ports in each flow table. Also, the application module 100 may generate a reward table based on the transmission direction port information for the same input port in the flow table through a compensation value calculation module (R-calculator) 130. [ In one example, the compensation value calculation module 130 may be configured to calculate a respective rate for the transmit / receive bytes to monitor the status associated with the increase or decrease of the rate. Accordingly, the compensation value calculation module 130 may generate a compensation table based on the transmission direction port information and the rate information.

이와 관련하여, 플로우 테이블 모듈(220)에 해당하는 OpenVSwitch의 상태(State)는 상태 모니터 모듈(110)을 통한 포트 모니터링에 의해 결정된다. 상태 모니터 모듈(110)에서는 수학식 1의 (1), (2)를 통해 입력 포트(Input Port)를 제외한 나머지 포트들의 시간 t 및 t+1에서 Tx bytes, Rx bytes와 관련된 상태를 모니터링하여 액션 컨덕터 모듈(130)로 전달한다. In this regard, the state of the OpenVSwitch corresponding to the flow table module 220 is determined by the port monitoring through the status monitor module 110. The status monitor module 110 monitors the statuses related to Tx bytes and Rx bytes at time t and t + 1 of the ports other than the input port through (1) and (2) in Equation 1, To the conductor module 130.

여기서,

는 시간 t와 t-1 사이의 패킷 수신량에 해당하고,

는 시간 t와 t-1 사이의 패킷 송신량에 해당한다. 또한,

와

는 시간 t와 t-1에서 각 패킷 수신량에 해당한다. 또한,

와

는 시간 t와 t-1에서 각 패킷 송신량에 해당한다.here,

Corresponds to a packet reception amount between time t and t-1,

Corresponds to the amount of packets transmitted between time t and t-1. Also,

Wow

Corresponds to each packet reception amount at time t and t-1. Also,

Wow

Corresponds to each packet transmission amount at time t and t-1.

구체적으로, 시간 t 및 t+1에서의 Tx bytes, Rx bytes가 동일할 경우 0(미사용중), 증가했을 경우 1(사용중)로 나타낼 수 있다. 즉, 송신/수신 바이트들의 수는 시간 t 또는 t+1까지의 송신/수신된 누적 바이트들의 수에 해당하고, 송신/수신 바이트들의 수와 동일하다는 의미는 시간 t에서 t+1로 변할 때 새로 송신/수신되는 데이터가 없다는 의미이다. 한편, 상태 모니터 모듈(110)은 상기 시간 t+1에서 송신/수신 바이트들의 수가 상기 시간 t에서의 송신/수신 바이트들의 수보다 증가하면, 상기 해당 상태를 1로 정의한다. 즉, 송신/수신 바이트들의 수는 시간 t 또는 t+1까지의 송신/수신된 누적 바이트들의 수에 해당하고, 송신/수신 바이트들의 수가 증가한다는 의미는 시간 t에서 t+1로 변할 때 새롭게 송신/수신되는 데이터가 존재한다는 의미이다. 이때, 출력 포트(Output Port)들이 얻는 0 또는 1값의 집합을 상태(State)로 정의한 후, 요청이 있을 시 모니터링하여 액션 컨덕터 모듈(130)로 전달하게 된다. Concretely, 0 (not in use) when Tx bytes and Rx bytes at the time t and t + 1 are equal to 1 (in use) when it is increased can be represented. That is, the number of transmit / receive bytes corresponds to the number of accumulated bytes transmitted / received up to time t or t + 1 and equal to the number of transmit / receive bytes. It means that there is no data to be transmitted / received. Meanwhile, the status monitor module 110 defines the corresponding status as 1 if the number of transmission / reception bytes at the time t + 1 is greater than the number of transmission / reception bytes at the time t. That is, the number of transmit / receive bytes corresponds to the number of cumulative bytes transmitted / received up to time t or t + 1, and the increase in the number of transmit / receive bytes means that a new transmit / Means that there is data to be received. At this time, a set of 0 or 1 values obtained by the output ports is defined as a state, and the request is monitored and transmitted to the action conductor module 130.

이와 같이, 새롭게 송신/수신되는 데이터의 존재 유무 또는 다른 기준에 따라 해당 상태를 다양하게 정의하는 것이 가능하면, 이러한 해당 상태를 정의하는 방법은 이에 한정되는 것이 아니라 자유롭게 변형 가능하다.As described above, if it is possible to define various states according to the presence or absence of newly transmitted / received data or other criteria, the method for defining the corresponding state is not limited thereto but can be freely modified.

또한, 액션 컨덕터 모듈(Action conductor, 130)은 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성된다. 또한, 액션 컨덕터 모듈(130)은 상기 송신/수신 바이트들의 수의 증가 여부 및 상기 각각의 레이트의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값이 가장 큰 액션(Action)을 수행하도록 구성될 수 있다. 이때, 상기 송신/수신 바이트들의 수의 증가 여부 및 상기 각각의 레이트의 증가 여부는 각각 상태 모니터 모듈(110)과 보상값 계산모듈(120)로부터 모니터링되어 이와 관련된 상태가 액션 컨덕터 모듈(130)로 전달될 수 있다.In addition, the action conductor module 130 is configured to receive a state related to an increase in the number of transmission / reception bytes and perform an action having the largest compensation value Reward for each state . In addition, the action conductor module 130 receives the state associated with the increase in the number of transmission / reception bytes and the increase or decrease in the respective rates, and performs an action having the largest compensation value for each state . At this time, whether the number of the transmission / reception bytes is increased or not and whether the respective rates are increased are monitored from the status monitor module 110 and the compensation value calculation module 120, respectively, and the status related thereto is transmitted to the action conductor module 130 Lt; / RTI >

여기서, 상기 보상값(Reward)은 상기 액션 컨덕터 모듈(Action conductor, 120)이 상기 액션을 수행하였을 때 해당하는 포트에서 측정되는 전송 지연(Transmission Delay) 값의 역수로 계산될 수 있다. Here, the compensation value Reward may be calculated as an inverse number of a transmission delay value measured at a corresponding port when the action conductor module 120 performs the action.

이와 관련하여, 도 5는 본 발명에 따른 Application (또는 어플리케이션 모듈)이 OpenVSwitch(오픈플로우 스위치)로부터 얻을 수 있는 각 상태를 나타낸 도면이다. 예를 들어, 도 5에 도시된 바와 같이, s(1) 내지 s(8)의 8가지 상태에 대하여, Application (또는 어플리케이션 모듈(100))이 OpenVSwitch(오픈플로우 스위치(200))로부터 얻을 수 있는 각 상태(State)를 나타낸 것이다. In this regard, FIG. 5 is a view showing each state that an application (or an application module) according to the present invention can obtain from an OpenVSwitch (open flow switch). For example, as shown in Fig. 5, an application (or application module 100) can be obtained from OpenVSwitch (open flow switch 200) for eight states s (1) to s State of each state.

한편, 액션 컨덕터 모듈(130)에서는 앞서 정의된 상태(State), 액션(Action), 보상값(Reward)를 통해 각 State와 Action에 대한 Reward 테이블을 생성한다. 도 3과 관련되어 전술된 SARSA 알고리즘을 이용하는 경우, 에피소드가 끝나는 시점에서 해당 Q 테이블을 학습한 후, 이를 정책(Policy)으로 사용한다. 하지만, 도 4 및 도 5에서 제안하는 방식에서는 State가 포트의 사용 유무에 의해 정의되어 있기 때문에 실제 같은 State 이더라도 Action으로부터의 Reward는 달라질 수 있다. 이와 같은 이유로 지속적인 Q 테이블 업데이트를 통해 효율적인 Action을 선택할 수 있도록 하는 알고리즘 적용이 가능하다.Meanwhile, in the action conductor module 130, a reward table for each state and action is generated through the above-described state, action, and compensation value (Reward). In the case of using the SARSA algorithm described above with reference to FIG. 3, the corresponding Q table is learned at the end of the episode and is used as a policy. However, in the method shown in FIGS. 4 and 5, since the state is defined by whether or not the port is used, the reward from the action may be different even if it is in the same state. For this reason, it is possible to apply algorithms that enable efficient action selection through continuous Q table update.

구체적으로, 이러한 알고리즘은 OpenVSwitch에서 전송 포트를 선정하기 위한 알고리즘이다. 또한, 기존 강화학습 알고리즘과 다른 점은, ε-greedy한 학습 정책이 아니라는 점이다. 제안하는 방식에서는 상태(State)가 각 포트의 사용 유무에 의해 정의되어 있기 때문에 실제 같은 상태(State)이더라도 행동(Action)으로부터의 보상값(Reward)은 각 Packet_IN 이벤트마다 달라질 수 있다. 이와 같은 이유로 패킷이 유입되는 이벤트를 감지하고 그때의 상태(State)와 행동(Action)을 통해 전송 포트를 선정할 수 있도록 적용할 수 있다.Specifically, this algorithm is an algorithm for selecting a transmission port in OpenVSwitch. Also, the difference from the existing reinforcement learning algorithm is that it is not an ε-greedy learning policy. In the proposed scheme, since the state is defined by the use of each port, the compensation value (Reward) from the action can be changed for each Packet_IN event even if it is in the same state. For this reason, it is possible to detect an incoming event and to select a transmission port through a state and an action at that time.

이와 관련하여, 표 2는 도 4 및 도 5에서 제안하는 방식에 따라 적용된 알고리즘의 의사코드(pseudo code)를 나타낸 것이다.In this regard, Table 2 shows the pseudo code of the algorithm applied according to the scheme proposed in FIG. 4 and FIG.

즉, 액션 컨덕터 모듈(130)은 상기 보상값에 대한 보상값 테이블 Q(s,a)에서, 패킷 유입이 검출되면 상태(s)로부터 제1액션(a)을 임의로 선택하고, 상기 제1액션(a)을 수행하여, 상기 보상값(r)을 관측한다. 또한, 액션 컨덕터 모듈(130)은 상기 상태(s)로부터 제2액션(a')을 선택하여 상기 보상값 테이블 Q(s,a)를 Q(s, a')으로 업데이트한다. 또한, 액션 컨덕터 모듈(130)은 상기 보상값 테이블 중 상기 상태(s)와 연관된 보상값 테이블(reward table)이 완성(complete)될 때까지 상기 보상값 관측 및 상기 보상값 테이블 업데이트를 반복한다. 이러한 상기 반복된 보상값 관측을 통해 상기 보상값이 가장 큰 액션을 선택하여, 경로 결정(Path decision)을 수행할 수 있다. 즉, 액션 컨덕터 모듈(130)은 상기 포트들의 사용 유무와 상기 수신된 플로우 테이블에 기반하여, 상기 해당 상태(State)마다 상기 보상값(Reward)이 가장 큰 액션(Action)을 수행하도록 구성될 수 있다. 이때, 상기 포트들의 사용 유무에 의해 상기 상태가 정의되어, 상기 상태에 대한 액션으로부터의 보상값은 상기 포트들의 사용 유무에 따라 달라질 수 있다. 한편, 입력포트에 대하여 전송 방향 포트가 하나만 존재한다면, 해당 플로우 엔트리에 대한 우선순위를 높여 업데이트를 수행할 수 있다.That is, the action conductor module 130 arbitrarily selects the first action (a) from the state (s) when a packet entry is detected, in the compensation value table Q (s, a) (a), and observes the compensation value (r). In addition, the action conductor module 130 updates the compensation value table Q (s, a) to Q (s, a ') by selecting the second action a' from the state s. In addition, the action conductor module 130 repeats the compensation value observation and the compensation value table update until the reward table associated with the state s in the compensation value table is completed. The action having the largest compensation value can be selected through the observation of the repeated compensation value to perform a path decision. That is, the action conductor module 130 may be configured to perform an action having the largest compensation value Reward for each state according to the use of the ports and the received flow table. have. At this time, the state is defined according to whether or not the ports are used, and the compensation value from the action on the state can be changed depending on whether or not the ports are used. On the other hand, if there is only one transfer direction port for the input port, the update can be performed by increasing the priority of the flow entry.

본 발명에서 제안하는 Application을 통해 패킷이 전송되는 구체적인 시나리오는 다음과 같다. Packet_IN 메시지에 의해 계산되는 전송 경로는 각 OpenVSwitch로 전달된다. 이때, 플로우 엔트리(Flow Entry)의 형태는 In 포트와 Output 포트가 한 쌍이 되어 하나의 전송 링크에 대해 총 2가지의 플로우 엔트리를 전달받게 된다. 이때, 효율적인 전송 포트를 찾기 위하여 각 매치(Match) 필드 상의 In 포트 정보를 기반으로 각각의 보상값 테이블(Reward table)을 만들게 된다. 오픈플로우 스위치(200)에서는 출발지-목적지 간 전송 경로를 홉 수 기반의 알고리즘으로 계산하고 각 전송 경로에 대한 우선순위를 다르게 설정하여 OpenVSwitch에 배치한다. 여기서, 전술된 출발지-목적지 간 전송 경로를 홉 수 기반의 알고리즘과 관련하여, 데이크스트라(Dijkstra) 알고리즘이 사용될 수 있다. 즉, 데이크스트라 알고리즘은 방향이 주어진 가중 그래프(weighted graph) G와 출발점 s를 입력으로 받는다. 그래프 G의 모든 꼭짓점들의 집합을 V라 하고, 그래프의 변을 출발점 u와 도착점 v의 순서쌍 (u, v)로 표현한다. G의 모든 변들의 집합을 E라 하고, 변들의 가중치는 함수 w: E → [0, ∞]로 표현한다. 이때 가중치 w(u, v)는 꼭짓점 u에서 꼭짓점 v로 이동하는 데 드는 비용(시간, 거리 등)이 된다. 경로의 비용은 경로 사이의 모든 변들의 가중치의 합이 된다. 데이크스트라 알고리즘은 V의 임의의 꼭짓점의 쌍 s와 t가 있을 때 s에서 t로 가는 가장 적은 비용이 드는 경로(최단 경로)를 찾는데 활용될 수 있다.A concrete scenario in which a packet is transmitted through the application proposed in the present invention is as follows. The transport path computed by the Packet_IN message is passed to each OpenVSwitch. At this time, the form of the flow entry is a pair of the In port and the output port, and a total of two flow entries are received for one transmission link. At this time, in order to find an efficient transmission port, a reward table is formed based on In port information on each match field. In the open flow switch 200, the source-destination transmission path is calculated by an algorithm based on the number of hops, and the priority of each transmission path is set differently and placed in the OpenVSwitch. Here, the Dijkstra algorithm can be used in connection with the hop-based algorithm described above for the source-to-destination transmission path. In other words, the Dextra algorithm receives a weighted graph G and a starting point s as inputs. Let V be the set of all vertices of graph G, and express the sides of the graph as a pair (u, v) of starting point u and destination point v. Let E be the set of all sides of G, and the weights of the sides are represented by the function w: E → [0, ∞]. In this case, the weight w (u, v) is the cost (time, distance, etc.) required to move from the vertex u to the vertex v. The cost of the path is the sum of the weights of all sides between the paths. The Dextra algorithm can be used to find the least costly path (shortest path) from s to t when there is a pair s and t of any vertex in V.

한편, 아래의 표 3은 하나의 OpenVSwitch에 배치되는 출발지-목적지 간의 Flow Entry에 대한 Table의 예시를 나타낸다. On the other hand, Table 3 below shows an example of the table for the flow entry between the source and destination disposed in one OpenVSwitch.

즉, 상기 플로우 테이블은, 플로우 엔트리(Flow Entry), 매치 필드(Match Field) 및 액션(Action) 필드를 포함할 수 있다. 여기서, 플로우 엔트리는 복수의 플로우들을 서로 구분하도록 구성된다. 또한, 매치 필드는 상기 플로우 엔트리에 대하여, 상기 각 전송 경로에 대한 우선순위, 입력 포트(in_port), 상기 출발지 및 상기 목적지의 MAC 주소(src_mac, dst_mac)와 IP 주소(dst_ip, src_ip)를 포함한다. 또한, 액션 필드는 상기 플로우 엔트리에 대하여, 상기 입력 포트(in_port)에 대응하는 출력 포트(output port)를 포함한다.That is, the flow table may include a flow entry, a match field, and an action field. Here, the flow entry is configured to distinguish a plurality of flows from each other. The match field includes a priority, an input port, an origin and a destination MAC address (src_mac, dst_mac) and an IP address (dst_ip, src_ip) for each of the transmission paths with respect to the flow entry . In addition, the action field includes, for the flow entry, an output port corresponding to the input port (in_port).

한편, 액션 컨덕터 모듈(130)은 상기 플로우 테이블 내의 각 플로우 엔트리의 입력 포트(In port) 정보를 추출하고, 상기 입력 포트에 대응하는 적어도 하나의 출력 포트(output port)의 가용 대역폭을 기반으로 상기 보상값 계산모듈(120)로부터 상기 보상값을 전달받아 최종 액션을 결정하도록 구성될 수 있다.Meanwhile, the action conductor module 130 extracts information on the In port of each flow entry in the flow table, and extracts, based on the available bandwidth of the at least one output port corresponding to the input port, And may receive the compensation value from the compensation value calculation module 120 and determine the final action.

구체적으로, 액션 컨덕터 모듈(130)에서는 배치되어 있는 플로우 테이블 내에서 각 플로우 테이블의 In 포트 정보를 추출한다. 이는 In 포트에 대한 행동(Action) Rule의 정보를 기반으로 Reward Table을 만들기 위해 수행되는 과정이다. 구체적으로, 표 3을 참조하면, in_port=1, in_port=2, in_port=3, in_port=4에 대한 Table이 만들어진다. 이때, in_port=1에 대한 Table은 행동(Action) Rule을 output:2, output:3 및 output:4를 갖고 있으므로 현재의 상태(State)에서 in_port=1에 대한 각각의 Output 포트의 가용대역폭을 기반으로 보상값 계산모듈(R Calculator)(120)로부터 보상값(Reward)을 전달받아 최종 행동(Action)을 결정하게 된다. 또한, in_port=2, in_port=3, in_port=4에 대한 Table의 경우, output:1에 해당하는 하나의 행동(Action)만 갖게 되므로 해당 Flow Entry를 선택하게 된다. 각 Reward Table에서 보상값(Reward) 값에 의해 행동(Action)을 결정한 후, 어플리케이션 모듈(110)은 해당 행동(Action)을 갖는 Flow Entry의 우선순위를 높게 적용하여 업데이트할 수 있다. 또한, 각 Entry의 경우 Timeout 시간을 적용하여 해당 Entry가 활용되지 않을 경우 상기 Timeout 시간 후에 삭제되도록 할 수 있다. 여기서, 상기 Timeout 시간은 5초로 결정될 수 있다.Specifically, the action conductor module 130 extracts In port information of each flow table in the flow table disposed. This is the process that is performed to create the reward table based on the action rule information for the In port. Specifically, referring to Table 3, tables for in_port = 1, in_port = 2, in_port = 3, in_port = 4 are created. The table for in_port = 1 has Action Rule as output: 2, output: 3 and output: 4, so the available bandwidth of each output port for in_port = 1 in the current state And receives the compensation value Reward from the compensation value calculation module (R calculator) 120 to determine the final action. In the table for in_port = 2, in_port = 3, and in_port = 4, the flow entry is selected because it has only one action corresponding to output: 1. After determining the action according to the reward value in each reward table, the application module 110 can update the priority of the flow entry having the action with a high priority. In addition, in case of each entry, timeout time is applied, and if the entry is not utilized, it can be deleted after the timeout time. Here, the timeout time may be determined to be 5 seconds.

한편, 도 6은 본 발명의 일 실시예에 따른 각각의 입력 포트에서의 보상값 테이블을 도시한다. 도 6에 도시되고 전술된 바와 같이, 각각의 입력 포트에서의 출력 포트는 적어도 하나 이상이고, 예를 들어, in_port=1에 대한 액션 규칙은 복수의 출력 포트(output:2, output:3 및 output:4)에 대해서 서로 다르게 정의될 수 있다. 즉, 각 행동(Action)에 대한 보상값(Reward)을 기반으로 최종적으로 플로우 엔트리(Flow Entry)를 결정하게 된다. 이렇게 Data Plane의 각 OpenVSwitch에서 최종 플로우 엔트리를 업데이트 하게 되면 해당 Entry들을 기반으로 만들어진 경로를 통해 패킷이 전송된다. 6 illustrates a table of compensation values at each input port in accordance with an embodiment of the present invention. 6, and as described above, there are at least one output port at each input port, for example, an action rule for in_port = 1 includes a plurality of output ports (output: 2, output: 3, and output : 4). That is, the flow entry is finally determined based on the compensation value Reward for each action. In this way, when each OpenVSwitch in the Data Plane updates the last flow entry, the packet is transmitted through the path based on the entries.

in_port=2, in_port=3, in_port=4의 Reward Table을 통해 결정된 Flow Entry의 경우, 이웃 노드로부터의 패킷 전송 여부에 따라 활용되거나 삭제된다. 이 의미는 2번 포트와 연결된 이웃 노드로부터 패킷이 전달된다면 in_port=3과 in_port=4를 매치 필드(Match Field)로 갖는 플로우 엔트리(Flow Entry)는 Timeout 설정에 의해 삭제되고 in_port=2를 Match 필드로 갖는 플로우 엔트리만 활용되어 패킷을 목적지까지 전달하게 된다. 본 방식은 오픈플로우 스위치가 선정한 출발지에서 목적지까지의 Hop수 기반 다중 경로에 대하여 OpenVSwitch의 Application을 통해 현재 상태(State)에서 가장 효율적인 전송경로를 스스로 결정하게 된다. 이러한 방식을 제안하게 된 가장 큰 이유는 SDN 환경에서 Data Plane의 자원을 적극 활용하여 보다 유연한 전송 경로 설정 방법을 탐색하기 위함이다. In case of a flow entry determined through the reward table of in_port = 2, in_port = 3, in_port = 4, it is utilized or deleted according to whether a packet is transmitted from a neighboring node. This means that if a packet is delivered from a neighbor node connected to port 2, a flow entry having in_port = 3 and in_port = 4 as a match field is deleted by timeout setting and in_port = So that the packet is transmitted to the destination. This method determines the most efficient transmission path in the current state through the application of OpenVSwitch for Hop-based multi-path from the start point to the destination selected by the open flow switch. The most prominent reason for suggesting such a scheme is to exploit the more flexible transmission path setting method by actively utilizing the resources of the data plane in the SDN environment.

전술된 보상값(Reward)의 경우, 각 플로우 엔트리의 In 포트의 정보를 토대로 Output 포트를 추출한다. 또한, 각 Output 포트에 대한 보상값(Reward) 계산을 수행하게 될 때, 해당 포트의 가용 대역폭을 모니터링하여 계산한 후 액션 컨덕터 모듈(130)로 전달한다. 전술된 수학식 1의 (1)과 (2)를 통해 이더넷 포트의 패킷 송수신 여부, 즉 포트의 사용량을 확인할 수 있다. 이때, 액션 컨덕터 모듈(130)의 요청에 의해 SMonitor 모듈이 송수신 바이트들(Bytes)을 모니터링 한다. 이때 송신 및 수신된 바이트들이 증가했다면 현재 포트가 사용되고 있다고 판단하여 상태(State) 결정에 반영한다. 또한, 각 Rate가 증가하였을 경우 이를 가용대역폭과 전송지연 계산에 활용하여 보상값(Reward)을 계산한다.In the case of the compensation value Reward described above, the output port is extracted based on the information of the In port of each flow entry. In addition, when calculating a compensation value for each output port, the available bandwidth of the corresponding port is monitored and calculated and then transmitted to the action conductor module 130. (1) and (2) of Equation (1) described above, it is possible to confirm whether the Ethernet port is transmitting / receiving a packet, that is, the usage amount of the port. At this time, the SMonitor module monitors the transmission / reception bytes (Bytes) at the request of the action conductor module 130. At this time, if the transmitted and received bytes are increased, it is determined that the current port is being used and reflected in the state determination. Also, when each rate increases, the compensation value (Reward) is calculated by using it for the available bandwidth and transmission delay calculation.

이와 관련하여, 수학식 2의 (3) 내지 (5)는 Application이 해당 상태(State)에서 각 Reward Table에 대하여 행동(Action)을 수행했을 때의 보상값(Reward) 값을 계산하기 위한 수식이다. 이는 각 포트의 최대 대역폭을 활용하여 현재 활용할 수 있는 가용대역폭을 구하고 이를 기반으로 전송 지연 값을 계산하게 된다. 수학식 2의 (3)과 (4)에서 8을 곱해주는 이유는 대역폭의 단위가 Bit per Second 이기 때문에 Byte 단위로 얻는 Rate 값을 Bit로 바꿔주기 위함이다. Equations (3) to (5) are equations for calculating a compensation value (Reward) when an application performs an action on each reward table in the corresponding state . It utilizes the maximum bandwidth of each port to obtain the available bandwidth available and calculates the transmission delay based on the available bandwidth. The reason why (8) is multiplied by (3) and (4) in Equation 2 is to convert the Rate value obtained in byte units to Bit because the unit of bandwidth is Bit per Second.

여기서,

와

는 각각 시간 t에서 포트의 가용 대역폭과 포트의 최대 대역폭을 나타낸다. 또한,

과

는 각각 시간 t에서 포트의 전송 지연 값과 유입된 패킷(Packet)의 크기(Size)를 나타낸다. 또한,

는 시간 t에서 해당 포트의 보상값을 나타낸다. here,

Wow

Represents the available bandwidth of the port and the maximum bandwidth of the port at time t, respectively. Also,

and

Represents the transmission delay value of the port and the size (size) of the incoming packet at time t, respectively. Also,

Represents the compensation value of the corresponding port at time t.

수식 (3)과 (4)는 Application이 해당 상태(State)에서 각 Reward Table에 대하여 행동(Action)을 수행했을 때의 보상값(Reward) 값을 계산하기 위한 수식이다. 이는 각 포트의 최대 대역폭을 활용하여 현재 활용할 수 있는 가용대역폭을 구하고 이를 기반으로 전송 지연 값을 계산하게 된다. 수식 (3)과 (4)에서 8을 곱해주는 이유는 대역폭의 단위가 Bit per Second 이기 때문에 Byte 단위로 얻는 Rate 값을 Bit로 바꿔주기 위함이다.Equations (3) and (4) are formulas for calculating the reward value when the application performs an action on each reward table in the corresponding state. It utilizes the maximum bandwidth of each port to obtain the available bandwidth available and calculates the transmission delay based on the available bandwidth. The reason for multiplying 8 in Eqs. (3) and (4) is to convert the Rate value obtained in byte units to Bit because the unit of bandwidth is Bit per Second.

보상값(Reward) 값은 전송 지연 값의 역수로 정의되며 이는 각 포트의 가용대역폭이 같거나 다를 경우 패킷 전송을 수행함으로써 얻을 수 있다. 또한 가중치(Weight)의 경우, 최대 대역폭의 값에 따라 결정되는데, 이는 보상값(Reward)이 커지는 것을 방지하고 적정 수준으로 각 포트를 비교하기 위해 곱해준다. 이러한 과정을 통해 얻게 되는 전송 지연값을 기반으로 Application은 각 행동(Action)을 수행하고 Output에 해당하는 포트 중, 보상값(Reward) 값이 가장 큰 Port를 선정하여 패킷을 전송하게 된다. The Reward value is defined as the reciprocal of the transmission delay value, which can be obtained by performing packet transmission when the available bandwidth of each port is the same or different. Also, in the case of the weight, it is determined according to the value of the maximum bandwidth, which prevents the compensation value Reward from becoming large, and multiplies it to compare each port to an appropriate level. Based on the transmission delay value obtained through this process, the application performs each action and selects a port having the largest Reward value among the ports corresponding to the output, and transmits the packet.

이와 관련하여, 수학식 2에 따르면, 보상값(Reward)이 액션 컨덕터 모듈(Action conductor, 120)이 액션(Action)을 수행하였을 때, 해당 포트(Port)에서 측정되는 전송 지연 값의 역수로 나타남을 알 수 있다. 한편, 어플리케이션 모듈(100)은, 상기 보상값이 가장 큰 액션을 수행함으로써 패킷을 전송하고, 상기 액션에 대응하는 상기 보상값을 갱신하도록 구성될 수 있다. 즉, 패킷 전송에 따라 해당 포트에서 측정되는 전송 지연이 최소가 되는 경로를 선택하도록 구성함으로써 보상값을 최대가 될 수 있다.In this regard, according to Equation (2), when the action conductor 120 performs an action, the compensation value Reward is represented as an inverse number of the transmission delay value measured at the corresponding port . On the other hand, the application module 100 may be configured to transmit a packet by performing an action having the largest compensation value, and to update the compensation value corresponding to the action. That is, the path that minimizes the transmission delay measured at the corresponding port is selected according to the packet transmission, so that the compensation value can be maximized.

도 7은 본 발명의 일 실시예에 따라, Packet_IN 메시지를 기반으로 컨트롤러가 OpenVSwitch에 플로우 테이블을 배치한 것을 나타낸다. 컨트롤러에서 선정된 다중 경로와 관련하여, 총 5개의 경로가 생성될 수 있다. 즉, SW1-SW2-SW3-SW4를 지나는 경로 A와 SW1-SW4-SW5-SW6을 지나는 경로 B, SW1-SW5-SW6을 지나는 경로 C, SW1-SW2-SW6을 지나는 경로 D, 마지막으로 SW1-SW2-SW5-SW6을 지나는 경로 E까지 총 5개의 경로가 생성되며 이에 대한 Flow Table을 경로상의 OpenVSwitch에게 전달된다. 각 Entry는 컨트롤러가 출발지에서 목적지까지의 다중 경로를 선정한 후 배치한 것으로써 SW1, SW2, SW5, SW6에는 하나의 In 포트에 대하여 여러 개의 Action Rule이 생성된다. 이는 현재 OpenVSwitch의 상태(State)에서 유입되는 패킷을 전송하기 위해 수행할 수 있는 행동(Action)이며 이를 기반으로 각 Entry의 Output Port를 기반으로 패킷을 전송함으로써 보상값(Reward) 값을 얻게 된다. FIG. 7 shows that a controller places a flow table on an OpenVSwitch based on a Packet_IN message according to an embodiment of the present invention. With respect to the multi-path selected in the controller, a total of five paths can be generated. That is, the path A passing SW1-SW2-SW3-SW4 and the path B passing SW1-SW4-SW5-SW6, the path C passing SW1-SW5-SW6, the path D passing SW1- SW2-SW5-SW6 to path E, and the flow table for this path is transmitted to OpenVSwitch in the path. Each entry is arranged after the controller selects multiple paths from the source to the destination, and several action rules are created for one In port in SW1, SW2, SW5, and SW6. It is an action that can be performed to transmit the packet that is coming from the current state of OpenVSwitch, and based on this, the Reward value is obtained by transmitting the packet based on the output port of each entry.

각 Flow Entry는 서로 다른 우선순위를 갖고 있다. 이는 동일한 Match 필드에서 다양한 Action 필드를 갖게 하는 역할을 하며 최종 Action이 결정되면 우선순위를 다른 Entry보다 높게 설정하여 패킷이 Match 필드에 있는 In 포트로 유입될 경우 활용하게 된다. Each Flow Entry has a different priority. It has various Action fields in the same Match field. When the final action is determined, the priority is set to be higher than other entries, and the packet is used when it enters the In port in the Match field.

한편, 도 8은 본 발명의 일 실시예에 따라, 전송 경로가 설정된 형태를 나타낸다. 도 8에 도시된 바와 같이, SW1과 SW2에서 수행할 수 있는 각 행동(Action)중에 보상값(Reward) 값이 가장 큰 Flow Entry를 선정한 것이다. 또한, 각 OpenVSwitch에서 활용되지 않는 Flow Entry의 경우 Timeout 설정에 의해 삭제되어 현재 패킷 전송에 활용되고 있는 Flow Entry만 남아있다. 경로 선정 이후, 또 다른 Host에 의한 패킷 전송이 있을 시, 위에서 언급한 절차를 거쳐 상태(State)를 확인하고 수행할 수 있는 행동(Action)에 따른 보상값(Reward) 값을 얻음으로써 경로가 선정된다. 본 발명에서는 가장 효율적인 패킷 전송 경로 설정에 대하여 오픈플로우 스위치의 추가 개입 없이 Data Plane의 OpenVSwitch가 각 포트의 현재 상태 및 효율성을 확인하여 전송링크가 선정된다. 따라서, 컨트롤러의 추가 개입 없이 전송링크가 선정되기 때문에 Self Routing Organization 기법이라 명시할 수 있다.Meanwhile, FIG. 8 shows a form in which a transmission path is set according to an embodiment of the present invention. As shown in FIG. 8, among the actions that can be performed in SW1 and SW2, a flow entry having the largest Reward value is selected. Also, in the case of a flow entry which is not utilized by each OpenVSwitch, only the flow entry which is deleted by the timeout setting and used for the current packet transmission remains. When a packet is transmitted by another host after the route selection, the state is checked through the above-mentioned procedure, and a compensation value (Reward) according to an action to be performed is obtained, do. In the present invention, the most efficient packet transmission path setting is determined by the OpenVSwitch of the data plane without checking the current state and the efficiency of each port, and the transmission link is selected without further intervention of the open flow switch. Therefore, since the transmission link is selected without any additional intervention of the controller, the self routing organization technique can be specified.

이상에서는 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 장치에 대해 살펴보았다. 아래에서는, 본 발명의 다른 양상에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 방법에 대해 살펴보기로 한다. 한편, 전술된 전송 경로 설정 장치에 대한 내용은 전송 경로 설정 방법에도 활용될 수 있음은 물론이다. In the foregoing, a transmission path setting apparatus for setting a transmission path using a data plane application in a software defined network according to the present invention has been described. Hereinafter, a transmission path setting method for setting a transmission path by utilizing a data plane application in a software defined network according to another aspect of the present invention will be described. Meanwhile, it goes without saying that the contents of the transmission path setting apparatus described above can also be utilized in the transmission path setting method.

이와 관련하여, 도 9는 본 발명에 따른 소프트웨어 정의 네트워크에서 데이터 평면 애플리케이션을 활용하여 전송 경로를 설정하는 전송 경로 설정 방법의 흐름도를 도시한다. 전송 경로 설정 방법은 전송 경로 설정 장치에 의해 수행되며, 특히, 어플리케이션 모듈(100) 및 오픈플로우 스위치(200) 중 적어도 하나에 의해 수행될 수 있다.In this regard, Figure 9 illustrates a flow diagram of a transmission path setup method for establishing a transmission path utilizing a data plane application in a software defined network according to the present invention. The transmission path setting method is performed by the transmission path setting device, in particular, by at least one of the application module 100 and the open flow switch 200. [

도 9에 도시된 바와 같이, 전송 경로 설정 방법은 상태 모니터링 단계(S100), 레이트 모니터링 단계(S150) 및 액션 수행 단계(S200)를 포함한다. 한편, 액션 수행 단계(S200)는 제1액션 선택 단계(S210), 보상값 관측 단계(S220), 및 제2액션 선택/보상값 테이블 업데이트 단계(S230)를 포함한다. 또한, 액션 수행 단계(S200)는 보상값 테이블 완성 여부 판단 단계(S240) 및 경로 결정 단계(S250)를 더 포함할 수 있다.As shown in FIG. 9, the transmission path setting method includes a state monitoring step S100, a rate monitoring step S150, and an action performing step S200. Meanwhile, the action execution step S200 includes a first action selection step S210, a compensation value observation step S220, and a second action selection / compensation value table updating step S230. The action execution step S200 may further include a step S240 of determining whether the compensation value table is completed or not and a step S250 of determining a path.

상태 모니터링 단계(S100)에서, 입력 포트(In_port)를 제외한 나머지 포트들의 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 모니터링한다.In the status monitoring step S100, monitoring is performed to determine whether the number of transmission / reception bytes of the ports other than the input port In_port is increased.

레이트 모니터링 단계(S150)에서, 상기 송신/수신 바이트들에 대하여 각각의 레이트를 계산하여, 상기 레이트의 증가 여부와 연관된 상태를 모니터링한다.In the rate monitoring step (S150), each rate is calculated for the transmit / receive bytes to monitor the status associated with the rate increase.

한편, 상태 모니터링 단계(S100)에서, 시간 t+1에서 송신/수신 바이트들의 수가 시간 t에서의 송신/수신 바이트들의 수와 동일하면 상기 해당 상태를 0으로 정의한다. 또한, 상기 시간 t+1에서 송신/수신 바이트들의 수가 상기 시간 t에서의 송신/수신 바이트들의 수보다 증가하면, 상기 해당 상태를 1로 정의하여, 상기 포트들의 상태를 모니터링한다.Meanwhile, in the state monitoring step S100, if the number of transmission / reception bytes at time t + 1 is equal to the number of transmission / reception bytes at time t, the corresponding state is defined as 0. Also, if the number of transmission / reception bytes at the time t + 1 is greater than the number of transmission / reception bytes at the time t, the corresponding status is defined as 1, and the status of the ports is monitored.

액션 수행 단계(S200)에서, 송신/수신 바이트들의 수의 증가 여부와 연관된 상태를 전달받아 해당 상태(State)마다 보상값(Reward)이 가장 큰 액션(Action)을 수행한다.In the action execution step S200, a state associated with an increase in the number of transmission / reception bytes is received, and an action having the largest compensation value Reward is performed for each state.

구체적으로, 제1액션 선택 단계(S210)에서, 상기 보상값에 대한 보상값 테이블 Q(s,a)에서, 패킷 유입이 검출되면 상태(s)로부터 제1액션(a)을 임의로 선택한다. Specifically, in the first action selection step S210, if the packet flow is detected in the compensation value table Q (s, a) for the compensation value, the first action a is arbitrarily selected from the state s.

또한, 보상값 관측 단계(S220)에서, 상기 제1액션(a)을 수행하여, 상기 보상값(r)을 관측한다. 이에 따라, 제2액션 선택/보상값 테이블 업데이트 단계(S230)에서, 상기 상태(s)로부터 제2액션(a')을 선택하여 상기 보상값 테이블 Q(s,a)를 Q(s, a')으로 업데이트한다.In the compensation value observation step S220, the first action a is performed to observe the compensation value r. Accordingly, in the second action selection / compensation value table update step S230, the second action a 'is selected from the state s to set the compensation value table Q (s, a) to Q (s, a ').

다음으로, 보상값 테이블 완성 여부 판단 단계(S240)에서, 상기 보상값 테이블 중 상기 상태(s)와 연관된 보상값 테이블(reward table)의 완성(complete) 여부를 판단한다. 이때, 상기 보상값 테이블이 완성될 때까지 상기 보상값 관측 단계(S220) 및 상기 제2액션 선택/보상값 테이블 업데이트 단계(S230)를 반복한다.Next, it is determined whether or not a compensation table associated with the state s in the compensation value table is complete in a compensation value table completion determination step S240. At this time, the compensation value observation step (S220) and the second action selection / compensation value table updating step (S230) are repeated until the compensation value table is completed.

상기 상태(s)와 연관된 보상값 테이블(reward table)이 완성된 것으로 판단되면, 경로 결정 단계(S250)에서, 상기 반복된 보상값 관측을 통해 상기 보상값이 가장 큰 액션을 선택하여, 경로 결정(Path decision)을 수행할 수 있다. If it is determined that the reward table associated with the state (s) is completed, an action having the largest compensation value is selected through the repeated compensation value observation in the path determining step (S250) (Path decision).

본 발명에 따르면, 경로설정을 제외한 다른 기능을 수행해야 하는 컨트롤러를 통해 운용되는 네트워크상에서는 본 발명에서 제안하는 방식을 통해 컨트롤러의 부하를 줄일 수 있다. 또한, 많은 노드들을 포함하는 대규모 네트워크에서도 Flow Table 요청에 따른 부하를 줄일 수 있다.According to the present invention, it is possible to reduce the load on the controller through the method proposed in the present invention on a network operated through a controller that performs other functions than routing. In addition, it is possible to reduce the load according to the flow table request even in a large-scale network including many nodes.

본 발명의 기대효과와 관련하여, 본 발명에서 제안된 방식을 통해 전송경로 설정 기능을 보조함으로써 컨트롤러의 부하를 줄이고 컨트롤러가 해야 하는 다른 기능을 원활하게 수행할 수 있는 환경을 제공할 수 있다.In connection with the expected effect of the present invention, it is possible to provide an environment that can reduce the load on the controller and smoothly perform other functions that the controller must perform by assisting the transmission path setting function through the method proposed in the present invention.

본 발명의 사업화 전망과 관련하여, 소프트웨어 정의 네트워크를 활용하는 통신망에 본 발명에서 제안한 방식을 적용한다면 컨트롤러의 부하를 줄임으로써 사용자들이 보다 원활한 네트워크 환경을 제공받을 수 있을 것으로 기대된다.Regarding the commercialization prospect of the present invention, if the method proposed by the present invention is applied to a communication network utilizing a software defined network, it is expected that users can provide a more seamless network environment by reducing the load on the controller.

본 발명의 적어도 일 실시예에 따른 전송 경로 설정 방법은, 소프트웨어 정의 네트워크에서 패킷전송만을 담당했던 데이터 평면에 전송 경로 갱신 및 요청 등의 기능을 Application 형태로 배치하여 처리함으로써, 컨트롤러에 집중되는 부하를 감소시킬 수 있다는 장점이 있다. The transmission path setting method according to at least one embodiment of the present invention arranges and processes functions such as transmission path update and request in an application form in a data plane that has only performed packet transmission in a software defined network and processes the load, There is an advantage that it can be reduced.

또한, 본 발명의 적어도 일 실시예에 따른 전송 경로 설정 방법은, 소프트웨어 정의 네트워크에서 패킷전송만을 담당했던 데이터 평면에 전송 경로 갱신 및 요청 등의 기능을 Application 형태로 배치하여 처리함으로써, 컨트롤러에 집중되는 부하에 따라 사용자 데이터의 전송 지연이 증가하는 것을 감소시킬 수 있다는 장점이 있다. In addition, a transmission path setting method according to at least one embodiment of the present invention is a method of setting a transmission path update and request in a data plane that has only performed packet transmission in a software defined network, It is possible to reduce an increase in the transmission delay of the user data depending on the load.

소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능뿐만 아니라 각각의 구성 요소들은 별도의 소프트웨어 모듈로도 구현될 수 있다. 상기 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다. 적절한 프로그램 언어로 쓰여진 소프트웨어 어플리케이션으로 소프트웨어 코드가 구현될 수 있다. 상기 소프트웨어 코드는 메모리에 저장되고, 제어부(controller) 또는 프로세서(processor) (또는 어플리케이션 프로세서)에 의해 실행될 수 있다.According to a software implementation, not only the procedures and functions described herein, but also each component may be implemented as a separate software module. Each of the software modules may perform one or more of the functions and operations described herein. Software code can be implemented in a software application written in a suitable programming language. The software code is stored in a memory and can be executed by a controller or a processor (or an application processor).

10: 컨트롤러
100: 어플리케이션 모듈
110: 상태 모니터 모듈
120: 보상값 계산모듈
130: 액션 컨덕터 모듈
200: 오픈플로우 스위치
210: 포트 상태 제공 모듈
220: 플로우 테이블 모듈
1000: 전송 경로 설정 장치10: Controller
100: Application module
110: Status monitor module
120: compensation value calculation module
130: action conductor module
200: Open flow switch
210: Port status provision module
220: Flow Table Module
1000: transmission path setting device

Claims

An apparatus for establishing a transmission path by utilizing a data plane application in a software defined network,
A controller configured to set a flow table regarding a priority for each transmission path between the source and destination (Src-Dst); And
When a plurality of output ports are provided for one input port (In_port), an action having the largest compensation value (Reward) is performed for each state based on the flow table And an application module.

The method according to claim 1,
The controller comprising:
Receiving a request message (Packet_IN Message) for the flow table for transferring the packet from the open flow switch when a packet flows from the host to the open flow switch,
Selecting a shortest-path multipath to which the packet is to be transmitted by receiving the request message,
The open-
Determining a current state of the switch by utilizing the transmission direction ports in each flow table,
The application module comprising:
And generates a reward table based on the transfer direction port information for the same input port in the flow table through a compensation value calculation module (R-calculator).

The method according to claim 1,
The application module comprising:
A status monitor (S-Monitor) module configured to monitor a status associated with an increase in the number of transmission / reception bytes of ports other than the input port (In_port); And
A compensation value calculation module (R-Calculator) configured to calculate a rate for each of the transmit / receive bytes, and to monitor a status associated with the rate increase; And
An action conductor module configured to perform an action having a largest compensation value for each state by receiving a state associated with an increase in the number of transmission / reception bytes and an increase or decrease in the respective rates, ).

The method of claim 3,
The status monitor module comprising:
If the number of transmit / receive bytes at time t + 1 equals the number of transmit / receive bytes at time t, the corresponding state is defined as 0,
And when the number of transmission / reception bytes at the time t + 1 is greater than the number of transmission / reception bytes at the time t, the status is defined as 1 to monitor the status of the ports.

The method of claim 3,
The action conductor module comprises:
In the compensation value table Q (s, a) for the compensation value, if the packet flow is detected, the first action (a) is arbitrarily selected from the state (s)
Performing the first action (a) to observe the compensation value (r)
The second action a 'is selected from the state s to update the compensation value table Q (s, a) to Q (s, a'),
Repeating the compensation value observation and the compensation value table update until the reward table associated with the state (s) in the compensation value table is completed,
If there is only one transmission direction port with respect to the input port, updating the flow entry with a higher priority,
Wherein the action having the largest compensation value is selected through the observation of the repeated compensation value, and a Patch decision is performed.

The method of claim 3,
The compensation value (Reward)
Is calculated as a product of a reciprocal of a transmission delay value measured at a corresponding port when the action conductor performs an action and a weight depending on a value of a maximum bandwidth. and,
The action conductor module comprises:
Extracting an input port information of each flow entry in the flow table and outputting the compensation value from the compensation value calculation module based on an available bandwidth of at least one output port corresponding to the input port And determines a final action based on the received information.

The method according to claim 1,
Wherein the flow table comprises:
A flow entry configured to distinguish a plurality of flows from each other;
A match field including a priority for each transmission route, an input port (in_port), a source address and a MAC address of the destination and an IP address for the flow entry; And
And an action field for the flow entry, the action field including an output port corresponding to the input port in_port.

An apparatus for establishing a transmission path by utilizing a data plane application in a software defined network,
A status monitor module configured to monitor a status associated with an increase in the number of transmit / receive bytes of ports other than the input port (In_port); And
When a plurality of output ports are provided for one input port (In_port), a state associated with an increase or decrease in the number of transmission / reception bytes is received and a compensation value (Reward) And an Action Conductor module configured to perform an Action.

9. The method of claim 8,
Further comprising a compensation value calculation module (R-calculator) configured to calculate a respective rate for the transmit / receive bytes, and to monitor a status associated with the rate increase,
The action conductor module is configured to perform an action having the largest compensation value for each state by receiving a state associated with an increase in the number of transmission / reception bytes and an increase in the rate of each of the bytes, , A transmission path setting device.

9. The method of claim 8,
The status monitor module comprising:
If the number of transmit / receive bytes at time t + 1 equals the number of transmit / receive bytes at time t, the corresponding state is defined as 0,
And when the number of transmission / reception bytes at the time t + 1 is greater than the number of transmission / reception bytes at the time t, the status is defined as 1 to monitor the status of the ports.

11. The method of claim 10,
The action conductor module comprises:
An action having the largest compensation value Reward is performed for each state according to the use of the ports and the flow table received from the controller,
Wherein the state is defined according to whether or not the ports are used, and the compensation value from the action on the state depends on whether the ports are used or not.

CLAIMS What is claimed is: 1. A method for establishing a transmission path utilizing a data plane application in a software defined network,
A status monitoring step of monitoring a status associated with an increase in the number of transmission / reception bytes of ports other than the input port (In_port);
A rate monitoring step of calculating a rate for each of the transmit / receive bytes to monitor a status associated with the rate increase; And
Receiving an increase in the number of transmission / reception bytes and a state associated with an increase in each of the rates, and performing an action having a largest compensation value for each state; How to set the path.

13. The method of claim 12,
The status monitoring step may include:
If the number of transmit / receive bytes at time t + 1 equals the number of transmit / receive bytes at time t, the corresponding state is defined as 0,
And if the number of transmission / reception bytes at the time t + 1 is greater than the number of transmission / reception bytes at the time t, the status is defined as 1 to monitor the status of the ports.

13. The method of claim 12,
Wherein the action-
A first action selecting step of arbitrarily selecting a first action (a) from the state (s) when a packet flow is detected, in a compensation value table Q (s, a) for the compensation value;
Observing the compensation value (r) by performing the first action (a); And
And a second action selection / compensation value table updating step of updating the compensation value table Q (s, a) to Q (s, a ') by selecting a second action a' from the state s , A transmission path setting method.

15. The method of claim 14,
Wherein the action-
Determining whether a reward value table associated with the state (s) in the reward value table is complete; And
The step of repeating the step of observing the compensation value and the step of updating the second action selection / compensation value table until the compensation value table is completed, selecting an action having the largest compensation value through observation of the repeated compensation value, Further comprising a path determination step of performing a decision (Patch decision).

A computer-readable recording medium having recorded thereon a program for performing the transmission path setting method according to any one of claims 12 to 15.