KR101884636B1

KR101884636B1 - Method of distributed service function fail-over for highly available service function chaining and system of the same

Info

Publication number: KR101884636B1
Application number: KR1020160165201A
Authority: KR
Inventors: 백상헌; 서동은; 백호성
Original assignee: 고려대학교 산학협력단
Priority date: 2016-12-06
Filing date: 2016-12-06
Publication date: 2018-08-03
Also published as: KR20180065070A

Abstract

장애 복구 방법이 개시된다. 상기 장애 복구 방법은 장애 복구 시스템에 의해 수행되고, 상기 장애 복구 시스템에 포함되는 송신부가 서비스 기능 체인 상의 다음 서비스 기능을 제공하는 서비스 노드로 핑(ping) 메시지를 일정 주기로 송신하는 단계, 상기 장애 복구 시스템에 포함되는 판단부가 상기 핑 메시지에 대한 응답의 수신 여부를 판단하는 단계, 및 상기 장애 복구 시스템에 포함되는 장애 탐지부가, 상기 핑 메시지에 대한 응답이 미리 정의된 문턱값 시간 내에 회신되지 않은 경우 상기 다음 서비스 기능의 장애를 탐지하는 단계를 포함한다.A fault recovery method is disclosed. Wherein the fault recovery method is performed by a fault recovery system, wherein a transmitter included in the fault recovery system transmits a ping message to a service node providing a next service function on a service function chain at regular intervals, The method includes the steps of: determining whether a determination unit included in the system receives a response to the ping message; and when the failure detection unit included in the failure recovery system determines that a response to the ping message is not returned within a predetermined threshold time And detecting a failure of the next service function.

Description

TECHNICAL FIELD The present invention relates to a method and system for a distributed service disaster recovery for high availability of a service function chain,

본 발명은 서비스 기능 체인 상의 서비스 기능 장애 발생 시, 이에 대처하기 위한 분산적인 서비스 기능 장애 복구 방법에 관한 것이다.The present invention relates to a distributed service dysfunction recovery method for coping with a service dysfunction on a service function chain.

네트워크 기능 가상화 기술은 Firewall, NAT(Network Address Translation), IDS(Intrusion Detection System) 등의 네트워크 서비스 기능을 제공하는 하드웨어 장비를 소프트웨어화하여 상용 서버 상에 구현하는 기술로, 비용 감소 및 유연성 향상 등의 장점들로 인해 네트워크 사업자들로부터 많은 관심을 받고 있다. 서비스 기능 체이닝 기술은 이러한 서비스 기능들을 정책에 따라 하나의 논리적인 서비스 순서 리스트 (즉, 서비스 기능 체인)으로 구성한 뒤 네트워크에 인입된 트래픽을 순차적으로 전달 및 처리하는 기술이다. 한편, 체인 상의 서비스 기능에 장애 발생 시, 해당 서비스를 받아야 하는 트래픽은 높은 지연시간을 경험할 수 있고, 이를 방지하기 위해 서비스 기능 장애에 빠르게 대처하기 위한 장애 극복 방법이 필요하다.Network function Virtualization technology is a technology implemented on a commercial server by hardwareizing hardware devices that provide network service functions such as Firewall, Network Address Translation (NAT), and Intrusion Detection System (IDS) Advantages have attracted much attention from network operators. Service function Chaining technology is a technology that sequentially arranges and processes the traffic that enters the network after configuring these service functions into a logical service order list (that is, a service function chain) according to the policy. On the other hand, when a service function on a chain fails, the traffic to receive the service may experience a high latency. To prevent this, there is a need to overcome a failure to quickly cope with a service failure.

미합중국 특허출원 제14/572,335호 "Methods, systems, and computer readable storage devices for managing faults in a virtual machine network"U.S. Patent Application No. 14 / 572,335 entitled " Methods, systems, and computer readable storage devices for managing faults in a virtual machine network & 미합중국 특허출원 제14/572,716호 "System, method, and computer program for preserving service continuity in a network function virtualization (NFV) based communication network"U.S. Patent Application No. 14 / 572,716 entitled " System, method, and computer program for preserving service continuity in a network function virtualization (NFV)

본 발명은 서비스 기능 체인 상의 서비스 기능에 대한 분산적인 장애 극복 방법을 통해 서비스 기능 장애 극복 시간을 줄이는 것을 목적으로 한다. An object of the present invention is to reduce the overcoming time of a service function through a distributed fail over method for a service function on a service function chain.

본 발명은 각 서비스 기능에 DFA(Distributed Fail-over Agent)를 위치시켜 체인 상의 다음 서비스 기능에 대한 장애 탐지 및 장애 극복 메커니즘을 수행하도록 하고, 이를 통해 중앙 집중적인 SFC 컨트롤러를 통해서만 이루어질 수 있는 체인 업데이트 과정 없이 분산적으로 장애를 극복하여 서비스 기능 장애 극복시간을 줄이는 것을 목적으로 한다.In the present invention, a Distributed Fail-over Agent (DFA) is placed in each service function to perform a failure detection and fail-over mechanism for the next service function in the chain, thereby enabling a chain update The objective is to reduce the time to overcome the service dysfunction by overcoming the obstacles in a distributed manner without process.

본 발명의 일 실시 예에 의한 장애 복구 방법은 장애 복구 시스템에 의해 수행되고, 상기 장애 복구 시스템에 포함되는 송신부가 서비스 기능 체인 상의 다음 서비스 기능을 제공하는 서비스 노드로 핑(ping) 메시지를 일정 주기로 송신하는 단계, 상기 장애 복구 시스템에 포함되는 판단부가 상기 핑 메시지에 대한 응답의 수신 여부를 판단하는 단계, 및 상기 장애 복구 시스템에 포함되는 장애 탐지부가, 상기 핑 메시지에 대한 응답이 미리 정의된 문턱값 시간 내에 회신되지 않은 경우 상기 다음 서비스 기능의 장애를 탐지하는 단계를 포함한다.A failure recovery method according to an embodiment of the present invention is performed by a failure recovery system, and a transmitter included in the failure recovery system transmits a ping message to a service node providing a next service function on a service function chain The method according to claim 1, further comprising the steps of: determining whether a response to the ping message is received by the determination unit included in the failure recovery system; and detecting a failure included in the failure recovery system, And detecting a failure of the next service function if it is not returned within the value time.

본 발명의 일 실시 예에 의한 장애 복구 시스템은 i+1번째 서비스 기능에 대한 백업 인스턴스를 유지하는 저장부, i+1번째 서비스 노드로 핑 메시지를 일정 주기로 송신하는 송신부, 상기 핑 메시지에 대한 응답의 수신 여부를 판단하는 판단부, 및 상기 핑 메시지에 대한 응답이 미리 정의된 문턱값 시간 내에 회신하지 않는 경우 i+1번째 서비스 기능의 장애를 탐지하는 장애 탐지부를 포함한다.The failure recovery system according to an embodiment of the present invention includes a storage unit for holding a backup instance for an (i + 1) th service function, a transmitter for transmitting a ping message at a predetermined cycle to an (i + And a failure detection unit for detecting failure of the (i + 1) th service function when the response to the ping message does not return within a predetermined threshold time.

일실시예에 따르면, 서비스 기능 체인 상의 서비스 기능에 대한 분산적인 장애 극복 방법을 통해 서비스 기능 장애 극복 시간을 줄일 수 있다.According to one embodiment, a service failover time can be reduced through a distributed failover method for service functions on a service function chain.

일실시예에 따르면, 각 서비스 기능에 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)를 위치시켜 체인 상의 다음 서비스 기능에 대한 장애 탐지 및 장애 극복 메커니즘을 수행할 수 있다.According to one embodiment, a Distributed Fail-over Agent (DFA) may be placed in each service function to perform a failure detection and failover mechanism for the next service function in the chain.

일실시예에 따르면, 중앙 집중적인 서비스 기능 체이닝 컨트롤러를 통해서만 이루어질 수 있는 체인 업데이트 과정 없이 분산적으로 장애를 극복하여 서비스 기능 장애 극복시간을 줄일 수 있다.According to an exemplary embodiment, it is possible to overcome a failure in a distributed manner without a chain update process that can be performed only through a centralized service function chaining controller, thereby reducing a service function failure over time.

본 발명의 상세한 설명에서 인용되는 도면을 보다 충분히 이해하기 위하여 각 도면의 상세한 설명이 제공된다.
도 1은 제어 평면과 데이터 평면으로 구성된 서비스 기능 체이닝(Service Function Chainning, SFC) 구조를 나타낸다.
도 2는 서비스 기능 체이닝의 컨트롤러를 통한 서비스 기능 장애 극복 과정을 나타낸다.
도 3은 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)를 통한 서비스 기능 장애 극복 과정을 나타낸다.
도 4는 일실시예에 따른 장애 복구 시스템을 설명하는 도면이다.
도 5는 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)의 분산적인 서비스 기능 장애 극복을 위한, 일실시예에 따른 장애 복구 프로세서의 동작 방법에 대한 플로우 차트이다.
도 6은 컨트롤러와 서비스 기능간의 지연시간의 범위를 변화시키면서 평균 장애극복 시간의 변화를 살펴본 그래프이다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In order to more fully understand the drawings recited in the detailed description of the present invention, a detailed description of each drawing is provided.
1 shows a Service Function Chaining (SFC) structure composed of a control plane and a data plane.
FIG. 2 shows a process of overcoming a service failure through a controller of a service function chaining.
FIG. 3 shows a process of overcoming a service failure through a Distributed Fail-over Agent (DFA).
4 is a view for explaining a failure recovery system according to an embodiment.
FIG. 5 is a flowchart of a method for operating a failover processor according to an exemplary embodiment, for a distributed service failure failure of a distributed fail-over agent (DFA).
FIG. 6 is a graph showing a change in average failover time while varying a range of delay time between a controller and a service function.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시 예들에 대해서 특정한 구조적 또는 기능적 설명은 단지 본 발명의 개념에 따른 실시 예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시 예들은 다양한 형태들로 실시될 수 있으며 본 명세서에 설명된 실시 예들에 한정되지 않는다.It is to be understood that the specific structural or functional description of embodiments of the present invention disclosed herein is for illustrative purposes only and is not intended to limit the scope of the inventive concept But may be embodied in many different forms and is not limited to the embodiments set forth herein.

본 발명의 개념에 따른 실시 예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시 예들을 도면에 예시하고 본 명세서에서 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시 예들을 특정한 개시 형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물, 또는 대체물을 포함한다.The embodiments according to the concept of the present invention can make various changes and can take various forms, so that the embodiments are illustrated in the drawings and described in detail herein. It should be understood, however, that it is not intended to limit the embodiments according to the concepts of the present invention to the particular forms disclosed, but includes all modifications, equivalents, or alternatives falling within the spirit and scope of the invention.

제1 또는 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만, 예컨대 본 발명의 개념에 따른 권리 범위로부터 벗어나지 않은 채, 제1 구성 요소는 제2 구성 요소로 명명될 수 있고 유사하게 제2 구성 요소는 제1 구성 요소로도 명명될 수 있다.The terms first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms may be named for the purpose of distinguishing one element from another, for example, without departing from the scope of the right according to the concept of the present invention, the first element may be referred to as a second element, The component may also be referred to as a first component.

어떤 구성 요소가 다른 구성 요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성 요소가 다른 구성 요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는 중간에 다른 구성 요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성 요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between. Other expressions that describe the relationship between components, such as "between" and "between" or "neighboring to" and "directly adjacent to" should be interpreted as well.

본 명세서에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로서, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 본 명세서에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like are used to specify that there are features, numbers, steps, operations, elements, parts or combinations thereof described herein, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the meaning of the context in the relevant art and, unless explicitly defined herein, are to be interpreted as ideal or overly formal Do not.

이하, 본 명세서에 첨부된 도면들을 참조하여 본 발명의 실시 예들을 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings attached hereto.

도 1은 제어 평면과 데이터 평면으로 구성된 서비스 기능 체이닝(Service Function Chaining, SFC) 구조(100)를 나타낸다. 1 shows a Service Function Chaining (SFC) structure 100 composed of a control plane and a data plane.

제어 평면의 컨트롤러는 SFC 테이블을 유지한다. 테이블 엔트리는 체인에 대한 Service Path Index(SPI), 정책, 서비스 기능들의 순서로 구성될 수 있다. 예를 들어, chain C의 경우, SPI가 P로 정의되어 있다. 체인이 부여된 패킷이 지나야 하는 실제 네트워크 상의 경로를 Service Function Path(SFP)라 하는데, SPI(Service Path Index)는 실제 네트워크 상의 경로에 대한 식별자이다. 또한 테이블 엔트리에는 해당 체인을 목적지 포트번호가 80번인 트래픽에 대해서만 적용하도록 하는 정책이 명시되어있다. The controller of the control plane maintains the SFC table. A table entry can consist of a sequence of Service Path Index (SPI), policy, and service functions for the chain. For example, for chain C, the SPI is defined as P. The path on the actual network through which the packet to which the chain is assigned is referred to as a Service Function Path (SFP), and the Service Path Index (SPI) is an identifier for a path on an actual network. The table entry also specifies a policy that applies the chain only to traffic whose destination port number is 80.

다음으로는 Service Function(SF) 인스턴스들(또는 서비스 기능들)의 순서로 구성된 실제 네트워크 상의 경로가 SF 1 - SF 2 - SF 3 - SF 4의 순서로 명시되어 있다.Next, paths on the actual network in the order of Service Function (SF) instances (or service functions) are specified in the order SF 1 - SF 2 - SF 3 - SF 4.

본 명세서에서 사용되는 SF 1, SF 2 등 SF i의 표현은 1번째 서비스 기능, 2번째 서비스 기능, i번째 서비스 기능으로 해석될 수 있다.The expressions of SF i such as SF 1 and SF 2 used in this specification can be interpreted as a first service function, a second service function, and an i-th service function.

한편, 괄호안의 정보는 해당 서비스 기능이 연결되어 있는 서비스 기능 포워더(Service Function Forwarder, SFF)를 의미한다. 서비스 기능 포워더는 전달받은 패킷들을 정의된 체인에 따라 서비스 기능(또는 서비스 노드) 또는 서비스 기능 포워더에게 전달하는 역할을 수행한다.On the other hand, the information in parentheses means a Service Function Forwarder (SFF) to which a corresponding service function is connected. The service function forwarder is responsible for delivering the delivered packets to the service function (or service node) or service function forwarder according to the defined chain.

한편, 컨트롤러는 서비스 기능 인스턴스들(또는 서비스 기능들)을 데이터 평면상의 각 Service Node(SN)에 설치한다. SN은 CPU, 메모리, 저장 공간을 제공하는 물리적인 또는 가상의 요소로, NFV(Network Function Virtualization) 기술을 통해 VM(Virtual Machine) 형태로 서비스 기능 인스턴스(또는 서비스 기능)를 호스팅할 수 있는 기능을 제공한다. 컨트롤러는 새로운 체인 구성에 대한 요청을 받으면, SFC 테이블에 새로운 엔트리를 입력한다. 엔트리가 입력되면, 컨트롤러는 정책에 따라 특정 트래픽이 실제 네트워크 상의 경로에 정의된 순서에 따라 서비스를 받을 수 있도록, 서비스 분류기(Service Classifier, SC) 및 서비스 기능 포워더의 SFC 테이블을 업데이트한다.On the other hand, the controller installs service function instances (or service functions) on each Service Node (SN) on the data plane. SN is a physical or virtual element that provides CPU, memory, and storage space. It can host a service function instance (or service function) in the form of a virtual machine (VM) through Network Function Virtualization (NFV) to provide. When the controller receives a request for a new chain configuration, it enters a new entry into the SFC table. When an entry is input, the controller updates the SFC table of the service classifier (SC) and the service function forwarder so that the specific traffic can be serviced according to the order defined in the path on the actual network according to the policy.

또한, 컨트롤러는 주기적인 핑 메시지를 통해 각 서비스 기능들에 대한 장애 탐지를 수행할 수 있다. 즉, 미리 정의된 문턱 값인

의 시간 동안 핑 메시지에 대한 응답이 없을 시, 해당 서비스 기능에 장애가 발생하였다고 판단할 수 있다.In addition, the controller can perform fault detection for each service function through a periodic ping message. That is, the predefined threshold value

If there is no response to the ping message during the time period of " 0 "

만약, 특정 서비스 기능 인스턴스에 대한 장애가 탐지되면, 컨트롤러는 해당 서비스 기능에 대한 서비스 기능 인스턴스를 장애가 발생한 서비스 기능 인스턴스와 같은 서비스 기능 포워더와 연결된 서비스 기능 인스턴스들(또는 서비스 노드들) 중 자신과의 통신 지연시간이 최소인 서비스 기능 인스턴스(또는 서비스 노드)와 같은 위치에 설치하고, 새로운 서비스 기능 인스턴스(또는 새로운 서비스 기능이 설치된 서비스 노드)로 트래픽이 지날 수 있도록 새로운 실제 네트워크 상의 경로를 구성할 수 있다.If a failure is detected for a particular service function instance, the controller sends the service function instance for that service function to the service function instance (or service nodes) associated with the service function forwarder, such as a failed service function instance, It can be installed at the same location as a service function instance (or service node) with a minimum delay time and can configure a path on a new physical network so that traffic can pass to a new service function instance (or a service node with a new service function installed) .

데이터 평면의 인그레스 노드(ingress node)는 SFC 도메인으로 유입되는 패킷들에 대해 컨트롤러로부터 전달받은 네트워크 정책에 따라 네트워크 서비스 헤더(Network Service Header, NSH)를 부여하여 해당 패킷들이 실제 네트워크 상의 경로에 정의된 순서에 따라 서비스를 받을 수 있도록 한다. NSH에는 SPI 및 서비스 인덱스(Service Index, SI) 정보가 포함되어 있는데, SI는 각 서비스 기능 인스턴스에 의해 1씩 감소되어 해당 패킷이 현재 실제 네트워크 상의 경로 상의 어떤 순서까지 처리되었는지를 나타낸다. 예를 들어, 인그레스 노드는 정책에 따라 목적지 포트번호가 80번인 패킷에 대해 SPI=P 및 SI=255를 부여한 것을 확인할 수 있다.The ingress node of the data plane assigns the network service header (NSH) according to the network policy received from the controller for the packets coming into the SFC domain, So that the service can be received according to the order in which they are received. The NSH includes SPI and Service Index (SI) information, which is decremented by 1 for each service function instance to indicate the order in which the packet is currently processed on the path on the actual network. For example, the ingress node can confirm that SPI = P and SI = 255 are assigned to a packet whose destination port number is 80 according to the policy.

인그레스 노드는 NSH 부여 후, 패킷을 실제 네트워크 상의 경로에 명시된 첫 번째 서비스 기능 인스턴스에 연결된 서비스 기능 포워더로 전달한다. 서비스 기능 포워더는 SPI, SI 그리고 Next Hop(NH)로 구성된 서비스 기능 포워더 테이블을 통해 해당 패킷을 전달한다. SC로부터 전달받은 패킷의 SPI, SI가 각각 P, 255이므로 이에 해당하는 NH인 SF 1로 패킷을 전달한다.After granting the NSH, the ingress node forwards the packet to the service function forwarder connected to the first service function instance specified in the path on the actual network. The service function forwarder forwards the packet through a service function forwarder table consisting of SPI, SI and Next Hop (NH). Since the SPI and SI of the packet received from the SC are P and 255, respectively, the packet is transmitted to the corresponding SF 1.

SF 1은 패킷을 처리 후, SI값을 1만큼 감소시킨 뒤, 다시 서비스 기능 포워더 1에게 전달한다. 이러한 과정을 반복하여 해당 패킷은 서비스 기능 포워더 테이블 상에서 1 - 2 - 3 - 4 - 5의 순서로 매칭되며 순서대로 서비스 기능 인스턴스들을 지나게 된다.After processing the packet, SF 1 decrements the SI value by 1 and then transmits it to the service function forwarder 1 again. Repeating this process, the packet is matched in order of 1 - 2 - 3 - 4 - 5 on the service function forwarder table and passes through service function instances in order.

앞서 언급하였듯, 서비스 기능 포워더 테이블은 컨트롤러 상에서 새로운 실제 네트워크 상의 경로 구성 시, 컨트롤러에 의해 업데이트 된다. 인그레스 노드는 SFC 도메인 내에서 처리가 완료된 패킷들을 SFC 도메인 외부로 전달하는 역할을 수행한다.As mentioned earlier, the service function forwarder table is updated by the controller when configuring the path on a new physical network on the controller. The ingress node is responsible for delivering the processed packets to the outside of the SFC domain in the SFC domain.

본 발명에서는 실제 네트워크 상의 경로에 포함된 모든 서비스 기능 인스턴스들이 위치한 SN들에 분산 페일오버 에이전트(실시 예에 따라 분산 페일오버부로 명명될 수도 있음)가 위치한다. 실제 네트워크 상의 경로에 포함된 모든 SF들이 같은 서비스 기능 포워더에 연결되어 있다고 가정할 수 있다. 또한 실제 네트워크 상의 경로를 SF 1 - SF 2 - SF 3 - ... - SF N로 나태내고, i번째 서비스 기능에 위치한 분산 페일오버 에이전트를 i번째 분산 페일오버 에이전트라고 하면, i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능이 i번째 서비스 기능과 같은 서비스 기능 포워더에 연결되어 있을 시, i+1번째 서비스 기능에 대한 장애 탐지 및 복구를 수행할 수 있다.In the present invention, a distributed failover agent (which may be termed a distributed failover according to an embodiment) is located in the SNs where all the service function instances included in the path on the actual network are located. It can be assumed that all the SFs included in the path on the actual network are connected to the same service function forwarder. Also, if the path on the real network is SF 1 - SF 2 - SF 3 - ... - SF N, and the distributed failover agent located in the i th service function is called the i th distributed failover agent, then i th distributed failover The agent can perform fault detection and recovery for the (i + 1) th service function when the (i + 1) th service function is connected to the same service function forwarder as the i th service function.

먼저 i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능에 대한 장애 탐지를 수행할 수 있다. 즉, i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능(또는 i+1번째 서비스 노드)에게 주기적인 핑 메시지를 보내고, 미리 정의된 문턱값인

의 시간 동안 응답이 없을 시, i+1번째 서비스 기능(또는 i+1번째 서비스 노드)에 장애가 발생하였다고 판단할 수 있다.First, the i-th distributed failover agent can perform fault detection for the (i + 1) -th service function. That is, the i-th distributed failover agent sends a periodic ping message to the i + 1-th service function (or the (i + 1) th service node)

(I + 1) -th service node (i + 1) -th service node (i + 1) th service node.

한편, i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능에 대한 백업 인스턴스를 유지한다. i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능의 장애 판단 시, i+1번째 서비스 기능에 대한 백업 인스턴스를 활성화시키고, i번째 서비스 기능으로 유입된 패킷들이 i번째 서비스 기능에 의해 처리된 후, 새로 활성화된 i+1번째 서비스 기능에 전달되어 처리될 수 있도록 내부 네트워크를 업데이트 한다.On the other hand, the i-th distributed failover agent maintains a backup instance for the i + 1-th service function. The i-th distributed failover agent activates the backup instance for the (i + 1) -th service function when the failure of the (i + 1) -th service function is activated, and after the packets flowing into the i-th service function are processed by the i- , Updates the internal network so that it can be transmitted to the newly activated i + 1 < th > service function and processed.

예를 들어, 도면 1에서 SF 2에 장애 발생 시, 분산 페일오버 에이전트 1은 비활성화되어 있던 SF 2를 활성화 시키고, 내부 네트워크를 업데이트 한다. 이후 SF 1로 유입되는 패킷들은 SF 1에서 처리된 후, 같은 SN 1에 위치한 SF 2에서 처리된다. For example, when a failure occurs in SF 2 in FIG. 1, the distributed failover agent 1 activates the inactivated SF 2 and updates the internal network. Afterwards, the packets arriving at SF 1 are processed at SF 1 and then processed at the same SN 1 at SF 2.

이로 인해 SF 1로 유입되는 패킷들은 SF 1 및 SF 2에서 처리가 되어 NSH의 SI 값이 2씩 감소된 뒤, 서비스 기능 포워더 1로 전달된다. 전달받은 패킷은 SI 값이 253이므로, 장애가 발생한 SF 2 인스턴스로 전달되지 않고, 바로 SF 3 인스턴스로 전달된다. 따라서 컨트롤러에 의한 실제 네트워크 상의 경로 생성 및 서비스 기능 포워더 테이블 업데이트 과정을 거치지 않고도 장애를 극복할 수 있게 된다.As a result, packets entering SF 1 are processed in SF 1 and SF 2, and the SI value of NSH is decreased by 2, and then forwarded to the service function forwarder 1. Since the SI value is 253, the received packet is not delivered to the failed SF 2 instance but directly to the SF 3 instance. Therefore, it is possible to overcome the obstacle without going through the process of creating the path on the actual network by the controller and updating the service function forwarder table.

한편, SF 1에 장애가 발생하였을 경우 혹은 i+1번째 서비스 기능이 i번째 서비스 기능과 서로 다른 서비스 기능 포워더에 연결되어 있을 때 i+1번째 서비스 기능에 장애가 발생하였을 경우, 해당 장애들을 탐지/복구할 분산 페일오버 에이전트가 없으므로, 컨트롤러가 앞서 언급한 장애 탐지/복구 메커니즘을 통해 실제 네트워크 상의 경로 생성 및 서비스 기능 포워더 테이블 업데이트 과정을 수행하여 장애를 극복할 수 있다.On the other hand, when the failure occurs in SF 1 or when the i + 1-th service function is connected to the i-th service function forwarder and is different from the i-th service function, Because there is no distributed failover agent to do, the controller can overcome obstacles by performing path creation and service function forwarder table update procedures on the actual network through the above-mentioned fault detection / recovery mechanism.

도 2는 SFC 컨트롤러를 통한 SF 장애 극복 과정을 나타내는 실시예(200)이다.FIG. 2 is an embodiment 200 illustrating an SF failover process through an SFC controller.

도 2는 i번째 서비스 기능에 장애 발생 시, SFC 컨트롤러에 의한 장애 복구 과정 및 이 때의 장애 복구 시간

을 나타낸다. 먼저 컨트롤러는 주기적으로 i번째 서비스 기능의 상태를 확인하며,

의 시간 동안 응답이 없을 시 i번째 서비스 기능에 장애가 발생하였다고 판단한다. 따라서 장애를 탐지하기까지는

의 시간이 소요된다. i번째 서비스 기능은 SN i에 설치되어있다고 하고,

를 SFC 컨트롤러와 i번째 서비스 기능(또는 SN i) 간의 네트워크 지연시간으로 가정할 수 있다. 또한, SN i와 같은 서비스 기능 포워더에 연결된 SN들 중 컨트롤러와의 지연시간이 가장 낮은 SN을 j라 하면, SFC 컨트롤러는 SN j에게 i번째 서비스 기능에 대한 설치 요청을 보내고, 이 때 걸리는 시간은

이다. 그 후, 설치 요청을 받은 SN j가 i번째 서비스 기능을 설치하게 되며, 이 때 걸리는 시간을

라 한다. i번째 서비스 기능의 설치가 완료되면, SN j는 해당 작업에 대한 ACK을 보내고, 컨트롤러는 ACK을 받은 이후 서비스 기능 포워더 테이블을 업데이트한다.

를 컨트롤러와 서비스 기능 포워더 간의 네트워크 지연시간이라 하면, 업데이트 요청 후 서비스 기능 포워더를 받을 때까지 걸리는 시간은

가 된다. 결과적으로

는 [수학식 1]과 같이 계산된다.FIG. 2 is a flowchart illustrating a failure recovery process by the SFC controller and a failure recovery time

. First, the controller periodically checks the status of the i-th service function,

The service function of the i-th service is determined to have failed. Therefore, until the fault is detected

Of time. Assume that the i-th service function is installed in the SN i,

Can be assumed as the network delay time between the SFC controller and the i-th service function (or SN i). Also, if the SN of the SN with the lowest delay time from the controller connected to the service function forwarder such as SN i is j, the SFC controller sends an installation request for the i th service function to the SN j,

to be. Then, the SN j that receives the installation request installs the i-th service function.

. When the installation of the i-th service function is completed, the SN j sends an ACK for the job, and the controller updates the service function forwarder table after receiving the ACK.

Is the network delay time between the controller and the service function forwarder, the time taken to receive the service function forwarder after the update request is

. As a result

Is calculated as shown in Equation (1).

[수학식 1][Equation 1]

이때,

는 장애 탐지 타임아웃 값으로 해석될 수 있고,

는 SF 설치에 걸리는 시간으로 해석될 수 있으며,

는 i번째 서비스 기능과 SFC controller 사이의 네트워크 지연시간으로 해석될 수 있고,

는 서비스 기능 포워더와 SFC controller 사이의 네트워크 지연시간으로 해석될 수 있으며,

는 centralized fail-over의 경우, i번째 서비스 기능에 대한 장애 복구시간으로 해석될 수 있고,

는 centralized fail-over의 경우, 평균 장애 복구시간으로 해석될 수 있으며,

는 distributed fail-over의 경우, i번째 서비스 기능에 대한 장애 복구시간으로 해석될 수 있으며,

는 distributed fail-over의 경우, 평균 장애 복구시간으로 해석될 수 있다.At this time,

May be interpreted as a failure detection timeout value,

Can be interpreted as the time taken to install the SF,

Can be interpreted as the network delay time between the i-th service function and the SFC controller,

Can be interpreted as the network delay time between the service function forwarder and the SFC controller,

Can be interpreted as the failure recovery time for the i-th service function in the case of centralized fail-over,

In the case of centralized fail-over, it can be interpreted as average failure recovery time,

Can be interpreted as the failure recovery time for the i-th service function in case of distributed fail-over,

Can be interpreted as average failure recovery time in case of distributed fail-over.

이 때,

이다. SFC 컨트롤러를 통한 평균 장애 극복 시간

는 [수학식 2]과 같이 계산된다.At this time,

to be. Average failover time with SFC controller

Is calculated as shown in Equation (2).

[수학식 2]&Quot; (2) "

이때,

는 centralized fail-over의 경우, i번째 서비스 기능에 대한 장애 복구시간으로 해석될 수 있다.At this time,

Can be interpreted as the failure recovery time for the i-th service function in the case of centralized fail-over.

도 3은 분산 페일오버 에이전트를 통한 서비스 기능 장애 극복 과정을 나타낸다.3 illustrates a service failover process through a distributed failover agent.

즉, 도 3은 i가 2 이상일 때(즉, i가 실제 네트워크 상의 경로의 첫 번째 SF가 아닐 때) i번째 서비스 기능에 장애 발생 시, 분산 페일오버 에이전트를 통한 장애 극복 과정 및 이 때의 장애 복구 시간

를 타나낸다. That is, FIG. 3 shows a process for failing over the i-th service function when i is 2 or more (i.e., when i is not the first SF of the path on the real network) Recovery time

.

i번째 분산 페일오버 에이전트-1은 주기적으로 i번째 서비스 기능의 상태를 확인하며,

의 시간이 소요된다. i-1번째 분산 페일오버 에이전트는 장애를 탐지하면 SN i-1에 위치한 back-up i번째 서비스 기능을 활성화시키고, 내부 네트워크를 업데이트 시켜 i번째 서비스 기능이 SN i-1에에 제공될 수 있도록 한다.The i-th distributed failover agent-1 periodically checks the status of the i-th service function,

Of time. When the i-1th distributed failover agent detects a failure, it activates the back-up i-th service function located at SN i-1 and updates the internal network so that the i-th service function can be provided to SN i-1 .

이 때 걸리는 시간은 i번째 서비스 기능을 설치하는 데에 걸리는 시간

와 동일하다고 가정한다. i번째 서비스 기능에 대한 복구 시간은 [수학식 3]과 같다.The time it takes is the time it takes to install the i-th service function

. The recovery time for the i-th service function is expressed by Equation (3).

[수학식 3]&Quot; (3) "

는 장애 탐지 타임아웃 값으로 해석될 수 있고,

는 서비스 기능 설치에 걸리는 시간으로 해석될 수 있으며,

는 centralized fail-over의 경우, i번째 서비스 기능에 대한 장애 복구시간으로 해석될 수 있다.

May be interpreted as a failure detection timeout value,

Can be interpreted as the time taken to install the service function,

한편, i가 1일 경우, 장애 복구는 컨트롤러에 의해 수행되므로, 장애 복구 시간은

이 된다. 분산 페일오버 에이전트를 통한 평균 장애 극복시간

는 [수학식 4]와 같이 계산된다.On the other hand, if i is 1, failover is performed by the controller,

. Average Failover Time with Distributed Failover Agent

Is calculated as shown in Equation (4).

[수학식 4]&Quot; (4) "

도 4는 일실시예에 따른 장애 복구 시스템을 설명하는 도면이다.4 is a view for explaining a failure recovery system according to an embodiment.

도 4에서는 서비스 기능(SF)에 장애 발생 시 분산적으로 장애 탐지/복구를 수행하도록 하는 구조/기법 등이 제안된다.FIG. 4 shows a structure / technique for performing distributed fault detection / recovery when a service function (SF) fails.

구체적으로, 본 발명에 따르면, SN에 분산 페일오버 에이전트(Distributed Fail-over Agent) 및 Back-up SF를 위치시킬 수 있다. 또한, 분산 페일오버 에이전트는 분산적인 장애 탐지 및 Back-up SF를 통한 분산적인 장애 극복을 수행할 수 있다. 본 발명에 따르면, 기존의 SFC 컨트롤러 기반의 장애 복구 기법 대비 짧은 장애 극복시간을 갖고, SFC 컨트롤러를 통하지 않고도 서비스 기능의 장애 복구가 가능하다. 또한, 실시 예에 따라 도 4에 도시된 장애 복구 시스템(400)은 서비스 노드를 의미할 수도 있다.Specifically, according to the present invention, a distributed fail-over agent and a back-up SF can be located in the SN. Also, the distributed failover agent can perform distributed failover through distributed fault detection and back-up SF. According to the present invention, service failures can be recovered without having to go through the SFC controller with a short failover time compared to the conventional SFC controller-based failover method. In addition, according to an embodiment, the failover system 400 shown in FIG. 4 may also mean a service node.

이를 위해, 일 실시 예에 따른 장애 복구 시스템(400)은 저장부(410), 송신부(420), 판단부(430), 및 장애 탐지부(440)를 포함할 수 있다. 또한, 장애 복구 시스템(400)은 장애 복구 장치로 명명될 수도 있다.The failure recovery system 400 may include a storage unit 410, a transmission unit 420, a determination unit 430, and a failure detection unit 440. In addition, the failover system 400 may be referred to as a failover device.

일 실시 예에 따른 저장부(410)는 i+1번째 서비스 기능(SF i+1)에 대한 백업 인스턴스(back-up instance)를 유지할 수 있다. 한편, 송신부(420)는 i+1번째 서비스 기능(또는 i+1번째 서비스 노드)에 일정 주기로 핑(ping) 메시지를 송신할 수 있다. 판단부(430)에서는 송신된 핑(ping) 메시지의 수신 여부(또는 상기 핑 메시지에 대한 응답의 수신 여부)를 판단할 수 있다.The storage unit 410 according to the embodiment may maintain a back-up instance for the (i + 1) th service function (SF i + 1). Meanwhile, the transmitter 420 may transmit a ping message to the (i + 1) th service function (or the (i + 1) th service node) at regular intervals. The determination unit 430 may determine whether the transmitted ping message is received (or whether a response to the ping message is received).

일 실시 예에 따른 장애 탐지부(440)는 송신된 핑 메시지에 대한 응답이 미리 정의된 문턱값 시간 내에 회신하지 않는 경우 i+1번째 서비스 기능(Service Function, SF) 또는 i+1번째 서비스 노드의 장애를 탐지할 수 있다.If the response to the transmitted ping message does not return within a predefined threshold time, the failure detection unit 440 according to an embodiment determines whether the i + 1th service function (SF) Can be detected.

이때, 다른 일 실시 예에 따른 장애 복구 시스템(400)은 백업 인스턴스(또는 백업 서비스 기능)를 활성화하여 장애를 복구하는 장애 복구 처리부(450)를 더 포함할 수 있다. 특히, 장애 복구 처리부(45)는 장애가 탐지되면, i+1번째 서비스 기능(Service Function, SF)에 대한 백업 인스턴스(back-up instance)를 활성화시킬 수 있다. 즉, i번째 서비스 기능(Service Function, SF)으로 유입된 패킷들이 i번째 서비스 기능(Service Function, SF)에 의해 처리된 후, 백업 인스턴스(back-up instance)가 활성화된 상기 i+1번째 서비스 기능(Service Function, SF)에 전달되어 처리될 수 있도록 내부 네트워크를 업데이트할 수 있다.At this time, the failure recovery system 400 according to another embodiment may further include a failure recovery processing unit 450 for activating a backup instance (or a backup service function) to recover the failure. In particular, if a failure is detected, the failure recovery processor 45 may activate a back-up instance for the (i + 1) -th service function (SF). That is, after the packets received in the i-th service function (SF) are processed by the i-th service function (SF), the i + 1-th service Function (SF), and update the internal network so that it can be processed.

또한, 장애 복구 처리부(450)는 i번째 서비스 기능(Service Function, SF) 및 i+1번째 서비스 기능(Service Function, SF)에서 처리 후 네트워크 서비스 헤더(Network Service Header, NSH)의 서비스 인덱스(Service Index, SI)가 일정치씩 감소되도록 i번째 서비스 기능(또는 i번째 서비스 노드)으로 유입되는 패킷들을 처리할 수 있다.In addition, the failure recovery processing unit 450 extracts a service index (Service) of a post-processing network service header (NSH) from an i-th service function (SF) and an i + Index, SI) is reduced by a constant value.

일 실시 예에 따른 장애 복구 처리부(450)는 i+1번째 서비스 기능(Service Function, SF)에서 장애가 탐지된 경우, 컨트롤러를 통해 실제 네트워크 상의 경로(Service Function Path, SFP)를 생성하고, 생성된 실제 네트워크 상의 경로(Service Function Path, SFP)에 상응하는 서비스 기능 포워더(Service Function Forwarder, SFF) 테이블을 업데이트하며, 업데이트된 서비스 기능 포워더(Service Function Forwarder, SFF) 테이블을 이용하여 장애를 처리할 수 있다.The failure recovery processing unit 450 according to an embodiment generates a path (Service Function Path, SFP) on the actual network through the controller when a failure is detected in the (i + 1) th service function (SF) It updates the Service Function Forwarder (SFF) table corresponding to the path (Service Function Path (SFP) on the actual network) and can handle the failure using the updated Service Function Forwarder (SFF) table have.

도 5는 분산 페일오버 에이전트의 분산적인 SF 장애 극복을 위한, 일실시예에 따른 장애 복구 프로세서의 동작 방법에 대한 플로우 차트이다.5 is a flowchart of a method for operating a failover processor, according to an embodiment, for distributed SF failover of a distributed failover agent.

서비스 기능(Service Function, SF)에 위치하는 적어도 하나 이상의 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)를 유지할 수 있다.At least one Distributed Fail-over Agent (DFA) located in a Service Function (SF) can be maintained.

먼저, 일예에 따른 장애 복구 프로세서의 동작 방법은 i번째 분산 페일오버 에이전트는 i+1번째 서비스 기능에게 핑 메시지를 전송한다(단계 510). 즉, 적어도 하나 이상의 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA) 중에서, i번째 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)에서 i+1번째 서비스 기능(Service Function, SF)에 일정 주기로 핑(ping) 메시지를 송신할 수 있다.First, according to an exemplary operation method of the failure recovery processor, an i-th distributed failover agent transmits a ping message to an (i + 1) -th service function (step 510). That is, among the at least one Distributed Fail-over Agent (DFA), the (i + 1) th Service Function (SF) in the i-th Distributed Fail- The ping message can be sent periodically.

일 실시 예에 따른 장애 복구 프로세서의 동작 방법은 송신된 핑(ping) 메시지의 수신 여부를 판단할 수 있다(단계 520).The method of operation of the failover processor according to one embodiment may determine whether to receive a transmitted ping message (step 520).

다음으로, 일 실시 예에 따른 장애 복구 프로세서의 동작 방법은 To 시간 내에 응답이 수신되면, 다시 핑 메시지를 전송한다(단계 510). 만일 To 시간 내에 응답이 없을 시, i+1번째 서비스 기능의 back-up SF를 활성화시킨다(단계 530). 또한 i번째 서비스 기능에서 처리된 패킷들이 i+1번째 서비스 기능으로 전달되도록 내부 네트워크를 업데이트한다(단계 540).Next, the method of operating the failover processor according to an exemplary embodiment of the present invention transmits a ping message again when a response is received within the To time (step 510). If there is no response within the To time, the back-up SF of the (i + 1) -th service function is activated (step 530). Also, the internal network is updated so that the packets processed in the i-th service function are transmitted to the i + 1-th service function (step 540).

즉, 장애 복구 프로세서의 동작 방법은 송신된 핑 메시지에 대한 응답이 미리 정의된 문턱값 시간 내에 회신하지 않는 경우 i+1번째 서비스 기능(Service Function, SF)의 장애를 탐지할 수 있다. 또한, 장애가 탐지되는 경우, 상기 i번째 분산 페일오버 에이전트(Distributed Fail-over Agent, DFA)에서 상기 i+1번째 서비스 기능(Service Function, SF)에 대한 백업 인스턴스(back-up instance)를 활성화시킬 수 있다. 뿐만 아니라, i번째 서비스 기능(Service Function, SF)으로 유입된 패킷들이 상기 i번째 서비스 기능(Service Function, SF)에 의해 처리된 후, 백업 인스턴스(back-up instance)가 활성화된 i+1번째 서비스 기능(Service Function, SF)에 전달되어 처리될 수 있도록 내부 네트워크를 업데이트 할 수 있다.That is, the operation method of the failure recovery processor can detect a failure of the (i + 1) -th service function (SF) when the response to the transmitted ping message does not return within a predefined threshold time. If a failure is detected, a back-up instance for the i + 1th service function (SF) is activated in the i-th Distributed Fail-over Agent (DFA) . In addition, after the packets received in the i-th service function (SF) are processed by the i-th service function (SF), the i + 1 th The internal network can be updated so that it can be transferred to and processed in a service function (SF).

장애 복구 프로세서의 동작 방법은 i번째 서비스 기능(Service Function, SF)으로 유입되는 패킷들에 대해 상기 i번째 서비스 기능(Service Function, SF) 및 상기 i+1번째 서비스 기능(Service Function, SF)에서 처리 후 네트워크 서비스 헤더(Network Service Header, NSH)의 서비스 인덱스(Service Index, SI)가 일정치씩 감소되도록 할 수 있다.The operation method of the failure recovery processor may be configured such that the i-th service function (SF) and the (i + 1) -th service function (SF) The service index (SI) of the network service header (NSH) after the processing can be reduced by a predetermined value.

일 실시 예에 따른 장애 복구 프로세서의 동작 방법은 유입된 패킷의 서비스 인덱스(Service Index, SI)가 특정값 이상인 경우, 상기 i+1번째 서비스 기능(Service Function, SF)을 스킵하고, i+2번째 서비스 기능(Service Function, SF)으로 상기 유입된 패킷을 전달할 수 있다.(I + 2) th service function (SF) is skipped if the service index (SI) of the incoming packet is greater than a specific value, Th service function (SF).

또한, 장애 복구 프로세서의 동작 방법은 상기 i+1번째 서비스 기능(Service Function, SF)에서 장애가 탐지된 경우, 컨트롤러를 통해 실제 네트워크 상의 경로(Service Function Path, SFP)를 생성하고, 상기 생성된 실제 네트워크 상의 경로(Service Function Path, SFP)에 상응하는 서비스 기능 포워더(Service Function Forwarder, SFF) 테이블을 업데이트하며, 및 상기 업데이트된 서비스 기능 포워더(Service Function Forwarder, SFF) 테이블을 이용하여 상기 장애를 처리할 수 있다.In addition, when the failure is detected in the (i + 1) -th service function (SF), the operation method of the failure recovery processor generates a path (Service Function Path, SFP) on the actual network through the controller, Updating a Service Function Forwarder (SFF) table corresponding to a path on the network (Service Function Path, SFP), and processing the failure using the updated Service Function Forwarder (SFF) table can do.

도 6은 컨트롤러와 SF간의 지연시간의 범위를 변화시키면서 평균 장애극복 시간의 변화를 살펴본 그래프(600)이다.FIG. 6 is a graph 600 illustrating a change in the average failover time while varying the range of the delay time between the controller and the SF.

즉, 도 6의 그래프(600)에서는 본 발명의 성능 평가를 위해 시뮬레이션을 진행하고 기존의 방법들과 비교를 하였다. 시뮬레이션 결과를 나타내는 도면에서 도면부호 620에 해당하는 Distributed Fail-over의 결과는 본 발명, 도면부호 610에 해당하는 Centralized Fail-over의 결과는 SFC 컨트롤러가 모든 SF들의 장애 탐지/복구를 수행하는 기술에 해당한다.That is, in the graph 600 of FIG. 6, the simulation is performed for performance evaluation of the present invention and compared with the existing methods. The result of Distributed Fail-over corresponding to reference numeral 620 in the drawing showing the result of the simulation is the result of the Centralized Fail-over corresponding to the present invention and reference numeral 610, the SFC controller performs the fault detection / .

그래프(600)는 컨트롤러와 SF들간의 지연시간의 범위를 변화시키면서 평균 장애 극복 시간의 변화를 나타낸다. 그래프(600)에서 볼 수 있듯이 본 발명의 지연시간이 더욱 낮은 평균 장애 극복시간을 달성하는 것을 확인할 수 있다. 이는 본 발명이 분산 페일오버 에이전트를 통한 분산적인 서비스 기능 장애 극복 방법을 통해 컨트롤러를 통한 서비스 기능 장애 극복을 최소화하였기 때문이다.The graph 600 shows the change in average failover time while varying the range of delay times between the controller and the SFs. As can be seen in graph 600, it can be seen that the delay time of the present invention achieves a lower average failover time. This is because the present invention minimizes the overcoming of the service function failure through the controller through the distributed service failover method through the distributed failover agent.

결국, 본 발명을 이용하는 경우, 서비스 기능 체인 상의 서비스 기능에 대한 분산적인 장애 극복 방법을 통해 서비스 기능 장애 극복 시간을 줄일 수 있다. 또한, 각 서비스 기능에 분산 페일오버 에이전트(Distributed Fail-over Agent)를 위치시켜 체인 상의 다음 서비스 기능에 대한 장애 탐지 및 장애 극복 메커니즘을 수행할 수 있고, 중앙 집중적인 SFC 컨트롤러를 통해서만 이루어질 수 있는 체인 업데이트 과정 없이 분산적으로 장애를 극복하여 서비스 기능 장애 극복시간을 줄일 수 있다.As a result, when using the present invention, it is possible to reduce the time taken to overcome a service failure by a distributed failover method for a service function on a service function chain. In addition, it is possible to place a distributed fail-over agent in each service function to perform a failure detection and fail-over mechanism for the next service function in the chain, and to perform a chain that can only be achieved through a centralized SFC controller It is possible to overcome the failure of the service failure by overcoming the failure in a distributed manner without an update process.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시 예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), a PLU a programmable logic unit, a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing unit may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

실시 예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시 예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시 예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

본 발명은 도면에 도시된 실시 예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시 예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 등록청구범위의 기술적 사상에 의해 정해져야 할 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

Claims

(I is a natural number smaller than N) fault recovery systems among N (N is a natural number of 3 or more) fault recovery systems each providing a service function of any one of a plurality of service functions included in the service function chain A method for recovering from a distributed service failure,
The transmitter included in the i-th fail-over system transmits a ping message to the (i + 1) -th fail-over system providing the next service function on the service function chain at regular intervals;
Determining whether a response to the ping message is received by the determination unit included in the i < th >
Detecting a failure of the next service function when a failure detection unit included in the i-th failure recovery system does not return a response to the ping message within a predefined threshold time; And
Wherein the failure recovery processing unit included in the i < th > failback system includes a step of activating a backup instance for the next service function when the failure is detected.
How to recover from a failure.

The method according to claim 1,
The failure recovery processing unit may process the packets received in the i-th failure recovery system by the service function provided by the i-th failure recovery system, The method further comprising:
How to recover from a failure.

3. The method of claim 2,
Packets received in the service function provided by the i-th fail-over system are processed in the service function and the next service function after the Service Index (SI) of the Network Service Header (NSH) &Lt; / RTI >
How to recover from a failure.

The method of claim 3,
Wherein the service index is reduced by two.

(I is a natural number smaller than N) fault recovery systems among N (N is a natural number of 3 or more) fault recovery systems each providing a service function of any one of a plurality of service functions included in the service function chain As a result,
a storage unit for holding a backup instance for the i + 1 < th > service function;
a transmitter for transmitting the ping message to the (i + 1) th service node at regular intervals;
A determination unit for determining whether a response to the ping message is received;
A failure detection unit for detecting failure of the (i + 1) -th service function when the response to the ping message does not return within a predefined threshold time; And
and a failure recovery processor for recovering a failure by activating the backup instance for the (i + 1) -th service function if a failure of the (i + 1) th service function is detected.
Failover system.

delete

6. The method of claim 5,
The failure recovery processing unit,
Th service function provided by the i < th > failure recovery system is processed by the i < th > service function, and after the backup instance is transferred to the activated i + A failover system that updates the network.

8. The method of claim 7,
The failure recovery processing unit,
And processes the packets flowing into the i-th service function so that the service index of the network service header after processing in the i-th service function and the (i + 1) -th service function is decreased by a predetermined value.