KR102239177B1

KR102239177B1 - Method for managing of cloud server, device and system for managing of cloud server performing the same

Info

Publication number: KR102239177B1
Application number: KR1020140117070A
Authority: KR
Inventors: 정현호; 옥기상
Original assignee: 주식회사 케이티
Priority date: 2014-09-03
Filing date: 2014-09-03
Publication date: 2021-04-09
Also published as: KR20160028247A

Abstract

본 발명에 따른 클라우드 서버 관리 방법은 클라우드 서버 관리 장치가 가상머신의 장애에 따라 바이패스(Bypass) 기능을 구현하도록 클라우드 서버를 관리하는 방법에서, 적어도 하나 이상의 가상머신을 감시하는 단계, 가상머신 별로 바이패스를 활성화할지 여부를 판단하는 단계, 그리고 장애가 발생한 가상머신으로 가는 트래픽에 대해 논리적 큐(Queue)의 릴레이(Relay)를 통해 바이패스 기능을 구현하도록 제어하는 단계를 포함한다.In the cloud server management method according to the present invention, in a method of managing a cloud server such that a cloud server management device implements a bypass function according to a failure of a virtual machine, monitoring at least one or more virtual machines, for each virtual machine The step of determining whether to activate the bypass, and controlling the traffic to the failed virtual machine to implement the bypass function through a relay of a logical queue.

Description

Cloud server management method, cloud server management device and cloud service management system that perform it {METHOD FOR MANAGING OF CLOUD SERVER, DEVICE AND SYSTEM FOR MANAGING OF CLOUD SERVER PERFORMING THE SAME}

본 발명은 클라우드 서버 관리 방법, 이를 수행하는 클라우드 서버 관리 장치 및 클라우드 서비스 관리 시스템에 관한 것이다.The present invention relates to a cloud server management method, a cloud server management device for performing the same, and a cloud service management system.

클라우드 서비스 플랫폼은 네트워크 인터페이스 카드(Network Interface Card, NIC)가 가상머신의 장애 발생 시나 사용자 제어에 의해 물리 링크에 대한 바이패싱(Bypassing) 기능을 제공한다. The cloud service platform provides a bypassing function for a physical link when a network interface card (NIC) fails in a virtual machine or under user control.

그리고, 가상머신의 장애 시에, 클라우드 서비스 플랫폼은 가상머신 마이그레이션(Migration)을 통한 소프트웨어 적인 장애에 대응을 할 수 있다. 하지만, 가상머신 마이그레이션을 위해서는 기술 성숙도와 운용 노하우에 따라 많은 시간이 소요되는 어려움이 있다.And, in the event of a virtual machine failure, the cloud service platform can respond to a software failure through virtual machine migration. However, for virtual machine migration, it is difficult to take a lot of time depending on the technology maturity and operation know-how.

또한, 현재까지의 NIC 바이패싱 기술은 호스트 장애만 탐지하여 절체할 수 있으며, 가상머신의 장애 시에 가상머신 별로 트래픽을 선별적으로 바이패스하지 못하는 어려움이 있다.In addition, the NIC bypassing technology up to now can detect and switch over only a host failure, and it is difficult to selectively bypass traffic for each virtual machine in the event of a virtual machine failure.

본 발명은 가상머신 별로 바이패스 기능을 구현할 수 있는 클라우드 서버 관리 방법, 이를 수행하는 클라우드 서버 관리 장치 및 클라우드 서비스 관리 시스템을 제안하고자 한다.The present invention is to propose a cloud server management method capable of implementing a bypass function for each virtual machine, a cloud server management device and a cloud service management system that perform the same.

본 발명의 클라우드 서버 관리 방법은 클라우드 서버 관리 장치가 가상머신의 장애에 따라 바이패스(Bypass) 기능을 구현하도록 클라우드 서버를 관리하는 방법에서, 적어도 하나 이상의 가상머신을 감시하는 단계, 가상머신 별로 바이패스를 활성화할지 여부를 판단하는 단계, 그리고 장애가 발생한 가상머신으로 가는 트래픽에 대해 논리적 큐(Queue)의 릴레이(Relay)를 통해 바이패스 기능을 구현하도록 제어하는 단계를 포함한다.In the cloud server management method of the present invention, in a method of managing a cloud server such that a cloud server management device implements a bypass function according to a failure of a virtual machine, monitoring at least one virtual machine, bypassing for each virtual machine Determining whether to activate the path, and controlling the traffic to the failed virtual machine to implement a bypass function through a relay of a logical queue.

상기 감시하는 단계는, 가상머신 별로 장애가 발생했는지 여부를 탐지하는 단계를 포함할 수 있다.The monitoring may include detecting whether a failure has occurred for each virtual machine.

상기 판단하는 단계는, 가상머신의 상태 메시지가 일정시간 동안 이상 도착하지 않으면 상기 가상머신에 장애가 발생한 것으로 판단하는 단계를 포함할 수 있다.The determining may include determining that a failure has occurred in the virtual machine when the status message of the virtual machine does not arrive for a predetermined period of time or longer.

상기 제어하는 단계는, 입력 큐(Input Queue)와 출력 큐(Output Queue)를 소프트웨어적으로 릴레이하여 장애가 발생한 가상머신의 트래픽에 바이패스 기능을 구현하는 단계를 포함할 수 있다.The controlling may include implementing a bypass function for traffic of a failed virtual machine by relaying an input queue and an output queue in software.

상기 제어하는 단계는, 비정상 가상머신으로 가는 트래픽을 해당 가상머신으로 전달하지 않고, 입력 큐(Input Queue)에서 출력 큐(Output Queue)로 통과시키는 단계를 포함할 수 있다.The controlling may include passing traffic destined to the abnormal virtual machine to an output queue from an input queue without passing to the corresponding virtual machine.

본 발명의 클라우드 서버 관리 장치는 적어도 하나 이상의 가상머신 별로 장애 발생 여부를 탐지하는 장애 탐지부, 그리고 상기 적어도 하나 이상의 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 생성하고, 장애가 발생한 가상머신으로 가는 트래픽에 대해 논리적 큐(Queue)의 릴레이(Relay)를 통해 바이패스(Bypass) 기능을 구현하도록 제어하는 바이패스 제어부를 포함한다.The cloud server management apparatus of the present invention creates a failure detection unit that detects whether a failure occurs for each of at least one or more virtual machines, and a logical queue that separates traffic for each of the at least one or more virtual machines, It includes a bypass control unit that controls to implement a bypass function through a relay of a logical queue for going traffic.

장애가 발생한 상기 가상머신의 트래픽에 대해 바이패스 기능을 구현하되, 가상머신 별로 패킷을 바이패스하는 바이패스부를 더 포함할 수 있다.Implementing a bypass function for the traffic of the virtual machine in which a failure has occurred, and may further include a bypass unit for bypassing packets for each virtual machine.

상기 장애 탐지부는, 각각의 가상머신에 배치되어 가상머신 별로 장애 발생 여부를 탐지하며, 각 가상머신의 상태 메시지를 상기 바이패스 제어부에 전송할 수 있다.The failure detection unit may be disposed in each virtual machine to detect whether a failure occurs for each virtual machine, and transmit a status message of each virtual machine to the bypass control unit.

상기 바이패스 제어부는, 상기 장애 탐지부로부터 각 가상머신의 상태 메시지를 수집하고, 수집된 상태 메시지를 이용해 상기 가상머신의 상태를 관리할 수 있다.The bypass control unit may collect a status message of each virtual machine from the failure detection unit and manage the state of the virtual machine using the collected status message.

상기 바이패스 제어부는, 등록 되었던 가상머신의 장애 탐지부에서 일정시간 동안 이상 상기 상태 메시지가 도착하지 않으면 해당 가상머신에 장애가 발생한 것으로 판단하고, 해당 가상머신에 바이패스 모드를 설정하도록 상기 바이패스부에 메시지를 전달할 수 있다.The bypass control unit determines that a failure has occurred in the virtual machine when the status message does not arrive for a certain period of time from the failure detection unit of the registered virtual machine, and sets the bypass mode to the virtual machine. You can pass the message on.

상기 바이패스부는, 상기 적어도 하나 이상의 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 저장하는 큐 저장부, 그리고 상기 장애 탐지부 또는 상기 바이패스 제어부로부터 상태 메시지를 전달받고, 장애가 발생한 가상머신 별로 트래픽을 선별적으로 바이패스하도록 관리하는 바이패스 관리부를 포함할 수 있다.The bypass unit includes a queue storage unit that stores a logical queue for classifying traffic for each of the at least one virtual machine, and receives a status message from the failure detection unit or the bypass control unit, and for each virtual machine in which a failure occurs. It may include a bypass management unit that manages to selectively bypass traffic.

상기 바이패스 관리부는, 입력 큐(Input Queue)와 출력 큐(Output Queue)를 소프트웨어적으로 릴레이하여 장애가 발생한 가상머신의 트래픽에 바이패스 기능을 구현할 수 있다.The bypass management unit may implement a bypass function for traffic of a failed virtual machine by relaying an input queue and an output queue in software.

본 발명의 클라우드 서비스 관리 시스템은 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 생성하고, 장애가 발생한 가상머신으로 가는 트래픽에 대해 상기 논리적 큐의 릴레이(Relay)를 통해 바이패스(Bypass) 기능을 구현하도록 클라우드 서버를 제어하는 클라우드 서버 관리 장치, 그리고 적어도 하나 이상의 클라우드 서버와 연결되며, 장애가 발생한 가상머신의 자원을 회수하거나 새로운 가상머신을 생성하도록 각각의 클라우드 서버를 제어하는 서비스 관리 장치를 포함한다.The cloud service management system of the present invention creates a logical queue that separates traffic for each virtual machine, and provides a bypass function for traffic going to a failed virtual machine through a relay of the logical queue. A cloud server management device that controls the cloud server to implement, and a service management device that is connected to at least one cloud server and controls each cloud server to recover resources of a failed virtual machine or to create a new virtual machine. .

상기 서비스 관리 장치는, 상기 클라우드 서버 관리 장치로부터 가상머신의 상태 메시지를 전송받고, 바이패스 기능의 구현 여부를 모니터링하는 가상머신 모니터링부, 그리고 장애가 발생한 가상머신의 자원을 회수하도록 제어하는 자원 관리부를 포함할 수 있다.The service management device includes a virtual machine monitoring unit that receives a status message of a virtual machine from the cloud server management device, monitors whether a bypass function is implemented, and a resource management unit that controls to recover resources of a failed virtual machine. Can include.

상기 서비스 관리 장치는, 장애가 발생한 가상머신을 대체할 새로운 가상머신을 생성하도록 각각의 클라우드 서버를 제어하는 구성 관리부를 더 포함할 수 있다.The service management apparatus may further include a configuration management unit that controls each cloud server to create a new virtual machine to replace the failed virtual machine.

상기 클라우드 서버 관리 장치는, 적어도 하나 이상의 가상머신 별로 장애 발생 여부를 탐지하는 장애 탐지부를 포함할 수 있다.The cloud server management apparatus may include a failure detection unit that detects whether a failure occurs for each of at least one or more virtual machines.

상기 클라우드 서버 관리 장치는, 상기 적어도 하나 이상의 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 생성하고, 장애가 발생한 가상머신으로 가는 트래픽에 대해 논리적 큐(Queue)의 릴레이(Relay)를 통해 바이패스 기능을 구현하도록 제어하는 바이패스 제어부를 포함할 수 있다.The cloud server management device creates a logical queue that separates traffic for each of the at least one or more virtual machines, and bypasses the traffic going to the failed virtual machine through a relay of the logical queue. It may include a bypass control unit that controls to implement the function.

상기 클라우드 서버 관리 장치는, 장애가 발생한 가상머신 별로 트래픽을 선별적으로 바이패스하도록 관리할 수 있다.The cloud server management apparatus may manage to selectively bypass traffic for each failed virtual machine.

본 발명에 따르면, 가상머신의 장애를 실시간으로 탐지하고 가상머신 별로 바이패스 기능을 구현함으로써, 서비스 장애시에 네트워크 서비스를 즉각적으로 복구할 수 있는 환경을 제공한다.According to the present invention, a virtual machine failure is detected in real time and a bypass function is implemented for each virtual machine, thereby providing an environment in which network services can be immediately restored in the event of a service failure.

도 1은 종래 기술에 따른 클라우드 서버의 구조를 간략히 도시한 도면이다.
도 2는 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템의 구조를 도시한 도면이다.
도 3은 본 발명의 한 실시예에 따른 클라우드 서버 관리 장치가 클라우드 서버 내에서 바이패스 기능을 구현하는 도면이다.
도 4는 본 발명의 한 실시예에 따른 클라우드 서버 관리 장치가 가상머신의 상태를 탐지해 바이패스 기능을 구현하는 과정을 도시한 흐름도이다.
도 5는 본 발명의 한 실시예에 따라 클라우드 서버가 가상머신의 상태에 따라트래픽을 전달하는 과정을 도시한 도면이다.1 is a diagram schematically showing the structure of a cloud server according to the prior art.
2 is a diagram showing the structure of a cloud service management system according to an embodiment of the present invention.
3 is a diagram illustrating a cloud server management apparatus according to an embodiment of the present invention implementing a bypass function in a cloud server.
4 is a flowchart illustrating a process of implementing a bypass function by detecting a state of a virtual machine by a cloud server management apparatus according to an embodiment of the present invention.
5 is a diagram illustrating a process of transmitting traffic by a cloud server according to a state of a virtual machine according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

또한, 명세서에 기재된 "…부", "…모듈" 의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.In addition, the terms "... unit" and "... module" described in the specification mean a unit that processes at least one function or operation, which may be implemented by hardware or software, or a combination of hardware and software.

도 1은 종래 기술에 따른 클라우드 서버의 구조를 간략히 도시한 도면이다.1 is a diagram schematically showing the structure of a cloud server according to the prior art.

도 1을 참조하면, 클라우드 서버(10)는 네트워크 인터페이스 카드(Network Interface Card, NIC)(40)가 가상머신(Virtual Machine)(32 내지 36)의 장애 발생 시나 사용자 제어에 의해 물리 링크에 대한 바이패싱(Bypassing) 기능을 제공한다. 그리고, 네트워크 인터페이스 카드(40)는 호스트 기반 통과(Pass-Through) 기능을 제공한다. Referring to FIG. 1, the cloud server 10 provides a network interface card (NIC) 40 when a virtual machine 32 to 36 fails or by user control. Provides Bypassing function. In addition, the network interface card 40 provides a host-based pass-through function.

가상머신의 장애 시에, 가상머신 마이그레이션(Migration)을 통한 소프트웨어 적인 장애에 대응을 할 수 있다. 하지만, 가상머신 마이그레이션을 위해서는 기술 성숙도와 운용 노하우에 따라 수분에서 수시간에 이르기까지 많은 시간이 소요된다.In the event of a virtual machine failure, it is possible to respond to software failure through virtual machine migration. However, virtual machine migration takes a lot of time, ranging from minutes to hours, depending on technology maturity and operational know-how.

그리고, 네트워크 기능 가상화(Network Function Virtualization, NFV) 서비스의 상용화와 서비스 수준 협약서(Service Level Agreement, SLA) 만족을 위해서는 가상머신의 장애 시, 실시간(수 초이내로) 네트워크 바이패싱(Network Bypassing) 기능이 구현되어야 한다.In addition, in order to commercialize Network Function Virtualization (NFV) services and satisfy the Service Level Agreement (SLA), in case of a virtual machine failure, a real-time (within seconds) network bypassing function is required. It should be implemented.

하지만, 현재까지의 NIC 바이패싱 기술은 호스트 장애만 탐지하여 절체할 수 있으며, 가상머신의 장애 시에 가상머신 별로 트래픽을 선별적으로 바이패스하지 못하는 어려움이 있다.However, until now, the NIC bypassing technology can detect and switch over only a host failure, and there is a difficulty in not being able to selectively bypass traffic for each virtual machine in the event of a virtual machine failure.

이제 도 2 내지 도 5를 참고하여 본 발명의 한 실시예에 따른 클라우드 서버 관리 방법, 이를 수행하는 클라우드 서버 관리 장치 및 클라우드 서비스 관리 시스템에 대하여 상세하게 설명한다.Now, a cloud server management method according to an embodiment of the present invention, a cloud server management apparatus and a cloud service management system performing the same will be described in detail with reference to FIGS. 2 to 5.

도 2는 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템의 구조를 도시한 도면이다. 이때, 클라우드 서비스 관리 시스템은 본 발명의 실시예에 따른 설명을 위해 필요한 개략적인 구성만을 도시할 뿐 이러한 구성에 국한되는 것은 아니다.2 is a diagram showing the structure of a cloud service management system according to an embodiment of the present invention. In this case, the cloud service management system only shows a schematic configuration necessary for description according to an embodiment of the present invention, but is not limited to this configuration.

도 2를 참조하면, 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템은 클라우드 서버 관리 장치(100)와 서비스 관리 장치(200)를 포함한다. Referring to FIG. 2, a cloud service management system according to an embodiment of the present invention includes a cloud server management device 100 and a service management device 200.

클라우드 서버 관리 장치(100)는 가상머신 별로 가상 큐 관리를 통해 바이패스 기능을 구현한다. 클라우드 서버 관리 장치(100)는 네트워크 인터페이스 카드(Network Interface Card, NIC)에 가상머신(Virtual Machine) 별로 트래픽을 구분하는 논리적 큐(Queue)를 생성한다. The cloud server management apparatus 100 implements a bypass function through virtual queue management for each virtual machine. The cloud server management apparatus 100 creates a logical queue for classifying traffic for each virtual machine on a network interface card (NIC).

그리고, 클라우드 서버 관리 장치(100)는 가상머신별로 와치독(watch dog) 기능을 구현하고, 가상머신별 장애 탐지 시에, 장애가 발생한 가상머신으로 가는 트래픽에 대해 상기 논리적 큐의 릴레이(Relay)를 통해 바이패스(Bypass) 기능을 구현하도록 클라우드 서버를 제어한다.In addition, the cloud server management apparatus 100 implements a watch dog function for each virtual machine, and when a failure is detected for each virtual machine, a relay of the logical queue for traffic going to the failed virtual machine is The cloud server is controlled to implement the bypass function.

그리고, 클라우드 서버 관리 장치(100)는 장애가 발생한 가상머신 별로 트래픽을 선별적으로 바이패스하도록 관리한다.In addition, the cloud server management apparatus 100 manages to selectively bypass traffic for each virtual machine in which a failure occurs.

이러한 클라우드 서버 관리 장치(100)는 본 발명의 한 실시예에 따라 장애 탐지부(110), 바이패스 제어부(120) 및 바이패스부(130)를 포함한다.The cloud server management apparatus 100 includes a failure detection unit 110, a bypass control unit 120, and a bypass unit 130 according to an embodiment of the present invention.

장애 탐지부(110)는 적어도 하나 이상의 가상머신 별로 장애 발생 여부를 탐지한다. 그리고, 장애 탐지부(110)는 각각의 가상머신에 배치되어 가상머신 별로 장애 발생 여부를 탐지하며, 각 가상머신의 상태 메시지를 바이패스 제어부(120)에 전송한다.The failure detection unit 110 detects whether a failure occurs for each of at least one or more virtual machines. Further, the failure detection unit 110 is disposed in each virtual machine to detect whether a failure occurs for each virtual machine, and transmits a status message of each virtual machine to the bypass control unit 120.

바이패스 제어부(120)는 적어도 하나 이상의 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 생성하고, 장애가 발생한 가상머신으로 가는 트래픽에 대해 논리적 큐(Queue)의 릴레이(Relay)를 통해 바이패스 기능을 구현하도록 제어한다.The bypass control unit 120 creates a logical queue that separates traffic for each of at least one or more virtual machines, and bypasses the traffic going to the failed virtual machine through a relay of the logical queue. Control to implement.

바이패스 제어부(120)는 장애 탐지부(110)로부터 각 가상머신의 상태 메시지를 수집하고, 수집된 상태 메시지를 이용해 상기 가상머신의 상태를 관리한다.The bypass control unit 120 collects a status message of each virtual machine from the failure detection unit 110 and manages the state of the virtual machine using the collected status message.

그리고, 바이패스 제어부(120)는 등록 되었던 가상머신의 장애 탐지부(110)에서 일정시간 동안 이상 상기 상태 메시지가 도착하지 않으면 해당 가상머신에 장애가 발생한 것으로 판단하고, 해당 가상머신에 바이패스 모드를 설정하도록 및 바이패스부(130)에 메시지를 전달한다.In addition, the bypass control unit 120 determines that a failure has occurred in the virtual machine when the status message does not arrive for a certain period of time by the failure detection unit 110 of the registered virtual machine, and sets the bypass mode to the virtual machine. To set and pass a message to the bypass unit 130.

바이패스 제어부(120)는 본 발명의 한 실시예에 따라 시간 제어부(122), 가상머신 상태 관리부(124), 바이패스 상태 관리부(126) 및 TCP 연동부(128)를 포함한다.The bypass control unit 120 includes a time control unit 122, a virtual machine state management unit 124, a bypass state management unit 126, and a TCP linkage unit 128 according to an embodiment of the present invention.

시간 제어부(122)는 장애 탐지부(110)에서 일정시간 동안 메시지가 도착하는지 여부를 판단하기 위해서 각각의 가상머신 별로 시간을 관리한다.The time control unit 122 manages the time for each virtual machine in order to determine whether a message arrives for a predetermined time in the failure detection unit 110.

가상머신 상태 관리부(124) 및 비이패스 상태 관리부(126)는 주기적으로 장애 탐지부(110)로부터 상태 메시지를 수집하고, 가상머신의 상태 및 가상머신별 상태 메시지의 수신 시간을 관리한다.The virtual machine state management unit 124 and the non-pass state management unit 126 periodically collect status messages from the failure detection unit 110 and manage the status of the virtual machine and the reception time of each virtual machine status message.

TCP 연동부(128)는 가상머신에 배치된 장애 탐지부(110)와 TCP(transmission control protocol, 전송 제어 프로토콜) 연동하여 해당 가상머신에 장애가 발생하였는지 여부를 탐지한다.The TCP interworking unit 128 detects whether a failure has occurred in the corresponding virtual machine by interworking with the failure detection unit 110 disposed in the virtual machine and a transmission control protocol (TCP).

바이패스부(130)는 장애가 발생한 가상머신의 트래픽에 대해 바이패스 기능을 구현하고, 가상머신 별로 패킷을 바이패스하도록 동작한다. 여기서, 바이패스부(130)는 시스템을 감시하는 와치독(watch dog) 기능을 수행할 수 있다.The bypass unit 130 implements a bypass function for the traffic of a virtual machine in which a failure has occurred, and operates to bypass packets for each virtual machine. Here, the bypass unit 130 may perform a watch dog function for monitoring the system.

그리고, 바이패스부(130)는 본 발명의 한 실시예에 따라 큐 저장부(132) 및 바이패스 관리부(134)를 포함한다.In addition, the bypass unit 130 includes a queue storage unit 132 and a bypass management unit 134 according to an embodiment of the present invention.

큐 저장부(132)는 적어도 하나 이상의 가상머신 별로 트래픽을 구분하는 논리적 큐(Queue)를 저장한다. The queue storage unit 132 stores a logical queue for classifying traffic for each of at least one or more virtual machines.

바이패스 관리부(134)는 장애 탐지부(110)나 바이패스 제어부(120)로부터 가상머신의 메시지를 전달받고, 장애가 발생한 가상머신 별로 트래픽을 선별적으로 바이패스하도록 관리한다.The bypass management unit 134 receives a message from the virtual machine from the failure detection unit 110 or the bypass control unit 120 and manages to selectively bypass traffic for each virtual machine in which a failure occurs.

바이패스 관리부(134)는 입력 큐(Input Queue)와 출력 큐(Output Queue)를 소프트웨어적으로 릴레이하여 장애가 발생한 가상머신의 트래픽에 바이패스 기능을 구현한다.The bypass management unit 134 relays an input queue and an output queue in software to implement a bypass function for traffic of a failed virtual machine.

그리고, 서비스 관리 장치(200)는 적어도 하나 이상의 클라우드 서버와 연결되며, 장애가 발생한 가상머신의 자원을 회수하거나 새로운 가상머신을 생성하도록 각각의 클라우드 서버를 제어한다.In addition, the service management device 200 is connected to at least one cloud server, and controls each cloud server to recover resources of a failed virtual machine or to create a new virtual machine.

서비스 관리 장치(200)는 클라우드 서버 관리 장치(100)로부터 가상머신의 장애 여부 및 바이패스 기능의 구현 여부를 수신한다. 그리고, 서비스 관리 장치(200)는 장애가 발생한 가상머신의 자원을 회수하고, 새로운 가상머신을 생성한다. The service management device 200 receives from the cloud server management device 100 whether a virtual machine has a failure and whether a bypass function is implemented. Then, the service management apparatus 200 recovers the resources of the virtual machine in which the failure occurred, and creates a new virtual machine.

여기서, 서비스 관리 장치(200)는 본 발명의 한 실시예에 따라 가상머신 모니터링부(210), 자원 관리부(220) 및 구성 관리부(230)를 포함한다.Here, the service management apparatus 200 includes a virtual machine monitoring unit 210, a resource management unit 220, and a configuration management unit 230 according to an embodiment of the present invention.

가상머신 모니터링부(210)는 클라우드 서버 관리 장치(100)로부터 가상머신의 상태 메시지를 전송받고, 바이패스 기능의 구현 여부를 모니터링한다.The virtual machine monitoring unit 210 receives a status message of the virtual machine from the cloud server management device 100 and monitors whether the bypass function is implemented.

자원 관리부(220)는 장애가 발생한 가상머신의 자원을 회수하도록 제어한다. 그리고, 구성 관리부(230)는 장애가 발생한 가상머신을 대체할 새로운 가상머신을 생성하도록 각각의 클라우드 서버를 제어한다.The resource management unit 220 controls to recover the resources of the virtual machine in which a failure has occurred. In addition, the configuration management unit 230 controls each cloud server to create a new virtual machine to replace the virtual machine in which the failure has occurred.

이와 같이, 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템은 가상머신의 장애를 실시간으로 탐지하고 가상머신 별로 바이패스 기능을 구현함으로써, 서비스 장애시에 네트워크 서비스를 즉각적으로 복구할 수 있는 환경을 제공한다.As described above, the cloud service management system according to an embodiment of the present invention detects a failure of a virtual machine in real time and implements a bypass function for each virtual machine, thereby providing an environment in which network services can be immediately restored in case of a service failure to provide.

도 3은 본 발명의 한 실시예에 따른 클라우드 서버 관리 장치가 클라우드 서버 내에서 바이패스 기능을 구현하는 도면이다. 이때, 클라우드 서버 관리 장치 및 클라우드 서버는 본 발명의 실시예에 따른 설명을 위해 필요한 개략적인 구성만을 도시할 뿐 이러한 구성에 국한되는 것은 아니다.3 is a diagram of a cloud server management apparatus according to an embodiment of the present invention implementing a bypass function in a cloud server. At this time, the cloud server management apparatus and the cloud server only show schematic configurations necessary for description according to an embodiment of the present invention, but are not limited to these configurations.

도 3을 참조하면, 본 발명의 한 실시예에 따른 클라우드 서버 관리 장치가 포함된 클라우드 서버(10)의 NIC(40)는 가상머신별(32 내지 36)로 트래픽의 처리를 위한 논리적 큐를 갖추고, 가상머신별 와치독(watch dog)을 수행할 수 있도록 하는 기능요소를 가진다.Referring to FIG. 3, the NIC 40 of the cloud server 10 including the cloud server management apparatus according to an embodiment of the present invention has a logical queue for processing traffic for each virtual machine (32 to 36). , It has a functional element that makes it possible to perform a watch dog for each virtual machine.

그리고, 호스트 서버(20)의 OS(operating system, 운영체계)와 게스트 서버(30)의 OS에는 NIC(40)에 배치된 바이패스부(130)과 헬스체크 정보를 통신하기 위한 복수개의 제1 장애 탐지부 내지 제3 장애 탐지부(112 내지 116)가 각각의 가상머신들(32 내지 36)에 설치된다. In addition, the OS (operating system) of the host server 20 and the OS of the guest server 30 are provided with a plurality of firsts for communicating health check information with the bypass unit 130 disposed in the NIC 40. Failure detection units to third failure detection units 112 to 116 are installed in each of the virtual machines 32 to 36.

또한, 서비스 관리 장치(200)는 가상머신별 패킷 바이패스 이후에 호스트 서버(20)로부터 가상머신의 장애 정보를 전달받고, 가상머신의 복구 절차를 수행한다.In addition, the service management apparatus 200 receives failure information of the virtual machine from the host server 20 after the packet bypass for each virtual machine, and performs a recovery procedure of the virtual machine.

도 4는 본 발명의 한 실시예에 따른 클라우드 서버 관리 장치가 가상머신의 상태를 탐지해 바이패스 기능을 구현하는 과정을 도시한 흐름도이다. 이하의 흐름도는 도 1 내지 도 3의 구성과 연계하여 동일한 도면부호를 사용하여 설명한다.4 is a flowchart illustrating a process of implementing a bypass function by detecting a state of a virtual machine by a cloud server management apparatus according to an embodiment of the present invention. The following flowchart will be described using the same reference numerals in connection with the configurations of FIGS. 1 to 3.

도 4를 참조하면, 클라우드 서버 관리 장치(100)는4, the cloud server management device 100 is

바이패스 제어부(120)가 NIC(40)의 와치독(watch dog) API를 호출한다 (S102). 여기서, 바이패스 제어부(120)가 와치독 기능을 수행하는 바이패스부(130)의 API를 호출할 수 있다.The bypass control unit 120 calls the watch dog API of the NIC 40 (S102). Here, the bypass control unit 120 may call an API of the bypass unit 130 that performs a watchdog function.

그리고, 클라우드 서버 관리 장치(100)는 일정시간 동안 메시지 도착했는지 여부에 따라, 가상머신이 정상상태 인지 아니면, 비정상 상태 인지 어부를 판단한다(S104 내지 S108). 일정 시간 안에 가상머신의 상태 메시지가 도착한 경우에는 해당 가상머신을 정상 상태로 판단하고, 일정 시간이 지나도 상태 메시지가 도착하지 않은 경우에는 해당 가상머신을 비정상 상태로 판단한다.In addition, the cloud server management apparatus 100 determines whether the virtual machine is in a normal state or an abnormal state according to whether the message has arrived for a predetermined period of time (S104 to S108). If the status message of the virtual machine arrives within a certain time, the virtual machine is determined to be in a normal state, and if the status message does not arrive after a certain time, the virtual machine is determined to be in an abnormal state.

클라우드 서버 관리 장치(100)는 NIC(40)에서 비정상 가상머신을 탐지하고, 가상머신 별로 입력 큐와 출력 큐를 릴레이하여 바이패스 기능을 활성화한다(S110, S112).The cloud server management apparatus 100 detects an abnormal virtual machine in the NIC 40 and activates the bypass function by relaying the input queue and the output queue for each virtual machine (S110 and S112).

그리고, 바이패스(130)는 바이패스 정보를 바이패스 제어부(120)에 전달하며, 이를 수신한 바이패스 제어부(120)가 서비스 관리 장치(200)로 가상머신 장애 정보를 전송한다(S114, S116). 그리고, 서비스 관리 장치(200)는 가상머신의 마이그레이션(Migration) 및 자원 회수 프로세스를 진행한다.In addition, the bypass 130 transmits the bypass information to the bypass control unit 120, and the bypass control unit 120 receiving this transmits the virtual machine failure information to the service management device 200 (S114, S116). ). In addition, the service management apparatus 200 performs a process of migration and resource recovery of the virtual machine.

도 5는 본 발명의 한 실시예에 따라 클라우드 서버가 가상머신의 상태에 따라트래픽을 전달하는 과정을 도시한 도면이다. 이하의 흐름도는 도 1 내지 도 3의 구성과 연계하여 동일한 도면부호를 사용하여 설명한다.5 is a diagram illustrating a process of transmitting traffic by a cloud server according to a state of a virtual machine according to an embodiment of the present invention. The following flowchart will be described using the same reference numerals in connection with the configurations of FIGS. 1 to 3.

도 5를 참조하면, 클라우드 서버 관리 장치(100)가 배치된 클라우드 서버(10)는 외부의 네트워크 링크들로부터 입력 포트로 패킷을 수신한다(S202).Referring to FIG. 5, the cloud server 10 on which the cloud server management apparatus 100 is disposed receives a packet from external network links through an input port (S202).

그리고, 가상머신 정상 상태인 경우에는 플로우 정보에 따라 목적지 가상머신 별로 트래픽을 분류하여 NIC(40)의 입력 논리적 큐에 저장한다(S204, S206). 그리고 나서, 클라우드 서버(10)는 해당 가상머신으로 트래픽을 전달한다(S208).In addition, when the virtual machine is in a normal state, traffic is classified for each destination virtual machine according to flow information and stored in the input logical queue of the NIC 40 (S204, S206). Then, the cloud server 10 delivers the traffic to the virtual machine (S208).

또한, 가상머신에 장애가 발생하여 비정상 상태인 경우, 클라우드 서버(10)는 클라우드 서버 관리 장치(100)를 통해 가상머신 별로 바이패스 기능을 활성화한다(S210).In addition, when a virtual machine is in an abnormal state due to a failure, the cloud server 10 activates the bypass function for each virtual machine through the cloud server management device 100 (S210).

이때, 비정상 가상머신으로 가는 트래픽은 NIC(40)에서 가상머신으로 전달되지 않고, 입력 큐에서 출력 큐로 바로 통과한다(S212).At this time, traffic to the abnormal virtual machine is not transferred from the NIC 40 to the virtual machine, but passes directly from the input queue to the output queue (S212).

이와 같이, 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템은 가상머신 장애 시에 가상머신이 제공하는 서비스를 보호하는 것이 아니라, 가상머신으로 가는 트래픽을 보호(Bypass)하는 기술이다. As described above, the cloud service management system according to an embodiment of the present invention does not protect the service provided by the virtual machine in the event of a virtual machine failure, but rather protects traffic to the virtual machine (Bypass).

그리고, 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템은 가상머신별 트래픽을 분류한 후, 가상머신의 실시간 감시를 통해 논리 큐의 릴에이를 통해 바이패스 기능을 구현할 수 있다.In addition, the cloud service management system according to an embodiment of the present invention may implement a bypass function through real-time monitoring of the logical queue through real-time monitoring of the virtual machine after classifying the traffic for each virtual machine.

따라서, 본 발명의 한 실시예에 따른 클라우드 서비스 관리 시스템은, 가상머신의 장애를 실시간으로 탐지하고 가상머신 별로 바이패스 기능을 구현함으로써, 서비스 장애시에 네트워크 서비스를 즉각적으로 복구할 수 있는 환경을 제공한다.Therefore, the cloud service management system according to an embodiment of the present invention detects a failure of a virtual machine in real time and implements a bypass function for each virtual machine, thereby providing an environment in which network services can be immediately restored in the event of a service failure. to provide.

이상에서 설명한 본 발명의 실시예는 장치 및 방법을 통해서만 구현이 되는 것은 아니며, 본 발명의 실시예의 구성에 대응하는 기능을 실현하는 프로그램 또는 그 프로그램이 기록된 기록 매체를 통해 구현될 수도 있다. 이러한 기록 매체는 서버뿐만 아니라 사용자 단말에서도 실행될 수 있다.The embodiments of the present invention described above are not implemented only through an apparatus and a method, but may be implemented through a program that realizes a function corresponding to the configuration of the embodiment of the present invention or a recording medium in which the program is recorded. Such a recording medium can be executed not only in the server but also in the user terminal.

이상에서 본 발명의 실시예에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속하는 것이다. Although the embodiments of the present invention have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements by those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. It belongs to the scope of rights.

Claims

In a method of managing a cloud server so that the cloud server management device implements a bypass function according to a failure of a virtual machine,
Monitoring the occurrence of a failure of a plurality of virtual machines transmitting and receiving traffic through a network interface card, and
When a packet is input from the outside through the port of the network interface card, a logical queue that separates the traffic for each virtual machine is created, and the traffic stored in the logical queue is transferred to the corresponding virtual machine according to whether or not each virtual machine has a failure. Including the step of controlling to pass or bypass without passing to the corresponding virtual machine,
The controlling step
Control to implement a bypass function that delivers the traffic stored in the logical queue of a normal virtual machine to the corresponding virtual machine, and relays the traffic stored in the logical queue of the failed virtual machine to the Output Queue. , How to manage the cloud server.

delete

In claim 1,
The monitoring step,
Determining that a failure has occurred in the virtual machine if the status message of the virtual machine does not arrive for a certain period of time or longer.
Cloud server management method comprising a.

In claim 1,
The controlling step,
The step of implementing a bypass function for the traffic of a failed virtual machine by relaying the input queue and the output queue through software.
Cloud server management method comprising a.

In claim 1,
The controlling step,
The step of passing the traffic to the abnormal virtual machine from the input queue to the output queue without passing it to the corresponding virtual machine.
Cloud server management method comprising a.

delete

Monitors the occurrence of failures of a plurality of virtual machines that transmit and receive traffic through a network interface card, and when a packet is input from the outside through a port of the network interface card, a logical queue that separates traffic for each virtual machine is created, and , A cloud server management device that controls to bypass the traffic stored in the logical queue to the virtual machine or bypass it without passing it to the virtual machine according to the occurrence of a failure of each virtual machine, and
A service management device for recovering resources of a failed virtual machine from among the plurality of virtual machines or creating a new virtual machine,
The cloud server management device
Control to implement a bypass function that delivers the traffic stored in the logical queue of a normal virtual machine to the corresponding virtual machine, and relays the traffic stored in the logical queue of the failed virtual machine to the Output Queue. That, the cloud service management system.

In claim 13,
The service management device,
A cloud service management system that receives a status message of a virtual machine from the cloud server management device, monitors whether a bypass function is implemented, and controls to recover resources of a failed virtual machine.

In clause 14,
The service management device,
A cloud service management system that controls each cloud server to create a new virtual machine to replace the failed virtual machine.

delete

In claim 13,
The cloud server management device,
A cloud service management system that manages to selectively bypass traffic for each failed virtual machine.