KR20220164840A

KR20220164840A - Load balancer manage system, method, program in a cloud native environment and the load balancer created by this method

Info

Publication number: KR20220164840A
Application number: KR1020210072567A
Authority: KR
Inventors: 이정복; 유태희; 이어형; 황병하; 조충희
Original assignee: 주식회사 카카오엔터프라이즈
Priority date: 2021-06-04
Filing date: 2021-06-04
Publication date: 2022-12-14
Also published as: KR102644436B1; KR20240008387A

Abstract

The present invention relates to a load balancer management system capable of flexibly coping with variable traffics. More specifically, in the present invention, a CR management unit generates a custom resource (CR) based on a load balancer node generation request. The custom resource includes at least one parameter information defining a load balancer. The CR management unit stores the generated custom resource in a database. An operator performs at least one of generating, removing, and changing at least one load balancer node (LB node) to correspond to the stored at least one custom resource.

Description

System, method, program for managing load balancer in cloud native environment, and load balancer created by this method

본 발명은 클라우드 네이티브 환경에서 로드 밸런서를 관리하는 시스템, 방법 및 이 방법에 의해서 생성된 로드 밸런서에 관한 것으로, 보다 구체적으로는 클라우드 네이티브 환경의 네트워크 상황에 따라서 유연하게 로드 밸런서를 생성 및 제거하는 제어 방법 및 이 방법에 의해서 생성된 로드 밸런서에 관한 것이다.The present invention relates to a system for managing a load balancer in a cloud-native environment, a method, and a load balancer created by the method, and more specifically, control for creating and removing load balancers flexibly according to network conditions in a cloud-native environment. A method and a load balancer created by the method.

클라우드 데이터 센터 환경에는 다양한 애플리케이션을 실행하기 위한 컴퓨팅 및/또는 저장 용량을 제공하는 상호 연결된 서버의 대규모 집합이 존재한다. 예를 들어, 데이터 센터는 가입자, 즉 데이터 센터의 고객을 위한 애플리케이션 및 서비스를 호스팅하는 시설을 포함할 수 있다. 그리고 데이터 센터는 네트워킹 및 스토리지 시스템, 중복 전원 공급 장치 및 환경 제어와 같은 모든 인프라 장비를 호스팅 할 수 있다.Cloud data center environments contain large collections of interconnected servers that provide computing and/or storage capacity to run a variety of applications. For example, a data center may include facilities that host applications and services for subscribers, ie, customers of the data center. And data centers can host all infrastructure equipment, such as networking and storage systems, redundant power supplies and environmental controls.

데이터 센터는 하나 이상의 물리적 네트워크 스위치 및 라우터 계층이 제공하는 고속 스위치 패브릭을 통해 상호 연결된다.Data centers are interconnected through a high-speed switch fabric provided by one or more layers of physical network switches and routers.

데이터 센터의 특성상 수많은 가상 머신(VM)이 존재하며, 이러한 가상 머신은 쉽게 만들어질 수도 있고 제거될 수도 있다. 이와 같이 지속적으로 그 수와 위치가 변화하는 가상 머신으로 트래픽을 로드 밸런싱하는 로드 밸런서 역시 네트워크 상황에 따라 유연하게 만들어지고 제거되어야 한다.Due to the nature of data centers, numerous virtual machines (VMs) exist, and these virtual machines can be easily created or removed. Likewise, a load balancer that load balances traffic to virtual machines whose number and location are constantly changing must also be created and removed flexibly according to network conditions.

일반적으로 로드 밸런서는 데이터 센터에 별도로 연결되는 물리적인 하드웨어 장치로 구현된다. 이러한 물리적인 하드웨어 장치에 의존할 경우, 급격하게 변하는 트래픽 양이나 가상 머신 개수의 변화에 따라 유연한 처리가 어려워진다. 만약 데이터 센터에 유입될 수 있는 최대 트래픽을 예측하여 물리적인 장비를 배치 한다면, 많은 장비 구매 비용이 요구될 뿐만 아니라 트래픽이 적은 상황에서는 비용과 성능 측면에서 경제적이지 못한 설계가 될 수 있다.Load balancers are typically implemented as physical hardware devices that are separately connected to the data center. When relying on such a physical hardware device, it is difficult to provide flexible processing according to a rapidly changing amount of traffic or a change in the number of virtual machines. If the physical equipment is placed by predicting the maximum traffic that can flow into the data center, not only a lot of equipment purchase costs are required, but it can be an uneconomical design in terms of cost and performance in low traffic situations.

또한 데이터 센터 네트워크를 설계하는데 있어서 기하 급수적으로 증가하는 트래픽과 트래픽 패턴의 급격한 변화를 처리할 수 있어야 한다. 특정 트래픽 흐름의 급격한 증가로 인해 부하가 특정 기능에 집중되어 작동 이상을 유발하고 다른 관련 기능에 순차적으로 영향을 미칠 수 있기 때문이다.In addition, data center networks must be designed to handle exponentially increasing traffic and rapid changes in traffic patterns. This is because a sudden increase in a particular traffic flow can concentrate the load on a particular function, causing malfunction and sequentially affecting other related functions.

따라서 빠르게 변화하는 트래픽 패턴에 어려움 없이 대처할 수 있으면서도 동시에 경제적일 수 있는 로드 밸런서 관리 시스템에 대한 연구가 요구되는 실정이다.Therefore, research on a load balancer management system capable of coping with rapidly changing traffic patterns without difficulty and being economical at the same time is required.

본 발명이 해결하고자 하는 과제는 클라우드 네이티브 환경에서 최적화된 로드 밸런서 관리 시스템 및 제어 방법을 제공하는 것이다.An object of the present invention is to provide a load balancer management system and control method optimized in a cloud native environment.

본 발명이 해결하고자 하는 다른 과제는 다중 테넌트 지원이 가능한 로드 밸런서를 제공하는 것이다.Another problem to be solved by the present invention is to provide a load balancer capable of supporting multi-tenants.

본 발명이 해결하고자 하는 다른 과제는 빠르게 변화하는 트래픽 패턴에 따라서 유동적으로 변하는 로드 밸런서 관리 시스템을 제공하는 것이다.Another problem to be solved by the present invention is to provide a load balancer management system that dynamically changes according to rapidly changing traffic patterns.

본 발명이 해결하고자 하는 다른 과제는 리눅스 운영체제 상에서 소프트웨어 수준에서 쉽게 제어 할 수 있는 컨테이너 형태의 로드 밸런서를 제공하는 것이다.Another problem to be solved by the present invention is to provide a container-type load balancer that can be easily controlled at a software level on a Linux operating system.

본 발명이 해결하고자 하는 다른 과제는 하드웨어 로드 밸런서 성능에 준하는 컨테이너 형태의 로드 밸런서를 제공하는 것이다.Another problem to be solved by the present invention is to provide a load balancer in the form of a container that corresponds to the performance of a hardware load balancer.

본 발명에서 이루고자 하는 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급하지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The technical problems to be achieved in the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the description below. You will be able to.

상기 또는 다른 과제를 해결하기 위해 본 발명의 일 측면에 따르면, 로드 밸런서 생성 요청에 기초하여 CR관리부가 사용자 지정 리소스(CR, Custom Resource)를 생성하는 단계; 상기 사용자 지정 리소스는 로드 밸런서를 정의하는 적어도 하나의 파라미터 정보를 포함하고, 상기 생성된 사용자 지정 리소스를 데이터 베이스에 저장하는 단계; 및 오퍼레이터가 상기 저장되어 있는 적어도 하나의 사용자 지정 리소스에 대응되도록, 적어도 하나의 로드 밸런서 노드(LB Node)를 생성, 제거 및 변경 중 적어도 하나를 수행하는 단계를 포함하는, 로드 밸런서 관리 시스템의 제어 방법을 제공한다.According to one aspect of the present invention in order to solve the above or other problems, CR management unit based on the load balancer creation request, creating a user-specified resource (CR, Custom Resource); The user-specified resource includes at least one parameter information defining a load balancer, and storing the created user-specified resource in a database; and performing at least one of creating, removing, and changing at least one load balancer node (LB Node) so that the operator corresponds to the stored at least one user-specified resource. Control of the load balancer management system. provides a way

VIP 할당부가 추가될 로드 밸런서에 VIP(Virtual IP)를 할당하는 단계를 더 포함할 수 있다.A step of allocating a VIP (Virtual IP) to the load balancer to which the VIP allocator is added may be further included.

상기 생성되는 로드 밸런서는, 리눅스 네트워크 스택 중 장치 드라이버(DD) 스택에서 구현될 수 있다.The generated load balancer may be implemented in a device driver (DD) stack of Linux network stacks.

상기 로드 밸런서는 eBPF(Extended Berkeley Packet Filter)/XDP(eXpress Data Path)로 구현될 수 있다.The load balancer may be implemented as an Extended Berkeley Packet Filter (eBPF)/eXpress Data Path (XDP).

상기 또는 다른 과제를 해결하기 위해 본 발명의 다른 측면에 따르면, 로드 밸런서 생성 요청에 기초하여 사용자 지정 리소스(CR, Custom Resource)를 생성하는 CR관리부; 상기 사용자 지정 리소스는 로드 밸런서를 정의하는 적어도 하나의 파라미터 정보를 포함하고, 상기 생성된 사용자 지정 리소스를 저장하는 데이터 베이스; 및 상기 저장되어 있는 적어도 하나의 사용자 지정 리소스에 대응되도록, 적어도 하나의 로드 밸런서 노드(LB Node)를 생성, 제거 및 변경 중 적어도 하나를 수행하는 오퍼레이터를 포함하는, 로드 밸런서 관리 시스템을 제공한다.According to another aspect of the present invention to solve the above or other problems, CR management unit for creating a custom resource (CR, Custom Resource) based on the load balancer creation request; The user-specified resource includes at least one parameter information defining a load balancer, and a database for storing the created user-specified resource; and an operator performing at least one of creating, removing, and changing at least one load balancer node (LB Node) to correspond to the stored at least one user-specified resource.

추가될 로드 밸런서에 VIP(Virtual IP)를 할당하는 VIP 할당부를 더 포함할 수 있다.A VIP allocation unit for allocating a VIP (Virtual IP) to the load balancer to be added may be further included.

본 발명에 따른 클라우드 네이티브 환경에서 로드 밸런서를 관리하는 시스템 및 방법의 효과에 대해 설명하면 다음과 같다.Effects of the system and method for managing a load balancer in a cloud-native environment according to the present invention are described as follows.

본 발명의 실시 예들 중 적어도 하나에 의하면, 클라우드 네이티브 환경에서 최적화된 로드 밸런서 관리가 가능하다는 장점이 있다.According to at least one of the embodiments of the present invention, there is an advantage in that optimized load balancer management is possible in a cloud native environment.

또한, 본 발명의 실시 예들 중 적어도 하나에 의하면, 다중 테넌트 지원이 가능한 로드 밸런서를 제공할 수 있다는 장점이 있다.In addition, according to at least one of the embodiments of the present invention, there is an advantage in that a load balancer capable of supporting multi-tenants can be provided.

그리고 본 발명의 실시 예들 중 적어도 하나에 의하면, 컨테이너 기반 로드 밸런서를 통하여, 기존 하드웨어 형태의 로드 밸런서 대비 상대적으로 경제적인 로드 밸런서를 제공할 수 있다는 장점이 있다.In addition, according to at least one of the embodiments of the present invention, there is an advantage in that a relatively economical load balancer can be provided through a container-based load balancer compared to an existing hardware-type load balancer.

추가적으로 본 발명의 실시 예들 중 적어도 하나에 의하면, 빠르게 변화하는 트래픽 패턴에 따라서 유동적으로 변하는 로드 밸런서 관리 시스템을 제공할 수 있다는 장점이 있다.Additionally, according to at least one of the embodiments of the present invention, there is an advantage in providing a load balancer management system that dynamically changes according to rapidly changing traffic patterns.

본 발명의 적용 가능성의 추가적인 범위는 이하의 상세한 설명으로부터 명백해질 것이다. 그러나 본 발명의 사상 및 범위 내에서 다양한 변경 및 수정은 당업자에게 명확하게 이해될 수 있으므로, 상세한 설명 및 본 발명의 바람직한 실시 예와 같은 특정 실시 예는 단지 예시로 주어진 것으로 이해되어야 한다.A further scope of the applicability of the present invention will become apparent from the detailed description that follows. However, since various changes and modifications within the spirit and scope of the present invention can be clearly understood by those skilled in the art, it should be understood that the detailed description and specific examples such as preferred embodiments of the present invention are given as examples only.

도 1은 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)의 구조를 도시하는 도면이다.
도 2는 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)이 스위치 패브릭과 연결된 개념도를 도시한다.
도 3은 본 발명의 일실시예에 따라 추가 LB 노드(203-3)를 생성하기 위한 제어 순서의 개념을 도시한다.
도 4는 본 발명의 일실시예에 따라 추가 LB 노드(203-3)를 생성하기 위한 제어 순서도를 도시한다.
도 5는 본 발명의 일실시예에 따른 LB 노드(203)가 인바운드 트래픽을 처리하는 제어 순서를 도시한다.
도 6은 본 발명의 일실시예에 따라 패킷 헤더에 포함되어 있는 MAC 주소와 VID가 변경되는 예시를 설명한다.
도 7은 본 발명의 일실시예에 따른 로드 밸런서 노드가 실행되는 위치를 설명하기 위한 리눅스 네트워크 네트워크 스택(Linux network stack)을 도시하는 도면이다.
도 8은 본 발명의 일실시예에 따른 해시 테이블(801)의 예시를 도시한다.
도 9는 본 발명의 일실시예에 따른 리눅스 커널의 LBN 코어(905)와 유저스페이스(707)의 LBN 컨트롤러(910) 간의 관계를 보여준다.
도 10은 본 실험의 실험 환경을 도시한다.
도 11 내지 도 13은 특정 크기의 프레임으로 만 구성된 트래픽을 25G NIC 카드의 전체 속도로 전송할 때 각 시나리오의 throughput을 도시한다.
도 14는 IMIX 트래픽에 대한 BPS(L2 및 L1 수준) 단위의 Troughput 결과를 도시한다. 도 15는 IMIX 트래픽에 대한 PPS 단위의 Troughput 결과를 도시한다.1 is a diagram showing the structure of a load balancer management system 100 according to an embodiment of the present invention.
Figure 2 shows a conceptual diagram in which the load balancer management system 100 according to an embodiment of the present invention is connected to a switch fabric.
Figure 3 illustrates the concept of a control sequence for creating an additional LB node 203-3 according to one embodiment of the present invention.
4 shows a control flow chart for creating an additional LB node 203-3 according to one embodiment of the present invention.
5 illustrates a control sequence in which the LB node 203 processes inbound traffic according to an embodiment of the present invention.
6 illustrates an example in which a MAC address and a VID included in a packet header are changed according to an embodiment of the present invention.
7 is a diagram illustrating a Linux network stack for explaining a location where a load balancer node is executed according to an embodiment of the present invention.
8 shows an example of a hash table 801 according to one embodiment of the present invention.
9 shows the relationship between the LBN core 905 of the Linux kernel and the LBN controller 910 of the user space 707 according to an embodiment of the present invention.
10 shows the experimental environment of this experiment.
11 to 13 show the throughput of each scenario when traffic consisting only of frames of a specific size is transmitted at full speed of a 25G NIC card.
14 shows throughput results in units of BPS (L2 and L1 levels) for IMIX traffic. 15 shows Troughput results in units of PPS for IMIX traffic.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시 예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시 예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시 예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. Hereinafter, the embodiments disclosed in this specification will be described in detail with reference to the accompanying drawings, but the same or similar elements are given the same reference numerals regardless of reference numerals, and redundant description thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used together in consideration of ease of writing the specification, and do not have meanings or roles that are distinct from each other by themselves. In addition, in describing the embodiments disclosed in this specification, if it is determined that a detailed description of a related known technology may obscure the gist of the embodiment disclosed in this specification, the detailed description thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in this specification, the technical idea disclosed in this specification is not limited by the accompanying drawings, and all changes included in the spirit and technical scope of the present invention , it should be understood to include equivalents or substitutes.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.It is understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle. It should be. On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this application, terms such as "comprise" or "have" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

본 발명은 클라우드 네이티브 환경 상에서 가상화된 컴퓨팅 인프라(virtualized computing infrastructure)에 관한 것으로, 보다 구체적으로는 네트워크 내의 가상화된 컴퓨팅 인프라에 배치된 가상 실행 요소(virtual execution element, 예를 들어, 가상 머신(VM, Virtual machine) 또는 컨테이너)에 대한 로드 밸런서를 가상 실행 요소로 구성하는 것에 관한 것이다.The present invention relates to a virtualized computing infrastructure in a cloud-native environment, and more particularly, to a virtualized computing infrastructure deployed in a network in a virtual execution element (eg, a virtual machine (VM)). It is about configuring load balancers for virtual machines (or containers) as virtual execution elements.

컨테이너화는 운영 체제 수준의 가상화에 기반한 가상화 방식을 말한다. 컨테이너는 서로 간에 또는 호스트로부터 격리된 애플리케이션을 위한 가볍고 이식 가능한 실행 요소라는 특징이 존재한다.Containerization refers to a virtualization method based on operating system level virtualization. Containers are characterized as lightweight, portable executables for applications that are isolated from each other or from the host.

컨테이너는 호스트 하드웨어 컴퓨팅 환경에 밀접하게 연결되어 있지 않기 때문에 애플리케이션을 컨테이너 이미지에 연결하고 기본 컨테이너 아키텍처를 지원하는 모든 호스트 또는 가상 호스트에서 단일 경량 패키지로 실행할 수 있다. 이러한 특징 때문에, 컨테이너는 서로 다른 컴퓨팅 환경에서 소프트웨어를 작동시킬 때 발생할 수 있는 여러가지 문제를 해결할 수 있다.Because containers are not tightly coupled to the host hardware computing environment, applications can be linked to container images and run as a single, lightweight package on any host or virtual host that supports the underlying container architecture. Because of these characteristics, containers can solve various problems that can arise when running software in different computing environments.

그리고 컨테이너는 운영 체제를 가상화한 환경임으로 가상 머신에 비해서 상대적으로 가볍기 때문에, 동일한 조건 하에서 기존의 가상 머신 보다 더 많은 컨테이너 인스턴스 지원이 가능할 뿐만 아니라 빠르게 생성 및 제거가 가능하다. 종종 수명이 짧은 컨테이너는 가상 머신 보다 더 효율적으로 생성 및 이동할 수 있다. 그리고 컨테이너는 논리적으로 관련된 요소들의 그룹으로 관리하는 것이 가능하다. 예를 들어서 쿠버네티스(Kubernetes)와 같은 일부 오케스트레이션 플랫폼의 경우, 이러한 요소들을 "파드(Pod)"라고 부르며, 이러한 파드의 그룹을 "클러스터"라 한다.In addition, as a container is an environment in which an operating system is virtualized, it is relatively lightweight compared to a virtual machine, so it is possible to support more container instances than existing virtual machines under the same conditions, and to create and remove them quickly. Often short-lived containers can be created and moved more efficiently than virtual machines. And containers can be managed as groups of logically related elements. In some orchestration platforms, for example Kubernetes, these elements are called "pods", and groups of these pods are called "clusters".

이하 본 발명에서 파드 및 클러스터라는 용어를 사용하지만, 이는 요소 및 그것들의 그룹을 지칭하기 위한 용어일 뿐, 본 발명이 쿠버네티스 플랫폼에 한정되지는 않을 것이다.Hereinafter, the terms pod and cluster are used in the present invention, but these are only terms to refer to elements and their groups, and the present invention will not be limited to the Kubernetes platform.

본 발명의 일실시예에서는 클러스터, 그것들의 역할 및 관계가 정의된 설치 가능한 클라우드 아키텍처를 제안한다. 또한 쿠버네티스와 같은 오케스트레이션 도구를 통해 쉽게 관리 할 수 있고 리눅스(Linux) 커널 내에서 eBPF(Extended Berkeley Packet Filter)/XDP(eXpress Data Path)를 사용하여 트래픽을 분산 할 수 있는 컨테이너화 된 고성능 로드 밸런서를 제공할 수 있는 클라우드 아키텍처를 제안한다.An embodiment of the present invention proposes an installable cloud architecture in which clusters, their roles and relationships are defined. It is also a containerized, high-performance load balancer that can be easily managed through orchestration tools such as Kubernetes and can distribute traffic using Extended Berkeley Packet Filter (eBPF) / eXpress Data Path (XDP) within the Linux kernel. We propose a cloud architecture that can provide

쿠버네티스 기반 클라우드에서는 컨테이너 또는 가상 머신 형태의 애플리케이션을 쉽게 생성하고 제거하여 수신 트래픽을 처리할 수 있다. 이에 따라, 데이터 센터에는 가상의 IP(Virtual IP, VIP)를 갖는 여러 컨테이너 또는 가상 머신에 트래픽을 분산하는 로드 밸런서가 필요하다. 특히 로드 밸런서는 개수와 위치가 자주 변경되는 여러 컨테이너/가상 머신에 들어오는 트래픽을 전달할 수 있어야 한다.In a Kubernetes-based cloud, applications in the form of containers or virtual machines can be easily created and removed to handle incoming traffic. Accordingly, data centers require load balancers that distribute traffic to multiple containers or virtual machines with virtual IPs (VIPs). In particular, the load balancer needs to be able to forward incoming traffic to multiple containers/virtual machines whose number and location change frequently.

또한 로드 밸런서는 네트워크 환경에 따라서 유동적으로 확장 가능해야 하며 어떤 상황에서도 트래픽을 안정적으로 전달할 수 있어야 한다.In addition, the load balancer must be flexible and scalable according to the network environment and must be able to deliver traffic stably under any circumstances.

본 발명에서는 이러한 요건들을 충족하기 위해, 오케스트레이션 도구(예를 들어 Kubernetes)를 통하여 배포 및 관리 할 수 있는 컨테이너 형태로 로드 밸런서를 배포하거나 관리하도록 제안한다. 이와 같이 컨테이너 기반 로드 밸런서를 사용하면, 사용자의 요청에 따라 클라우드 리소스를 적절하게 배포하고 관리 할 수 있다는 장점이 존재할 것이다.To meet these requirements, the present invention proposes deploying or managing a load balancer in the form of a container that can be deployed and managed through an orchestration tool (eg Kubernetes). In this way, using a container-based load balancer will have the advantage of being able to properly deploy and manage cloud resources according to user requests.

그리고 본 발명의 일실시예에 따른 로드 밸런서는 목적지 MAC(Media Access Control Address) 주소와 VLAN ID를 변조하는 레이어 2 계층(L2) DSR(direct server return) 모드로 동작하도록 제안한다. 이와 같은 L2DSR 모드로 동작할 경우, 로드 밸런서를 거치지 않고 클라이언트에 직접 응답하여 응답 속도를 높이면서 동시에 로드 밸런서의 부하를 줄일 수 있다. 뿐만 아니라, 멀티 테넌트를 지원하기 위해 VLAN ID를 변조하여 해당되는 특정 테넌트로 로드 밸런싱 서비스를 수행할 수 있다. 이하 본 발명에서 특정 테넌트에 해당하는 구분자를 VLAN ID로 사용하지만, 이는 테넌트를 구분하기 위한 용어일 뿐, 본 발명이 VLAN에 한정되지는 않을 것이다.In addition, the load balancer according to an embodiment of the present invention proposes to operate in a layer 2 layer (L2) direct server return (DSR) mode in which a destination MAC (Media Access Control Address) address and VLAN ID are modulated. When operating in this L2DSR mode, the load of the load balancer can be reduced while increasing the response speed by directly responding to the client without going through the load balancer. In addition, in order to support multi-tenant, the VLAN ID can be modulated to perform a load balancing service for a specific tenant. Hereinafter, in the present invention, an identifier corresponding to a specific tenant is used as a VLAN ID, but this is only a term for distinguishing tenants, and the present invention will not be limited to VLANs.

본 발명에서는, 컨테이너화 된 로드 밸런서(생성 및 삭제를 위한 API, 상태 모니터링 등) 배포 및 로드 밸런서에서 패킷 매칭 규칙 설정에 대한 정책(policy)을 제안한다. 그리고, 리눅스 운영체제에서 동작하는데 있어서 충분한 성능을 기대할 수 있는 로드 밸런서를 제안한다.In the present invention, a policy for deploying a containerized load balancer (API for creation and deletion, status monitoring, etc.) and setting packet matching rules in the load balancer is proposed. In addition, we propose a load balancer that can expect sufficient performance to operate in the Linux operating system.

본 발명에서는, IaaS(Infrastructure-as-a-Service)에 필요한 클라우드 구성 요소(파드)를 정의한 다음 해당 구성 요소를 특성(또는 기능)에 따라 클러스터로 분류한다. 이러한 클러스터에 대해서 도 1 및 도 2를 함께 참조하여 설명한다.In the present invention, cloud components (pods) required for IaaS (Infrastructure-as-a-Service) are defined, and then the components are classified into clusters according to their characteristics (or functions). This cluster will be described with reference to FIGS. 1 and 2 together.

도 1은 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)의 구조를 도시하는 도면이다. 도 2는 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)이 스위치 패브릭과 연결된 개념도를 도시한다.1 is a diagram showing the structure of a load balancer management system 100 according to an embodiment of the present invention. Figure 2 shows a conceptual diagram in which the load balancer management system 100 according to an embodiment of the present invention is connected to a switch fabric.

이하 도 1 및 도 2를 함께 참조하여 설명한다.Hereinafter, it will be described with reference to FIGS. 1 and 2 together.

본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)은, 서비스 클러스터(110), LB(로드 밸런서) 클러스터(120)및 LB 컨트롤 클러스터(130)를 포함하도록 구성될 수 있다. 단, 언급된 클러스터 외에도 추가적으로 클러스터 구성이 가능하다. 도 1에 도시된 구성요소들은 로드 밸런서 관리 시스템(100)을 구현하는데 있어서 필수적인 것은 아니어서, 본 명세서 상에서 설명되는 로드 밸런서 관리 시스템(100)는 위에서 열거된 구성요소들 보다 많거나, 또는 적은 구성요소들을 가질 수 있다.The load balancer management system 100 according to an embodiment of the present invention may be configured to include a service cluster 110 , a load balancer (LB) cluster 120 and a LB control cluster 130 . However, additional clusters can be configured in addition to the clusters mentioned above. The components shown in FIG. 1 are not essential for implementing the load balancer management system 100, so the load balancer management system 100 described in this specification has more, or fewer components than the components listed above. can have elements.

본 발명의 일실시예에서는 도 1에서와 같이 정의(구성)된 클러스터에 기초하여, 베어 메탈 서버(bare metal server)에서 안정적이면서도 단계적으로 확장 가능한 클라우드 기반 서비스를 생성하는 방법을 제안한다. 본 발명의 일실시예에서는 사용자 지정 리소스 정의(CRD, custom resource definition)를 기반으로 로드 밸런서의 생성(배포) 및 부하 분산 규칙을 자동화한다. 예를 들어 CRD는 쿠버네티스 플랫폼 API에서 제공하는 사용자 지정 개체일 수 있다. 그리고 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)은, eBPF(extended Berkeley Packet Filter)/XDP(eXpress Data Path)를 사용하는 로드 밸런서를 구현하여 클라우드에 대한 충분한 로드 밸런싱 성능을 보장할 수 있다.An embodiment of the present invention proposes a method for generating a cloud-based service that is stable and scalable step by step in a bare metal server based on a cluster defined (configured) as shown in FIG. 1 . In one embodiment of the present invention, creation (distribution) of load balancers and load balancing rules are automated based on a custom resource definition (CRD). For example, a CRD can be a custom object provided by the Kubernetes platform API. In addition, the load balancer management system 100 according to an embodiment of the present invention implements a load balancer using an extended Berkeley Packet Filter (eBPF) / eXpress Data Path (XDP) to ensure sufficient load balancing performance for the cloud. can

본 발명의 일실시예에 따른 파드는 로드 밸런서 관리 시스템(100)에서 생성하고 관리할 수 있는 배포 가능한 가장 작은 컴퓨팅 단위를 의미할 수 있다. 그리고 파드는 하나 이상의　컨테이너의 그룹으로 구성될 수 있으며, 이 그룹은 스토리지 및 네트워크를 공유하고, 컨테이너를 구동하는 방식에 대한 명세를 공유할 수 있다. 파드의 콘텐츠는 함께 배치되고, 함께 스케줄될 수 있다.A pod according to an embodiment of the present invention may mean the smallest distributable computing unit that can be created and managed by the load balancer management system 100 . And a pod can be composed of a group of one or more containers, which can share storage and network, and the specification of how to run the containers. A Pod's content can be collocated and scheduled together.

본 발명의 일실시예에서는 클러스터 기반 프레임 워크를 사용하여 가상 실행 요소를 가상화 환경에 배포할 수 있다. 클러스터는 하나 이상의 노드에 설치될 수 있다. 클러스터가 설치되는 각 노드는 그 역할에 따라 마스터 노드(Master node, 또는 마스터)와 워커 노드(Worker node, 또는 워커)로 구분된다. 마스터 노드는 클러스터 전체를 컨트롤하는 역할을 수행한다. In one embodiment of the present invention, virtual execution elements may be distributed in a virtualization environment using a cluster-based framework. A cluster can be installed on one or more nodes. Each node where the cluster is installed is divided into master node (or master) and worker node (or worker) according to its role. The master node controls the entire cluster.

마스터 노드에 설치되는 파드들은 일반적으로 클러스터의 기능들을 API 형태로 제공하는 API 서버, 클러스터의 데이터를 저장하는 데이터베이스, 파드들이 어떠한 노드에 할당될 것인지를 결정하는 스케쥴러(scheduler), 클러스터의 자원을 관리 및 할당하는 컨트롤러, 그리고 DNS(Domain Name System) 서버 등이 있을 수 있다. 단, 마스터 노드에 설치되는 파드들은 열거한 기능들에 한정되지는 않을 것이다.Pods installed on the master node generally have an API server that provides cluster functions in API form, a database that stores cluster data, a scheduler that determines which nodes pods are assigned to, and manages cluster resources. and an allocating controller, and a Domain Name System (DNS) server. However, pods installed on the master node will not be limited to the listed functions.

워커 노드에 설치되는 파드들은 일반적으로 마스터 노드의 API 서버로부터 전달된 명령을 수행하는 Agent, 파드들간의 네트워크를 관리하는 proxy, 파드들을 실제로 실행시키는 런타임(Runtime) 등이 있다. 단, 워커 노드에 설치되는 파드들은 열거한 기능들에 한정되지는 않을 것이다.Pods installed on worker nodes generally include an agent that executes commands delivered from the API server of the master node, a proxy that manages the network between pods, and a runtime that actually runs the pods. However, pods installed on worker nodes will not be limited to the listed functions.

서비스 클러스터(110)는 실제 IaaS 서비스를 수행하는 구성이다. 서비스 클러스터(110)의 서비스 마스터 파드(112)는 마스터 노드에 속하고, 서비스 Agent 파드(111) 및 서비스 API 파드(113)는 워커 노드에 속하도록 구성된다. 도시된 도면에서는 두 개의 서비스 그룹만이 도시되어 있지만, 더 많은 수의 서비스 그룹이 구비될 수 있음은 자명하다.The service cluster 110 is a configuration that performs actual IaaS services. The service master pod 112 of the service cluster 110 belongs to the master node, and the service agent pod 111 and service API pod 113 belong to the worker node. Although only two service groups are shown in the drawing, it is apparent that a larger number of service groups may be provided.

워커 노드에 구동되는 서비스 API 파드(113)는, "Nova", "Neutron", "Keystone", "Cinder", "Glance", "Horizon", "Octavia" 등과 같은 OpenStack API 서버를 운영하는 파드이다. 상술한 OpenStack API를 이용하는 것은 하나의 예시에 불과하며, 동일한 기능을 수행하는 다른 API로 대체될 수 있음은 자명할 것이다. 예를 들어, "Neutron"는 네트워킹 제어를 위한 API로, 다른 네트워킹 제어 API가 "Neutron"을 대신하여 적용될 수 있음은 자명하다.The service API pod 113 running on the worker node is a pod that runs OpenStack API servers such as "Nova", "Neutron", "Keystone", "Cinder", "Glance", "Horizon", and "Octavia". . Using the aforementioned OpenStack API is just one example, and it will be apparent that other APIs performing the same function can be substituted. For example, "Neutron" is an API for networking control, and it is obvious that other networking control APIs can be applied instead of "Neutron".

도 2에 도시된 제 1 및 제 2 서비스 그룹(200-1, 200-2)은 사용자가 생성 한 가상 머신(202-1 ~ 202-8)을 포함한다. The first and second service groups 200-1 and 200-2 shown in FIG. 2 include virtual machines 202-1 to 202-8 created by users.

서비스 클러스터(110)를 구성하는 워커 노드들 중 리프 스위치(210-1, 210-2)에 연결되어 있는 각 서버(201-1 ~ 201-4)에는 'DHCP-agent', 'ovs(OpenVSwitch)-agent', 'nova-compute' 등과 같은 서비스 Agent 파드(111)가 설치된다. 설치된 서비스 Agent 파드(111)(들)는 가상 머신들(202-1 ~ 202-4)이 서버(201-1 ~ 201-4)에 설치되어 컴퓨팅, 네트워크 등의 자원을 사용할 수 있도록 지원한다. 즉, 서비스 Agent 파드(111)(들)는 각 가상 머신(202-1 ~ 202-8)들이 서비스 API에 의해서 제어될 수 있도록 지원한다.Among the worker nodes constituting the service cluster 110, 'DHCP-agent', 'ovs (OpenVSwitch) Service Agent pods 111 such as -agent' and 'nova-compute' are installed. The installed service agent pod 111(s) supports the virtual machines 202-1 to 202-4 installed in the servers 201-1 to 201-4 to use resources such as computing and network. That is, the service agent pod 111(s) supports each virtual machine 202-1 to 202-8 to be controlled by the service API.

LB 클러스터(120)는 로드 밸런싱을 수행하는 구성이다. LB 마스터 파드(122)는 마스터 노드에 속하고, LB 노드 파드(121)는 워커 노드에 속하도록 구성된다.The LB cluster 120 is a configuration that performs load balancing. The LB master pod 122 belongs to the master node, and the LB node pod 121 belongs to the worker node.

LB 노드 파드(121)는 로드 밸런싱을 수행하는 적어도 하나의 LB 노드(203-1, 203-2)(LBN, Load Balancer Node) 역할을 수행한다. 본 발명의 일실시예에 따른 LB 노드(203-1, 203-2)는 하나의 물리적 머신(서버)에 하나의 파드만이 실행되는 형태로 구비될 수 있다. 왜냐하면 서버의 자원(cpu, memory)을 오로지 로드 밸런싱의 자원으로만 사용하는 로드 밸런싱 전용 서버로 만들어 사용하기 위함이다. 이하에서 하나의 LB 노드를 지칭할 때에는, LB 노드(203)로 표현한다.The LB node pod 121 serves as at least one LB node (203-1, 203-2) (LBN, Load Balancer Node) that performs load balancing. The LB nodes 203-1 and 203-2 according to an embodiment of the present invention may be provided in a form in which only one pod is executed on one physical machine (server). This is to use the server's resources (cpu, memory) as a dedicated load balancing server that uses only load balancing resources. When referring to one LB node below, it is expressed as an LB node 203.

도 2에 도시된 바에 따르면 LB 노드 파드(121)는 두 개의 LB 노드(203-1, 203-2)를 포함하고 있지만, 이에 한정되지 않을 것이다. 특히 이하에서 후술하겠지만, LB 마스터 파드(122)의 데이터 베이스에 저장되는 CR(Custom resource)에 동기화되어 LB 노드(203-1, 203-2)의 개수가 증감될 수 있을 것이다.As shown in FIG. 2, the LB node pod 121 includes two LB nodes 203-1 and 203-2, but is not limited thereto. In particular, as will be described later, the number of LB nodes 203-1 and 203-2 may be increased or decreased in synchronization with CR (Custom resource) stored in the database of the LB master pod 122.

LB 마스터 파드(122)는 LB 클러스터(120)를 운영하기 위한 기본적인 파드(들)가 설치된다.In the LB master pod 122, basic pod(s) for operating the LB cluster 120 are installed.

LB 컨트롤 클러스터(130)는 LB 클러스터(12)를 제어하는 구성이다. LB 컨트롤 마스터 파드(132)는 마스터 노드에 속하고, LB 컨트롤 노드 파드(131)는 워커 노드에 속하도록 구성된다.The LB control cluster 130 is a component that controls the LB cluster 12 . The LB control master pod 132 belongs to the master node, and the LB control node pod 131 belongs to the worker node.

LB 컨트롤 마스터 파드(132)는 LB 컨트롤 클러스터(130)를 운영하기 위한 기본적인 파드(들)가 설치된다.In the LB control master pod 132, basic pod(s) for operating the LB control cluster 130 are installed.

LB 컨트롤 노드 파드(131)는 LB 노드(203-1, 203-2)를 컨트롤 하는 API 서버 및 오퍼레이터를 포함할 수 있다. API 서버 및 오퍼레이터에 대해서는 이하 도 3을 참조하여 상세히 후술한다.The LB control node pod 131 may include an API server and an operator that controls the LB nodes 203-1 and 203-2. The API server and operator will be described in detail later with reference to FIG. 3 .

본 발명의 일실시예에서는 다중 테넌트를 지원하는 로드 밸런서를 구현하기 위하여 VLAN ID(VID)를 통해 서로 다른 테넌트의 네트워크를 논리적으로 분할하도록 제안한다. 이러한 실시예에 따르면, 서로 다른 테넌트의 VM이 동일한 IP를 가지고 있더라도 서로 다른 네트워크의 트래픽은 VID로 구분할 수 있기 때문에, 다중 테넌트의 지원이 가능할 것이다. 상술한 바와 같이 본 발명에서 특정 테넌트에 해당하는 구분자를 VLAN ID로 사용하지만, 이는 테넌트를 구분하기 위한 용어일 뿐, 본 발명이 VLAN에 한정되지는 않을 것이다. 예를 들어, VXLAN의 경우에는 VID 대신 VNI(Virtual Network Identifier, 또는 VXLAN Segment ID)를 사용하는 경우도 본 발명에 포함될 것이다.In one embodiment of the present invention, in order to implement a load balancer supporting multi-tenants, it is proposed to logically divide networks of different tenants through VLAN IDs (VIDs). According to this embodiment, even if VMs of different tenants have the same IP, traffic of different networks can be distinguished by VID, so multi-tenant support will be possible. As described above, in the present invention, an identifier corresponding to a specific tenant is used as a VLAN ID, but this is only a term for distinguishing tenants, and the present invention will not be limited to VLANs. For example, in the case of VXLAN, a case of using VNI (Virtual Network Identifier, or VXLAN Segment ID) instead of VID will also be included in the present invention.

도 2를 참조하면 제 1 ~ 제 8 VM(202-1 ~ 202-8)이 실행되는 각 컴퓨팅 노드인 제 1 내지 제 4 서버(201-1 ~ 201-4)는 2 계층 리프(leaf)-스파인(spine) 패브릭(230)의 리프 스위치(210-1, 210-2)인 TOR(top of rack)에 연결된다. 각 리프 스위치에는 두 개의 서버만이 연결된 것으로 도시되어 있지만, 이에 한정되지 않고 서버의 개수가 달라질 수 있음은 자명하다.Referring to FIG. 2, the first to fourth servers 201-1 to 201-4, which are computing nodes on which the first to eighth VMs 202-1 to 202-8 are executed, are two-layer leaf- It is connected to the top of rack (TOR) leaf switches 210-1 and 210-2 of the spine fabric 230. Although it is shown that only two servers are connected to each leaf switch, it is obvious that the number of servers may vary without being limited thereto.

도 2에 도시된 패브릭 구성은 VXLAN(Virtual Extensible LAN)-EVPN(Ethernet VPN)를 예시로 설명하나, 다양한 패브릭 구성이 본 발명에 포함될 수 있음은 자명할 것이다.The fabric configuration shown in FIG. 2 is described as an example of Virtual Extensible LAN (VXLAN)-Ethernet VPN (EVPN), but it will be apparent that various fabric configurations may be included in the present invention.

그리고 각 서버(201-1 ~ 201-4)는, 적어도 하나의 가상 머신(202-1 ~ 202-8)을 구성할 수 있을 것이다. 도 2에서 각 서버(201-1 ~ 201-4)에 구성되는 가상 머신(201-1 ~ 202-8)의 개수는 두 개로 도시되어 있지만, 이에 한정되지 않을 것이다.Also, each of the servers 201-1 to 201-4 may configure at least one virtual machine 202-1 to 202-8. Although the number of virtual machines 201-1 to 202-8 configured in each server 201-1 to 201-4 is shown as two in FIG. 2, it will not be limited thereto.

본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)을 구성하는 각 서버(201-1 ~ 201-4, 물리적 머신)에는 하나의 포트(eth0)가 서비스 네트워크(220)의 TOR인 제 1 및 제 2 리프 스위치 (210-1, 210-2)에 연결된다.In each server (201-1 to 201-4, physical machine) constituting the load balancer management system 100 according to an embodiment of the present invention, one port (eth0) is the first TOR of the service network 220. and the second leaf switches 210-1 and 210-2.

네트워크(220)로부터 수신되는 사용자의 트래픽은 리프 스위치(210-1, 210-2), 보더 리프 스위치(212) 및 스파인 스위치(211-1, 211-2)로 구성된 2 계층 리프-스파인 패브릭(230)을 통해 이동한다. 도 2에서 외부 라우터는 생략되었다.Traffic of users received from the network 220 is a two-layer leaf-spine fabric composed of leaf switches 210-1 and 210-2, border leaf switches 212 and spine switches 211-1 and 211-2 ( 230) to move through. In Figure 2, the external router is omitted.

본 발명의 일실시예에 따르면, 컴퓨팅 노드인 제 1 내지 제 8 VM(202-1 ~ 202-8)에서 생성된 패킷이 노드 밖으로 나가게 되면 해당 네트워크의 VID와 함께 VLAN 헤더(header)가 원본 패킷에 추가될 수 있다. 그리고 리프-스파인 패브릭(230)은 VXLAN(Virtual Extensible LAN)-EVPN(Ethernet VPN) 또는 EVPN-VXLAN으로 설정되어 있기 때문에 VXLAN 헤더가 VLAN 헤더 대신 원본 패킷에 추가될 수 있다. 각 리프 스위치(210-1, 210-2)에는 VTEP(VXLAN Ternal End Point, 터널의 끝점)이 있다.According to an embodiment of the present invention, when packets generated by the first to eighth VMs 202-1 to 202-8, which are computing nodes, go out of the node, the VLAN header along with the VID of the corresponding network is added to the original packet. can be added to In addition, since the leaf-spine fabric 230 is set to VXLAN (Virtual Extensible LAN)-EVPN (Ethernet VPN) or EVPN-VXLAN, a VXLAN header may be added to the original packet instead of a VLAN header. Each of the leaf switches 210-1 and 210-2 has a VXLAN Terminal End Point (VTEP).

VXLAN(Virtual Extensible LAN, 가상 확장형 LAN)이란, 캡슐화 프로토콜이며 터널링을 사용하여 기본 레이어 3 네트워크를 통해 레이어 2 연결을 확장하는 데이터 센터 연결을 제공한다. 그리고 VTEP이란 패킷의 캡슐화 및 디캡슐화를 수행하는 개체로서, VXLAN 터널의 시작/끝 지점이다.Virtual Extensible LAN (VXLAN) is an encapsulation protocol that uses tunneling to provide data center connectivity that extends Layer 2 connectivity over the underlying Layer 3 network. VTEP is an entity that encapsulates and decapsulates packets, and is the start/end point of a VXLAN tunnel.

LB 노드(203-1, 203-2)가 로드 밸런싱을 수행하기 위해서는, 다음과 같은 사전 작업이 요구된다. 보더 리프 스위치(212)와 연결되는 각 서버는 LB 노드(203-1, 203-2) 용도로 사용된다는 레이블링이 이루어진다. 레이블링이 이루어진 각 서버는 LB 클러스터(120)에 결합(join)되고, LB 노드(203-1, 203-2)가 각 서버에 파드 형태로 설치된다.In order for the LB nodes 203-1 and 203-2 to perform load balancing, the following preliminary work is required. Each server connected to the border leaf switch 212 is labeled as being used for the purpose of the LB nodes 203-1 and 203-2. Each labeled server is joined to the LB cluster 120, and LB nodes 203-1 and 203-2 are installed in each server in the form of a pod.

그리고 이렇게 설치되는 각 LB 노드(203-1, 203-2)는 LB 노드 그룹 중 하나에 할당된다. 3개의 LB 노드가 하나의 그룹으로 그룹핑되어 동일한 VIP를 처리한다. 참고로, LB 노드 생성 요청이 오면, 기본적으로 3개의 LB 노드가 생성된다. 본 발명의 일실시예에 따른 LB 노드는 3개가 하나의 그룹을 구성하지만, 반드시 이에 한정되지 않고 네트워크 상황에 따라 그 개수가 변경될 수 있음은 자명하다.Each of the LB nodes 203-1 and 203-2 thus installed is assigned to one of the LB node groups. 3 LB nodes are grouped into a group and handle the same VIP. For reference, when an LB node creation request comes in, 3 LB nodes are created by default. Although three LB nodes according to an embodiment of the present invention constitute one group, it is obvious that the number may be changed according to network conditions without necessarily being limited thereto.

외부 라우터는 ECMP(Equal-cost multi-path routing) 프로토콜을 통해 3개의 LB 노드에 트래픽을 균등하게 분산시킬 수 있다. ECMP 프로토콜을 이용하는 것은 하나의 예시에 불과할 뿐, 본 발명이 여기에 한정되지 않을 것이다. 각 LB 노드 및 외부 라우터가 BGP 피어를 설정하도록 구성되어 있으므로 인바운드(inbound) 트래픽을 외부 라우터에서 각 LB 노드로 라우팅 할 수 있다.External routers can evenly distribute traffic across the three LB nodes via the Equal-cost multi-path routing (ECMP) protocol. Using the ECMP protocol is only one example, and the present invention will not be limited thereto. Since each LB node and external router are configured to establish BGP peers, inbound traffic can be routed from the external router to each LB node.

본 발명의 상세한 설명에서는, 편의를 위하여 하나의 LB 노드(203-1, 203-2)로 표현하지만, 이는 LB 노드의 그룹을 의미할 수 있다.In the detailed description of the present invention, for convenience, it is expressed as one LB node (203-1, 203-2), but this may mean a group of LB nodes.

이하 도 3 및 도 4에서는, 본 발명의 일실시예에 따른 LB 노드(203-1, 203-2)의 동작에 대해서 설명한다.3 and 4 below, operations of the LB nodes 203-1 and 203-2 according to an embodiment of the present invention will be described.

도 3은 본 발명의 일실시예에 따라 추가 LB 노드(203-3)를 생성하기 위한 제어 순서의 개념을 도시한다. 도 4는 본 발명의 일실시예에 따라 추가 LB 노드(203-3)를 생성하기 위한 제어 순서도를 도시한다.Figure 3 illustrates the concept of a control sequence for creating an additional LB node 203-3 according to one embodiment of the present invention. 4 shows a control flow chart for creating an additional LB node 203-3 according to one embodiment of the present invention.

서비스 API 파드(113)는 LB 노드를 생성하고 제어하는 API인 LB생성부(311)를 운영한다. 일예로 LB생성부(311)는 OpenStack에서 제공하는 API 중 'Octavia API'에 기초하여, LB 노드를 생성하고 제어하는 API를 제공할 수 있다. LB 노드의 생성은 LB생성부(311)가 사용자로부터 LB 생성 요청(S301)을 받게 되면 시작된다.The service API pod 113 operates the LB creation unit 311, which is an API for creating and controlling LB nodes. For example, the LB generator 311 may provide an API for creating and controlling an LB node based on 'Octavia API' among APIs provided by OpenStack. Creation of the LB node starts when the LB creation unit 311 receives an LB creation request (S301) from the user.

LB생성부(311)는 S301 요청을 받으면 OpenStack의 '네트워크 관리부(312)'에 추가 LB 노드(203-3)가 사용할 포트 생성 요청(S302)을 전달한다. 일예시로 네트워크 관리부(312)은 OpenStack에서 제공하는 API 중 'Neutron API'에 기초하여, 네트워킹의 제어와 관련되는 API를 제공할 수 있다.Upon receiving the S301 request, the LB generator 311 forwards a port creation request (S302) to be used by the additional LB node 203-3 to the 'network manager 312' of OpenStack. For example, the network management unit 312 may provide an API related to networking control based on 'Neutron API' among APIs provided by OpenStack.

네트워크 관리부(312)은 포트 생성 요청(S302)에 대응하여 포트를 생성(S303)한다. 네트워크 관리부(312)은 응답으로 생성된 포트의 정보와 함께 추가 LB 노드(203-3)에서 사용할 VIP를 LB생성부(311)에 회신(S304)한다.The network management unit 312 creates a port in response to the port creation request (S302) (S303). The network manager 312 returns the VIP to be used in the additional LB node 203-3 to the LB generator 311 along with the port information generated in response (S304).

LB생성부(311)에 포함된 드라이버(313)는 포트 정보를 회신 받으면, VIP를 포함하여 생성 된 포트 정보와 함께 LB 생성 요청을 LB 컨트롤 노드 파드(131)의 API-서버(301)로 전달(S305)한다.When the driver 313 included in the LB generator 311 receives the port information, forwards the LB creation request to the API-server 301 of the LB control node pod 131 along with the generated port information including the VIP (S305).

API 서버(301)에 포함된 CR 관리부(302)는 LB 노드에 관하여 미리 정의 된 CRD를 기반으로 CR(Custom Resource)을 생성, 수정 및 삭제하는 구성이다. S305의 요청을 전달 받으면 CR 관리부(302)는, S305 요청에 포함되어 있는 포트 정보, VIP 정보에 기초하여 제 3 CR(350-3)을 생성(S306)하고, 데이터 베이스(303)에 저장(S307)한다.The CR management unit 302 included in the API server 301 is a component that creates, modifies, and deletes a CR (Custom Resource) based on a predefined CRD with respect to an LB node. Upon receiving the request of S305, the CR management unit 302 generates (S306) a third CR 350-3 based on the port information and VIP information included in the request of S305, and stores it in the database 303 (S306). S307).

데이터 베이스(303)에는 기존 제 1 및 제 2 LB 노드(203-1, 203-2) 각각에 대응하는 제 1 및 제 2 CR(350-1, 350-2)이 기저장되어 있을 수 있다. 그리고, CR 관리부(302)가 생성한 제 3 CR(350-3)을 추가하여 저장할 수 있다. 본 발명의 일실시예에 따른 데이터 베이스(303)는, 쿠버네티스 플랫폼의 키-값 저장소인 etcd일 수 있다.In the database 303, the first and second CRs 350-1 and 350-2 corresponding to the existing first and second LB nodes 203-1 and 203-2, respectively, may be pre-stored. In addition, the third CR 350-3 generated by the CR management unit 302 may be added and stored. The database 303 according to an embodiment of the present invention may be etcd, which is a key-value store of the Kubernetes platform.

오퍼레이터(304)는 데이터 베이스(303)에 저장되어 있는 CR을 모니터링(S308)한다.The operator 304 monitors the CR stored in the database 303 (S308).

오퍼레이터(304)는 데이터 베이스(303)에 저장되어 있는 CR을 모니터링하고 CR의 생성, 수정 및 삭제 내용에 따라 해당 LB 노드의 생성, 제거 및 변경 중 적어도 하나를 수행한다. 이렇게 오퍼레이터(304)가 데이터 베이스(303)에 저장되어 있는 CR에 대응되도록 LB 노드(203-1 ~ 203-3)를 지속적으로 생성, 제거 및 변경하는 작업을 "Reconciling"이라고 한다.The operator 304 monitors the CR stored in the database 303 and performs at least one of creation, deletion, and change of the corresponding LB node according to the creation, modification, and deletion of the CR. The operation of the operator 304 to continuously create, remove, and change the LB nodes 203-1 to 203-3 so as to correspond to the CRs stored in the database 303 is called "Reconciling".

S308의 모니터링 결과, 데이터 베이스(303)에 저장되어 있는 제 3 CR(350-3)을 확인할 경우, 오퍼레이터(304)는 제 3 LB 노드(203-3)을 생성(S309, 또는 생성 요청)한다.As a result of monitoring in S308, when the third CR 350-3 stored in the database 303 is checked, the operator 304 creates the third LB node 203-3 (S309 or requests creation) .

이하, 도 5를 참조하여 인바운트 트래픽을 처리하는 제어 순서를 설명한다.Hereinafter, a control sequence for processing inbound traffic will be described with reference to FIG. 5 .

도 5는 본 발명의 일실시예에 따른 LB 노드(203)가 인바운드 트래픽을 처리하는 제어 순서를 도시한다. 도 5의 제어 순서와 함께 도 2를 함께 참조하여 설명한다. 도 5의 순서도와 관련되는 예시는 인바운드 트래픽은 제 1 LB 노드(203-1)을 향하고 있으며, 제 5 VM(202-5)로 전달되는 시나리오이다.5 illustrates a control sequence in which the LB node 203 processes inbound traffic according to an embodiment of the present invention. The control sequence of FIG. 5 will be described with reference to FIG. 2 together. An example related to the flowchart of FIG. 5 is a scenario in which inbound traffic is directed to the first LB node 203-1 and forwarded to the fifth VM 202-5.

특정 LB 노드(203)에 해당하는 VIP로 향하는 인바운드 트래픽은, 외부 라우터(미도시)와 보더 리프 스위치(212)를 거쳐, 해당 LB 노드(203)에 도달한다. 시나리오에서 제 1 LB 노드(203-1)는 자신의 VIP를 향하는 인바운드 트래픽의 패킷을 수신(S501)한다.Inbound traffic directed to a VIP corresponding to a specific LB node 203 reaches the corresponding LB node 203 via an external router (not shown) and a border leaf switch 212 . In the scenario, the first LB node 203-1 receives a packet of inbound traffic toward its own VIP (S501).

제 1 LB 노드(203-1)는 수신된 패킷에 포함되어 있는 헤더 정보에 대해서 해싱을 수행(S502)한다. 해싱 결과에 기초하여 제 1 LB 노드(203-1)는, 변경할 MAC 주소와 VID를 획득(S503)한다. 이렇게 MAC 주소와 VID를 획득하는 구체적인 방법에 대해서는 도 8 및 도 9를 참조하여 후술하기로 한다.The first LB node 203-1 performs hashing on the header information included in the received packet (S502). Based on the hashing result, the first LB node 203-1 obtains a MAC address and VID to be changed (S503). A detailed method of obtaining the MAC address and VID will be described later with reference to FIGS. 8 and 9 .

이어서 제 1 LB 노드(203-1)는, 패킷 헤더에 포함되어 있는 MAC 주소와 VID를, S503 단계에서 획득한 MAC 주소와 VID로 대체(S504)한다.Subsequently, the first LB node 203-1 replaces the MAC address and VID included in the packet header with the MAC address and VID obtained in step S503 (S504).

도 6은 본 발명의 일실시예에 따라 패킷 헤더에 포함되어 있는 MAC 주소와 VID가 변경되는 예시를 설명한다. 도 6 (a)는 변경 전 헤더 정보(601-1)이고, 도 6 (b)는 변경 후 헤더 정보(601-2)이다.6 illustrates an example in which a MAC address and a VID included in a packet header are changed according to an embodiment of the present invention. 6(a) shows header information 601-1 before change, and FIG. 6(b) shows header information 601-2 after change.

변경 전 헤더 정보(601-1)에는 변경 전 VID(602-1), 변경 전 소스 MAC(603-1), 변경 전 목적지 MAC(604-1)이 포함되어 있다. 상술한 S503 단계에서, 제 1 LB 노드(203-1)는 변경 전 헤더 정보(601-1)를 해싱하여 변경할 MAC 주소와 VID를 획득(S503)한다.The header information 601-1 before change includes a VID 602-1 before change, a source MAC 603-1 before change, and a destination MAC 604-1 before change. In the above step S503, the first LB node 203-1 acquires the MAC address and VID to be changed by hashing the header information 601-1 before the change (S503).

먼저 제 1 LB 노드(203-1)는 변경 전 소스 MAC(603-1)을 변경 전 목적지 MAC(604-1)로 변경한다. 왜냐하면, 변경 전 목적지가 변경 후 새로운 소스가 되기 때문이다. 그리고 제 1 LB 노드(203-1)는 변경 전 목적지 MAC(604-1)을 S503 단계에서 획득한 MAC 주소로 대체하여, 변경 후 목적지 MAC(604-2)로 전환시킨다. 즉, 변경 후 목적지 MAC(604-2)은, 제 5 VM(202-5)의 MAC일 것이다. 마찬가지로 제 1 LB 노드(203-1)는 변경 전 VID(602-1)을 S503 단계에서 획득한 VID로 대체하여, 변경 후 VID(602-2)로 전환시킨다.First, the first LB node 203-1 changes the source MAC 603-1 before change to the destination MAC 604-1 before change. This is because the destination before the change becomes the new source after the change. Then, the first LB node 203-1 replaces the destination MAC 604-1 before the change with the MAC address obtained in step S503, and converts the destination MAC 604-2 after the change. That is, the destination MAC 604-2 after change will be the MAC of the fifth VM 202-5. Similarly, the first LB node 203-1 replaces the VID 602-1 before the change with the VID obtained in step S503 and converts it to the VID 602-2 after the change.

다시 도 5로 복귀하여 제 1 LB 노드(203-1)는, MAC 주소와 VID가 대체된 패킷을 보더 리프 스위치(212)로 회신(S504)한다.Returning to FIG. 5 again, the first LB node 203-1 returns a packet in which the MAC address and VID are replaced to the border leaf switch 212 (S504).

보더 리프 스위치(212)는, 제 1 LB 노드(203-1)로부터 회신 받은 패킷을 캡슐화(S505) 시킨다. 일예시로, 보더 리프 스위치(212)는 VXLAN으로 캡슐화시킬 수 있다. 그리고 보더 리프 스위치(212)는 회신 받은 패킷의 헤더에 포함되어 있는 변경 후 목적지 MAC(604-2)에 기초하여, 변경 후 목적지 MAC(604-2)이 연결되어 있는 TOR 스위치인 제 2 리프 스위치(210-2)로 전달(S506)한다.The border leaf switch 212 encapsulates the packet returned from the first LB node 203-1 (S505). As an example, the border leaf switch 212 can be encapsulated with VXLAN. And the border leaf switch 212 is a second leaf switch that is a TOR switch to which the destination MAC 604-2 after change is connected based on the destination MAC 604-2 after change included in the header of the returned packet. It is transferred to (210-2) (S506).

제 2 리프 스위치(210-2)는, 전달 받은 패킷의 VXLAN 헤드에 대한 캡슐화를 해제하고, 변경 후 목적지 MAC(604-2)에 대응하는 제 5 VM(202-5)에 패킷을 전달(S507)한다.The second leaf switch 210-2 releases the encapsulation of the VXLAN head of the received packet, and forwards the packet to the fifth VM 202-5 corresponding to the destination MAC 604-2 after the change (S507 )do.

제 5 VM(202-5)은 루프백 IP가 제 1 LB 노드(203-1)인지 여부를 확인한 후, 제 1 LB 노드(203-1)가 맞다면 전달 받은 패킷을 수락(S508)한다. 만약 맞지 않다면, 제 5 VM(202-5)는 전달 받은 패킷을 거부할 수 있을 것이다.After confirming whether the loopback IP is the first LB node 203-1, the fifth VM 202-5 accepts the forwarded packet if the first LB node 203-1 is correct (S508). If not, the fifth VM 202-5 may reject the forwarded packet.

제 5 VM(202-5)은, 수락한 패킷을 처리한 후, 소스 MAC과 목적지 MAC을 교환(S509)하는 방식을 통하여, 패킷을 보낸 대상에게 응답할 수 있다. 즉, 제 5 VM(202-5)은 소스 MAC과 목적지 MAC을 교환한 후, 이 패킷을 제 2 리프 스위치(210-2)에 회신(S510)한다.After processing the accepted packet, the fifth VM 202-5 may respond to the sender of the packet by exchanging the source MAC and the destination MAC (S509). That is, after exchanging the source MAC and the destination MAC, the fifth VM 202-5 returns the packet to the second leaf switch 210-2 (S510).

제 2 리프 스위치(210-2)는 회신 받은 패킷에 대해서 캡슐화(예를 들어 VXLAN 캡슐화)를 수행(S511)하고, 제 2 스파인 스위치(211-2)를 거쳐 보더 리프 스위치(212)로 전달(S512)된다. 보더 리프 스위치(212)는 캡슐화를 해제(S513)하고, 외부 라우터를 통하여 네트워크(220)에 패킷을 내보낸다(S514).The second leaf switch 210-2 performs encapsulation (for example, VXLAN encapsulation) on the returned packet (S511) and transfers it to the border leaf switch 212 via the second spine switch 211-2 ( S512). The border leaf switch 212 cancels encapsulation (S513) and transmits the packet to the network 220 through an external router (S514).

상술한 바에 따르면 본 발명의 일실시예에 따른 패킷의 경로는 LB 노드를 거치지 않고 네트워크(220)로 내보내 진다. 이러한 방식을 DSR(Direct Server Return)이라고 한다. DSR 방식에 의하면 패킷이 데이터 센터를 벗어날 때 LB 노드를 통과하지 않으므로 LB 노드에 대한 부하가 줄어든다는 장점이 존재한다.According to the foregoing, the path of the packet according to an embodiment of the present invention is exported to the network 220 without passing through the LB node. This method is called Direct Server Return (DSR). According to the DSR method, there is an advantage that the load on the LB node is reduced because the packet does not pass through the LB node when leaving the data center.

더 나아가 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)은, 로드 밸런서가 소프트웨어 형태로 구현되면서 하드웨어 형태에 뒤쳐지지 않는 속도를 내기 위한 방법을 제안한다.Furthermore, the load balancer management system 100 according to an embodiment of the present invention proposes a method for achieving speed that does not lag behind the hardware type while the load balancer is implemented in a software form.

본 발명의 일실시예에서 첫 번째로 고려되어야 하는 중요한 사항은 리눅스에서 구현 되는 로드 밸런서의 위치이고, 두 번째는 구현 방법이다. 로드 밸런서의 위치에 대해서 도 7을 참조하여 설명한다.In one embodiment of the present invention, the first important matter to be considered is the location of the load balancer implemented in Linux, and the second is the implementation method. The location of the load balancer will be described with reference to FIG. 7 .

도 7은 본 발명의 일실시예에 따른 로드 밸런서 노드가 실행되는 위치를 설명하기 위한 리눅스 네트워크 네트워크 스택(Linux network stack)을 도시하는 도면이다.7 is a diagram illustrating a Linux network stack for explaining a location where a load balancer node is executed according to an embodiment of the present invention.

리눅스의 네트워크 측면에서 하위 수준부터 최상위 수준까지 살펴보면 일반적으로 NIC(701, Network Interface Controller), 장치 드라이버(702, Device Driver), 트래픽 컨트롤러(703, Traffic Controller), 넷필터(704, Netfilter), TCP 스택(705), 소켓 레이어(706) 및 유저스페이스(707)이다. 유저스페이스(707)에 가까워질 수록(위로 올라갈 수록), 더 많은 기능(또는 코드)이 수행 가능하고, 더 많은 상태(state) 정보가 존재한다. 반면, NIC(701)에 가까워질 수록(아래로 내려갈 수록) 사용할 수 있는 기능(또는 코드)이 줄어들지만 더 빠른 성능(performance)을 기대할 수 있다. 즉, 리눅스에서 패킷을 다룰 경우에 패킷이 처리되는 데이터 경로가 낮은 레이어에서 수행될수록 성능이 향상된다.Looking at the network side of Linux from the lowest level to the highest level, in general, NIC (701, Network Interface Controller), device driver (702, Device Driver), traffic controller (703, Traffic Controller), netfilter (704, Netfilter), TCP Stack 705, socket layer 706 and userspace 707. The closer you get to the userspace 707 (the higher you go), the more functions (or codes) can be performed, and the more state information exists. On the other hand, the closer you get to the NIC 701 (the lower you go), the fewer functions (or codes) you can use, but faster performance can be expected. That is, when handling packets in Linux, performance improves as the data path through which packets are processed is performed in a lower layer.

특히, 장치 드라이버(702)에서 네트워크 스택으로 패킷을 전송할 때 성능의 악영향을 주는 메모리 복사 현상이 발생할 뿐만 아니라 sk_buff 구조에 패킷을 할당하는 것이 불가피하다는 문제점이 존재한다.In particular, when a packet is transmitted from the device driver 702 to the network stack, there is a problem in that a memory copy phenomenon that adversely affects performance occurs and that packets are unavoidably allocated to the sk_buff structure.

따라서, 본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)은 리눅스의 비교적 낮은 네트워크 스택에서 로드 밸런스가 수행되도록 제안한다. 구체적으로 본 발명의 일실시예에서는 커널에서 패킷을 처리하는 포워딩 플레인을 eXpress Data Path(XDP) 및 eBPF(Extended Berkeley Packet Filter) 가상 머신을 활용하여 성능을 향상 시키도록 제안한다.Therefore, the load balancer management system 100 according to an embodiment of the present invention proposes to perform load balancing in a relatively low network stack of Linux. Specifically, an embodiment of the present invention proposes to improve the performance of a forwarding plane that processes packets in a kernel by utilizing an eXpress Data Path (XDP) and an eBPF (Extended Berkeley Packet Filter) virtual machine.

본 발명의 일실시예에 따른 로드 밸런서 관리 시스템(100)은 eBPF/XDP를 활용하여 패킷이 커널의 네트워킹 계층으로 전달되기 전에 장치 드라이버(702) 계층에서 패킷을 처리한다. BPF는 유저스페이스(707)에서 커널에 삽입한 프로그램을 가벼운 가상 머신 형태로 실행하게 한다. 그 프로그램들은 커널의 특정 후크(hook)들에 연결되어 있다. eBPF는 확장된 레지스터(register) 및 명령어 세트(instructions), 맵(map)(크기 제한이 없는 키/값 저장소) 등을 추가적으로 지원하는 클래식 BPF의 확장 버전을 의미한다. XDP는 eBPF 후크의 한 유형이며 장치 드라이버(207) 내에서 작동하는 일련의 명령을 말한다.The load balancer management system 100 according to an embodiment of the present invention utilizes eBPF/XDP to process packets in the device driver 702 layer before the packets are transferred to the networking layer of the kernel. BPF allows programs inserted into the kernel in the user space 707 to be executed in the form of a lightweight virtual machine. Those programs are hooked up to specific hooks in the kernel. eBPF refers to an extended version of classic BPF that additionally supports extended registers and instruction sets, and maps (key/value store with no size limit). XDP is a type of eBPF hook and refers to a set of commands that operate within the device driver 207.

도 8은 본 발명의 일실시예에 따른 해시 테이블(801)의 예시를 도시한다. 도 9는 본 발명의 일실시예에 따른 리눅스 커널의 LBN 코어(905)와 유저스페이스(707)의 LBN 컨트롤러(910) 간의 관계를 보여준다.8 shows an example of a hash table 801 according to one embodiment of the present invention. 9 shows the relationship between the LBN core 905 of the Linux kernel and the LBN controller 910 of the user space 707 according to an embodiment of the present invention.

LBN 컨트롤러(910)의 제어부(902)는 LBN 코어(905)에 프로그램을 삽입하고 LBN 코어(905)에서 사용하는 제 2 해시 테이블 저장소(903)를 업데이트한다.The control unit 902 of the LBN controller 910 inserts a program into the LBN core 905 and updates the second hash table storage 903 used by the LBN core 905.

LBN 컨트롤러(910)에 포함되는 제 1 해시 테이블 저장소(901)는, LB 노드(203)의 제 2 해시 테이블 저장소(903)에 저장된 규칙을 로드하는 저장소를 의미한다. 제 1 및 제 2 해시 테이블 저장소(901, 903)의 업데이트 방식은 후술한다.The first hash table storage 901 included in the LBN controller 910 means a storage for loading rules stored in the second hash table storage 903 of the LB node 203 . The update method of the first and second hash table storages 901 and 903 will be described later.

LBN 코어(905)는 LB 노드(203)의 전반적인 동작을 제어한다.The LBN core 905 controls the overall operation of the LB node 203.

제 2 해시 테이블 저장소(903)는 LB 노드(203)가 로드 밸런싱을 수행하기 위한 규칙을 저장한다. 예를 들어, 인바운드 트래픽의 패킷 수신 시, 제 2 해시 테이블 저장소(903)는 수신된 패킷을 어느 가상 머신에게 전달할지에 대한 규칙을 저장한다. 이러한 규칙은 도 8 예시에서의 해시 테이블(801) 형태로 저장될 수 있다.The second hash table storage 903 stores rules for the LB node 203 to perform load balancing. For example, when a packet of inbound traffic is received, the second hash table storage 903 stores a rule about which virtual machine to forward the received packet to. These rules may be stored in the form of a hash table 801 in the example of FIG. 8 .

본 발명의 일실시예에 따른 제 2 해시 테이블 저장소(903)는 커널에 있는 키/값 저장소인 BPF 맵으로 구현될 수 있다. 제 2 해시 테이블 저장소(903)은 LBN 컨트롤러(910)를 통해서만 업데이트 가능하다.The second hash table storage 903 according to an embodiment of the present invention may be implemented as a BPF map, which is a key/value storage in the kernel. The second hash table storage 903 can be updated only through the LBN controller 910 .

로드 밸런싱 관련되는 규칙을 추가 또는 변경하고자 할 경우, 먼저 LBN 컨트롤러(910)는 제 2 해시 테이블 저장소(903)에 저장되어 있는 규칙(도 8 예시와 같은 해시 테이블 형태)을 제 1 해시 테이블 저장소(901)로 로드한다. 그리고 LBN 컨트롤러(910)는 제 1 해시 테이블 저장소(901)에 로드되어 있는 규칙에 변경 내용을 반영하는 업데이트 한다. 그 다음 LBN 컨트롤러(910)는 제 2 해시 테이블 저장소(903)에 업데이트된 규칙을 덮어쓰는 방식으로 규칙을 추가하거나 변경할 수 있다. 이를 위하여 본 발명의 일실시예에 따른 LBN 컨트롤러(910)는 제 1 및 제 2 해시 테이블 저장소(901, 903)에 저장된 규칙을 동일하게 유지(Sync)할 수 있다.When adding or changing rules related to load balancing, first, the LBN controller 910 converts the rules stored in the second hash table storage 903 (in the form of a hash table as shown in the example of FIG. 8) to the first hash table storage ( 901) to load. In addition, the LBN controller 910 updates the rules loaded in the first hash table storage 901 to reflect the changes. Then, the LBN controller 910 may add or change a rule in the second hash table storage 903 by overwriting the updated rule. To this end, the LBN controller 910 according to an embodiment of the present invention may keep (Sync) the same rules stored in the first and second hash table storages 901 and 903 .

이하에서는 제 2 해시 테이블 저장소(903)에 저장되어 있는 규칙을 적용하는 방법에 대해서 설명한다.Hereinafter, a method of applying the rules stored in the second hash table storage 903 will be described.

도 5의 S501 단계에서와 같이 패킷이 LB 노드(203)에 수신되면, 수신 받은 LB 노드(203)의 LBN 코어(905)는 대상 정보(목적지 MAC 및 VID)를 얻기 위해 두 개의 해시 함수를 실행한다.As shown in step S501 of FIG. 5, when the packet is received by the LB node 203, the LBN core 905 of the received LB node 203 executes two hash functions to obtain target information (destination MAC and VID) do.

도 8 (a) 및 (b)를 참조하면, 본 발명의 일실시예에 따른 해시 테이블은 제 1 및 제 2 해시 테이블(801, 802)을 포함하도록 구비될 수 있다. 제 1 해시 테이블(801)은 이하에서 후술할 제 1 해시 함수를 이용하여, 제 2 해시 테이블(802)이 저장되어 있는 위치를 찾기 위하여 사용된다. 제 2 해시 테이블(802)은 제 2 해시 함수를 이용하여 패킷이 전달될 최종 가상 머신(VM)을 찾는데 사용된다.Referring to FIG. 8 (a) and (b), the hash table according to an embodiment of the present invention may include first and second hash tables 801 and 802. The first hash table 801 is used to find a location where the second hash table 802 is stored using a first hash function to be described later. The second hash table 802 is used to find the last virtual machine (VM) to which the packet is to be forwarded using the second hash function.

도 8에 도시된 제 2 해시 테이블(802)을 참조하면, 각 VIP에 대한 테이블이 별도로 존재할 수 있다. 제 1 VIP(VIP 1)에 대해서는 해시 테이블 A(802A), 제 2 VIP(VIP 2)에 대해서는 해시 테이블 B(802B) 및 제 3 VIP(VIP 3)에 대해서는 해시 테이블 C(802C)가 구비될 수 있다. 도시된 예시에서 해시 테이블 A ~ C(802A ~ 802C)는 하나의 해시 테이블(801) 상에서 열(row)로 구분되어 있지만, 이는 하나의 예시일 뿐 행(column)이나 별도로 구분되는 테이블 형태로 저장될 수도 있을 것이다.Referring to the second hash table 802 shown in FIG. 8 , a table for each VIP may exist separately. Hash table A (802A) for the first VIP (VIP 1), hash table B (802B) for the second VIP (VIP 2), and hash table C (802C) for the third VIP (VIP 3). can In the illustrated example, the hash tables A to C (802A to 802C) are divided into rows on one hash table 801, but this is only one example and stored in the form of a column or a separate table It could be.

제 2 해시 테이블(802) 상에서 VIP의 구분은, 서비스의 구분일 수 있다. 예를 들어서, 제 1 서비스는 제 1 VIP로 할당되고, 제 2 서비스는 제 2 VIP로 할당될 수 있다. 즉, 하나의 데이터 센터 상에 여러 서비스를 제공할 경우, 각 서비스 별로 개별적인 VIP를 할당할 수 있다. 그리고 할당된 VIP 각각에 대응되는 해시 테이블 A ~ C(802A ~ 802C)가 구비될 수 있을 것이다.Classification of VIPs on the second hash table 802 may be classification of services. For example, a first service may be assigned to a first VIP, and a second service may be assigned to a second VIP. That is, when multiple services are provided in one data center, individual VIPs can be assigned to each service. In addition, hash tables A to C (802A to 802C) corresponding to each assigned VIP may be provided.

제 2 해시 테이블(802)의 각 항목은 트래픽이 전달 될 대상 가상 머신(VM)의 목적지 MAC 정보를 포함한다. 예를 들어 도시된 해시 테이블 A(802A)의 경우 A 가상 머신에 대한 MAC 주소인 'MAC A', B 가상 머신에 대한 MAC 주소인 'MAC B' 및 C 가상 머신에 대한 MAC 주소인 'MAC C'을 포함한다. 마찬가지로 해시 테이블 B(802B)에는 'MAC D', 'MAC E'가, 해시 테이블 C(802C)에는 'MAC F' 및 'MAC G'가 포함된다.Each item of the second hash table 802 includes destination MAC information of a target virtual machine (VM) to which traffic is to be delivered. For example, in the illustrated hash table A (802A), 'MAC A' is the MAC address for virtual machine A, 'MAC B' is the MAC address for virtual machine B, and 'MAC C' is the MAC address for virtual machine C. '. Similarly, 'MAC D' and 'MAC E' are included in hash table B (802B), and 'MAC F' and 'MAC G' are included in hash table C (802C).

먼저 패킷을 수신 받은 LB 노드(203)의 LBN 코어(905)는 제 1 해시 함수를 제 1 해시 테이블(801)에 적용하여 대응되는 VIP의 메모리 주소(0, 1024 및 2048 중 하나)를 확인하고, 확인된 메모리 주소에 기초하여 해시 테이블 A~C(802A ~ 802C) 중 하나를 선택한다.First, the LBN core 905 of the LB node 203 receiving the packet applies the first hash function to the first hash table 801 to determine the memory address (one of 0, 1024, and 2048) of the corresponding VIP, , one of the hash tables A to C (802A to 802C) is selected based on the identified memory address.

제 1 해시 함수의 매개 변수(parameter)는 목적지 IP(VIP), 목적지 포트, 프로토콜, VID 및 세그먼트 유형 중 적어도 하나를 포함한다. 일예시에서 세그먼트 유형은 VLAN으로 설정되지만, 반드시 이에 한정되는 것은 아니고, VLAN 대신 다른 프로토콜을 사용할 수 있도록 세그먼트 유형이 정의될 수도 있다. 수학식 1은 제 1 해시 함수의 일예시이다.Parameters of the first hash function include at least one of a destination IP (VIP), a destination port, a protocol, a VID, and a segment type. In one example, the segment type is set to VLAN, but is not necessarily limited thereto, and the segment type may be defined so that other protocols may be used instead of VLAN. Equation 1 is an example of the first hash function.

해시 테이블 A~C(802A ~ 802C) 중 하나가 선택된 후, LBN 코어(905)는 제 2 해시 함수를 적용한다. 제 2 해시 함수의 매개 변수는 5-튜플(출발지 IP, 출발지 포트, 목적지 IP, 목적지 포트, 프로토콜)과 패킷 헤더의 VID 중 적어도 하나를 포함한다. 제 2 해시 함수를 통하여, 패킷이 전달될 최종 가상 머신(VM)의 MAC 주소 및 VID를 획득(상술한 도 5의 S503)할 수 있다. 예시에는 VIP 해시 테이블마다 같은 VID를 갖지만, 같은 서비스 그룹에 존재하는 가상 머신들에 대해서 서로 다른 VID를 갖을 수도 있다. 수학식 2는 제 2 해시 함수의 일예시이다.After one of the hash tables A to C (802A to 802C) is selected, the LBN core 905 applies a second hash function. Parameters of the second hash function include at least one of a 5-tuple (source IP, source port, destination IP, destination port, protocol) and VID of a packet header. Through the second hash function, the MAC address and VID of the last virtual machine (VM) to which the packet will be delivered can be obtained (S503 of FIG. 5 described above). In the example, each VIP hash table has the same VID, but virtual machines existing in the same service group may have different VIDs. Equation 2 is an example of the second hash function.

한편, 로드 밸런서 관리 시스템(100)이 효율적인 로드 밸런싱을 수행하기 위해서는, 상술한 도 8 예시에서와 같은 해시 테이블(801)의 항목들에 대한 정렬 방식이 중요하다. 왜냐하면, 가상 머신은 로드 밸런싱이 되는 대상 풀에서 쉽게 가입(join)되고 삭제되기 때문에, 각 해시 테이블(801)에 저장되어 있는 항목이 자주 교체(삭제 또는 추가) 되기 때문이다. 해시 테이블(801)에서 하나의 가상 머신에 대한 정보만 추가하거나 삭제하더라도 해시 테이블(801)의 항목 비율 유지 등의 이유로 이미 존재하는 다른 가상 머신에 대한 정보 위치가 변경 될 수 있다.Meanwhile, in order for the load balancer management system 100 to perform efficient load balancing, a sorting method for the items of the hash table 801 as in the above-described example of FIG. 8 is important. This is because virtual machines are easily joined and deleted from the load-balancing target pool, and items stored in each hash table 801 are frequently replaced (deleted or added). Even if only information about one virtual machine is added or deleted from the hash table 801, the position of information about another virtual machine that already exists may be changed for reasons such as maintaining a ratio of items in the hash table 801.

이와 같이 해시 함수의 해싱 값이 변경되어 특정 VIP에 대한 대상 가상 머신이 변경되는 것을 "hashing disruption" 이라 부른다. 이러한 "hashing disruption"으로 인해 관련 없는 연결이 의도하지 않게 종료되기도 한다. 의도치 않게 종료가 되는 연결에 대해, 결국 다른 대상과의 연결을 재설정하는 프로세스가 추가적으로 필요하게 된다. 따라서 본 발명의 일실시예에서는 최소한의 "hashing disruption"을 지원하는 해시 테이블 메커니즘을 제안한다.In this way, when the hashing value of the hash function is changed and the target virtual machine for a specific VIP is changed, it is called “hashing disruption”. This "hashing disruption" can lead to unintentional termination of unrelated connections. For connections that are terminated unintentionally, an additional process is required to eventually re-establish the connection with another target. Therefore, an embodiment of the present invention proposes a hash table mechanism that supports minimal “hashing disruption”.

이를 위해서 본 발명의 일실시예에서는 일관된(consistent) 해싱 알고리즘의 개선된 버전인 Maglev 해싱 메커니즘을 사용하도록 제안한다. Maglev 해싱을 사용하면 해시 테이블(801)에 새 항목이 추가되거나 기존 항목이 삭제 될 때 기존 항목의 위치를 가능한 적게 변경할 수 있기 때문이다.To this end, one embodiment of the present invention proposes to use the Maglev hashing mechanism, which is an improved version of the consistent hashing algorithm. This is because Maglev hashing allows the positions of existing entries to be changed as little as possible when new entries are added to the hash table 801 or when existing entries are deleted.

Maglev 해싱을 따르면 "Permutation"이라는 프로세스와 "Population"이라는 두 프로세스를 통해, 해시 테이블에 각 가상 머신의 항목들을 배치하게 된다. 이때 "Permutation"은 각 항목에 대해 겹치지 않는 순서를 얻어 2차원 배열을 생성하는 프로세스이다. 그리고 "Population"은 항목 별로 순서대로 공평하게 자리를 배치하는 프로세스를 말한다.Following Maglev hashing, two processes, one called "Permutation" and one called "Population", place the entries of each virtual machine in a hash table. At this time, "Permutation" is a process of generating a two-dimensional array by obtaining a non-overlapping sequence for each item. And "Population" refers to the process of arranging seats fairly in order by item.

본 발명의 일실시예에서의 Maglev 해싱에 따른 해시 테이블(801)이 지원하는 가상 머신의 항목 수는 MAX_VIPS * (MAX_VMS * FR)로 정의될 수 있다. MAX_VIPS는 하나의 LBN에서 지원하는 VIP의 개수로서, 기본값은 4096이며 네트워크 상태에 따라 변경될 수 있다. MAX_VMS는 하나의 VIP 당 처리 할 수 있는 가상 머신의 개수를 말한다. FR은 MAX_VMS에 곱해지는 파라미터로서 이 값이 커질수록 "hashing disruption"이 감소하며, 각 VIP에 가중치(weight)를 할당하기 위해 사용될 수 있다.The number of virtual machine entries supported by the hash table 801 according to Maglev hashing in an embodiment of the present invention may be defined as MAX_VIPS * (MAX_VMS * FR). MAX_VIPS is the number of VIPs supported by one LBN. The default value is 4096 and can be changed according to the network condition. MAX_VMS refers to the number of virtual machines that can be processed per VIP. FR is a parameter that is multiplied by MAX_VMS, and as this value increases, “hashing disruption” decreases and can be used to assign a weight to each VIP.

- 본 발명의 실시예에 따른 로드 밸런서 성능에 대한 실험적인 증명- Experimental proof of load balancer performance according to an embodiment of the present invention

소프트웨어로 구현된 LBN의 성능은 유선으로 직접 연결된 경우에서의 패킷 속도에 얼마나 근접한지로 결정할 수 있다. 근접한 정도에 따라 LB를 실제 네트워크에서 사용할 수 있는지 판단할 수 있을 것이다. 만약, 유선(이하 도 11 ~ 도 15에 도시된 실험 결과에서 "Loopback" 항목)으로 직접 연결된 경우와 비슷하다면, LBN으로서 만족할 만한 성능이라고 볼 수 있을 것이다.The performance of an LBN implemented in software can determine how close it is to the packet rate in the case of a direct wired connection. Depending on the degree of proximity, it will be possible to judge whether the LB can be used in the actual network. If it is similar to the case where it is directly connected by wire (hereinafter referred to as “Loopback” in the experimental results shown in FIGS. 11 to 15), it can be regarded as satisfactory performance as an LBN.

이를 위하여 본 실험에서는 방화벽, 라우터 등 물리적 장비의 성능 측정에 널리 사용되는 RFC2544 표준을 통해 LBN의 성능이 측정되었다. 이 표준은 성능 실험의 조건을 명시하여 회사의 제품 간의 성능을 정확하게 비교 할 수 있다. RFC2544 뿐만 아니라, 데이터 센터의 특성을 충족하기 위해 IMIX라는 실제 트래픽 패턴과 유사한 테스트도 함께 이루어졌다.To this end, in this experiment, the performance of LBN was measured through the RFC2544 standard, which is widely used to measure the performance of physical devices such as firewalls and routers. This standard specifies the performance test conditions so that the performance of the company's products can be accurately compared. In addition to RFC2544, tests similar to real traffic patterns called IMIX were also conducted to meet the characteristics of data centers.

IMIX는 트레픽의 포함된 프레임들이 프레임 크기 별로 정해진 비율만큼 혼합된 트레픽 패턴을 말한다. 따라서, IMIX 트래픽은 실제 클라우드 네트워크에서 발생할 수 있는 트래픽 패턴을 모방하는 트레픽이라고 볼 수 있다.IMIX refers to a traffic pattern in which frames included in traffic are mixed at a ratio determined by frame size. Therefore, IMIX traffic can be seen as traffic that mimics traffic patterns that can occur in actual cloud networks.

이하에서는 먼저 실험에 설정되었던 속성 값들을 자세히 설명한 후에, RFC2544 테스트와 IMIX 테스트의 성능을 순서대로 설명한다.In the following, the property values set in the experiment are explained in detail, and then the performance of the RFC2544 test and the IMIX test is explained in order.

도 10은 본 실험의 실험 환경을 도시한다.10 shows the experimental environment of this experiment.

A. 실험 환경 설정A. Experiment environment setup

RFC2544에 정의 된 대로, 하나의 스위치(1002)가 두 대의 서버(1001-1, 1001-2)에 연결되도록 실험 환경을 구성했다. 테스터 서버(1001-1)는 테스터 역할을, DUT 서버(1001-2)는 DUT(테스트 대상 장치, device under test) 역할을 수행한다. 각 테스트 시나리오(시나리오 A 및 시나리오 B) 마다 테스터 서버(1001-1)는 패킷을 생성하여 스위치(1002)로 보내고 다시 DUT 서버(1001-2)가 패킷을 받아서 "throughput" 같은 성능 지표를 얻도록 구성된다.As defined in RFC2544, the experimental environment was configured such that one switch (1002) is connected to two servers (1001-1 and 1001-2). The tester server 1001-1 serves as a tester, and the DUT server 1001-2 serves as a device under test (DUT). For each test scenario (scenario A and scenario B), the tester server 1001-1 generates a packet and sends it to the switch 1002, and the DUT server 1001-2 receives the packet and obtains a performance indicator such as "throughput". It consists of

본 실험에서는 테스터 서버(1001-1)의 성능 측정을 위해 오픈 소스 트래픽 생성기인 "TRex"를 설치했다. "TRex"는 Intel의 DPDK를 활용하여 저비용 및 고성능을 실현하는 소프트웨어 계측기로서, Stateful 및 Stateless 트레픽을 생성하는 것이 특징이다. "TRex"를 사용하면 Python 스크립트로 시나리오를 작성하여 RFC 표준을 준수하는 테스트를 수행 할 수 있다. 그뿐만 아니라 상용 물리적인 테스터 장비와 유사한 테스트를 수행 할 수 있다.In this experiment, "TRex", an open source traffic generator, was installed to measure the performance of the tester server (1001-1). "TRex" is a software instrument that realizes low cost and high performance by utilizing Intel's DPDK, and is characterized by generating stateful and stateless traffic. "TRex" allows you to write scenarios in Python scripts to perform tests that comply with RFC standards. Not only that, but it can perform tests similar to commercial physical tester equipment.

테스터 서버(1001-1)의 코어들 중에 테스터 서버(1001-1)를 실행하기 위한 최소 코어를 제외한 나머지 모든 CPU 코어들은 "TRex" 패킷 전송 용으로 설정되었다. DUT 서버(1001-2)에는 스위치(1002)에서 수신 한 패킷의 헤더를 테스터 서버(1001-1)의 MAC 주소와 VID로 변경하여 스위치(1002)로 전달하는 LBN을 설치했다.Among the cores of the tester server 1001-1, all CPU cores except for the minimum core for executing the tester server 1001-1 are configured for “TRex” packet transmission. An LBN is installed in the DUT server 1001-2 to change the header of the packet received from the switch 1002 into the MAC address and VID of the tester server 1001-1 and forward it to the switch 1002.

테스터 서버(1001-1)와 DUT 서버(1001-2)는 모두 2.10GHz CPU(총 16 코어, 11MB L3 캐시) 및 4개의 32GB RAM이 사용되었다. 그리고 커널 5.4가 포함된 Ubuntu 20.04를 각 서버(1001-1, 1001-2)에 설치하였다. 각 서버(1001-1, 1001-2)에 사용되는 NIC 카드는 Mellanox 듀얼 포트 25G이다. 테스터 서버(1001-1)의 한 포트는 RX 만 수행하고 다른 포트는 TX 만 수행하도록 구성되었다. 반면에, DUT 서버(1001-2)에는 포트가 하나만 있으며 해당 포트는 RX와 TX를 모두 수행한다. 커널 수준에서 VLAN 헤더를 포함하는 패킷을 확인하기 위해 커널의 rxvlan 및 txvlan 오프로드 기능을 두 서버(1001-1, 10001-2)에서 모두 꺼 놓았다. 테스터 서버(1001-1)에는 Mellanox 사의 NIC 카드를 DPDK로 사용하기 위한 드라이버인 MLX5 폴 모드 드라이버를 설치했다.Both the tester server (1001-1) and the DUT server (1001-2) used a 2.10GHz CPU (total of 16 cores, 11MB L3 cache) and four 32GB RAMs. And Ubuntu 20.04 with kernel 5.4 was installed on each server (1001-1, 1001-2). The NIC card used for each server 1001-1 and 1001-2 is a Mellanox dual port 25G. One port of the tester server 1001-1 was configured to perform RX only and the other port to perform TX only. On the other hand, the DUT server 1001-2 has only one port, and the corresponding port performs both RX and TX. I turned off the kernel's rxvlan and txvlan offload features on both servers (1001-1 and 10001-2) to see packets with VLAN headers at the kernel level. In the tester server (1001-1), MLX5 poll mode driver, which is a driver for using Mellanox's NIC card as DPDK, was installed.

리눅스 넷필터 프로젝트에 속하는 Iptables는 패킷 필터링 및 제어(NAT 등) 기능을 제공한다. 이 중 DNAT는 단순히 로드 밸런싱에 사용할 수 있는 기능이다. 즉, 수신 패킷의 대상 주소를 변경하는 여러 DNAT 규칙을 사용하여 트레픽을 로드 밸런싱 할 수 있다. 이러한 이유로 LBN의 성능을 검증하기 위한 비교 대상으로 Iptables DNAT를 선택하였다. 또한 테스트 서버의 TX 포트와 RX 포트를 직접 연결하여 루프백(loopback) 테스트를 수행하였고 이 성능을 로드 밸런싱 성능의 상한선으로 설정하였다.Iptables, part of the Linux Netfilter project, provides packet filtering and control (NAT, etc.) functions. Among these, DNAT is simply a function that can be used for load balancing. That is, traffic can be load balanced using multiple DNAT rules that change the destination address of incoming packets. For this reason, Iptables DNAT was selected as a comparison target to verify the performance of LBN. In addition, a loopback test was performed by directly connecting the TX port and RX port of the test server, and this performance was set as the upper limit of the load balancing performance.

LBN은 하나의 VIP 당 여러 가상 머신(VM)으로 로드 밸런싱을 수행한다. 따라서 본 실험에서 우리가 수정할 수 있는 매개 변수로 VIP의 수와 가상 머신의 수를 설정하였다.LBN performs load balancing with multiple virtual machines (VMs) per VIP. Therefore, in this experiment, we set the number of VIPs and the number of virtual machines as parameters that can be modified.

그리고 LBN이 사용되는 시나리오는 총 2가지로 설정하였다. 첫번째는 1개의 VIP와 255개의 가상 머신에 로드 밸런싱을 하는 시라니오(도 10의 A 시나리오)이고, 다른 하나는 128개의 VIP와 4000개의 가상 머신에 로드 밸런싱을 하는 시나리오(도 10의 B 시나리오)이다. 다수의 VIP에 대한 테스트일 경우 VIP의 범위에서 임의로 선택되어 트래픽을 보내도록 설정하였다. 테스터 서버(1001-1)가 보내는 VIP의 범위는 LBN의 해시 테이블에 추가된 규칙의 VIP 목록 범위와 동일하다. 동일한 그룹에 속하는 LBN은 동일한 규칙을 갖는다. 본 테스트는 L2DSR에 대한 테스트이기 때문에 트래픽의 대상 IP는 변경되지 않지만 VID 및 대상 MAC 주소는 관련 실제 대상 가상 머신 중 하나의 VID 및 MAC 주소로 변경된다. LBN과 동일한 작업을 수행하는 Iptables DNAT를 만들기 위해 128 개의 VIP에 대한 규칙을 Iptables DNAT에 동일하게 적용했다. 테스터 서버(1001-1)와 DUT 서버(1001-2)를 연결하는 스위치(1002)는 단순히 들어오는 패킷을 다른 쪽 연결된 서버 방향의 포트로 플러딩(flooding)하는 브리지(bridge) 역할을 수행한다. 일반 L2DSR 구성의 경우 LBN 서버에 연결된 스위치의 각 포트는 다양한 VLAN에 연결된 트래픽을 허용하기 위해 트렁크(trunk) 모드로 설정되지만 실험 환경에서는 특정 VLAN 만 송수신 할 수 있도록 포트가 액세스(access) 모드로 설정되었다.In addition, two scenarios in which LBN is used are set. The first is a scenario load balancing 1 VIP and 255 virtual machines (scenario A in FIG. 10), and the other is a scenario load balancing 128 VIPs and 4000 virtual machines (scenario B in FIG. 10) to be. In the case of testing for multiple VIPs, traffic was randomly selected from the range of VIPs to be sent. The range of VIPs sent by the tester server (1001-1) is equal to the range of the VIP list of rules added to the LBN's hash table. LBNs belonging to the same group have the same rules. Since this test is for L2DSR, the destination IP of the traffic does not change, but the VID and destination MAC address are changed to the VID and MAC address of one of the relevant actual destination virtual machines. To create Iptables DNAT that does the same thing as LBN, we applied the same rules for 128 VIPs to Iptables DNAT. The switch 1002 connecting the tester server 1001-1 and the DUT server 1001-2 simply performs a bridge role of flooding incoming packets to a port toward the other connected server. In the case of a typical L2DSR configuration, each port on the switch connected to the LBN server is set to trunk mode to allow traffic connected to various VLANs, but in the experimental environment, the ports are set to access mode to allow only certain VLANs to transmit and receive. It became.

B. RFC2544에 따른 성능 테스트B. Performance testing according to RFC2544

이 섹션에서는 RFC2544 표준에 따라 수행 된 실험의 설정들을 나타내고 throughput (BPS 및 PPS 단위)을 보여준다. In this section, we present the settings of experiments performed according to the RFC2544 standard and show the throughput (units of BPS and PPS).

본 실험에서는 총 7 가지 유형의 프레임 크기(64, 128, 256, 512, 1024, 1280, 1510 (byte))를 테스트했다. 구체적으로는 각 프레임 크기별로 아래 4가지 유형의 테스트를 수행했다:A total of 7 types of frame sizes (64, 128, 256, 512, 1024, 1280, 1510 (byte)) were tested in this experiment. Specifically, for each frame size, four types of tests were performed:

1) 루프백(loopback)1) loopback

2) Iptables의 DNAT2) DNAT from Iptables

3) LBN에 대한 시나리오 A (#VIP = 1, #Real-VM = 255)3) Scenario A for LBN (#VIP = 1, #Real-VM = 255)

4) LBN에 대한 시나리오 B (#VIP = 128, #Real-VM = 4,000)4) Scenario B for LBN (#VIP = 128, #Real-VM = 4,000)

한 번의 테스트 시도(trial)에서 25G NIC 카드의 최대 속도로 60초 동안 각 프레임 크기의 UDP 트래픽을 전송했으며, 3번 시도의 평균 throughput을 최종 결과로 사용했다. 시도 사이에 서로 간섭하지 않도록 각 시도의 간격을 30 초로 설정했다.In one test trial, UDP traffic of each frame size was transmitted for 60 seconds at the maximum speed of a 25G NIC card, and the average throughput of three trials was used as the final result. The interval between each trial was set at 30 s so as not to interfere with each other between trials.

도 11 내지 도 13은 특정 크기의 프레임으로 만 구성된 트래픽을 25G NIC 카드의 전체 속도로 전송할 때 각 시나리오의 throughput을 도시한다.11 to 13 show the throughput of each scenario when traffic consisting only of frames of a specific size is transmitted at full speed of a 25G NIC card.

특히 도 11 및 도 12는 L2 및 L1 수준에서 BPS 단위의 throughput을 각각 보여준다. 도 13은 L2 수준에서 PPS 단위의 throughput을 보여준다. 모든 프레임 크기에 대한 결과에서 루프백(Loopback)의 성능은 LB가 도달 할 수 있는 최상의 성능이라고 인정될 수 있다.In particular, FIGS. 11 and 12 show the throughput of the BPS unit at the L2 and L1 levels, respectively. 13 shows the throughput of the PPS unit at the L2 level. From the results for all frame sizes, the performance of Loopback can be accepted as the best performance that LB can reach.

도 11 및 도 12에서는 프레임 크기가 작을수록 프레임 처리 오버 헤드가 증가하여 BPS가 낮아지는 것을 볼 수 있다. 도 11의 결과는 프레임의 레이어 1과 2의 헤더를 모두 제거한 후의 결과이고, 도 12의 결과는 프레임의 레이어 1의 헤더만 제거한 결과이므로 도 12가 더 좋은 성능을 일 것이다.11 and 12, it can be seen that the smaller the frame size, the higher the frame processing overhead and the lower the BPS. The result of FIG. 11 is the result after removing all of the headers of layer 1 and 2 of the frame, and the result of FIG. 12 is the result of removing only the header of layer 1 of the frame, so FIG. 12 will have better performance.

주어진 수치에 대해 세 가지 관점으로 테스트 결과를 해석할 수 있을 것이다.For a given number, we can interpret the test results in three ways.

1) LBN의 성능이 루프백에 얼마나 가까운지 여부1) How close the performance of LBN is to loopback

2) Iptables DNAT와 LBN의 성능은 얼마나 차이가 나는지2) What is the performance difference between Iptables DNAT and LBN?

3) LBN에 적용되는 규칙 수가 다를 때 성능 차이는 얼마나 되는지3) What is the difference in performance when the number of rules applied to LBN is different?

L1 및 L2용 BPS의 경우(도 11 및 도 12) 프레임 크기가 64일 때 루프백과 LBN의 차이는 약 5 ~ 6Gbps 미만이다. 즉, LBN은 실제 물리적 최대 성능보다 약 24% 낮다. 프레임 크기가 128인 경우 차이는 약 3%에 불과하고 이보다 큰 프레임 크기에서는 성능이 거의 동일한 것을 확인할 수 있다.In the case of BPS for L1 and L2 (FIGS. 11 and 12), when the frame size is 64, the difference between loopback and LBN is less than about 5 to 6 Gbps. That is, the LBN is about 24% lower than the actual physical maximum performance. When the frame size is 128, the difference is only about 3%, and it can be seen that the performance is almost the same for larger frame sizes.

가장 작은 프레임 크기인 64를 처리하기위해 발생하는 불가피한 성능 저하를 제외하면 LBN의 성능은 현저하게 높은 성능을 보인다는 것을 증명한다. 가장 성능이 안 좋은 경우(프레임 크기가 64인 경우)에도 LBN은 Iptables DNAT보다 16배 더 나은 성능을 제공한다. 차이가 가장 작은 경우(프레임이 1,510일 때) 차이는 약 두 배이다.Excluding the unavoidable performance degradation that occurs to handle the smallest frame size of 64, the performance of LBN proves to be remarkably high. Even in the worst performing case (frame size of 64), LBN provides 16x better performance than Iptables DNAT. When the difference is smallest (when the frame is 1,510), the difference is about double.

두 시나리오를 LBN에 대해 다른 규칙 수로 비교할 때 가장 성능이 안 좋은 경우(프레임 크기가 64인 경우) 성능은 규칙 수가 증가함에 따라 약 4.2% 감소한다. 그러나 다른 프레임 크기에는 거의 차이가 없기 때문에 규칙의 수가 성능에 큰 영향을 주지 않음을 알 수 있다.Comparing the two scenarios with different number of rules for LBN, in the worst-performing case (with a frame size of 64), the performance decreases by about 4.2% as the number of rules increases. However, it can be seen that the number of rules does not significantly affect the performance, as there is little difference for different frame sizes.

도 13은 도 11의 결과를 PPS 단위로 나타낸 것이다. BPS와 반대로 프레임 크기가 작을수록 더 많은 패킷이 전송되기 때문에 PPS가 높아진다는 것을 알 수 있다. 프레임 크기가 64일 때 루프백과 LBN의 차이는 약 7 ~ 8Mpps 미만이다. 상술한 세 가지 관점에 대한 백분율 값은 도 11 및 도 12 관련하여 언급 한 것과 마찬가지일 것이다.FIG. 13 shows the results of FIG. 11 in units of PPS. Contrary to BPS, it can be seen that the smaller the frame size, the higher the PPS because more packets are transmitted. When the frame size is 64, the difference between loopback and LBN is less than about 7 to 8 Mpps. Percentage values for the above three aspects will be the same as those mentioned in relation to FIGS. 11 and 12 .

C. IMIX에 따른 성능 테스트 C. Performance tests according to IMIX

도 14는 IMIX 트래픽에 대한 BPS(L2 및 L1 수준) 단위의 Troughput 결과를 도시한다. 도 15는 IMIX 트래픽에 대한 PPS 단위의 Troughput 결과를 도시한다.14 shows throughput results in units of BPS (L2 and L1 levels) for IMIX traffic. 15 shows Troughput results in units of PPS for IMIX traffic.

IMIX의 사이즈 별 비율은 시간이 지남에 따라 변경되어 왔다. 그러나 각 프레임 크기에 대해 일반적으로 사용되는 비율이 있다. 실험에 사용 된 각 프레임 크기의 비율은 프레임 크기가 64, 512 및 1510 인 프레임에 대해 각각 60%, 20% 및 20%이다. 다양한 프레임 크기의 트래픽을 혼합하여 테스트를 수행했기 때문에 실제 네트워크 환경과 유사 함을 알 수 있다. 도 14를 참조하면, 루프백과 LBN 간의 L2 수준에서 측정 된 BPS의 차이는 약 0.7Gbps 미만이다. 즉, LBN은 이론상 최대 성능보다 약 2% 만 낮다는 의미이다. 그리고 LBN은 Iptables DNAT보다 27 배 더 나은 성능을 제공한다. LBN에 대한 규칙 수가 다른 두 시나리오를 비교할 때 거의 차이가 없는 것을 확인할 수 있다. RFC255 테스트와 마찬가지로 규칙 수가 LBN의 성능에 큰 영향을 미치지 않음을 알 수 있다.IMIX's size ratio has changed over time. However, there are commonly used ratios for each frame size. The ratios of each frame size used in the experiment are 60%, 20% and 20% for frames with frame sizes of 64, 512 and 1510, respectively. Since the test was performed by mixing traffic of various frame sizes, it can be seen that it is similar to the actual network environment. Referring to Fig. 14, the difference in BPS measured at the L2 level between loopback and LBN is less than about 0.7 Gbps. This means that the LBN is only about 2% below the theoretical maximum performance. And LBN provides 27 times better performance than Iptables DNAT. When comparing two scenarios with different numbers of rules for LBN, it can be seen that there is almost no difference. Similar to the RFC255 test, we can see that the number of rules does not significantly affect the performance of LBN.

이상으로 본 발명에 따른 로드 밸런서 관리 시스템의 제어 방법의 실시예를 설시하였으나 이는 적어도 하나의 실시예로서 설명되는 것이며, 이에 의하여 본 발명의 기술적 사상과 그 구성 및 작용이 제한되지는 아니하는 것으로, 본 발명의 기술적 사상의 범위가 도면 또는 도면을 참조한 설명에 의해 한정／제한되지는 아니하는 것이다. 또한 본 발명에서 제시된 발명의 개념과 실시예가 본 발명의 동일 목적을 수행하기 위하여 다른 구조로 수정하거나 설계하기 위한 기초로써 본 발명이 속하는 기술분야의 통상의 지식을 가진 자에 의해 사용되어질 수 있을 것인데, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자에 의한 수정 또는 변경된 등가 구조는 청구범위에서 기술되는 본 발명의 기술적 범위에 구속되는 것으로서, 청구범위에서 기술한 발명의 사상이나 범위를 벗어나지 않는 한도 내에서 다양한 변화, 치환 및 변경이 가능한 것이다.Above, the embodiment of the control method of the load balancer management system according to the present invention has been described, but this is described as at least one embodiment, whereby the technical idea of the present invention and its configuration and operation are not limited, The scope of the technical idea of the present invention is not limited / limited by the drawings or the description referring to the drawings. In addition, the concepts and embodiments of the present invention presented in the present invention can be used by those skilled in the art as a basis for modifying or designing other structures to achieve the same purpose of the present invention. , Modified or changed equivalent structure by a person skilled in the art to which the present invention belongs is bound by the technical scope of the present invention described in the claims, and does not depart from the spirit or scope of the invention described in the claims. Various changes, substitutions and modifications are possible within the limits.

Claims

In the control method of a system that manages a load balancer (LB, Load Balancer) in a cloud-native environment,
Based on the load balancer node creation request, CR management unit creating a custom resource (CR, Custom Resource);
The user-specified resource includes at least one parameter information defining a load balancer,
CR management unit storing the created user-specified resource in a database; and
Including the step of the operator performing at least one of creating, removing, and changing at least one load balancer node (LB Node) to correspond to the stored at least one user-specified resource,
How to control the load balancer management system.

According to claim 1,
Further comprising allocating a VIP (Virtual IP) to the load balancer to which the VIP allocation unit is to be added.
How to control the load balancer management system.

According to claim 1,
The generated load balancer is implemented in the device driver (DD) stack of the Linux network stack.
How to control the load balancer management system.

According to claim 3,
The load balancer is implemented as an Extended Berkeley Packet Filter (eBPF) / eXpress Data Path (XDP),
How to control the load balancer management system.

In a load balancer management system that manages a load balancer (LB, Load Balancer) in a cloud-native environment,
CR management unit for creating a custom resource (CR, Custom Resource) based on the load balancer node creation request;
The user-specified resource includes at least one parameter information defining a load balancer,
a database for storing the created user-specified resource; and
Including an operator that performs at least one of creating, removing, and changing at least one load balancer node (LB Node) to correspond to the stored at least one user-specified resource,
Load balancer management system.

According to claim 5,
Further comprising a VIP allocation unit for allocating a VIP (Virtual IP) to the load balancer to be added,
Load balancer management system.

According to claim 5,
The generated load balancer is implemented in the device driver (DD) stack of the Linux network stack.
Load balancer management system.

According to claim 7,
The load balancer is implemented as an Extended Berkeley Packet Filter (eBPF) / eXpress Data Path (XDP),
Load balancer management system.

In the control method of the load balancer node,
LBN core receiving a packet from the border leaf switch;
applying, by the LBN core, first and second hash functions to the received packet;
obtaining at least one of a destination address and a VID from a hash table based on a result of applying the first and second hash functions by the LBN core;
replacing, by the LBN core, at least one of the destination address and VID of the received packet with the obtained destination MAC address and VID;
Transmitting the replaced packet by the LBN core to the border leaf switch,
How to control load balancer nodes.

According to claim 9,
The first and second hash functions are applied to the header of the received packet,
How to control load balancer nodes.

The method of claim 9, wherein the parameters of the first hash function,
Including at least one of a destination IP (VIP), destination port, protocol, VID, and segment type for the received packet,
How to control load balancer nodes.

The method of claim 9, wherein the parameter of the second hash function,
At least one of a source IP, a source port, a destination IP, a destination port, a protocol, and a VID of a packet header for the received packet,
How to control load balancer nodes.

The method of claim 9, wherein the acquiring step,
Acquiring, by the LBN core, a Virtual IP (VIP) based on a first hashing result of the first hash function,
How to control load balancer nodes.

The method of claim 13, wherein the acquiring step,
The LBN core further comprising obtaining a destination MAC address based on the second hashing result of the second hash function and the first hashing result.
How to control load balancer nodes.

For load balancer nodes,
hash table storage to store hash tables; and
Including an LBN core for receiving packets from a border leaf switch,
The LBN core,
Applying first and second hash functions to the received packet;
Obtaining at least one of a destination address and a VID on the stored hash table based on a result of applying the first and second hash functions;
Replace at least one of the destination address and VID of the received packet with the obtained destination MAC address and VID;
The LBN core transmits the replaced packet to the border leaf switch,
load balancer node.

According to claim 15,
The first and second hash functions are applied to the header of the received packet,
load balancer node.

16. The method of claim 15, wherein the parameters of the first hash function,
Including at least one of a destination IP (VIP), destination port, protocol, VID, and segment type for the received packet,
load balancer node.

16. The method of claim 15, wherein the parameters of the second hash function,
At least one of a source IP, a source port, a destination IP, a destination port, a protocol, and a VID of a packet header for the received packet,
load balancer node.

According to claim 15,
The LBN core obtains a VIP (Virtual IP) based on the first hashing result of the first hash function.
load balancer node.

According to claim 19,
The LBN core obtains a destination MAC address based on the second hashing result and the first hashing result of the second hash function.
load balancer node.

A computer program stored in a medium to be combined with hardware to execute the method of any one of claims 1 to 4.