KR20240011357A

KR20240011357A - Method and Apparatus for Resource Allocating in Graphics Processing Unit Cloud Environment

Info

Publication number: KR20240011357A
Application number: KR1020220088706A
Authority: KR
Inventors: 김세웅; 고현우; 이정복
Original assignee: 주식회사 카카오엔터프라이즈
Priority date: 2022-07-19
Filing date: 2022-07-19
Publication date: 2024-01-26

Abstract

본 발명은 자원 할당 방법 및 장치에 관한 것으로, 제어장치가 적어도 하나의 어플리케이션 서버로부터 자원 할당 요청신호를 수신하는 단계, 제어장치가 자원 할당 요청신호에 대응되는 필요 연산량을 확인하는 단계, 제어장치가 어플리케이션 서버에서 선택된 클러스터 할당 규칙에 따라 적어도 하나의 그룹 중 제1 그룹에 포함된 적어도 하나의 제1 GPU클러스터 중에서 필요 연산량의 처리가 가능한 적어도 하나의 제1 유휴 클러스터를 확인하는 단계, 제어장치가 제1 유휴 클러스터의 변경이 필요하면 적어도 하나의 제2 그룹에 포함된 적어도 하나의 제2 GPU클러스터 중에서 필요 연산량의 적어도 일부의 처리가 가능한 적어도 하나의 제2 유휴 클러스터를 확인하는 단계 및 제어장치가 제1 유휴 클러스터 및 제2 유휴 클러스터 중 적어도 하나를 어플리케이션 서버에 할당하는 단계를 포함하며 다른 실시 예로도 적용이 가능하다.The present invention relates to a resource allocation method and device, comprising: a control device receiving a resource allocation request signal from at least one application server; a control device confirming the required amount of computation corresponding to the resource allocation request signal; Confirming at least one first idle cluster capable of processing the required amount of computation among at least one first GPU cluster included in the first group among the at least one group according to the cluster allocation rule selected by the application server, the control device 1 If the idle cluster needs to be changed, a step of checking at least one second idle cluster capable of processing at least a portion of the required computational amount among at least one second GPU cluster included in at least one second group, and a control device It includes the step of allocating at least one of the first idle cluster and the second idle cluster to the application server, and can be applied to other embodiments.

Description

Resource allocation method and apparatus {Method and Apparatus for Resource Allocating in Graphics Processing Unit Cloud Environment}

본 발명은 자원 할당 방법 및 장치에 관한 것이다. The present invention relates to a resource allocation method and device.

인터넷의 급속한 발달로 인해 개인의 통신 속도는 급격하게 향상되었고, 통신 속도의 향상으로 어플리케이션을 제공하는 어플리케이션 서버에 접속하여 대용량의 데이터를 다운로드 또는 업로드하고, 멀티미디어 데이터를 스트리밍으로 즐길 수 있는 환경이 조성되었다. 특히, 어플리케이션 서버에서 어플리케이션을 구동시키고 비디오 인코딩을 통해 구동화면을 압축하여 클라이언트 단말로 전송함으로써, 클라이언트가 자신의 단말에서 어플리케이션이 구동되는 것과 같은 효과를 내는 화면 가상화 기반의 클라우드 스트리밍 서비스가 각광받고 있다.Due to the rapid development of the Internet, personal communication speeds have rapidly improved, and the improvement in communication speeds has created an environment in which people can download or upload large amounts of data by accessing application servers that provide applications, and enjoy streaming multimedia data. It has been done. In particular, a cloud streaming service based on screen virtualization that runs the application on an application server, compresses the running screen through video encoding, and transmits it to the client terminal, giving the client the same effect as if the application is running on his or her own terminal, is attracting attention. .

이와 같은 클라우드 스트리밍 서비스가 보편화되면서 클라우드 시스템을 통해 서비스할 수 있는 매체가 다양하게 개발되었고, 다양한 매체에 대한 서비스의 요청은 클라우드 시스템의 중앙 처리 유닛(CPU; central processing unit)만으로는 처리하기 어려운 상황에 이르렀다. 이러한 문제를 해결하기 위해 현재에는 그래픽 처리 유닛(GPU; graphics processing unit)을 이용하여 중앙 처리 유닛에서의 패킷 처리를 분산시킴으로써 보다 효율적으로 시스템의 자원을 활용하여 클라우드 서비스를 제공할 수 있는 기술이 개발되어 활발하게 사용되고 있다. As such cloud streaming services have become popular, a variety of media that can be serviced through cloud systems have been developed, and requests for services for various media are difficult to process with the central processing unit (CPU) of the cloud system alone. It has arrived. To solve this problem, technology has been developed to provide cloud services by utilizing system resources more efficiently by distributing packet processing in the central processing unit using a graphics processing unit (GPU). and is being actively used.

이를 위해, 어플리케이션 서버에서는 클라이언트에게 보다 효율적인 클라우드 서비스를 제공하기 위해서, 클라우드 서비스를 제공하는 사업자가 보유한 GPU 클러스터의 종류 및 개수 등을 독점적으로 계약하는 서비스를 이용한다. 따라서, 어플리케이션 서버 측면에서는 클라이언트가 증가하더라도, 클라우드 서비스 제공 사업자와 계약한 계약사항을 변경하기 어려워 클라이언트에게 클라우드 서비스를 제공할 수 없는 문제점이 발생한다.To this end, in order to provide more efficient cloud services to clients, the application server uses a service that exclusively contracts for the type and number of GPU clusters owned by the cloud service provider. Therefore, in terms of the application server, even if the number of clients increases, it is difficult to change the contract terms signed with the cloud service provider, resulting in the problem of not being able to provide cloud services to clients.

이러한 종래의 문제점을 해결하기 위한 본 발명의 실시 예들은 사용자가 원하는 시점에 사용자의 계약조건 및 유휴자원을 고려하여 사용자에게 필요한 자원을 공급할 수 있는 자원 할당 방법 및 장치를 제공하는 것이다. Embodiments of the present invention to solve these conventional problems provide a resource allocation method and device that can supply necessary resources to the user at the desired time by considering the user's contract conditions and idle resources.

또한, 본 발명의 실시 예들은 사용자의 요청에 따라 자원 할당 시에 미리 구축된 유휴자원 중에서 전력 소모를 최소화할 수 있는 유휴자원을 할당하여 사용자에게 필요한 자원을 공급할 수 있는 자원 할당 방법 및 장치를 제공하는 것이다. In addition, embodiments of the present invention provide a resource allocation method and device that can supply necessary resources to users by allocating idle resources that can minimize power consumption among pre-built idle resources when allocating resources according to user requests. It is done.

본 발명의 실시 예에 따른 자원 할당 방법은, 제어장치가 적어도 하나의 어플리케이션 서버로부터 자원 할당 요청신호를 수신하는 단계, 상기 제어장치가 상기 자원 할당 요청신호에 대응되는 필요 연산량을 확인하는 단계, 상기 제어장치가 상기 어플리케이션 서버에서 선택된 클러스터 할당 규칙에 따라 적어도 하나의 그룹 중 제1 그룹에 포함된 적어도 하나의 제1 GPU클러스터 중에서 상기 필요 연산량의 처리가 가능한 적어도 하나의 제1 유휴 클러스터를 확인하는 단계, 상기 제어장치가 상기 제1 유휴 클러스터의 변경이 필요하면 적어도 하나의 제2 그룹에 포함된 적어도 하나의 제2 GPU클러스터 중에서 상기 필요 연산량의 적어도 일부의 처리가 가능한 적어도 하나의 제2 유휴 클러스터를 확인하는 단계 및 상기 제어장치가 상기 제1 유휴 클러스터 및 상기 제2 유휴 클러스터 중 적어도 하나를 상기 어플리케이션 서버에 할당하는 단계를 포함하는 것을 특징으로 한다.A resource allocation method according to an embodiment of the present invention includes the steps of a control device receiving a resource allocation request signal from at least one application server, the control device confirming the required amount of computation corresponding to the resource allocation request signal, A control device identifying at least one first idle cluster capable of processing the required amount of computation among at least one first GPU cluster included in a first group among at least one group according to a cluster allocation rule selected by the application server. , If the control device needs to change the first idle cluster, it selects at least one second idle cluster capable of processing at least a portion of the required computational amount among the at least one second GPU cluster included in the at least one second group. Characterized in that it includes the step of confirming and the step of allocating at least one of the first idle cluster and the second idle cluster to the application server.

또한, 제어장치가 상기 제1 유휴 클러스터의 변경이 필요하지 않으면 상기 확인된 제1 유휴 클러스터를 상기 어플리케이션 서버에 할당하는 단계를 더 포함하는 것을 특징으로 한다.In addition, if the control device does not need to change the first idle cluster, the method may further include allocating the confirmed first idle cluster to the application server.

또한, 제어장치가 상기 필요 연산량을 기반으로 상기 제1 유휴 클러스터의 변경여부를 확인하는 단계를 더 포함하는 것을 특징으로 한다.In addition, the control device may further include a step of checking whether the first idle cluster is changed based on the required calculation amount.

또한, 제1 유휴 클러스터의 변경여부를 확인하는 단계는, 상기 제1 유휴 클러스터에서 상기 필요 연산량의 전체 처리가 불가능하면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 단계인 것을 특징으로 한다.In addition, the step of checking whether the first idle cluster has been changed is characterized in that it is a step of checking that the first idle cluster needs to be changed if the entire required computational amount cannot be processed in the first idle cluster.

또한, 제1 유휴 클러스터의 변경여부를 확인하는 단계는, 상기 제1 유휴 클러스터에서의 제1 전력소모량이 임계치를 초과하면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 단계인 것을 특징으로 한다.In addition, the step of checking whether the first idle cluster is changed is characterized in that it is a step of checking that the first idle cluster needs to be changed when the first power consumption in the first idle cluster exceeds a threshold.

또한, 제1 유휴 클러스터의 변경여부를 확인하는 단계는, 상기 제1 유휴 클러스터에 일정이 설정된 상태이면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 단계인 것을 특징으로 한다.In addition, the step of checking whether the first idle cluster has been changed is characterized in that it is a step of checking that the first idle cluster needs to be changed if a schedule is set for the first idle cluster.

또한, 제1 유휴 클러스터의 변경여부를 확인하는 단계는, 상기 제1 유휴 클러스터에 에러가 발생된 상태이면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 단계인 것을 특징으로 한다.In addition, the step of checking whether the first idle cluster has been changed is characterized in that it is a step of checking that the first idle cluster needs to be changed if an error has occurred in the first idle cluster.

또한, 클러스터 할당 규칙은, 상기 필요 연산량을 기반으로 상기 제1 유휴 클러스터 및 상기 제2 유휴 클러스터 중 적어도 하나의 클러스터를 상기 어플리케이션 서버에 가변적으로 할당하기 위한 규칙인 것을 특징으로 한다. In addition, the cluster allocation rule is characterized as a rule for variably allocating at least one cluster among the first idle cluster and the second idle cluster to the application server based on the required calculation amount.

또한, 제1 그룹에 포함된 적어도 하나의 제1 GPU클러스터와 상기 적어도 하나의 제2 그룹에 포함된 적어도 하나의 제2 GPU클러스터는 각각 종류, 연산 성능 및 전력소모량을 포함하는 스펙 중 적어도 하나가 상이한 것을 특징으로 한다. In addition, each of the at least one first GPU cluster included in the first group and the at least one second GPU cluster included in the at least one second group has at least one of specifications including type, computing performance, and power consumption. It is characterized by different things.

아울러, 본 발명의 실시 예에 따른 자원 할당 장치는, 적어도 하나의 어플리케이션 서버와 통신을 수행하는 통신부 및 상기 적어도 하나의 어플리케이션 서버로부터 수신된 자원 할당 요청신호에 대응되는 필요 연산량을 확인하고, 상기 어플리케이션 서버에서 선택된 클러스터 할당 규칙에 따라 적어도 하나의 그룹 중 제1 그룹에 포함된 적어도 하나의 제1 GPU클러스터 중에서 상기 필요 연산량의 처리가 가능한 적어도 하나의 제1 유휴 클러스터를 확인하고, 상기 제1 유휴 클러스터의 변경이 필요하면, 적어도 하나의 제2 그룹에 포함된 적어도 하나의 제2 GPU클러스터 중에서 상기 필요 연산량의 적어도 일부의 처리가 가능한 적어도 하나의 제2 유휴 클러스터를 확인하고, 상기 제1 유휴 클러스터 및 상기 제2 유휴 클러스터 중 적어도 하나를 상기 어플리케이션 서버에 할당하는 제어부를 포함하는 것을 특징으로 한다.In addition, the resource allocation device according to an embodiment of the present invention verifies the required amount of computation corresponding to a communication unit that performs communication with at least one application server and a resource allocation request signal received from the at least one application server, and Identify at least one first idle cluster capable of processing the required amount of computation among at least one first GPU cluster included in a first group among at least one group according to a cluster allocation rule selected by the server, and determine the first idle cluster If a change is necessary, identify at least one second idle cluster capable of processing at least a portion of the required computational amount among the at least one second GPU cluster included in the at least one second group, the first idle cluster, and and a control unit that allocates at least one of the second idle clusters to the application server.

또한, 제어부는, 상기 제1 유휴 클러스터의 변경이 필요하지 않으면 상기 확인된 제1 유휴 클러스터를 상기 어플리케이션 서버에 할당하는 것을 특징으로 한다.Additionally, the control unit may allocate the confirmed first idle cluster to the application server if the first idle cluster does not need to be changed.

또한, 제어부는, 상기 제1 유휴 클러스터에서 상기 필요 연산량의 전체 처리가 불가능하면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 것을 특징으로 한다.In addition, the control unit is characterized in that it determines that the first idle cluster needs to be changed if the entire required amount of computation cannot be processed in the first idle cluster.

또한, 제어부는, 상기 제1 유휴 클러스터에서의 제1 전력소모량이 임계치를 초과하면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 것을 특징으로 한다.In addition, the control unit is characterized in that it determines that the first idle cluster needs to be changed when the first power consumption in the first idle cluster exceeds a threshold.

또한, 제어부는, 상기 제1 유휴 클러스터에 일정이 설정된 상태이면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 것을 특징으로 한다.Additionally, the control unit may determine that a change to the first idle cluster is necessary if a schedule is set for the first idle cluster.

또한, 제어부는, 상기 제1 유휴 클러스터에 에러가 발생된 상태이면 상기 제1 유휴 클러스터의 변경이 필요한 것으로 확인하는 것을 특징으로 한다.Additionally, the control unit may determine that a change to the first idle cluster is required if an error has occurred in the first idle cluster.

상술한 바와 같이 본 발명에 따른 자원 할당 방법 및 장치는, 사용자가 원하는 시점에 사용자의 계약조건 및 유휴자원을 고려하여 사용자에게 필요한 자원을 공급함으로써 유휴자원의 낭비를 최소화할 수 있는 효과가 있다. As described above, the resource allocation method and device according to the present invention has the effect of minimizing waste of idle resources by supplying necessary resources to the user in consideration of the user's contract conditions and idle resources at the time desired by the user.

또한, 본 발명에 따른 자원 할당 방법 및 장치는, 사용자의 요청에 따라 자원 할당 시에 미리 구축된 유휴자원 중에서 전력 소모를 최소화할 수 있는 유휴자원을 할당함으로써, 전력 소모량을 최소화할 수 있는 효과가 있다. In addition, the resource allocation method and device according to the present invention has the effect of minimizing power consumption by allocating idle resources that can minimize power consumption among pre-built idle resources when allocating resources according to the user's request. there is.

도 1은 본 발명의 실시 예에 따른 GPU클라우드 시스템을 나타낸 도면이다.
도 2는 본 발명의 실시 예에 따른 제어장치를 나타낸 도면이다.
도 3은 본 발명의 실시 예에 따른 GPU클라우드 환경에서 자원 할당 방법을 설명하기 위한 순서도이다.Figure 1 is a diagram showing a GPU cloud system according to an embodiment of the present invention.
Figure 2 is a diagram showing a control device according to an embodiment of the present invention.
Figure 3 is a flow chart to explain a resource allocation method in a GPU cloud environment according to an embodiment of the present invention.

이하, 본 발명에 따른 바람직한 실시 형태를 첨부된 도면을 참조하여 상세하게 설명한다. 첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 실시형태를 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 실시형태를 나타내고자 하는 것이 아니다. 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략할 수 있고, 명세서 전체를 통하여 동일 또는 유사한 구성 요소에 대해서는 동일한 참조 부호를 사용할 수 있다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the attached drawings. The detailed description set forth below in conjunction with the accompanying drawings is intended to illustrate exemplary embodiments of the invention and is not intended to represent the only embodiments in which the invention may be practiced. In order to clearly explain the present invention in the drawings, parts unrelated to the description may be omitted, and the same reference numerals may be used for identical or similar components throughout the specification.

도 1은 본 발명의 실시 예에 따른 GPU클라우드 시스템을 나타낸 도면이다. Figure 1 is a diagram showing a GPU cloud system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명에 따른 GPU클라우드 시스템(10)은 단말(100), 어플리케이션 서버(150), 클라우드 서버(200) 및 네트워크(250)를 포함할 수 있고, 클라우드 서버(200)는 제어장치(210) 및 GPU그룹(220)을 포함할 수 있다. Referring to Figure 1, the GPU cloud system 10 according to the present invention may include a terminal 100, an application server 150, a cloud server 200, and a network 250, and the cloud server 200 It may include a control device 210 and a GPU group 220.

단말(100)은 복수의 사용자가 사용하는 단말로, 어플리케이션 서버(150)로부터 클라우드 서비스에 상응하는 어플리케이션 실행 결과 화면을 수신하여 사용자에게 제공한다. 단말(100)은 정보통신기기, 멀티미디어 단말, 스마트폰, 태블릿PC, 노트북 등 다양한 단말일 수 있고, 어플리케이션 서버(150)에서 제공된 어플리케이션을 통해 클라우드 서비스를 이용하는 전자장치를 의미할 수 있다. The terminal 100 is a terminal used by a plurality of users, and receives an application execution result screen corresponding to a cloud service from the application server 150 and provides it to the user. The terminal 100 may be a variety of terminals such as information and communication devices, multimedia terminals, smartphones, tablet PCs, and laptops, and may refer to electronic devices that use cloud services through applications provided by the application server 150.

어플리케이션 서버(150)는 복수의 클라이언트가 사용하는 단말(100)과의 통신을 통해 단말(100)에서 데이터를 다운로드 및 업로드를 수행하며, 멀티미디어 데이터를 스트리밍으로 즐길 수 있는 어플리케이션 등을 단말(100)로 제공하는 업체에서 사용하는 서버 등의 전자장치일 수 있다. The application server 150 downloads and uploads data from the terminal 100 through communication with the terminal 100 used by a plurality of clients, and provides an application for enjoying multimedia data through streaming to the terminal 100. It may be an electronic device such as a server used by a company that provides it.

어플리케이션 서버(150)는 단말(100)로 클라우드 서비스를 제공하기 위해 클라우드 서버(200)의 제어장치(210)에 접속하여 GPU그룹(220) 중 적어도 하나의 GPU그룹(220)에 포함된 적어도 하나의 GPU클러스터를 자원으로 할당받는다. The application server 150 connects to the control device 210 of the cloud server 200 to provide a cloud service to the terminal 100, and connects at least one application server 150 included in at least one GPU group 220 among the GPU groups 220. GPU cluster is allocated as a resource.

보다 구체적으로, 어플리케이션 서버(150)는 어플리케이션 서버(150)에서 필요한 연산량(이하, 필요 연산량이라 함)에 자원 즉, 클러스터를 할당 받기 위한 클러스터 할당 규칙에 따라 동작할 수 있고, GPU그룹(220)에 포함된 GPU클러스터의 제품 및 수량을 기준으로 클러스터를 할당 받기 위한 기본규칙에 따라 동작할 수 있다. 이때, 어플리케이션 서버(150)는 GPU클러스터의 제품의 스펙, 제품명, 제조사, 생산년도, 가격 등을 포함하는 GPU클러스터의 정보를 기반으로 기본규칙에 따라 동작할 때 할당받고자 하는 GPU클러스터를 선택할 수 있다. More specifically, the application server 150 may operate according to a cluster allocation rule to allocate a resource, that is, a cluster, to the amount of computation required by the application server 150 (hereinafter referred to as the amount of computation required), and the GPU group 220 It can operate according to the basic rules for allocating a cluster based on the product and quantity of the GPU cluster included. At this time, the application server 150 can select the GPU cluster to be assigned when operating according to basic rules based on information on the GPU cluster including product specifications, product name, manufacturer, production year, price, etc. .

어플리케이션 서버(150)가 클러스터 할당 규칙에 따라 동작할 경우, 어플리케이션 서버(150)는 클라우드 서비스를 위한 필요 연산량을 산출하여 클라우드 서버(200)로 요청한다. 어플리케이션 서버(150)는 클라우드 서버(200)의 제어장치(210)로부터 필요 연산량를 처리할 수 있는 적어도 하나의 GPU클러스터를 할당받아 단말(100)에서 클라우드 서비스를 이용하도록 할 수 있다. When the application server 150 operates according to the cluster allocation rules, the application server 150 calculates the amount of computation required for the cloud service and requests it from the cloud server 200. The application server 150 can receive at least one GPU cluster capable of processing the required amount of computation from the control device 210 of the cloud server 200 and allow the terminal 100 to use the cloud service.

본 발명의 실시 예에서는 어플리케이션 서버(150)가 클라우드 서비스를 위한 필요 연산량을 산출하여 클라우드 서버(200)로 요청하는 것을 예로 설명하고 있으나, 반드시 이에 한정되는 것은 아니다. 즉, 어플리케이션 서버(150)가 클러스터 할당 규칙으로 동작하고자 하면, 클라우드 서버(200)가 어플리케이션 서버(150)에서 단말(100)로 제공하는 클라우드 서비스의 종류, 어플리케이션 서버(150)에 접속하여 클라우드 서비스를 이용하는 단말(100)의 개수 등을 확인하여 클라우드 서버(200)에서 필요 연산량을 산출할 수 있다. 이때, 클라우드 서비스의 종류는 예컨대, 데이터의 업로드 및 다운로드만을 제공하는 서비스, 멀티미디어 데이터를 스트리밍으로 제공하는 서비스 등과 같은 종류를 의미할 수 있다. In the embodiment of the present invention, it is explained as an example that the application server 150 calculates the amount of computation required for the cloud service and requests it from the cloud server 200, but it is not necessarily limited to this. In other words, if the application server 150 wants to operate according to the cluster allocation rule, the cloud server 200 connects to the application server 150 to select the type of cloud service provided by the application server 150 to the terminal 100. The amount of calculation required can be calculated in the cloud server 200 by checking the number of terminals 100 using . At this time, the type of cloud service may mean, for example, a service that only provides upload and download of data, a service that provides streaming multimedia data, etc.

또한, 어플리케이션 서버(150)가 클러스터 할당 규칙에 따라 동작하지 않는 경우, 어플리케이션 서버(150)는 단말(100)로 제공하는 클라우드 서비스의 종류에 따라 필요한 GPU클러스터의 제품 및 수량을 클라우드 서버(200)로 요청한다. 어플리케이션 서버(150)는 클라우드 서버(200)의 제어장치(210)로부터 요청한 GPU클러스터의 제품 및 수량을 할당받아 단말(100)에서 클라우드 서비스를 이용하도록 할 수 있다. In addition, if the application server 150 does not operate according to the cluster allocation rules, the application server 150 provides the product and quantity of the GPU cluster required according to the type of cloud service provided to the terminal 100 to the cloud server 200. Request with The application server 150 can receive the requested GPU cluster product and quantity from the control device 210 of the cloud server 200 and allow the terminal 100 to use the cloud service.

클라우드 서버(200)의 제어장치(210)는 어플리케이션 서버(150)가 클러스터 할당 규칙에 따라 동작할 경우 어플리케이션 서버(150)로부터 수신된 자원 할당 요청신호에 포함된 필요 연산량을 기반으로 어플리케이션 서버(150)에 GPU클러스터를 할당한다. 제어장치(210)의 동작은 하기의 도 2를 이용하여 보다 구체적으로 설명하기로 한다. When the application server 150 operates according to the cluster allocation rules, the control device 210 of the cloud server 200 operates the application server 150 based on the required amount of computation included in the resource allocation request signal received from the application server 150. ), allocate a GPU cluster to . The operation of the control device 210 will be described in more detail using FIG. 2 below.

클라우드 서버(200)의 GPU그룹(220)은 제1 그룹(220a), 제2 그룹(220b) 내지 제n 그룹(220n)을 포함하여 구성될 수 있다. GPU그룹(220)에 포함된 각각의 그룹은 그룹별로 GPU클러스터 제품의 스펙이 동일한 GPU클러스터로 구현될 수 있다. 예컨대, 제1 그룹(220a)에 포함된 복수의 제1 GPU클러스터는 동일한 스펙의 GPU클러스터일 수 있고, 제2 그룹(220b)에 포함된 복수의 제2 GPU클러스터는 각각 동일한 스펙의 GPU클러스터이되, 제1 GPU클러스터와 상이한 스펙의 GPU클러스터일 수 있다. The GPU group 220 of the cloud server 200 may include a first group 220a, a second group 220b to an n-th group 220n. Each group included in the GPU group 220 can be implemented as a GPU cluster with the same GPU cluster product specifications for each group. For example, the plurality of first GPU clusters included in the first group 220a may be GPU clusters with the same specifications, and the plurality of second GPU clusters included in the second group 220b may each be GPU clusters with the same specifications. , It may be a GPU cluster with different specifications from the first GPU cluster.

네트워크(250)는 단말(100), 어플리케이션 서버(150) 및 클라우드 서버(200) 사이의 데이터를 전달하는 통로를 제공하는 것으로, 기존에 이용되는 네트워크 및 향후 개발 가능한 네트워크를 모두 포괄하는 개념일 수 있다. The network 250 provides a path for transmitting data between the terminal 100, the application server 150, and the cloud server 200, and can be a concept that encompasses both existing networks and networks that can be developed in the future. there is.

도 2는 본 발명의 실시 예에 따른 제어장치를 나타낸 도면이다. Figure 2 is a diagram showing a control device according to an embodiment of the present invention.

도 2를 참조하면, 본 발명에 따른 제어장치(210)는 통신부(211), 입력부(213), 표시부(215), 메모리(217) 및 제어부(219)를 포함할 수 있고, 제어부(219)는 감지부(219a) 및 선택부(219b)를 포함할 수 있다. Referring to FIG. 2, the control device 210 according to the present invention may include a communication unit 211, an input unit 213, a display unit 215, a memory 217, and a control unit 219, and the control unit 219 may include a detection unit 219a and a selection unit 219b.

통신부(211)는 네트워크(250)와의 통신을 통해 단말(100) 및 어플리케이션 서버(150)와의 통신을 수행한다. 이를 위해, 통신부(211)는 5G(5^th generation), LTE-A(long term evolution-advanced), LTE(long term evolution) 및 Wi-Fi(wireless fidelity) 등의 인터넷 통신을 수행할 수 있다.The communication unit 211 performs communication with the terminal 100 and the application server 150 through communication with the network 250. To this end, the communication unit 211 can perform Internet communication such as ^5th generation (5G), long term evolution-advanced (LTE-A), long term evolution (LTE), and wireless fidelity (Wi-Fi).

입력부(213)는 제어장치(210)를 관리하는 관리자 입력에 대응하여 입력 데이터를 발생시킨다. 이를 위해, 입력부(213)는 키패드(key pad), 돔 스위치(dome switch), 터치 패널(touch panel), 터치 키(touch key) 및 버튼(button)을 포함할 수 있다. The input unit 213 generates input data in response to input from an administrator who manages the control device 210. To this end, the input unit 213 may include a key pad, dome switch, touch panel, touch key, and button.

표시부(215)는 제어장치(210)의 동작에 따른 출력 데이터를 출력한다. 특히, 표시부(215)는 클라우드 서버(200)에 포함되어 제어장치(210)에서 관리하는 GPU그룹(220)에 포함된 GPU클러스터의 사용량 및 유휴자원 등에 대한 현황을 표시할 수 있다. 이를 위해, 표시부(215)는 액정 디스플레이(LCD; liquid crystal display), 발광 다이오드(LED; light emitting diode) 디스플레이, 유기 발광 다이오드(OLED; organic LED) 디스플레이, 마이크로 전자기계 시스템(MEMS; micro electro mechanical systems) 디스플레이 및 전자 종이(electronic paper) 디스플레이를 포함할 수 있다.The display unit 215 outputs output data according to the operation of the control device 210. In particular, the display unit 215 can display the status of the usage amount and idle resources of the GPU cluster included in the GPU group 220 included in the cloud server 200 and managed by the control device 210. To this end, the display unit 215 includes a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, and a micro electro mechanical system (MEMS). systems) displays and electronic paper displays.

메모리(217)는 제어장치(210)의 동작 프로그램들을 저장한다. 특히, 메모리(217)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 경우, 어플리케이션 서버(150)에서 요청한 필요 연산량 등을 저장할 수 있다. 메모리(217)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택하지 않은 경우, 어플리케이션 서버(150)가 선택한 GPU클러스터의 제품 및 수량을 저장할 수 있다. The memory 217 stores operation programs of the control device 210. In particular, the memory 217 may store the required amount of computation requested by the application server 150 when the application server 150 selects a cluster allocation rule. If the application server 150 does not select a cluster allocation rule, the memory 217 may store the product and quantity of the GPU cluster selected by the application server 150.

또한, 메모리(217)는 GPU그룹(220)에 포함된 각각의 그룹을 구성하는 GPU클러스터의 정보를 저장할 수 있고, GPU클러스터의 사용량 및 유휴 클러스터 등을 모니터링 할 수 있는 알고리즘 등을 저장할 수 있다. 아울러, 메모리(217)는 어플리케이션 서버(150)에서 단말(100)로 제공하는 클라우드 서비스의 종류 및 어플리케이션 서버(150)에 접속하여 클라우드 서비스를 이용하는 단말(100)의 개수 등을 확인하여 클라우드 서버(200)에서 필요 연산량을 산출할 수 있는 알고리즘을 저장할 수 있다. Additionally, the memory 217 can store information on GPU clusters constituting each group included in the GPU group 220, and can store algorithms for monitoring GPU cluster usage and idle clusters. In addition, the memory 217 checks the type of cloud service provided by the application server 150 to the terminal 100 and the number of terminals 100 that connect to the application server 150 and use the cloud service, etc. to determine the cloud server ( 200), an algorithm that can calculate the required amount of calculation can be stored.

제어부(219)는 클러스터 할당 규칙으로 동작하고자 하는 적어도 하나의 어플리케이션 서버(150)로부터 수신된 자원 할당 요청신호에 포함된 필요 연산량을 확인한다. 이때, 필요 연산량은 어플리케이션 서버(150)로부터 자원 할당 요청신호 수신 시에 메모리(217)에 저장된 알고리즘을 이용하여 제어부(219)에서 직접 산출할 수도 있다. The control unit 219 checks the required amount of computation included in the resource allocation request signal received from at least one application server 150 that wishes to operate according to the cluster allocation rule. At this time, the required calculation amount may be directly calculated by the control unit 219 using an algorithm stored in the memory 217 when receiving a resource allocation request signal from the application server 150.

제어부(219)는 제1 그룹(220a)을 구성하는 적어도 하나의 제1 GPU클러스터 중에서 유휴자원인 제1 유휴 클러스터를 확인하고, 제1 유휴 클러스터의 변경이 필요하면 다른 그룹에 포함된 적어도 하나의 GPU클러스터 중에서 유휴 클러스터를 확인하여 어플리케이션 서버(150)에 할당한다. 이를 위해, 제어부(219)는 감지부(219a)와 선택부(219b)를 포함할 수 있으나 반드시 이에 한정되는 것은 아니며, 설명의 편의를 위해 제어부(219)가 감지부(219a)와 선택부(219b)를 포함하는 것을 명확히 하는 바이다. 이때, 제어부(219)는 제1 유휴 클러스터가 확인되지 않은 상태, 제1 유휴 클러스터에서 필요 연산량의 전체 처리가 불가능한 상태, 제1 유휴 클러스터에서의 전력소모량이 임계치를 초과한 상태, GPU클러스터의 추후 증설이 요청된 상태, 제1 유휴 클러스터에 일정이 설정된 상태, 제1 유휴 클러스터에 에러가 발생된 상태 중 적어도 하나의 상태일 때 제1 유휴 클러스터의 변경이 필요한 것으로 확인할 수 있다. 아울러, 필요 연산량에 따라 하나의 제1 유휴 클러스터에서 필요 연산량의 처리가 가능할 수 있고, 복수의 제1 유휴 클러스터에서 처리가 가능할 수 있다. The control unit 219 checks the first idle cluster, which is an idle resource, among the at least one first GPU cluster constituting the first group 220a, and if the first idle cluster needs to be changed, at least one included in the other group Among the GPU clusters, idle clusters are identified and assigned to the application server 150. For this purpose, the control unit 219 may include a detection unit 219a and a selection unit 219b, but is not necessarily limited thereto. For convenience of explanation, the control unit 219 may include a detection unit 219a and a selection unit (219b). It is clarified that 219b) is included. At this time, the control unit 219 is in a state in which the first idle cluster is not confirmed, the entire required computational amount cannot be processed in the first idle cluster, the power consumption in the first idle cluster exceeds the threshold, and the GPU cluster is in a state in which the future It can be confirmed that a change to the first idle cluster is required when at least one of the following states is present: expansion is requested, a schedule is set for the first idle cluster, and an error occurs in the first idle cluster. In addition, depending on the required amount of computation, processing of the required amount of computation may be possible in one first idle cluster, or processing may be possible in a plurality of first idle clusters.

제어부(219)는 어플리케이션 서버(150)로부터 어플리케이션 서버(150)가 단말(100)로 클라우드 서비스를 제공하기 위한 GPU클러스터의 할당을 위한 자원할당 요청신호가 수신되면 제어부(219)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태인지 확인한다. 이때, 클러스터 할당 규칙은, 어플리케이션 서버(150)가 휴대장치(100)로 원활한 클러스터 서비스를 제공하기 위해 필요 연산량에 따라 유휴 클러스터를 가변적으로 할당받기 위한 규칙을 의미한다. 아울러, 본 발명에서는 설명의 편의를 위해 어플리케이션 서버(150)가 제1 그룹(220a) 내지 제n 그룹(220n) 중에서 제1 그룹(220a)을 디폴트로 선택한 것을 예로 설명하기로 한다. When the control unit 219 receives a resource allocation request signal from the application server 150 for allocation of a GPU cluster for the application server 150 to provide a cloud service to the terminal 100, the control unit 219 controls the application server 150. ) Check whether the cluster allocation rule is selected. At this time, the cluster allocation rule refers to a rule for the application server 150 to variably allocate an idle cluster according to the amount of computation required to provide a smooth cluster service to the mobile device 100. In addition, for convenience of explanation, the present invention will be described as an example in which the application server 150 selects the first group 220a as default from the first group 220a to the n-th group 220n.

어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태이면, 제어부(219)는 감지부(219a)를 호출하고, 어플리케이션 서버(150)에서 요청한 필요 연산량을 확인한다. 감지부(219a)는 디폴트로 선택된 제1 그룹(220a)을 구성하는 복수의 제1 GPU클러스터 중에서 필요 연산량의 처리가 가능한 제1 유휴 클러스터를 확인한다. When the application server 150 selects the cluster allocation rule, the control unit 219 calls the detection unit 219a and checks the required amount of computation requested by the application server 150. The detection unit 219a identifies a first idle cluster capable of processing the required amount of computation among the plurality of first GPU clusters constituting the first group 220a selected by default.

감지부(219a)는 제1 유휴 클러스터의 변경이 필요하면 제2 그룹(220b) 내지 제n 그룹(220n)을 구성하는 복수의 GPU클러스터 중에서 어플리케이션 서버(150)에 할당할 수 있는 유휴 클러스터를 확인한다. 선택부(219b)는 감지부(219a)의 확인결과에 따라, 제1 GPU클러스터 및 복수의 GPU클러스터 중 적어도 하나를 어플리케이션 서버(150)에 할당한다. If the first idle cluster needs to be changed, the detection unit 219a checks the idle cluster that can be assigned to the application server 150 among the plurality of GPU clusters constituting the second group 220b to the n-th group 220n. do. The selection unit 219b allocates at least one of the first GPU cluster and the plurality of GPU clusters to the application server 150 according to the confirmation result of the detection unit 219a.

보다 구체적으로, 일 실시 예는, 제1 그룹(220a)에서 필요 연산량의 처리가 가능한 제1 유휴 클러스터가 확인되지 않은 경우이다. 감지부(219a)는 제1 그룹(220a)을 제외한 다른 그룹 제2 그룹(220b) 내지 제n 그룹(220n)을 구성하는 복수의 GPU클러스터 중에서 필요 연산량을 처리할 수 있는 유휴 클러스터를 가진 적어도 하나의 그룹을 확인한다. 예컨대, 감지부(219a)는 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 필요 연산량의 처리가 가능한 제2 유휴 클러스터가 확인되면, 선택부(219b)는 감지부(219a)의 확인 결과에 따라 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당한다. More specifically, in one embodiment, a first idle cluster capable of processing the required amount of computation is not identified in the first group 220a. The detection unit 219a is configured to include at least one GPU cluster that constitutes the second group 220b to the nth group 220n excluding the first group 220a and has an idle cluster capable of processing the required amount of computation. Check the group. For example, when the detection unit 219a identifies a second idle cluster capable of processing the required amount of computation among the plurality of second GPU clusters constituting the second group 220b, the selection unit 219b selects the detection unit 219a. According to the confirmation result, the second idle cluster is assigned to the application server 150.

다른 실시 예는, 제1 유휴 클러스터에서 필요 연산량의 전체 처리가 불가능한 경우이다. 감지부(219a)는 필요 연산량 중에서 제1 유휴 클러스터에서 처리 가능한 연산량을 제외한 잔여 연산량을 처리할 수 있는 유휴 클러스터를 가진 그룹을 확인한다. 예컨대, 감지부(219a)의 확인결과, 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 잔여 연산량의 처리가 가능한 제2 유휴 클러스터가 확인되면, 선택부(219b)는 감지부(219a)의 확인 결과에 따라 제2 유휴 클러스터를 제1 유휴 클러스터와 함께 어플리케이션 서버(150)에 할당한다. Another embodiment is a case where it is impossible to process the entire amount of required computation in the first idle cluster. The detection unit 219a identifies a group that has an idle cluster that can process the remaining computation amount excluding the computation amount that can be processed by the first idle cluster among the required computation amount. For example, as a result of the confirmation of the detection unit 219a, if a second idle cluster capable of processing the remaining computation amount is confirmed among the plurality of second GPU clusters constituting the second group 220b, the selection unit 219b selects the detection unit ( According to the confirmation result of 219a), the second idle cluster is allocated to the application server 150 together with the first idle cluster.

또 다른 실시 예는, 제1 유휴 클러스터에서의 전력소모량이 임계치를 초과한 경우이다. 감지부(219a)는 제1 유휴 클러스터에서 필요 연산량을 처리하기 위해 소비되는 제1 전력소모량과 다른 그룹에 포함된 유휴 클러스터에서 필요 연산량을 처리하기 위해 소비되는 전력소모량을 확인한다. 예컨대, 감지부(219a)는 제1 그룹(220a)에서 확인된 제1 유휴 클러스터에서 필요 연산량을 처리하기 위해 소비되는 제1 전력소모량이 제2 그룹(220b)에서 확인된 제2 유휴 클러스터에서 필요 연산량을 처리하기 위해 소비되는 제2 전력소모량보다 크면, 선택부(219b)는 제1 유휴 클러스터 대신 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다. Another embodiment is when the power consumption in the first idle cluster exceeds the threshold. The detection unit 219a verifies the first power consumption consumed to process the required computation amount in the first idle cluster and the power consumption consumed to process the required computation amount in the idle cluster included in the other group. For example, the detection unit 219a determines that the first power consumption consumed to process the required calculation amount in the first idle cluster identified in the first group 220a is required by the second idle cluster identified in the second group 220b. If the second power consumption consumed to process the computation amount is greater than the second power consumption, the selection unit 219b may allocate the second idle cluster to the application server 150 instead of the first idle cluster.

또 다른 실시 예는, GPU클러스터의 추후 증설이 요청된 경우이다. 제1 유휴 클러스터에서 필요 연산량을 처리할 수 있으나, 제1 그룹(220a)에 포함된 제1 GPU클러스터로 어플리케이션 서버(150)에서 요청한 GPU클러스터의 추후 증설이 불가능한 상태이면, 감지부(219a)는 제2 그룹(220b) 내지 제n 그룹(220n)에서 유휴 클러스터를 확인한다. 감지부(219a)는 제2 그룹(220b) 내지 제n 그룹(200n) 중 적어도 하나의 그룹에서 확인된 예컨대, 제2 그룹(220b)에서 확인된 제2 유휴 클러스터로 GPU클러스터의 추후 증설이 가능한 것으로 확인되면, 선택부(219b)는 필요 연산량의 처리를 위해 제1 유휴 클러스터를 어플리케이션 서버(150)로 할당하고, GPU클러스터의 추후 증설을 위해 제2 유휴 클러스터를 어플리케이션 서버(150)에 추가적으로 할당할 수 있다. Another example is a case where future expansion of the GPU cluster is requested. The first idle cluster can process the required amount of computation, but if future expansion of the GPU cluster requested by the application server 150 is not possible with the first GPU cluster included in the first group 220a, the detection unit 219a Idle clusters are confirmed in the second group 220b to the nth group 220n. The detection unit 219a is a second idle cluster identified in at least one of the second group 220b to the n-th group 200n, for example, a second idle cluster identified in the second group 220b, which enables future expansion of the GPU cluster. If confirmed, the selection unit 219b allocates the first idle cluster to the application server 150 for processing the required amount of computation, and additionally allocates the second idle cluster to the application server 150 for future expansion of the GPU cluster. can do.

또한, 제1 유휴 클러스터에서 필요 연산량의 처리가 불가능하고, 제1 그룹(220a)에 포함된 제2 GPU클러스터로 GPU클러스터의 추후 증설이 불가능한 상태이면, 감지부(219a)는 제2 그룹(220b) 내지 제n 그룹(220n)에서 유휴 클러스터를 확인한다. 감지부(219a)는 제2 그룹(220b) 내지 제n 그룹(220n) 중 적어도 하나의 그룹에서 확인된 예컨대, 제2 그룹(220b)에서 확인된 제2 유휴 클러스터 중 적어도 일부를 필요 연산량을 처리하도록 어플리케이션 서버(150)로 할당한다. 아울러, 감지부(219a)는 제2 유휴 클러스터 중 나머지 일부를 GPU클러스터의 추후 증설 시에 이용할 수 있으면 이를 어플리케이션 서버(150)에 추가적으로 할당할 수 있다. 그러나, 제2 유휴 클러스터 중 나머지 일부를 GPU클러스터의 추후 증설 시에 이용할 수 없으면, 감지부(219a)는 제3 그룹(미도시)을 구성하는 제3 GPU클러스터 중에서 GPU클러스터의 추후 증설을 위해 사용할 수 있는 제3 유휴 클러스터를 확인할 수 있다. 선택부(219b)는 GPU클러스터의 추후 증설을 위해 확인된 제3 유휴 클러스터를 어플리케이션 서버(150)로 추가적으로 할당할 수 있다. In addition, if it is impossible to process the required amount of computation in the first idle cluster and future expansion of the GPU cluster is impossible with the second GPU cluster included in the first group 220a, the detection unit 219a detects the second group 220b. ) Check idle clusters in the nth group 220n. The detection unit 219a processes the amount of computation required for at least some of the second idle clusters identified in the second group 220b, for example, identified in at least one group of the second group 220b to the n-th group 220n. It is assigned to the application server 150 to do so. In addition, the detection unit 219a may additionally allocate the remaining part of the second idle cluster to the application server 150 if it can be used for future expansion of the GPU cluster. However, if the remaining part of the second idle cluster cannot be used for future expansion of the GPU cluster, the detection unit 219a selects a part of the third GPU cluster constituting the third group (not shown) to be used for future expansion of the GPU cluster. You can check the third idle cluster. The selection unit 219b may additionally allocate the identified third idle cluster to the application server 150 for future expansion of the GPU cluster.

또 다른 실시 예는, 제1 유휴 클러스터에 일정이 설정된 경우이거나, 제1 유휴 클러스터에 에러가 발생된 경우이다. 감지부(219a)는 제1 유휴 클러스터의 점검 및 폐기 예정 등으로 인한 사용불가 일정이 설정된 상태이거나, 에러가 발생한 상태이면, 다른 그룹 즉, 제2 그룹(220b) 내지 제n 그룹(220n)을 구성하는 복수의 GPU클러스터 중에서 필요 연산량을 처리할 수 있는 유휴 클러스터를 가진 그룹을 확인한다. 예컨대, 감지부(219a)는 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 필요 연산량의 처리가 가능한 제2 유휴 클러스터가 확인되면, 선택부(219b)는 감지부(219a)의 확인 결과에 따라 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다. Another embodiment is when a schedule is set in the first idle cluster or when an error occurs in the first idle cluster. If an unusability schedule is set due to inspection and disposal of the first idle cluster, or an error has occurred, the detection unit 219a selects another group, that is, the second group 220b to the nth group 220n. Among the multiple GPU clusters that make up the group, check which group has an idle cluster that can handle the required amount of computation. For example, when the detection unit 219a identifies a second idle cluster capable of processing the required amount of computation among the plurality of second GPU clusters constituting the second group 220b, the selection unit 219b selects the detection unit 219a. According to the confirmation result, the second idle cluster may be allocated to the application server 150.

아울러, 본 발명의 실시 예들에서는 설명의 편의를 위해 제1 유휴 클러스터 대신 제2 그룹(220b)에서 확인된 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당하는 것을 예로 설명하고 있으나, 반드시 이에 한정되는 것은 아니다. 선택부(219b)는 감지부(219a)의 확인결과를 기반으로 GPU클러스터의 연산속도 및 전력소모량 등을 고려하여 제2 그룹(220b) 내지 제n 그룹(220n) 중에서 적어도 두 개의 그룹에서 확인된 유휴 클러스터를 조합하여 어플리케이션 서버(150)에 할당할 수 있다. In addition, in the embodiments of the present invention, for convenience of explanation, it is explained as an example that the second idle cluster identified in the second group 220b is assigned to the application server 150 instead of the first idle cluster, but it is not necessarily limited to this. That is not the case. Based on the confirmation result of the detection unit 219a, the selection unit 219b considers the operation speed and power consumption of the GPU cluster, and selects at least two groups from the second group 220b to the nth group 220n. Idle clusters can be combined and assigned to the application server 150.

제어부(219)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택하지 않은 상태이면 감지부(219a)를 호출한다. 이때, 어플리케이션 서버(150)는 이용하고자 하는 GPU클러스터의 제품 및 수량을 선택한 상태일 수 있다. 본 발명의 실시 예에서는 제1 그룹(220a)이 어플리케이션 서버(150)에서 이용하고자 하는 GPU클러스터의 제품으로 구성된 그룹이고, 제1 그룹(220a)에서 3개의 GPU클러스터를 요청한 상태인 것을 예로 설명할 수 있다. The control unit 219 calls the detection unit 219a when the application server 150 has not selected a cluster allocation rule. At this time, the application server 150 may have selected the product and quantity of the GPU cluster to be used. In an embodiment of the present invention, the first group 220a is a group composed of GPU cluster products to be used in the application server 150, and it will be explained as an example that the first group 220a has requested three GPU clusters. You can.

감지부(219a)는 제1 그룹(220a)을 구성하는 복수의 제1 GPU클러스터 중에서 어플리케이션 서버(150)가 선택한 수량에 대응되는 제1 GPU클러스터가 유휴 클러스터인지 확인한다. 어플리케이션 서버(150)가 선택한 수량의 제1 GPU클러스터가 유휴 클러스터이면 감지부(219a)는 선택부(219b)를 호출한다. 선택부(219b)는 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다. 반대로, 감지부(219a)는 어플리케이션 서버(150)가 선택한 수량보다 적은 수량 예컨대, 2개의 제1 GPU클러스터가 유휴 클러스터일 경우, 어플리케이션 서버(150)로 제1 GPU클러스터의 할당이 불가능함을 알리는 메시지를 전송할 수 있다.The detection unit 219a determines whether the first GPU cluster corresponding to the quantity selected by the application server 150 among the plurality of first GPU clusters constituting the first group 220a is an idle cluster. If the number of first GPU clusters selected by the application server 150 is an idle cluster, the detection unit 219a calls the selection unit 219b. The selection unit 219b may allocate an idle cluster to the application server 150. Conversely, if the quantity is less than the quantity selected by the application server 150, for example, when the two first GPU clusters are idle clusters, the detection unit 219a notifies the application server 150 that allocation of the first GPU cluster is impossible. Messages can be sent.

도 3은 본 발명의 실시 예에 따른 GPU클라우드 환경에서 자원 할당 방법을 설명하기 위한 순서도이다. Figure 3 is a flow chart to explain a resource allocation method in a GPU cloud environment according to an embodiment of the present invention.

도 3을 참조하면, 301단계에서 제어부(219)는 어플리케이션 서버(150)로부터 자원할당 요청신호가 수신되면 제어부(219)는 303단계를 수행하고, 자원할당 요청신호가 수신되지 않으면 제어부(219)는 자원할당 요청신호의 수신을 대기한다. 이때, 자원할당 요청신호는, 어플리케이션 서버(150)가 단말(100)로 클라우드 서비스를 원활하게 제공하기 위해 클라우드 서버(200)의 제어장치(210)로 GPU클러스터의 할당을 요청하는 신호일 수 있다. Referring to FIG. 3, if a resource allocation request signal is received from the application server 150 in step 301, the control unit 219 performs step 303. If a resource allocation request signal is not received, the control unit 219 performs step 303. Waits for reception of the resource allocation request signal. At this time, the resource allocation request signal may be a signal for the application server 150 to request allocation of a GPU cluster to the control device 210 of the cloud server 200 in order to smoothly provide cloud services to the terminal 100.

303단계에서 제어부(219)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태인지 확인한다. 303단계에서 제어부(219)는 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태이면 305단계를 수행하고, 클러스터 할당 규칙을 선택한 상태가 아니면 315단계를 수행한다. 이때, 클러스터 할당 규칙은, 어플리케이션 서버(150)가 클러스터 서비스를 위해 필요한 연산량(이하, 필요 연산량이라 함)을 기반으로 GPU그룹(200)에 포함된 적어도 하나의 그룹으로부터 GPU클러스터를 가변적으로 할당받기 위한 규칙을 의미한다. In step 303, the control unit 219 checks whether the application server 150 has selected a cluster allocation rule. In step 303, the control unit 219 performs step 305 if the application server 150 has selected the cluster allocation rule, and performs step 315 if the application server 150 has not selected the cluster allocation rule. At this time, the cluster allocation rule is that the application server 150 is variably assigned a GPU cluster from at least one group included in the GPU group 200 based on the amount of computation required for the cluster service (hereinafter referred to as the required computation amount). refers to the rules for

반대로, 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태가 아니면 315단계에서 감지부(219a)는 디폴트로 설정된 그룹 예컨대, 제1 그룹(220a)을 구성하는 복수의 제1 GPU클러스터 중에서 어플리케이션 서버(150)가 선택한 제1 GPU클러스터의 수량을 확인하고 317단계를 수행한다. 317단계에서 감지부(219a)는 어플리케이션 서버(150)에서 선택한 수량만큼의 제1 GPU클러스터가 유휴 클러스터임을 확인한다. Conversely, if the application server 150 has not selected a cluster allocation rule, in step 315, the detection unit 219a selects the application server ( 150) checks the quantity of the first GPU cluster selected and performs step 317. In step 317, the detection unit 219a confirms that the number of first GPU clusters selected by the application server 150 is an idle cluster.

317단계의 확인결과, 어플리케이션 서버(150)가 선택한 수량의 제1 GPU클러스터가 유휴 클러스터이면 감지부(219a)는 313단계를 수행한다. 313단계에서 감지부(219a)는 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다. As a result of the confirmation in step 317, if the number of first GPU clusters selected by the application server 150 is an idle cluster, the detection unit 219a performs step 313. In step 313, the detection unit 219a may allocate an idle cluster to the application server 150.

317단계의 확인결과, 제1 그룹(220a)을 구성하는 복수의 제1 GPU클러스터 중에서 어플리케이션 서버(150)가 선택한 수량에 대응되는 제1 GPU클러스터가 유휴 클러스터가 아니면, 감지부(219a)는 319단계를 수행한다. 319단계에서 감지부(219a)는 어플리케이션 서버(150)로 제1 GPU클러스터의 할당이 불가능함을 알리는 메시지를 전송한다. As a result of the confirmation in step 317, if the first GPU cluster corresponding to the quantity selected by the application server 150 among the plurality of first GPU clusters constituting the first group 220a is not an idle cluster, the detection unit 219a detects 319 Follow the steps. In step 319, the detection unit 219a transmits a message to the application server 150 indicating that allocation of the first GPU cluster is impossible.

303단계의 확인결과, 어플리케이션 서버(150)가 클러스터 할당 규칙을 선택한 상태이면 제어부(219)는 감지부(219a)를 호출하여 305단계를 수행한다. 305단계에서 감지부(219a)는 301단계에서 수신된 자원할당 요청신호에 포함된 필요 연산량을 확인한다. As a result of the confirmation in step 303, if the application server 150 has selected the cluster allocation rule, the control unit 219 calls the detection unit 219a and performs step 305. In step 305, the detection unit 219a checks the required amount of calculation included in the resource allocation request signal received in step 301.

307단계에서 감지부(219a)는 디폴트로 설정된 제1 그룹(220a)에 포함된 제1 GPU클러스터 중에서 어플리케이션 서버(150)에 할당 가능한 제1 GPU클러스터를 제1 유휴 클러스터로 확인하고 309단계를 수행한다. 309단계에서 감지부(219a)는 확인된 제1 유휴 클러스터의 변경이 필요한지 확인한다. 309단계에서 감지부(219a)는 제1 유휴 클러스터의 변경이 필요하지 않은 것으로 확인되면 313단계를 수행한다. 313단계에서 선택부(219b)는 307단계에서 확인된 제1 유휴 클러스터를 어플리케이션 서버(150)에 할당하여 어플리케이션 서버(150)에서 클러스터 서비스를 단말(100)로 제공할 수 있도록 한다. In step 307, the detection unit 219a identifies the first GPU cluster assignable to the application server 150 as the first idle cluster among the first GPU clusters included in the default first group 220a and performs step 309. do. In step 309, the detection unit 219a determines whether the identified first idle cluster needs to be changed. If it is determined in step 309 that the first idle cluster does not need to be changed, the detection unit 219a performs step 313. In step 313, the selection unit 219b allocates the first idle cluster identified in step 307 to the application server 150 so that the application server 150 can provide a cluster service to the terminal 100.

반대로, 309단계에서 제1 유휴 클러스터의 변경이 필요한 것으로 확인되면 311단계에서 감지부(219a)는 제2 그룹(220b) 내지 제n 그룹(220n)에 포함된 GPU클러스터에서 어플리케이션 서버(150)에 할당 가능한 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 311단계에서 확인된 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다. 이때, 감지부(219a)는 복수의 그룹 각각에서 확인된 유휴 클러스터를 어플리케이션 서버(150)로 할당할 수 있으나, 본 발명의 실시 예에서는 제2 그룹(220b)에 포함된 제2 GPU클러스터에서 확인된 제2 유휴 클러스터를 어플리케이션 서버(150)로 할당하는 것을 예로 설명하기로 한다. Conversely, if it is confirmed in step 309 that the first idle cluster needs to be changed, in step 311, the detection unit 219a sends a message to the application server 150 in the GPU cluster included in the second group 220b to the n-th group 220n. Check available idle clusters for allocation. In step 313, the selection unit 219b may allocate the idle cluster identified in step 311 to the application server 150. At this time, the detection unit 219a may allocate the idle cluster identified in each of the plurality of groups to the application server 150, but in the embodiment of the present invention, the idle cluster identified in each of the plurality of groups is identified in the second GPU cluster included in the second group 220b. The assignment of the second idle cluster to the application server 150 will be explained as an example.

보다 구체적으로, 감지부(219a)는 제1 GPU클러스터 중에서 제1 유휴 클러스터가 확인되지 않은 상태, 제1 유휴 클러스터에서 필요 연산량의 전체 처리가 불가능한 상태, 제1 유휴 클러스터에서의 전력소모량이 임계치를 초과한 상태, GPU클러스터의 추후 증설이 요청된 상태, 제1 유휴 클러스터에 일정이 설정된 상태, 제1 유휴 클러스터에 에러가 발생된 상태 중 적어도 하나의 상태일 때 제1 유휴 클러스터의 변경이 필요한 것으로 확인할 수 있다. More specifically, the detection unit 219a detects a state in which the first idle cluster among the first GPU clusters is not confirmed, the entire required computational amount cannot be processed in the first idle cluster, and the power consumption in the first idle cluster exceeds the threshold. When at least one of the following states is exceeded, a state in which future expansion of the GPU cluster is requested, a state in which a schedule is set for the first idle cluster, and a state in which an error has occurred in the first idle cluster, a change to the first idle cluster is required. You can check it.

일 실시 예에 따르면, 309단계에서 감지부(219a)는 제1 그룹(220a)에 필요 연산량의 처리가 가능한 제1 유휴 클러스터가 존재하지 않으면 311단계를 수행한다. 311단계에서 감지부(219a)는 다른 그룹 예컨대, 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 필요 연산량을 처리할 수 있는 제2 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 311단계에서 확인된 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당한다. According to one embodiment, in step 309, the detection unit 219a performs step 311 if there is no first idle cluster in the first group 220a capable of processing the required amount of computation. In step 311, the detection unit 219a identifies a second idle cluster capable of processing the required amount of computation among the plurality of second GPU clusters constituting another group, for example, the second group 220b. In step 313, the selection unit 219b allocates the second idle cluster identified in step 311 to the application server 150.

다른 실시 예에 따르면, 309단계에서 감지부(219a)는 제1 그룹(220a)에서 확인된 제1 유휴 클러스터가 필요 연산량을 처리하기에 충분하지 않으면 311단계를 수행한다. 예컨대, 311단계에서 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 잔여 연산량을 처리할 수 있는 제2 유휴 클러스터가 존재하면 감지부(219a)는 313단계를 수행한다. 313단계에서 선택부(219b)는 제2 유휴 클러스터를 제1 유휴 클러스터와 함께 어플리케이션 서버(150)에 할당한다.According to another embodiment, in step 309, the detection unit 219a performs step 311 if the first idle cluster identified in the first group 220a is not sufficient to process the required amount of computation. For example, in step 311, if a second idle cluster capable of processing the remaining computation amount exists among the plurality of second GPU clusters constituting the second group 220b, the detection unit 219a performs step 313. In step 313, the selection unit 219b allocates the second idle cluster to the application server 150 along with the first idle cluster.

또 다른 실시 예에 따르면, 309단계에서 감지부(219a)는 필요 연산량을 처리하기 위해 제1 유휴 클러스터에서 소모되는 제1 전력소모량과 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터에서 소모되는 제2 전력소모량을 비교한다. 제1 전력소모량이 제2 전력소모량보다 크면 감지부(219a)는 311단계를 수행한다. 311단계에서 감지부(219a)는 제2 그룹(220b)을 구성하는 제2 GPU클러스터 중에서 제2 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 확인된 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당한다.According to another embodiment, in step 309, the detection unit 219a determines the first power consumption in the first idle cluster and the plurality of second GPU clusters constituting the second group 220b in order to process the required calculation amount. Compare the second power consumption. If the first power consumption is greater than the second power consumption, the detection unit 219a performs step 311. In step 311, the detection unit 219a identifies a second idle cluster among the second GPU clusters constituting the second group 220b. In step 313, the selection unit 219b allocates the confirmed second idle cluster to the application server 150.

또 다른 실시 예에 따르면, 309단계에서 어플리케이션 서버(150)에서 GPU클러스터의 추후 증설을 요청한 상태이면 감지부(219a)는 311단계를 수행한다. 예컨대, 감지부(219a)는 제1 유휴 클러스터로 필요 연산량을 처리할 수 있으나, 제1 그룹(220a)에서 GPU클러스터의 추후 증설이 불가능한 상태이면 감지부(219a)는 311단계를 수행한다. 311단계에서 감지부(219a)는 제2 그룹(220b)에서 제2 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 필요 연산량의 처리를 위해 제1 유휴 클러스터를 어플리케이션 서버(150)로 할당하고, GPU클러스터의 추후 증설을 위해 확인된 제2 유휴 클러스터를 어플리케이션 서버(150)에 추가적으로 할당할 수 있다.According to another embodiment, if the application server 150 requests future expansion of the GPU cluster in step 309, the detection unit 219a performs step 311. For example, the detection unit 219a can process the required amount of computation with the first idle cluster, but if future expansion of the GPU cluster in the first group 220a is impossible, the detection unit 219a performs step 311. In step 311, the detection unit 219a checks the second idle cluster in the second group 220b. In step 313, the selection unit 219b allocates the first idle cluster to the application server 150 to process the required amount of computation, and additionally assigns the confirmed second idle cluster to the application server 150 for future expansion of the GPU cluster. Can be assigned.

또한, 309단계에서 제1 유휴 클러스터로 필요 연산량의 처리가 불가능하고, 제1 그룹(220a)에서 GPU클러스터의 추후 증설이 불가능한 상태이면 감지부(219a)는 311단계를 수행한다. 311단계에서 감지부(219a)는 제2 그룹(220b)에서 제2 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 필요 연산량을 처리하도록 확인된 제2 유휴 클러스터 중 적어도 일부를 어플리케이션 서버(150)로 할당하고, 제2 유휴 클러스터 중 다른 일부를 GPU클러스터의 추후 증설을 위해 어플리케이션 서버(150)에 추가적으로 할당할 수 있다. 이때, GPU클러스터의 추후 증설 시에 제2 유휴 클러스터를 할당하지 못할 경우, 감지부(219a)는 제3 그룹을 구성하는 제3 GPU클러스터 중 제3 유휴 클러스터를 확인할 수 있다. 선택부(219b)는 GPU클러스터의 추후 증설을 위해 확인된 제3 유휴 클러스터를 어플리케이션 서버(150)로 추가적으로 할당할 수 있다. Additionally, in step 309, if it is impossible to process the required amount of computation with the first idle cluster and future expansion of the GPU cluster in the first group 220a is impossible, the detection unit 219a performs step 311. In step 311, the detection unit 219a checks the second idle cluster in the second group 220b. In step 313, the selection unit 219b allocates at least a portion of the second idle clusters identified to process the required amount of computation to the application server 150, and assigns another portion of the second idle clusters to the application server for future expansion of the GPU cluster. It can be additionally assigned to (150). At this time, if the second idle cluster cannot be allocated during future expansion of the GPU cluster, the detection unit 219a can check the third idle cluster among the third GPU clusters constituting the third group. The selection unit 219b may additionally allocate the identified third idle cluster to the application server 150 for future expansion of the GPU cluster.

또 다른 실시 예에 따르면, 309단계에서 감지부(219a)는 제1 유휴 클러스터의 점검 및 폐기 예정 등으로 인한 사용불가 일정이 설정된 상태이거나, 에러가 발생한 상태가 확인되면 311단계를 수행한다. 311단계에서 감지부(219a)는 제2 그룹(220b)을 구성하는 복수의 제2 GPU클러스터 중에서 필요 연산량을 처리할 수 있는 제2 유휴 클러스터를 확인한다. 313단계에서 선택부(219b)는 감지부(219a)의 확인 결과에 따라 제2 유휴 클러스터를 어플리케이션 서버(150)에 할당할 수 있다.According to another embodiment, in step 309, the detection unit 219a performs step 311 if an unusability schedule is set due to inspection or disposal of the first idle cluster, or if an error occurs. In step 311, the detection unit 219a identifies a second idle cluster that can process the required amount of computation among the plurality of second GPU clusters constituting the second group 220b. In step 313, the selection unit 219b may allocate the second idle cluster to the application server 150 according to the confirmation result of the detection unit 219a.

이와 같이, 본 발명은 클라우드 서버(200)를 구성하는 GPU클러스터 중에서 유휴 클러스터를 고려하여 사용자 즉, 어플리케이션 서버(150)의 요청에 따라 사용자에게 필요한 클러스터를 공급함으로써 유휴자원의 낭비를 최소화할 수 있는 효과가 있다. 또한, 사용자에게 자원 할당 시에 전력 소모를 최소화할 수 있는 유휴 클러스터를 할당하여 전력소모량을 최소화할 수 있는 효과가 있다. 아울러, GPU클러스터의 성능에 관계 없이 사용자에게 필요한 자원을 공급함으로써, GPU클러스터의 운영 효율을 향상시킬 수 있는 효과가 있다. In this way, the present invention considers idle clusters among the GPU clusters constituting the cloud server 200 and supplies the necessary clusters to the user according to the request of the user, that is, the application server 150, thereby minimizing the waste of idle resources. It works. In addition, there is an effect of minimizing power consumption by allocating idle clusters that can minimize power consumption when allocating resources to users. In addition, it has the effect of improving the operational efficiency of the GPU cluster by supplying the necessary resources to users regardless of the performance of the GPU cluster.

본 명세서와 도면에 개시된 본 발명의 실시 예들은 본 발명의 기술 내용을 쉽게 설명하고 본 발명의 이해를 돕기 위해 특정 예를 제시한 것일 뿐이며, 본 발명의 범위를 한정하고자 하는 것은 아니다. 따라서 본 발명의 범위는 여기에 개시된 실시 예들 이외에도 본 발명의 기술적 사상을 바탕으로 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The embodiments of the present invention disclosed in this specification and drawings are merely provided as specific examples to easily explain the technical content of the present invention and to facilitate understanding of the present invention, and are not intended to limit the scope of the present invention. Therefore, the scope of the present invention should be construed as including all changes or modified forms derived based on the technical idea of the present invention in addition to the embodiments disclosed herein.

Claims

A control device receiving a resource allocation request signal from at least one application server;
Confirming, by the control device, a required amount of computation corresponding to the resource allocation request signal;
The control device identifies at least one first idle cluster capable of processing the required amount of computation among at least one first GPU cluster included in a first group among at least one group according to a cluster allocation rule selected by the application server. step;
When the control device needs to change the first idle cluster, it identifies at least one second idle cluster capable of processing at least a portion of the required computational amount among at least one second GPU cluster included in at least one second group. steps; and
allocating, by the control device, at least one of the first idle cluster and the second idle cluster to the application server;
A resource allocation method comprising:

According to paragraph 1,
If the control device does not need to change the first idle cluster, allocating the confirmed first idle cluster to the application server;
A resource allocation method further comprising:

According to paragraph 1,
The control device checking whether the first idle cluster is changed based on the required calculation amount;
A resource allocation method further comprising:

According to paragraph 3,
The step of checking whether the first idle cluster has changed is,
A resource allocation method characterized in that it is confirmed that a change to the first idle cluster is necessary if the entire required computational amount cannot be processed in the first idle cluster.

According to paragraph 3,
The step of checking whether the first idle cluster has changed is,
A resource allocation method, characterized in that it is a step of confirming that the first idle cluster needs to be changed when the first power consumption in the first idle cluster exceeds a threshold.

According to paragraph 3,
The step of checking whether the first idle cluster has changed is,
A resource allocation method characterized in that, if a schedule is set for the first idle cluster, it is confirmed that the first idle cluster needs to be changed.

According to paragraph 3,
The step of checking whether the first idle cluster has changed is,
A resource allocation method characterized in that, if an error has occurred in the first idle cluster, it is confirmed that the first idle cluster needs to be changed.

According to paragraph 3,
The cluster allocation rule is,
A resource allocation method, characterized in that it is a rule for variably allocating at least one cluster among the first idle cluster and the second idle cluster to the application server based on the required computational amount.

According to paragraph 1,
At least one first GPU cluster included in the first group and at least one second GPU cluster included in the at least one second group are different in at least one of specifications including type, computing performance, and power consumption. A resource allocation method characterized by:

a communication unit that communicates with at least one application server; and
At least one first GPU cluster included in the first group among the at least one group according to the cluster allocation rule selected by the application server, confirming the required amount of computation corresponding to the resource allocation request signal received from the at least one application server. Identifying at least one first idle cluster capable of processing the required amount of computation,
If the first idle cluster needs to be changed, identify at least one second idle cluster capable of processing at least a portion of the required computational amount among at least one second GPU cluster included in at least one second group,
a control unit allocating at least one of the first idle cluster and the second idle cluster to the application server;
A resource allocation device comprising:

According to clause 10,
The control unit,
A resource allocation device that allocates the confirmed first idle cluster to the application server if the first idle cluster does not need to be changed.

According to clause 10,
The control unit,
A resource allocation device characterized in that it is determined that a change in the first idle cluster is necessary if the entire required amount of computation cannot be processed in the first idle cluster.

According to clause 10,
The control unit,
A resource allocation device, characterized in that when the first power consumption in the first idle cluster exceeds a threshold, it is determined that the first idle cluster needs to be changed.

According to clause 10,
The control unit,
A resource allocation device characterized in that, if a schedule is set for the first idle cluster, it is confirmed that the first idle cluster needs to be changed.

According to clause 10,
The control unit,
A resource allocation device characterized in that, if an error has occurred in the first idle cluster, it is determined that the first idle cluster needs to be changed.

According to clause 10,
The cluster allocation rule is,
A resource allocation device, characterized in that it is a rule for variably allocating at least one cluster among the first idle cluster and the second idle cluster to the application server based on the required computational amount.

According to clause 10,
At least one first GPU cluster included in the first group and at least one second GPU cluster included in the at least one second group are different in at least one of specifications including type, computing performance, and power consumption. A resource allocation device characterized in that.