KR102417882B1

KR102417882B1 - Method for managing gpu resources and computing device for executing the method

Info

Publication number: KR102417882B1
Application number: KR1020200117612A
Authority: KR
Inventors: 김정호; 박건용; 공선빈
Original assignee: 한화시스템 주식회사
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2022-07-05
Also published as: KR20220035626A

Abstract

GPU 자원 관리 방법 및 이를 수행하기 위한 컴퓨팅 장치가 개시된다. 본 발명의 일 실시예에 따른 GPU 자원 관리 방법 및 이를 수행하기 위한 컴퓨팅 장치는 하나 이상의 프로세서들, 및 상기 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 컴퓨팅 장치로서, 사용자에 의해 작업 명령을 수신하고, 상기 수신한 작업 명령을 기반으로 부팅 신호를 생성하는 마스터 노드; 상기 마스터 노드로부터 입력된 부팅 신호에 따라 제1 운영체제 또는 제2 운영체제를 부팅하는 운영체제부; 및 상기 마스터 노드로부터 상기 작업 명령을 전달 받아 GPU(Graphics Processing Unit)에 할당하는 워커 노드를 포함한다.Disclosed are a GPU resource management method and a computing device for performing the same. A method for managing GPU resources and a computing device for performing the same according to an embodiment of the present invention is a computing device having one or more processors and a memory for storing one or more programs executed by the one or more processors, and a user a master node for receiving a work command by the , and generating a booting signal based on the received work command; an operating system unit for booting a first operating system or a second operating system according to a booting signal input from the master node; and a worker node that receives the work command from the master node and allocates it to a graphics processing unit (GPU).

Description

A method for managing GPU resources and a computing device for performing the same

본 발명의 실시예들은 GPU 자원 관리 기술과 관련된다.Embodiments of the present invention relate to GPU resource management techniques.

GPU(Graphics Processing Unit)는 분자 시뮬레이션, 금융 공학, 클라우드 게이밍, 2D/3D 그래픽 연산 등의 HPC(high-performance computing) 분야 전반에서 사용되는 중요한 장치이다. 특히 클라우드 환경에서 GPU를 탑재한 컴퓨트 노드(compute node)는 사용자에게 운용 비용을 절감하면서도 높은 에너지와 자원 효율을 제공한다. 높은 효율을 제공하면서도 저렴한 가격으로 GPU를 운용하기 위해서는 가상화 기술이 필요하며, 이러한 가상화를 이용한 GPU 운용을 통해 높은 사용률을 얻을 수 있다.GPU (Graphics Processing Unit) is an important device used in high-performance computing (HPC) fields such as molecular simulation, financial engineering, cloud gaming, and 2D/3D graphics computation. In particular, compute nodes equipped with GPUs in cloud environments provide users with high energy and resource efficiency while reducing operating costs. Virtualization technology is required to operate a GPU at a low price while providing high efficiency, and a high utilization rate can be obtained through GPU operation using such virtualization.

한편, 가상 머신이란 하드웨어를 소프트웨어로 구현하고 구현된 소프트웨어 상에서 운영체계가 작동하도록 하는 기술이다. 이러한 GPU 가상화를 통해 GPU를 분할함으로써, 각 가상 머신이 할당된 작업을 수행하게 된다.On the other hand, a virtual machine is a technology that implements hardware as software and allows an operating system to operate on the implemented software. By partitioning the GPU through GPU virtualization, each virtual machine performs an assigned task.

그러나, 이러한 가상 GPU들은 다수의 작업을 수행할 수 있으나, 실제 GPU에 비하여 비교적 낮은 성능으로 구성되는 한계가 있다.However, although these virtual GPUs can perform a number of tasks, there is a limitation in that they have relatively low performance compared to an actual GPU.

국내 출원특허공보 제10-2013-0010442호 (2013.01.28.)Domestic Patent Application Publication No. 10-2013-0010442 (2013.01.28.)

본 발명의 실시예들은 사용자의 필요에 따라 실제 GPU 또는 가상 GPU를 선택적으로 운영할 수 있는 GPU 자원 관리 방법을 제공하기 위한 것이다.SUMMARY Embodiments of the present invention provide a method for managing GPU resources that can selectively operate a real GPU or a virtual GPU according to a user's needs.

본 발명의 예시적인 실시예에 따르면, 하나 이상의 프로세서들, 및 상기 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 컴퓨팅 장치로서, 사용자에 의해 작업 명령을 수신하고, 상기 수신한 작업 명령을 기반으로 부팅 신호를 생성하는 마스터 노드; 상기 마스터 노드로부터 입력된 부팅 신호에 따라 제1 운영체제 또는 제2 운영체제를 부팅하는 운영체제부; 및 상기 마스터 노드로부터 상기 작업 명령을 전달 받아 GPU(Graphics Processing Unit)에 할당하는 워커 노드를 포함하는 GPU 자원 관리 장치가 제공된다.According to an exemplary embodiment of the present invention, there is provided a computing device having one or more processors and a memory for storing one or more programs executed by the one or more processors to receive a work instruction by a user, the receiving a master node that generates a boot signal based on one work command; an operating system unit for booting a first operating system or a second operating system according to a booting signal input from the master node; and a worker node that receives the work command from the master node and allocates it to a graphics processing unit (GPU).

상기 마스터 노드는 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화할지 여부를 결정하여 상기 부팅 신호를 생성할 수 있다.The master node may generate the booting signal by determining whether to virtualize the GPU based on at least one of a quantity of the received work command and a memory usage.

상기 마스터 노드는 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화하지 않는 것으로 결정하면, 상기 제1 운영체제를 부팅하도록 제1 부팅 신호를 생성하며, 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화하는 것으로 결정하면 상기 제2 운영체제를 부팅하도록 제2 부팅 신호를 생성할 수 있다.When the master node determines not to virtualize the GPU based on at least one of the quantity of the received work command and the memory usage, the master node generates a first booting signal to boot the first operating system, and the received work command When it is determined that the GPU is virtualized based on one or more of the quantity and memory usage, a second booting signal may be generated to boot the second operating system.

상기 마스터 노드는 상기 수신한 작업 명령의 수량이 기 설정된 제1 기준 이하이거나 상기 수신한 작업 명령의 사용량이 기 설정된 제2 기준 초과인 경우, 상기 GPU를 가상화하지 않는 것으로 결정할 수 있다.The master node may determine not to virtualize the GPU when the number of the received work commands is less than or equal to a preset first criterion or when the amount of the received work command exceeds a preset second criterion.

상기 마스터 노드는 상기 수신한 작업 명령의 수량이 기 설정된 제1 기준 초과하거나, 상기 수신한 작업 명령의 사용량이 기 설정된 제2 기준 이하인 경우, 상기 GPU를 가상화하는 것으로 결정할 수 있다.The master node may determine to virtualize the GPU when the quantity of the received work commands exceeds a preset first criterion or the usage of the received work commands is less than or equal to a preset second criterion.

상기 워커 노드는 상기 GPU를 복수의 슬롯을 포함하는 가상 GPU로 가상화하는 GPU 가상 드라이버; 및 상기 복수의 슬롯과 각각 연결되어 상기 수신한 작업 명령을 상기 가상 GPU의 복수의 슬롯에 각각 할당하는 복수의 가상 머신을 포함할 수 있다.The worker node includes: a GPU virtual driver for virtualizing the GPU into a virtual GPU including a plurality of slots; and a plurality of virtual machines respectively connected to the plurality of slots to allocate the received work command to the plurality of slots of the virtual GPU, respectively.

상기 마스터 노드는 상기 운영체제부를 감지하여 상기 생성된 부팅 신호와 상기 운영체제부에 부팅된 운영체제를 비교하며, 상기 부팅 신호와 상기 부팅된 운영체제가 상이한 경우, 상기 부팅된 운영체제를 종료하도록 종료 신호를 생성할 수 있다.The master node detects the operating system unit, compares the generated booting signal with the operating system booted to the operating system unit, and generates a termination signal to terminate the booted operating system when the booting signal and the booted operating system are different. can

상기 마스터 노드는 상기 제1 부팅 신호의 생성 후 상기 제2 운영체제가 부팅되어 있는 것으로 확인되면, 상기 제2 운영체제가 종료되도록 종료 신호를 상기 운영체제부로 전송하고, 상기 제2 운영체제가 종료된 경우 상기 제1 부팅 신호를 상기 운영체제부로 전달하여 상기 제1 운영체제가 부팅되도록 할 수 있다.When it is confirmed that the second operating system is booted after the generation of the first booting signal, the master node transmits a termination signal to the operating system unit so that the second operating system is terminated, and when the second operating system is terminated, the second operating system is terminated. The first operating system may be booted by transmitting a first booting signal to the operating system unit.

상기 마스터 노드는 상기 제2 부팅 신호의 생성 후 상기 제1 운영체제가 부팅되어 있는 것으로 확인되면, 상기 제1 운영체제가 종료되도록 종료 신호를 상기 운영체제부로 전송하고, 상기 제1 운영체제가 종료된 경우 상기 제2 부팅 신호를 상기 운영체제부로 전달하여 상기 제2 운영체제가 부팅되도록 할 수 있다.When it is confirmed that the first operating system is booted after generation of the second booting signal, the master node transmits a termination signal to the operating system unit so that the first operating system is terminated, and when the first operating system is terminated, the first operating system is terminated. The second operating system may be booted by transmitting a second booting signal to the operating system unit.

본 발명의 다른 예시적인 실시예에 따르면, 하나 이상의 프로세서들, 및 상기 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 컴퓨팅 장치에서 수행되는 GPU 자원 관리 방법으로서, 마스터 노드에서 사용자에 의해 작업 명령을 수신하는 단계; 상기 마스터 노드에서 상기 수신한 작업 명령을 기반으로 부팅 신호를 생성하는 단계; 운영체제부에서 상기 마스터 노드로부터 입력된 부팅 신호에 따라 제1 운영체제 또는 제2 운영체제를 부팅하는 단계; 및 워커 노드에서 상기 마스터 노드로부터 상기 작업 명령을 전달 받아 GPU(Graphics Processing Unit)에 할당하는 단계를 포함하는 GPU 자원 관리 방법이 제공된다.According to another exemplary embodiment of the present invention, there is provided a GPU resource management method performed in a computing device having one or more processors, and a memory for storing one or more programs executed by the one or more processors, in a master node receiving a work instruction by a user; generating a booting signal based on the received work command in the master node; booting the first operating system or the second operating system according to the booting signal input from the master node in the operating system unit; and receiving the work command from the master node in a worker node and allocating it to a graphics processing unit (GPU).

상기 부팅 신호를 생성하는 단계는 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화할지 여부를 결정하여 상기 부팅 신호를 생성하는 단계를 포함할 수 있다.The generating of the booting signal may include generating the booting signal by determining whether to virtualize the GPU based on at least one of a quantity of the received work command and a memory usage.

상기 부팅 신호를 생성하는 단계는 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화하지 않는 것으로 결정하면, 상기 제1 운영체제를 부팅하도록 제1 부팅 신호를 생성하는 단계를 포함할 수 있다.The generating of the booting signal may include generating a first booting signal to boot the first operating system when it is determined not to virtualize the GPU based on one or more of the received work command quantity and memory usage. may include

상기 제1 부팅 신호를 생성하는 단계는 상기 수신한 작업 명령의 수량이 기 설정된 제1 기준 이하이거나 상기 수신한 작업 명령의 사용량이 기 설정된 제2 기준 초과인 경우, 상기 GPU를 가상화하지 않는 것으로 결정하는 단계를 포함할 수 있다.In the generating of the first booting signal, it is determined that the GPU is not virtualized when the number of the received work commands is less than or equal to a preset first criterion or the usage of the received work commands exceeds a preset second criterion. may include the step of

상기 부팅 신호를 생성하는 단계는 상기 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 상기 GPU를 가상화하는 것으로 결정하면 상기 제2 운영체제를 부팅하도록 제2 부팅 신호를 생성하는 단계를 포함할 수 있다.The generating of the booting signal may include generating a second booting signal to boot the second operating system when it is determined that the GPU is virtualized based on one or more of the received work command quantity and memory usage. can

상기 제2 부팅 신호를 생성하는 단계는 상기 수신한 작업 명령의 수량이 기 설정된 제1 기준 초과하거나, 상기 수신한 작업 명령의 사용량이 기 설정된 제2 기준 이하인 경우, 상기 GPU를 가상화하는 것으로 결정하는 단계를 포함할 수 있다.In the generating of the second booting signal, when the number of the received work commands exceeds a preset first criterion or the usage of the received work commands is less than or equal to a preset second criterion, determining that the GPU is virtualized may include steps.

상기 GPU에 할당하는 단계는 상기 워커 노드에서 GPU 가상 드라이버를 통해 상기 GPU를 복수의 슬롯을 포함하는 가상 GPU로 가상화하는 단계; 및 상기 워커 노드에서 복수의 가상 머신을 통해 상기 수신한 작업 명령을 상기 가상 GPU의 복수의 슬롯에 각각 할당하는 단계를 포함하며, 상기 복수의 가상 머신은 상기 복수의 슬롯과 각각 연결될 수 있다.The allocating to the GPU may include: virtualizing the GPU into a virtual GPU including a plurality of slots through a GPU virtual driver in the worker node; and allocating the work command received through a plurality of virtual machines in the worker node to a plurality of slots of the virtual GPU, wherein the plurality of virtual machines may be respectively connected to the plurality of slots.

상기 GPU 자원 관리 방법은 상기 운영체제부를 감지하는 단계; 상기 생성된 부팅 신호와 상기 운영체제부에 부팅된 운영체제를 비교하는 단계; 및 상기 부팅 신호와 상기 부팅된 운영체제가 상이한 경우, 상기 부팅된 운영체제를 종료하도록 종료 신호를 생성하는 단계를 더 포함할 수 있다.The GPU resource management method may include detecting the operating system unit; comparing the generated booting signal with the operating system booted by the operating system unit; and generating an end signal to terminate the booted operating system when the booting signal is different from the booted operating system.

본 발명의 실시예들에 따르면, 사용자의 필요에 따라 실제 GPU 또는 가상 GPU를 선택적으로 운영함으로써, GPU를 효율적으로 운영할 수 있다.According to embodiments of the present invention, by selectively operating a real GPU or a virtual GPU according to a user's need, the GPU can be efficiently operated.

또한, 본 발명의 실시예들에 따르면, 실제 GPU를 사용하거나 실제 GPU를 가상화하여 분할한 가상 GPU를 사용함으로써, 작업의 크기에 맞는 성능을 가지는 GPU를 제공할 수 있다.In addition, according to embodiments of the present invention, by using a real GPU or a virtual GPU divided by virtualizing the real GPU, it is possible to provide a GPU having performance suitable for the size of a task.

도 1은 본 발명의 일 실시예에 따른 GPU 자원 관리 시스템을 설명하기 위한 구성도
도 2는 본 발명의 일 실시예에 따른 가상화 시스템을 이용한 GPU 자원 관리 시스템을 설명하기 위한 구성도
도 3은 본 발명의 일 실시예에 따른 GPU 자원 관리 방법을 설명하기 위한 흐름도
도 4는 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경을 예시하여 설명하기 위한 블록도1 is a configuration diagram illustrating a GPU resource management system according to an embodiment of the present invention;
2 is a configuration diagram illustrating a GPU resource management system using a virtualization system according to an embodiment of the present invention;
3 is a flowchart illustrating a GPU resource management method according to an embodiment of the present invention;
4 is a block diagram illustrating and describing a computing environment including a computing device suitable for use in example embodiments;

이하, 도면을 참조하여 본 발명의 구체적인 실시형태를 설명하기로 한다. 이하의 상세한 설명은 본 명세서에서 기술된 방법, 장치 및/또는 시스템에 대한 포괄적인 이해를 돕기 위해 제공된다. 그러나 이는 예시에 불과하며 본 발명은 이에 제한되지 않는다.Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. The following detailed description is provided to provide a comprehensive understanding of the methods, devices, and/or systems described herein. However, this is merely an example, and the present invention is not limited thereto.

본 발명의 실시예들을 설명함에 있어서, 본 발명과 관련된 공지기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략하기로 한다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다. 상세한 설명에서 사용되는 용어는 단지 본 발명의 실시예들을 기술하기 위한 것이며, 결코 제한적이어서는 안 된다. 명확하게 달리 사용되지 않는 한, 단수 형태의 표현은 복수 형태의 의미를 포함한다. 본 설명에서, "포함" 또는 "구비"와 같은 표현은 어떤 특성들, 숫자들, 단계들, 동작들, 요소들, 이들의 일부 또는 조합을 가리키기 위한 것이며, 기술된 것 이외에 하나 또는 그 이상의 다른 특성, 숫자, 단계, 동작, 요소, 이들의 일부 또는 조합의 존재 또는 가능성을 배제하도록 해석되어서는 안 된다.In describing the embodiments of the present invention, if it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. And, the terms to be described later are terms defined in consideration of functions in the present invention, which may vary according to intentions or customs of users and operators. Therefore, the definition should be made based on the content throughout this specification. The terminology used in the detailed description is for the purpose of describing embodiments of the present invention only, and should in no way be limiting. Unless explicitly used otherwise, expressions in the singular include the meaning of the plural. In this description, expressions such as “comprising” or “comprising” are intended to indicate certain features, numbers, steps, acts, elements, some or a combination thereof, one or more other than those described. It should not be construed to exclude the presence or possibility of other features, numbers, steps, acts, elements, or any part or combination thereof.

또한, 제1, 제2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로 사용될 수 있다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성 요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다.Also, terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The above terms may be used for the purpose of distinguishing one component from another component. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component.

이하의 설명에 있어서, 신호 또는 정보의 "전송", "통신", "송신", "수신" 기타 이와 유사한 의미의 용어는 일 구성요소에서 다른 구성요소로 신호 또는 정보가 직접 전달되는 것뿐만이 아니라 다른 구성요소를 거쳐 전달되는 것도 포함한다. 특히 신호 또는 정보를 일 구성요소로 "전송" 또는 "송신"한다는 것은 그 신호 또는 정보의 최종 목적지를 지시하는 것이고 직접적인 목적지를 의미하는 것이 아니다. 이는 신호 또는 정보의 "수신"에 있어서도 동일하다. 또한 본 명세서에 있어서, 2 이상의 데이터 또는 정보가 "관련"된다는 것은 하나의 데이터(또는 정보)를 획득하면, 그에 기초하여 다른 데이터(또는 정보)의 적어도 일부를 획득할 수 있음을 의미한다. In the following description, the terms "transmission", "communication", "transmission", "reception" and other similar meanings of a signal or information are not only directly transmitted from one component to another component, but also a signal or information This includes passing through other components. In particular, to “transmit” or “transmit” a signal or information to a component indicates the final destination of the signal or information and does not imply a direct destination. The same is true for "reception" of signals or information. In addition, in this specification, when two or more data or information are "related", it means that when one data (or information) is acquired, at least a part of other data (or information) can be acquired based thereon.

도 1은 본 발명의 일 실시예에 따른 GPU 자원 관리 시스템을 설명하기 위한 구성도이다.1 is a configuration diagram illustrating a GPU resource management system according to an embodiment of the present invention.

도 1을 참조하면, GPU 자원 관리 시스템(100)은 마스터 노드(Master Node)(110), 운영체제부(OS: Operation System)(120), 워커 노드(Worker Node)(130) 및 GPU(Graphics Processing Unit)(140)를 포함할 수 있다. 또한, 운영체제부(120)는 제1 운영체제(121) 및 제2 운영체제(122)를 포함한다.1, the GPU resource management system 100 is a master node (Master Node) 110, an operating system (OS: Operation System) 120, a worker node (Worker Node) 130, and GPU (Graphics Processing) Unit) 140 may be included. In addition, the operating system unit 120 includes a first operating system 121 and a second operating system 122 .

마스터 노드(110)는 사용자로부터 작업 명령(GPU 자원을 사용하여 수행되는 그래픽 처리 명령)을 수신하고, 수신한 작업 명령을 기반으로 부팅 신호를 생성할 수 있다. 또한, 마스터 노드(110)는 부팅된 운영체제부(120)에 작업 명령을 전달할 수 있다. 구체적으로, 마스터 노드(110)는 수신한 작업 명령의 수량 및 메모리 사용량(작업 명령에 따른 GPU(140)의 메모리 사용량) 중 하나 이상을 기반으로 GPU(140)를 가상화할지 여부를 결정하여 운영체제부(120)를 부팅하도록 부팅 신호를 생성할 수 있다. The master node 110 may receive a work command (graphic processing command performed using GPU resources) from a user, and generate a booting signal based on the received work command. Also, the master node 110 may transmit a work command to the booted operating system unit 120 . Specifically, the master node 110 determines whether to virtualize the GPU 140 based on one or more of the quantity and memory usage (memory usage of the GPU 140 according to the work command) of the received work command to determine whether to virtualize the operating system unit A boot signal may be generated to boot 120 .

예시적인 실시예에서, 마스터 노드(110)는 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 GPU(140)를 가상화하지 않는 것으로 결정하면, 제1 운영체제(121)를 부팅하도록 제1 부팅 신호를 생성할 수 있다. In an exemplary embodiment, if the master node 110 determines not to virtualize the GPU 140 based on one or more of the quantity and memory usage of the received work command, the first operating system 121 is booted. A boot signal can be generated.

또한, 마스터 노드(110)는 수신한 작업 명령의 수량 및 메모리 사용량 중 하나 이상을 기반으로 GPU(140)를 가상화하는 것으로 결정하게 되면 제2 운영체제(122)를 부팅하도록 제2 부팅 신호를 생성할 수 있다.In addition, the master node 110 generates a second boot signal to boot the second operating system 122 when it is determined to virtualize the GPU 140 based on one or more of the received work command quantity and memory usage. can

예를 들어, 마스터 노드(110)는 수신한 작업 명령의 수량이 기 설정된 제1 기준 이하, 또는 수신한 작업 명령의 사용량이 기 설정된 제2 기준 초과인 경우, GPU(140)를 가상화하지 않는 것으로 결정하고, 실제 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제1 운영체제(121)를 부팅하도록 제1 부팅 신호를 생성할 수 있다. 또한, 마스터 노드(110)는 수신한 작업 명령의 수량이 기 설정된 제1 기준 초과, 또는 수신한 작업 명령의 사용량이 기 설정된 제2 기준 이하인 경우, GPU(140)를 가상화하는 것으로 결정하고, 가상 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제2 운영체제(122)를 부팅하도록 제2 부팅 신호를 생성할 수 있다.For example, the master node 110 does not virtualize the GPU 140 when the number of received work commands is less than or equal to a preset first criterion, or when the amount of received work commands exceeds a preset second criterion. In order to determine and allocate a work command to the memory of the actual GPU 140 , the first booting signal may be generated to boot the first operating system 121 . In addition, the master node 110 determines to virtualize the GPU 140 when the quantity of received work commands exceeds a preset first criterion, or when the amount of received work commands is less than or equal to a preset second criterion, A second booting signal may be generated to boot the second operating system 122 in order to allocate a work command to the memory of the GPU 140 .

여기서, 기 설정된 제1 기준은 실제 GPU(140)의 개수일 수 있으며, 기 설정된 제2 기준은 가상화된 가상 GPU(140)의 메모리량일 수 있다. 또한, 실제 GPU(140)는 물리적인 GPU를 의미할 수 있으며, 가상 GPU(140)는 물리적인 GPU를 소프트웨어적으로 분할하여 구현한 GPU를 의미할 수 있다.Here, the preset first criterion may be the number of real GPUs 140 , and the preset second criterion may be the amount of memory of the virtualized virtual GPU 140 . In addition, the real GPU 140 may mean a physical GPU, and the virtual GPU 140 may mean a GPU implemented by dividing the physical GPU in software.

예를 들어, 32Gb의 물리적인 GPU가 8개 있는 경우, 수신한 작업 명령의 수량이 8개 이하이면 실제 GPU(140)에서 작업 수행이 가능하다고 판단하고, 실제 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제1 운영체제를 부팅하도록 결정할 수 있다. 또한, 32Gb의 물리적인 GPU를 가상화하여 8개의 가상 GPU(140)로 분할한 경우, 수신한 작업 명령의 메모리 사용량이 4Gb를 초과하게 되면 가상 GPU(140)에서 작업 수행이 불가능하다고 판단하고, 실제 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제1 운영체제를 부팅하도록 결정할 수 있다.For example, if there are 8 32 Gb physical GPUs, if the number of received work commands is 8 or less, it is determined that the work can be performed on the actual GPU 140 and the work command is stored in the memory of the actual GPU 140 . It may be determined to boot the first operating system in order to allocate . In addition, if the 32Gb physical GPU is virtualized and divided into 8 virtual GPUs 140, if the memory usage of the received work command exceeds 4Gb, it is determined that the work cannot be performed on the virtual GPU 140, and the actual It may be determined to boot the first operating system in order to allocate a work command to the memory of the GPU 140 .

또한, 32Gb의 물리적인 GPU가 8개 있는 경우, 수신한 작업 명령의 수량이 8개를 초과하면 실제 GPU(140)에서 작업 수행이 불가능하다고 판단하고, 가상 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제2 운영체제를 부팅하도록 결정할 수 있다. 또한, 32Gb의 물리적인 GPU를 가상화하여 8개의 가상 GPU(140)로 분할한 경우, 수신한 작업 명령의 메모리 사용량이 4Gb 이하이면 가상 GPU(140)에서 작업 수행이 가능하다고 판단하고, 가상 GPU(140)의 메모리에 작업 명령을 할당하기 위하여 제2 운영체제를 부팅하도록 결정할 수 있다.In addition, if there are 8 physical GPUs of 32Gb, if the number of received work commands exceeds 8, it is determined that the actual GPU 140 cannot perform the work, and the work command is stored in the memory of the virtual GPU 140. It may be decided to boot the second operating system to allocate. In addition, when the 32Gb physical GPU is virtualized and divided into 8 virtual GPUs 140, if the memory usage of the received work command is 4Gb or less, it is determined that the work can be performed on the virtual GPU 140, and the virtual GPU ( 140) may determine to boot the second operating system in order to allocate a work command to the memory.

이와 같이, 작업 명령의 수량 및 메모리 사용량 등에 따라 제1 운영체제를 통해 실제 GPU에 작업을 할당하거나 제2 운영체제를 통해 가상 GPU에 작업을 할당함으로써, GPU를 효율적으로 운영할 수 있게 된다.As described above, by allocating a task to the real GPU through the first operating system or to the virtual GPU through the second operating system according to the quantity and memory usage of the work command, the GPU can be efficiently operated.

또한, 마스터 노드(110)는 제1 부팅 신호를 생성한 후, 제2 운영체제(122)가 부팅되어 있는 것으로 확인된 경우, 종료 신호를 운영체제부(120)에 전송하여 제2 운영체제(122)가 종료되도록 할 수 있다. 마스터 노드(110)는 제2 운영체제(122)가 종료된 것으로 확인된 경우, 제1 부팅 신호를 운영체제부(120)에 전달하여 제1 운영체제(121)가 부팅되도록 할 수 있다.In addition, when it is confirmed that the second operating system 122 is booted after generating the first booting signal, the master node 110 transmits an end signal to the operating system unit 120 so that the second operating system 122 is booted. can be made to end. When it is confirmed that the second operating system 122 has been terminated, the master node 110 may transmit a first booting signal to the operating system unit 120 so that the first operating system 121 is booted.

또한, 마스터 노드(110)는 제2 부팅 신호를 생성한 후, 제1 운영체제(121)가 부팅되어 있는 것으로 확인된 경우, 종료 신호를 운영체제부(120)에 전송하여 제1 운영체제(121)가 종료되도록 할 수 있다. 마스터 노드(110)는 제1 운영체제(121)가 종료된 것으로 확인된 경우, 제2 부팅 신호를 운영체제부(120)에 전달하여 제2 운영체제(122)가 부팅되도록 할 수 있다.In addition, when it is confirmed that the first operating system 121 is booted after generating the second booting signal, the master node 110 transmits an end signal to the operating system unit 120 so that the first operating system 121 is booted. can be made to end. When it is confirmed that the first operating system 121 is terminated, the master node 110 may transmit a second booting signal to the operating system unit 120 so that the second operating system 122 is booted.

한편, 본 발명의 일 실시예에 따른 GPU 자원 관리 시스템(100)은 쿠버네티스(Kubernetes)로 구현될 수 있다. Meanwhile, the GPU resource management system 100 according to an embodiment of the present invention may be implemented with Kubernetes.

쿠버네티스는 대규모 컨테이너식 애플리케이션을 배포하고 관리할 수 있도록 지원하는 오픈 소스 소프트웨어로서, 컨테이너 가상화와 클러스터 구성을 효율적으로 관리할 수 있는 오픈소스 플랫폼이다. 쿠버네티스는 Amazon EC2 컴퓨팅 인스턴스의 클러스터를 관리하고 배포, 유지 관리 및 규모 조정의 프로세스를 통해 이러한 인스턴스에서 컨테이너를 실행한다.Kubernetes is an open source software that enables the deployment and management of large-scale containerized applications. It is an open source platform for efficiently managing container virtualization and cluster configurations. Kubernetes manages a cluster of Amazon EC2 compute instances and runs containers on these instances through a process of deployment, maintenance, and scaling.

쿠버네티스 클러스터는 컨테이너를 실행하는 EC2 컴퓨팅 인스턴스의 논리적 그룹이다. 클러스터는 컨트롤 플레인(컨테이너가 언제, 어떻게, 어디에서 실행되는지 제어하는 인스턴스)과 데이터 플레인(컨테이너가 실행되는 인스턴스)으로 구성된다.A Kubernetes cluster is a logical grouping of EC2 compute instances running containers. A cluster consists of a control plane (instances that control when, how, and where containers run) and a data plane (instances in which containers run).

예시적인 실시예에서, 종료 신호는 코돈(kubernetes cordon) 및 드레인(kubernetes drain) 신호일 수 있다. 코돈 신호는 워커 노드(130)에 작업 명령이 할당되지 않도록 할 수 있으며, 드레인 신호는 운영체제부(120)가 종료되도록 할 수 있다. In an exemplary embodiment, the termination signal may be a kubernetes cordon and a kubernetes drain signal. The codon signal may prevent a work command from being assigned to the worker node 130 , and the drain signal may cause the operating system unit 120 to terminate.

운영체제부(120)는 메모리에 설치될 수 있다. 운영 체제는 제1 운영체제(121) 및 제2 운영체제(122)를 포함할 수 있다. 제1 운영체제(121) 및 제2 운영체제(122)는 서로 다른 운영체제일 수 있다. 예를 들어, Linux, Windows, Unix 등의 운영체제의 일 종류일 수 있으며, Ubuntu, CentOS, RHEL등 일 수 있다. 메모리는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(Random Access Memory, RAM), SRAM(Static Random Access Memory), 롬(Read-Only Memory, ROM), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The operating system unit 120 may be installed in a memory. The operating system may include a first operating system 121 and a second operating system 122 . The first operating system 121 and the second operating system 122 may be different operating systems. For example, it may be a type of operating system such as Linux, Windows, Unix, etc., and may be Ubuntu, CentOS, RHEL, or the like. The memory includes a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), and a random access memory (RAM). Memory, RAM), SRAM (Static Random Access Memory), ROM (Read-Only Memory, ROM), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk, optical disk It may include at least one type of storage medium.

예시적인 실시예에서, 운영체제부(120)는 마스터 노드(110)로부터 제1 부팅 신호가 입력되면, 제1 운영체제(121)를 부팅할 수 있다. 또한, 운영체제부(120)는 마스터 노드(110)로부터 제2 부팅 신호가 입력되면, 제2 운영체제(122)를 부팅할 수 있다. 또한, 운영체제부(120)는 마스터 노드(110)로부터 종료 신호가 입력되면 부팅된 운영체제부(120)를 종료할 수 있다.In an exemplary embodiment, when the first booting signal is input from the master node 110 , the operating system unit 120 may boot the first operating system 121 . Also, when the second booting signal is input from the master node 110 , the operating system unit 120 may boot the second operating system 122 . Also, the operating system unit 120 may terminate the booted operating system unit 120 when a termination signal is input from the master node 110 .

워커 노드(130)는 운영체제부(120)에 의하여 마스터 노드(110)로부터 사용자의 작업 명령을 전달받을 수 있다. 워커 노드(130)는 제1 운영체제(121)가 부팅된 경우, 사용자의 작업 명령을 실제 GPU(140)에 할당할 수 있다. 할당된 사용자의 작업 명령은 실제 GPU(140)를 통해 실행될 수 있다. The worker node 130 may receive a user's work command from the master node 110 by the operating system unit 120 . The worker node 130 may allocate a user's work command to the actual GPU 140 when the first operating system 121 is booted. The assigned user's work command may be actually executed through the GPU 140 .

도 2는 본 발명의 일 실시예에 따른 가상화 시스템을 이용한 GPU 자원 관리 시스템(100)을 설명하기 위한 구성도이다.2 is a configuration diagram illustrating a GPU resource management system 100 using a virtualization system according to an embodiment of the present invention.

도 2를 참조하면, 워커 노드(130)는 제2 운영체제(122)가 부팅된 경우, 사용자의 작업 명령을 가상 GPU(140)에 할당할 수 있다. 할당된 사용자의 작업 명령은 가상 GPU(140)를 통해 실행될 수 있다. 예를 들어, 워커 노드(130)에는 GPU 가상 드라이버(131)가 설치될 수 있다. 가상화 시스템은 물리적인 하드웨어를 다수의 가상 하드웨어로 구분하여 다수를 운영하기 위한 것이다. GPU 가상 드라이버(131)는 실제 GPU(140)의 가상화를 지원하는 NVIDIA GRID일 수 있다. NVIDIA GRID는 하드웨어 수준에서 GPU(140)를 가상화할 수 있다. NVIDIA GRID는 가상 GPU(140)를 생성하고, 가상 머신(132)(Virtual Machine)에 할당할 수 있다. 즉, 워커 노드(130)는 소프트웨어적으로 구현된 가상 GPU(140) 각각이 특정 작업을 수행하는 복수의 가상 머신들(132)에 할당하여 가상화 시스템을 구현할 수 있다.Referring to FIG. 2 , when the second operating system 122 is booted, the worker node 130 may allocate a user's work command to the virtual GPU 140 . The assigned user's work command may be executed through the virtual GPU 140 . For example, the GPU virtual driver 131 may be installed in the worker node 130 . The virtualization system divides physical hardware into a plurality of virtual hardware to operate the plurality. The GPU virtual driver 131 may be an NVIDIA GRID that supports virtualization of the actual GPU 140 . NVIDIA GRID can virtualize GPU 140 at the hardware level. NVIDIA GRID may create a virtual GPU 140 and assign it to a virtual machine 132 (Virtual Machine). That is, the worker node 130 may implement a virtualization system by assigning each of the virtual GPUs 140 implemented in software to a plurality of virtual machines 132 performing a specific task.

복수의 가상 머신들(132)은 가상 GPU(140) 메모리의 슬롯들 중 적어도 하나의 슬롯을 할당받고, 가상 GPU(140)의 메모리를 사용하여 각 가상 머신(132)에 할당된 작업 명령을 수행한다.The plurality of virtual machines 132 are allocated at least one slot among the slots of the memory of the virtual GPU 140 , and use the memory of the virtual GPU 140 to perform a work command assigned to each virtual machine 132 . do.

다시 도 1을 참조하면, GPU(140)는 빠른 그래픽 렌더링을 위한 컴퓨팅 장치 구성 요소이다. 렌더링은 모든 픽셀에 대해 동일한 독립적인 알고리즘을 계산해야 하므로, GPU(140)는 병렬 실행에 최적화되어 있다. 요즘, GPU(140)는 자체 특성 때문에 범용 컴퓨팅에도 사용된다.Referring back to FIG. 1 , the GPU 140 is a computing device component for fast graphics rendering. Because rendering must compute the same independent algorithm for every pixel, GPU 140 is optimized for parallel execution. Nowadays, GPU 140 is also used for general purpose computing because of its nature.

GPU(140)는 실제 GPU(140)의 메모리에 작업이 할당될 수 있으며, 실제 GPU(140)를 소프트웨어적으로 분할하여 구현한 가상 GPU(140)의 메모리에 작업이 할당될 수 있다.The GPU 140 may allocate a task to the memory of the real GPU 140 , and the task may be allocated to the memory of the virtual GPU 140 implemented by dividing the real GPU 140 in software.

실제 GPU(140)는 제1 운영체제(121)가 부팅된 경우, 워커 노드(130)로부터 작업 명령이 할당될 수 있다. 또한, 가상 GPU(140)는 제2 운영체제(122)가 부팅된 경우, 워커 노드(130)로부터 작업 명령이 할당될 수 있다.The actual GPU 140 may be assigned a work command from the worker node 130 when the first operating system 121 is booted. In addition, the virtual GPU 140 may be assigned a work command from the worker node 130 when the second operating system 122 is booted.

가상 GPU(140)는 복수의 슬롯들을 포함하며, 가상 GPU(140)의 복수의 슬롯들에는 각각 복수의 가상 머신들(132)에 할당된다. 예를 들어, 실제 GPU(140)를 8개의 가상 GPU(140)로 가상화할 수 있다. 본 발명에서는 복수의 가상 머신들(132)이 동일한 비율로 가상 GPU(140)에 할당되는 것으로 설명하였으나, 이에 한정되는 것은 아니며, 각각 상이한 비율로 가상 GPU(140)에 할당될 수도 있다.The virtual GPU 140 includes a plurality of slots, and each of the plurality of slots of the virtual GPU 140 is allocated to a plurality of virtual machines 132 . For example, the real GPU 140 may be virtualized with eight virtual GPUs 140 . In the present invention, it has been described that the plurality of virtual machines 132 are allocated to the virtual GPU 140 at the same ratio, but the present invention is not limited thereto, and may be allocated to the virtual GPU 140 at different ratios, respectively.

도 3은 본 발명의 일 실시예에 따른 GPU 자원 관리 방법을 설명하기 위한 흐름도이다. 전술한 바와 같이, 본 발명의 일 실시예에 따른 GPU 자원 관리 방법은 하나 이상의 프로세서들, 및 상기 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 컴퓨팅 장치(12)에서 수행될 수 있다. 이를 위하여, 상기GPU 자원 관리 방법은 하나 이상의 컴퓨터 실행 가능 명령어를 포함하는 프로그램 내지 소프트웨어의 형태로 구현되어 상기 메모리상에 저장될 수 있다. 3 is a flowchart illustrating a GPU resource management method according to an embodiment of the present invention. As described above, the GPU resource management method according to an embodiment of the present invention is performed in a computing device 12 having one or more processors and a memory storing one or more programs executed by the one or more processors. can be To this end, the GPU resource management method may be implemented in the form of a program or software including one or more computer-executable instructions and stored in the memory.

또한, 도시된 흐름도에서는 상기 방법을 복수 개의 단계로 나누어 기재하였으나, 적어도 일부의 단계들은 순서를 바꾸어 수행되거나, 다른 단계와 결합되어 함께 수행되거나, 생략되거나, 세부 단계들로 나뉘어 수행되거나, 또는 도시되지 않은 하나 이상의 단계가 부가되어 수행될 수 있다.In addition, although the method has been described by dividing the method into a plurality of steps in the illustrated flowchart, at least some of the steps are performed in a different order, are performed in combination with other steps, are omitted, are performed in sub-steps, or are shown One or more steps not included may be added and performed.

단계 302에서, 마스터 노드(110)가 사용자에 의해 작업 명령(GPU 자원을 사용하여 수행되는 그래픽 처리 명령)을 수신한다.In step 302, the master node 110 receives a work instruction (graphics processing instruction performed using GPU resources) by the user.

단계 304에서, 마스터 노드(110)가 수신한 작업 명령을 기반으로 부팅 신호를 생성한다. 구체적으로, 마스터 노드(110)는 수신한 작업 명령의 수량 및 사용량(작업 명령에 따른 GPU의 메모리 사용량)을 기반으로 GPU(140)를 가상화할지 여부를 판단하여 운영체제부(120)를 부팅하도록 부팅 신호를 생성할 수 있다. In step 304, a boot signal is generated based on the work command received by the master node 110. Specifically, the master node 110 determines whether to virtualize the GPU 140 based on the quantity and usage of the received work command (memory usage of the GPU according to the work command) and boots the operating system unit 120 to boot. signal can be generated.

단계 306에서, 마스터 노드(110)가 운영체제부(120)를 감지하고, 생성된 부팅 신호와 운영체제부(120)를 비교한다. 구체적으로, 마스터 노드(110)는 생성한 부팅 신호와 동일한 운영체제부(120)가 부팅되어 있는 것으로 확인되는 경우, 운영체제부(120)에 작업 명령을 전달할 수 있다. 또한, 마스터 노드(110)는 제1 부팅 신호를 생성한 후, 제2 운영체제(122)가 부팅되어 있는 것으로 확인된 경우, 종료 신호를 운영체제부(120)에 전송하여 제2 운영체제(122)가 종료되도록 할 수 있다. 마스터 노드(110)는 제2 운영체제(122)가 종료된 것으로 확인된 경우, 제1 부팅 신호를 운영체제부(120)에 전달하여 제1 운영체제(121)가 부팅되도록 할 수 있다. 또한, 또한, 마스터 노드(110)는 제2 부팅 신호를 생성한 후, 제1 운영체제(121)가 부팅되어 있는 것으로 확인된 경우, 종료 신호를 운영체제부(120)에 전송하여 제1 운영체제(121)가 종료되도록 할 수 있다. 마스터 노드(110)는 제1 운영체제(121)가 종료된 것으로 확인된 경우, 제2 부팅 신호를 운영체제부(120)에 전달하여 제2 운영체제(122)가 부팅되도록 할 수 있다.In step 306 , the master node 110 detects the operating system unit 120 and compares the generated booting signal with the operating system unit 120 . Specifically, when it is confirmed that the operating system unit 120 identical to the generated booting signal is being booted, the master node 110 may transmit a work command to the operating system unit 120 . In addition, when it is confirmed that the second operating system 122 is booted after generating the first booting signal, the master node 110 transmits an end signal to the operating system unit 120 so that the second operating system 122 is booted. can be made to end. When it is confirmed that the second operating system 122 has been terminated, the master node 110 may transmit a first booting signal to the operating system unit 120 so that the first operating system 121 is booted. Also, when it is confirmed that the first operating system 121 is booted after generating the second booting signal, the master node 110 transmits a termination signal to the operating system unit 120 to transmit the first operating system 121 . ) can be terminated. When it is confirmed that the first operating system 121 is terminated, the master node 110 may transmit a second booting signal to the operating system unit 120 so that the second operating system 122 is booted.

단계 308에서, 운영체제부(120)가 마스터 노드(110)로부터 입력된 부팅 신호에 따라 운영체제를 부팅한다.In step 308 , the operating system unit 120 boots the operating system according to the booting signal input from the master node 110 .

단계 310에서, 워커 노드(130)가 운영체제부(120)에 의하여 마스터 노드(110)로부터 사용자의 작업 명령을 전달받아 GPU(140)에 할당한다. 구체적으로, 워커 노드(130)는 제1 운영체제(121)가 부팅된 경우, 사용자의 작업 명령을 실제 GPU(140)에 할당할 수 있다. 할당된 사용자의 작업 명령은 실제 GPU(140)를 통해 실행될 수 있다. 또한, 워커 노드(130)는 제2 운영체제(122)가 부팅된 경우, 사용자의 작업 명령을 가상 GPU(140)에 할당할 수 있다. 할당된 사용자의 작업 명령은 가상 GPU(140)를 통해 실행될 수 있다.In step 310 , the worker node 130 receives the user's work command from the master node 110 by the operating system unit 120 and allocates it to the GPU 140 . Specifically, when the first operating system 121 is booted, the worker node 130 may allocate a user's work command to the actual GPU 140 . The assigned user's work command may be actually executed through the GPU 140 . Also, when the second operating system 122 is booted, the worker node 130 may allocate a user's work command to the virtual GPU 140 . The assigned user's work command may be executed through the virtual GPU 140 .

도 4는 예시적인 실시예들에서 사용되기에 적합한 컴퓨팅 장치를 포함하는 컴퓨팅 환경(10)을 예시하여 설명하기 위한 블록도이다. 도시된 실시예에서, 각 컴포넌트들은 이하에 기술된 것 이외에 상이한 기능 및 능력을 가질 수 있고, 이하에 기술된 것 이외에도 추가적인 컴포넌트를 포함할 수 있다.4 is a block diagram illustrating and describing a computing environment 10 including a computing device suitable for use in example embodiments. In the illustrated embodiment, each component may have different functions and capabilities other than those described below, and may include additional components in addition to those described below.

도시된 컴퓨팅 환경(10)은 컴퓨팅 장치(12)를 포함한다. 일 실시예에서, 컴퓨팅 장치(12)는 본 발명의 실시예에 따른 GPU 자원 관리를 수행하기 위한 장치일 수 있다. The illustrated computing environment 10 includes a computing device 12 . In one embodiment, the computing device 12 may be a device for performing GPU resource management according to an embodiment of the present invention.

컴퓨팅 장치(12)는 적어도 하나의 프로세서(14), 컴퓨터 판독 가능 저장 매체(16) 및 통신 버스(18)를 포함한다. 프로세서(14)는 컴퓨팅 장치(12)로 하여금 앞서 언급된 예시적인 실시예에 따라 동작하도록 할 수 있다. 예컨대, 프로세서(14)는 컴퓨터 판독 가능 저장 매체(16)에 저장된 하나 이상의 프로그램들을 실행할 수 있다. 상기 하나 이상의 프로그램들은 하나 이상의 컴퓨터 실행 가능 명령어를 포함할 수 있으며, 상기 컴퓨터 실행 가능 명령어는 프로세서(14)에 의해 실행되는 경우 컴퓨팅 장치(12)로 하여금 예시적인 실시예에 따른 동작들을 수행하도록 구성될 수 있다.Computing device 12 includes at least one processor 14 , computer readable storage medium 16 , and communication bus 18 . The processor 14 may cause the computing device 12 to operate in accordance with the exemplary embodiments discussed above. For example, the processor 14 may execute one or more programs stored in the computer-readable storage medium 16 . The one or more programs may include one or more computer-executable instructions that, when executed by the processor 14, configure the computing device 12 to perform operations in accordance with the exemplary embodiment. can be

컴퓨터 판독 가능 저장 매체(16)는 컴퓨터 실행 가능 명령어 내지 프로그램 코드, 프로그램 데이터 및/또는 다른 적합한 형태의 정보를 저장하도록 구성된다. 컴퓨터 판독 가능 저장 매체(16)에 저장된 프로그램(20)은 프로세서(14)에 의해 실행 가능한 명령어의 집합을 포함한다. 일 실시예에서, 컴퓨터 판독 가능 저장 매체(16)는 메모리(랜덤 액세스 메모리와 같은 휘발성 메모리, 비휘발성 메모리, 또는 이들의 적절한 조합), 하나 이상의 자기 디스크 저장 디바이스들, 광학 디스크 저장 디바이스들, 플래시 메모리 디바이스들, 그 밖에 컴퓨팅 장치(12)에 의해 액세스되고 원하는 정보를 저장할 수 있는 다른 형태의 저장 매체, 또는 이들의 적합한 조합일 수 있다.Computer-readable storage medium 16 is configured to store computer-executable instructions or program code, program data, and/or other suitable form of information. The program 20 stored in the computer readable storage medium 16 includes a set of instructions executable by the processor 14 . In one embodiment, computer-readable storage medium 16 includes memory (volatile memory, such as random access memory, non-volatile memory, or a suitable combination thereof), one or more magnetic disk storage devices, optical disk storage devices, flash It may be memory devices, other forms of storage medium accessed by computing device 12 and capable of storing desired information, or a suitable combination thereof.

통신 버스(18)는 프로세서(14), 컴퓨터 판독 가능 저장 매체(16)를 포함하여 컴퓨팅 장치(12)의 다른 다양한 컴포넌트들을 상호 연결한다.Communication bus 18 interconnects various other components of computing device 12 , including processor 14 and computer readable storage medium 16 .

컴퓨팅 장치(12)는 또한 하나 이상의 입출력 장치(24)를 위한 인터페이스를 제공하는 하나 이상의 입출력 인터페이스(22) 및 하나 이상의 네트워크 통신 인터페이스(26)를 포함할 수 있다. 입출력 인터페이스(22) 및 네트워크 통신 인터페이스(26)는 통신 버스(18)에 연결된다. 입출력 장치(24)는 입출력 인터페이스(22)를 통해 컴퓨팅 장치(12)의 다른 컴포넌트들에 연결될 수 있다. 예시적인 입출력 장치(24)는 포인팅 장치(마우스 또는 트랙패드 등), 키보드, 터치 입력 장치(터치패드 또는 터치스크린 등), 음성 또는 소리 입력 장치, 다양한 종류의 센서 장치 및/또는 촬영 장치와 같은 입력 장치, 및/또는 디스플레이 장치, 프린터, 스피커 및/또는 네트워크 카드와 같은 출력 장치를 포함할 수 있다. 예시적인 입출력 장치(24)는 컴퓨팅 장치(12)를 구성하는 일 컴포넌트로서 컴퓨팅 장치(12)의 내부에 포함될 수도 있고, 컴퓨팅 장치(12)와는 구별되는 별개의 장치로 컴퓨팅 장치(12)와 연결될 수도 있다.Computing device 12 may also include one or more input/output interfaces 22 and one or more network communication interfaces 26 that provide interfaces for one or more input/output devices 24 . The input/output interface 22 and the network communication interface 26 are coupled to the communication bus 18 . Input/output device 24 may be coupled to other components of computing device 12 via input/output interface 22 . Exemplary input/output device 24 may include a pointing device (such as a mouse or trackpad), a keyboard, a touch input device (such as a touchpad or touchscreen), a voice or sound input device, various types of sensor devices, and/or imaging devices. input devices and/or output devices such as display devices, printers, speakers and/or network cards. The exemplary input/output device 24 may be included in the computing device 12 as a component constituting the computing device 12 , and may be connected to the computing device 12 as a separate device distinct from the computing device 12 . may be

이상에서 본 발명의 대표적인 실시예들을 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 상술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. 그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Although representative embodiments of the present invention have been described in detail above, those of ordinary skill in the art to which the present invention pertains will understand that various modifications are possible without departing from the scope of the present invention with respect to the above-described embodiments. . Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined by the claims described below as well as the claims and equivalents.

100 : GPU 자원 관리 시스템
110 : 마스터 노드
120 : 운영체제부
121 : 제1 운영체제
122 : 제2 운영체제
130 : 워커 노드
140 : GPU(Graphics Processing Unit)100: GPU resource management system
110: master node
120: operating system unit
121: first operating system
122: second operating system
130: worker node
140: GPU (Graphics Processing Unit)

Claims

one or more processors, and
A computing device having a memory to store one or more programs executed by the one or more processors, the computing device comprising:
a master node that receives a work command from a user and generates a booting signal based on the received work command;
an operating system unit for booting a first operating system or a second operating system according to a booting signal input from the master node; and
and a worker node that receives the work command from the master node and allocates it to a GPU (Graphics Processing Unit),
The master node generates the booting signal by determining whether to virtualize the GPU based on one or more of the received work command quantity and memory usage,
When it is determined not to virtualize the GPU based on one or more of the quantity and memory usage of the received work command, the GPU resource management apparatus generates a first booting signal to boot the first operating system.

delete

The method according to claim 1,
The master node is
When it is determined to virtualize the GPU based on one or more of the quantity and memory usage of the received work command, the GPU resource management apparatus generates a second booting signal to boot the second operating system.

4. The method according to claim 3,
The master node is
When the quantity of the received work command is less than or equal to a preset first criterion or the usage amount of the received work command exceeds a preset second criterion, it is determined that the GPU is not virtualized.

4. The method according to claim 3,
The master node is
When the quantity of the received work command exceeds a preset first criterion, or when the amount of the received work command is less than or equal to a preset second criterion, the GPU resource management apparatus for determining to virtualize the GPU.

4. The method according to claim 3,
The worker node is
a GPU virtual driver for virtualizing the GPU into a virtual GPU including a plurality of slots; and
A GPU resource management apparatus comprising a plurality of virtual machines connected to the plurality of slots, respectively, for allocating the received work command to the plurality of slots of the virtual GPU.

4. The method according to claim 3,
The master node is
Comparing the generated booting signal by detecting the operating system unit and the operating system booted to the operating system unit,
When the booting signal is different from the booted operating system, the GPU resource management apparatus generates an end signal to terminate the booted operating system.

8. The method of claim 7,
The master node is
When it is confirmed that the second operating system is booted after the generation of the first booting signal, a termination signal is transmitted to the operating system unit so that the second operating system is terminated, and when the second operating system is terminated, the first booting signal is transmitted The GPU resource management apparatus, which transmits to the operating system unit so that the first operating system is booted.

8. The method of claim 7,
The master node is
When it is confirmed that the first operating system is booted after the generation of the second booting signal, a termination signal is transmitted to the operating system unit so that the first operating system is terminated, and when the first operating system is terminated, the second booting signal is transmitted The GPU resource management apparatus, which transmits to the operating system unit so that the second operating system is booted.

one or more processors, and
As a GPU resource management method performed in a computing device having a memory for storing one or more programs executed by the one or more processors,
receiving a work command by a user at the master node;
generating a booting signal from the master node based on the received work command;
booting the first operating system or the second operating system according to the booting signal input from the master node in the operating system unit; and
In a worker node, receiving the work command from the master node and allocating it to a graphics processing unit (GPU),
The step of generating the booting signal comprises:
Determining whether to virtualize the GPU based on one or more of the received work command quantity and memory usage to generate the booting signal,
When it is determined that the GPU is not virtualized based on one or more of the received work command quantity and memory usage, generating a first booting signal to boot the first operating system, GPU resource management method .

delete

11. The method of claim 10,
The step of generating the first booting signal comprises:
Comprising the step of determining not to virtualize the GPU when the quantity of the received work command is less than or equal to a preset first criterion or when the amount of the received work command exceeds a preset second criterion, GPU resource management method comprising: .

11. The method of claim 10,
The step of generating the booting signal comprises:
and generating a second booting signal to boot the second operating system when it is determined that the GPU is virtualized based on at least one of the quantity and memory usage of the received work command.

15. The method of claim 14,
The generating of the second booting signal comprises:
Comprising the step of determining to virtualize the GPU when the quantity of the received work command exceeds a preset first criterion or when the amount of the received work command is less than or equal to a preset second criterion, GPU resource management method.

15. The method of claim 14,
Allocating to the GPU comprises:
virtualizing the GPU into a virtual GPU including a plurality of slots through a GPU virtual driver in the worker node; and
Allocating the received work command through a plurality of virtual machines in the worker node to a plurality of slots of the virtual GPU, respectively,
The plurality of virtual machines, each connected to the plurality of slots, GPU resource management method.

11. The method of claim 10,
The GPU resource management method comprises:
After performing the step of generating the booting signal and before performing the step of booting,
detecting the operating system unit;
comparing the generated booting signal with the operating system booted by the operating system unit; and
When the booting signal and the booted operating system are different from each other, generating an end signal to terminate the booted operating system.