KR20200052417A

KR20200052417A - Apparatus and method for selecting inference module of target device

Info

Publication number: KR20200052417A
Application number: KR1020180128249A
Authority: KR
Inventors: 윤석진; 박재복; 유승목; 이경희; 조창식
Original assignee: 한국전자통신연구원
Priority date: 2018-10-25
Filing date: 2018-10-25
Publication date: 2020-05-15

Abstract

Disclosed are a device and a method for selecting an inference module of a target device. According to the present invention, the method for selecting an inference module of a target device, performed by the device for selecting an inference module of a target device, comprises the steps of: calculating the amount of calculation of an artificial neural network model; calculating a calculation cost of a target device performing inference by using the artificial neural network model based on the amount of the calculation; and selecting a module of the target device that performs inference by using the artificial neural network model based on the calculation cost of the target device. Therefore, inference can be performed in a way that is optimized for a target device based on the information of the target device.

Description

Apparatus and method for selecting inference module of target device {APPARATUS AND METHOD FOR SELECTING INFERENCE MODULE OF TARGET DEVICE}

본 발명은 타겟 디바이스의 추론 모듈을 선택하는 기술에 관한 것으로, 특히 인공신경망 모델을 로드하여 추론할 때, 타겟 디바이스의 정보를 기반으로 계산 가속기를 사용하여 추론할 수 있도록 하는 기술에 관한 것이다.The present invention relates to a technique for selecting an inference module of a target device, and particularly, when loading and inferring an artificial neural network model, it relates to a technique that enables inference using a calculation accelerator based on information of the target device.

일반적으로 인공지능 분야에서 인공신경망 모델을 추론할 때, CPU에 계산 가속기를 추가하여 추론을 수행한다. 그리고 인공지능 개발 과정에서도 CPU와 계산 가속기를 이용하여 신경망 모델을 생성한다. In general, when inferring an artificial neural network model in the field of artificial intelligence, inference is performed by adding a computational accelerator to the CPU. Also, in the AI development process, a neural network model is generated using a CPU and a computational accelerator.

도 1은 종래 기술에 따른 인공신경망 개발 과정을 설명하기 위한 도면이고, 도 2는 종래 기술에 따른 인공신경망을 이용한 추론 과정을 설명하기 위한 도면이다. 1 is a view for explaining an artificial neural network development process according to the prior art, and FIG. 2 is a view for explaining an inference process using the artificial neural network according to the prior art.

도 1에 도시한 바와 같이, 인공신경망 생성 장치(10)는 고성능의 CPU(11) 및 계산 가속기(13)를 포함하며, 신경망 학습부(15)는 고성능 CPU(11) 및 계산 가속기(13)를 이용하여 학습 데이터(30)에 대한 신경망 학습을 수행한다. 이를 통하여 인공신경망 생성 장치(10)는 신경망 모델(40)을 생성할 수 있다. 1, the artificial neural network generating apparatus 10 includes a high-performance CPU 11 and a computational accelerator 13, and the neural network learning unit 15 has a high-performance CPU 11 and a computational accelerator 13 Use to perform neural network learning on the learning data 30. Through this, the artificial neural network generating apparatus 10 may generate the neural network model 40.

도 2의 인공신경망 추론 장치(20)는 신경망 모델(40)을 로드하여 추론을 수행하는 타겟 디바이스를 의미하며, 저성능 CPU(21) 및 계산 가속기(23)를 이용하여 신경망 추론(25)을 수행한다. The artificial neural network inference device 20 of FIG. 2 refers to a target device that loads the neural network model 40 to perform inference, and uses the low-performance CPU 21 and the computational accelerator 23 to perform neural network inference 25. Perform.

인공신경망 추론 장치(20)의 신경망 추론부(25)는 실제 데이터(50)를 신경망 모델(40)에 적용하여 신경망 추론을 수행하고, 추론 결과(60)를 생성할 수 있다. 이때, 도 2에서 인공신경망 추론 장치(20)에 포함된 CPU(21) 및 계산 가속기(23)는 도 1의 CPU(11) 및 계산 가속기(13)에 비하여 성능이 낮다. The neural network inference unit 25 of the artificial neural network inference device 20 may apply the actual data 50 to the neural network model 40 to perform neural network inference and generate an inference result 60. At this time, the CPU 21 and the calculation accelerator 23 included in the artificial neural network inference device 20 in FIG. 2 have lower performance than the CPU 11 and the calculation accelerator 13 of FIG. 1.

인공신경망 생성 장치(10)는 인공신경망의 수행 환경에 대한 고려 없이 인공신경망을 개발하며, 기능이 검정된 인공신경망은 인공신경망 추론 장치(20)에 대한 튜닝 과정을 거친다. The artificial neural network generating device 10 develops an artificial neural network without considering the performance environment of the artificial neural network, and the artificial neural network having the function of verification undergoes a tuning process for the artificial neural network inference device 20.

개발 과정에서 사용된 인공신경망 생성 장치(10)의 수행 환경을 고급 사양인 반면, 실제 인공신경망이 활용될 인공신경망 추론 장치(20)는 하드웨어 계산 성능이 부족한 경우가 많다. While the performance environment of the artificial neural network generating device 10 used in the development process is an advanced specification, the artificial neural network inference device 20 in which an actual artificial neural network will be utilized often lacks hardware calculation performance.

특히, 호스트와 계산 가속기(23)간 데이터 전송 속도가 느린 경우, 인공신경망 추론 장치(20)는 계산 가속기(23)를 사용하지 않고 CPU(21)에서 직접 처리하는 방법이 더 효율적일 수 있다. 또한, 계산 가속기(23)가 동시에 처리할 수 있는 크기가 작은 경우, 데이터를 나누어 전송해야 하므로, 호스트와 인공신경망 추론 장치(20)간 데이터 전송 횟수가 증가하여 전체 전송 속도가 느려질 수 있다. In particular, when the data transmission speed between the host and the computational accelerator 23 is slow, the artificial neural network inference device 20 may be more efficient in processing directly from the CPU 21 without using the computational accelerator 23. In addition, when the size that can be simultaneously processed by the computation accelerator 23 is small, data must be divided and transmitted, so the number of data transmissions between the host and the artificial neural network inference device 20 increases, and thus the overall transmission speed may be slowed down.

이와 같이, 인공지능 개발 과정에서 모델을 생성할 때 수행 환경에 대한 고려 없이 개발되어 발생하는 종래 기술의 문제점을 극복하고, 타겟 디바이스의 추론 속도를 향상시킬 수 있는 기술의 개발이 필요하다. As described above, it is necessary to develop a technology capable of overcoming the problems of the conventional technology that is developed and generated without considering the performance environment when generating a model in the process of AI development, and improving the reasoning speed of the target device.

한국 등록 특허 제10-1543969호, 2015년 08월 11일 공고(명칭: 애플리케이션 처리 속도와 소모 전력 향상을 위한 CPU 제어 방법 및 장치)Korean Registered Patent No. 10-1543969, August 11, 2015 Announcement (Name: CPU control method and device for improving application processing speed and power consumption)

본 발명의 목적은 타겟 디바이스의 정보를 기반으로, 타겟 디바이스에 최적인 방법으로 추론을 수행하도록 하는 것이다. An object of the present invention is to perform inference in a method optimal for a target device based on information of the target device.

또한, 본 발명의 목적은 타겟 디바이스의 형태를 예측하기 어려운 개발 단계에서 생성된 인공신경망 모델이 다양한 타겟 디바이스에 적합한 형태로 적용될 수 있도록 하는 것이다. In addition, an object of the present invention is to enable the artificial neural network model generated in a development stage in which it is difficult to predict the shape of the target device to be applied in a form suitable for various target devices.

또한, 본 발명이 목적은 인공지능 분야에서 인공신경망을 개발할 때, 타겟 디바이스에 맞는 인공신경망을 개발하는 시행착오를 줄이는 것이다. In addition, the object of the present invention is to reduce trial and error when developing an artificial neural network suitable for a target device when developing an artificial neural network in the field of artificial intelligence.

상기한 목적을 달성하기 위한 본 발명에 따른 타겟 디바이스의 추론 모듈 선택 장치에 의해 수행되는 타겟 디바이스의 추론 모듈 선택 방법은, 인공신경망 모델의 계산량을 연산하는 단계, 상기 계산량을 기반으로, 상기 인공신경망 모델을 이용하여 추론을 수행하는 타겟 디바이스의 계산 비용을 연산하는 단계, 그리고 상기 타겟 디바이스의 계산 비용을 기반으로, 상기 인공신경망 모델을 이용하여 추론을 수행하는 상기 타겟 디바이스의 모듈을 선택하는 단계를 포함한다. The method for selecting a reasoning module of a target device performed by the apparatus for selecting a reasoning module of a target device according to the present invention for achieving the above object comprises: calculating a calculation amount of an artificial neural network model, and based on the calculation amount, the artificial neural network Calculating a calculation cost of a target device performing inference using a model, and selecting a module of the target device performing inference using the artificial neural network model based on the calculation cost of the target device Includes.

이때, 상기 타겟 디바이스의 모듈을 선택하는 단계는, 연산된 상기 계산 비용을 기반으로, 상기 타겟 디바이스의 CPU 및 계산 가속기 중에서 적어도 어느 하나를 상기 인공신경망 모델을 이용하여 추론을 수행할 상기 타겟 디바이스의 모듈로 선택할 수 있다. At this time, the step of selecting a module of the target device, based on the calculated calculation cost, at least one of the CPU and the calculation accelerator of the target device using the artificial neural network model of the target device to perform inference Can be selected as a module.

이때, 상기 타겟 디바이스의 계산 비용을 연산하는 단계는, 상기 타겟 디바이스의 CPU가 추론을 수행하는 경우의 계산 비용인 CPU 계산 비용 및 상기 타겟 디바이스의 계산 가속기가 추론을 수행하는 경우의 계산 비용인 계산 가속기 비용 중 적어도 어느 하나를 포함하는 상기 계산 비용을 연산할 수 있다. In this case, calculating the calculation cost of the target device includes calculating the CPU cost, which is a calculation cost when the CPU of the target device performs inference, and the calculation cost when a calculation accelerator of the target device performs inference. The calculation cost including at least one of accelerator costs may be calculated.

이때, 상기 타겟 디바이스의 모듈을 선택하는 단계는, 상기 CPU 계산 비용 및 상기 계산 가속기 비용의 비교 결과를 기반으로, 상기 타겟 디바이스의 상기 CPU 및 상기 계산 가속기 중에서 적어도 어느 하나의 상기 모듈을 선택할 수 있다. At this time, the step of selecting a module of the target device may select at least one of the modules from the CPU and the calculation accelerator of the target device, based on a comparison result of the CPU calculation cost and the calculation accelerator cost. .

이때, 상기 타겟 디바이스의 모듈을 선택하는 단계는, 상기 CPU 계산 비용이 상기 계산 가속기 비용보다 적은 경우 상기 CPU를 상기 추론을 수행하는 상기 타겟 디바이스의 모듈로 선택하고, 상기 계산 가속기 비용이 상기 CPU 계산 비용보다 적은 경우 상기 계산 가속기를 상기 추론을 수행하는 상기 타겟 디바이스의 모듈로 선택할 수 있다. At this time, the step of selecting the module of the target device, if the CPU calculation cost is less than the calculation accelerator cost, select the CPU as a module of the target device performing the inference, and the calculation accelerator cost is the CPU calculation If less than the cost, the computational accelerator may be selected as a module of the target device that performs the inference.

이때, 상기 타겟 디바이스의 모듈을 선택하는 단계는, 상기 CPU 계산 비용 및 상기 계산 가속기 비용 중 적어도 어느 하나와 임계값의 비교 결과를 기반으로, 상기 CPU 및 상기 계산 가속기 중에서 적어도 어느 하나의 상기 모듈을 선택할 수 있다. At this time, the step of selecting the module of the target device, based on a comparison result of at least one of the CPU calculation cost and the calculation accelerator cost and a threshold value, the at least one module among the CPU and the calculation accelerator You can choose.

이때, 상기 타겟 디바이스의 계산 비용을 연산하는 단계는, 상기 타겟 디바이스에 상기 추론을 수행하는 추론 엔진이 설치될 때 저장된 상기 타겟 디바이스의 정보를 이용하여, CPU 계산 비용 및 계산 가속기 비용 중 적어도 어느 하나를 포함하는 상기 계산 비용을 연산할 수 있다. At this time, the calculating of the calculation cost of the target device may include at least one of a CPU calculation cost and a calculation accelerator cost by using the information of the target device stored when the reasoning engine performing the reasoning is installed in the target device. It is possible to calculate the calculation cost including.

이때, 상기 타겟 디바이스의 정보는, 상기 타겟 디바이스의 CPU의 평균 계산 시간, 상기 타겟 디바이스의 계산 가속기의 평균 계산 시간, 상기 계산 가속기의 동시 계산 용량 및 상기 계산 가속기와 호스트 메모리간 전송 속도 중 적어도 어느 하나를 포함할 수 있다. In this case, the target device information includes at least one of an average calculation time of the CPU of the target device, an average calculation time of the calculation accelerator of the target device, a simultaneous calculation capacity of the calculation accelerator, and a transfer speed between the calculation accelerator and host memory. It can contain one.

이때, 상기 타겟 디바이스의 계산 비용을 연산하는 단계는, 상기 CPU의 평균 계산 시간 및 상기 계산량을 기반으로 상기 CPU 계산 비용을 연산할 수 있다. In this case, calculating the calculation cost of the target device may calculate the CPU calculation cost based on the average calculation time and the calculation amount of the CPU.

이때, 상기 타겟 디바이스의 계산 비용을 연산하는 단계는, 상기 계산 가속기와 호스트 메모리간 전송 속도, 상기 계산 가속기의 동시 계산 용량 및 상기 계산 가속기의 평균 계산 시간 중 적어도 어느 하나를 기반으로 상기 계산 가속기 비용을 연산할 수 있다. In this case, calculating the calculation cost of the target device may include calculating the accelerator cost based on at least one of the transfer speed between the calculation accelerator and the host memory, the simultaneous calculation capacity of the calculation accelerator, and the average calculation time of the calculation accelerator. Can be calculated.

또한, 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치는, 인공신경망 모델의 계산량을 연산하는 계산량 연산부, 상기 계산량을 기반으로, 상기 인공신경망 모델을 이용하여 추론을 수행하는 타겟 디바이스의 계산 비용을 연산하는 계산 비용 연산부, 그리고 상기 타겟 디바이스의 계산 비용을 기반으로, 상기 인공신경망 모델을 이용하여 추론을 수행하는 상기 타겟 디바이스의 모듈을 선택하는 추론 모듈 선택부를 포함한다. In addition, the apparatus for selecting an inference module of a target device according to an embodiment of the present invention includes: a calculation amount calculation unit configured to calculate a calculation amount of an artificial neural network model, based on the calculation amount, of It includes a calculation cost calculation unit for calculating the calculation cost, and an inference module selection unit for selecting a module of the target device performing inference using the artificial neural network model based on the calculation cost of the target device.

이때, 상기 추론 모듈 선택부는, 연산된 상기 계산 비용을 기반으로, 상기 타겟 디바이스의 CPU 및 계산 가속기 중에서 적어도 어느 하나를 상기 인공신경망 모델을 이용하여 추론을 수행할 상기 타겟 디바이스의 모듈로 선택할 수 있다. At this time, the inference module selector may select at least one of the CPU and the computational accelerator of the target device as the module of the target device to perform inference using the artificial neural network model based on the calculated calculation cost. .

이때, 상기 계산 비용 연산부는, 상기 타겟 디바이스의 CPU가 추론을 수행하는 경우의 계산 비용인 CPU 계산 비용 및 상기 타겟 디바이스의 계산 가속기가 추론을 수행하는 경우의 계산 비용인 계산 가속기 비용 중 적어도 어느 하나를 포함하는 상기 계산 비용을 연산할 수 있다. In this case, the calculation cost calculation unit, at least one of the CPU calculation cost, which is the calculation cost when the CPU of the target device performs inference, and the calculation accelerator cost, which is the calculation cost when the calculation accelerator of the target device performs inference. It is possible to calculate the calculation cost including.

이때, 상기 추론 모듈 선택부는, 상기 CPU 계산 비용 및 상기 계산 가속기 비용의 비교 결과를 기반으로, 상기 타겟 디바이스의 상기 CPU 및 상기 계산 가속기 중에서 적어도 어느 하나의 상기 모듈을 선택할 수 있다. At this time, the reasoning module selection unit may select at least one of the modules from the CPU and the calculation accelerator of the target device, based on a comparison result of the CPU calculation cost and the calculation accelerator cost.

이때, 상기 추론 모듈 선택부는, 상기 CPU 계산 비용이 상기 계산 가속기 비용보다 적은 경우 상기 CPU를 상기 추론을 수행하는 상기 타겟 디바이스의 모듈로 선택하고, 상기 계산 가속기 비용이 상기 CPU 계산 비용보다 적은 경우 상기 계산 가속기를 상기 추론을 수행하는 상기 타겟 디바이스의 모듈로 선택할 수 있다. At this time, if the CPU calculation cost is less than the calculation accelerator cost, the inference module selection unit selects the CPU as a module of the target device performing the inference, and the calculation accelerator cost is less than the CPU calculation cost. The computational accelerator can be selected as a module of the target device that performs the inference.

이때, 상기 추론 모듈 선택부는, 상기 CPU 계산 비용 및 상기 계산 가속기 비용 중 적어도 어느 하나와 임계값의 비교 결과를 기반으로, 상기 CPU 및 상기 계산 가속기 중에서 적어도 어느 하나의 상기 모듈을 선택할 수 있다. At this time, the reasoning module selector may select at least one of the CPU and the calculation accelerator based on a comparison result of a threshold value with at least one of the CPU calculation cost and the calculation accelerator cost.

이때, 상기 타겟 디바이스에 상기 추론을 수행하는 추론 엔진이 설치될 때 상기 타겟 디바이스의 정보 저장하는 저장부를 더 포함하고, 상기 계산 비용 연산부는, 상기 타겟 디바이스의 정보를 이용하여, CPU 계산 비용 및 계산 가속기 비용 중 적어도 어느 하나를 포함하는 상기 계산 비용을 연산할 수 있다. At this time, when the inference engine for performing the inference is installed on the target device further includes a storage unit for storing information of the target device, and the calculation cost calculation unit uses the information of the target device to calculate and calculate CPU cost The calculation cost including at least one of accelerator costs may be calculated.

이때, 상기 저장부는, 상기 타겟 디바이스의 CPU의 평균 계산 시간, 상기 타겟 디바이스의 계산 가속기의 평균 계산 시간, 상기 계산 가속기의 동시 계산 용량 및 상기 계산 가속기와 호스트 메모리간 전송 속도 중 적어도 어느 하나를 포함하는 상기 타겟 디바이스의 정보를 저장할 수 있다. In this case, the storage unit includes at least one of an average calculation time of the CPU of the target device, an average calculation time of the calculation accelerator of the target device, a simultaneous calculation capacity of the calculation accelerator, and a transfer speed between the calculation accelerator and the host memory. The information of the target device can be stored.

이때, 상기 계산 비용 연산부는, 상기 CPU의 평균 계산 시간 및 상기 계산량을 기반으로 상기 CPU 계산 비용을 연산할 수 있다. At this time, the calculation cost calculation unit may calculate the CPU calculation cost based on the average calculation time and the calculation amount of the CPU.

이때, 상기 계산 비용 연산부는, 상기 계산 가속기와 호스트 메모리간 전송 속도, 상기 계산 가속기의 동시 계산 용량 및 상기 계산 가속기의 평균 계산 시간 중 적어도 어느 하나를 기반으로 상기 계산 가속기 비용을 연산할 수 있다. In this case, the calculation cost calculator may calculate the cost of the calculation accelerator based on at least one of the transfer speed between the calculation accelerator and the host memory, the simultaneous calculation capacity of the calculation accelerator, and the average calculation time of the calculation accelerator.

본 발명에 따르면, 타겟 디바이스의 정보를 기반으로, 타겟 디바이스에 최적인 방법으로 추론을 수행 할 수 있다. According to the present invention, based on the information of the target device, it is possible to perform inference in an optimal way for the target device.

또한 본 발명에 따르면, 타겟 디바이스의 형태를 예측하기 어려운 개발 단계에서 생성된 인공신경망 모델이 다양한 타겟 디바이스에 적합한 형태로 적용되도록 할 수 있다. In addition, according to the present invention, the artificial neural network model generated in the development stage in which it is difficult to predict the shape of the target device can be applied in a form suitable for various target devices.

또한 본 발명에 따르면, 인공지능 분야에서 인공신경망을 개발할 때, 타겟 디바이스에 맞는 인공신경망을 개발하는 시행착오를 줄일 수 있다. In addition, according to the present invention, when developing an artificial neural network in the field of artificial intelligence, it is possible to reduce trial and error to develop an artificial neural network suitable for a target device.

도 1은 종래 기술에 따른 인공신경망 개발 과정을 설명하기 위한 도면이다.
도 2는 종래 기술에 따른 인공신경망을 이용한 추론 과정을 설명하기 위한 도면이다.
도 3은 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치의 구성을 나타낸 블록도이다.
도 4는 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 방법을 설명하기 위한 순서도이다.
도 5는 본 발명의 일실시예에 따른 타겟 디바이스의 정보를 저장하는 과정을 설명하기 위한 순서도이다.
도 6은 본 발명의 일실시예에 따른 컴퓨터 시스템을 나타낸 블록도이다.1 is a view for explaining an artificial neural network development process according to the prior art.
2 is a view for explaining a reasoning process using an artificial neural network according to the prior art.
3 is a block diagram showing the configuration of an apparatus for selecting an inference module of a target device according to an embodiment of the present invention.
4 is a flowchart illustrating a method for selecting a reasoning module of a target device according to an embodiment of the present invention.
5 is a flowchart illustrating a process of storing information of a target device according to an embodiment of the present invention.
6 is a block diagram showing a computer system according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 상세하게 설명하고자 한다.The present invention can be applied to various changes and can have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail.

그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, terms such as “include” or “have” are intended to indicate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, one or more other features. It should be understood that the existence or addition possibilities of fields or numbers, steps, operations, components, parts or combinations thereof are not excluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person skilled in the art to which the present invention pertains. Terms such as those defined in a commonly used dictionary should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined in the present application. Does not.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 본 발명을 설명함에 있어 전체적인 이해를 용이하게 하기 위하여 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In order to facilitate the overall understanding in describing the present invention, the same reference numerals are used for the same components in the drawings, and duplicate descriptions for the same components are omitted.

이하에서는 도 3을 통하여 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치의 구성에 대하여 상세하게 설명한다. Hereinafter, a configuration of an inference module selection apparatus of a target device according to an embodiment of the present invention will be described in detail with reference to FIG. 3.

도 3은 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치의 구성을 나타낸 블록도이다. 3 is a block diagram showing the configuration of an apparatus for selecting an inference module of a target device according to an embodiment of the present invention.

도 3에 도시한 바와 같이, 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산량 연산부(110), 계산 비용 연산부(120), 추론 모듈 선택부(130) 및 저장부(140)를 포함한다. 3, the apparatus 100 for selecting a reasoning module of a target device according to an embodiment of the present invention includes a calculation amount calculating unit 110, a calculation cost calculating unit 120, a reasoning module selecting unit 130, and a storage unit 140.

먼저, 계산량 연산부(110)는 인공신경망 모델의 계산량을 연산한다. 여기서, 인공신경망 모델은 인공지능 분야에서의 신경망을 의미하며, 계산량 연산부(110)는 타겟 디바이스의 추론 모듈 선택 장치(100)가 로드한 인공신경망 모델의 크기를 기반으로 계산량을 연산할 수 있다. First, the calculation amount calculating unit 110 calculates the calculation amount of the artificial neural network model. Here, the artificial neural network model means a neural network in the field of artificial intelligence, and the calculation amount calculating unit 110 may calculate the calculation amount based on the size of the artificial neural network model loaded by the reasoning module selection apparatus 100 of the target device.

이때, 계산량 연산부(110)는 인공신경망 모델에 표현된 노드의 개수와 연결의 개수를 곱하여 해당 인공신경망 모델에 필요한 계산량을 구할 수 있다. 계산량 연산부(110)가 연산한 계산량은 최대 계산량으로, 실제 계산량은 추론을 수행하는 과정에서 줄어들 수 있다. 그러나 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치(100)는 최대 계산량을 기준으로 후술할 계산 비용을 추출할 수 있다. At this time, the calculation amount calculating unit 110 may obtain the calculation amount required for the artificial neural network model by multiplying the number of nodes and the number of connections expressed in the artificial neural network model. The calculation amount calculated by the calculation unit 110 is the maximum calculation amount, and the actual calculation amount may be reduced in the process of performing inference. However, the apparatus 100 for selecting a reasoning module of a target device according to an embodiment of the present invention may extract a calculation cost to be described later based on a maximum calculation amount.

다음으로 계산 비용 연산부(120)는 연산된 계산량을 기반으로, CPU 계산 비용 및 계산 가속기 비용을 포함하는 타겟 디바이스의 계산 비용을 연산한다. 여기서, 타겟 디바이스는 추론 엔진을 탑재하고, 인공신경망 모델을 이용하여 추론을 수행하는 디바이스를 의미한다. Next, the calculation cost calculation unit 120 calculates the calculation cost of the target device including the CPU calculation cost and the calculation accelerator cost based on the calculated calculation amount. Here, the target device means a device equipped with an inference engine and performing inference using an artificial neural network model.

계산 비용 연산부(120)는 타겟 디바이스의 정보와 인공신경망 모델에 상응하는 계산량을 이용하여, CPU 계산 비용 및 계산 가속기 비용을 연산할 수 있다. 이때, 계산 비용 연산부(120)는 타겟 디바이스의 CPU가 추론을 수행하는 경우의 계산 비용을 CPU 계산 비용으로 연산하고, 타겟 디바이스의 계산 가속기가 추론을 수행하는 경우의 계산 비용을 계산 가속기 비용으로 연산할 수 있다. The calculation cost calculation unit 120 may calculate the CPU calculation cost and the calculation accelerator cost using information of the target device and a calculation amount corresponding to the artificial neural network model. At this time, the calculation cost calculating unit 120 calculates the calculation cost when the CPU of the target device performs inference as the CPU calculation cost, and calculates the calculation cost when the calculation accelerator of the target device performs the inference as the calculation accelerator cost. can do.

특히, 계산 비용 연산부(120)는 다음의 수학식 1과 같이 타겟 디바이스의 정보 중에서, CPU의 평균 계산 시간에 상응하는 CPU 계산 속도 및 계산량을 기반으로 CPU 계산 비용을 연산할 수 있다. In particular, the calculation cost calculating unit 120 may calculate the CPU calculation cost based on the CPU calculation speed and calculation amount corresponding to the average calculation time of the CPU, among information of the target device, as shown in Equation 1 below.

[수학식 1][Equation 1]

여기서, CPU 계산 속도는 CPU의 평균 계산 시간으로부터 연산된 값일 수 있으며, CPU의 평균 계산 시간은 벤치마크 프로그램을 수행하여 획득한 것일 수 있다. Here, the CPU calculation speed may be a value calculated from the average calculation time of the CPU, and the average calculation time of the CPU may be obtained by performing a benchmark program.

그리고 계산 비용 연산부(120)는 다음의 수학식 2와 같이 타겟 디바이스의 정보 중에서, 호스트 메모리와 계산 가속기간 전송 속도, 계산 가속기의 동시 계산 용량 및 계산 가속기의 평균 계산 시간 중 적어도 어느 하나를 기반으로 계산 가속기 비용을 연산할 수 있다. In addition, the calculation cost calculating unit 120 is based on at least one of the host device and the calculation accelerator transmission speed, the simultaneous calculation capacity of the calculation accelerator, and the average calculation time of the calculation accelerator among information of the target device, as shown in Equation 2 below. Compute accelerator cost can be calculated.

[수학식 2][Equation 2]

여기서, 계산 가속기 계산 속도는 계산 가속기의 평균 계산 시간으로부터 연산된 값일 수 있으며, 계산 가속기의 평균 계산 시간은 기 설정된 횟수만큼 계산을 수행하여 획득한 시간들의 평균값일 수 있다. 일 예로, 계산 비용 연산부(120)는 약 100만번 계산을 수행하여 획득한 시간들의 평균값을 계산 가속기의 계산 속도로 연산할 수 있다. Here, the calculation accelerator calculation speed may be a value calculated from the average calculation time of the calculation accelerator, and the average calculation time of the calculation accelerator may be an average value of times obtained by performing calculation for a predetermined number of times. For example, the calculation cost calculator 120 may calculate an average value of times obtained by performing calculations about 1 million times at a calculation speed of the calculation accelerator.

다음으로 추론 모듈 선택부(130)는 타겟 디바이스의 계산 비용을 기반으로, 타겟 디바이스의 CPU 및 계산 가속기 중에서 적어도 어느 하나를 추론을 수행할 타겟 디바이스의 모듈로 선택할 수 있다. Next, the inference module selection unit 130 may select at least one of the CPU and the computational accelerator of the target device as a module of the target device to perform inference based on the calculation cost of the target device.

이때, 추론 모듈 선택부(130)는 CPU 계산 비용 및 계산 가속기 비용의 비교 결과를 기반으로, 추론을 수행할 타겟 디바이스의 모듈인 추론 모듈을 선택할 수 있다. 예를 들어, CPU 계산 비용이 계산 가속기 비용보다 적은 경우, 추론 모듈 선택부(130)는 CPU를 추론 모듈로 선택할 수 있다. 반면, 계산 가속기 비용이 CPU 계산 비용보다 적은 경우 추론 모듈 선택부(130)는 계산 가속기를 추론 모듈로 선택할 수 있다. At this time, the reasoning module selection unit 130 may select a reasoning module, which is a module of a target device to perform inference, based on a comparison result of CPU calculation cost and calculation accelerator cost. For example, when the CPU calculation cost is less than the calculation accelerator cost, the reasoning module selection unit 130 may select the CPU as the reasoning module. On the other hand, if the computational accelerator cost is less than the CPU computational cost, the inference module selection unit 130 may select the computational accelerator as an inference module.

여기서, 추론 모듈 선택부(130)에 의해 선택된 추론 모듈은, 추론을 수행하기 위하여 인공신경망 모델을 전개할 때 중심이 되는 모듈을 의미할 수 있다. 예를 들어, CPU가 추론 모듈로 선택된 경우에는 CPU 계산 모듈을 호출하는 방식으로 추론을 수행하고, 계산 가속기가 추론 모듈로 선택된 경우에는 계산 가속기 중심의 기 설정된 모듈을 이용하여 추론을 수행할 수 있다. Here, the inference module selected by the inference module selection unit 130 may mean a module that becomes a center when deploying an artificial neural network model to perform inference. For example, when the CPU is selected as the inference module, inference is performed by calling the CPU calculation module, and when the calculation accelerator is selected as the inference module, inference may be performed using a preset module centered on the calculation accelerator. .

설명의 편의를 위하여, 추론 모듈 선택부(130)가 CPU 계산 비용과 계산 가속기 비용을 비교하여 추론 모듈을 선택하는 것으로 설명하였으나 이에 한정하지 않고, 추론 모듈 선택부(130)는 CPU 계산 비용 및 계산 가속기 비용 중 적어도 어느 하나를 임계값과 비교하고, 임계값과의 비교 결과를 기반으로 추론 모듈을 선택할 수 있다. For convenience of explanation, the reasoning module selection unit 130 has been described as comparing the CPU calculation cost with the calculation accelerator cost to select the reasoning module, but the reasoning module selection unit 130 is not limited to the CPU calculation cost and calculation At least one of the accelerator costs can be compared with a threshold, and a reasoning module can be selected based on the comparison result with the threshold.

마지막으로 저장부(140)는 타겟 디바이스의 정보를 저장한다. 이때, 저장부(140)는 타겟 디바이스에 추론을 수행하는 추론 엔진이 설치될 때, 타겟 디바이스의 정보를 저장할 수 있다. Finally, the storage unit 140 stores information of the target device. At this time, the storage unit 140 may store information of the target device when an inference engine that performs inference on the target device is installed.

설명의 편의를 위하여, 타겟 디바이스에 탑재되어 추론을 수행하는 추론 엔진과 타겟 디바이스의 추론 모듈 선택 장치(100)가 별개의 장치인 것으로 설명하였으나 이에 한정하지 않고, 추론 엔진과 타겟 디바이스의 추론 모듈 선택 장치(100)가 실질적으로 하나의 장치인 경우, 저장부(140)는 타겟 디바이스의 추론 모듈 선택 장치(100)가 타겟 디바이스에 설치될 때 타겟 디바이스의 정보를 저장할 수 있다. For convenience of description, the reasoning engine mounted on the target device and the reasoning module selection device 100 of the target device are described as separate devices, but the present invention is not limited thereto, and the reasoning module of the reasoning engine and the target device is selected. When the device 100 is substantially one device, the storage 140 may store information of the target device when the reasoning module selection device 100 of the target device is installed on the target device.

그리고 저장부(140)는 타겟 디바이스에 포함된 CPU의 평균 계산 시간, 계산 가속기의 평균 계산 시간, 계산 가속기의 동시 계산 용량 및 계산 가속기와 호스트 메모리간 전송 속도 중 적어도 어느 하나를 포함하는 타겟 디바이스의 정보를 저장할 수 있다. In addition, the storage unit 140 of the target device including at least one of the average calculation time of the CPU included in the target device, the average calculation time of the calculation accelerator, the simultaneous calculation capacity of the calculation accelerator, and the transfer speed between the calculation accelerator and the host memory. Information can be stored.

이하에서는 도 4를 통하여, 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치에 의해 수행되는 타겟 디바이스의 추론 모듈 선택 방법에 대하여 더욱 상세하게 설명한다. Hereinafter, a method for selecting a reasoning module of a target device performed by the apparatus for selecting a reasoning module of a target device according to an embodiment of the present invention will be described in more detail with reference to FIG. 4.

도 4는 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 방법을 설명하기 위한 순서도이다. 4 is a flowchart illustrating a method for selecting a reasoning module of a target device according to an embodiment of the present invention.

먼저, 타겟 디바이스의 추론 모듈 선택 장치(100)는 인공 신경망 모델의 계산량을 연산한다(S410). First, the apparatus 100 for selecting a reasoning module of a target device calculates a calculation amount of an artificial neural network model (S410).

타겟 디바이스의 추론 모듈 선택 장치(100)는 저장장치에 저장된 인공신경망 모델을 로드한다. 그리고 타겟 디바이스의 추론 모듈 선택 장치(100)는 로드된 인공신경망 모델을 분석하여, 해당 인공신경망 모델의 계산량을 연산한다. The reasoning module selection device 100 of the target device loads the artificial neural network model stored in the storage device. In addition, the inference module selection apparatus 100 of the target device analyzes the loaded artificial neural network model and calculates a calculated amount of the artificial neural network model.

타겟 디바이스의 추론 모듈 선택 장치(100)는 인공신경망 모델에 상응하는 노드의 개수와 연결의 개수를 곱하여, 해당 인공신경망 모델의 계산량을 연산할 수 있다. The apparatus 100 for selecting a reasoning module of a target device may calculate a calculation amount of the corresponding artificial neural network model by multiplying the number of nodes corresponding to the artificial neural network model by the number of connections.

그리고 타겟 디바이스의 추론 모듈 선택 장치(100)는 타겟 디바이스의 계산 비용을 연산한다(S420). Then, the reasoning module selection apparatus 100 of the target device calculates the calculation cost of the target device (S420).

타겟 디바이스의 추론 모듈 선택 장치(100)는 타겟 디바이스의 정보와 인공신경망 모델의 계산량을 이용하여, CPU 계산 비용 및 계산 가속기 비용을 연산할 수 있다. The inference module selection apparatus 100 of the target device may calculate the CPU calculation cost and the calculation accelerator cost using the information of the target device and the calculation amount of the artificial neural network model.

타겟 디바이스의 정보는 타겟 디바이스에 추론 엔진이 설치될 때 생성되어 저장된 것일 수 있으며, 타겟 디바이스의 추론 모듈 선택 장치(100)는 저장된 타겟 디바이스의 정보를 입력받아 타겟 디바이스의 계산 비용을 연산할 수 있다. The information of the target device may be generated and stored when the inference engine is installed on the target device, and the inference module selection apparatus 100 of the target device may receive the information of the stored target device and calculate the calculation cost of the target device. .

여기서, 타겟 디바이스의 정보는 인공신경망 모델을 이용하여 추론을 수행하는 디바이스의 정보를 의미하며, 타겟 디바이스에 상응하는 CPU의 평균 계산 시간, 계산 가속기의 평균 계산 시간, 계산 가속기의 동시 계산 용량 및 계산 가속기와 호스트 메모리간 전송 속도 중 적어도 어느 하나를 포함할 수 있다. 타겟 디바이스의 정보에 대해서는 후술할 도 5를 통하여 더욱 상세하게 설명하기로 한다. Here, the target device information means information of a device performing inference using an artificial neural network model, and the average calculation time of the CPU corresponding to the target device, the average calculation time of the calculation accelerator, and the simultaneous calculation capacity and calculation of the calculation accelerator It may include at least one of the transfer speed between the accelerator and the host memory. The target device information will be described in more detail with reference to FIG. 5 to be described later.

타겟 디바이스의 추론 모듈 선택 장치(100)는 CPU의 평균 계산 시간 및 인공신경망 모델의 계산량을 기반으로, CPU 계산 비용을 연산할 수 있다. 또한, 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기와 호스트 메모리간 전송 속도, 계산 가속기의 동시 계산 용량 및 계산 가속기의 평균 계산 시간과 인공 신경망 모델의 계산량을 기반으로 계산 가속기 비용을 연산할 수 있다. The inference module selection apparatus 100 of the target device may calculate the CPU calculation cost based on the average calculation time of the CPU and the calculation amount of the artificial neural network model. In addition, the inference module selection apparatus 100 of the target device calculates the computational accelerator cost based on the transmission speed between the computational accelerator and the host memory, the simultaneous computational capacity of the computational accelerator, and the average computation time of the computational accelerator and the computational amount of the artificial neural network model. Can be.

설명의 편의를 위하여, 타겟 디바이스의 추론 모듈 선택 장치(100)가 CPU 계산 비용 및 계산 가속기 비용을 연산하는 것으로 설명하였으나, 후술할 S430 단계에서 기 설정된 임계값과 타겟 디바이스의 계산 비용을 비교하는 경우 타겟 디바이스의 추론 모듈 선택 장치(100)는 CPU 계산 비용 및 계산 가속기 비용 중 적어도 어느 하나만 연산할 수도 있다. For convenience of explanation, the reasoning module selection apparatus 100 of the target device has been described as calculating the CPU calculation cost and the calculation accelerator cost, but compares the calculation cost of the target device with a preset threshold in step S430, which will be described later. The inference module selection apparatus 100 of the target device may calculate at least one of a CPU calculation cost and a calculation accelerator cost.

다음으로 타겟 디바이스의 추론 모듈 선택 장치(100)는 타겟 디바이스의 계산 비용을 비교한다(S430). Next, the reasoning module selection apparatus 100 of the target device compares the calculation cost of the target device (S430).

이때, 타겟 디바이스의 추론 모듈 선택 장치(100)는 S420 단계에서 연산된 CPU 계산 비용과 계산 가속기 비용을 비교할 수 있다. At this time, the inference module selection apparatus 100 of the target device may compare the CPU calculation cost calculated in step S420 with the calculation accelerator cost.

비교 결과 CPU 계산 비용이 계산 가속기 비용보다 큰 것으로 판단된 경우(S440 Yes), 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기를 추론 모듈로 선택할 수 있다(S450). As a result of comparison, when it is determined that the CPU calculation cost is greater than the calculation accelerator cost (S440 Yes), the reasoning module selection device 100 of the target device may select the calculation accelerator as the reasoning module (S450).

반면, 계산 가속기 비용이 CPU 계산 비용보다 큰 것으로 판단된 경우(S440 No), 타겟 디바이스의 추론 모듈 선택 장치(100)는 CPU를 추론 모듈로 선택할 수 있다(S460). On the other hand, when it is determined that the calculation accelerator cost is greater than the CPU calculation cost (S440 No), the reasoning module selection apparatus 100 of the target device may select the CPU as the reasoning module (S460).

마지막으로, 추론 모듈을 선택한 후 타겟 디바이스의 추론 모듈 선택 장치(100)는 선택된 추론 모듈을 이용하여 추론을 수행하도록 할 수 있다(S470). Finally, after selecting the reasoning module, the apparatus 100 for selecting a reasoning module of the target device may perform reasoning using the selected reasoning module (S470).

여기서, 선택된 추론 모듈은 인공신경망 모델을 이용하여 추론을 수행할 때 중심이 되는 모듈을 의미하며, 타겟 디바이스의 추론 모듈 선택 장치(100)는 타겟 디바이스의 추론 모듈이 추론을 수행할 수 있도록 인공신경망 모델을 전개하거나, 해당 추론 모듈을 호출할 수 있다. Here, the selected inference module means a module that is central when performing inference using an artificial neural network model, and the inference module selection apparatus 100 of the target device is an artificial neural network so that the inference module of the target device can perform inference. You can deploy the model or call the inference module.

CPU 계산 비용이 계산 가속기 비용보다 작은 경우, 타겟 디바이스에 설치된 추론 엔진은 CPU 계산 모듈을 호출하는 방식으로 추론을 수행할 수 있다. 반면, 계산 가속기 비용이 CPU 계산 비용보다 작은 경우 타겟 디바이스의 추론 모듈 선택 장치(100)는 추론 엔진이 계산 가속기 중심의 기 설정된 모듈을 이용하여 추론을 수행할 수 있도록, 인공신경망 모델을 전개할 수 있다.If the CPU calculation cost is smaller than the calculation accelerator cost, the inference engine installed in the target device may perform inference by calling the CPU calculation module. On the other hand, if the computational accelerator cost is less than the CPU computational cost, the inference module selection apparatus 100 of the target device may deploy an artificial neural network model so that the inference engine can perform inference using a predetermined module centered on the computational accelerator. have.

이와 같이, 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치(100)는 인공지능 학습된 인공신경망 모델을 사용할 때, 타겟 디바이스의 환경에 적합한 형태로 추론을 수행하도록 할 수 있다. As described above, the apparatus 100 for selecting a reasoning module of a target device according to an embodiment of the present invention may perform reasoning in a form suitable for an environment of a target device when using an artificial intelligence learned artificial neural network model.

즉, 인공신경망 모델을 생성할 때, 개발자가 해당 인공신경망 모델이 적용될 타겟 디바이스에 대하여 고려하지 않고 인공신경망을 생성할 수 있으며, 인공신경망 모델 생성 시의 시행착오를 줄일 수 있다. That is, when creating an artificial neural network model, a developer can create an artificial neural network without considering a target device to which the artificial neural network model is applied, and reduce trial and error when creating the artificial neural network model.

그리고 학습된 인공신경망 모델을 타겟 디바이스에 배포할 때 타겟 디바이스의 형태를 예측 할 수 없어도, 해당 인공신경망 모델이 타겟 디바이스에 적합한 형태로 적용되어 추론을 수행하도록 할 수 있다. In addition, even when the trained artificial neural network model is not predictable when the target device is distributed to the target device, the artificial neural network model may be applied in a form suitable for the target device to perform inference.

이하에서는 도 5를 통하여 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치가 타겟 디바이스의 정보를 저장하는 과정에 대하여 더욱 상세하게 설명한다. Hereinafter, a process in which the apparatus for selecting a reasoning module of the target device stores the information of the target device will be described in more detail with reference to FIG. 5.

도 5는 본 발명의 일실시예에 따른 타겟 디바이스의 정보를 저장하는 과정을 설명하기 위한 순서도이다. 5 is a flowchart illustrating a process of storing information of a target device according to an embodiment of the present invention.

타겟 디바이스의 정보를 저장하는 도 5의 과정은, 타겟 디바이스에 추론 엔진이 설치될 때 수행될 수 있다. The process of FIG. 5 for storing the information of the target device may be performed when the inference engine is installed on the target device.

도 5에 도시한 바와 같이, 타겟 디바이스의 추론 모듈 선택 장치(100)는 CPU의 평균 계산 시간을 연산한다(S510). 그리고 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기의 평균 계산 시간을 연산한다(S520).5, the inference module selection apparatus 100 of the target device calculates the average calculation time of the CPU (S510). Then, the inference module selection apparatus 100 of the target device calculates the average calculation time of the calculation accelerator (S520).

타겟 디바이스의 추론 모듈 선택 장치(100)는 벤치마크 프로그램을 통하여, CPU의 평균 계산 시간을 연산하거나, 계산 가속기의 평균 계산 시간을 연산할 수 있다. 또한, 타겟 디바이스의 추론 모듈 선택 장치(100)는 복수의 횟수만큼 계산을 수행하여, CPU의 평균 계산 시간을 연산할 수 있다. 일 예로, 타겟 디바이스의 추론 모듈 선택 장치(100)는 100만번 계산을 수행하여, CPU의 평균 계산 시간 및 계산 가속기의 평균 계산 시간을 연산할 수 있다. The inference module selection apparatus 100 of the target device may calculate the average calculation time of the CPU or the average calculation time of the calculation accelerator through a benchmark program. In addition, the inference module selection apparatus 100 of the target device may calculate a plurality of times and calculate the average calculation time of the CPU. For example, the inference module selection apparatus 100 of the target device may perform one million calculations to calculate the average calculation time of the CPU and the average calculation time of the calculation accelerator.

다음으로 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기와 호스트 메모리간 전송 속도를 연산하고(S530), 계산 가속기의 동시 계산 용량을 연산한다(S540). Next, the inference module selection apparatus 100 of the target device calculates a transfer speed between the calculation accelerator and the host memory (S530), and calculates the simultaneous calculation capacity of the calculation accelerator (S540).

타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기와 호스트 메모리 간의 데이터 전송에 소요되는 시간을 확인하여, 계산 가속기와 호스트 메모리간 전송 속도를 연산할 수 있다. The inference module selection apparatus 100 of the target device may check the time required for data transmission between the computational accelerator and the host memory, and calculate the transmission speed between the computational accelerator and the host memory.

일반적으로 메모리를 공유하지 않는 경우, 호스트 메모리에 저장된 데이터를 계산 가속기로 전송하여야 한다. 이때, 호스트 메모리에 저장된 데이터를 계산 가속기로 전송할 때 많은 시간이 소요되는 경우, 전체적인 전송 속도가 느려지므로, 계산 가속기를 사용하지 않고 CPU로 연산을 수행하여 추론하는 것이 더 효율적일 수 있다. In general, when the memory is not shared, data stored in the host memory should be transferred to the calculation accelerator. At this time, when a large amount of time is spent when transferring the data stored in the host memory to the computational accelerator, the overall transmission speed is slowed, so it may be more efficient to infer by performing an operation with the CPU without using the computational accelerator.

따라서, 본 발명의 일실시예에 따른 타겟 디바이스의 추론 모듈 선택 장치(100)는 계산 가속기와 호스트 메모리간 전송 속도를 연산하여 타겟 디바이스의 정보로 저장함으로써, 호스트 메모리와 계산 가속기간의 전송 속도가 느린 경우 계산 가속기를 사용하여 성능이 저하되는 문제를 사전에 예방할 수 있다. Accordingly, the apparatus 100 for selecting a reasoning module of a target device according to an embodiment of the present invention calculates a transfer speed between a calculation accelerator and a host memory and stores the information as a target device, so that the transfer speed of the host memory and the calculation acceleration period is slow. In this case, a calculation accelerator can be used to prevent the performance degradation.

마지막으로, 타겟 디바이스의 추론 모듈 선택 장치(100)는 연산 결과를 타겟 디바이스의 정보로 저장한다(S550). Finally, the reasoning module selection apparatus 100 of the target device stores the calculation result as information of the target device (S550).

타겟 디바이스의 추론 모듈 선택 장치(100)는 저장부에 타겟 디바이스의 정보를 저장하거나, 별도의 저장장치에 타겟 디바이스의 정보를 저장할 수 있다. 그리고 저장된 타겟 디바이스의 정보는, 타겟 디바이스의 추론 모듈 선택 장치(100)가 인공신경망 모델을 로드하여 추론을 수행할 때 활용될 수 있다. The reasoning module selection apparatus 100 of the target device may store information of the target device in a storage unit or may store information of the target device in a separate storage device. And the stored target device information may be utilized when the inference module selection apparatus 100 of the target device loads the artificial neural network model to perform inference.

설명의 편의를 위하여, 타겟 디바이스의 추론 모듈 선택 장치(100)가 도 5와 같이 타겟 디바이스의 정보를 저장하는 과정을 수행하는 것으로 설명하였으나 이에 한정하지 않고, 추론 엔진을 설치하는 과정에서 타겟 디바이스가 도 5와 같이 타겟 디바이스의 정보를 저장하는 과정을 수행할 수도 있다. For convenience of explanation, the reasoning module selection apparatus 100 of the target device has been described as performing a process of storing information of the target device as shown in FIG. 5, but is not limited thereto, and the target device is used in the process of installing the reasoning engine. As illustrated in FIG. 5, a process of storing information of the target device may be performed.

도 6은 본 발명의 일실시예에 따른 컴퓨터 시스템을 나타낸 블록도이다.6 is a block diagram showing a computer system according to an embodiment of the present invention.

도 6을 참조하면, 본 발명의 실시예는 컴퓨터로 읽을 수 있는 기록매체와 같은 컴퓨터 시스템(600)에서 구현될 수 있다. 도 6에 도시된 바와 같이, 컴퓨터 시스템(600)은 버스(620)를 통하여 서로 통신하는 하나 이상의 프로세서(610), 메모리(630), 사용자 인터페이스 입력 장치(640), 사용자 인터페이스 출력 장치(650) 및 스토리지(660)를 포함할 수 있다. 또한, 컴퓨터 시스템(600)은 네트워크(680)에 연결되는 네트워크 인터페이스(670)를 더 포함할 수 있다. 프로세서(610)는 중앙 처리 장치 또는 메모리(630)나 스토리지(660)에 저장된 프로세싱 인스트럭션들을 실행하는 반도체 장치일 수 있다. 메모리(630) 및 스토리지(660)는 다양한 형태의 휘발성 또는 비휘발성 저장 매체일 수 있다. 예를 들어, 메모리는 ROM(631)이나 RAM(632)을 포함할 수 있다.6, an embodiment of the present invention may be implemented in a computer system 600 such as a computer-readable recording medium. As shown in FIG. 6, computer system 600 includes one or more processors 610, memory 630, user interface input device 640, and user interface output device 650 that communicate with each other via bus 620. And storage 660. In addition, the computer system 600 may further include a network interface 670 connected to the network 680. The processor 610 may be a central processing unit or a semiconductor device that executes processing instructions stored in the memory 630 or the storage 660. The memory 630 and the storage 660 may be various types of volatile or nonvolatile storage media. For example, the memory may include ROM 631 or RAM 632.

따라서, 본 발명의 실시예는 컴퓨터로 구현된 방법이나 컴퓨터에서 실행 가능한 명령어들이 기록된 비일시적인 컴퓨터에서 읽을 수 있는 매체로 구현될 수 있다. 컴퓨터에서 읽을 수 있는 명령어들이 프로세서에 의해서 수행될 때, 컴퓨터에서 읽을 수 있는 명령어들은 본 발명의 적어도 한 가지 태양에 따른 방법을 수행할 수 있다.Accordingly, an embodiment of the present invention may be implemented as a computer-implemented method or a non-transitory computer-readable medium having computer-executable instructions recorded thereon. When computer readable instructions are executed by a processor, computer readable instructions may perform the method according to at least one aspect of the present invention.

이상에서와 같이 본 발명에 따른 타겟 디바이스의 추론 모듈 선택 장치 및 방법은 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다. As described above, the apparatus and method for selecting a reasoning module of the target device according to the present invention are not limited to the configuration and method of the embodiments described as described above, and the above embodiments are provided so that various modifications can be made. All or some of the embodiments may be configured by selectively combining.

10: 인공신경망 생성 장치 11: 고성능 CPU
13: 고성능 계산 가속기 15: 신경망 학습부
20: 인공신경망 추론 장치 21: 저성능 CPU
23: 저성능 계산 가속기 25: 신경망 추론부
30: 학습 데이터 40: 신경망 모델
50: 실제 데이터 60: 추론 결과
100: 타겟 디바이스의 추론 모듈 선택 장치
110: 계산량 연산부 120: 계산 비용 연산부
130: 추론 모듈 선택부 140: 저장부
600: 컴퓨터 시스템 610: 프로세서
620: 버스 630: 메모리
631: 롬 632: 램
640: 사용자 인터페이스 입력 장치
650: 사용자 인터페이스 출력 장치
660: 스토리지 670: 네트워크 인터페이스
680: 네트워크10: artificial neural network generator 11: high-performance CPU
13: high-performance computational accelerator 15: neural network learning department
20: artificial neural network inference device 21: low-performance CPU
23: low-performance calculation accelerator 25: neural network inference
30: training data 40: neural network model
50: actual data 60: inference results
100: inference module selection device of the target device
110: calculation amount calculation unit 120: calculation cost calculation unit
130: inference module selection unit 140: storage unit
600: computer system 610: processor
620: bus 630: memory
631: Rom 632: Ram
640: user interface input device
650: user interface output device
660: storage 670: network interface
680: network

Claims

In the method for selecting a reasoning module of the target device performed by the apparatus for selecting a reasoning module of the target device,
Calculating the amount of computation of the artificial neural network model,
Calculating a calculation cost of a target device performing inference using the artificial neural network model based on the calculation amount, and
And selecting a module of the target device that performs inference using the artificial neural network model based on the calculation cost of the target device.

According to claim 1,
The step of selecting a module of the target device,
Inference module of the target device, characterized in that at least one of the CPU and the computational accelerator of the target device is selected as a module of the target device to perform inference using the artificial neural network model based on the calculated calculation cost. How to choose.

According to claim 2,
Calculating the calculation cost of the target device,
Calculate the calculation cost including at least one of a CPU calculation cost, which is a calculation cost when the CPU of the target device performs inference, and a calculation accelerator cost, which is a calculation cost when the calculation accelerator of the target device performs inference. Method for selecting a reasoning module of the target device, characterized in that.

According to claim 3,
The step of selecting a module of the target device,
Based on the comparison result of the CPU calculation cost and the calculation accelerator cost, a method for selecting a reasoning module of a target device, characterized in that at least one of the CPU and the calculation accelerator of the target device is selected.

The method of claim 4,
The step of selecting a module of the target device,
If the CPU calculation cost is less than the calculation accelerator cost, the CPU is selected as a module of the target device performing the inference, and when the calculation accelerator cost is less than the CPU calculation cost, the calculation accelerator performs the inference. A method for selecting a reasoning module of a target device, characterized in that it is selected as a module of the target device.

According to claim 3,
The step of selecting a module of the target device,
A method for selecting an inference module of a target device, wherein at least one of the CPU and the computational accelerator is selected based on a comparison result of a threshold value with at least one of the CPU computational cost and the computational accelerator cost. .

According to claim 1,
Calculating the calculation cost of the target device,
A target characterized by calculating the calculation cost including at least one of a CPU calculation cost and a calculation accelerator cost using information of the target device stored when an inference engine performing the reasoning is installed in the target device. How to choose a device's inference module.

The method of claim 7,
The target device information,
A target comprising at least one of an average calculation time of the CPU of the target device, an average calculation time of the calculation accelerator of the target device, a simultaneous calculation capacity of the calculation accelerator, and a transfer speed between the calculation accelerator and the host memory. How to choose a device's inference module.

The method of claim 8,
Calculating the calculation cost of the target device,
A method for selecting a reasoning module of a target device, wherein the CPU calculation cost is calculated based on the average calculation time of the CPU and the calculation amount.

The method of claim 8,
Calculating the calculation cost of the target device,
A method for selecting a reasoning module of a target device, wherein the calculation accelerator cost is calculated based on at least one of the transfer speed between the calculation accelerator and the host memory, the simultaneous calculation capacity of the calculation accelerator, and the average calculation time of the calculation accelerator. .

Computation calculation unit that calculates the calculation of the artificial neural network model,
On the basis of the calculation amount, a calculation cost calculation unit for calculating the calculation cost of the target device for performing inference using the artificial neural network model, and
An inference module selection device for a target device, comprising an inference module selection unit for selecting a module of the target device that performs inference using the artificial neural network model based on the calculation cost of the target device.

The method of claim 11,
The reasoning module selection unit,
Inference module of the target device, characterized in that at least one of the CPU and the computational accelerator of the target device is selected as a module of the target device to perform inference using the artificial neural network model based on the calculated calculation cost. Optional device.

The method of claim 12,
The calculation cost calculation unit,
Calculate the calculation cost including at least one of a CPU calculation cost, which is a calculation cost when the CPU of the target device performs inference, and a calculation accelerator cost, which is a calculation cost when the calculation accelerator of the target device performs inference. Inference module selection apparatus of the target device, characterized in that.

According to claim 3,
The reasoning module selection unit,
Based on the comparison result of the CPU calculation cost and the calculation accelerator cost, the apparatus for selecting a reasoning module of a target device, characterized in that at least one of the CPU and the calculation accelerator of the target device is selected.

The method of claim 14,
The reasoning module selection unit,
If the CPU calculation cost is less than the calculation accelerator cost, the CPU is selected as a module of the target device performing the inference, and when the calculation accelerator cost is less than the CPU calculation cost, the calculation accelerator performs the inference. An apparatus for selecting an inference module of a target device, characterized in that it is selected as a module of the target device.

The method of claim 13,
The reasoning module selection unit,
An apparatus for selecting an inference module of a target device, wherein at least one of the CPU and the computational accelerator is selected based on a comparison result of a threshold value with at least one of the CPU computational cost and the computational accelerator cost. .

The method of claim 11,
When the inference engine for performing the inference is installed on the target device further comprises a storage unit for storing information of the target device,
The calculation cost calculation unit,
The apparatus for selecting a reasoning module of a target device, wherein the calculation cost including at least one of a CPU calculation cost and a calculation accelerator cost is calculated using information of the target device.

The method of claim 17,
The storage unit,
Information of the target device including at least one of an average calculation time of the CPU of the target device, an average calculation time of the calculation accelerator of the target device, a simultaneous calculation capacity of the calculation accelerator, and a transfer speed between the calculation accelerator and host memory. Inference module selection device of the target device, characterized in that for storing.

The method of claim 18,
The calculation cost calculation unit,
Inference module selection device of the target device, characterized in that for calculating the CPU calculation cost based on the average calculation time and the calculation amount of the CPU.

The method of claim 18,
The calculation cost calculation unit,
A device for selecting a reasoning module of a target device, wherein the calculation accelerator cost is calculated based on at least one of the transfer speed between the calculation accelerator and the host memory, the simultaneous calculation capacity of the calculation accelerator, and the average calculation time of the calculation accelerator. .