KR20200023660A

KR20200023660A - Electronic device for controlling performance of at least one processor when providing inference service through deep learning model and operating method thereof

Info

Publication number: KR20200023660A
Application number: KR1020180094583A
Authority: KR
Inventors: 강우철
Original assignee: 인천대학교 산학협력단
Priority date: 2018-08-13
Filing date: 2018-08-13
Publication date: 2020-03-06
Also published as: KR102159953B1

Abstract

The present invention relates to an electronic device for controlling the performance of at least one processor when an inference service is provided through a deep learning model, and an operating method thereof. According to various embodiments of the present invention, the electronic device performing an arithmetic operation through a deep learning model includes: a layer management part performing an arithmetic operation on input data through a deep learning model when receiving the input data, and identifying a milestone identifier pre-allocated to at least one of a plurality of layers included in the deep learning model while performing the arithmetic operation on the input data; a performance confirmation part measuring and storing time consumed from the start of the arithmetic operation for the input data to the moment for the identification of the milestone identifier in response to the identification of the milestone, and then, obtaining a delay value based on the stored consumed time and a preset target time; a delay control part determining whether the obtained delay value is within a threshold range based on a preset value, and then, generating a control command for adjusting the performance of at least one processor based on a result of the determination; a resource management part changing the frequency of the processor in the electronic device in accordance with the generated control command; and an input/output part providing data calculated until a moment for the identification of a termination identifier as output data corresponding to the received input data in response to the identification of the termination identifier when the arithmetic operation for the input data is continued through the processor with the changed frequency.

Description

ELECTRONIC DEVICE FOR CONTROLLING PERFORMANCE OF AT LEAST ONE PROCESSOR WHEN PROVIDING INFERENCE SERVICE THROUGH DEEP LEARNING MODEL AND OPERATING METHOD THEREOF }

본 발명은 딥러닝 모델을 통한 추론 서비스를 제공할 때, 적어도 하나의 프로세서의 성능을 제어하는 전자 장치 및 그의 동작 방법에 대한 것이다.The present invention relates to an electronic device that controls the performance of at least one processor when providing an inference service through a deep learning model, and a method of operating the same.

딥러닝(deep learning)은 사물이나 데이터를 군집화하거나 분류하는 데 이용되는 기술로서, 다층 구조 형태의 신경망(neural network)을 통해 강건(robust)하고 정확(accurate)한 추론(inference)을 가능하게 한다는 점에서 최근 그 쓰임새가 확장되고 있다. 예를 들어, 딥러닝의 신경망에 의해 구동되는 시각적 장면에 대한 이해와 관련된 기술은 증강 현실용 웨어러블 장치, 홈 오토메이션 장치, 카메라 기반의 감시 장치 및 자율 주행 차량과 같은, 가상 물리 시스템(cyber-physical system)에 활발히 적용되고 있다.Deep learning is a technique used to cluster or classify objects or data, and enables robust and accurate inference through multi-layer neural networks. In recent years, its use has been expanded. For example, techniques related to understanding visual scenes driven by neural networks in deep learning include cyber-physical, such as wearable devices for augmented reality, home automation devices, camera-based surveillance and autonomous vehicles. system is being actively applied.

이처럼 딥러닝에 이용되는 신경망은 연속적으로 배치된 복수의 레이어들로 구성될 수 있으며, 복수의 레이어들은 배치된 순서에 따라 레이어 별 연산을 수행한 후 연산 결과를 그 다음에 배치된 레이어로 전달하도록 설정될 수 있다.As such, the neural network used for deep learning may be composed of a plurality of layers arranged in succession, and the plurality of layers perform calculation for each layer according to the order in which the plurality of layers are arranged, and then transfer the calculation result to the next arranged layer. Can be set.

한편, 딥러닝을 통한 추론 과정은 수많은 연산들을 수반하기 때문에, 딥러닝을 이용하여 데이터를 처리하는 전자 장치에는 고성능의 프로세서, 충분한 전력을 공급할 수 있는 전원 장치 및 충분한 저장 공간을 제공할 수 있는 메모리 등이 구비될 필요가 있다. 따라서, 성능, 공급 전력, 저장 공간 등 리소스(resource)가 제한된 모바일 장치에서는 딥러닝의 사용이 제한되는 측면이 있었다.On the other hand, since inference processes through deep learning involve numerous operations, electronic devices that process data using deep learning have high performance processors, power supplies capable of supplying sufficient power, and memory capable of providing sufficient storage space. Etc., it is necessary to be provided. Therefore, the use of deep learning has been limited in mobile devices with limited resources such as performance, power supply, and storage space.

최근에는 모바일 장치에서도 딥러닝에 따른 추론 과정이 원활히 수행될 수 있도록, 리소스를 효율적으로 관리하기 위한 다양한 방법들이 제안되고 있다. 예를 들어, 전자 장치에 하드웨어 기반의 가속기를 적용하는 방법이나, 딥러닝 모델을 압축하여 연산량을 줄이는 방법 등이 제안되고 있으며, 제안된 방법들을 통해 모바일 및 임베디드 장치에서 딥러닝에 따른 추론 과정을 보다 원활하게 수행하는 것이 가능해졌다.Recently, various methods for efficiently managing resources have been proposed so that the reasoning process according to deep learning can be smoothly performed in a mobile device. For example, a method of applying a hardware-based accelerator to an electronic device or a method of reducing a computation amount by compressing a deep learning model has been proposed. Through the proposed methods, the inference process of deep learning in mobile and embedded devices is proposed. It became possible to perform more smoothly.

추론 과정은 딥러닝을 통해 임의의 입력과 대응되는 출력을 획득하는 과정을 의미할 수 있다. 또한, 딥러닝 모델은 추론 과정에 이용될 수 있는 알고리즘으로서, 신경망을 구성하는 복수의 레이어들의 구조를 정의하거나 레이어 별로 수행되는 연산의 매개 변수 또는 가중치를 정의하는데 이용될 수 있다.The inference process may mean a process of obtaining an output corresponding to an arbitrary input through deep learning. In addition, the deep learning model is an algorithm that can be used in an inference process, and can be used to define a structure of a plurality of layers constituting a neural network or define a parameter or weight of an operation performed for each layer.

일 실시 예에 따르면, 딥러닝 모델에 기초하여 정의된 매개 변수 또는 가중치는 트레이닝 과정을 통해 학습(learning), 즉, 갱신될 수 있다. 트레이닝 과정은 많은 연산 처리량을 필요로 하기 때문에 클라우드 등 서버를 통해 수행될 수 있으며, 이를 통해 학습된 딥러닝 모델은 스마트 폰, 감시 카메라, 스마트 스피커 및 자율 주행 차량 등 다양한 모바일 장치들에 제공될 수 있다. 이렇게 하여, 다양한 모바일 장치들에 제공된 딥러닝 모델은 애플리케이션에서 입력된 데이터를 분류하고 결과를 예측하는 추론 과정에 이용될 수 있다.According to an embodiment of the present disclosure, a parameter or weight defined based on the deep learning model may be learned, that is, updated through a training process. Since the training process requires a lot of computational processing, it can be performed through a server such as a cloud, and the deep learning model learned through this can be provided to various mobile devices such as smart phones, surveillance cameras, smart speakers, and autonomous vehicles. have. In this way, deep learning models provided to various mobile devices can be used in the inference process to classify the data input from the application and predict the results.

한편, 추론 과정은 트레이닝 과정에 비해 상대적으로 적은 연산 처리량을 필요로 할 수 있으나 모바일 장치에서 프로세서의 성능 및 공급 가능한 전력량은 한정적일 수밖에 없으므로, 연산에 따른 레이턴시(latency)와 전력 효율성은 추론 과정에서 주요한 성능 지표로 고려될 수 있다. 예를 들어, 모바일 장치에서 실행되는 STT(speech-to-text) 애플리케이션 또는 번역 애플리케이션의 결과값이 추론 과정을 통해 출력되거나 자율 주행에 이용되는 판단의 결과값이 추론 과정을 통해 출력되는 경우, 각각의 결과값은 최대한 빨리 출력될 것을 요청 받을 수 있다. 이 경우, 모바일 장치는 프로세서의 성능을 상향 조정하여 추론 과정에 따른 결과값을 짧은 레이턴시(예: 200 ms)로 출력할 수 있다. 하지만, 이와 동시에 고려되어야 할 사항은 모바일 장치가 공급할 수 있는 전력량으로서, 결과값을 짧은 레이턴시로 출력하기 위해 프로세서의 성능이 상향 조정될 경우 상대적으로 많은 양의 전력이 소모될 수 있기 때문에, 모바일 장치에서 프로세서의 성능을 상향 조정하는 데는 제한이 따를 수밖에 없다. 결국, 모바일 장치에서 추론 과정에 따른 연산이 수행될 때 레이턴시 및 소모 전력은 상호 트레이드 오프(trade off) 관계에 있는 것으로 볼 수 있으며, 이에 따라, 출력되는 결과값 별로 요청되는 최적의 레이턴시를 고려하여 프로세서의 성능이 동적으로 조정될 필요가 있다.On the other hand, the inference process may require relatively less computational throughput than the training process, but the performance and power supply of the processor in the mobile device may be limited, so the latency and power efficiency of the computation may be It can be considered as a major performance indicator. For example, when a result of a speech-to-text (STT) application or a translation application executed on a mobile device is output through an inference process or a result of a judgment used for autonomous driving is output through an inference process, respectively. The result of can be requested to be printed as soon as possible. In this case, the mobile device may increase the performance of the processor and output the result value according to the inference process with a short latency (for example, 200 ms). However, at the same time, it is important to consider that the amount of power that a mobile device can supply is a relatively large amount of power that can be consumed if the performance of the processor is increased to output the result with short latency. Increasing the processor's performance is subject to limitations. As a result, when the calculation according to the inference process is performed in the mobile device, the latency and power consumption may be considered to be in a trade off relationship. Accordingly, in consideration of the optimal latency requested for each output result, The performance of the processor needs to be dynamically adjusted.

나아가, 모바일 장치에서 추론 과정을 이용하여 판단을 수행함에 있어, 매 판단 시 마다 지정된 레이턴시에 따라 일정한 간격으로 결과값이 출력되도록 설정되는 것은 매 판단 시 마다 정확한 결과값이 출력되도록 설정되는 것만큼이나 결과값의 신뢰성 제고 측면에서 중요한 요소일 수 있다. 예컨대, 모바일 장치에서 추론 과정을 이용하여 자율 주행에 필요한 판단을 수행할 때, 매 판단 시 마다 서로 다른 레이턴시에 따라 결과값이 출력된다면, 결과값이 출력되는 시간을 예측하기 어려워 자율 주행의 안정성이 저해될 수 있으며, 이에 따라, 상기 추론 과정에 이용되는 딥러닝 모델은 자율 주행에 필요한 판단을 수행할 때 이용되기 어려울 수 있다.Furthermore, in performing the determination using the inference process in the mobile device, the result values are set to be output at regular intervals according to the specified latency at each determination, as much as the correct result is output at each determination. This may be an important factor in terms of increasing the reliability of the value. For example, when performing a judgment required for autonomous driving using a reasoning process in a mobile device, if the result value is output according to a different latency at each determination, it is difficult to predict the time at which the result value is output. As a result, the deep learning model used in the inference process may be difficult to use when performing the judgment required for autonomous driving.

본 문서에 개시된 다양한 실시 예들은 딥러닝 모델을 이용하여 추론 과정을 수행함에 있어, 추론 과정에 따른 결과값이 출력되는 시점을 예측 가능한 범위 이내로 설정하기 위하여 프로세서의 성능을 동적으로 조정할 수 있는 방법을 제안한다. 예컨대, 일 실시 예에 따른 전자 장치는 출력되는 결과값 별로 요청되는 최적의 레이턴시를 계산하고, 계산된 레이턴시에 따라 결과값이 출력될 수 있도록 전자 장치 내 적어도 하나의 프로세서의 성능을 조정할 수 있다.In various embodiments of the present disclosure, in performing an inference process using a deep learning model, a method of dynamically adjusting a processor's performance in order to set a time point at which a result value according to the inference process is output within a predictable range is provided. Suggest. For example, the electronic device according to an embodiment may calculate an optimal latency requested for each output value and adjust the performance of at least one processor in the electronic device to output the result value according to the calculated latency.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치는, 입력 데이터가 수신되면 딥러닝 모델을 이용하여 상기 입력 데이터에 대한 연산을 수행하고, 상기 입력 데이터에 대한 연산이 수행될 때 상기 딥러닝 모델에 포함된 복수의 레이어들 중 적어도 하나의 레이어에 미리 할당된 마일스톤 식별자를 식별하는 레이어 관리부, 상기 마일스톤 식별자가 식별되는 것에 응답하여, 상기 입력 데이터에 대한 연산이 시작된 후 상기 마일스톤 식별자가 식별된 시점까지의 소요 시간을 측정하여 저장하고, 상기 저장된 소요 시간 및 미리 설정된 목표 시간을 이용하여 지연 값을 획득하는 성능 확인부, 상기 획득된 지연 값이 미리 설정된 값을 기준으로 하여 임계 범위 내에 있는지 여부를 판단하고, 상기 판단 결과에 기초하여 적어도 하나의 프로세서의 성능을 조정하기 위한 제어 명령을 생성하는 지연 제어부, 상기 생성된 제어 명령에 따라, 상기 전자 장치 내 적어도 하나의 프로세서의 주파수를 변경하는 리소스 관리부, 및 상기 주파수가 변경된 적어도 하나의 프로세서를 통해 상기 입력 데이터에 대한 연산이 속행될 때 종료 식별자가 식별되는 것에 응답하여, 상기 수신된 입력 데이터와 대응되는 출력 데이터로서 상기 종료 식별자가 식별된 시점까지 연산된 데이터를 제공하는 입출력부를 포함할 수 있다. According to various embodiments of the present disclosure, when an electronic device performs an operation using a deep learning model, the electronic device performs an operation on the input data using the deep learning model and receives an operation on the input data. A layer manager for identifying a milestone identifier pre-allocated to at least one of a plurality of layers included in the deep learning model, and in response to the milestone identifier being identified, the operation on the input data is started. A performance checking unit for measuring and storing a time required until the time point at which the milestone identifier is identified, and obtaining a delay value using the stored time required and a predetermined target time; and the obtained delay value is based on a preset value. It is determined whether or not within the threshold range, and based on the determination result A delay controller for generating a control command for adjusting the performance of at least one processor, a resource manager for changing a frequency of at least one processor in the electronic device according to the generated control command, and at least the frequency of which is changed In response to the termination identifier being identified when the operation on the input data is continued through one processor, an input / output for providing the calculated data to the time point at which the termination identifier is identified as output data corresponding to the received input data. It may include wealth.

또한, 본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법은, 입력 데이터가 수신되면 딥러닝 모델을 이용하여 상기 입력 데이터에 대한 연산을 수행하는 단계, 상기 입력 데이터에 대한 연산이 수행될 때 상기 딥러닝 모델에 포함된 복수의 레이어들 중 적어도 하나의 레이어에 미리 할당된 마일스톤 식별자를 식별하는 단계, 상기 마일스톤 식별자가 식별되는 것에 응답하여, 상기 입력 데이터에 대한 연산이 시작된 후 상기 마일스톤 식별자가 식별된 시점까지의 소요 시간을 측정하여 저장하는 단계, 상기 저장된 소요 시간 및 미리 설정된 목표 시간을 이용하여 지연 값을 획득하는 단계, 상기 획득된 지연 값이 미리 설정된 값을 기준으로 하여 임계 범위 내에 있는지 여부를 판단하는 단계, 상기 판단 결과에 기초하여 적어도 하나의 프로세서의 성능을 조정하기 위한 제어 명령을 생성하는 단계, 상기 생성된 제어 명령에 따라, 상기 전자 장치 내 적어도 하나의 프로세서의 주파수를 변경하는 단계, 및 상기 주파수가 변경된 적어도 하나의 프로세서를 통해 상기 입력 데이터에 대한 연산이 속행될 때 종료 식별자가 식별되는 것에 응답하여, 상기 입력 데이터와 대응되는 출력 데이터로서 상기 종료 식별자가 식별된 시점까지 연산된 데이터를 제공하는 단계를 포함할 수 있다.Also, according to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model may include performing an operation on the input data using a deep learning model when input data is received. Identifying a milestone identifier previously assigned to at least one of a plurality of layers included in the deep learning model when the operation on the input data is performed; and in response to the milestone identifier being identified, the input Measuring and storing the time required until the milestone identifier is identified after the operation on the data is started, acquiring a delay value using the stored time required and a predetermined target time, and obtaining the delay value Determining whether it is within a threshold range based on a preset value Generating a control command for adjusting the performance of at least one processor based on the determination result, changing a frequency of at least one processor in the electronic device according to the generated control command, and the frequency Responsive to identifying an end identifier when the operation on the input data is continued through the at least one processor having changed, providing output data corresponding to the input data until the point at which the end identifier is identified. It may include a step.

본 문서에 개시된 다양한 실시 예들에 따르면, 전자 장치는 딥러닝 모델을 통해 추론 과정을 수행함에 있어 추론 과정에 따른 결과값이 예측 가능한 시간 범위 안에서 출력되도록 설정될 수 있다. 이처럼, 추론 과정에 따른 결과값이 예측 가능한 시간 범위 안에서 출력되도록 설정됨에 따라, 딥러닝 모델을 이용하는 전자 장치의 안정성과 딥러닝에 따른 연산의 신뢰성이 보장될 수 있다.According to various embodiments of the present disclosure, in performing an inference process through a deep learning model, the electronic device may be configured to output a result value according to the inference process within a predictable time range. As such, as the result value according to the inference process is set to be output within a predictable time range, the stability of the electronic device using the deep learning model and the reliability of the operation according to the deep learning can be guaranteed.

또한, 본 문서에 개시된 다양한 실시 예들에 따르면, 전자 장치는 추론 과정에 따른 결과값이 지정된 레이턴시에 따라 출력되도록 하기 위해 적어도 하나의 프로세서의 성능을 동적으로 조정할 수 있다. 이에 따라, 프로세서의 성능이 필요 이상으로 상향 조정되는 경우가 사라지게 됨에 따라, 전자 장치 내 소모 전력을 효율적으로 관리할 수 있다.In addition, according to various embodiments of the present disclosure, the electronic device may dynamically adjust the performance of at least one processor so that a result value according to an inference process is output according to a specified latency. Accordingly, the case in which the performance of the processor is adjusted upward more than necessary disappears, so that power consumption in the electronic device can be managed efficiently.

도 1은 본 발명의 일 실시 예에 따른 전자 장치의 구성을 도시한 도면이다.
도 2는 본 발명의 일 실시 예에 따른 전자 장치에 구비되는 런타임 모듈의 구성을 도시한 도면이다.
도 3은 본 발명의 일 실시 예에 따른 전자 장치에 구비되는 피드백 회로의 구성을 도시한 도면이다.
도 4는 본 발명의 일 실시 예에 따라 딥러닝 모델에 마일스톤 식별자를 할당하는 방법을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시 예에 따른 전자 장치에서 딥러닝 모델에 마일스톤 식별자를 할당하는 동작을 설명하기 위한 순서도이다.
도 6은 본 발명의 일 실시 예에 따른 전자 장치에서 딥러닝 모델을 통한 추론 과정을 수행할 때, 적어도 하나의 프로세서의 성능을 제어하는 방법을 설명하기 위한 순서도이다.1 is a diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure.
2 is a diagram illustrating a configuration of a runtime module included in an electronic device according to an embodiment of the present disclosure.
3 is a diagram illustrating a configuration of a feedback circuit included in an electronic device according to an embodiment of the present disclosure.
4 is a diagram for describing a method of allocating a milestone identifier to a deep learning model according to an exemplary embodiment.
5 is a flowchart illustrating an operation of allocating a milestone identifier to a deep learning model in an electronic device according to an embodiment of the present disclosure.
6 is a flowchart illustrating a method of controlling performance of at least one processor when performing an inference process through a deep learning model in an electronic device according to an embodiment of the present disclosure.

본 문서에 개시된 다양한 실시 예들은 본 발명을 특정한 실시 형태로 한정하기 위해 제시된 것이 아니며, 다양한 실시 예들을 통해 소개된 구성요소들은 본 발명의 사상 및 기술 범위에 포함되는 모든 변경 가능한 균등물 내지 대체물을 포함하는 의미로서 제시된 것임을 당업자는 용이하게 이해할 것이다. 또한, 각 도면을 설명함에 있어, 다르게 정의되지 않는 한 기술적이거나 과학적인 용어를 포함해서 본 명세서 상에서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 사람에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있는 것으로 해석될 수 있다. 또한, 본 발명의 목적 및 효과, 그리고 그것들을 달성하기 위한 기술적 구성들은 첨부되는 도면과 함께 상세하게 설명되는 실시 예들을 통해 명확해질 것이다. 본 발명을 설명함에 있어 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우, 그와 관련된 상세한 설명은 생략될 수 있으며, 뒤에 설명되는 용어들은 본 발명에서의 구조, 역할 및 기능 등을 고려하여 정의된 용어들로서 이는 사용자 및 운용자의 의도 또는 관례 등에 따라 기존에 사용되던 의미와 달리 해석될 수 있다.The various embodiments disclosed in this document are not presented to limit the present invention to specific embodiments, and the components introduced through the various embodiments are not limited to all changeable equivalents or substitutes included in the spirit and technical scope of the present invention. It will be readily understood by those skilled in the art that the present invention is presented in the meaning included. In addition, in describing each of the drawings, all terms used in the present specification, including technical or scientific terms, unless otherwise defined, are generally understood by those of ordinary skill in the art. It can be interpreted as having the same meaning. In addition, the objects and effects of the present invention, and the technical configurations for achieving them will be apparent through the embodiments described in detail with the accompanying drawings. In the following description of the present invention, when it is determined that a detailed description of a known function or configuration may unnecessarily obscure the subject matter of the present invention, a detailed description thereof may be omitted, and the terminology described hereinafter will be described in detail. As terms defined in consideration of roles, functions, and the like, they may be interpreted differently from the meanings used in the past according to intentions or customs of users and operators.

본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있음을 밝혀둔다. 본 문서에 개시된 다양한 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 오로지 특허청구범위에 기재된 청구항의 범주에 의하여 정의될 뿐이다. It is to be understood that the present invention is not limited to the embodiments disclosed below but may be implemented in various different forms. Various embodiments disclosed in this document are provided to make the disclosure of the present invention complete, and to fully inform the scope of the invention to those skilled in the art to which the present invention pertains. It is only defined by the scope of the claims set forth in the scope.

본 문서에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다. 또한, 본 발명의 다양한 실시 예들에 있어서, 각 구성요소들, 기능 블록들 또는 수단들은 하나 또는 그 이상의 하부 구성요소로 구성될 수 있고, 각 구성요소들이 수행하는 전기, 전자, 기계적 기능들은 전자회로, 집적회로, ASIC(application specific integrated circuit) 등 공지된 다양한 소자들 또는 기계적 요소들로 구현될 수 있으며, 각각 별개로 구현되거나 2 이상이 하나로 통합되어 구현될 수도 있다. In this document, when a part is said to "include" a certain component, it means that it can further include other components, without excluding the other components unless otherwise stated. In addition, in various embodiments of the present disclosure, each component, functional block, or means may be composed of one or more subcomponents, and the electrical, electronic, and mechanical functions performed by each component are electronic circuits. It may be implemented by various known elements or mechanical elements such as an integrated circuit, an application specific integrated circuit (ASIC), and may be implemented separately or two or more may be integrated into one.

한편, 첨부된 블록도의 블록들이나 흐름도의 단계들은 범용 컴퓨터, 특수용 컴퓨터, 휴대용 노트북 컴퓨터, 네트워크 컴퓨터 등 데이터 프로세싱이 가능한 장비의 프로세서나 메모리에 탑재되어 지정된 기능들을 수행하는 컴퓨터 프로그램 인스트럭션들을 의미하는 것으로 해석될 수 있다. 이들 컴퓨터 프로그램 인스트럭션들은 컴퓨터 장치에 구비된 메모리 또는 컴퓨터에서 판독 가능한 메모리에 저장될 수 있기 때문에, 블록도의 블록들 또는 흐름도의 단계들에서 설명된 기능들은 이를 수행하는 인스트럭션 수단을 내포하는 제조물로 생산될 수도 있다. 아울러, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 가능한 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 정해진 순서와 달리 실행되는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 실질적으로 동시에 수행되거나, 역순으로 수행될 수 있으며, 경우에 따라 일부 블록들 또는 단계들이 생략된 채로 수행될 수도 있다.Meanwhile, the steps of the blocks or flowcharts in the attached block diagrams refer to computer program instructions that are mounted in a processor or memory of a data processing device such as a general purpose computer, a special purpose computer, a portable notebook computer, a network computer, and perform specified functions. Can be interpreted. Since these computer program instructions may be stored in a memory provided in a computer device or in a computer readable memory, the functions described in the blocks of the block diagram or the steps of the flowchart are produced as an article containing an instruction means for performing this. May be In addition, each block or step may represent a portion of a module, segment, or code that includes one or more executable instructions for executing a specified logical function (s). It should also be noted that in some alternative embodiments, the functions recited in blocks or steps may be executed in a different order. For example, two blocks or steps shown in succession may be performed substantially concurrently or in reverse order, and in some cases, some blocks or steps may be omitted.

도 1은 본 발명의 일 실시 예에 따른 전자 장치의 구성을 도시한 도면이다. 다양한 실시 예들에 따르면, 전자 장치(100)는 애플리케이션(110), 런타임 모듈(120), 프로세서(130), 센서(140), 저장부(150) 및 입출력부(160) 중 적어도 하나를 포함할 수 있다. 여기서, 런타임 모듈(120)은 특정 인스트럭션을 수행하는 모듈을 의미하기 위해 사용되었으며, 상기 모듈들을 통해 수행되는 인스트럭션들은 전자 장치의 프로세서(130)에 의해 수행되는 것으로 이해될 수 있음을 밝혀둔다.1 is a diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure. According to various embodiments of the present disclosure, the electronic device 100 may include at least one of the application 110, the runtime module 120, the processor 130, the sensor 140, the storage 150, and the input / output unit 160. Can be. Here, the runtime module 120 is used to mean a module that performs a specific instruction, and it is understood that the instructions performed through the modules may be understood to be performed by the processor 130 of the electronic device.

전자 장치(100)의 애플리케이션(110)은 특정 업무를 수행할 수 있는 프로그램 또는 상기 프로그램이 실행되는 모듈을 의미할 수 있다. 일 실시 예에 따르면, 애플리케이션(110)은 딥러닝 모델을 통한 추론 과정을 호출하거나 추론 과정을 통해 획득된 결과 데이터를 미리 정해진 방식에 따라 사용자에게 제공하는 동작을 수행할 수 있다. 애플리케이션(110)은 입력 데이터가 수신되는 것에 응답하여 추론 과정을 호출하도록 설정될 수 있으며, 호출된 추론 과정은 센서(140)에서 수신되는 데이터에 기초하여 런타임 모듈(120)을 통해 주기적 또는 비주기적으로 수행될 수 있다.The application 110 of the electronic device 100 may refer to a program capable of performing a specific task or a module in which the program is executed. According to an embodiment of the present disclosure, the application 110 may call an inference process through the deep learning model or perform an operation of providing a result data obtained through the inference process to the user in a predetermined manner. The application 110 may be configured to invoke an inference process in response to the input data being received, wherein the called inference process is periodic or aperiodic through the runtime module 120 based on the data received at the sensor 140. It can be performed as.

일 실시 예에 따르면, 애플리케이션(110)은 미리 설정된 QoS(quality of service) 데이터를 참고하여 특정 딥러닝 모델(M)을 호출할 수 있다. 여기서, QoS 데이터란 서비스 품질을 나타내는 지표로서, 상기 지표는 네트워크 또는 프로세서 상에서 일정 정도 이하의 레이턴시나 데이터 손실률 등을 보장하기 위한 용도로 이용될 수 있다. 예컨대, QoS 데이터를 나타내는 Q는 하기 수학식 1과 같이 정의될 수 있다.According to an embodiment of the present disclosure, the application 110 may call a specific deep learning model M by referring to preset quality of service (QoS) data. Here, the QoS data is an indicator indicating a quality of service, and the indicator may be used to guarantee a latency or a data loss rate of a certain level or less on a network or a processor. For example, Q representing QoS data may be defined as in Equation 1 below.

상기 수학식 1에서, d는 딥러닝 모델(M)을 통해 추론 과정을 수행함에 있어, 출력 데이터의 획득이 요청되는 제 1 응답 시간을 나타내는 값일 수 있다. 예컨대, 제 1 응답 시간은 추론 과정이 시작되는 시점부터 종료되는 시점까지의 소요 시간을 의미할 수 있다. 또한, 상기 수학식 1에서, C는 딥러닝 모델(M)에 대한 압축 범위(compression bound)를 나타내는 값일 수 있다.In Equation 1, d may be a value representing a first response time for obtaining output data when performing an inference process through the deep learning model M. FIG. For example, the first response time may mean a time required from when the inference process starts to when it ends. In addition, in Equation 1, C may be a value representing a compression bound for the deep learning model M.

일 실시 예에 따르면, 애플리케이션(110)은 미리 설정된 QoS 데이터에 기초하여 딥러닝 모델(M)이 호출되는 경우, QoS 데이터와 함께 딥러닝 모델(M)을 런타임 모듈(120)로 전달할 수 있다.According to an embodiment of the present disclosure, when the deep learning model M is called based on preset QoS data, the application 110 may transmit the deep learning model M to the runtime module 120 together with the QoS data.

전자 장치(100)의 런타임 모듈(120)은 입출력부(160)를 통해 수신되는 입력 데이터와 애플리케이션(110)으로부터 수신되는 딥러닝 모델을 이용하여, 딥러닝 연산을 수행할 수 있다. 예컨대, 런타임 모듈(120)은 입력 데이터로서 "고양이 사진"이 수신되고 입력 데이터가 무엇인지 묻는 질문 데이터가 식별되는 경우, 식별된 질문 데이터에 대한 답변 데이터를 생성하기 위해 딥러닝 모델에 포함된 복수의 레이어들 중 하나 이상의 레이어들 각각에 정의된 연산 방법을 이용하여 입력 데이터에 대한 연산을 수행할 수 있다. 런타임 모듈(120)은 연산을 수행한 결과로서 "고양이"라는 답변 데이터가 획득할 수 있으며, 획득된 답변 데이터를 입력 데이터의 추론 결과로서 애플리케이션(110)에 전달할 수 있다.The runtime module 120 of the electronic device 100 may perform a deep learning operation using input data received through the input / output unit 160 and a deep learning model received from the application 110. For example, the runtime module 120 may include a plurality of pieces included in the deep learning model to generate answer data for the identified question data when the “cat picture” is received as the input data and question data for identifying the input data is identified. An operation on the input data may be performed using an operation method defined in each of one or more layers among the layers of. The runtime module 120 may obtain answer data called "cat" as a result of performing the operation, and deliver the obtained answer data to the application 110 as an inference result of the input data.

일 실시 예에 따르면, 런타임 모듈(120)은 애플리케이션(110)으로부터 QoS 데이터 및 딥러닝 모델이 수신되는 경우, QoS 데이터를 확인하여 딥러닝 모델에 대한 압축을 수행할 수 있다. 예를 들어, 런타임 모듈(120)은 QoS 데이터로부터 출력 데이터의 획득이 요청되는 제 1 응답 시간을 확인할 수 있다. 또한, 런타임 모듈(120)은 딥러닝 모델 내 복수의 레이어들 각각을 이용하여 연산을 수행하는데 소요되는 시간을 합산하여 총 소요 시간을 확인할 수 있다. 만약, 총 소요 시간이 제 1 응답 시간을 초과하는 것으로 판단되는 경우, 런타임 모듈(120)은 딥러닝 모델에 대한 압축을 수행하여 딥러닝 모델을 이용한 연산에 소요되는 시간을 줄일 수 있다. According to an embodiment of the present disclosure, when the QoS data and the deep learning model are received from the application 110, the runtime module 120 may check the QoS data and perform compression on the deep learning model. For example, the runtime module 120 may identify a first response time for which acquisition of output data from the QoS data is requested. In addition, the runtime module 120 may check the total time required by summing the time required to perform an operation using each of the plurality of layers in the deep learning model. If it is determined that the total required time exceeds the first response time, the runtime module 120 may reduce the time required for the calculation using the deep learning model by compressing the deep learning model.

일 실시 예에 따르면, 런타임 모듈(120)은 딥러닝 모델을 이용하여 입력 데이터에 대한 연산을 수행하면서 연산이 진행되는 속도를 일정한 간격으로 확인할 수 있으며, 확인 결과에 기초하여 프로세서(130)의 동작 속도를 제어함으로써 연산에 따른 결과 데이터가 목표 시간보다 빨리 또는 늦게 출력되지 않도록 할 수 있다. 예컨대, 런타임 모듈(120)은 딥러닝 모델에 대한 압축을 수행할 때, 딥러닝 모델 내 복수의 레이어들 중 적어도 하나의 레이어에 제 1 마일스톤 식별자를 할당할 수 있다. 이 후 런타임 모듈(120)은 딥러닝 모델을 이용하여 입력 데이터에 대한 연산을 수행하던 중 미리 할당된 제 1 마일스톤 식별자가 확인되면, 연산이 시작된 시점 또는 앞서 다른 마일스톤 식별자가 확인된 시점부터 제 1 마일스톤 식별자가 확인된 시점까지의 연산에 실제로 소요된 시간인 제 2 응답 시간을 측정할 수 있다. 이와 동시에, 런타임 모듈(120)은 연산이 시작된 시점 또는 앞서 다른 마일스톤 식별자가 확인된 시점부터 제 1 마일스톤 식별자가 확인된 시점까지의 연산에 소요되어야 하는 제 1 목표 시간을 확인할 수 있다. 다양한 실시 예들에 따르면, 런타임 모듈(120)은 제 k 시점에 제 2 응답 시간의 제 1 목표 시간에 대한 비율을 지연 값으로서 획득할 수 있으며, 제 k 시점의 지연 값은 하기 수학식 2와 같이 정의될 수 있다.According to an embodiment of the present disclosure, the runtime module 120 may check the speed at which the calculation proceeds at regular intervals while performing an operation on the input data using the deep learning model, and operate the processor 130 based on the check result. By controlling the speed, the result data of the operation can be prevented from being output earlier or later than the target time. For example, when the runtime module 120 performs compression on the deep learning model, the runtime module 120 may assign a first milestone identifier to at least one layer of the plurality of layers in the deep learning model. Thereafter, when the pre-allocated first milestone identifier is identified while performing the operation on the input data using the deep learning model, the runtime module 120 determines whether the first module is determined from the time when the operation is started or when the other milestone identifier is previously identified. It is possible to measure the second response time, which is the time actually spent on the operation up to the point of time when the milestone identifier is confirmed. At the same time, the runtime module 120 may identify the first target time that should be used for the calculation from the time when the operation is started or when another milestone identifier is confirmed to the time when the first milestone identifier is confirmed. According to various embodiments of the present disclosure, the runtime module 120 may obtain a ratio of the second response time to the first target time as the delay value at the k th time point, and the delay value at the k th time point may be expressed by Equation 2 below. Can be defined.

만약, 지연 값이 1보다 크다면, 런타임 모듈(120)을 통한 연산은 목표로 설정된 시간보다 느리게 수행되고 있는 것으로 판단될 수 있다. 반면에, 지연 값이 1보다 작다면, 런타임 모듈(120)을 통한 연산은 목표로 설정된 속도보다 빠르게 수행되고 있는 것으로 판단될 수 있다. If the delay value is larger than 1, it may be determined that the operation through the runtime module 120 is being performed slower than the target time. On the other hand, if the delay value is less than 1, it may be determined that the operation through the runtime module 120 is being performed faster than the target speed.

일 실시 예에 따르면, 런타임 모듈(120)은 연산이 목표로 설정된 속도보다 느리게 수행되고 있는 것으로 판단되는 경우, 연산에 이용되는 프로세서(130)의 동작 속도를 상향 조정할 수 있다. 또한, 런타임 모듈(120)은 연산이 목표로 설정된 시간보다 빠르게 수행되고 있는 것으로 판단되는 경우, 연산에 이용되는 프로세서(130)의 동작 속도를 하향 조정할 수 있다.According to an embodiment of the present disclosure, when it is determined that the operation is being performed slower than the target speed, the runtime module 120 may adjust the operation speed of the processor 130 used for the operation. In addition, when it is determined that the operation is being performed faster than the target time, the runtime module 120 may adjust the operation speed of the processor 130 used for the operation.

전자 장치(100)의 프로세서(130)는 런타임 모듈(120)에서 수행되는 연산 중 적어도 일부를 처리할 수 있다. 예컨대, 프로세서(130)는 딥러닝 모델 내 복수의 레이어들 중 적어도 하나의 레이어에서 수행되는 연산을 수행한 후, 그 결과 데이터를 런타임 모듈(120)로 제공할 수 있다. 다양한 실시 예들에 따르면, 프로세서(130)는 런타임 모듈(120)을 포함하는 형태로 구현될 수 있으며, 이 경우 런타임 모듈(120)에서 수행되는 모든 인스트럭션들은 프로세서(130)에서 수행되는 것으로 이해될 수 있다.The processor 130 of the electronic device 100 may process at least some of the operations performed by the runtime module 120. For example, the processor 130 may perform an operation performed on at least one layer of the plurality of layers in the deep learning model, and then provide the result data to the runtime module 120. According to various embodiments of the present disclosure, the processor 130 may be implemented in a form including a runtime module 120, and in this case, all instructions executed in the runtime module 120 may be understood to be performed in the processor 130. have.

일 실시 예에 따르면, 프로세서(130)는 하나 이상의 프로세서들을 포함할 수 있다. 예컨대, 프로세서(130)는 하나 이상의 중앙처리장치(central processing unit, CPU)들 및 하나 이상의 그래픽처리장치(graphics processing unit, GPU)들을 포함할 수 있다. 프로세서(130)는 딥러닝 모델 내 복수의 레이어들 각각의 타입에 기초하여 연산을 수행하는데 필요한 처리장치를 선택할 수 있으며, 선택된 처리장치를 이용하여 연산을 수행할 수 있다. According to an embodiment of the present disclosure, the processor 130 may include one or more processors. For example, the processor 130 may include one or more central processing units (CPUs) and one or more graphics processing units (GPUs). The processor 130 may select a processing apparatus required to perform an operation based on each type of the plurality of layers in the deep learning model, and perform the operation using the selected processing apparatus.

일 실시 예에 따르면, 프로세서(130)는 런타임 모듈(120)로부터 동작 속도를 상향 조정하는 것과 대응되는 제어 명령이 수신되는 경우, 수신된 제어 명령과 대응되는 프로세서(130)의 주파수를 상향 조정하여 연산 처리 속도를 높일 수 있다. 반면에, 프로세서(130)는 런타임 모듈(120)로부터 동작 속도를 하향 조정하는 것과 대응되는 제어 명령이 수신되는 경우, 수신된 제어 명령과 대응되는 프로세서(130)의 주파수를 하향 조정하여 연산 처리 속도를 낮출 수 있다. According to an embodiment of the present disclosure, when a control command corresponding to increasing the operation speed is received from the runtime module 120, the processor 130 adjusts the frequency of the processor 130 corresponding to the received control command to increase the frequency. It can speed up the computation process. On the other hand, when a control command corresponding to a lowering of the operation speed is received from the runtime module 120, the processor 130 adjusts the frequency of the processor 130 corresponding to the received control command to lower the processing speed. Can be lowered.

전자 장치(100)의 센서(140)는 전자 장치(100)의 내부의 작동 상태(예: 전력 또는 온도), 또는 외부의 환경 상태에 대응하는 전기 신호 또는 데이터 값을 생성할 수 있다. 예컨대, 센서(140)는 제스처 센서, 자이로 센서, 기압 센서, 마그네틱 센서, 가속도 센서, 그립 센서, 근접 센서, 컬러 센서, IR(infrared) 센서, 생체 센서, 온도 센서, 습도 센서 및 조도 센서 중 적어도 하나를 포함할 수 있다. 일 실시 예에 따르면, 전자 장치(100)는 자율 주행 차량에 장착된 후 센서(140)로부터 획득되는 입력 데이터를 이용하여 딥러닝 모델을 통한 연산을 수행할 수 있고, 딥러닝 모델을 통한 연산의 결과 값으로 출력 데이터를 생성하여 자율 주행 차량에 제공할 수 있다. The sensor 140 of the electronic device 100 may generate an electrical signal or data value corresponding to an operating state (eg, power or temperature) inside the electronic device 100 or an external environmental state. For example, the sensor 140 may include at least one of a gesture sensor, a gyro sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor, an infrared (IR) sensor, a biometric sensor, a temperature sensor, a humidity sensor, and an illuminance sensor. It may include one. According to an embodiment of the present disclosure, the electronic device 100 may perform an operation through the deep learning model using input data obtained from the sensor 140 after being mounted in the autonomous vehicle, and may perform calculation through the deep learning model. The output data may be generated as a result value and provided to the autonomous vehicle.

전자 장치(100)의 저장부(150)는 전자 장치(100)의 적어도 하나의 구성 요소에 의해 사용되는 다양한 데이터를 저장할 수 있다. 여기서, 데이터는 소프트웨어 및 이와 관련된 명령에 대한 입력 데이터 또는 출력 데이터를 포함할 수 있다. 또한, 저장부(150)는 휘발성 메모리 또는 비휘발성 메모리를 포함할 수 있다. 일 실시 예에 따르면, 저장부(150)는 입력 데이터에 대한 연산을 수행할 때 이용될 수 있는 하나 이상의 딥러닝 모델들과 하나 이상의 딥러닝 모델들에 포함된 복수의 레이어들에 대한 정보를 저장할 수 있다. The storage unit 150 of the electronic device 100 may store various data used by at least one component of the electronic device 100. Here, the data may include input data or output data for software and commands related thereto. In addition, the storage unit 150 may include a volatile memory or a nonvolatile memory. According to an embodiment of the present disclosure, the storage unit 150 stores information about one or more deep learning models that can be used when performing an operation on input data and a plurality of layers included in the one or more deep learning models. Can be.

전자 장치(100)의 입출력부(160)는 전자 장치(100)의 구성 요소에 사용될 명령 또는 데이터를 전자 장치(100)의 외부(예: 사용자)로부터 수신하거나 외부로 제공할 수 있다. 입출력부(160)는 입력부 및 출력부로 구분될 수 있으며, 입력부는 마우스, 키보드 및 터치 패드를 포함할 수 있고 출력부는 디스플레이 및 스피커를 포함할 수 있다.The input / output unit 160 of the electronic device 100 may receive or provide a command or data to be used for a component of the electronic device 100 from an external (eg, a user) of the electronic device 100. The input / output unit 160 may be divided into an input unit and an output unit, the input unit may include a mouse, a keyboard, and a touch pad, and the output unit may include a display and a speaker.

도 2는 본 발명의 일 실시 예에 따른 전자 장치에 구비되는 런타임 모듈의 구성을 도시한 도면이다. 다양한 실시 예들에 따르면, 전자 장치(100)는 런타임 모듈(120), 프로세서(130), 저장부(150) 및 입출력부(160) 중 적어도 하나를 포함할 수 있다. 또한, 런타임 모듈(120)은 QoS 관리부(200) 및 실행 관리부(210)를 포함할 수 있다. 또한, QoS 관리부(200)는 모델 압축부(201), 마일스톤 식별자 할당부(203), 성능 확인부(205), 지연 제어부(207) 및 리소스 관리부(209)를 포함할 수 있다. 또한, 실행 관리부(210)는 데이터 전처리부(211), 모델 관리부(213) 및 레이어 관리부(215)를 포함할 수 있다. 또한, 프로세서(130)는 중앙처리장치(220) 및 그래픽처리장치(225)를 포함할 수 있다. 2 is a diagram illustrating a configuration of a runtime module included in an electronic device according to an embodiment of the present disclosure. According to various embodiments of the present disclosure, the electronic device 100 may include at least one of the runtime module 120, the processor 130, the storage 150, and the input / output unit 160. In addition, the runtime module 120 may include a QoS manager 200 and an execution manager 210. In addition, the QoS manager 200 may include a model compressor 201, a milestone identifier allocator 203, a performance checker 205, a delay controller 207, and a resource manager 209. In addition, the execution manager 210 may include a data preprocessor 211, a model manager 213, and a layer manager 215. In addition, the processor 130 may include a central processing unit 220 and a graphics processing unit 225.

일 실시 예에 따르면, QoS 관리부(200)는 QoS 데이터에 기초하여, 딥러닝 모델을 압축하거나 프로세서(130)의 동작 속도를 제어할 수 있다. 실행 관리부(210)는 딥러닝 모델을 통해 입력 데이터에 대한 연산을 수행하거나 딥러닝 모델 내 적어도 일부의 레이어들에 할당된 마일스톤 식별자를 식별할 수 있다. According to an embodiment of the present disclosure, the QoS manager 200 may compress the deep learning model or control the operation speed of the processor 130 based on the QoS data. The execution manager 210 may perform an operation on the input data through the deep learning model or identify a milestone identifier assigned to at least some layers in the deep learning model.

QoS 관리부(200)의 모델 압축부(201)는 미리 설정된 QoS 데이터를 통해 확인된 d (딥러닝 모델을 통해 추론 과정을 수행함에 있어, 출력 데이터의 획득이 요청되는 제 1 응답 시간) 및 C (딥러닝 모델에 대한 압축 범위)를 이용하여, 딥러닝 모델에 대한 압축이 필요한 지 여부를 판단할 수 있다. 예컨대, 추론 과정에 대한 호출이 있을 경우, 모델 압축부(201)는 상기 호출에 따라 선택된 딥러닝 모델의 메모리 사용량 및 전자 장치(100)의 가용 메모리의 양을 체크할 수 있으며, 상기 체크 결과에 따라 딥러닝 모델에 대한 압축량을 결정할 수 있다. 한편, 모델 압축부(201)는 딥러닝 모델에 대한 압축을 수행함에 있어, 추론 과정의 정확도가 예기치 않을 정도로 손실되는 것을 방지하기 위하여, 단계적인 압축 방법을 이용할 수 있다. 예컨대, 모델 압축부(201)는 딥러닝 모델에 대한 압축을 할 때, 적절한 순위를 선택함으로써 압축의 수준을 제어할 수 있는 특이값 분해 (singular value decomposition) 기술을 통한 근사 방법을 이용할 수 있다. 또는, 모델 압축부(201)는 딥러닝 모델을 통한 연산에 소요되는 총 소요 시간 및 제 1 응답 시간 간의 비교를 통해 딥러닝 모델에 대한 압축의 수준을 결정할 수 있다. 예컨대, 딥러닝 모델을 통한 연산에 소요되는 총 소요 시간이 제 1 응답 시간보다 긴 것으로 판단되는 경우, 모델 압축부(201)는 총 소요 시간 및 제 1 응답 시간 간의 차이를 계산할 수 있으며 계산 결과에 따라 상기 딥러닝 모델에 대한 압축 수준을 결정할 수 있다. 한편, 모델 압축부(201)는 딥러닝 모델에 대한 압축 수준이 결정되면 이와 대응되는 제어 명령을 생성하고, 생성된 제어 명령을 모델 관리부(213)로 전달할 수 있다. The model compressing unit 201 of the QoS managing unit 200 checks d (first response time for obtaining output data in performing an inference process through a deep learning model) and C (confirmed through preset QoS data). Compression range for the deep learning model) to determine whether compression for the deep learning model is required. For example, when there is a call for the inference process, the model compressor 201 may check the memory usage of the deep learning model selected in accordance with the call and the amount of available memory of the electronic device 100. Accordingly, the amount of compression for the deep learning model can be determined. Meanwhile, the model compressor 201 may use a stepwise compression method in order to prevent the accuracy of the inference process from being unexpectedly lost when performing the compression on the deep learning model. For example, when compressing the deep learning model, the model compressor 201 may use an approximation method through a singular value decomposition technique that can control the level of compression by selecting an appropriate rank. Alternatively, the model compressor 201 may determine the level of compression of the deep learning model through a comparison between the total time required for the calculation through the deep learning model and the first response time. For example, when it is determined that the total time required for the calculation through the deep learning model is longer than the first response time, the model compression unit 201 may calculate a difference between the total time required and the first response time. Accordingly, the compression level for the deep learning model may be determined. Meanwhile, when the compression level of the deep learning model is determined, the model compressor 201 may generate a control command corresponding to the deep learning model, and transmit the generated control command to the model manager 213.

QoS 관리부(200)의 마일스톤 식별자 할당부(203)는 압축이 수행된 딥러닝 모델에 대하여 실행 상태나 통신 상태 등을 해석하는 프로파일링을 수행할 수 있다. 일 실시 예에 따르면, 마일스톤 식별자 할당부(203)는 압축이 수행된 딥러닝 모델 내 복수의 레이어들 각각에 대한 동작 시간을 측정할 수 있다. 이어서, 마일스톤 식별자 할당부(203)는 측정된 동작 시간을 이용하여, 목표로 설정된 시간 간격마다 위치한 레이어들에 마일스톤 식별자를 할당하기 위한 제어 명령을 생성할 수 있다. 예컨대, 마일스톤 식별자 할당부(203)는 압축이 수행된 딥러닝 모델을 통한 연산이 시작된 시점부터 100 ms 간격으로 동작되는 레이어들을 식별할 수 있고, 식별된 레이어들 각각에 마일스톤 식별자를 할당하기 위한 제어 명령을 생성할 수 있다. 보다 구체적으로, 마일스톤 식별자 할당부(203)는 연산이 시작된 후 100 ms 이 경과한 시점에 동작하는 제 1 레이어에 제 1 마일스톤 식별자를 할당하고, 이후 또 다시 100 ms 이 경과한 시점에 동작하는 제 2 레이어에 제 2 마일스톤 식별자를 할당하기 위한 제어 명령을 생성할 수 있다. 한편, 마일스톤 식별자 할당부(203)는 마일스톤 식별자를 할당하기 위해 생성된 제어 명령을 모델 관리부(213)로 전달할 수 있다. The milestone identifier allocator 203 of the QoS manager 200 may perform profiling to analyze an execution state or a communication state with respect to the deep learning model on which compression is performed. According to an embodiment, the milestone identifier allocator 203 may measure an operation time of each of the plurality of layers in the deep learning model on which compression is performed. Subsequently, the milestone identifier allocator 203 may generate a control command for allocating milestone identifiers to layers located at target time intervals using the measured operating time. For example, the milestone identifier allocator 203 may identify layers that are operated at intervals of 100 ms from the time point at which the calculation through the deep learning model with compression is started, and assigns a milestone identifier to each of the identified layers. You can create a command. More specifically, the milestone identifier allocator 203 assigns a first milestone identifier to a first layer operating at a time when 100 ms has elapsed since the operation is started, and then operates at a time when 100 ms has elapsed again. A control command for assigning the second milestone identifier to the two layers can be generated. Meanwhile, the milestone identifier allocator 203 may transmit a control command generated for allocating the milestone identifier to the model manager 213.

실행 관리부(210)의 데이터 전처리부(211)는 압축이 수행된 딥러닝 모델에 마일스톤 식별자를 할당하는 동작이 완료된 후, 입력 데이터가 외부로부터 수신되는 것에 응답하여, 딥러닝 모델을 통한 연산에 이용될 수 있도록 수신된 입력 데이터의 형식 또는 내용을 수정할 수 있다. The data preprocessor 211 of the execution manager 210 is used for calculation through the deep learning model in response to the input data being received from the outside after the operation of assigning the milestone identifier to the deep learning model in which the compression is performed is completed. The format or content of the received input data can be modified to make it possible.

실행 관리부(210)의 모델 관리부(213)는 모델 압축부(201)로부터 수신된 제어 명령 또는 마일스톤 식별자 할당부(203)로부터 수신된 제어 명령에 기초하여, 딥러닝 모델에 대한 압축을 수행하거나 딥러닝 모델 내 적어도 하나의 레이어에 마일스톤 식별자를 할당할 수 있다.The model manager 213 of the execution manager 210 performs a deep learning model or compresses the deep learning model based on the control command received from the model compressor 201 or the control command received from the milestone identifier allocator 203. A milestone identifier may be assigned to at least one layer in the running model.

실행 관리부(210)의 레이어 관리부(215)는 데이터 전처리부(211)로부터 형식 또는 내용이 수정된 입력 데이터가 수신되면, 상기 압축이 수행된 딥러닝 모델과 상기 수정된 입력 데이터를 이용하여 추론 과정에 따른 연산을 시작할 수 있다. 예컨대, 레이어 관리부(215)는 압축이 수행된 딥러닝 모델 내 복수의 레이어들 각각의 타입과 대응되는 연산 공식을 이용하여 연산을 진행할 수 있다. 이 때, 연산을 진행하는 순서는 딥러닝 모델 내 복수의 레이어들이 배치된 순서에 따라 진행될 수 있다. The layer manager 215 of the execution manager 210 receives an input data whose format or contents are modified from the data preprocessor 211, and infers a process using the deep learning model on which the compression is performed and the modified input data. Operation can be started. For example, the layer manager 215 may perform an operation using an operation formula corresponding to each type of a plurality of layers in the deep learning model on which compression is performed. In this case, the order of the calculation may be performed according to the order in which the plurality of layers in the deep learning model are arranged.

일 실시 예에 따르면, 레이어 관리부(215)는 연산을 진행 중인 레이어에 마일스톤 식별자가 할당되어 있는지 여부를 확인할 수 있다. 만약, 미리 할당된 제 1 마일스톤 식별자가 확인된다면, 레이어 관리부(215)는 연산이 시작된 시점부터 제 1 마일스톤 식별자가 확인된 시점까지 소요된 시간을 측정한 후, 측정된 소요 시간에 대한 제 1 데이터를 QoS 관리부(200)의 성능 확인부(205)로 전달할 수 있다. 이 후, 레이어 관리부(215)는 연산을 속행하여 미리 할당된 제 2 마일스톤 식별자가 확인된다면, 제 1 마일스톤 식별자가 확인된 시점부터 제 2 마일스톤 식별자가 확인된 시점까지 소요된 시간을 측정한 후, 측정된 소요 시간에 대한 제 2 데이터를 앞서와 마찬가지로 QoS 관리부(200)의 성능 확인부(205)로 전달할 수 있다. 레이어 관리부(215)는 종결 식별자가 확인되기 전까지 연산-식별자 확인-소요 시간 측정-측정 값 전달을 반복할 수 있고, 종결 식별자가 확인되는 것에 응답하여 연산을 종료할 수 있다. 레이어 관리부(215)는 연산이 종료되는 경우, 입력 데이터와 대응되는 출력 데이터로서 연산이 종료되는 시점까지 연산된 데이터를 입출력부(160)를 통해 사용자에게 제공할 수 있다. According to an embodiment of the present disclosure, the layer manager 215 may determine whether a milestone identifier is assigned to the layer in operation. If the pre-allocated first milestone identifier is confirmed, the layer manager 215 measures the time taken from the time point at which the operation is started to the time point at which the first milestone identifier is confirmed, and then the first data on the measured time required. To the performance verification unit 205 of the QoS management unit 200. Subsequently, if the second milestone identifier assigned in advance is confirmed by continuing the operation, the layer manager 215 measures the time taken from when the first milestone identifier is confirmed to when the second milestone identifier is verified, As described above, the second data about the measured time required may be transferred to the performance checker 205 of the QoS manager 200. The layer manager 215 may repeat the operation-identifier identification-time measurement-measurement value transfer until the termination identifier is confirmed, and may terminate the operation in response to the termination identifier being confirmed. When the calculation is completed, the layer manager 215 may provide the user with the calculated data as output data corresponding to the input data to the user through the input / output unit 160.

QoS 관리부(200)의 성능 확인부(205)는 레이어 관리부(215)로부터 마일스톤 식별자가 확인되는 시점까지의 소요 시간에 대한 데이터가 수신되는 경우, 수신된 데이터를 이용하여 전자 장치(100) 내 적어도 하나의 구성 요소에 대한 성능을 확인할 수 있다. 일 실시 예에 따르면, 미리 할당된 제 1 마일스톤 식별자가 확인되는 것에 응답하여 레이어 관리부(215)로부터 연산이 시작된 시점부터 제 1 마일스톤 식별자가 확인된 시점까지의 소요 시간에 대한 제 1 데이터가 수신되는 경우, 성능 확인부(205)는 수신된 제 1 데이터에 따른 소요 시간과 미리 설정된 목표 시간을 비교하여 지연 값을 획득할 수 있다. 이 때, 지연 값은 수학식 2에 따라 획득될 수 있다. 예컨대, 제 1 데이터에 따른 소요 시간이 120 ms 으로 측정되고, 목표 시간이 100 ms 으로 설정된 경우, 지연 값은 1.2(=120 ms / 100 ms)로 계산될 수 있다. 한편, 이와 같은 방식으로 계산된 지연 값은 성능 확인부(205)에서 지연 제어부(207)로 전달될 수 있다. When the performance checker 205 of the QoS manager 200 receives data about the time required from the layer manager 215 to the time point when the milestone identifier is checked, the performance manager 205 uses at least one of the received data in the electronic device 100. You can see the performance of one component. According to an embodiment of the present disclosure, in response to the pre-allocated first milestone identifier being confirmed, first data regarding the time required from the time point at which the operation is started to the time point at which the first milestone identifier is confirmed is received from the layer manager 215. In this case, the performance checking unit 205 may obtain a delay value by comparing the required time according to the received first data with a preset target time. In this case, the delay value may be obtained according to Equation 2. For example, when the time required according to the first data is measured as 120 ms and the target time is set to 100 ms, the delay value may be calculated as 1.2 (= 120 ms / 100 ms). Meanwhile, the delay value calculated in this manner may be transferred from the performance checking unit 205 to the delay control unit 207.

QoS 관리부(200)의 지연 제어부(207)는 성능 확인부(205)로부터 수신된 지연 값에 기초하여, 실제 연산이 목표로 설정된 속도보다 빠르게 진행되고 있는지 또는 느리게 진행되고 있는지 여부를 판단할 수 있다. 예컨대, 지연 제어부(207)는 지연 값으로 1.2가 수신되는 경우(즉, 지연 값이 1보다 큰 경우), 실제 연산이 목표로 설정된 속도보다 느리게 진행되고 있는 것으로 판단할 수 있다. 반면에, 지연 제어부(207)는 지연 값이 1보다 작은 경우, 실제 연산이 목표로 설정된 속도보다 빠르게 진행되고 있는 것으로 판단할 수 있다. The delay controller 207 of the QoS manager 200 may determine whether the actual operation is progressing faster or slower than the target speed based on the delay value received from the performance checker 205. . For example, when 1.2 is received as the delay value (that is, when the delay value is greater than 1), the delay controller 207 may determine that the actual operation is progressing slower than the target speed. On the other hand, if the delay value is less than 1, the delay controller 207 may determine that the actual operation is proceeding faster than the target speed.

일 실시 예에 따르면, 지연 제어부(207)는 실제 연산이 목표로 설정된 속도보다 느리게 진행되고 있는 것으로 판단되면(지연 값이 1보다 큰 경우), 프로세서(130)의 성능을 상향 조정하도록 지시하는 제어 명령을 생성할 수 있다. 반면에, 지연 제어부(207)는 실제 연산이 목표로 설정된 속도보다 빠르게 진행되고 있는 것으로 판단되면(지연 값이 1보다 작은 경우), 프로세서(130)의 성능을 상향 조정하도록 지시하는 제어 명령을 생성할 수 있다. 한편, 지연 제어부(207)는 프로세서의 성능을 상향 조정 또는 하향 조정하도록 지시하는 제어 명령을 생성함에 있어, 하기 표 1과 같이 미리 설정된 임계 범위를 참고할 수 있다. According to an embodiment of the present disclosure, if it is determined that the actual operation is progressing slower than the target speed (when the delay value is greater than 1), the delay control unit 207 instructs to increase the performance of the processor 130. You can create a command. On the other hand, if it is determined that the actual operation is proceeding faster than the target speed (when the delay value is less than 1), the delay control unit 207 generates a control command for instructing to increase the performance of the processor 130. can do. Meanwhile, the delay controller 207 may refer to a preset threshold range as shown in Table 1 below in generating a control command for instructing the processor to increase or decrease the performance of the processor.

Upward adjustment 1st level up 2nd level up 3rd level up Downward adjustment 1st level down adjustment 2nd level down adjustment 3rd level down adjustment

상기 표 1에서, 제 1 레벨, 제 2 레벨 및 제 3 레벨은 프로세서의 성능을 조정하는 정도(degree)와 관련된 값으로서, 사용자에 의해 미리 설정될 수 있다. 예컨대, 제 1 레벨 상향 조정(또는 하향 조정)은 프로세서의 주파수를 5%만큼 상향 조정(또는 하향 조정)하는 것을 의미할 수 있다. 또는, 제 1 레벨 상향 조정(또는 하향 조정)은 프로세서의 주파수를 50MHz만큼 상향 조정(또는 하향 조정)하는 것을 의미할 수 있다. 예컨대, 앞선 실시 예와 같이, 지연 값이 1.2로 측정되는 경우, 지연 값은 1보다 크므로 상향 조정이 필요한 경우에 해당하고, 1과 지연 값의 차이는 0.2로 계산될 수 있다. 이 경우, 지연 제어부(207)는 상기 표 1에 정의된 바를 참고하여, 프로세서의 성능이 제 2 레벨 상향 조정될 필요가 있는 것으로 판단할 수 있고, 이와 대응되는 제어 명령을 생성할 수 있다. In Table 1, the first level, the second level, and the third level are values related to a degree of adjusting the performance of the processor, and may be preset by the user. For example, the first level upward adjustment (or downward adjustment) may refer to an upward adjustment (or downward adjustment) of the frequency of the processor by 5%. Alternatively, the first level up adjustment (or down adjustment) may mean that the frequency of the processor is adjusted up (or down) by 50 MHz. For example, as described above, when the delay value is measured as 1.2, since the delay value is larger than 1, it corresponds to a case where upward adjustment is required, and the difference between 1 and the delay value may be calculated as 0.2. In this case, the delay controller 207 may determine that the performance of the processor needs to be adjusted to the second level by referring to the definition in Table 1 above, and may generate a control command corresponding thereto.

일 실시 예에 따르면, 지연 제어부(207)는 프로세서(130)의 성능을 상향 조정(또는 하향 조정)하도록 지시하는 제어 명령을 생성함에 있어, PI(proportional integral) 제어기를 통해 근사화된 하기 수학식 3을 이용할 수 있다. According to one embodiment, the delay control unit 207 generates a control command for instructing to increase (or down) the performance of the processor 130, the following equation (3) approximated through a proportional integral (PI) controller Can be used.

상기 수학식 3에서, speed(k)는 k 시점 이전에 획득된 데이터에 기초하여 k 시점에 필요한 프로세서(130)의 성능을 의미할 수 있다. 예컨데, k-1시점의 프로세서 성능 speed(k-1)보다 k시점의 프로세서의 성능 speed(k)가 10% 큰 경우, 프로세서의 주파수를 10% 증가시켜야 한다. 반대로, speed(k)가 speed(k-1)보다 적은 경우, 프로세서의 주파수를 감소시켜야 한다. 여기서, K_p 및 K_I는 실험적으로 획득된 PI 제어기의 이득이고, e(k)는 에러 값으로서 k 시점에 획득된 지연 값과 기준 값의 차이를 의미할 수 있다. 예컨대, k 시점에 획득된 지연 값을 지연값(k)라고 정의하고, 기준 값을 1로 정의하면, e(k)는 1-지연값(k)를 의미할 수 있다.In Equation 3, speed (k) may mean the performance of the processor 130 required at the k time point based on the data obtained before the k time point. For example, if the performance speed (k) of the processor at k is 10% greater than the processor performance speed (k-1) at k-1, the frequency of the processor should be increased by 10%. Conversely, if speed (k) is less than speed (k-1), the frequency of the processor must be reduced. Here, K _p and K _I are gains of an experimentally obtained PI controller, and e (k) may mean a difference between a delay value obtained at time k and a reference value as an error value. For example, if a delay value obtained at time k is defined as a delay value k and a reference value is defined as 1, e (k) may mean 1-delay value k.

다양한 실시 예들에 따르면, 상기 수학식 3의 K_p 및 K_I 는 하기 표 2와 같이 실험적으로 획득될 수 있다.According to various embodiments, K _p and K _I of Equation 3 may be experimentally obtained as shown in Table 2 below.

K_p K _p K_I K _I CPUCPU -0.28-0.28 -0.42-0.42 GPUGPU -0.60-0.60 -0.60-0.60

상기 표 2에 따르면, 조정이 필요한 프로세서(130)가 중앙처리장치(CPU)인지 그래픽처리장치(GPU)인지에 따라 서로 다른 값들이 획득될 수 있다.According to Table 2, different values may be obtained depending on whether the processor 130 that needs adjustment is a CPU or a GPU.

QoS 관리부(200)의 리소스 관리부(209)는 지연 제어부(207)에서 생성된 제어 명령에 따라, 전자 장치(100) 내 적어도 하나의 구성 요소의 성능을 제어할 수 있다. 예컨대, 리소스 관리부(209)는 지연 제어부(207)로부터 수신된 제어 명령에 기초하여, 프로세서(130)의 성능을 조정할 수 있다. 이 때, 리소스 관리부(209)는 DVFS (dynamic voltage / frequency scaling)와 같은 튜닝 노브(tunning knob)를 이용하여 프로세서의 성능을 조정할 수 있다.The resource manager 209 of the QoS manager 200 may control the performance of at least one component in the electronic device 100 according to the control command generated by the delay controller 207. For example, the resource manager 209 may adjust the performance of the processor 130 based on the control command received from the delay controller 207. In this case, the resource manager 209 may adjust the performance of the processor by using a tuning knob such as dynamic voltage / frequency scaling (DVFS).

프로세서(130)는 리소스 관리부(209)의 제어에 따라, 프로세서(130)에 포함된 중앙처리장치(220) 또는 그래픽처리장치(225)의 성능(예: 주파수)을 조정할 수 있다. 성능이 조정된 프로세서(130)는 레이어 관리부(215)에서 딥러닝 모델에 따른 연산이 수행될 때, 조정 전과 다른 연산 속도를 제공할 수 있다. The processor 130 may adjust the performance (eg, frequency) of the CPU 220 or the graphic processor 225 included in the processor 130 under the control of the resource manager 209. When the performance is adjusted, the processor 130 may provide a different calculation speed than before the adjustment when the layer manager 215 performs an operation based on the deep learning model.

도 3은 본 발명의 일 실시 예에 따른 전자 장치에 구비되는 피드백 회로의 구성을 도시한 도면이다.3 is a diagram illustrating a configuration of a feedback circuit included in an electronic device according to an embodiment of the present disclosure.

일 실시 예에 따르면, 제 k 시점에, 마일스톤 식별자가 식별되는 경우, 성능 확인부(205)는 지연 값인 tard(k)를 획득할 수 있다. 이어서, 지연 제어부(207)는 지연 값과 기준 값으로 설정된 1의 차이인 e(k)를 획득할 수 있다. 지연 제어부(207)는 획득된 e(k)에 기초하여 프로세서(130)의 성능이 조정될 필요가 있는지 여부와, 조정이 필요하다면 어느 정도 조정될 필요가 있는지 여부를 판단할 수 있다. 예컨대, 지연 값이 1.2 인 경우, 지연 제어부(207)는 지연 값이 1보다 크므로 프로세서의 성능이 상향 조정될 필요가 있다고 판단할 수 있다. 이 때, e(k)는 0.2로 확인되는바, 지연 제어부(207)는 최종적으로 프로세서의 성능이 제 2 레벨 상향 조정될 필요가 있는 것으로 결정할 수 있다.According to an embodiment of the present disclosure, when the milestone identifier is identified at the kth time point, the performance checking unit 205 may obtain a delay value tard (k). Subsequently, the delay controller 207 may obtain e (k), which is a difference between 1 set as a delay value and a reference value. The delay control unit 207 may determine whether the performance of the processor 130 needs to be adjusted based on the obtained e (k), and if necessary, to what extent it needs to be adjusted. For example, when the delay value is 1.2, the delay controller 207 may determine that the performance of the processor needs to be adjusted upward because the delay value is greater than one. At this time, e (k) is confirmed as 0.2, the delay control unit 207 can finally determine that the performance of the processor needs to be adjusted to the second level up.

지연 제어부(207)는 프로세서(130)의 성능을 제 2 레벨 상향 조정하기 위한 제어 명령으로서

을 생성할 수 있고, 생성된 제어 명령을 DVFS 관리부(300)로 전달할 수 있다. The delay control unit 207 serves as a control command for adjusting the performance of the processor 130 to the second level.

May be generated, and the generated control command may be transmitted to the DVFS manager 300.

또한, DVFS 관리부(300)는 지연 제어부(207)로부터 수신된 제어 명령에 대응하여, 프로세서(130) 내 중앙처리장치(220) 및 그래픽처리장치(225) 중 적어도 하나의 주파수를 조정하기 위한 제어 명령으로서 freq(k+1)을 생성할 수 있고, 생성된 제어 명령을 중앙처리장치(220) 및 그래픽처리장치(225) 중 적어도 하나로 전달할 수 있다. 이 후, 조정된 성능에 따라 딥러닝 모델을 통한 연산이 속행될 수 있다.In addition, the DVFS management unit 300 controls to adjust the frequency of at least one of the central processing unit 220 and the graphics processing unit 225 in the processor 130 in response to the control command received from the delay control unit 207. As a command, freq (k + 1) may be generated, and the generated control command may be transferred to at least one of the CPU 220 and the graphic processor 225. Thereafter, the calculation through the deep learning model may be continued according to the adjusted performance.

일 실시 예에 따라, 제 k+1 시점에, 마일스톤 식별자가 식별되는 경우, 성능 확인부(205)는 지연 값인 tard(k+1)를 획득할 수 있다. 이어서, 지연 제어부(207)는 지연 값과 기준 값으로 설정된 1의 차이인 e(k+1)를 획득할 수 있으며, 프로세서(130)의 성능을 조정하는 동작을 반복하여 수행할 수 있다. According to an embodiment, when the milestone identifier is identified at the k + 1th time point, the performance checking unit 205 may acquire a delay value tard (k + 1). Subsequently, the delay controller 207 may acquire e (k + 1), which is a difference between 1 set as a delay value and a reference value, and may repeatedly perform an operation of adjusting the performance of the processor 130.

도 4는 본 발명의 일 실시 예에 따라 딥러닝 모델에 마일스톤 식별자를 할당하는 방법을 설명하기 위한 도면이다.4 is a diagram for describing a method of allocating a milestone identifier to a deep learning model according to an exemplary embodiment.

전자 장치(100)는 "고양이 사진"이 입력 데이터(400)로 수신되는 경우, 딥러닝 모델(420)을 이용하여 입력 데이터(400)에 대한 연산을 수행할 수 있으며, 연산의 결과 데이터(410)로서 "Cat!"을 획득할 수 있다. 한편, 딥러닝 모델(420)은 타입에 따라 서로 다른 연산을 수행하는 복수의 레이어들로 구성될 수 있고, 복수의 레이어들은 미리 설정된 순서에 따라 나열된 형태로 도식화 될 수 있다. When the “cat photo” is received as the input data 400, the electronic device 100 may perform an operation on the input data 400 using the deep learning model 420, and may result in operation 410. ) Can be obtained. On the other hand, the deep learning model 420 may be composed of a plurality of layers for performing different operations according to the type, the plurality of layers may be plotted in a form listed in a predetermined order.

일 실시 예에 따르면, 마일스톤 식별자 할당부(203)는 딥러닝 모델을 통한 연산을 수행하기에 앞서, 딥러닝 모델을 통한 연산에 소요되는 총 소요 시간을 측정할 수 있다. 예컨대, 마일스톤 식별자 할당부(203)는 딥러닝 모델을 통한 연산에 소요되는 총 소요 시간이 300 ms 인 것으로 확인할 수 있다. 이 때, 마일스톤 식별자 할당부(203)는 딥러닝 모델 내 복수의 레이어들 중 마지막 레이어에 종료 식별자를 할당할 수 있다. According to an embodiment, the milestone identifier allocator 203 may measure the total time required for the calculation through the deep learning model before performing the calculation through the deep learning model. For example, the milestone identifier allocator 203 may confirm that the total time required for the calculation through the deep learning model is 300 ms. In this case, the milestone identifier allocator 203 may allocate an end identifier to the last layer among the plurality of layers in the deep learning model.

또한, 마일스톤 식별자 할당부(203)는 총 소요 시간을 기준으로 하여, 미리 설정된 시간 간격마다 마일스톤 식별자를 할당할 수 있다. 예컨대, 마일스톤 식별자 할당부(203)는 총 소요 시간의 1/3과 대응되는 시점마다 위치한 레이어에 마일스톤 식별자를 할당할 수 있다. 즉, 마일스톤 식별자 할당부(203)는 총 소요 시간의 1/3과 대응되는 시점에 위치한 제 1 레이어(430) 및 2/3과 대응되는 시점에 위치한 제 2 레이어(440) 각각에 제 1 마일스톤 식별자 및 제 2 마일스톤 식별자를 할당할 수 있다. 만약, 총 소요 시간이 300 ms 라면, 제 1 마일스톤 식별자는 연산이 시작된 시점으로부터 100 ms 이 경과한 시점에 연산을 수행하는 제 1 레이어(430)에 할당될 수 있고, 제 2 마일스톤 식별자는 제 1 마일스톤 식별자가 식별된 시점으로부터 100 ms 이 경과한 시점에 연산을 수행하는 제 2 레이어(440)에 할당될 수 있다. In addition, the milestone identifier allocator 203 may allocate the milestone identifier for each preset time interval based on the total time required. For example, the milestone identifier allocator 203 may allocate a milestone identifier to a layer located at a time point corresponding to 1/3 of the total time required. That is, the milestone identifier allocator 203 has a first milestone for each of the first layer 430 located at a time corresponding to 1/3 of the total time required and the second layer 440 at a time corresponding to 2/3. An identifier and a second milestone identifier can be assigned. If the total time required is 300 ms, the first milestone identifier may be assigned to the first layer 430 performing the calculation at a time point 100 ms after the start of the operation, and the second milestone identifier may be assigned to the first milestone identifier. The milestone identifier may be assigned to the second layer 440 that performs the calculation at a time point 100 ms elapsed from the identified time point.

도 5는 본 발명의 일 실시 예에 따른 전자 장치에서 딥러닝 모델에 마일스톤 식별자를 할당하는 동작을 설명하기 위한 순서도이다.5 is a flowchart illustrating an operation of allocating a milestone identifier to a deep learning model in an electronic device according to an embodiment of the present disclosure.

단계(500)에서는, 미리 설정된 QoS 데이터를 확인할 수 있다. QoS 데이터는 사용자에 의해 요청되는 서비스 품질을 나타내는 지표로서, 딥러닝 모델을 통한 추론 과정이 수행되기 전에 미리 설정되어 저장될 수 있다.In operation 500, the preset QoS data may be checked. The QoS data is an indicator indicating the quality of service requested by the user, and may be preset and stored before the inference process through the deep learning model is performed.

단계(510)에서는, 확인된 QoS 데이터에 기초하여, 딥러닝 모델에 대한 압축을 수행할 수 있다. In operation 510, compression may be performed on the deep learning model based on the identified QoS data.

단계(520)에서는, 압축이 수행된 딥러닝 모델에 대한 프로파일링을 수행하여, 딥러닝 모델 내 복수의 레이어들 각각에 대한 동작 시간을 측정할 수 있다.In operation 520, the profiling of the deep learning model on which the compression is performed may be performed to measure an operation time of each of the plurality of layers in the deep learning model.

단계(530)에서는, 측정 결과에 기초하여, 복수의 레이어들 중 적어도 하나의 레이어를 선택하고, 선택된 적어도 하나의 레이어에 마일스톤 식별자를 할당할 수 있다.In operation 530, at least one layer from among the plurality of layers may be selected based on the measurement result, and a milestone identifier may be assigned to the selected at least one layer.

다양한 실시 예들에 따르면, 도 5에 개시된 동작들 중 일부는 생략되거나 복수 회 반복될 수 있다. 또한, 도 5에 개시된 동작들 각각은 일 실시 예로 보는 것이 타당하며, 어느 하나의 동작이 다른 하나의 동작에 종속되는 것으로 제한 해석될 수 없다.According to various embodiments of the present disclosure, some of the operations disclosed in FIG. 5 may be omitted or repeated a plurality of times. In addition, each of the operations disclosed in FIG. 5 may be considered to be an example, and may not be construed as limiting one operation to another operation.

도 6은 본 발명의 일 실시 예에 따른 전자 장치에서 딥러닝 모델을 통한 추론 과정을 수행할 때, 적어도 하나의 프로세서의 성능을 제어하는 방법을 설명하기 위한 순서도이다.6 is a flowchart illustrating a method of controlling performance of at least one processor when performing an inference process through a deep learning model in an electronic device according to an embodiment of the present disclosure.

단계(600)에서는, 입력 데이터가 수신되면, 딥러닝 모델을 이용하여 수신된 입력 데이터에 대한 연산을 수행할 수 있다. In operation 600, when the input data is received, an operation on the received input data may be performed using the deep learning model.

단계(610)에서는, 종료 식별자가 식별되는지 여부를 판단할 수 있다. 만약, 종료 식별자가 식별되면, 딥러닝 모델을 통한 연산은 종료될 수 있으며, 이 후 단계(660)이 수행될 수 있다. 반면에, 종료 식별자가 식별되지 않는다면, 단계(620)이 수행될 수 있다. In step 610, it may be determined whether an end identifier is identified. If the end identifier is identified, the operation through the deep learning model may end, and then step 660 may be performed. On the other hand, if the termination identifier is not identified, step 620 may be performed.

단계(620)에서는, 마일스톤 식별자가 식별되는지 여부를 판단할 수 있다. 만약, 마일스톤 식별자가 식별되면, 단계(630) 내지 단계(650)이 수행될 수 있다. 반면에, 마일스톤 식별자가 식별되지 않으면, 단계(600)에 따라 딥러닝 모델을 통한 연산이 속행될 수 있다. In step 620, it may be determined whether the milestone identifier is identified. If the milestone identifier is identified, steps 630 to 650 may be performed. On the other hand, if the milestone identifier is not identified, operation through the deep learning model may continue according to step 600.

단계(630)에서는, 입력 데이터에 대한 연산이 시작된 후 마일스톤 식별자가 식별된 시점까지의 소요 시간을 측정하여 저장할 수 있다. In operation 630, the required time from the start of the operation on the input data to the time point when the milestone identifier is identified may be measured and stored.

단계(640)에서는, 저장된 소요 시간 및 미리 설정된 목표 시간을 이용하여 지연 값을 획득할 수 있다. In operation 640, the delay value may be obtained using the stored time required and the predetermined target time.

단계(650)에서는, 지연 값이 미리 설정된 값을 기준으로 임계 범위 내에 있지 않는 것에 응답하여, 적어도 하나의 프로세서의 주파수를 변경할 수 있다. In step 650, in response to the delay value not being within a threshold range based on the preset value, the frequency of the at least one processor may be changed.

단계(660)에서는, 입력 데이터와 대응되는 출력 데이터로서, 종료 식별자가 식별된 시점까지 연산된 데이터를 제공할 수 있다. In operation 660, data calculated until the end identifier is identified may be provided as output data corresponding to the input data.

다양한 실시 예들에 따르면, 도 6에 개시된 동작들 중 일부는 생략되거나 복수 회 반복될 수 있다. 또한, 도 6에 개시된 동작들 각각은 일 실시 예로 보는 것이 타당하며, 어느 하나의 동작이 다른 하나의 동작에 종속되는 것으로 제한 해석될 수 없다.According to various embodiments of the present disclosure, some of the operations disclosed in FIG. 6 may be omitted or repeated a plurality of times. In addition, each of the operations disclosed in FIG. 6 may be considered to be an example, and may not be construed as limiting one operation to another operation.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치에 있어서, 상기 레이어 관리부는, 상기 복수의 레이어들 중 상기 입력 데이터에 대한 연산을 수행하는데 이용되는 제 1 복수의 레이어들을 선택하고, 상기 선택된 제 1 복수의 레이어들 각각의 타입을 식별하고, 상기 식별된 제 1 복수의 레이어들 각각의 타입과 대응되는 연산 공식을 이용하여 상기 입력 데이터에 대한 연산을 수행할 수 있다. In an electronic device that performs an operation using a deep learning model according to various embodiments of the present disclosure, the layer manager may include a first plurality of layers used to perform an operation on the input data among the plurality of layers. Select layers, identify a type of each of the selected first plurality of layers, and perform an operation on the input data using an operation formula corresponding to each of the identified first plurality of layers. have.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치는, 미리 설정된 QoS(quality of service) 데이터를 확인하고, 상기 확인된 QoS 데이터에 기초하여 상기 딥러닝 모델에 대한 압축을 수행하는 모델 압축부 및 상기 압축이 수행된 딥러닝 모델에 대한 프로파일링을 수행하여 상기 압축이 수행된 딥러닝 모델 내 상기 복수의 레이어들 각각에 대한 동작 시간을 측정하고, 상기 측정된 동작 시간을 이용하여 상기 복수의 레이어들 중 미리 설정된 시간 간격마다 동작되는 하나 이상의 레이어들을 선택하고, 상기 선택된 하나 이상의 레이어들 각각에 마일스톤 식별자를 할당하는 마일스톤 식별자 할당부를 더 포함할 수 있다. According to various embodiments of the present disclosure, an electronic device that performs an operation using a deep learning model may check preset quality of service (QoS) data and perform an operation on the deep learning model based on the identified QoS data. Profiling the model compression unit performing compression and the deep learning model on which the compression is performed to measure an operation time for each of the plurality of layers in the deep learning model on which the compression is performed, and the measured operation The apparatus may further include a milestone identifier allocator configured to select one or more layers operated at predetermined time intervals from among the plurality of layers and to assign a milestone identifier to each of the selected one or more layers.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치에 있어서, 상기 성능 확인부는, 상기 저장된 소요 시간의 상기 미리 설정된 목표 시간에 대한 비율을 상기 지연 값으로 획득할 수 있다. In the electronic device performing an operation using a deep learning model according to various embodiments of the present disclosure, the performance checking unit may obtain a ratio of the stored required time to the preset target time as the delay value. have.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치에 있어서, 상기 지연 제어부는, 상기 획득된 지연 값이 상기 미리 설정된 값인 1보다 작으면서 상기 임계 범위를 벗어난 것으로 판단되는 것에 응답하여, 상기 전자 장치 내 상기 적어도 하나의 프로세서의 성능을 하향 조정하도록 지시하는 제 1 제어 명령을 생성하고, 상기 획득된 지연 값이 상기 미리 설정된 값인 1보다 크면서 상기 임계 범위를 벗어난 것으로 판단되는 것에 응답하여, 상기 전자 장치 내 상기 적어도 하나의 프로세서의 성능을 상향 조정하도록 지시하는 제 2 제어 명령을 생성할 수 있다. In an electronic device performing an operation using a deep learning model according to various embodiments of the present disclosure, the delay controller determines that the obtained delay value is out of the threshold range while being smaller than 1, the preset value. And in response to generating a first control command instructing to downgrade the performance of the at least one processor in the electronic device, wherein the obtained delay value is greater than 1, the preset value, and out of the threshold range. In response to the determination, the second control command may be generated to instruct to increase the performance of the at least one processor in the electronic device.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치에 있어서, 상기 지연 제어부는, 상기 획득된 지연 값이 상기 임계 범위를 벗어난 정도(degree)를 측정하고, 상기 측정된 정도에 기초하여 상기 적어도 하나의 프로세서의 성능이 하향 조정 또는 상향 조정되는 값을 결정하고, 상기 결정된 값에 따라 상기 제 1 제어 명령 또는 상기 제 2 제어 명령을 생성할 수 있다.In the electronic device performing an operation using a deep learning model according to various embodiments of the present disclosure, the delay controller measures the degree to which the obtained delay value is out of the threshold range and measures the measurement. The first control command or the second control command may be determined according to the determined value based on the determined value.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법은, 입력 데이터가 수신되면 딥러닝 모델을 이용하여 상기 입력 데이터에 대한 연산을 수행하는 단계, 상기 입력 데이터에 대한 연산이 수행될 때 상기 딥러닝 모델에 포함된 복수의 레이어들 중 적어도 하나의 레이어에 미리 할당된 마일스톤 식별자를 식별하는 단계, 상기 마일스톤 식별자가 식별되는 것에 응답하여, 상기 입력 데이터에 대한 연산이 시작된 후 상기 마일스톤 식별자가 식별된 시점까지의 소요 시간을 측정하여 저장하는 단계, 상기 저장된 소요 시간 및 미리 설정된 목표 시간을 이용하여 지연 값을 획득하는 단계, 상기 획득된 지연 값이 미리 설정된 값을 기준으로 하여 임계 범위 내에 있는지 여부를 판단하는 단계, 상기 판단 결과에 기초하여 적어도 하나의 프로세서의 성능을 조정하기 위한 제어 명령을 생성하는 단계, 상기 생성된 제어 명령에 따라, 상기 전자 장치 내 적어도 하나의 프로세서의 주파수를 변경하는 단계, 및 상기 주파수가 변경된 적어도 하나의 프로세서를 통해 상기 입력 데이터에 대한 연산이 속행될 때 종료 식별자가 식별되는 것에 응답하여, 상기 입력 데이터와 대응되는 출력 데이터로서 상기 종료 식별자가 식별된 시점까지 연산된 데이터를 제공하는 단계를 포함할 수 있다. According to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model includes performing an operation on the input data using a deep learning model when input data is received. Identifying a milestone identifier previously assigned to at least one of a plurality of layers included in the deep learning model when an operation on the input data is performed, in response to the milestone identifier being identified, Measuring and storing the time required until the milestone identifier is identified after the operation is started; acquiring a delay value using the stored time duration and a preset target time; and obtaining the predetermined delay value Determining whether it is within a threshold range based on a value, wherein Generating a control command for adjusting the performance of at least one processor based on the determination result, changing a frequency of at least one processor in the electronic device according to the generated control command, and changing the frequency In response to the termination identifier being identified when the operation on the input data is continued via at least one processor, providing the calculated data as output data corresponding to the input data up to the point at which the termination identifier is identified. It may include.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법에 있어서, 상기 입력 데이터에 대한 연산을 수행하는 단계는, 상기 복수의 레이어들 중 상기 입력 데이터에 대한 연산을 수행하는데 이용되는 제 1 복수의 레이어들을 선택하는 단계, 상기 선택된 제 1 복수의 레이어들 각각의 타입을 식별하는 단계, 및 상기 식별된 제 1 복수의 레이어들 각각의 타입과 대응되는 연산 공식을 이용하여 상기 입력 데이터에 대한 연산을 수행하는 단계를 더 포함할 수 있다. According to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model, wherein the performing of the operation on the input data may be performed on the input data of the plurality of layers. Selecting a first plurality of layers that are used to perform an operation on the object, identifying a type of each of the selected first plurality of layers, and an operation corresponding to the type of each of the identified first plurality of layers The method may further include performing an operation on the input data using a formula.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법은, 미리 설정된 QoS(quality of service) 데이터를 확인하고, 상기 확인된 QoS 데이터에 기초하여 상기 딥러닝 모델에 대한 압축을 수행하는 단계, 상기 압축이 수행된 딥러닝 모델에 대한 프로파일링을 수행하여 상기 압축이 수행된 딥러닝 모델 내 상기 복수의 레이어들 각각에 대한 동작 시간을 측정하는 단계, 상기 측정된 동작 시간을 이용하여 상기 복수의 레이어들 중 미리 설정된 시간 간격마다 동작되는 하나 이상의 레이어들을 선택하는 단계, 및 상기 선택된 하나 이상의 레이어들 각각에 마일스톤 식별자를 할당하는 단계를 더 포함할 수 있다.According to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model may include checking a predetermined quality of service (QoS) data and based on the determined QoS data. Performing compression on a running model, performing profiling on the deep learning model on which the compression is performed, and measuring an operation time for each of the plurality of layers in the deep learning model on which the compression is performed, The method may further include selecting one or more layers operated at predetermined time intervals among the plurality of layers using the measured operating time, and assigning a milestone identifier to each of the selected one or more layers.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법에 있어서, 상기 지연 값을 획득하는 단계는, 상기 저장된 소요 시간의 상기 미리 설정된 목표 시간에 대한 비율을 상기 지연 값으로 획득하는 단계를 더 포함할 수 있다. According to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model, wherein the obtaining of the delay value may include a ratio of the stored time to the preset target time. The method may further include obtaining the delay value as the delay value.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법에 있어서, 상기 제어 명령을 생성하는 단계는, 상기 획득된 지연 값이 상기 미리 설정된 값인 1보다 작으면서 상기 임계 범위를 벗어난 것으로 판단되는 것에 응답하여, 상기 전자 장치 내 상기 적어도 하나의 프로세서의 성능을 하향 조정하도록 지시하는 제 1 제어 명령을 생성하는 단계, 및 상기 획득된 지연 값이 상기 미리 설정된 값인 1보다 크면서 상기 임계 범위를 벗어난 것으로 판단되는 것에 응답하여, 상기 전자 장치 내 상기 적어도 하나의 프로세서의 성능을 상향 조정하도록 지시하는 제 2 제어 명령을 생성하는 단계를 더 포함할 수 있다. According to various embodiments of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model may include generating the control command, wherein the obtained delay value is less than 1, the preset value. In response to being determined to be out of the threshold range, generating a first control command instructing to downgrade the performance of the at least one processor in the electronic device, and wherein the obtained delay value is the preset value. And in response to being determined to be greater than 1 and out of the threshold range, generating a second control command instructing to increase the performance of the at least one processor in the electronic device.

본 문서에 개시된 다양한 실시 예들에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법에 있어서, 상기 제어 명령을 생성하는 단계는, 상기 획득된 지연 값이 상기 임계 범위를 벗어난 정도(degree)를 측정하는 단계, 상기 측정된 정도에 기초하여 상기 적어도 하나의 프로세서의 성능이 하향 조정 또는 상향 조정되는 값을 결정하는 단계, 및 상기 결정된 값에 따라 상기 제 1 제어 명령 또는 상기 제 2 제어 명령을 생성하는 단계를 더 포함할 수 있다. According to various embodiments of the present disclosure, in the method of controlling an electronic device that performs an operation using a deep learning model, the generating of the control command may include the degree to which the obtained delay value is out of the threshold range. degree), determining a value at which the performance of the at least one processor is adjusted down or up based on the measured degree, and according to the determined value, the first control command or the second control. The method may further include generating a command.

본 발명의 일 실시 예에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법은 컴퓨터와의 결합을 통해 실행시키기 위한 저장매체에 저장된 컴퓨터 프로그램으로 구현될 수 있다.According to an embodiment of the present disclosure, a method of controlling an electronic device that performs an operation using a deep learning model may be implemented as a computer program stored in a storage medium for execution in combination with a computer.

또한, 본 발명의 일 실시 예에 따라 딥러닝 모델을 이용하여 연산을 수행하는 전자 장치를 제어하는 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. In addition, the method for controlling an electronic device performing an operation using the deep learning model according to an embodiment of the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. have. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시 예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시 예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. In the present invention as described above has been described by the specific embodiments, such as specific components and limited embodiments and drawings, but this is provided only to help a more general understanding of the present invention, the present invention is not limited to the above embodiments. For those skilled in the art, various modifications and variations are possible from these descriptions.

따라서, 본 발명의 사상은 설명된 실시 예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Accordingly, the spirit of the present invention should not be limited to the described embodiments, and all of the equivalents or equivalents of the claims as well as the claims to be described later will belong to the scope of the present invention. .

Claims

An electronic device that performs a calculation using a deep learning model,
When input data is received, an operation is performed on the input data using a deep learning model, and when the operation is performed on the input data, the data is pre-assigned to at least one layer among a plurality of layers included in the deep learning model. A layer manager for identifying the milestone identifier;
In response to the milestone identifier being identified, measuring and storing the time required from the start of the operation on the input data to the time point when the milestone identifier is identified, and using the stored time and the predetermined target time delay value Performance verification unit to obtain;
A delay controller configured to determine whether the obtained delay value is within a threshold range based on a preset value, and to generate a control command for adjusting the performance of at least one processor based on the determination result;
A resource manager configured to change a frequency of at least one processor in the electronic device according to the generated control command; And
In response to the termination identifier being identified when the operation on the input data is continued through the at least one processor whose frequency has been changed, the output data corresponding to the received input data until the point at which the termination identifier is identified. An electronic device for performing an operation using a deep learning model, including an input and output unit for providing data.

The method of claim 1,
The layer manager,
Selecting a first plurality of layers used to perform an operation on the input data among the plurality of layers, identifying a type of each of the selected first plurality of layers, and identifying the first plurality of layers The electronic device performs the operation using the deep learning model, characterized in that the operation on the input data is performed using an operation formula corresponding to each type.

The method of claim 1,
A model compressing unit which checks predetermined quality of service (QoS) data and compresses the deep learning model based on the identified QoS data; And
Profiling the deep learning model on which the compression is performed to measure an operation time for each of the plurality of layers in the compression deep learning model, and using the measured operating time, the plurality of layers And a milestone identifier allocator for selecting one or more layers operated at predetermined time intervals and allocating a milestone identifier to each of the selected one or more layers.

The method of claim 1,
The performance check unit,
And calculating, as the delay value, a ratio of the stored required time to the preset target time as the delay value.

The method of claim 4, wherein
The delay control unit,
In response to determining that the obtained delay value is out of the threshold range while being less than the preset value of 1, generate a first control command instructing to downgrade the performance of the at least one processor in the electronic device; ,
In response to determining that the obtained delay value is larger than 1, the predetermined value, and out of the threshold range, generating a second control command instructing to increase the performance of the at least one processor in the electronic device. The electronic device for performing an operation using a deep learning model, characterized in that.

The method of claim 5, wherein
The delay control unit,
Measure a degree to which the obtained delay value is out of the threshold range, and determine a value at which the performance of the at least one processor is adjusted down or up based on the measured amount, and according to the determined value And generating the first control command or the second control command by using a deep learning model.

In the method for controlling an electronic device performing a calculation using a deep learning model,
Performing an operation on the input data using a deep learning model when the input data is received;
Identifying a milestone identifier previously assigned to at least one layer of a plurality of layers included in the deep learning model when the operation on the input data is performed;
In response to the milestone identifier being identified, measuring and storing the time required for the milestone identifier to be identified after the operation on the input data starts;
Obtaining a delay value using the stored time required and a preset target time;
Determining whether the obtained delay value is within a threshold range based on a preset value;
Generating a control command for adjusting the performance of at least one processor based on the determination result;
Changing a frequency of at least one processor in the electronic device according to the generated control command; And
Responsive to identifying an end identifier when the operation on the input data is continued through the at least one processor whose frequency has been changed, outputting data calculated up to a point in time when the end identifier is identified as output data corresponding to the input data. A method of controlling an electronic device performing an operation using a deep learning model, comprising: providing a deep learning model.

The method of claim 7, wherein
Performing an operation on the input data,
Selecting a first plurality of layers used to perform an operation on the input data among the plurality of layers;
Identifying a type of each of the selected first plurality of layers; And
Performing an operation on the input data by using an operation formula corresponding to the type of each of the identified first plurality of layers, wherein the operation is performed using the deep learning model. How to control your device.

The method of claim 7, wherein
Checking predetermined quality of service (QoS) data and performing compression on the deep learning model based on the identified QoS data;
Profiling the deep learning model on which the compression is performed to measure an operation time of each of the plurality of layers in the compression deep learning model;
Selecting one or more layers operated at predetermined time intervals from among the plurality of layers using the measured operating time; And
And allocating a milestone identifier to each of the selected one or more layers.

The method of claim 7, wherein
Acquiring the delay value,
Obtaining a ratio of the stored required time to the preset target time as the delay value. The method of controlling an electronic device using a deep learning model.

The method of claim 10,
Generating the control command,
In response to determining that the obtained delay value is out of the threshold range while being less than the preset value of 1, generating a first control command instructing to downgrade the performance of the at least one processor in the electronic device. step; And
In response to determining that the obtained delay value is larger than 1, the predetermined value, and out of the threshold range, generating a second control command instructing to increase the performance of the at least one processor in the electronic device. And controlling the electronic device performing the operation using the deep learning model.

The method of claim 10,
Generating the control command,
Measuring a degree to which the obtained delay value is outside the threshold range;
Determining a value at which the performance of the at least one processor is down-scaled or up-scaled based on the measured degree; And
Generating the first control command or the second control command according to the determined value. The method of controlling an electronic device using the deep learning model.

A computer-readable recording medium having recorded thereon a program for causing a computer to perform the method of any one of claims 7 to 12.

A computer program stored in a storage medium for executing the method of any one of claims 7 to 12 in combination with a computer.