KR20220025579A

KR20220025579A - System and method for providing inference service based on deep neural network

Info

Publication number: KR20220025579A
Application number: KR1020200106508A
Authority: KR
Inventors: 이창식
Original assignee: 한국전자통신연구원
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2022-03-03
Also published as: KR102653006B1

Abstract

Provided is a method for providing an inference service based on a deep neural network. The method includes: performing an inference operation to a preset layer of an edge node in response to an inference request for input data into the deep neural network; calculating an entropy value based on a probability vector corresponding to each result value for the input data, by using a result (hereinafter, a first inference result) from the inference operation at the edge node; comparing a plurality of preset threshold values with the entropy value at the edge node; and providing, as a result value, a result value corresponding to the first inference result based on a comparison result of the plurality of threshold values or an unknown inference result, or transmitting the first inference result to the cloud node. The cloud node receives a first inference result received from the edge node, and performs an inference operation, such that the final second inference result is provided as a result value.

Description

System and method for providing inference service based on deep neural network

본 발명은 심층 신경망 기반의 추론 서비스 제공 시스템 및 방법에 관한 것으로, 엣지 컴퓨팅 환경에서 자원 효율적인 추론 서비스를 제공하기 위해 심층 신경망의 연산 결과에 따른 처리를 위한 시스템 및 방법에 관한 것이다.The present invention relates to a system and method for providing an inference service based on a deep neural network, and to a system and method for processing according to an operation result of a deep neural network in order to provide a resource-efficient reasoning service in an edge computing environment.

심층 신경망(Deep Neural Network, DNN)을 기반으로 한 추론 서비스는 높은 정확도를 바탕으로 의료, 컴퓨터 비전, 자연어 처리, 자율 주행 등 많은 분야에서 활용되고 있다.Inference services based on deep neural networks (DNNs) are being used in many fields such as medical care, computer vision, natural language processing, and autonomous driving based on high accuracy.

이와 더불어, DNN 기반의 저지연 추론 서비스를 제공하기 위해, 단말과 가까운 위치에서 데이터를 수집 및 처리하는 엣지 컴퓨팅 구조가 등장하였다.In addition, in order to provide a DNN-based low-latency inference service, an edge computing structure that collects and processes data from a location close to the terminal has emerged.

하지만, DNN 연산 과정에서는 많은 계산량과 저장공간, 그리고 에너지 소모를 필요로 하기 때문에 현재의 엣지 기기에서 안정적인 DNN 추론 서비스를 제공하기에는 현실적인 어려움이 있다.However, since the DNN calculation process requires a large amount of computation, storage space, and energy consumption, it is difficult to provide a stable DNN inference service in current edge devices.

본 발명의 실시예는 심층 신경망을 기반으로 하는 엣지 단말과 클라우드 단말 간의 분산 추론 구조에서, 엣지 단말의 연산 결과에 대한 불확실성에 따라 자원 효율적인 처리를 가능하게 하는 심층 신경망 기반의 추론 서비스 제공 시스템 및 방법을 제공한다. An embodiment of the present invention provides a system and method for providing an inference service based on a deep neural network that enables resource-efficient processing according to uncertainty about the calculation result of the edge terminal in a distributed reasoning structure between an edge terminal and a cloud terminal based on a deep neural network provides

다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical task to be achieved by the present embodiment is not limited to the technical task as described above, and other technical tasks may exist.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 제 1 측면에 따른 심층 신경망 기반으로 구성된 적어도 하나의 엣지 노드 및 클라우드 노드를 포함하는 추론 서비스 제공 방법은 상기 심층 신경망으로의 입력 데이터에 대한 추론 요청에 대응하여 엣지 노드의 미리 정의된 레이어까지 추론 연산을 수행하는 단계; 상기 엣지 노드에서 상기 추론 연산에 따른 결과(이하, 제1추론 결과)로 상기 입력 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피(Entropy) 값을 산출하는 단계; 상기 엣지 노드에서 복수의 미리 설정된 임계 값과 상기 엔트로피 값을 비교하는 단계; 및 상기 복수의 임계 값과의 비교 결과에 기초하여 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제1 추론 결과를 상기 클라우드 노드로 전달하는 단계를 포함한다. 이때, 상기 클라우드 노드는 상기 엣지 노드로부터 전달된 제1 추론 결과를 수신하여 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값으로 제공한다.As a technical means for achieving the above-described technical problem, a method for providing an inference service including at least one edge node and a cloud node configured based on a deep neural network according to the first aspect of the present invention is a method for providing input data to the deep neural network. performing a speculation operation up to a predefined layer of an edge node in response to a speculation request; calculating an entropy value based on a probability vector corresponding to each result value of the input data as a result (hereinafter, a first inference result) according to the inference operation at the edge node; comparing the entropy value with a plurality of preset threshold values at the edge node; and providing a result value corresponding to the first inference result or an inference result unknown as a result value based on the comparison result with the plurality of threshold values, or transmitting the first inference result to the cloud node includes In this case, the cloud node receives the first reasoning result transmitted from the edge node, performs a reasoning operation, and provides a final second reasoning result as a result value.

본 발명의 일부 실시예에서, 상기 엣지 노드에서 복수의 미리 설정된 임계 값과 상기 엔트로피 값을 비교하는 단계는, 상기 복수의 미리 설정된 임계 값으로 제1 임계 값 및 상기 제1 임계 값보다 크도록 설정된 제2 임계 값을 상기 엔트로피 값과 각각 비교할 수 있다.In some embodiments of the present invention, the comparing the entropy value with a plurality of preset threshold values at the edge node includes a first threshold value with the plurality of preset threshold values and set to be greater than the first threshold value The second threshold value may be compared with the entropy value, respectively.

본 발명의 일부 실시예에서, 상기 복수의 임계 값과의 비교 결과에 기초하여 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제1 추론 결과를 상기 클라우드 노드로 전달하는 단계는, 상기 엔트로피 값이 상기 제1 임계 값 미만인 경우, 상기 입력 데이터의 각 결과 값에 상응하는 확률 벡터 중 가장 큰 값의 확률 벡터를 갖는 결과 값을 상기 제1 추론 결과에 상응하는 결과 값으로 제공할 수 있다.In some embodiments of the present invention, based on the comparison result with the plurality of threshold values, a result value corresponding to the first inference result or an inference result unknown is provided as a result value, or the first inference result is In the transmitting of the entropy value to the cloud node, when the entropy value is less than the first threshold value, a result value having the largest probability vector among the probability vectors corresponding to each result value of the input data is the first inference result can be provided as a result value corresponding to .

본 발명의 일부 실시예는, 상기 제1 추론 결과에 상응하는 결과 값을 제공함에 따라, 상기 엣지 노드는 다음 입력 데이터에 대한 추론 요청에 대응하는 추론 연산을 수행하는 단계를 더 포함할 수 있다.Some embodiments of the present invention may further include, in response to providing a result value corresponding to the first speculation result, the edge node performing a speculation operation corresponding to a speculation request for next input data.

본 발명의 일부 실시예에서, 상기 복수의 임계 값과의 비교 결과에 기초하여 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제1 추론 결과를 상기 클라우드 노드로 전달하는 단계는, 상기 엔트로피 값이 제1 임계 값 이상이며 상기 제2 임계 값 미만인 경우, 상기 제1 추론 결과를 상기 클라우드 노드로 전달할 수 있다.In some embodiments of the present invention, based on the comparison result with the plurality of threshold values, a result value corresponding to the first inference result or an inference result unknown is provided as a result value, or the first inference result is The transmitting to the cloud node may include transmitting the first inference result to the cloud node when the entropy value is greater than or equal to the first threshold value and less than the second threshold value.

본 발명의 일부 실시예에서, 상기 복수의 임계 값과의 비교 결과에 기초하여 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제1 추론 결과를 상기 클라우드 노드로 전달하는 단계는, 상기 엔트로피 값이 상기 제2 임계 값을 초과하는 경우, 상기 추론 결과 미정을 결과 값으로 제공할 수 있다.In some embodiments of the present invention, based on the comparison result with the plurality of threshold values, a result value corresponding to the first inference result or an inference result unknown is provided as a result value, or the first inference result is In the transmitting to the cloud node, when the entropy value exceeds the second threshold value, the inference result undecided may be provided as a result value.

본 발명의 일부 실시예는, 상기 추론 결과 미정을 결과 값으로 제공함에 따라, 상기 엣지 노드는 다음 입력 데이터에 대한 추론 요청에 대응하는 추론 연산을 수행하는 단계를 더 포함할 수 있다.Some embodiments of the present disclosure may further include, by providing the inference result undecided as a result value, the edge node performing a speculation operation corresponding to a speculation request for next input data.

본 발명의 일부 실시예는, 테스트 노드에 학습을 위한 입력 데이터(이하, 학습 데이터) 및 상기 학습 데이터에 상응하는 결과 값을 미리 준비하는 단계; 상기 학습 데이터에 상응하는 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 테스트 노드에서 상기 제2 임계 값을 갱신하는 단계; 상기 갱신된 제2 임계 값을 상기 엣지 노드에 전달하는 단계; 및 상기 갱신된 제2 임계 값을 상기 엣지 노드에 적용시키는 단계를 더 포함할 수 있다.Some embodiments of the present invention, the steps of preparing in advance input data for learning (hereinafter, learning data) and a result value corresponding to the learning data in a test node; updating the second threshold value at the test node based on each result value at the edge node and the cloud node corresponding to the learning data; transmitting the updated second threshold value to the edge node; and applying the updated second threshold value to the edge node.

본 발명의 일부 실시예에서, 상기 학습 데이터에 상응하는 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 테스트 노드에서 상기 제2 임계 값을 갱신하는 단계는, 상기 테스트 노드가 상기 학습 데이터를 상기 엣지 노드로 전달하는 단계; 상기 엣지 노드에서 상기 학습 데이터에 대한 추론 요청에 대응하여 상기 엣지 노드의 미리 정의된 레이어까지 추론 연산을 수행하는 단계; 상기 엣지 노드에서 상기 학습 데이터의 추론 연산에 따른 결과(이하, 제3 추론 결과)로 상기 학습 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피(Entropy) 값을 산출하는 단계; 상기 엣지 노드에서 복수의 미리 설정된 임계 값과 상기 엔트로피 값을 비교하는 단계; 상기 복수의 임계 값과의 비교 결과에 기초하여 상기 제3 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제3 추론 결과를 상기 클라우드 노드로 전달하는 단계; 및 상기 클라우드 노드가 상기 엣지 노드로부터 전달된 제3 추론 결과를 수신하여 추론 연산을 수행하여 최종적인 제4 추론 결과를 결과 값으로 제공하는 단계를 포함할 수 있다.In some embodiments of the present invention, the step of updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes: transmitting to the edge node; performing an inference operation from the edge node to a predefined layer of the edge node in response to an inference request for the training data; calculating an entropy value based on a probability vector corresponding to each result value of the training data as a result (hereinafter, a third inference result) according to the inference operation of the training data at the edge node; comparing the entropy value with a plurality of preset threshold values at the edge node; providing a result value corresponding to the third inference result or an inference result unknown as a result value based on a result of comparison with the plurality of threshold values, or transmitting the third inference result to the cloud node; and receiving, by the cloud node, a third reasoning result transmitted from the edge node, performing a reasoning operation, and providing a final fourth reasoning result as a result value.

본 발명의 일부 실시예에서, 상기 학습 데이터에 상응하는 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 테스트 노드에서 상기 제2 임계 값을 갱신하는 단계는, 상기 엣지 노드로부터 수신한 결과 값이 상기 제3 추론 결과에 상응하는 결과 값인 경우 이전 제2 임계 값이 유지되도록 제2 임계 값을 갱신하고, 상기 엣지 노드로부터 수신한 결과 값이 상기 추론 결과 미정(unknown)의 결과 값인 경우 이전 제2 임계 값이 증가되도록 제2 임계 값을 갱신할 수 있다.In some embodiments of the present invention, the updating of the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes the result received from the edge node When the value is the result value corresponding to the third inference result, the second threshold value is updated so that the previous second threshold value is maintained, and when the result value received from the edge node is the result value of the inference result unknown, the previous value The second threshold value may be updated to increase the second threshold value.

본 발명의 일부 실시예에서, 상기 학습 데이터에 상응하는 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 테스트 노드에서 상기 제2 임계 값을 갱신하는 단계는, 상기 클라우드 노드로부터 수신한 결과 값이 미리 준비된 결과 값과 일치하는 경우 이전 제2 임계 값이 증가되도록 제2 임계 값을 갱신하고, 상기 클라우드 노드로부터 수신한 결과 값이 미리 준비된 결과 값과 상이한 경우 이전 제2 임계 값이 감소되도록 제2 임계 값을 갱신할 수 있다.In some embodiments of the present invention, the step of updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes a result received from the cloud node When the value matches the pre-prepared result value, the second threshold value is updated so that the previous second threshold value is increased, and when the result value received from the cloud node is different from the pre-prepared result value, the previous second threshold value is decreased The second threshold value may be updated.

본 발명의 일부 실시예에서, 상기 학습 데이터에 상응하는 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 테스트 노드에서 상기 제2 임계 값을 갱신하는 단계는, 상기 제2 임계 값에 0과 1 사이의 미리 설정된 가중치를 부여하는 단계; 상기 테스트 노드에서 수신한 결과 값이 상기 엣지 노드 및 클라우드 노드 중 어느 노드에서 수신한 것인지 여부를 확인하는 단계; 및 상기 확인 결과에 기초하여, 상기 가중치가 부여된 제2 임계 값에 양의 값, 0, 음의 값 중 어느 하나로 결정되는 지시자(Indicator)를 부가하여 상기 제2 임계 값을 갱신하는 단계를 포함할 수 있다.In some embodiments of the present invention, the step of updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes 0 to the second threshold value. and assigning a preset weight between 1 and 1; checking whether the result value received from the test node is received from either the edge node or the cloud node; and updating the second threshold value by adding an indicator determined as any one of a positive value, 0, and a negative value to the weighted second threshold value based on the confirmation result can do.

또한, 본 발명의 제2 측면에 따른 심층 신경망 기반의 추론 서비스 제공 시스템은 상기 심층 신경망으로의 입력 데이터에 대한 추론 요청에 대응하여 미리 정의된 레이어까지 추론 연산을 수행하고, 상기 추론 연산에 따른 결과(이하, 제1 추론 결과)로 상기 입력 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피(Entropy) 값을 산출하며, 상기 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여, 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하는 적어도 하나의 엣지 노드, 상기 엣지 노드에서의 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여 상기 엣지 노드로부터 상기 제1 추론 결과를 수신하고, 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값으로 제공하는 클라우드 노드 및 상응하는 결과 값을 포함하는 학습을 위한 입력 데이터(이하, 학습 데이터)에 대한 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 임계 값을 갱신하고, 상기 갱신된 임계 값을 상기 엣지 노드로 전달하는 테스트 노드를 포함한다.In addition, the deep neural network-based inference service providing system according to the second aspect of the present invention performs an inference operation up to a predefined layer in response to an inference request for input data to the deep neural network, and results according to the inference operation (hereinafter, the first inference result) calculates an entropy value based on a probability vector corresponding to each result value for the input data, and compares the entropy value with a plurality of preset threshold values. Based on the result of comparing the entropy value of at least one edge node and the edge node that provides a result value corresponding to the first inference result or an inference result unknown as a result value, and a plurality of preset threshold values Input data for learning including a cloud node that receives the first inference result from the edge node based on .

또한, 본 발명의 제3 측면에 따른 심층 신경망 기반의 추론 서비스 제공 시스템은 상기 심층 신경망으로의 입력 데이터에 대한 추론 요청에 대응하여 미리 정의된 레이어까지 추론 연산을 수행하고, 상기 추론 연산에 따른 결과(이하, 제1 추론 결과)로 상기 입력 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피(Entropy) 값을 산출하며, 상기 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여, 상기 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하는 적어도 하나의 엣지 노드 및 상기 엣지 노드에서의 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여 상기 엣지 노드로부터 상기 제1 추론 결과를 수신하고, 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값으로 제공하는 클라우드 노드를 포함한다.In addition, the deep neural network-based inference service providing system according to the third aspect of the present invention performs an inference operation up to a predefined layer in response to an inference request for input data to the deep neural network, and results according to the inference operation (hereinafter, the first inference result) calculates an entropy value based on a probability vector corresponding to each result value for the input data, and compares the entropy value with a plurality of preset threshold values. Based on the comparison result of at least one edge node providing a result value corresponding to the first inference result or an inference result unknown as a result value, and an entropy value at the edge node and a plurality of preset threshold values and a cloud node that receives the first inference result from the edge node based on

본 발명의 일부 실시예에서, 상기 복수의 미리 설정된 임계 값은 제1 임계 값 및 상기 제1 임계 값보다 크도록 설정된 제2 임계 값이고, 상기 엣지 노드는 상기 엔트로피 값이 상기 제1 임계 값 미만인 경우, 상기 입력 데이터의 각 결과 값에 상응하는 확률 벡터 중 가장 큰 값의 확률 벡터를 갖는 결과 값을 상기 제1 추론 결과에 상응하는 결과 값으로 제공하고, 상기 엔트로피 값이 제1 임계 값 이상이며 상기 제2 임계 값 미만인 경우, 상기 제1 추론 결과를 상기 클라우드 노드로 전달하며, 상기 엔트로피 값이 상기 제2 임계 값을 초과하는 경우, 상기 추론 결과 미정을 결과 값으로 제공할 수 있다.In some embodiments of the present invention, the plurality of preset threshold values are a first threshold value and a second threshold value set to be greater than the first threshold value, and the edge node has the entropy value less than the first threshold value. case, a result value having the largest probability vector among the probability vectors corresponding to each result value of the input data is provided as a result value corresponding to the first inference result, and the entropy value is equal to or greater than the first threshold value; When it is less than the second threshold, the first inference result is transferred to the cloud node, and when the entropy value exceeds the second threshold, the inference result undecided may be provided as a result value.

본 발명의 일부 실시예는, 상응하는 결과 값을 포함하는 학습을 위한 입력 데이터(이하, 학습 데이터)에 대한 상기 엣지 노드 및 클라우드 노드에서의 각 결과 값에 기초하여 상기 제2 임계 값을 갱신하고, 상기 갱신된 제2 임계 값을 상기 엣지 노드로 전달하는 테스트 노드를 더 포함하며, 상기 엣지 노드는 상기 제2 임계 값을 수신하여 적용시킬 수 있다.Some embodiments of the present invention update the second threshold value based on each result value in the edge node and the cloud node for input data (hereinafter, learning data) for learning including the corresponding result value, and , a test node that transmits the updated second threshold value to the edge node, wherein the edge node may receive and apply the second threshold value.

본 발명의 일부 실시예에서, 상기 테스트 노드는 상기 학습 데이터를 상기 엣지 노드로 전달하며, 상기 엣지 노드는 상기 학습 데이터에 대한 추론 요청에 대응하여 미리 정의된 레이어까지 추론 연산을 수행하고, 상기 추론 연산에 따른 결과(이하, 제3 추론 결과)로 상기 학습 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피(Entropy) 값을 산출하며, 상기 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여, 상기 제3 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공하거나, 상기 제3 추론 결과를 상기 클라우드 노드로 전달하고, 상기 클라우드 노드는 상기 엣지 노드로부터 전달된 제3 추론 결과를 수신하여 추론 연산을 수행하여 최종적인 제4 추론 결과를 결과 값으로 제공할 수 있다.In some embodiments of the present invention, the test node transmits the training data to the edge node, and the edge node performs a reasoning operation up to a predefined layer in response to an inference request for the training data, and the reasoning An entropy value is calculated based on a probability vector corresponding to each result value for the learning data as a result according to the operation (hereinafter, the third inference result), and the entropy value and a plurality of preset threshold values are calculated. Based on the comparison result, a result value corresponding to the third inference result or an inference result unknown is provided as a result value, or the third inference result is transmitted to the cloud node, and the cloud node is the edge A final fourth reasoning result may be provided as a result value by receiving the third speculation result transmitted from the node and performing a speculation operation.

본 발명의 일부 실시예에서, 상기 테스트 노드는, 상기 엣지 노드로부터 수신한 결과 값이 상기 제3 추론 결과에 상응하는 결과 값인 경우 이전 제2 임계 값이 유지되도록 제2 임계 값을 갱신하고, 상기 엣지 노드로부터 수신한 결과 값이 상기 추론 결과 미정(unknown)의 결과 값인 경우 이전 제2 임계 값이 증가되도록 제2 임계 값을 갱신할 수 있다.In some embodiments of the present invention, when the result value received from the edge node is a result value corresponding to the third inference result, the test node updates the second threshold value so that the previous second threshold value is maintained, and When the result value received from the edge node is the result value of the inference result unknown, the second threshold value may be updated so that the previous second threshold value is increased.

본 발명의 일부 실시예에서, 상기 테스트 노드는, 상기 클라우드 노드로부터 수신한 결과 값이 상기 미리 준비된 결과 값과 일치하는 경우 이전 제2 임계 값이 증가되도록 제2 임계 값을 갱신하고, 상기 클라우드 노드로부터 수신한 결과 값이 상기 미리 준비된 결과 값과 상이한 경우 이전 제2 임계 값이 감소되도록 제2 임계 값을 갱신할 수 있다.In some embodiments of the present invention, the test node updates the second threshold value so that the previous second threshold value is increased when the result value received from the cloud node matches the pre-prepared result value, and the cloud node When the result value received from ' is different from the previously prepared result value, the second threshold value may be updated so that the previous second threshold value is decreased.

본 발명의 일부 실시예에서, 상기 테스트 노드는, 상기 제2 임계 값에 0과 1 사이의 미리 설정된 가중치를 부여하고, 상기 엣지 노드 및 클라우드 노드 중 어느 노드에서 결과 값을 수신하였는지 여부를 확인하여, 상기 가중치가 부여된 제2 임계 값에 양의 값, 0, 음의 값 중 어느 하나로 결정되는 지시자(Indicator)를 부가하여 상기 제2 임계 값을 갱신할 수 있다.In some embodiments of the present invention, the test node assigns a preset weight between 0 and 1 to the second threshold value, and confirms whether a result value is received from any of the edge node and the cloud node. , an indicator determined as any one of a positive value, 0, and a negative value may be added to the weighted second threshold value to update the second threshold value.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체가 더 제공될 수 있다.In addition to this, another method for implementing the present invention, another system, and a computer-readable recording medium for recording a computer program for executing the method may be further provided.

상기와 같은 본 발명에 따르면, 엣지 노드에서 불필요한 통신 비용과 전력 소모, 그리고 클라우드 노드에서 추가 연산에 필요한 연산 비용 등의 오버헤드를 줄일 수 있다는 장점이 있다.According to the present invention as described above, there is an advantage that overhead such as unnecessary communication cost and power consumption in the edge node, and the computation cost required for additional operation in the cloud node can be reduced.

또한, 클라우드 노드에서 불필요한 추가 연산을 수행하지 않는바 추론 지연(latency)을 감소시킬 수 있다.In addition, since unnecessary additional operations are not performed in the cloud node, it is possible to reduce inference latency.

또한, 엣지 노드가 수많은 단말을 서비스하는 현실적인 상황에서도 여러 단말들로부터 들어오는 입력 데이터를 자원 효율적으로 처리가 가능하다.In addition, it is possible to resource-efficiently process input data coming from multiple terminals even in a realistic situation in which an edge node serves numerous terminals.

이와 더불어, 특정한 분산 심층 신경망 아키텍처에 의존하는 것이 아니라, 다양한 분산 심층 신경망 아키텍처의 적용이 가능하다는 장점이 있다.In addition, it does not depend on a specific distributed deep neural network architecture, but has the advantage that various distributed deep neural network architectures can be applied.

본 발명의 효과들은 이상에서 언급된 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 종래의 합성곱 신경망의 구조를 설명하기 위한 도면이다.
도 2는 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 시스템을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따른 각 노드의 하드웨어를 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 방법의 순서도이다.
도 5 및 도 6은 테스트 노드를 통해 임계 값을 업데이트하는 내용을 설명하기 위한 도면이다.1 is a diagram for explaining the structure of a conventional convolutional neural network.
2 is a diagram for explaining a system for providing an inference service based on a deep neural network according to an embodiment of the present invention.
3 is a diagram for explaining hardware of each node according to an embodiment of the present invention.
4 is a flowchart of a method for providing an inference service based on a deep neural network according to an embodiment of the present invention.
5 and 6 are diagrams for explaining the contents of updating a threshold value through a test node.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully understand the scope of the present invention to those skilled in the art, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components. Like reference numerals refer to like elements throughout, and "and/or" includes each and every combination of one or more of the recited elements. Although "first", "second", etc. are used to describe various elements, these elements are not limited by these terms, of course. These terms are only used to distinguish one component from another. Accordingly, it goes without saying that the first component mentioned below may be the second component within the spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein will have the meaning commonly understood by those of ordinary skill in the art to which this invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

도 1은 종래의 합성곱 신경망의 구조를 설명하기 위한 도면이다.1 is a diagram for explaining the structure of a conventional convolutional neural network.

DNN 연산 과정에서의 많은 계산량, 저장공간, 그리고 에너지 소모와 같은 엣지 컴퓨팅 구조에서의 문제를 해결하기 위하여, 기존의 DNN 연산 과정을 클라우드 노드와 엣지 노드 간에 분산하여 처리하는 분산 심층 신경망(Distributed Deep Neural Network, DDNN) 구조가 제시되었다.In order to solve problems in edge computing structures such as large amount of computation, storage space, and energy consumption in the DNN operation process, a distributed deep neural network (DNN) that distributes and processes the existing DNN computation process between cloud nodes and edge nodes. Network, DDNN) structure is presented.

DDNN 구조에서는, 입력 데이터를 수신하면(S10), 엣지 기기에서 DNN의 일부 레이어(즉, 출구점(exit point))까지만 연산한 뒤(S20), 입력 데이터에 대한 확률 벡터를 기반으로 엔트로피(Entropy, 불확실성)을 계산하였다(S30). In the DDNN structure, when input data is received (S10), the edge device calculates only some layers of the DNN (that is, an exit point) (S20), and then, based on the probability vector for the input data, entropy (Entropy) , uncertainty) was calculated (S30).

이렇게 계산된 엔트로피는 미리 정해진 임계치와 비교를 하여 엔트로피가 낮을 경우, 앞서 구한 확률 벡터 중 가장 큰 값에 해당하는 결과 값을 추론 결과로 반환하고, 그 다음 레이어에 대한 연산은 진행하지 않는다. 이를 로컬 출구(Local exit)라 지칭한다.The entropy calculated in this way is compared with a predetermined threshold, and when the entropy is low, the result value corresponding to the largest value among the previously obtained probability vectors is returned as an inference result, and the operation on the next layer is not performed. This is referred to as a local exit.

반면, 엔트로피가 임계치보다 높아 정밀한 연산이 필요한 경우(S40), 클라우드 노드에서 이를 이어서 처리하여(S50) 최종 추론 결과를 반환한다(S60). 이를 클라우드 출구(Cloud exit)라 지칭한다.On the other hand, when the entropy is higher than the threshold and precise calculation is required (S40), the cloud node subsequently processes it (S50) and returns the final reasoning result (S60). This is referred to as a cloud exit.

이때, 엣지 노드에서의 추론 결과를 클라우드 노드에서 이어서 처리할 경우에는, 첫 레이어부터 연산을 하는 것이 아니라 엣지 노드에서 출구점(exit point) 직전 레이어까지 연산한 중간 결과를 수신하고, 이후 레이어부터 연산을 진행하게 된다.At this time, when the inference result from the edge node is subsequently processed by the cloud node, the intermediate result calculated from the edge node to the layer just before the exit point is received instead of from the first layer, and the calculation starts from the next layer. will proceed with

이러한 로컬 출구(Local exit)의 도입은 엣지 노드에서 추론의 정확도를 크게 잃지 않으면서 추론 속도를 낮춰주는 효과가 있다. Introduction of such a local exit has the effect of lowering the inference speed without significantly losing the accuracy of inference at the edge node.

하지만, 종래 기술에서는 임의의 입력 데이터에 대해 엣지 노드에서 계산한 엔트로피가 임계치보다 높으면 항상 클라우드 노드에서 추가 연산을 진행하게 된다. 이 경우 엣지 노드에서 중간 연산 결과를 클라우드 노드로 전달하는데 있어, 불필요한 통신 비용(communication cost)과 전력 소모, 그리고 클라우드 노드에서 추가 연산에 필요한 연산 비용(computation cost) 등의 오버헤드가 발생하게 된다.However, in the prior art, when the entropy calculated by the edge node for arbitrary input data is higher than the threshold, the cloud node always performs an additional operation. In this case, overheads such as unnecessary communication cost and power consumption, and computation cost required for additional calculation in the cloud node are generated when the edge node transmits the intermediate operation result to the cloud node.

특히, DNN의 대부분의 레이어를 거친 이후에 출구점(exit point)가 존재하는 구조에서는, 출구점(exit point) 이후에 존재하는 소수의 레이어를 통한 연산을 거친다고 해서 추론 결과에 대한 엔트로피가 개선될 가능성은 매운 낮으며, 이는 결국 불필요한 추가 연산으로 인한 높은 지연(latency)을 유발하게 된다.In particular, in a structure in which an exit point exists after going through most of the layers of DNN, the entropy of the inference result is improved because the operation is performed through a small number of layers that exist after the exit point. It is very unlikely to happen, which in turn leads to high latency due to unnecessary extra operations.

더욱이, 하나의 엣지 노드에서 수많은 단말을 서비스하는 현실적인 상황에서는 여러 단말로부터 들어오는 입력 데이터를 자원 효율적으로 처리해야 하기 때문에, 위 문제가 더욱 커질 가능성이 높다.Moreover, in a realistic situation in which one edge node serves a number of terminals, the above problem is highly likely to increase because input data from several terminals must be efficiently processed in a resource-efficient manner.

따라서, 엣지 노드에서 추론 결과에 대한 엔트로피가 높은 경우에서의 자원 효율적인 처리 방법이 필요한 실정이다.Therefore, there is a need for a resource-efficient processing method in the case where the entropy of the inference result is high in the edge node.

본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 시스템(100) 및 방법은, 심층 신경망을 기반으로 하는 엣지 노드(110)와 클라우드 노드(120) 간의 분산 추론 구조에서, 엣지 노드(110)의 연산 결과에 대한 엔트로피에 따라 저지연 및 자원 효율적인 추론 서비스를 제공할 수 있다.In the deep neural network-based inference service providing system 100 and method according to an embodiment of the present invention, in the distributed reasoning structure between the edge node 110 and the cloud node 120 based on the deep neural network, the edge node 110 ), it is possible to provide a low-latency and resource-efficient reasoning service according to the entropy of the operation result.

도 2는 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 시스템(100)을 설명하기 위한 도면이다. 도 3은 본 발명의 일 실시예에 따른 각 노드(110, 120, 130)의 하드웨어를 설명하기 위한 도면이다.2 is a diagram for explaining a deep neural network-based inference service providing system 100 according to an embodiment of the present invention. 3 is a diagram for explaining hardware of each node 110 , 120 , 130 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 시스템(100)은 적어도 하나의 엣지 노드(110), 클라우드 노드(120) 및 테스트 노드(130)를 포함한다.The deep neural network-based inference service providing system 100 according to an embodiment of the present invention includes at least one edge node 110 , a cloud node 120 , and a test node 130 .

엣지 노드(110)는 엣지 노드(110)로 입력되는 입력 데이터에 대한 추론 요청에 대응하여 미리 정의된 출구점(exit point)에 해당하는 레이어까지 추론 연산을 수행하고, 추론 연산에 따른 결과(이하, 제1 추론 결과)로 입력 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피 값을 산출한다.The edge node 110 performs an inference operation up to a layer corresponding to a predefined exit point in response to an inference request for input data input to the edge node 110, and results according to the inference operation (hereinafter referred to as the inference operation). , as the first inference result), an entropy value based on a probability vector corresponding to each result value for the input data is calculated.

그리고 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여, 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(unknown)을 결과 값으로 제공한다.In addition, based on a result of comparing the entropy value with a plurality of preset threshold values, a result value corresponding to the first inference result or an inference result unknown is provided as a result value.

클라우드 노드(120)는 엣지 노드(110)에서의 엔트로피 값과 미리 설정된 복수의 임계 값을 비교한 결과에 기초하여 엣지 노드(110)로부터 제1 추론 결과를 수신하게 된다. 그리고 수신한 제1 추론 결과를 기반으로 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값으로 제공한다.The cloud node 120 receives the first inference result from the edge node 110 based on a result of comparing the entropy value of the edge node 110 with a plurality of preset threshold values. Then, a speculation operation is performed based on the received first speculation result, and a final second speculation result is provided as a result value.

테스트 노드(130)는 엣지 노드(110)에 적용된 미리 설정된 임계 값을 갱신한다. 테스트 노드(130)는 결과 값을 포함하는 학습을 위한 입력 데이터(이하, 학습 데이터)에 대한 엣지 노드(110) 및 클라우드 노드(120)에서의 각 결과 값에 기초하여 임계 값을 갱신하고, 갱신된 임계 값을 엣지 노드(110)로 전달한다. The test node 130 updates the preset threshold applied to the edge node 110 . The test node 130 updates the threshold value based on each result value in the edge node 110 and the cloud node 120 for input data (hereinafter, learning data) for learning including the result value, and updates The threshold value is transferred to the edge node 110 .

한편, 본 발명의 일 실시예에서 엣지 노드(110), 클라우드 노드(120) 및 테스트 노드(130)는 도 3에 도시된 바와 같이 통신모듈(210), 메모리(220) 및 상기 메모리(220)에 저장된 프로그램을 실행시키는 프로세서(230)를 포함하도록 구성될 수 있다.Meanwhile, in an embodiment of the present invention, the edge node 110 , the cloud node 120 and the test node 130 include a communication module 210 , a memory 220 and the memory 220 as shown in FIG. 3 . It may be configured to include a processor 230 that executes a program stored in the .

이와 같은 통신 모듈(210)은 무선 통신 모듈로 구성됨이 바람직하나 이는 유선 통신 모듈을 배제하는 것은 아니다. 무선 통신 모듈은 WLAN(wireless LAN), Bluetooth, HDR WPAN, UWB, ZigBee, Impulse Radio, 60GHz WPAN, Binary-CDMA, 무선 USB 기술 및 무선 HDMI 기술 등으로 구현될 수 있다. 또한, 유선 통신 모듈은 전력선 통신 장치, 전화선 통신 장치, 케이블 홈(MoCA), 이더넷(Ethernet), IEEE1294, 통합 유선 홈 네트워크 및 RS-485 제어 장치로 구현될 수 있다. The communication module 210 is preferably configured as a wireless communication module, but this does not exclude a wired communication module. The wireless communication module may be implemented with wireless LAN (WLAN), Bluetooth, HDR WPAN, UWB, ZigBee, Impulse Radio, 60GHz WPAN, Binary-CDMA, wireless USB technology, wireless HDMI technology, and the like. In addition, the wired communication module may be implemented as a power line communication device, a telephone line communication device, a cable home (MoCA), Ethernet, IEEE1294, an integrated wired home network, and an RS-485 control device.

메모리(220)는 전원이 공급되지 않아도 저장된 정보를 계속 유지하는 비휘발성 저장장치 및 휘발성 저장장치를 통칭하는 것이다. 예를 들어, 메모리(220)는 콤팩트 플래시(compact flash; CF) 카드, SD(secure digital) 카드, 메모리 스틱(memory stick), 솔리드 스테이트 드라이브(solid-state drive; SSD) 및 마이크로(micro) SD 카드 등과 같은 낸드 플래시 메모리(NAND flash memory), 하드 디스크 드라이브(hard disk drive; HDD) 등과 같은 마그네틱 컴퓨터 기억 장치 및 CD-ROM, DVD-ROM 등과 같은 광학 디스크 드라이브(optical disc drive) 등을 포함할 수 있다.The memory 220 collectively refers to a non-volatile storage device and a volatile storage device that continuously maintain stored information even when power is not supplied. For example, the memory 220 may include a compact flash (CF) card, a secure digital (SD) card, a memory stick, a solid-state drive (SSD), and a micro SD card. NAND flash memory such as cards, magnetic computer storage devices such as hard disk drives (HDD), etc., and optical disk drives such as CD-ROMs and DVD-ROMs. can

이하에서는 도 4 내지 도 5를 참조하여 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 시스템(100)에 의해 수행되는 방법에 대하여 설명하도록 한다.Hereinafter, a method performed by the deep neural network-based inference service providing system 100 according to an embodiment of the present invention will be described with reference to FIGS. 4 to 5 .

도 4는 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 방법의 순서도이다.4 is a flowchart of a method for providing an inference service based on a deep neural network according to an embodiment of the present invention.

먼저, 엣지 노드(110)는 심층 신경망으로의 입력 데이터에 대한 추론 요청에 대응하여 미리 정의된 레이어까지 추론 연산을 수행한다(S110). 즉, 엣지 노드(110)는 입력 데이터를 수신하면, 미리 정의된 출구점(exit point)에 해당하는 레이어까지 연산을 수행한다.First, the edge node 110 performs an inference operation up to a predefined layer in response to an inference request for input data to the deep neural network (S110). That is, when the edge node 110 receives input data, it performs an operation up to a layer corresponding to a predefined exit point.

다음으로, 엣지 노드(110)에서의 제1 추론 결과로, 입력 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피 값을 산출한다(S120). 입력 데이터에 대한 연산 결과는 입력 데이터의 각 결과 값(label)에 대한 확률 벡터로 산출되며, 이를 기반으로 엣지 노드(110)는 추론 불확실성인 엔트로피 값을 산출한다.Next, as a result of the first inference in the edge node 110 , an entropy value based on a probability vector corresponding to each result value of the input data is calculated ( S120 ). The operation result for the input data is calculated as a probability vector for each result value (label) of the input data, and the edge node 110 calculates an entropy value that is an inference uncertainty based on this.

여기에서 엔트로피 값은 0과 1 사이의 값을 가지며, 0에 가까울수록 추론 결과에 대하여 높은 신뢰도를 갖는 것을 의미하고, 1에 가까울수록 낮은 신뢰도를 의미한다.Here, the entropy value has a value between 0 and 1, and the closer to 0, the higher the reliability of the inference result, and the closer to 1, the lower the reliability.

다음으로, 엣지 노드(110)는 복수의 미리 설정된 임계 값과 엔트로피 값을 비교한다(S130). 이때, 본 발명의 일 실시예에서의 복수의 미리 설정된 임계 값은 제1 임계 값(T_low)과, 제1 임계 값보다 크도록 설정된 제2 임계 값(T_high)일 수 있다.Next, the edge node 110 compares a plurality of preset threshold values and entropy values (S130). In this case, the plurality of preset threshold values in an embodiment of the present invention may be a first threshold value T _low and a second threshold value T _high set to be greater than the first threshold value.

다음으로, 엣지 노드(110)는 복수의 임계 값과의 비교 결과에 기초하여 제1 추론 결과에 상응하는 결과 값 또는 추론 결과 미정(known)을 결과 값으로 제공한다. 이때, 엣지 노드(110)는 비교 결과에 따라 제1 추론 결과를 클라우드 노드(120)로 전달할 수 있으며, 이를 수신한 클라우드 노드(120)는 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값으로 제공한다.Next, the edge node 110 provides a result value corresponding to the first inference result or an inference result unknown as a result value based on the comparison result with the plurality of threshold values. At this time, the edge node 110 may transmit the first reasoning result to the cloud node 120 according to the comparison result, and the cloud node 120 receiving this may perform an inference operation and return the final second reasoning result to the result value. provided as

구체적으로 엣지 노드(110)는 제1 임계 값과 제2 임계 값과의 비교 결과에 따라 다음과 같이 수행된다.Specifically, the edge node 110 is performed as follows according to the result of comparison between the first threshold value and the second threshold value.

먼저, 엔트로피 값이 제1 임계 값 미만인 경우(S141), 엣지 노드(110)는 입력 데이터의 각 결과 값에 상응하는 확률 벡터 중 가장 큰 값의 확률 벡터를 갖는 결과 값을 제1 추론 결과에 상응하는 결과 값으로 제공한다(S143).First, when the entropy value is less than the first threshold value (S141), the edge node 110 corresponds to the first inference result with the result value having the largest probability vector among the probability vectors corresponding to each result value of the input data. is provided as a result value (S143).

엔트로피 값이 제1 임계 값보다 낮은 경우는 추론 결과에 대한 신뢰도가 높다는 것으로, 앞서 계산한 확률 벡터 중 가장 큰 값에 해당하는 결과 값을 제1 추론 결과에 상응하는 결과 값으로 반환한다. When the entropy value is lower than the first threshold value, the reliability of the inference result is high, and a result value corresponding to the largest value among the previously calculated probability vectors is returned as a result value corresponding to the first inference result.

이는 도 1에서 설명한 로컬 출구(Local exit)와 같은 의미이며, 출구점(exit point) 이후의 연산은 진행하지 않고 다음 입력 데이터를 처리하기 위한 상태로 넘어간다. 즉, 엣지 노드(110)는 제1 추론 결과에 상응하는 결과 값을 제공함에 따라, 다음 입력 데이터에 대한 추론 요청에 대응하는 추론 연산을 수행한다.This has the same meaning as the local exit described in FIG. 1 , and the operation after the exit point does not proceed and goes to a state for processing the next input data. That is, as the edge node 110 provides a result value corresponding to the first reasoning result, the edge node 110 performs a speculation operation corresponding to a speculation request for the next input data.

다음으로, 엔트로피 값이 제1 임계 값 이상이며 제2 임계 값 미만인 경우(S151), 엣지 노드(110)는 제1 추론 결과를 클라우드 노드(120)로 전달한다(S153). 이 경우는 추가 연산이 필요하다고 판단되는 경우로, 엣지 노드(110)는 출구점(exit point) 직전 레이어까지 연산한 중간 결과를 클라우드 노드(120)로 전달한다. Next, when the entropy value is equal to or greater than the first threshold value and less than the second threshold value (S151), the edge node 110 transmits the first inference result to the cloud node 120 (S153). In this case, it is determined that additional calculation is necessary, and the edge node 110 transmits the intermediate result calculated up to the layer just before the exit point to the cloud node 120 .

클라우드 노드(120)는 엣지 노드(110)로부터 제1 추론 결과를 수신함에 따라 추론 연산을 수행하여 최종적인 제2 추론 결과를 결과 값(Cloud exit)으로 제공한다.The cloud node 120 performs an inference operation upon receiving the first inference result from the edge node 110 and provides a final second inference result as a result value (Cloud exit).

마지막으로, 엔트로피 값이 제2 임계 값을 초과하는 경우(S161), 엣지 노드(110)는 추론 결과 미정을 결과 값으로 제공한다(S163). 이 경우는 엣지 노드(110)에서의 제1 추론 결과가 매우 불확실한 경우로, 이후 클라우드 노드(120)로 제1 추론 결과를 전달하여 추가 연산을 수행하더라도 신뢰도 높은 추론 결과가 도출될 가능성이 극히 낮다. 따라서, 이러한 경우 엣지 노드(110)는 현재까지의 추론 결과를 무시하고, 추론 결과 미정의 결과 값을 반환한다.Finally, when the entropy value exceeds the second threshold value (S161), the edge node 110 provides the inference result undecided as the result value (S163). In this case, the first inference result in the edge node 110 is very uncertain, and even if the first inference result is subsequently transferred to the cloud node 120 and additional calculation is performed, the probability of deriving a highly reliable inference result is extremely low. . Accordingly, in this case, the edge node 110 ignores the inference result up to now and returns an undefined result value of the inference result.

이후, 추론 결과 미정의 결과 값을 제공함에 따라, 엣지 노드(110)는 다음 입력 데이터에 대한 추론 요청에 대응하는 추론 연산을 수행하게 된다.Thereafter, as the undefined result of the reasoning result is provided, the edge node 110 performs a speculation operation corresponding to a speculation request for the next input data.

이와 같이, 본 발명의 일 실시예는 추론 결과가 매우 불확실한 경우 출구점(exit point) 이후의 연산은 진행되지 않기 때문에, 클라우드 노드(120)로 전달하는 과정에서 발생하는 통신 비용 소모와 전력 소모, 그리고 클라우드 노드(120)에서의 연산 비용을 줄일 수 있는바, 다른 입력 데이터를 처리하는데 자원을 보다 효율적으로 사용할 수 있다는 장점이 있다.As described above, in an embodiment of the present invention, when the inference result is very uncertain, the calculation after the exit point does not proceed, so communication cost consumption and power consumption occurring in the process of transferring to the cloud node 120, And since it is possible to reduce the computation cost in the cloud node 120, there is an advantage that resources can be used more efficiently to process other input data.

도 5 및 도 6은 테스트 노드(130)를 통해 임계 값을 업데이트하는 내용을 설명하기 위한 도면이다.5 and 6 are diagrams for explaining the contents of updating the threshold value through the test node 130 .

일 실시예로, 테스트 노드(130)에는 학습을 위한 입력 데이터 및 학습 데이터에 상응하는 결과 값이 미리 준비된다(S210). In an embodiment, the test node 130 prepares input data for learning and a result value corresponding to the learning data in advance ( S210 ).

그 다음, 테스트 노드(130)는 학습 데이터에 상응하는 엣지 노드(110) 및 클라우드 노드(120)에서의 각 결과 값에 기초하여 테스트 노드(130)에서 제2 임계 값을 갱신한다(S220).Next, the test node 130 updates the second threshold value in the test node 130 based on each result value in the edge node 110 and the cloud node 120 corresponding to the training data (S220).

구체적으로, 테스트 노드(130)에 준비된 학습 데이터는 전송 경로를 통해 엣지 노드(110)로 전달된다. 이때, 테스트 노드(130), 엣지 노드(110) 및 클라우드 노드(120)는 각각 전송 경로를 통해 연결되어 있으며, 테스트 노드(130)와 엣지 노드(110)는 제어 경로를 통해 연결되어 임계 값을 갱신 및 적용시킬 수 있다.Specifically, the training data prepared in the test node 130 is transmitted to the edge node 110 through a transmission path. At this time, the test node 130 , the edge node 110 , and the cloud node 120 are each connected through a transmission path, and the test node 130 and the edge node 110 are connected through a control path to set the threshold value. Can be updated and applied.

그 다음, 엣지 노드(110)에서의 입력 데이터에 대한 추론 연산과 마찬가지로, 엣지 노드(110)에서 학습 데이터에 대한 추론 요청에 대응하여 엣지 노드(110)의 미리 정의된 레이어까지 추론 연산을 수행한다. Next, similarly to the inference operation on the input data in the edge node 110 , the edge node 110 performs the inference operation up to a predefined layer of the edge node 110 in response to the inference request for the training data. .

그리고 엣지 노드(110)에서 학습 데이터에 대한 추론 연산에 따른 결과(이하, 제3 추론 결과)로 학습 데이터에 대한 각 결과 값에 상응하는 확률 벡터를 기반으로 하는 엔트로피 값을 산출한다.Then, the edge node 110 calculates an entropy value based on a probability vector corresponding to each result value of the training data as a result (hereinafter, referred to as a third reasoning result) according to the inference operation on the training data.

그 다음, 엣지 노드(110)에서 복수의 미리 설정된 임계 값과 엔트로피 값을 비교하며, 비교 결과에 기초하여 제3 추론 결과에 상응하는 결과 값 또는 추론 결과 미정을 결과 값으로 제공한다. 또는, 엣지 노드(110)는 제3 추론 결과를 클라우드 노드(120)로 전달할 경우, 클라우드 노드(120)는 엣지 노드(110)로부터 전달된 제3 추론 결과를 수행하여 학습 데이터에 대한 최종적인 제4 추론 결과를 결과 값으로 제공한다.Next, the edge node 110 compares a plurality of preset threshold values and entropy values, and provides a result value corresponding to the third inference result or an inference result undecided as a result value based on the comparison result. Alternatively, when the edge node 110 transmits the third inference result to the cloud node 120 , the cloud node 120 performs the third inference result transmitted from the edge node 110 to make the final second prediction for the learning data. 4 The inference result is provided as the result value.

학습 데이터에 대한 엣지 노드(110) 또는 클라우드 노드(120)에서의 추론 연산이 완료되면, 아래 수식과 같이 임계 값을 갱신한다. 여기에서 갱신되는 임계 값은 제2 임계 값(T_high)일 수 있다.When the reasoning operation in the edge node 110 or the cloud node 120 for the training data is completed, the threshold value is updated as shown in the following equation. Here, the updated threshold value may be the second threshold value T _high .

[수식][formula]

위 수식에서 α는 제2 임계 값을 갱신할 때마다 이전 제2 임계 값에 대한 가중치를 의미하며, 테스트 노드(130)는 이전 제2 임계 값에 0과 1 사이의 값을 갖는 상수로 정의된 미리 설정된 가중치(α)를 부여한다.In the above equation, α means a weight for the previous second threshold value whenever the second threshold value is updated, and the test node 130 is defined as a constant having a value between 0 and 1 in the previous second threshold value. A preset weight (α) is given.

여기에서 가중치(α)가 1에 가까울수록 이전 제2 임계 값에 많은 비중을 두기 때문에 갱신되는 제2 임계 값은 천천히 변화하게 되며, 가중치(α)가 0에 가까울수록 급격히 변화하게 된다.Here, the closer the weight α is to 1, the more weight is given to the previous second threshold value, so the updated second threshold value changes slowly, and the closer the weight α is to 0, the more rapidly it changes.

I는 테스트 노드(130)에서 수신한 결과 값에 대한 지시자로서, 테스트 노드(130)는 수신한 결과 값이 엣지 노드(110) 또는 클라우드 노드(120) 중 어느 노드에서 수신한 것인지 여부를 확인한다(S221).I is an indicator for the result value received from the test node 130 , and the test node 130 checks whether the received result value is received from either the edge node 110 or the cloud node 120 . (S221).

그리고 확인 결과에 기초하여, 가중치(α)가 부여된 제2 임계 값에 양의 값, 0, 음의 값 중 어느 하나로 결정되는 지시자(I)를 부가하여 제2 임계 값을 갱신하게 된다. And based on the confirmation result, the second threshold value is updated by adding an indicator (I) determined as any one of a positive value, 0, and a negative value to the second threshold value to which the weight (α) is assigned.

일 실시예로, 테스트 노드(130)는 엣지 노드(110)로부터 수신한 결과 값이 제3 추론 결과에 상응하는 결과 값인 경우(S222-N), 이전 제2 임계 값이 유지되도록 0으로 결정되는 지시자를 부가하여 제2 임계 값을 갱신한다(S224). In one embodiment, when the result value received from the edge node 110 is a result value corresponding to the third inference result (S222-N), the test node 130 is determined to be 0 so that the previous second threshold value is maintained. The second threshold value is updated by adding an indicator (S224).

이와 달리, 테스트 노드(130)는 엣지 노드(110)로부터 수신한 결과 값이 추론 결과 미정의 결과 값인 경우(S222-Y), 양의 값으로 결정되는 지시자(예를 들어, +1)를 부가하여 이전 제2 임계 값이 증가되도록 제2 임계 값을 갱신한다(S223).On the other hand, when the result value received from the edge node 110 is an undefined result value as an inference result (S222-Y), the test node 130 adds an indicator (eg, +1) determined as a positive value. Thus, the second threshold value is updated so that the previous second threshold value is increased (S223).

또 다른 실시예로, 테스트 노드(130)는 클라우드 노드(120)로부터 수신한 결과 값이 미리 준비된 결과 값과 일치하는 경우(S225-Y), 양의 값으로 결정되는 지시자(예를 들어, +1)를 부가하여 제2 임계 값이 증가되도록 제2 임계 값을 갱신한다(S226).As another embodiment, when the result value received from the cloud node 120 matches the result value prepared in advance (S225-Y), the test node 130 is an indicator determined as a positive value (eg, + 1) is added to update the second threshold value so that the second threshold value is increased (S226).

이와 달리, 테스트 노드(130)는 클라우드 노드(120)로부터 수신한 결과 값이 미리 준비된 결과 값과 상이한 오답인 경우(S225-N), 음의 값으로 결정되는 지시자(예를 들어, -1)를 부가하여 제2 임계 값이 감소되도록 제2 임계 값을 갱신한다(S227).On the other hand, when the result value received from the cloud node 120 is an incorrect answer different from the prepared result value (S225-N), the test node 130 is an indicator determined as a negative value (eg, -1). The second threshold value is updated so that the second threshold value is decreased by adding (S227).

즉, 제2 임계 값이 증가되도록 갱신되는 경우는 엣지 노드(110)에서의 학습 데이터에 대한 결과 값이 추론 결과 미정이거나 클라우드 노드(120)에서의 학습 데이터에 대한 추론 결과 값이 정답인 경우이며, 제2 임계 값이 감소되도록 갱신되는 경우는 클라우드 노드(120)에서의 학습 데이터에 대한 추론 결과 값이 오답인 경우이다.That is, when the second threshold value is updated to increase, the result value for the learning data in the edge node 110 is undecided as the inference result or the inference result value for the learning data in the cloud node 120 is the correct answer. , when the second threshold value is updated to decrease is a case in which an inference result value for the learning data in the cloud node 120 is an incorrect answer.

이후, 테스트 노드(130)는 갱신된 제2 임계 값을 제어 경로를 통해 엣지 노드(110)로 전달하며(S230), 엣지 노드(110)는 갱신된 제2 임계 값을 엣지 노드(110)에 적용시킨다(S240).Thereafter, the test node 130 transmits the updated second threshold value to the edge node 110 through the control path ( S230 ), and the edge node 110 transmits the updated second threshold value to the edge node 110 . Apply (S240).

테스트 노드(130)는 미리 설정된 주기 또는 제2 임계 값이 갱신될 때마다 제어 경로를 통해 엣지 노드(110)로 전달하며, 엣지 노드(110)에서는 갱신된 제2 임계 값을 적용하여, 다음 입력 데이터부터 심층 신경망의 연산 제어의 기준 값으로 사용한다.The test node 130 transmits it to the edge node 110 through the control path whenever the preset period or the second threshold value is updated, and the edge node 110 applies the updated second threshold value to the next input From data, it is used as a reference value for computational control of deep neural networks.

한편, 상술한 설명에서, 단계 S110 내지 S240은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다. 아울러, 기타 생략된 내용이라 하더라도 도 1 내지 도 3에 기술된 내용은 도 4 내지 도 6의 심층 신경망 기반의 추론 서비스 제공 방법에도 적용된다.Meanwhile, in the above description, steps S110 to S240 may be further divided into additional steps or combined into fewer steps according to an embodiment of the present invention. In addition, some steps may be omitted if necessary, and the order between steps may be changed. In addition, the contents described in FIGS. 1 to 3 are also applied to the method of providing an inference service based on the deep neural network of FIGS. 4 to 6 even if other contents are omitted.

이상에서 전술한 본 발명의 일 실시예에 따른 심층 신경망 기반의 추론 서비스 제공 방법은, 하드웨어인 서버와 결합되어 실행되기 위해 프로그램(또는 어플리케이션)으로 구현되어 매체에 저장될 수 있다.The method for providing an inference service based on a deep neural network according to an embodiment of the present invention described above may be implemented as a program (or application) and stored in a medium in order to be executed in combination with a server that is hardware.

상기 전술한 프로그램은, 상기 컴퓨터가 프로그램을 읽어 들여 프로그램으로 구현된 상기 방법들을 실행시키기 위하여, 상기 컴퓨터의 프로세서(CPU)가 상기 컴퓨터의 장치 인터페이스를 통해 읽힐 수 있는 C, C++, JAVA, 기계어 등의 컴퓨터 언어로 코드화된 코드(Code)를 포함할 수 있다. 이러한 코드는 상기 방법들을 실행하는 필요한 기능들을 정의한 함수 등과 관련된 기능적인 코드(Functional Code)를 포함할 수 있고, 상기 기능들을 상기 컴퓨터의 프로세서가 소정의 절차대로 실행시키는데 필요한 실행 절차 관련 제어 코드를 포함할 수 있다. 또한, 이러한 코드는 상기 기능들을 상기 컴퓨터의 프로세서가 실행시키는데 필요한 추가 정보나 미디어가 상기 컴퓨터의 내부 또는 외부 메모리의 어느 위치(주소 번지)에서 참조되어야 하는지에 대한 메모리 참조관련 코드를 더 포함할 수 있다. 또한, 상기 컴퓨터의 프로세서가 상기 기능들을 실행시키기 위하여 원격(Remote)에 있는 어떠한 다른 컴퓨터나 서버 등과 통신이 필요한 경우, 코드는 상기 컴퓨터의 통신 모듈을 이용하여 원격에 있는 어떠한 다른 컴퓨터나 서버 등과 어떻게 통신해야 하는지, 통신 시 어떠한 정보나 미디어를 송수신해야 하는지 등에 대한 통신 관련 코드를 더 포함할 수 있다.The above-described program is C, C++, JAVA, machine language, etc. that a processor (CPU) of the computer can read through a device interface of the computer in order for the computer to read the program and execute the methods implemented as a program It may include code (Code) coded in the computer language of Such code may include functional code related to a function defining functions necessary for executing the methods, etc., and includes an execution procedure related control code necessary for the processor of the computer to execute the functions according to a predetermined procedure. can do. In addition, the code may further include additional information necessary for the processor of the computer to execute the functions or code related to memory reference for which location (address address) in the internal or external memory of the computer to be referenced. there is. In addition, when the processor of the computer needs to communicate with any other computer or server located remotely in order to execute the above functions, the code uses the communication module of the computer to determine how to communicate with any other computer or server remotely. It may further include a communication-related code for whether to communicate and what information or media to transmit and receive during communication.

상기 저장되는 매체는, 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상기 저장되는 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있지만, 이에 제한되지 않는다. 즉, 상기 프로그램은 상기 컴퓨터가 접속할 수 있는 다양한 서버 상의 다양한 기록매체 또는 사용자의 상기 컴퓨터상의 다양한 기록매체에 저장될 수 있다. 또한, 상기 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장될 수 있다.The storage medium is not a medium that stores data for a short moment, such as a register, a cache, a memory, etc., but a medium that stores data semi-permanently and can be read by a device. Specifically, examples of the storage medium include, but are not limited to, ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device. That is, the program may be stored in various recording media on various servers accessible by the computer or in various recording media on the computer of the user. In addition, the medium may be distributed in a computer system connected to a network, and a computer-readable code may be stored in a distributed manner.

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 착탈형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of a method or algorithm described in connection with an embodiment of the present invention may be implemented directly in hardware, as a software module executed by hardware, or by a combination thereof. A software module may include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, or It may reside in any type of computer-readable recording medium well known in the art to which the present invention pertains.

이상, 첨부된 도면을 참조로 하여 본 발명의 실시예를 설명하였지만, 본 발명이 속하는 기술분야의 통상의 기술자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며, 제한적이 아닌 것으로 이해해야만 한다.As mentioned above, although embodiments of the present invention have been described with reference to the accompanying drawings, those skilled in the art to which the present invention pertains know that the present invention may be embodied in other specific forms without changing the technical spirit or essential features thereof. you will be able to understand Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive.

100: 추론 서비스 제공 시스템
110: 엣지 노드
120: 클라우드 노드
130: 테스트 노드
210: 통신모듈
220: 메모리
230: 프로세서100: inference service providing system
110: edge node
120: cloud node
130: test node
210: communication module
220: memory
230: processor

Claims

A method for providing an inference service comprising at least one edge node and a cloud node configured based on a deep neural network, the method comprising:
performing an inference operation up to a predefined layer of an edge node in response to an inference request for input data to the deep neural network;
calculating an entropy value based on a probability vector corresponding to each result value of the input data as a result (hereinafter, a first inference result) according to the inference operation at the edge node;
comparing the entropy value with a plurality of preset threshold values at the edge node; and
providing a result value corresponding to the first inference result or an inference result unknown as a result value based on a result of comparison with the plurality of threshold values, or transmitting the first inference result to the cloud node; including,
The cloud node receives the first reasoning result transmitted from the edge node, performs a reasoning operation, and provides a final second reasoning result as a result value,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

According to claim 1,
Comparing the entropy value with a plurality of preset threshold values at the edge node comprises:
Comparing a first threshold value with the plurality of preset threshold values and a second threshold value set to be greater than the first threshold value with the entropy value, respectively,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

3. The method of claim 2,
Providing a result value corresponding to the first inference result or an inference result unknown as a result value based on a comparison result with the plurality of threshold values, or transmitting the first inference result to the cloud node ,
When the entropy value is less than the first threshold value, providing a result value having the largest probability vector among the probability vectors corresponding to each result value of the input data as a result value corresponding to the first inference result sign,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

3. The method of claim 2,
In response to providing a result value corresponding to the first reasoning result, the edge node further comprises: performing a speculation operation corresponding to a speculation request for next input data;
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

3. The method of claim 2,
Providing a result value corresponding to the first inference result or an inference result unknown as a result value based on a comparison result with the plurality of threshold values, or transmitting the first inference result to the cloud node ,
When the entropy value is greater than or equal to the first threshold value and less than the second threshold value, the first inference result is transmitted to the cloud node,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

3. The method of claim 2,
Providing a result value corresponding to the first inference result or an inference result unknown as a result value based on a comparison result with the plurality of threshold values, or transmitting the first inference result to the cloud node ,
When the entropy value exceeds the second threshold value, providing the inference result undecided as a result value,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

7. The method of claim 6,
In response to providing the speculation result undecided as a result value, the edge node further comprises performing a speculation operation corresponding to a speculation request for next input data.
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

3. The method of claim 2,
Preparing in advance input data for learning (hereinafter, learning data) and a result value corresponding to the learning data in a test node;
updating the second threshold value at the test node based on each result value at the edge node and the cloud node corresponding to the learning data;
transmitting the updated second threshold value to the edge node; and
The method further comprising applying the updated second threshold value to the edge node.
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

9. The method of claim 8,
Updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes:
transmitting, by the test node, the training data to the edge node;
performing an inference operation from the edge node to a predefined layer of the edge node in response to an inference request for the training data;
calculating an entropy value based on a probability vector corresponding to each result value of the training data as a result (hereinafter, a third inference result) according to the inference operation of the training data at the edge node;
comparing the entropy value with a plurality of preset threshold values at the edge node;
providing a result value corresponding to the third inference result or an inference result unknown as a result value based on a result of comparison with the plurality of threshold values, or transmitting the third inference result to the cloud node; and
Comprising the step of receiving, by the cloud node, the third reasoning result transmitted from the edge node, performing a reasoning operation, and providing a final fourth reasoning result as a result value,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

10. The method of claim 9,
Updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes:
When the result value received from the edge node is a result value corresponding to the third inference result, the second threshold value is updated so that the previous second threshold value is maintained;
When the result value received from the edge node is the result value of the inference result unknown, updating the second threshold value so that the previous second threshold value is increased,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

10. The method of claim 9,
Updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes:
When the result value received from the cloud node matches the result value prepared in advance, the second threshold value is updated so that the previous second threshold value is increased;
When the result value received from the cloud node is different from the result value prepared in advance, the second threshold value is updated so that the previous second threshold value is decreased,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

10. The method of claim 9,
Updating the second threshold value in the test node based on each result value in the edge node and the cloud node corresponding to the learning data includes:
assigning a preset weight between 0 and 1 to the second threshold value;
checking whether the result value received from the test node is received from either the edge node or the cloud node; and
Based on the confirmation result, adding an indicator determined as any one of a positive value, 0, or a negative value to the weighted second threshold value and updating the second threshold value ,
A method of providing an inference service including at least one edge node and a cloud node configured based on a deep neural network.

In the deep neural network-based reasoning service providing system,
In response to an inference request for input data to the deep neural network, a reasoning operation is performed up to a predefined layer, and a result according to the inference operation (hereinafter, a first reasoning result) corresponds to each result value of the input data Calculates an entropy value based on a probability vector of ) at least one edge node providing as a result value,
Receives the first inference result from the edge node based on a result of comparing the entropy value at the edge node with a plurality of preset threshold values, performs a reasoning operation, and provides a final second inference result as a result value cloud node and
The threshold value is updated based on each result value in the edge node and the cloud node for input data (hereinafter, learning data) for learning including a corresponding result value, and the updated threshold value is set to the edge node containing a test node that passes to
Deep neural network-based inference service providing system.

In the deep neural network-based reasoning service providing system,
In response to an inference request for input data to the deep neural network, a reasoning operation is performed up to a predefined layer, and a result according to the inference operation (hereinafter, a first reasoning result) corresponds to each result value of the input data Calculates an entropy value based on a probability vector of ) as a result value at least one edge node and
Receives the first inference result from the edge node based on a result of comparing the entropy value at the edge node with a plurality of preset threshold values, performs a reasoning operation, and provides a final second inference result as a result value including a cloud node that
Deep neural network-based inference service providing system.

15. The method of claim 14,
the plurality of preset threshold values are a first threshold value and a second threshold value set to be greater than the first threshold value;
When the entropy value is less than the first threshold value, the edge node selects a result value having the largest probability vector among the probability vectors corresponding to each result value of the input data as a result value corresponding to the first inference result. provided as,
When the entropy value is greater than or equal to the first threshold value and less than the second threshold value, the first inference result is transmitted to the cloud node,
When the entropy value exceeds the second threshold value, providing the inference result undecided as a result value,
Deep neural network-based inference service providing system.

16. The method of claim 15,
Update the second threshold value based on each result value in the edge node and the cloud node for input data (hereinafter, learning data) for learning including a corresponding result value, and the updated second threshold value It further comprises a test node that passes to the edge node,
The edge node receives and applies the second threshold value,
Deep neural network-based inference service providing system.

17. The method of claim 16,
The test node transfers the training data to the edge node,
The edge node performs an inference operation up to a predefined layer in response to the inference request for the learning data, and corresponds to each result value of the learning data as a result according to the inference operation (hereinafter, a third inference result) Calculates an entropy value based on a probability vector of ) as a result value, or deliver the third inference result to the cloud node,
The cloud node receives the third reasoning result delivered from the edge node, performs a reasoning operation, and provides a final fourth reasoning result as a result value,
Deep neural network-based inference service providing system.

18. The method of claim 17,
The test node is
When the result value received from the edge node is a result value corresponding to the third inference result, the second threshold value is updated so that the previous second threshold value is maintained;
When the result value received from the edge node is the result value of the inference result unknown, updating the second threshold value so that the previous second threshold value is increased,
Deep neural network-based inference service providing system.

18. The method of claim 17,
The test node is
When the result value received from the cloud node matches the result value prepared in advance, the second threshold value is updated so that the previous second threshold value is increased;
When the result value received from the cloud node is different from the previously prepared result value, updating the second threshold value so that the previous second threshold value is decreased,
Deep neural network-based inference service providing system.

18. The method of claim 17,
The test node is
A preset weight between 0 and 1 is given to the second threshold value, and it is checked whether a result value is received from either the edge node or the cloud node, and the weighted second threshold value is positively To update the second threshold value by adding an indicator (Indicator) determined as any one of a value, 0, and a negative value,
Deep neural network-based inference service providing system.