KR20220049855A

KR20220049855A - Apparatus and method for controlling distributed neural network

Info

Publication number: KR20220049855A
Application number: KR1020200133486A
Authority: KR
Inventors: 박종성; 정종훈; 양회석
Original assignee: 국방과학연구소; 아주대학교산학협력단
Priority date: 2020-10-15
Filing date: 2020-10-15
Publication date: 2022-04-22

Abstract

Provided is a method for controlling a distributed neural network comprising: a step of classifying a plurality of filters into fewer parts than that of the number of student networks, when a plurality of groups comprising the plurality of filters are received; a step of transmitting the information regarding the same part to two or more student networks, to train the two or more student networks using one part; a step of receiving the results of the two or more student networks trained through the same part; and a step of generating a final feature based on an average value of the received results of the two or more student networks. Therefore, the present invention is capable of maintaining a constant inference accuracy.

Description

Control method and apparatus of a distributed neural network {APPARATUS AND METHOD FOR CONTROLLING DISTRIBUTED NEURAL NETWORK}

본 개시는 분산 뉴럴 네트워크(Distributed Neural Network)에 관한 것으로, 더 상세하게는 데이터 전송률이 낮은 디바이스를 이용하는 분산 뉴럴 네트워크의 제어 방법 및 장치에 관한 것이다.The present disclosure relates to a distributed neural network, and more particularly, to a method and apparatus for controlling a distributed neural network using a device having a low data rate.

최근 다양한 사물인터넷(Internet of Things) 디바이스의 확산과 함께, 스마트 워치, 스마트 고글과 같이 다양한 웨어러블 디바이스가 개발되고 있다. 이러한 웨어러블 디바이스들은 저전력 요구사항으로 인해, 사람의 몸을 매질로 하는 인체 통신이나 블루투스 통신 등과 같이 낮은 데이터 전송률을 갖는다. Recently, with the spread of various Internet of Things (IoT) devices, various wearable devices such as smart watches and smart goggles are being developed. These wearable devices have low data transmission rates such as human body communication or Bluetooth communication using a human body as a medium due to low power requirements.

한편 인공지능 기술이 발전함에 따라, 상술한 웨어러블 디바이스와 같이 데이터 전송률이 낮은 디바이스에서도 이미지 분류, 음성인식 등과 같은 뉴럴 네트워크 기반 응용프로그램을 구동할 필요성이 대두되고 있다. Meanwhile, with the development of artificial intelligence technology, the necessity of running neural network-based applications such as image classification and voice recognition even in devices with low data rates such as the aforementioned wearable devices is emerging.

데이터 전송률이 낮은 디바이스에서 뉴럴 네트워크 기반 응용프로그램을 구동하기 위해, 오프로딩 방식으로 뉴럴 네트워크를 구동하는 기법이 개발되었다. 이는, 디바이스로부터 고성능 CPU, GPU 그리고 가속기와 같은 하드웨어를 사용하는 클라우드 시스템으로 데이터가 전송되어 원격으로 뉴럴 네트워크가 수행된 후, 데이터 처리 결과가 다시 클라우드 시스템에서 디바이스로 전송되는 방식이다.In order to run a neural network-based application in a device with a low data rate, a technique for driving a neural network in an offloading manner has been developed. This is a method in which data is transmitted from the device to a cloud system using hardware such as a high-performance CPU, GPU, and accelerator, and a neural network is performed remotely, and then the data processing result is transmitted from the cloud system to the device again.

그러나, 이러한 오프로딩 구동 방식은 원격 서버 상태 및 네트워크 통신 상태에 따라 뉴럴 네트워크 추론 속도가 저하될 수 있으므로, 실시간성을 해칠 수 있다. 또한, 디바이스의 수가 증가할수록 빈번한 통신이 수반되어 추가적인 지연이 발생할 수 있고, 디바이스의 전력 소모가 증가할 수 있다. 따라서, 데이터 전송률이 낮은 디바이스에 뉴럴 네트워크를 탑재하여 구동하는 방법의 필요성이 존재한다. However, this offloading driving method may degrade the real-time performance because the neural network reasoning speed may be lowered depending on the remote server state and the network communication state. In addition, as the number of devices increases, frequent communication may occur, which may cause additional delay and increase power consumption of the devices. Accordingly, there is a need for a method of mounting and driving a neural network in a device having a low data rate.

본 실시 예가 해결하고자 하는 과제는, 데이터 전송률이 낮은 디바이스에 학생 네트워크를 탑재하는, 분산 뉴럴 네트워크의 제어 방법 및 장치를 제공하는데 있다. An object of the present embodiment is to provide a method and apparatus for controlling a distributed neural network, in which a student network is mounted on a device having a low data rate.

본 실시 예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시 예들로부터 또 다른 기술적 과제들이 유추될 수 있다. The technical problems to be achieved by the present embodiment are not limited to the technical problems described above, and other technical problems may be inferred from the following embodiments.

제1 실시 예에 따라, 분산 뉴럴 네트워크의 제어 방법은, 복수의 필터를 포함한 복수의 그룹이 수신되면, 학생 네트워크의 수보다 적은 수의 파트로 상기 복수의 필터를 분류하는 단계, 하나의 파트를 이용하여 둘 이상의 학생 네트워크를 학습시키기 위해, 상기 둘 이상의 학생 네트워크에 동일한 파트에 관한 정보를 전송하는 단계, 상기 동일한 파트를 통해 학습된 둘 이상의 학생 네트워크의 결과를 수신하는 단계 및 상기 수신된 둘 이상의 학생 네트워크의 결과의 평균값에 기초하여 최종 특징을 생성하는 단계를 포함할 수 있다. According to a first embodiment, a method for controlling a distributed neural network includes, when a plurality of groups including a plurality of filters are received, classifying the plurality of filters into a number of parts smaller than the number of student networks, one part transmitting information about the same part to the two or more student networks, receiving results of the two or more student networks learned through the same part, and the received two or more student networks to train two or more student networks using generating the final feature based on the average value of the results of the student network.

제2 실시 예에 따라, 분산 뉴럴 네트워크의 제어 장치는 적어도 하나의 명령어(instruction)를 저장하는 메모리(memory) 및 상기 적어도 하나의 명령어를 실행하여, 복수의 필터를 포함한 복수의 그룹이 수신되면, 학생 네트워크의 수보다 적은 수의 파트로 상기 복수의 필터를 분류하고, 하나의 파트를 이용하여 둘 이상의 학생 네트워크를 학습시키기 위해, 상기 둘 이상의 학생 네트워크에 동일한 파트에 관한 정보를 전송하고, 상기 동일한 파트를 통해 학습된 둘 이상의 학생 네트워크의 결과를 수신하고, 상기 수신된 둘 이상의 학생 네트워크의 결과의 평균값에 기초하여 최종 특징을 생성하는 프로세서(processor)를 포함할 수 있다.According to the second embodiment, when a plurality of groups including a plurality of filters are received by executing a memory for storing at least one instruction and the at least one instruction, In order to classify the plurality of filters into a number of parts less than the number of student networks, and to train two or more student networks using one part, information about the same part is transmitted to the two or more student networks, and the same and a processor configured to receive a result of the two or more student networks learned through the part, and generate a final feature based on an average value of the received results of the two or more student networks.

제3 실시 예에 따라, 컴퓨터로 읽을 수 있는 기록매체는 분산 뉴럴 네트워크의 제어 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 비일시적 기록매체로서, 상기 분산 뉴럴 네트워크의 제어 방법은, 복수의 필터를 포함한 복수의 그룹이 수신되면, 학생 네트워크의 수보다 적은 수의 파트로 상기 복수의 필터를 분류하는 단계, 하나의 파트를 이용하여 둘 이상의 학생 네트워크를 학습시키기 위해, 상기 둘 이상의 학생 네트워크에 동일한 파트에 관한 정보를 전송하는 단계, 상기 동일한 파트를 통해 학습된 둘 이상의 학생 네트워크의 결과를 수신하는 단계 및 상기 수신된 둘 이상의 학생 네트워크의 결과의 평균값에 기초하여 최종 특징을 생성하는 단계를 포함할 수 있다.According to a third embodiment, a computer-readable recording medium is a computer-readable non-transitory recording medium recording a program for executing a method for controlling a distributed neural network in a computer, the method for controlling a distributed neural network comprising: When a plurality of groups including a plurality of filters are received, classifying the plurality of filters into a number of parts less than the number of student networks, to train two or more student networks using one part, the two or more students transmitting information about the same part to a network; receiving results of two or more student networks learned through the same part; and generating a final feature based on an average value of the received results of two or more student networks. may include

기타 실시 예들의 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다.Specific details of other embodiments are included in the detailed description and drawings.

본 개시에 따른 분산 뉴럴 네트워크의 제어 방법 및 장치는, 학생 네트워크가 탑재된 디바이스 중 일부의 통신이 불안정하더라도 일정한 추론 정확도를 유지할 수 있는 효과가 있다.The method and apparatus for controlling a distributed neural network according to the present disclosure have the effect of maintaining constant reasoning accuracy even if communication of some of the devices on which the student network is mounted is unstable.

발명의 효과는 이상에서 언급한 효과로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 당해 기술 분야의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.Effects of the invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

도 1은 분산 뉴럴 네트워크 구조의 일례를 설명하기 위한 도면이다.
도 2a, 도 2b 및 도 3은 NONN(Network of Neural Network) 기반 분산 뉴럴 네트워크 구조의 일례를 설명하기 위한 도면이다.
도 4는 일 실시 예에 따른 분산 뉴럴 네트워크 구조를 설명하기 위한 도면이다.
도 5는 일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법을 설명하기 위한 흐름도이다.
도 6a 및 도 6b는 다른 일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법을 설명하기 위한 도면이다.
도 7a, 도 7b, 도 8a 및 도 8b는 일 실시 예에 따른 분산 뉴럴 네트워크의 성능을 설명하기 위한 도면이다.
도 9는 일 실시 예에 따른 분산 뉴럴 네트워크 제어 장치를 설명하기 위한 블록도이다.1 is a diagram for explaining an example of a distributed neural network structure.
2A, 2B, and 3 are diagrams for explaining an example of a network of neural network (NONN)-based distributed neural network structure.
4 is a diagram for describing a structure of a distributed neural network according to an embodiment.
5 is a flowchart illustrating a distributed neural network control method according to an embodiment.
6A and 6B are diagrams for explaining a distributed neural network control method according to another embodiment.
7A, 7B, 8A, and 8B are diagrams for explaining the performance of a distributed neural network according to an embodiment.
9 is a block diagram illustrating an apparatus for controlling a distributed neural network according to an embodiment.

실시 예들에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다.Terms used in the embodiments are selected as currently widely used general terms as possible while considering functions in the present disclosure, but may vary according to intentions or precedents of those of ordinary skill in the art, emergence of new technologies, and the like. In addition, in a specific case, there is a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the corresponding description. Therefore, the terms used in the present disclosure should be defined based on the meaning of the term and the contents of the present disclosure, rather than the simple name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. In the entire specification, when a part "includes" a certain element, this means that other elements may be further included, rather than excluding other elements, unless otherwise stated.

명세서 전체에서 기재된 "a, b, 및 c 중 적어도 하나"의 표현은, 'a 단독', 'b 단독', 'c 단독', 'a 및 b', 'a 및 c', 'b 및 c', 또는 'a, b, 및 c 모두'를 포괄할 수 있다.The expression "at least one of a, b, and c" described throughout the specification means 'a alone', 'b alone', 'c alone', 'a and b', 'a and c', 'b and c ', or 'all of a, b, and c'.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시 예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art to which the present disclosure pertains can easily implement them. However, the present disclosure may be implemented in several different forms and is not limited to the embodiments described herein.

이하에서는 도면을 참조하여 본 개시의 실시 예들을 상세히 설명한다.Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings.

도 1은 분산 뉴럴 네트워크 구조의 일례를 설명하기 위한 도면이다.1 is a diagram for explaining an example of a distributed neural network structure.

도 1을 참고하면, 뉴럴 네트워크(100)는 필터를 통해 지식을 추출하여 메모리 요구량과 연산량이 큰 교사 네트워크(110)의 최종결과를 학습한 단일 학생 네트워크(120)를 생성할 수 있다. 여기서, 교사 네트워크(110)는 비교적 고성능의 네트워크일 수 있고, 학생 네트워크(120)는 교사 네트워크(110)의 일부 지식을 이용하여 학습하는 서브 네트워크로 지칭될 수 있다.Referring to FIG. 1 , the neural network 100 extracts knowledge through a filter to generate a single student network 120 that learns the final result of the teacher network 110 with a large memory requirement and large computational amount. Here, the teacher network 110 may be a relatively high-performance network, and the student network 120 may be referred to as a sub-network that learns using some knowledge of the teacher network 110 .

하지만 도 1의 학생 네트워크(120)의 경우, 여전히 요구되는 연산량 및 메모리 용량이 크기 때문에, 데이터 전송률이 낮은 단일 디바이스에서 구동하기에 어려울 수 있다. 따라서, 학생 네트워크(120)는 레이어 단위로 여러 디바이스(131 내지 133)에 분산하여 탑재될 수 있다. 이러한 경우, 제2 디바이스(132)는 학생 네트워크(120)의 일부 지식 및 제1 디바이스(131)의 추론 결과를 이용하여 학습할 수 있다. 또한, 제3 디바이스(133)는 학생 네트워크(120)의 일부 지식 및 제2 디바이스(132)의 추론 결과에 기초하여 학습하고, 최종 특징 맵을 생성할 수 있다. However, in the case of the student network 120 of FIG. 1 , it may be difficult to operate in a single device having a low data transfer rate because the required amount of computation and memory capacity are still large. Accordingly, the student network 120 may be distributed and mounted on several devices 131 to 133 on a layer-by-layer basis. In this case, the second device 132 may learn by using some knowledge of the student network 120 and an inference result of the first device 131 . Also, the third device 133 may learn based on some knowledge of the student network 120 and an inference result of the second device 132 and generate a final feature map.

즉, 도 1의 분산 뉴럴 네트워크(100)는 하나의 추론 결과를 얻기 위해서 모든 디바이스(131 내지 133) 간의 데이터 송수신이 불가피하므로, 디바이스간 요구되는 통신량 및 통신 빈도가 높을 수 있다. 또한, 제1 디바이스(131) 내지 제3 디바이스(133) 중 어느 하나의 디바이스의 통신이 두절되는 경우, 추론 결과의 도출이 불가능할 수 있다. 따라서, 도 1의 분산 뉴럴 네트워크(100)는 웨어러블 디바이스와 같은 데이터 전송률이 낮은 디바이스에 탑재하기는 어려울 수 있다.That is, in the distributed neural network 100 of FIG. 1 , data transmission and reception between all devices 131 to 133 is inevitable in order to obtain a single reasoning result, and thus a required communication amount and communication frequency between devices may be high. Also, when communication of any one device among the first device 131 to the third device 133 is interrupted, it may be impossible to derive an inference result. Accordingly, it may be difficult to mount the distributed neural network 100 of FIG. 1 in a device having a low data rate, such as a wearable device.

도 2a, 도 2b 및 도 3은 NONN 기반 분산 뉴럴 네트워크 구조의 일례를 설명하기 위한 도면이다.2A, 2B, and 3 are diagrams for explaining an example of a structure of a NONN-based distributed neural network.

분산 뉴럴 네트워크의 다른 일례로, NONN(Network of Neural Network) 기반 분산 뉴럴 네트워크가 있다. 도 2a를 참고하면, NONN 기반의 분산 뉴럴 네트워크(200)는 교사 네트워크(210)와 다수의 학생 네트워크(221 내지 223)를 포함할 수 있다. 그리고 교사 네트워크(210)에 포함된 각각의 독립적인 필터를 이용하여 다수의 학생 네트워크(221 내지 223)를 학습시킬 수 있고, 다수의 학생 네트워크(221 내지 223)는 각각 별개의 디바이스(231 내지 233)에 탑재될 수 있다. 분산 뉴럴 네트워크(200)의 최종 추론 결과는 별개의 디바이스(231 내지 233)의 각각의 출력에 기초하여 생성될 수 있다. As another example of the distributed neural network, there is a NONN (Network of Neural Network) based distributed neural network. Referring to FIG. 2A , the NONN-based distributed neural network 200 may include a teacher network 210 and a plurality of student networks 221 to 223 . In addition, the plurality of student networks 221 to 223 may be trained using each independent filter included in the teacher network 210 , and the plurality of student networks 221 to 223 may each have separate devices 231 to 233 . ) can be mounted on The final reasoning result of the distributed neural network 200 may be generated based on the respective outputs of the separate devices 231 to 233 .

한편, 교사 네트워크(210)에서 다수의 학생 네트워크(221 내지 223)에 지식을 전달하는 방법은 도 2b를 통해 설명될 수 있다. 먼저 메모리 사용량과 연산량이 큰 교사 네트워크(210)의 최종 컨볼루션 레이어(240)에 포함된 필터(250)를 복수의 그룹으로 분류할 수 있다. 이후, 복수의 그룹에 포함된 필터(250)는 다시 복수의 파트로 분류될 수 있다. NONN 기반의 분산 뉴럴 네트워크(200)는, 복수의 파트로 분류된 각각의 필터(250)를 복수의 학생 네트워크(제1 학생 네트워크 및 제2 학생 네트워크)에 전달할 수 있다. 여기서, 복수의 학생 네트워크(제1 학생 네트워크 및 제2 학생 네트워크)는 모두 독립적일 수 있다. 이후, NONN 기반의 분산 뉴럴 네트워크(200)는 각각의 학생 네트워크(제1 학생 네트워크 및 제2 학생 네트워크)의 최종 결과들을 연결(concatenate)하여 결과를 도출할 수 있다. 한편, 도 2b에는 교사 네트워크(210)에 복수의 그룹 및 복수의 파트가 생성되는 것으로 도시되어 있으나, 복수의 그룹 및 복수의 파트가 생성되는 장소가 달라질 수 있음은 해당 기술분야의 통상의 기술자에게 자명하다.Meanwhile, a method of transferring knowledge from the teacher network 210 to the plurality of student networks 221 to 223 may be described with reference to FIG. 2B . First, the filter 250 included in the final convolutional layer 240 of the teacher network 210 having a large amount of memory and computation may be classified into a plurality of groups. Thereafter, the filters 250 included in the plurality of groups may be classified into a plurality of parts again. The NONN-based distributed neural network 200 may transmit each filter 250 classified into a plurality of parts to a plurality of student networks (a first student network and a second student network). Here, the plurality of student networks (the first student network and the second student network) may all be independent. Thereafter, the NONN-based distributed neural network 200 may derive a result by concatenating the final results of each student network (the first student network and the second student network). Meanwhile, although it is shown that a plurality of groups and a plurality of parts are generated in the teacher network 210 in FIG. 2B , it is to those skilled in the art that the places where the plurality of groups and the plurality of parts are generated may vary. self-evident

이러한 NONN 기반의 분산 뉴럴 네트워크(200)는 디바이스의 통신이 안정된 경우, 교사 네트워크(210)의 모든 지식을 활용하기 때문에 높은 추론 정확도를 제공할 수 있다. 또한, 학생 네트워크가 탑재되는 디바이스의 통신이 불안정한 경우, 도 1의 분산 뉴럴 네트워크(100)와 달리 결과 도출이 가능하다. 그러나, 통신이 두절된 디바이스에 의해 학습된 결과를 수신하기 어려운 경우, NONN 기반의 분산 뉴럴 네트워크(200)는 교사 네트워크(210)의 일부 지식만을 사용하기 때문에, 추론 정확도가 크게 저하될 수 있다.The NONN-based distributed neural network 200 can provide high inference accuracy because it utilizes all the knowledge of the teacher network 210 when the communication of the device is stable. In addition, when the communication of the device on which the student network is mounted is unstable, it is possible to derive a result unlike the distributed neural network 100 of FIG. 1 . However, when it is difficult to receive the result learned by the communication-disrupted device, the NONN-based distributed neural network 200 uses only some knowledge of the teacher network 210 , so the inference accuracy may be greatly reduced.

한편, NONN 기반의 분산 뉴럴 네트워크(200)는 추론 정확도를 향상시키기 위하여 최종 컨볼루션 레이어(240)에 포함된 필터(250)들을 AH(Activation Hub) 규칙에 기초하여 복수의 그룹으로 분류할 수 있다. On the other hand, the NONN-based distributed neural network 200 may classify the filters 250 included in the final convolutional layer 240 into a plurality of groups based on the AH (Activation Hub) rule in order to improve inference accuracy. .

일 실시 예에 따르면, NoNN 기반의 분산 뉴럴 네트워크(200)는 교사 네트워크(210)의 최종 컨볼루션 레이어(240)에 포함된 필터(250)를 통해 추출된 지식들이 특정 그룹에 편향되지 않도록 AH 규칙에 의거하여 필터를 복수의 그룹으로 분류할 수 있다. AH 규칙은 동일한 종류의 입력에 대하여 큰 중요도를 가지는 필터(250)를 분류하는 규칙으로, 교사 네트워크(210)의 최종 컨볼루션 레이어(240)에서 i번째 필터의 출력 평균을 a_i, j번째 필터의 출력 평균을 a_j라고 할 때, AH 규칙은 수학식 1과 같이 표현할 수 있다.According to an embodiment, the NoNN-based distributed neural network 200 uses an AH rule so that knowledge extracted through the filter 250 included in the final convolutional layer 240 of the teacher network 210 is not biased to a specific group. Filters can be classified into a plurality of groups based on The AH rule is a rule for classifying the filter 250 having great importance with respect to the same type of input. The average of the output of the i-th filter in the final convolutional layer 240 of the teacher network 210 is a _i , the j-th filter. When the output average of is a _j , the AH rule can be expressed as in Equation 1.

[수학식 1][Equation 1]

최종 컨볼루션 레이어(240)의 필터(250)를 그룹으로 분류하기 위해, 각 필터(250)를 정점(Vertex)으로 하고, 필터(250) 간의 간선(Edge)들의 가중치를 AH 값으로 가지는 그래프 형태의 자료구조를 생성할 수 있다. 이후, AH 값과 Newman, Mark EJ. "Modularity and community structure in networks." Proceedings of the national academy of sciences 103.23 (2006): 8577-8582.에 개시된 분할 알고리즘에 기초하여, 필터(250)를 복수의 그룹으로 분류할 수 있고, 이를 커뮤니티 구조(Community Structure)로 지칭할 수 있다. In order to classify the filters 250 of the final convolutional layer 240 into groups, each filter 250 is a vertex, and the weight of the edges between the filters 250 is a graph type having an AH value. data structures can be created. Later, AH values and Newman, Mark EJ. "Modularity and community structure in networks." Based on the segmentation algorithm disclosed in Proceedings of the national academy of sciences 103.23 (2006): 8577-8582., the filter 250 may be classified into a plurality of groups, and this may be referred to as a community structure. .

AH 규칙에 기초하여, 각 필터(250)가 복수의 그룹으로 분류될 때, 동일한 입력에 대해 출력 값이 큰 필터들은 AH 값이 작기 때문에, 서로 다른 그룹으로 분류될 수 있다. 예를 들어, 이미지 내의 객체에서 개를 판별할 수 있는 필터가 여러 개 존재하는 경우, 각각의 필터는 서로 다른 그룹으로 분류될 수 있다. 이러한 과정은 중요도가 높은 필터들을 분산시키는 과정으로도 볼 수 있다. Based on the AH rule, when each filter 250 is classified into a plurality of groups, filters having a large output value for the same input may be classified into different groups because the AH value is small. For example, when there are several filters capable of discriminating dogs from objects in an image, each filter may be classified into a different group. This process can also be seen as a process of dispersing filters with high importance.

한편, AH 규칙으로 분류된 각각 그룹들에 포함된 필터의 수가 상이할 수 있으므로, NoNN 기반 분산 뉴럴 네트워크(200)는 도 2b에 도시된 바와 같이 복수의 그룹(제1 그룹 내지 제4 그룹)에 포함된 필터들을 다시 파트(제1 파트 및 제2 파트)로 분류할 수 있다. 이 때 각 파트들에 포함된 필터의 수가 균일할 수 있도록 필터들이 분류될 수 있다. 이후, NoNN 기반 분산 뉴럴 네트워크(200)는 각 파트를 이용하여 학생 네트워크를 훈련시킬 수 있다. Meanwhile, since the number of filters included in each group classified by the AH rule may be different, the NoNN-based distributed neural network 200 provides a plurality of groups (first to fourth groups) as shown in FIG. 2B. The included filters may be classified into parts (a first part and a second part) again. In this case, the filters may be classified so that the number of filters included in each part may be uniform. Thereafter, the NoNN-based distributed neural network 200 may train the student network using each part.

한편, NoNN 기반 분산 뉴럴 네트워크는 학생 네트워크가 탑재되는 디바이스가 통신에 실패한 경우, 교사 네트워크의 일부 필터를 통해 학습된 학생 네트워크의 결과를 사용할 수 없으므로 추론 정확도가 저하될 수 있다.On the other hand, in the NoNN-based distributed neural network, when the device on which the student network is mounted fails to communicate, the results of the student network learned through some filters of the teacher network cannot be used, so inference accuracy may be reduced.

한편, 도 3은 4개의 학생 네트워크를 학습시키는 NONN 기반의 분산 뉴럴 네트워크를 설명하기 위한 도면이다.Meanwhile, FIG. 3 is a diagram for explaining a NONN-based distributed neural network that trains four student networks.

만약, 교사 네트워크의 최종 컨볼루션 레이어에 포함된 필터들이 도 3과 같이 제1 그룹 내지 제5 그룹으로 분류되고, 학생 네트워크가 탑재될 디바이스의 수가 4개 인 경우, NONN 기반의 분산 뉴럴 네트워크는 학생 네트워크의 수와 동일한 수의 파트로 필터를 분류할 수 있다. 도 3의 각 그룹, 파트 및 학생 네트워크에 포함된 필터의 수는 괄호 안에 도시되어 있다. If the filters included in the final convolutional layer of the teacher network are classified into groups 1 to 5 as shown in FIG. 3 and the number of devices on which the student network is to be mounted is 4, the NONN-based distributed neural network is Filters can be classified into the same number of parts as the number of networks. The number of filters included in each group, part and student network of FIG. 3 is shown in parentheses.

한편, 각 파트에 포함되는 필터의 수를 최대한 균일하게 하기 위하여, 제4 그룹에 포함된 필터(7개)와 제5 그룹에 포함된 필터(5개)가 제4 파트로 분류될 수 있다. Meanwhile, in order to make the number of filters included in each part as uniform as possible, the filters (7) included in the fourth group and the filters (5) included in the fifth group may be classified into the fourth part.

만약 도 3의 기재와 같이 구성된 NoNN 기반의 분산 뉴럴 네트워크에서 제1 학생 네트워크가 탑재된 디바이스의 통신에 오류가 발생한 경우, 제1 파트에 기초하여 학습된 결과를 사용할 수 없으므로, 최종 특징 맵(Final Feature Map)의 0~9번 요소(310)는 0 값을 가지게 된다. 따라서, 추론 정확도가 저하될 수 있다. If an error occurs in the communication of the device equipped with the first student network in the NoNN-based distributed neural network configured as described in FIG. 3, since the result learned based on the first part cannot be used, the final feature map (Final Elements 0 to 9 of the feature map 310 have a value of 0. Accordingly, inference accuracy may be deteriorated.

한편, 웨어러블 디바이스는 인체의 움직임이 많고 무선 통신이나 인체 통신을 활용하기 때문에 통신 상태가 불안정할 수 있다. 따라서, 웨어러블 디바이스 포함한, 데이터 전송률이 낮은 디바이스들에 학생 네트워크가 탑재되기 위해서는, 분산 뉴럴 네트워크가 통신 에러에 강인할 수 있도록 분산 뉴럴 네트워크를 제어하는 것이 필요하다. On the other hand, since the wearable device has a lot of movement of the human body and uses wireless communication or human body communication, the communication state may be unstable. Therefore, in order for the student network to be mounted on devices with low data rates, including wearable devices, it is necessary to control the distributed neural network so that the distributed neural network can be robust to communication errors.

도 4는 일 실시 예에 따른 분산 뉴럴 네트워크 구조를 설명하기 위한 도면이다.4 is a diagram for describing a structure of a distributed neural network according to an embodiment.

일 실시 예에 따른 분산 뉴럴 네트워크는 NONN 기반으로 할 수 있다. 따라서, 교사 네트워크의 최종 컨볼루션 레이어에 포함된 필터들이 제1 그룹 내지 제5 그룹으로 분류되는 과정은 도 2b 및 도 3의 분류 과정과 대응될 수 있다.The distributed neural network according to an embodiment may be based on NONN. Accordingly, the process of classifying the filters included in the final convolutional layer of the teacher network into the first to fifth groups may correspond to the classification process of FIGS. 2B and 3 .

한편, 일 실시 예에 따른 분산 뉴럴 네트워크의 경우, 학생 네트워크의 수보다 적은 수의 파트로 필터를 분류할 수 있다. 예를 들어, 디바이스의 수 및 디바이스에 탑재될 학생 네트워크의 수가 4개인 경우, 파트의 수는 4개 보다 적은 2개일 수 있다. 제1 그룹 내지 제5 그룹에 포함된 필터들은 제1 파트와 제2 파트에 최대한 균일하게 수로 분류될 수 있다. 따라서, 제1 그룹 및 제3 그룹에 포함된 필터는 제1 파트로 분류될 수 있고, 제2 그룹, 제4 그룹 및 제5 그룹에 포함된 필터는 제2 파트로 분류될 수 있다. 이렇게 파트의 수가 디바이스의 수보다 적은 경우, 하나의 파트를 이용하여 학습하는 학생 네트워크가 다수가 되기 때문에, 어느 한 디바이스가 통신 두절되더라도, 모든 파트의 필터를 이용한 학습 결과로 최종 특징 맵을 생성할 수 있다. Meanwhile, in the case of a distributed neural network according to an embodiment, the filter may be classified into a number of parts smaller than the number of student networks. For example, when the number of devices and the number of student networks to be mounted on the device are four, the number of parts may be two less than four. Filters included in the first to fifth groups may be numbered as uniformly as possible in the first part and the second part. Accordingly, the filters included in the first group and the third group may be classified as the first part, and the filters included in the second group, the fourth group, and the fifth group may be classified as the second part. If the number of parts is less than the number of devices, the number of student networks learning using one part becomes a large number, so even if any one device loses communication, the final feature map can be created as a result of learning using the filters of all parts. can

도 4를 참고하면, 제1 파트를 이용하여 학습하는 학생 네트워크는 제1 학생 네트워크 및 제3 학생 네트워크일 수 있고, 제2 파트를 이용하여 학습하는 학생 네트워크는 제2 학생 네트워크 및 제4 학생 네트워크일 수 있다. Referring to FIG. 4 , the student network learning using the first part may be a first student network and a third student network, and the student network learning using the second part is a second student network and a fourth student network can be

이후, 분산 뉴럴 네트워크 제어 방법은 동일한 파트를 이용하여 학습된 학생 네트워크의 결과의 평균을 계산할 수 있다. 도 4를 참고하면, 제1 학생 네트워크 및 제3 학생 네트워크의 결과의 평균 값인 제1 파트 평균이 계산될 수 있고, 제2 학생 네트워크 및 제4 학생 네트워크의 결과의 평균 값인 제2 파트 평균이 계산될 수 있다.Thereafter, the distributed neural network control method may calculate the average of the results of the student network learned using the same part. Referring to FIG. 4 , a first part average that is an average value of the results of the first student network and the third student network may be calculated, and a second part average that is an average value of the results of the second student network and the fourth student network is calculated can be

마지막으로, 분산 뉴럴 네트워크 제어 방법은 제1 파트 평균 및 제2 파트 평균을 연결(concatenate)하여, 최종 특징 맵을 생성할 수 있다. 이러한 경우, 제1 학생 네트워크가 탑재된 디바이스에서 통신 오류가 발생하여도, 제3 학생 네트워크를 통해 제1 파트에 포함된 필터를 이용한 학습 결과를 반영하여 최종 특징 맵을 생성할 수 있으므로, 통신 오류로 인한 최종 특징 맵의 변화는 미미할 수 있다.Finally, the distributed neural network control method may generate a final feature map by concatenating the first part average and the second part average. In this case, even if a communication error occurs in the device equipped with the first student network, the final feature map can be generated by reflecting the learning result using the filter included in the first part through the third student network, so the communication error The change in the final feature map due to .

도 5는 일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법을 설명하기 위한 흐름도이다.5 is a flowchart illustrating a method for controlling a distributed neural network according to an embodiment.

단계 S510에서, 분산 뉴럴 네트워크 제어 방법은 복수의 필터를 포함한 복수의 그룹이 수신되면, 학생 네트워크의 수보다 적은 수의 파트로 복수의 필터를 분류할 수 있다. 여기서, 파트의 수는 학생 네트워크의 수를 2로 나눈 값에 기초하여 결정되는 것일 수 있다. 일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법은 수학식 2에 기초하여, 파트의 수를 결정할 수 있다.In step S510 , the distributed neural network control method may classify the plurality of filters into a number of parts smaller than the number of student networks when a plurality of groups including the plurality of filters are received. Here, the number of parts may be determined based on a value obtained by dividing the number of student networks by 2. The distributed neural network control method according to an embodiment may determine the number of parts based on Equation (2).

[수학식 2][Equation 2]

여기서, N은 파트의 수를 의미하고, M은 학생 네트워크의 수를 의미한다.

은 내림(버림) 값을 의미한다. 예를 들어, 학생 네트워크가 탑재될 디바이스의 수가 5개인 경우, 학생 네트워크의 수(M)는 5이고, 파트의 수(N)는 2.5의 내림 값인 2 이하일 수 있다. 다시 말해, 학생 네트워크가 5개이면 파트의 수는 1개 또는 2개일 수 있다. Here, N means the number of parts, and M means the number of student networks.

stands for a rounded down (truncated) value. For example, when the number of devices on which the student network is to be mounted is 5, the number of student networks (M) may be 5, and the number of parts (N) may be 2 or less, which is a rounded-down value of 2.5. In other words, if there are 5 student networks, the number of parts may be 1 or 2.

한편, 각 파트를 통해 학습되는 학생 네트워크의 수가 상이할 경우 각 파트에 포함된 필터의 수에 기초하여, 각 파트를 통해 학습되는 학생 네트워크의 수가 결정될 수 있다. 일 실시 예에 따라, N개의 파트 및 M개의 디바이스를 포함하는 분산 뉴럴 네트워크의 제어 방법은, 파트 중에서 i번째로 많은 필터가 포함된 파트를 이용하여 학습하게 되는 학생 네트워크의 수(S_i)를 수학식 3을 통해 구할 수 있다.Meanwhile, when the number of student networks learned through each part is different, the number of student networks learned through each part may be determined based on the number of filters included in each part. According to an embodiment, in the method of controlling a distributed neural network including N parts and M devices, the number (S _i ) of the student network to be learned using the part including the i-th most filters among the parts It can be obtained through Equation 3.

[수학식 3][Equation 3]

예를 들어, 학생 네트워크의 수(M) 및 파트의 수(N)가 각각 5 및 2이고, 제1 파트에 포함된 필터의 수는 30개, 제2 파트에 포함된 필터의 수가 29개일 수 있다. 이때, i=1이면, 제1 파트 및 제2 파트 중, 필터가 더 많이 포함된 파트는 제1 파트이므로, 제1 파트를 이용하여 학습하게 되는 학생 네트워크의 수(S₁)는 5/2의 올림 값, 즉 3개 일 수 있다. 그리고 i=2이면, 제1 파트 및 제2 파트 중, 필터가 2번째로 많이 포함된 파트는 제2 파트이므로, 제2 파트를 이용하여 학습하게 되는 학생 네트워크의 수(S₂)는 5/2의 내림 값, 즉 2개 일 수 있다. 따라서, 제1 파트를 통해 학습하는 학생 네트워크의 수를 3개로 할 수 있고, 제2 파트를 통해 학습하는 학생 네트워크의 수를 2개로 할 수 있다.For example, the number of student networks (M) and the number of parts (N) are 5 and 2, respectively, the number of filters included in the first part is 30, and the number of filters included in the second part is 29. there is. At this time, if i=1, the part containing more filters among the first part and the second part is the first part, so the number of student networks to be learned using the first part (S ₁ ) is 5/2 It may be a rounded value of , that is, three. And if i=2, the number of student networks learning using the second part (S ₂ ) is 5/ It can be a rounding-down value of 2, that is, two. Accordingly, the number of student networks learning through the first part may be three, and the number of student networks learning through the second part may be two.

또한, 단계 S510은 각 파트로 분류된 필터의 수가 균일하도록 분류하는 것일 수 있다.In addition, step S510 may be to classify the number of filters classified into each part to be uniform.

단계 S520에서, 분산 뉴럴 네트워크 제어 방법은 하나의 파트를 이용하여 둘 이상의 학생 네트워크를 학습시키기 위해, 둘 이상의 학생 네트워크에 동일한 파트에 관한 정보를 전송할 수 있다.In step S520, the distributed neural network control method may transmit information about the same part to two or more student networks in order to train two or more student networks using one part.

단계 S530에서, 분산 뉴럴 네트워크 제어 방법은 동일한 파트를 통해 학습된 둘 이상의 학생 네트워크의 결과를 수신할 수 있다.In step S530, the distributed neural network control method may receive the results of two or more student networks learned through the same part.

단계 S540에서, 분산 뉴럴 네트워크 제어 방법은 수신된 둘 이상의 학생 네트워크의 결과의 평균값에 기초하여 최종 특징을 생성할 수 있다. 일 실시 예에 따른 분산 네트워크의 제어 방법은 학생 네트워크의 결과의 평균값을 연결하여 최종 특징 맵을 생성할 수 있다.In step S540, the distributed neural network control method may generate a final feature based on an average value of the received results of two or more student networks. The method for controlling a distributed network according to an embodiment may generate a final feature map by connecting average values of results of the student network.

따라서, 본 개시의 분산 네트워크의 제어 방법은 파트 간의 중복성을 강화함으로써, 특정 디바이스의 통신 실패로 인하여 일부 학생 네트워크의 결과가 전달되지 않더라도 추론 정확도를 유지할 수 있다.Therefore, the control method of the distributed network of the present disclosure may maintain inference accuracy even if the results of some student networks are not transmitted due to communication failure of a specific device by reinforcing redundancy between parts.

도 6a 및 도 6b는 다른 일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법을 설명하기 위한 도면이다.6A and 6B are diagrams for explaining a distributed neural network control method according to another embodiment.

다른 일 실시 예에 따른 분산 뉴럴 네트워크의 제어 방법은 각 디바이스의 통신 안정성에 관한 정보에 기초하여, 각 디바이스에 탑재될 학생 네트워크를 결정할 수 있다. 이러한 경우, 분산 뉴럴 네트워크의 제어 방법은 각 학생 네트워크가 통신 실패된 경우의 뉴럴 네트워크 전체 추론 정확도에 관한 정보에 기초하여, 학생 네트워크가 탑재될 디바이스를 결정할 수 있다. A method of controlling a distributed neural network according to another embodiment may determine a student network to be loaded in each device based on information about communication stability of each device. In this case, the control method of the distributed neural network may determine a device on which the student network is to be mounted based on information about the overall reasoning accuracy of the neural network when the communication of each student network fails.

도 6a를 참고하면, 제1 학생 네트워크 내지 제5 학생 네트워크는 각각 제1 디바이스 내지 제5 디바이스 중 하나에 탑재될 수 있다. 도 6a의 제1 목록(610)은 각 학생 네트워크를 독자적으로 사용하였을 때 추론 정확도가 높은 순으로 나열한 목록이고, 제2 목록(620)은 각 디바이스의 통신 안정성이 높은 순으로 나열한 목록이다. 다시 말해, 제3 학생 네트워크는 다른 학생 네트워크를 독자적으로 사용하는 경우에 비해, 높은 추론 정확도를 제공할 수 있는 학생 네트워크이고, 제1 디바이스는 디바이스 중에서 가장 안정된 데이터 송수신을 제공하는 디바이스일 수 있다. 한편, 학생 네트워크를 독자적으로 사용했을 때의 추론 정확도는 도 7b 및 도 8b와 같은 실험 결과를 통해 얻을 수 있고, 각 디바이스의 통신 안정성에 관한 정보는 디바이스 제조사 또는 시험기관에서 얻을 수 있으나, 이에 제한되지 않는다. Referring to FIG. 6A , the first to fifth student networks may be mounted on one of the first to fifth devices, respectively. The first list 610 of FIG. 6A is a list in which inference accuracy is high when each student network is used independently, and the second list 620 is a list in the order in which communication stability of each device is high. In other words, the third student network is a student network that can provide higher reasoning accuracy compared to the case of using other student networks independently, and the first device may be a device that provides the most stable data transmission and reception among devices. On the other hand, the inference accuracy when using the student network independently can be obtained through the experimental results shown in FIGS. 7B and 8B, and information on the communication stability of each device can be obtained from the device manufacturer or testing institution, but limited to this doesn't happen

일 실시 예에 따른 분산 뉴럴 네트워크의 제어 방법은 제1 목록(610)의 순서와 제2 목록(620)의 순서에 따라 각 학생 네트워크를 각 디바이스에 탑재할 수 있다. 이러한 경우, 가장 높은 추론 정확도를 제공하는 학생 네트워크가 가장 안정된 데이터 송수신을 제공하는 디바이스에 탑재되므로, 통신 상태에 따른 추론 정확도의 저하가 감소될 수 있다. In the method of controlling a distributed neural network according to an embodiment, each student network may be mounted on each device according to the order of the first list 610 and the order of the second list 620 . In this case, since the student network that provides the highest inference accuracy is mounted on the device that provides the most stable data transmission and reception, deterioration of inference accuracy according to the communication state can be reduced.

또한, 다른 일 실시 예에 따른 분산 뉴럴 네트워크의 제어 방법은 학생 네트워크를 학습시키는데 이용된 파트에 기초하여 학생 네트워크가 탑재될 디바이스를 결정할 수 있다.In addition, the method for controlling a distributed neural network according to another embodiment may determine a device on which the student network is to be mounted based on a part used to learn the student network.

도 6b를 참고하면, 제3 목록(630)은 제1 파트에 기초하여 학습된 학생 네트워크를 독자적으로 사용했을 때의 추론 정확도가 높은 순으로 나열한 목록이고, 제4 목록(640)은 제2 파트에 기초하여 학습된 학생 네트워크를 독자적으로 사용했을 때의 추론 정확도가 높은 순으로 나열한 목록이다. 이러한 경우, 제2 목록(620)에 포함된 각 디바이스에 제1 파트를 통해 학습된 학생 네트워크와 제2 파트를 통해 학습된 학생 네트워크를 서로 교차하여 탑재하도록 결정할 수 있다. 예를 들어, 제1 디바이스에 제1 파트를 통해 학습된 제1 학생 네트워크를 탑재하도록 결정할 수 있고, 제2 디바이스에 제2 파트를 통해 학습된 제2 학생 네트워크를 탑재하도록 결정할 수 있다. Referring to FIG. 6B , the third list 630 is a list in the order of highest inference accuracy when the student network learned based on the first part is used independently, and the fourth list 640 is the second part This is a list in order of highest inference accuracy when independently using the student network trained on the basis of In this case, it may be determined to load each device included in the second list 620 with the student network learned through the first part and the student network learned through the second part cross each other. For example, it may be determined to mount the first student network learned through the first part on the first device, and it may be determined to mount the second student network learned through the second part on the second device.

즉, 일 실시 예에 따른 분산 뉴럴 네트워크의 제어 방법은 제n 파트에 기초하여 학습된 학생 네트워크들을 각각 독자적으로 사용하였을 때 추론 정확도가 높은 순으로 정렬한 목록의 x번째 원소를 P_n,x라고 할 때, 통신 안정성이 i번째 높은 디바이스에 탑재될 학생 네트워크의 집합(D_i)을 수학식 4에 기초하여 결정할 수 있다.That is, in the method of controlling a distributed neural network according to an embodiment, when the student networks learned based on the n-th part are independently used, the x-th element of the list sorted in the order of high inference accuracy is P _n,x When doing so, the set of student networks to be mounted on the i-th device having the highest communication stability (D _i ) may be determined based on Equation (4).

[수학식 4][Equation 4]

수학식 4에 의하면, 통신 안정성이 제일 높은 디바이스에 탑재될 학생 네트워크의 집합(D₁)은 제1 파트에 기초하여 학습된 네트워크 중 가장 추론 정확도가 높은 학생 네트워크(P_1,1)를 포함할 수 있다. 이러한 기준에 기초하여 학생 네트워크가 탑재될 디바이스를 결정하면, 동일한 파트에 기초하여 학습된 학생 네트워크에서 동시에 통신 오류가 발생할 확률을 최소화하기 때문에, 통신 상태가 좋지 않은 환경에서도 분산 뉴럴 네트워크의 추론 정확도의 저하 정도가 감소될 수 있다.According to Equation 4, the set (D ₁ ) of the student network to be mounted on the device with the highest communication stability includes the student network (P _1,1 ) having the highest inference accuracy among the networks learned based on the first part. can If the device on which the student network is to be mounted is determined based on these criteria, the probability of simultaneous communication errors occurring in the student network learned based on the same part is minimized, so even in an environment with poor communication conditions, the inference accuracy of the distributed neural network is improved. The degree of degradation may be reduced.

즉, 일부 디바이스에서 통신 에러가 발생하여 일부 학생 네트워크의 결과만 이용하여도, 기존 교사 네트워크와 비슷한 정확도의 추론이 가능하다. 또한, 일 실시 예에 따른 분산 뉴럴 네트워크의 경우, 각 학생 네트워크의 최종 결과만 전송하기 때문에, 웨어러블 디바이스를 포함하는 데이터 전송률이 낮은 디바이스에도 탑재가 가능하다.That is, even if a communication error occurs in some devices and only results from some student networks are used, inference with similar accuracy to that of the existing teacher network is possible. In addition, in the case of a distributed neural network according to an embodiment, since only the final result of each student network is transmitted, it can be mounted on a device having a low data transmission rate, including a wearable device.

도 7a, 도 7b, 도 8a 및 도 8b는 일 실시 예에 따른 분산 뉴럴 네트워크의 성능을 설명하기 위한 도면이다.7A, 7B, 8A, and 8B are diagrams for explaining the performance of a distributed neural network according to an embodiment.

일 실시 예에 따른 분산 뉴럴 네트워크 제어 방법의 성능을 확인하기 위하여, NONN 기반의 종래 분산 뉴럴 네트워크와 일 실시 예에 따른 분산 뉴럴 네트워크를 각각 8개의 학생 네트워크를 이용하여 학습시켰다. 그리고 나서 8개의 학생 네트워크가 탑재된 8개의 디바이스 각각에 대하여, 통신 에러가 발생한 경우 해당 디바이스의 출력 값을 0으로 설정하여, 추론 정확도를 측정하였다. 한편, 통신 에러를 가정한 실험은 8개의 학생 네트워크에서 일어날 수 있는 모든 경우의 수 대해 실험을 진행하였다. In order to check the performance of the distributed neural network control method according to an embodiment, the conventional distributed neural network based on NONN and the distributed neural network according to the embodiment were trained using eight student networks, respectively. Then, for each of the 8 devices equipped with the 8 student networks, when a communication error occurred, the output value of the corresponding device was set to 0, and inference accuracy was measured. On the other hand, an experiment assuming a communication error was conducted for all possible cases in 8 student networks.

도 7a는 Cifar10 데이터세트를 이용한 NONN 기반의 종래 분산 뉴럴 네트워크의 추론 정확도를 나타낸 그래프이고, 도 7b는 Cifar10 데이터세트를 이용한 일 실시 예에 따른 분산 뉴럴 네트워크의 추론 정확도를 나타낸 그래프이다. 도 7a 및 도 7b의 분산 뉴럴 네트워크에 포함된 교사 네트워크는 WRN40-4이다.7A is a graph showing the inference accuracy of a conventional distributed neural network based on NONN using the Cifar10 dataset, and FIG. 7B is a graph showing the inference accuracy of the distributed neural network according to an embodiment using the Cifar10 dataset. A teacher network included in the distributed neural network of FIGS. 7A and 7B is WRN40-4.

도 7a 및 도 7b를 참고하면, Cifar10 데이터세트를 이용할 때 통신 실패한 디바이스의 수가 3개인 경우, 일 실시 예에 따른 분산 뉴럴 네트워크는 NONN 기반의 종래 분산 뉴럴 네트워크에 비해 평균 추론 정확도가 1.164%, 최저 추론 정확도는 11.22% 개선된 것을 확인할 수 있다. 이러한 개선은 통신 실패한 디바이스의 수가 증가할수록 더 뚜렷하게 나타난다. 도 7a 및 도 7b에 의하면, 통신 실패한 디바이스의 수가 7개인 경우 일 실시 예에 따른 분산 뉴럴 네트워크는 NONN 기반의 종래 분산 뉴럴 네트워크에 비해 평균 추론 정확도가 18.741%, 최저 추론 정확도는 54.22% 개선된 것을 확인할 수 있다.7A and 7B, when the number of devices that have failed to communicate when using the Cifar10 dataset is three, the distributed neural network according to an embodiment has an average inference accuracy of 1.164%, the lowest compared to the conventional distributed neural network based on NONN. It can be seen that the inference accuracy is improved by 11.22%. This improvement becomes more evident as the number of devices that have failed to communicate increases. 7A and 7B, when the number of devices that have failed communication is 7, the distributed neural network according to an embodiment has improved average inference accuracy by 18.741% and lowest inference accuracy by 54.22% compared to the conventional distributed neural network based on NONN. can be checked

한편, 도 8a는 Cifar100 데이터세트를 이용한 NONN 기반의 분산 뉴럴 네트워크의 추론 정확도를 나타낸 그래프이고, 도 8b는 Cifar100 데이터세트를 이용한 일 실시 예에 따른 분산 뉴럴 네트워크의 추론 정확도를 나타낸 그래프이다. 도 8a 및 도 8b의 분산 뉴럴 네트워크에 포함된 교사 네트워크는 WRN28-10이다.Meanwhile, FIG. 8A is a graph showing the inference accuracy of a NONN-based distributed neural network using the Cifar100 dataset, and FIG. 8B is a graph showing the inference accuracy of a distributed neural network using the Cifar100 dataset according to an embodiment. A teacher network included in the distributed neural network of FIGS. 8A and 8B is WRN28-10.

도 8a 및 도 8b를 참고하면, Cifar100 데이터세트를 이용할 때 통신 실패한 디바이스의 수가 3개인 경우, 일 실시 예에 따른 분산 뉴럴 네트워크는 NONN 기반의 종래 분산 뉴럴 네트워크에 비해 평균 추론 정확도가 15.072% 개선된 것을 확인할 수 있다. 또한, 통신 실패한 디바이스의 수가 7개인 경우 일 실시 예에 따른 분산 뉴럴 네트워크는 NONN 기반의 종래 분산 뉴럴 네트워크에 비해 평균 추론 정확도가 18.742% 개선된 것을 확인할 수 있다. Referring to FIGS. 8A and 8B , when the number of devices that have failed communication when using the Cifar100 dataset is three, the distributed neural network according to an embodiment has improved average inference accuracy by 15.072% compared to the conventional distributed neural network based on NONN. can check that In addition, when the number of devices that have failed communication is 7, it can be seen that the average inference accuracy of the distributed neural network according to an embodiment is improved by 18.742% compared to the conventional distributed neural network based on NONN.

도 7a, 도 7b, 도 8a 및 도 8b를 참고하면, 일 실시 예에 따른 분산 뉴럴 네트워크가 Cifar10 데이터세트 및 Cifar100 데이터세트를 이용하는 모든 경우에서 종래 NoNN 기반의 분산 뉴럴 네트워크보다 디바이스의 통신 실패에 대해 강인한 것을 확인할 수 있다.7A, 7B, 8A and 8B , in all cases where the distributed neural network according to an embodiment uses the Cifar10 dataset and the Cifar100 dataset, the communication failure of the device compared to the conventional NoNN-based distributed neural network You can see it's strong.

도 9는 일 실시 예에 따른 분산 뉴럴 네트워크 제어 장치를 설명하기 위한 블록도이다.9 is a block diagram illustrating an apparatus for controlling a distributed neural network according to an embodiment.

분산 뉴럴 네트워크 제어 장치(900)는 일 실시 예에 따라, 메모리(memory)(910) 및 프로세서(processor)(920)를 포함할 수 있다. 도 9에 도시된 분산 뉴럴 네트워크 제어 장치(900)는 본 실시 예와 관련된 구성요소들만이 도시되어 있다. 따라서, 도 9에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 본 실시 예와 관련된 기술분야에서 통상의 지식을 가진 자라면 이해할 수 있다. The distributed neural network control apparatus 900 may include a memory 910 and a processor 920 , according to an embodiment. In the distributed neural network control apparatus 900 shown in FIG. 9, only the components related to this embodiment are shown. Therefore, it can be understood by those of ordinary skill in the art related to the present embodiment that other general-purpose components may be further included in addition to the components shown in FIG. 9 .

메모리(910)는 분산 뉴럴 네트워크 제어 장치(900) 내에서 처리되는 각종 데이터들을 저장하는 하드웨어로서, 예를 들어, 메모리(910)는 분산 뉴럴 네트워크 제어 장치(900)에서 처리된 데이터들 및 처리될 데이터들을 저장할 수 있다. 메모리(910)는 프로세서(920)의 동작을 위한 적어도 하나의 명령어(instruction)를 저장할 수 있다. 또한, 메모리(910)는 분산 뉴럴 네트워크 제어 장치(900)에 의해 구동될 프로그램 또는 애플리케이션 등을 저장할 수 있다. 메모리(910)는 DRAM(dynamic random access memory), SRAM(static random access memory) 등과 같은 RAM(random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리를 포함할 수 있다.The memory 910 is hardware for storing various types of data processed in the distributed neural network control apparatus 900 . For example, the memory 910 includes data processed in the distributed neural network control apparatus 900 and to be processed. data can be stored. The memory 910 may store at least one instruction for an operation of the processor 920 . Also, the memory 910 may store a program or an application to be driven by the distributed neural network control apparatus 900 . The memory 910 is a random access memory (RAM), such as dynamic random access memory (DRAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD- It may include ROM, Blu-ray or other optical disk storage, a hard disk drive (HDD), a solid state drive (SSD), or flash memory.

프로세서(920)는 분산 뉴럴 네트워크 제어 장치(900)의 전반의 동작을 제어하고 데이터 및 신호를 처리할 수 있다. 프로세서(920)는 메모리(910)에 저장된 적어도 하나의 명령어 또는 적어도 하나의 프로그램을 실행함으로써, 분산 뉴럴 네트워크 제어 장치(900)를 전반적으로 제어할 수 있다. 프로세서(920)는 CPU(central processing unit), GPU(graphics processing unit), AP(application processor) 등으로 구현될 수 있으나, 이에 제한되지 않는다.The processor 920 may control the overall operation of the distributed neural network control apparatus 900 and process data and signals. The processor 920 may generally control the distributed neural network control apparatus 900 by executing at least one instruction or at least one program stored in the memory 910 . The processor 920 may be implemented as a central processing unit (CPU), a graphics processing unit (GPU), an application processor (AP), or the like, but is not limited thereto.

프로세서(920)는 복수의 필터를 포함한 복수의 그룹이 수신되면, 학생 네트워크의 수보다 적은 수의 파트로 복수의 필터를 분류하고, 하나의 파트를 이용하여 둘 이상의 학생 네트워크를 학습시키기 위해, 둘 이상의 학생 네트워크에 동일한 파트에 관한 정보를 전송하고, 동일한 파트를 통해 학습된 둘 이상의 학생 네트워크의 결과를 수신하고, 수신된 둘 이상의 학생 네트워크의 결과의 평균값에 기초하여 최종 특징을 생성할 수 있다.When a plurality of groups including a plurality of filters are received, the processor 920 classifies the plurality of filters into a number of parts less than the number of the student networks, and uses one part to train two or more student networks. It is possible to transmit information about the same part to more than one student network, receive results of two or more student networks learned through the same part, and generate a final feature based on an average value of the received results of two or more student networks.

이때, 파트의 수는 학생 네트워크의 수를 2로 나눈 값에 기초하여 결정될 수 있고, 각 파트를 통해 학습되는 학생 네트워크의 수가 상이할 경우 각 파트에 포함된 필터의 수에 기초하여, 각 파트를 통해 학습되는 학생 네트워크의 수가 결정될 수 있다. In this case, the number of parts may be determined based on a value obtained by dividing the number of student networks by 2, and when the number of student networks learned through each part is different, each part is selected based on the number of filters included in each part. The number of student networks to be learned through may be determined.

한편, 프로세서(920)는 학생 네트워크가 탑재될 디바이스의 통신 안정성에 대한 정보에 기초하여, 학생 네트워크가 탑재될 디바이스를 결정할 수 있다. 이때, 프로세서(920)는 각 학생 네트워크가 통신 실패된 경우 뉴럴 네트워크 전체 추론 정확도에 관한 정보에 기초하여, 학생 네트워크가 탑재될 디바이스를 결정할 수 있다. 또한, 프로세서(920)는 학생 네트워크를 학습시키는데 이용된 파트에 기초하여, 학생 네트워크가 탑재될 디바이스를 결정할 수 있다. Meanwhile, the processor 920 may determine a device on which the student network will be mounted, based on information about the communication stability of the device on which the student network will be mounted. In this case, the processor 920 may determine a device on which the student network is to be mounted, based on information about the overall reasoning accuracy of the neural network when the communication of each student network fails. Also, the processor 920 may determine a device on which the student network is to be mounted, based on the part used to learn the student network.

한편, 프로세서(920)는 복수의 필터를 파트로 분류할 때, 각 파트로 분류된 필터의 수가 균일하도록 분류할 수 있다. 또한, 분산 뉴럴 네트워크는 NONN 기반으로 하는 것일 수 있다.Meanwhile, when classifying the plurality of filters into parts, the processor 920 may classify the filters so that the number of filters classified into each part is uniform. Also, the distributed neural network may be based on NONN.

전술한 실시 예들에 따른 프로세서는 프로세서, 프로그램 데이터를 저장하고 실행하는 메모리, 디스크 드라이브와 같은 영구 저장부(permanent storage), 외부 장치와 통신하는 통신 포트, 터치 패널, 키(key), 버튼 등과 같은 사용자 인터페이스 장치 등을 포함할 수 있다. 소프트웨어 모듈 또는 알고리즘으로 구현되는 방법들은 상기 프로세서상에서 실행 가능한 컴퓨터가 읽을 수 있는 코드들 또는 프로그램 명령들로서 컴퓨터가 읽을 수 있는 기록 매체 상에 저장될 수 있다. 여기서 컴퓨터가 읽을 수 있는 기록 매체로 마그네틱 저장 매체(예컨대, ROM(read-only memory), RAM(random-Access memory), 플로피 디스크, 하드 디스크 등) 및 광학적 판독 매체(예컨대, 시디롬(CD-ROM), 디브이디(DVD: Digital Versatile Disc)) 등이 있다. 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템들에 분산되어, 분산 방식으로 컴퓨터가 판독 가능한 코드가 저장되고 실행될 수 있다. 매체는 컴퓨터에 의해 판독가능하며, 메모리에 저장되고, 프로세서에서 실행될 수 있다. The processor according to the above-described embodiments includes a processor, a memory for storing and executing program data, a permanent storage such as a disk drive, a communication port for communicating with an external device, a touch panel, a key, a button, etc. user interface devices, and the like. Methods implemented as software modules or algorithms may be stored on a computer-readable recording medium as computer-readable codes or program instructions executable on the processor. Here, the computer-readable recording medium includes a magnetic storage medium (eg, read-only memory (ROM), random-access memory (RAM), floppy disk, hard disk, etc.) and an optically readable medium (eg, CD-ROM). ), and DVD (Digital Versatile Disc)). The computer-readable recording medium may be distributed among network-connected computer systems, so that the computer-readable code may be stored and executed in a distributed manner. The medium may be readable by a computer, stored in a memory, and executed on a processor.

본 실시 예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들은 특정 기능들을 실행하는 다양한 개수의 하드웨어 또는/및 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 실시 예는 하나 이상의 마이크로프로세서들의 제어 또는 다른 제어 장치들에 의해서 다양한 기능들을 실행할 수 있는, 메모리, 프로세싱, 로직(logic), 룩 업 테이블(look-up table) 등과 같은 직접 회로 구성들을 채용할 수 있다. 구성 요소들이 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있는 것과 유사하게, 본 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 실시 예는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. “매커니즘”, “요소”, “수단”, “구성”과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다. 상기 용어는 프로세서 등과 연계하여 소프트웨어의 일련의 처리들(routines)의 의미를 포함할 수 있다.This embodiment may be represented by functional block configurations and various processing steps. These functional blocks may be implemented in any number of hardware and/or software configurations that perform specific functions. For example, an embodiment may be an integrated circuit configuration, such as memory, processing, logic, look-up table, etc., capable of executing various functions by the control of one or more microprocessors or other control devices. can be hired Similar to how components may be implemented as software programming or software components, this embodiment includes various algorithms implemented in a combination of data structures, processes, routines or other programming constructs, including C, C++, Java ( Java), assembler, etc. may be implemented in a programming or scripting language. Functional aspects may be implemented in an algorithm running on one or more processors. In addition, the present embodiment may employ the prior art for electronic environment setting, signal processing, and/or data processing. Terms such as “mechanism”, “element”, “means” and “configuration” may be used broadly and are not limited to mechanical and physical configurations. The term may include the meaning of a series of routines of software in connection with a processor or the like.

전술한 실시 예들은 일 예시일 뿐 후술하는 청구항들의 범위 내에서 다른 실시 예들이 구현될 수 있다.The above-described embodiments are merely examples, and other embodiments may be implemented within the scope of the claims to be described later.

Claims

when a plurality of groups including a plurality of filters are received, classifying the plurality of filters into a number of parts less than the number of student networks;
In order to train two or more student networks using one part, transmitting information about the same part to the two or more student networks;
receiving results of two or more student networks learned through the same part; and
and generating a final feature based on an average value of the received results of the two or more student networks.

According to claim 1,
The method of controlling a distributed neural network, wherein the number of parts is determined based on a value obtained by dividing the number of the student network by 2.

According to claim 1,
If the number of student networks taught through each part is different,
Based on the number of filters included in each part, the number of student networks learned through each part is determined, the control method of a distributed neural network.

According to claim 1,
Based on the information about the communication stability of the device, the method further comprising the step of determining a device on which the student network will be mounted, the control method of a distributed neural network.

5. The method of claim 4,
The step of determining the device on which the student network is to be mounted,
When each student network fails to communicate, based on information about the overall neural network reasoning accuracy, a device to which the student network is mounted is determined, a control method of a distributed neural network.

5. The method of claim 4,
The step of determining the device on which the student network is to be mounted,
A method of controlling a distributed neural network, which is based on the part used to train the student network.

According to claim 1,
The step of classifying the plurality of filters is
A method of controlling a distributed neural network that classifies the number of filters classified into each part to be uniform.

According to claim 1,
The distributed neural network is a method of controlling a distributed neural network, which is based on a Network of Neural Network (NONN).

a memory for storing at least one instruction; and
By executing the at least one command,
When a plurality of groups including a plurality of filters are received, classifying the plurality of filters into a number of parts less than the number of student networks,
In order to train two or more student networks using one part, sending information about the same part to the two or more student networks,
receiving the results of two or more student networks trained through the same part;
and a processor for generating a final feature based on an average value of the received results of the two or more student networks.

As a computer-readable non-transitory recording medium in which a program for executing a control method of a distributed neural network is recorded in a computer,
The control method of the distributed neural network,
when a plurality of groups including a plurality of filters are received, classifying the plurality of filters into a number of parts less than the number of student networks;
In order to train two or more student networks using one part, transmitting information about the same part to the two or more student networks;
receiving results of two or more student networks learned through the same part; and
and generating a final feature based on an average value of the received results of the two or more student networks.