KR102163498B1

KR102163498B1 - Apparatus and method for pruning-retraining of neural network

Info

Publication number: KR102163498B1
Application number: KR1020180168135A
Authority: KR
Inventors: 양회석; 장정규; 최규식
Original assignee: 아주대학교산학협력단
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2020-10-08
Also published as: KR20200078865A

Abstract

신경망의 프루닝-재훈련 방법에 관한 것이며, 신경망의 프루닝-재훈련 방법은 (a) 신경망 내 노드에 대한 프루닝을 수행하는 단계; 및 (b) 상기 프루닝된 신경망 내의 프루닝된 노드간 가중치 중 적어도 일부를 회복시키고, 상기 회복된 노드간 가중치에 대한 재훈련을 수행하는 단계를 포함할 수 있다.It relates to a pruning-retraining method of a neural network, and the pruning-retraining method of a neural network includes the steps of: (a) performing pruning on nodes in the neural network; And (b) recovering at least some of the pruned inter-node weights in the pruned neural network, and performing retraining on the recovered inter-node weights.

Description

Neural network pruning-retraining device and method {APPARATUS AND METHOD FOR PRUNING-RETRAINING OF NEURAL NETWORK}

본원은 신경망의 프루닝-재훈련 장치 및 방법에 관한 것이다. 특히, 본원은 프루닝된 신경망을 재훈련시킴에 있어서, 신경망 전체의 재훈련을 통해 신경망 내 모든 노드간 가중치를 재훈련시키는 것이 아니라, 프루닝된 노드간 가중치 중 일부 회복시킨 노드간 가중치를 재훈련시키는 신경망의 프루닝-재훈련 장치 및 방법에 관한 것이다.The present application relates to a neural network pruning-retraining apparatus and method. In particular, in retraining the pruned neural network, the present invention does not retrain the weights between all nodes in the neural network through retraining of the entire neural network, but retrains the weights between nodes that partially recover the weights between the pruned nodes. It relates to an apparatus and method for pruning-retraining of a neural network to be trained.

신경망(Neural Network)에서의 프루닝(Pruning)은 신경망 내 뉴런들 사이의 연결을 삭제하는 과정을 의미한다. 일반적으로 프루닝은 네트워크에서 무의미한 여분의 연결을 줄일 수 있기 때문에 많은 비용을 절감할 수 있다. Pruning in a neural network refers to the process of deleting connections between neurons in a neural network. In general, pruning can save a lot of money because it can reduce unnecessary redundant connections in the network.

프루닝은 훈련된 신경망에서 각 뉴런들의 연결의 중요성을 판단하여 중요하지 않은 순으로 연결을 삭제한 후, 재훈련 과정을 통하여 초기 훈련된 신경망의 정확도를 복구한다. 여기서, 뉴런들의 연결을 삭제하는 기준은 그 방법에 따라 다양하며, 일반적으론 뉴런(신경)의 가중치 값을 0으로 만드는 방법을 사용한다.Pruning determines the importance of the connections of each neuron in the trained neural network, deletes the connections in unimportant order, and recovers the accuracy of the initially trained neural network through a retraining process. Here, the criterion for deleting the connection of neurons varies depending on the method, and generally, a method of making the weight value of a neuron (nerve) 0 is used.

도 1 및 도 2는 종래의 문헌[Han, Song, et al. "DSD: regularizing deep neural networks with dense-sparse-dense training flow." arXiv preprint arXiv:1607.04381 3.6 (ICLR 2017)]에 개시되어 있는 프루닝을 사용한 훈련 기법인 DSD(Dense-Sparse-Dense)를 설명하기 위한 도면이다. 특히 DSD 기법을 알고리즘으로 표현하면 도 2와 같을 수 있다.1 and 2 are conventional documents [Han, Song, et al. "DSD: regularizing deep neural networks with dense-sparse-dense training flow." arXiv preprint arXiv:1607.04381 3.6 (ICLR 2017)] is a diagram for explaining a training technique using pruning, DSD (Dense-Sparse-Dense). In particular, if the DSD technique is expressed as an algorithm, it may be as shown in FIG. 2.

도 1 및 도 2를 참조하면, 종래의 문헌에 개시되어 있는 DSD 기법은 크게 Initial Dense Phase, Sparse Phase 및 Final Dense Phase로 구분될 수 있다. 1 and 2, the DSD technique disclosed in the conventional literature can be largely divided into an Initial Dense Phase, a Sparse Phase, and a Final Dense Phase.

간단히 살펴보면, Initial Dense Phase는 보통의 옵티마이저(Optimizer)를 이용한 신경망 훈련 과정을 나타낸다. Sparse Phase는 임계값(Threshold)보다 작은 값의 모든 가중치를 마스킹(Masking)하여 0으로 만드는 과정을 나타낸다. Final Dense Phase는 파인튜닝(Fine Tuning)을 위한 재훈련 과정으로서 프루닝에서 떨어진 정확도를 높이기 위해 진행된다. Briefly, the Initial Dense Phase represents a neural network training process using a normal optimizer. Sparse Phase refers to a process of masking all weights of a value smaller than a threshold to make it zero. The Final Dense Phase is a retraining process for Fine Tuning, and is conducted to increase the accuracy that has fallen from pruning.

이러한 DSD 훈련이 적용된 신경망은, 본래의 훈련 과정과 대비하여 신경망 내 중요치 않은 연결이 삭제됨에 따라 주어진 문제에 대하여 보다 강경한 해답을 도출해 낼 수 있다. 또한, 해당 신경망이 사용 데이터셋에 대해 최적화되기 때문에 추론 정확도의 상승, 과적합 방지 등의 이점을 얻을 수 있다.In contrast to the original training process, the neural network to which DSD training has been applied can derive a more robust solution to a given problem as insignificant connections in the neural network are deleted. In addition, since the neural network is optimized for the used data set, advantages such as an increase in inference accuracy and prevention of overfitting can be obtained.

그런데, DSD 훈련에서는 신경망에 대해 프루닝을 수행한 다음, 프루닝된 신경망 전체를 다시 재훈련시킴에 따라 신경망 내 노드간 모든 가중치의 값들이 재훈련된다. 이러한 DSD 훈련이 적용된 신경망은 파라미터 볼륨(Parameter volume, 파라미터 사이즈)을 줄이는 데에 한계가 있어, 큰 메모리 공간을 필요로 하고, 추론 속도(시간)가 느리며, 높은 전력(POWER)이 요구되는 문제가 있다.However, in DSD training, after pruning is performed on the neural network, all weight values between nodes in the neural network are retrained as the entire pruned neural network is retrained. The neural network to which DSD training is applied has a limitation in reducing the parameter volume (parameter size), which requires a large memory space, a slow inference speed (time), and a high power (POWER) requirement. have.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 전력, 정확도, 속도, 메모리 공간, 파라미터 볼륨 등을 고려한 적응적(adaptive) 신경망의 사용을 가능하게 하는 신경망의 프루닝-재훈련 장치 및 방법을 제공하려는 것을 목적으로 한다.The present application is to solve the problems of the prior art described above, and a pruning-retraining apparatus and method of a neural network enabling the use of an adaptive neural network in consideration of power, accuracy, speed, memory space, parameter volume, etc. It aims to provide.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 시스템의 성능이나 요구되는 조건에 따라 신경망을 선택적으로 사용할 수 있도록 하는 신경망의 프루닝-재훈련 장치 및 방법을 제공하려는 것을 목적으로 한다.The present application is to solve the problems of the prior art described above, and an object of the present invention is to provide an apparatus and method for pruning-retraining a neural network that enables a neural network to be selectively used according to system performance or required conditions.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical problem to be achieved by the embodiments of the present application is not limited to the technical problems as described above, and other technical problems may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 방법은, (a) 신경망 내 노드에 대한 프루닝을 수행하는 단계; 및 (b) 상기 프루닝된 신경망 내의 프루닝된 노드간 가중치 중 적어도 일부를 회복시키고, 상기 회복된 노드간 가중치에 대한 재훈련을 수행하는 단계를 포함할 수 있다.As a technical means for achieving the above technical problem, the pruning-retraining method of a neural network according to an embodiment of the present application includes the steps of: (a) performing pruning on nodes in the neural network; And (b) recovering at least some of the pruned inter-node weights in the pruned neural network, and performing retraining on the recovered inter-node weights.

또한, 상기 (b) 단계는, 상기 프루닝된 노드간 가중치 중에서 상기 회복된 적어도 일부의 노드간 가중치를 제외한 나머지 노드간 가중치 중 적어도 일부의 노드간 가중치를 추가로 회복시킬 수 있다.In addition, the step (b) may further recover at least some of the weights between nodes among the remaining weights excluding the recovered weights between the nodes among the pruned node weights.

또한, 상기 (b) 단계는 반복 수행될 수 있다.In addition, step (b) may be repeatedly performed.

또한, 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 방법은, (c) 상기 재훈련된 회복된 노드간 가중치를 고려하여 멀티 페이즈 신경망을 생성하는 단계를 더 포함할 수 있다.In addition, the pruning-retraining method of a neural network according to an embodiment of the present application may further include (c) generating a multi-phase neural network in consideration of the restored weights between nodes.

또한, 상기 멀티 페이즈 신경망은, 상기 신경망 내의 프루닝된 노드간 가중치를 포함하도록 생성되는 제1 레이어 및 상기 회복된 노드간 가중치를 포함하도록 생성되는 제2 레이어를 포함할 수 있다.Further, the multi-phase neural network may include a first layer generated to include the pruned inter-node weights in the neural network and a second layer generated to include the recovered inter-node weights.

또한, 상기 제2 레이어는 복수의 서브 레이어를 포함하고, 상기 복수의 서브 레이어 중 어느 하나의 서브 레이어는 상기 회복된 적어도 일부의 노드간 가중치를 포함하도록 생성되는 서브 레이어이고, 상기 복수의 서브 레이어 중 상기 어느 하나의 서브 레이어를 제외한 나머지 서브 레이어는 상기 (b) 단계가 반복 수행되는 경우, 상기 (b) 단계의 반복 수행시마다 추가로 회복된 노드간 가중치를 포함하도록 추가적으로 생성되는 서브 레이어일 수 있다.In addition, the second layer includes a plurality of sub-layers, any one of the plurality of sub-layers is a sub-layer generated to include the recovered weights between at least some of the nodes, and the plurality of sub-layers The remaining sub-layers excluding any one of the sub-layers may be sub-layers that are additionally generated to include the recovered inter-node weights each time the step (b) is repeatedly performed when step (b) is repeatedly performed. have.

또한, 상기 복수의 서브 레이어는 희소 행렬 형식을 적용하여 생성될 수 있다.Also, the plurality of sub-layers may be generated by applying a sparse matrix format.

또한, 상기 (a) 단계에서 상기 신경망은 훈련된 합성곱 신경망(Convolution Neural Network)일 수 있다.In addition, in step (a), the neural network may be a trained convolution neural network.

한편, 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치는, 신경망 내 노드에 대한 프루닝을 수행하는 프루닝부; 및 상기 프루닝된 신경망 내의 프루닝된 노드간 가중치 중 적어도 일부를 회복시키고, 상기 회복된 노드간 가중치에 대한 재훈련을 수행하는 회복 재훈련부를 포함할 수 있다.Meanwhile, the apparatus for pruning-retraining a neural network according to an embodiment of the present disclosure includes: a pruning unit that performs pruning on nodes in a neural network; And a recovery retraining unit recovering at least a part of the pruned inter-node weights in the pruned neural network and performing retraining on the restored inter-node weights.

또한, 상기 회복 재훈련부는, 상기 프루닝된 노드간 가중치 중에서 상기 회복된 적어도 일부의 노드간 가중치를 제외한 나머지 노드간 가중치 중 적어도 일부의 노드간 가중치를 추가로 회복시킬 수 있다.In addition, the recovery retraining unit may additionally recover at least some of the weights between nodes among remaining node weights excluding the restored at least some of the weights between nodes among the pruned node weights.

또한, 상기 회복 재훈련부는, 노드간 가중치를 회복시키고 회복된 노드간 가중치를 재훈련하는 과정을 반복 수행할 수 있다.In addition, the recovery retraining unit may repeatedly perform a process of recovering weights between nodes and retraining restored weights between nodes.

또한, 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치는, 상기 재훈련된 회복된 노드간 가중치를 고려하여 멀티 페이즈 신경망을 생성하는 생성부를 더 포함할 수 있다.In addition, the apparatus for pruning-retraining a neural network according to an embodiment of the present application may further include a generator configured to generate a multi-phase neural network in consideration of the restored weights between nodes.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary and should not be construed as limiting the present application. In addition to the above-described exemplary embodiments, additional embodiments may exist in the drawings and detailed description of the invention.

전술한 본원의 과제 해결 수단에 의하면, 신경망의 프루닝-재훈련 장치 및 방법을 제공함으로써, 전력, 정확도, 속도, 메모리 공간, 파라미터 볼륨 등을 고려한 적응적(adaptive) 신경망의 사용을 가능하게 할 수 있다.According to the above-described problem solving means of the present application, by providing a pruning-retraining apparatus and method of a neural network, it is possible to use an adaptive neural network in consideration of power, accuracy, speed, memory space, parameter volume, etc. I can.

전술한 본원의 과제 해결 수단에 의하면, 신경망의 프루닝-재훈련 장치 및 방법을 제공함으로써, 시스템의 성능이나 요구되는 조건에 따라 신경망을 선택적으로 사용하도록 할 수 있다.According to the above-described problem solving means of the present application, by providing an apparatus and method for pruning-retraining a neural network, it is possible to selectively use a neural network according to system performance or required conditions.

다만, 본원에서 얻을 수 있는 효과는 상기된 바와 같은 효과들로 한정되지 않으며, 또 다른 효과들이 존재할 수 있다.However, the effect obtainable in the present application is not limited to the effects as described above, and other effects may exist.

도 1 및 도 2는 종래의 문헌[Han, Song, et al. "DSD: regularizing deep neural networks with dense-sparse-dense training flow." arXiv preprint arXiv:1607.04381 3.6 (ICLR 2017)]에 개시되어 있는 프루닝을 사용한 훈련 기법인 DSD(Dense-Sparse-Dense)를 설명하기 위한 도면이다.
도 3은 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치의 개략적인 구성을 나타낸 블록도이다.
도 4는 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치에 의한 신경망의 프루닝-재훈련 과정을 설명하기 위한 도면이다.
도 5는 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치에 의해 생성되는 멀티 페이즈 신경망을 설명하기 위한 도면이다.
도 6은 희소 행렬의 유형 중 CSC(Compressed Sparse Column) 행렬의 저장 방식의 예를 나타낸 도면이다.
도 7은 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 방법에 대한 동작 흐름도이다.1 and 2 are conventional documents [Han, Song, et al. "DSD: regularizing deep neural networks with dense-sparse-dense training flow." arXiv preprint arXiv:1607.04381 3.6 (ICLR 2017)] is a diagram for explaining a training technique using pruning, DSD (Dense-Sparse-Dense).
3 is a block diagram showing a schematic configuration of a pruning-retraining apparatus of a neural network according to an embodiment of the present application.
4 is a view for explaining a pruning-retraining process of a neural network by a pruning-retraining apparatus of a neural network according to an embodiment of the present application.
5 is a diagram illustrating a multi-phase neural network generated by a pruning-retraining apparatus of a neural network according to an embodiment of the present application.
6 is a diagram showing an example of a method of storing a compressed sparse column (CSC) matrix among sparse matrix types.
7 is a flowchart illustrating an operation of a method for pruning-retraining a neural network according to an embodiment of the present application.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, exemplary embodiments of the present application will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present application. However, the present application may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in the drawings, parts not related to the description are omitted in order to clearly describe the present application, and similar reference numerals are attached to similar parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결" 또는 "간접적으로 연결"되어 있는 경우도 포함한다. Throughout the present specification, when a part is said to be "connected" with another part, it is not only "directly connected", but also "electrically connected" or "indirectly connected" with another element interposed therebetween. "Including the case.

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout this specification, when a member is positioned "on", "upper", "upper", "under", "lower", and "lower" of another member, this means that a member is located on another member. It includes not only the case where they are in contact but also the case where another member exists between the two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification of the present application, when a certain part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

도 3은 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치(100)의 개략적인 구성을 나타낸 블록도이다. 도 4는 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치(100)에 의한 신경망의 프루닝-재훈련 과정을 설명하기 위한 도면이다. 도 4에 표시된 동그라미 형상은 일예로 신경망 내 노드간 가중치를 개략적으로 나타낸 것이다. 3 is a block diagram showing a schematic configuration of a pruning-retraining apparatus 100 for a neural network according to an embodiment of the present application. 4 is a diagram illustrating a pruning-retraining process of a neural network by the apparatus 100 for pruning-retraining a neural network according to an embodiment of the present application. The circle shape shown in FIG. 4 schematically shows weights between nodes in a neural network as an example.

이하에서는 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치(100)를 설명의 편의상 본 장치(100)라 하기로 한다.Hereinafter, the apparatus 100 for pruning-retraining a neural network according to an embodiment of the present application will be referred to as the apparatus 100 for convenience of description.

도 3 및 도 4를 참조하면, 본 장치(100)는 프루닝부(110), 회복 재훈련부(120) 및 생성부(130)를 포함할 수 있다.3 and 4, the apparatus 100 may include a pruning unit 110, a recovery retraining unit 120, and a generation unit 130.

프루닝부(110)는 신경망(10) 내 노드에 대한 프루닝(Pruning)을 수행(S2)할 수 있다. 이때, 본 장치(100)는 신경망(10) 내 노드에 대한 프루닝을 수행(S2)하기 이전에, 프루닝을 위한 신경망(10)을 준비(S1)할 수 있다.The pruning unit 110 may perform pruning on a node in the neural network 10 (S2). At this time, the apparatus 100 may prepare the neural network 10 for pruning (S1) before performing pruning on the node in the neural network 10 (S2).

여기서, 신경망(10)은 훈련된 신경망일 수 있다. 즉, 신경망(10)은 초기(본래, 기존) 신경망(Initial Neural Network)에 대하여 훈련(Training)이 이루어진 훈련된 신경망일 수 있다. 본원에서 고려되는 신경망(10)은 일예로 특정 이미지 데이터 집합에 대하여 훈련된 신경망 모델일 수 있다. 이에 따르면, 본원에서 신경망(10)은 훈련된 신경망 모델, 훈련된 모델(Trained Model) 등으로 달리 표현될 수 있다.Here, the neural network 10 may be a trained neural network. That is, the neural network 10 may be a trained neural network in which training is performed on an initial (original, existing) neural network. The neural network 10 considered herein may be, for example, a neural network model trained on a specific image data set. Accordingly, in the present application, the neural network 10 may be expressed differently as a trained neural network model, a trained model, or the like.

또한, 본원에서 고려되는 신경망(10)은 일예로 훈련된 합성곱 신경망(Convolution Neural Network, CNN)일 수 있다.In addition, the neural network 10 considered herein may be a trained convolution neural network (CNN) as an example.

합성곱 신경망(CNN)은 인공신경망(딥러닝, 신경망)의 유형 중 하나로서, 주로 영상과 관련된 데이터 입력에 특화되어 분류(Classification) 및 분할(Segmentation) 처리에 뛰어난 성능을 보인다. 합성곱 신경망은 합성곱 레이어(Layer)에 포함된 필터를 입력된 영상에 적용함으로써 합성곱 연산을 통해 특징맵(feature map)을 출력시킨다. 이러한 특징맵에는 영상의 중요한 특징 정보가 압축되어 저장되기 때문에, 심층 신경망(딥 신경망)으로 합성곱 신경망을 구성할 경우 처리할 데이터가 많은 영상 및 동영상 처리에 효과적이다.A convolutional neural network (CNN) is one of the types of artificial neural networks (deep learning, neural networks), and is mainly specialized in image-related data input, and shows excellent performance in classification and segmentation processing. A convolutional neural network outputs a feature map through a convolution operation by applying a filter included in a convolutional layer to an input image. Since important feature information of an image is compressed and stored in this feature map, when a convolutional neural network is constructed with a deep neural network (deep neural network), it is effective in processing images and moving pictures with a lot of data to be processed.

본원에서는 신경망(10)으로서 훈련된 합성곱 신경망(CNN)이 고려되는 것으로 예시하였으나, 이에만 한정되는 것은 아니고, 본원에서 고려되는 신경망(10)으로는 순환신경망(RNN, Recurrent Neural Network) 등 종래에 이미 공지되었거나 향후 개발되는 다양한 신경망(이는 훈련된 신경망, 훈련되지 않은 신경망 등을 포함함)이 적용될 수 있다.In the present application, it is illustrated that a convolutional neural network (CNN) trained as the neural network 10 is considered, but is not limited thereto, and the neural network 10 considered in the present application includes a conventional recurrent neural network (RNN), etc. Various neural networks already known or developed in the future (including trained neural networks, untrained neural networks, etc.) can be applied.

프루닝부(110)는 단계S1에서 준비된 신경망(10)에 대하여 프루닝(Pruning)을 수행(S2, Initial Pruning)할 수 있다. 이에 따르면, 프루닝된 신경망(11)에는 프루닝의 수행에 의해 프루닝된 노드간 가중치(2) 및 프루닝되지 않은 노드간 가중치(3)가 포함될 수 있다. The pruning unit 110 may perform pruning (S2, Initial Pruning) on the neural network 10 prepared in step S1. Accordingly, the pruned neural network 11 may include a weight between nodes 2 pruned by pruning and a weight 3 between nodes that are not pruned.

프루닝부(110)가 신경망(10)에 대한 프루닝을 수행하는 경우, 신경망(10) 내 노드간 가중치(신경망 내 전체 노드간 가중치 전체, 1) 중 일부 노드간 가중치가 프루닝된 노드간 가중치(2)로서 프루닝(제거, 가지치기)될 수 있다.When the pruning unit 110 performs pruning on the neural network 10, the weights between nodes in which some of the weights between nodes in the neural network 10 (all weights between all nodes in the neural network, 1) are pruned. It can be pruned (removed, pruned) as (2).

프루닝부(110)는 신경망(10) 내 노드간 가중치(1) 중 미리 설정된 비율의 가중치들이 프루닝되도록, 신경망(10) 내 노드에 대한 프루닝을 수행할 수 있다.The pruning unit 110 may perform pruning on nodes in the neural network 10 so that weights of a preset ratio among the weights 1 between nodes in the neural network 10 are pruned.

여기서, 미리 설정된 비율은 일예로 사용자 입력에 의하여 설정될 수 있다.Here, the preset ratio may be set by user input as an example.

미리 설정된 비율은 예시적으로 신경망(10) 내 노드간 가중치(1) 중 70 % 내지 90 % 사이의 비율 중 어느 하나의 비율일 수 있다. 다만, 이에 한정되는 것은 아니고, 다양한 비율로 설정될 수 있다.The preset ratio may be any one of ratios between 70% and 90% of the weights 1 between nodes in the neural network 10. However, the present invention is not limited thereto and may be set in various ratios.

또한, 프루닝부(110)는 신경망(10) 내 노드간 가중치(1)의 절대값을 고려하여 프루닝을 수행할 수 있다. 구체적인 일예로, 프루닝부(110)는 신경망(10) 내 노드간 가중치(1)의 절대값을 순차적으로 나열(예를 들어, 오름차순 나열 또는 내림차순 나열)했을 때, 상위(또는 하위)에서 미리 설정된 비율에 속하는 노드간 가중치들을 프루닝할 수 있다. In addition, the pruning unit 110 may perform pruning in consideration of an absolute value of the weight 1 between nodes in the neural network 10. As a specific example, when the pruning unit 110 sequentially lists the absolute values of the weights 1 between nodes in the neural network 10 (for example, in ascending order or descending order), the upper (or lower) Weights between nodes belonging to the ratio can be pruned.

본원에서는 예시적으로 프루닝부(110)가 미리 설정된 비율(%)로 프루닝을 수행하는 것으로 예시하였으나, 이에 한정되는 것은 아니고, 미리 설정된 수(개수) 등의 단위로 프루닝을 수행할 수 있다. In the present application, the pruning unit 110 is exemplarily illustrated as performing pruning at a preset ratio (%), but is not limited thereto, and pruning may be performed in units such as a preset number (number). .

프루닝부(110)에 의해 신경망(10) 내 노드에 대한 프루닝이 수행되면, 이후 회복 재훈련부(120)는 프루닝된 신경망(11) 내의 프루닝된 노드간 가중치(2) 중 적어도 일부(즉, 적어도 일부의 노드간 가중치)를 회복시키고, 회복된 적어도 일부의 노드간 가중치에 대한 재훈련을 수행(Weight restoring & Retraining)할 수 있다.When pruning is performed on the nodes in the neural network 10 by the pruning unit 110, the recovery retraining unit 120 after that is at least a part of the weights 2 between the pruned nodes in the pruned neural network 11 ( That is, at least some of the weights between nodes) may be restored, and retraining on the recovered at least some of the weights between nodes may be performed (Weight restoring & Retraining).

이때, 회복 재훈련부(120)는 앞서 프루닝 방법과 마찬가지로, 미리 설정된 비율, 가중치의 절대값 등을 고려하여 프루닝된 노드간 가중치(2) 중 적어도 일부(즉, 적어도 일부의 노드간 가중치)를 회복시킬 수 있다. 예시적으로, 회복 재훈련부(120)는 프루닝된 노드간 가중치(2)의 절대값을 순차적으로 나열(예를 들어, 오름차순 나열 또는 내림차순 나열)했을 때, 상위(또는 하위)에서 미리 설정된 비율에 속하는 노드간 가중치들을 적어도 일부의 노드간 가중치로서 회복시킬 수 있다. At this time, the recovery retraining unit 120 is at least a part of the pruned node weights 2 (that is, at least some of the weights between nodes) in consideration of a preset ratio and an absolute value of the weight, as in the previous pruning method. Can recover. For example, when the recovery retraining unit 120 sequentially lists the absolute values of the pruned node weights (2) (eg, in ascending or descending order), a preset ratio from the upper (or lower) Weights between nodes belonging to at least some of the weights between nodes may be recovered.

또한, 회복 재훈련부(120)는 프루닝된 노드간 가중치 중에서 회복된 적어도 일부의 노드간 가중치를 제외한 나머지 노드간 가중치 중 적어도 일부의 노드간 가중치를 추가로 회복시킬 수 있다.In addition, the recovery retraining unit 120 may additionally recover at least some of the weights between nodes among remaining node weights excluding at least some of the restored weights between nodes among the pruned node weights.

회복 재훈련부(120)는 이러한 노드간 가중치를 회복시키고 회복된 노드간 가중치를 재훈련하는 과정을 반복 수행할 수 있다. 구체적인 설명은 다음과 같다.The recovery retraining unit 120 may repeatedly perform a process of recovering the weights between nodes and retraining the restored weights between nodes. The detailed description is as follows.

프루닝부(110)에 의해 신경망(10) 내 노드에 대한 프루닝이 수행(S2)된 이후, 회복 재훈련부(120)는 프루닝된 신경망(11) 내의 프루닝된 노드간 가중치(2) 중 적어도 일부(즉, 적어도 일부의 노드간 가중치, 4)를 회복시키고, 회복된 적어도 일부의 노드간 가중치(4)에 대한 재훈련을 수행(S3)할 수 있다. 이때, 본원에서 프루닝된 노드 간 가중치 중 적어도 일부의 노드간 가중치를 회복시키고, 회복된 적어도 일부의 노드간 가중치를 재훈련시키는 과정은 회복 재훈련 과정이라 지칭될 수 있다. 이에 따르면, 단계S3은 1차 회복 재훈련 과정이라 달리 지칭될 수 있다.After pruning for the nodes in the neural network 10 is performed (S2) by the pruning unit 110, the recovery retraining unit 120 is among the weights 2 among pruned nodes in the pruned neural network 11 At least a portion (ie, at least a portion of the weights between nodes 4) may be recovered, and retraining on the recovered at least some of the weights between nodes 4 may be performed (S3). In this case, a process of recovering at least some of the weights between nodes among the pruned node weights and retraining the restored at least some of the weights between nodes may be referred to as a recovery retraining process. According to this, step S3 may be referred to differently as a primary recovery retraining process.

이후, 1차 회복 재훈련 과정(S3)이 수행된 다음 회복 재훈련 과정을 다시 반복 수행(즉, S3 과정을 기준으로 회복 재훈련 과정을 1회 반복 수행)하는 경우, 프루닝부(110)는 프루닝된 노드간 가중치(2) 중에서 1차 회복 재훈련 과정에서 회복된 적어도 일부의 노드간 가중치(4)를 제외한 나머지 노드간 가중치(2') 중 적어도 일부의 노드간 가중치(5)를 추가로 회복시킬 수 있으며, 이후 추가로 회복된 적어도 일부의 노드간 가중치(5)에 대한 재훈련을 수행(S4)할 수 있다. 이러한 단계S4는 1차 회복 재훈련 과정(S3)의 1회 반복 수행(S4) 결과로서 2차 회복 재훈련 과정(S4)이라 달리 지칭될 수 있다.Thereafter, when the first recovery retraining process (S3) is performed and then the recovery retraining process is repeatedly performed (ie, the recovery retraining process is repeatedly performed once based on the S3 process), the pruning unit 110 Add at least some of the weights (5) among the weights (2') of the remaining nodes except for the weights (4) of the nodes recovered in the first recovery retraining process from the weights (2) of the pruned nodes. It can be recovered by, and then, retraining for at least some of the weights 5 between nodes that have been additionally recovered can be performed (S4). This step S4 may be referred to differently as the second recovery retraining process (S4) as a result of one repetition (S4) of the first recovery retraining process (S3).

이후, 2차 회복 재훈련 과정(S4)이 수행된 다음 회복 재훈련 과정을 다시 반복 수행(즉, S3 과정을 기준으로 회복 재훈련 과정을 2회 반복 수행)하는 경우, 프루닝부(110)는 프루닝된 노드간 가중치(2) 중에서 1차 회복 재훈련 과정에서 회복된 적어도 일부의 노드간 가중치(4)와 2차 회복 재훈련 과정에서 회복된 적어도 일부의 노드간 가중치(5)를 제외한 나머지 노드간 가중치(2'') 중 적어도 일부의 노드간 가중치(6)를 추가로 회복시킬 수 있으며, 이후 추가로 회복된 적어도 일부의 노드간 가중치(6)에 대한 재훈련을 수행(S5)할 수 있다. 이러한 단계S5는 1차 회복 재훈련 과정(S2)의 2회 반복 수행(S4, S5) 결과로서 3차 회복 재훈련 과정(S5)이라 달리 지칭될 수 있다.Thereafter, when the second recovery retraining process (S4) is performed and then the recovery retraining process is repeatedly performed (ie, the recovery retraining process is repeatedly performed twice based on the S3 process), the pruning unit 110 The rest of the pruned inter-node weights (2), excluding at least some inter-node weights (4) recovered in the first recovery retraining process and at least some inter-node weights (5) recovered during the second recovery retraining process. It is possible to additionally recover at least some of the inter-node weights (6) among the inter-node weights (2''), and then perform retraining (S5) on at least some of the additionally recovered inter-node weights (6). I can. This step S5 may be referred to differently as a third recovery retraining process (S5) as a result of performing two iterations (S4, S5) of the first recovery retraining process (S2).

이처럼, 프루닝부(110)는 회복 재훈련 과정을 반복 수행할 수 있다. 이때, 프루닝부(110)는 회복 재훈련 과정의 반복 수행시마다, 프루닝된 노드간 가중치(2) 중 이전 회복 재훈련 과정(적어도 1회 이상의 이전 회복 재훈련 과정)에서 회복된 노드간 가중치를 제외한 나머지 노드간 가중치 중 적어도 일부의 노드간 가중치를 추가로 회복시키고, 추가로 회복된 노드간 가중치에 대해서만 재훈련을 수행할 수 있다.As such, the pruning unit 110 may repeatedly perform the recovery retraining process. At this time, each time the recovery retraining process is repeatedly performed, the pruning unit 110 selects the weights between nodes recovered from the previous recovery retraining process (at least one or more previous recovery retraining processes) among the weights 2 among the pruned nodes. At least some of the weights between nodes of the remaining nodes are additionally recovered, and retraining can be performed only on the weights between nodes that have been additionally recovered.

이에 따르면, 종래 DSD 훈련에서는 신경망에 대해 프루닝을 수행한 다음, 프루닝된 신경망 전체를 다시 재훈련시킴에 따라 신경망 내 노드간 모든 가중치의 값들이 재훈련되었다. 다시 말해, 종래 DSD 훈련은 가중치 회복(Restoring) 후 재훈련을 수행하는 과정에서, 신경망 내 모든 노드간 가중치에 대하여 재훈련을 수행하였다. 이에 반해, 본 장치(100)의 회복 재훈련부(120)에 의하면, 본원은 프루닝-회복(Pruning-Restoring) 훈련의 수행시(달리 말해, 프루닝된 신경망을 재훈련시킴에 있어서), 종래와 같이 프루닝된 신경망 내 전체 노드간 가중치(즉, 노드간 모든 가중치)를 재훈련시키는 것이 아니라, 프루닝된 노드간 가중치 중 일부 회복시킨 노드간 가중치에 한하여 재훈련을 수행할 수 있다.According to this, in conventional DSD training, pruning is performed on a neural network, and then all weight values between nodes in the neural network are retrained by retraining the entire pruned neural network. In other words, in the conventional DSD training, in the process of performing retraining after restoring, retraining was performed on the weights between all nodes in the neural network. On the other hand, according to the recovery retraining unit 120 of the present apparatus 100, the present application is used when performing Pruning-Restoring training (in other words, in retraining the pruned neural network). As described above, instead of retraining the weights between all nodes (that is, all weights between nodes) in the pruned neural network, retraining may be performed only on the weights between nodes that have partially recovered among the weights between the pruned nodes.

생성부(130)는 프루닝부(120)에 의해 재훈련된 회복된 노드간 가중치를 고려하여 멀티 페이즈 신경망(Multi-phase Network, 20)을 생성할 수 있다. 멀티 페이즈 신경망(20)에 대한 설명은 도 5를 참조하여 보다 쉽게 이해될 수 있다.The generation unit 130 may generate a multi-phase network 20 in consideration of the recovered weights between nodes retrained by the pruning unit 120. A description of the multi-phase neural network 20 may be more easily understood with reference to FIG. 5.

도 5는 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 장치(100)에 의해 생성되는 멀티 페이즈 신경망(20)을 설명하기 위한 도면이다.5 is a diagram illustrating a multi-phase neural network 20 generated by the pruning-retraining apparatus 100 of a neural network according to an embodiment of the present application.

도 5를 참조하면, 멀티 페이즈 신경망(20)은, 신경망(10) 내의 프루닝된 노드간 가중치(2)를 포함하도록 생성되는 제1 레이어(21) 및 회복 재훈련부(120)에 의한 회복 재훈련 과정에 의하여 회복된 노드간 가중치를 포함하도록 생성되는 제2 레이어(22)를 포함할 수 있다.Referring to FIG. 5, the multi-phase neural network 20 includes a first layer 21 generated to include the weights 2 between nodes pruned in the neural network 10 and the recovery material by the recovery retraining unit 120. It may include a second layer 22 generated to include the weights between nodes recovered through the training process.

여기서, 제1 레이어(21)는 신경망(10) 내의 프루닝된 노드간 가중치(2)를 0으로 하여 포함하도록 생성되는 레이어를 의미할 수 있다. 달리 표현하여, 제1 레이어(21)는 신경망(10)에 대한 프루닝이 수행(S2)되었을 때, 프루닝된 신경망(11) 내 프루닝되지 않은 노드간 가중치(3)를 포함하도록 생성되는 레이어를 의미할 수 있다. 즉, 제1 레이어(21)는 예시적으로 도 4에서 단계S2에 대응하여, 프루닝되지 않은 노드간 가중치(3)를 포함하도록 생성되는 레이어를 의미할 수 있다.Here, the first layer 21 may mean a layer generated to include the pruned inter-node weight 2 in the neural network 10 as 0. In other words, when pruning for the neural network 10 is performed (S2), the first layer 21 is generated to include the weights 3 between nodes that are not pruned in the pruned neural network 11 It can mean a layer. That is, the first layer 21 may mean a layer generated to include the weights 3 between nodes that have not been pruned corresponding to step S2 in FIG. 4 by way of example.

제2 레이어(22)는 복수의 서브 레이어(22a, 22b, 22c, …)를 포함할 수 있다.The second layer 22 may include a plurality of sub-layers 22a, 22b, 22c, ...).

복수의 서브 레이어(22a, 22b, 22c, …) 중 어느 하나의 서브 레이어(22a)는 회복 재훈련부(120)에 의한 1차 회복 재훈련 과정(즉, 단계S3)에서 재훈련된 회복된 적어도 일부의 노드간 가중치(4)를 포함하도록 생성되는 레이어(서브 레이어)일 수 있다. Any one of the plurality of sub-layers 22a, 22b, 22c, ...) is at least recovered retrained in the primary recovery retraining process (ie, step S3) by the recovery retraining unit 120 It may be a layer (sub-layer) generated to include some of the weights 4 between nodes.

여기서, 복수의 서브 레이어(22a, 22b, 22c, …) 중 어느 하나의 서브 레이어(22a)는 제1 서브 레이어(22a)라 달리 표현될 수 있다. 상기 어느 하나의 서브 레이어(22a)는 예시적으로 도 4에서 단계S3에 대응하여, 1차 회복 재훈련 과정에서 재훈련된 회복된 적어도 일부의 노드간 가중치(4)를 포함하도록 생성되는 레이어(서브 레이어)를 의미할 수 있다. Here, one of the plurality of sub-layers 22a, 22b, 22c, ...) may be expressed differently as a first sub-layer 22a. One of the sub-layers 22a is a layer that is generated to include at least some of the recovered weights 4 between nodes that are retrained in the first recovery retraining process in response to step S3 in FIG. 4 ( It may mean a sub-layer.

또한, 복수의 서브 레이어(22a, 22b, 22c, …) 중 상기 어느 하나의 서브 레이어(22a)를 제외한 나머지 서브 레이어(22b, 22c, …)는 회복 재훈련부(120)에 의한 회복 재훈련 과정이 반복 수행되는 경우, 회복 재훈련 과정의 반복 수행시마다 추가로 회복된 노드간 가중치를 포함하도록 추가적으로 생성되는 서브 레이어일 수 있다.In addition, among the plurality of sub-layers 22a, 22b, 22c, …), the remaining sub-layers 22b, 22c,… excluding any one of the sub-layers 22a are a recovery retraining process by the recovery retraining unit 120 When this is repeatedly performed, it may be a sub-layer that is additionally generated to include the recovered inter-node weights each time the recovery retraining process is repeatedly performed.

나머지 서브 레이어(22b, 22c, …)에는 제2 서브 레이어(22b), 제3 서브 레이어(22c) 등이 포함될 수 있다.The remaining sub-layers 22b, 22c, ... may include a second sub-layer 22b, a third sub-layer 22c, and the like.

이때, 제2 서브 레이어(22b)는 예시적으로 도 4에서 단계S4에 대응하여, 2차 회복 재훈련 과정에서(즉, 1차 회복 재훈련 과정의 1회 반복 수행시) 재훈련된 회복된 적어도 일부의 노드간 가중치(5)를 포함하도록 생성되는 레이어(서브 레이어)를 의미할 수 있다. 또한, 제3 서브 레이어(22c)는 예시적으로 도 4에서 단계S5에 대응하여, 3차 회복 재훈련 과정에서(즉, 1차 회복 재훈련 과정의 2회 반복 수행시) 재훈련된 회복된 적어도 일부의 노드간 가중치(6)를 포함하도록 생성되는 레이어(서브 레이어)를 의미할 수 있다.At this time, the second sub-layer 22b exemplarily corresponds to step S4 in FIG. 4, in the second recovery retraining process (that is, when performing one repetition of the first recovery retraining process) It may mean a layer (sub-layer) generated to include at least some of the weights 5 between nodes. In addition, the third sub-layer 22c exemplarily corresponds to step S5 in FIG. 4, in the third recovery retraining process (that is, when performing two repetitions of the first recovery retraining process) It may mean a layer (sub-layer) generated to include at least some of the weights 6 between nodes.

이처럼, 제2 레이어(22)에 포함된 서브 레이어는, 프루닝부(110)에 의한 회복 재훈련 과정의 반복 수행시마다, 각각 반복 수행시에 추가로 회복되어 재훈련된 노드간 가중치(달리 말해, 재훈련된 추가로 회복된 노드간 가중치)를 포함하도록 추가적으로 생성될 수 있다.As such, the sub-layers included in the second layer 22 are additionally recovered and retrained during each repetition of the recovery retraining process by the pruning unit 110 (in other words, It may be additionally generated to include the retrained and additionally recovered inter-node weights).

달리 말해, 제2 레이어(22)에 포함되는 서브 레이어의 수는 제1 서브 레이어(22a)가 생성된 이후, 회복 재훈련 과정의 반복 수행시마다 회복 재훈련 과정의 반복 횟수에 비례하는 수로 증가할 수 있다. 즉, 제2 레이어(22)에 포함된 서브 레이어는 회복 재훈련 과정의 반복 횟수에 비례하여 추가적으로 생성될 수 있다. 달리 표현하여, 제2 레이어(22)중 제1 서브 레이어(22a)를 제외한 나머지 서브 레이어(22b, 22c, …)는 회복 재훈련 과정의 반복 횟수에 비례하는 수로 생성될 수 있다.In other words, the number of sub-layers included in the second layer 22 is increased to a number proportional to the number of repetitions of the recovery retraining process each time the recovery retraining process is repeated after the first sub-layer 22a is created. I can. That is, the sub-layer included in the second layer 22 may be additionally generated in proportion to the number of repetitions of the recovery retraining process. In other words, the remaining sub-layers 22b, 22c,… except for the first sub-layer 22a among the second layers 22 may be generated in a number proportional to the number of repetitions of the recovery retraining process.

이때, 제2 레이어(22)에 포함된 복수의 서브 레이어(22a, 22b, 22c, …)는 희소 행렬 형식을 적용하여 생성될 수 있다.In this case, the plurality of sub-layers 22a, 22b, 22c, ... included in the second layer 22 may be generated by applying a sparse matrix format.

희소 행렬(sparse matrix)은 행렬 내부의 값이 대부분 0인 행렬을 의미한다. 희소 행렬의 유형(종류)으로 예시적으로 CSR(Compressed Sparse Row), CSC(Compressed Sparse Column), COO(Coordinate list) 등이 있다. 예시적으로 도 6에는 희소 행렬의 유형 중 CSC(Compressed Sparse Column) 행렬의 저장 방식의 예가 도시되어 있다.The sparse matrix refers to a matrix whose internal values are mostly 0. Examples of the sparse matrix type (type) include Compressed Sparse Row (CSR), Compressed Sparse Column (CSC), and Coordinate List (COO). For example, FIG. 6 shows an example of a method of storing a compressed sparse column (CSC) matrix among sparse matrix types.

희소 행렬은 일종의 처리 과정을 통해 행렬 내에서 0이 아닌 값들만 저장하기 때문에, 필요한 저장 공간(요구되는 저장 공간)이 밀집 행렬(dense matrix)보다 작으며, 행렬간의 연산 속도가 빠르다는 장점이 있다. Since the sparse matrix stores only non-zero values in the matrix through a kind of processing, it has the advantage that the required storage space (required storage space) is smaller than the dense matrix, and the operation speed between matrices is fast. .

따라서, 생성부(130)는 멀티 페이즈 신경망(20)의 생성시, 제2 레이어(22)에 포함된 복수의 서브 레이어(22a, 22b, 22c, …)의 경우 희소 행렬 형식을 적용하여 생성할 수 있다.Therefore, when generating the multi-phase neural network 20, the generation unit 130 may generate the plurality of sub-layers 22a, 22b, 22c,… included in the second layer 22 by applying a sparse matrix format. I can.

생성부(130)는 멀티 페이즈 신경망을 복수의 유형으로 생성할 수 있다. 여기서, 생성부(130)에 의해 생성되는 멀티 페이즈 신경망의 복수의 유형은 생성되는 멀티 페이즈 신경망에 포함된 레이어의 수에 따라 결정될 수 있다.The generation unit 130 may generate a multi-phase neural network in a plurality of types. Here, the plurality of types of the multi-phase neural network generated by the generation unit 130 may be determined according to the number of layers included in the generated multi-phase neural network.

도 5를 참조하면 예시적으로 생성부(130)에 의해 생성되는 멀티 페이즈 신경망의 복수의 유형에는 4가지의 유형이 포함될 수 있다.Referring to FIG. 5, for example, four types may be included in a plurality of types of a multi-phase neural network generated by the generation unit 130.

여기서, 제1 유형의 멀티 페이즈 신경망은 제1 레이어(21)를 포함하는 신경망을 의미할 수 있다. 제2 유형의 멀티 페이즈 신경망은 제1 레이어(21), 및 제2 레이어(22) 내 제1 서브 레이어(22a)를 포함하는 신경망을 의미할 수 있다. 제3 유형의 멀티 페이즈 신경망은 제1 레이어(21), 제2 레이어(22) 내 제1 서브 레이어(22a)와 제2 서브 레이어(22b)를 포함하는 신경망을 의미할 수 있다. 제4 유형의 멀티 페이즈 신경망은 제1 레이어(21), 제2 레이어(22) 내 제1 서브 레이어(22a)와 제2 서브 레이어(22b)와 제3 서브 레이어(22c)를 포함하는 신경망을 의미할 수 있다.Here, the first type of multi-phase neural network may mean a neural network including the first layer 21. The second type of multi-phase neural network may mean a neural network including a first layer 21 and a first sub-layer 22a in the second layer 22. The third type of multi-phase neural network may mean a neural network including a first sub-layer 22a and a second sub-layer 22b in the first layer 21 and the second layer 22. The fourth type of multi-phase neural network is a neural network including a first sub-layer 22a, a second sub-layer 22b, and a third sub-layer 22c in the first layer 21 and the second layer 22. It can mean.

이처럼, 생성부(130)는 제2 레이어(22)에 포함되는 서브 레이어의 수에 따라 복수 유형의 멀티 페이즈 신경망을 생성할 수 있다. As such, the generation unit 130 may generate a plurality of types of multi-phase neural networks according to the number of sub-layers included in the second layer 22.

또한, 도면에 도시하지는 않았으나, 본 장치(100)는 멀티 페이즈 신경망 제공부(미도시)를 포함할 수 있다.Further, although not shown in the drawing, the apparatus 100 may include a multi-phase neural network providing unit (not shown).

멀티 페이즈 신경망 제공부(미도시)는 생성부(130)에 의해 생성된 멀티 페이즈 신경망을 제공할 수 있다. 특히, 멀티 페이즈 신경망 제공부(미도시)는 복수 유형의 멀티 페이즈 신경망 중 어느 하나를 제공할 수 있다. 이때, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건(요구되는 조건)을 고려하여, 복수 유형의 멀티 페이즈 신경망 중 어느 하나를 선택적으로 제공할 수 있다. The multi-phase neural network providing unit (not shown) may provide the multi-phase neural network generated by the generation unit 130. In particular, the multi-phase neural network provider (not shown) may provide any one of a plurality of types of multi-phase neural networks. In this case, the multi-phase neural network provider (not shown) may selectively provide any one of a plurality of types of multi-phase neural networks in consideration of the input system performance and/or a required condition (required condition).

여기서, 시스템의 성능 및/또는 요구 조건의 유형에는 메모리(memory) 공간, 추론(inference) 속도, 정확도(accuracy) 및 전력(power, 파워)이 포함될 수 있으나, 이에 한정되는 것은 아니다. 다른 일예로, 특히 시스템의 성능의 유형에는 배터리 임베디드 시스템(Battery Embedded System) 유형, 커넥티드 임베디드 시스템(Connected Embedded System) 유형, 서버 시스템(Server System) 유형 등이 포함될 수 있다. Here, the type of system performance and/or requirements may include, but is not limited to, a memory space, an inference speed, an accuracy, and a power (power). As another example, in particular, the type of performance of the system may include a battery embedded system type, a connected embedded system type, a server system type, and the like.

다시 말하자면, 생성부(130)는 멀티 페이즈 신경망(20)의 생성시, 프루닝된 신경망(11)에서 프루닝되지 않은 노드간 가중치(3)는 값을 고정시켜 두고, 이후 회복 재훈련 과정의 수행을 통해 재훈련된 회복된 노드간 가중치를 단계적으로 회복 재훈련 과정의 수행 횟수에 비례하여 증가시킴에 따라 복수 유형의 멀티 페이즈 신경망을 생성할 수 있다.In other words, when generating the multi-phase neural network 20, the generation unit 130 fixes the value of the weight 3 between nodes that are not pruned in the pruned neural network 11, and then the recovery retraining process Multiple types of multi-phase neural networks can be generated by increasing the weight of the recovered nodes retrained through execution in proportion to the number of times the recovery retraining process is performed step by step.

예시적으로, 신경망(10)에 대한 초기 프루닝(S2)이 수행되어 남아있는 가중치를 A라고 하고, 회복 재훈련 과정의 수행시마다 단계별로 회복되어 재훈련된 노드간 가중치가 각각 B, C, D인 것으로 가정하자. 이러한 경우, 생성부(130)는 복수 유형의 멀티 페이즈 신경망으로서 A를 포함하는 신경망, A+B를 포함하는 신경망, A+B+C를 포함하는 신경망 및 A+B+C+D를 포함하는 신경망을 생성할 수 있다.As an example, the weight remaining after the initial pruning (S2) for the neural network 10 is performed is referred to as A, and the weights between nodes recovered and retrained in stages are B, C, and respectively, each time the recovery retraining process is performed. Suppose it is D. In this case, the generation unit 130 includes a neural network including A, a neural network including A+B, a neural network including A+B+C, and A+B+C+D as multiple types of multi-phase neural networks. You can create neural networks.

여기서, 앞서 설명한 바에 따르면, A는 단계S2에서 프루닝된 신경망 내 남아있는 프루닝되지 않은 노드간 가중치(3)를 의미하고, B는 단계S3에서 회복되어 재훈련된 노드간 가중치(4)를 의미하고, C는 단계S4에서 회복되어 재훈련된 노드간 가중치(5)를 의미하고, D는 단계S5에서 회복되어 재훈련된 노드간 가중치(6)를 의미할 수 있다.Here, as described above, A denotes the weight (3) between unpruned nodes remaining in the neural network pruned in step S2, and B denotes the weight (4) between nodes recovered and retrained in step S3. Here, C may mean a weight 5 between nodes recovered and retrained in step S4, and D may mean a weight 6 between nodes recovered and retrained in step S5.

따라서, A를 포함하는 신경망은 제1 유형의 멀티 페이즈 신경망, A+B를 포함하는 신경망은 제2 유형의 멀티 페이즈 신경망, A+B+C를 포함하는 신경망은 제3 유형의 멀티 페이즈 신경망, A+B+C+D를 포함하는 신경망은 제4 유형의 멀티 페이즈 신경망을 의미할 수 있다.Therefore, a neural network including A is a first type of multi-phase neural network, a neural network including A+B is a second type of multi-phase neural network, and a neural network including A+B+C is a third type of multi-phase neural network, The neural network including A+B+C+D may mean a fourth type of multi-phase neural network.

이후, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건(요구되는 조건)에 따라 복수 유형의 멀티 페이즈 신경망(예를 들어, 제1 유형 내지 제4 유형의 멀티 페이즈 신경망) 중 어느 하나를 선택적으로 제공할 수 있다.Thereafter, the multi-phase neural network providing unit (not shown) provides multiple types of multi-phase neural networks (e.g., first to fourth types of multi-phase neural networks) according to the performance and/or requirements (required conditions) of the input system. Neural network) can be optionally provided.

이때, 도 5를 참조하면, 생성된 멀티 페이즈 신경망 내 레이어의 수가 증가할수록(달리 표현하여, 생성된 멀티 페이즈 신경망 내 재훈련된 회복된 노드간 가중치의 수가 증가할수록, 혹은 제1 유형의 멀티 페이즈 신경망에서 제4 유형의 멀티 페이즈 신경망으로 갈수록), 신경망은 더욱 밀집(Dense)해질 수 있다. 이처럼, 신경망이 밀집해질수록, 요구되는 메모리(memory) 공간은 커지고(Large), 정확도(accuracy) 및 전력(power, 파워) 소모가 높아지는(high) 반면, 추론(inference) 속도(특히, 정방향 추론 속도)는 느려(Slow)질 수 있다. 즉, 성능-정확도 간에는 트레이드 오프(Trade-off) 관계를 가질 수 있으며, 본 장치(100)는 신경망(10)(예를 들어 합성곱 신경망)에서 전력, 속도, 정확도 등의 트레이드 오프(Trade-off)를 고려한 프루닝-회복 훈련(재훈련) 기술을 제공할 수 있다.At this time, referring to FIG. 5, as the number of layers in the generated multi-phase neural network increases (in other words, as the number of retrained recovered inter-node weights in the generated multi-phase neural network increases, or the first type of multi-phase As we go from neural networks to multi-phase neural networks of the 4th type), neural networks can become more dense. In this way, as the neural network becomes denser, the required memory space becomes larger (Large), the accuracy (accuracy) and power (power) consumption become higher (high), while the inference speed (especially forward inference Speed) can be slow. That is, there may be a trade-off relationship between performance-accuracy, and the device 100 can trade off power, speed, and accuracy in the neural network 10 (for example, a convolutional neural network). off), pruning-recovery training (retraining) techniques can be provided.

신경망(10)에서 프루닝된 노드간 가중치의 비율이 커질수록 신경망(10) 내에 0 값을 갖는 가중치의 수가 많아지게 되며, 이는 신경망(10)의 희소 비율(Sparsity rate)이 커진다고 표현할 수 있다. 따라서, 본 장치(100)에서 생성부(130)는 신경망이 밀집해질수록 추론 속도가 느려지고 요구되는 메모리 공간이 커지는 문제를 해소하기 위해, 멀티 페이즈 신경망(20)의 생성시 멀티 페이즈 신경망(20) 내 포함되는 레이어(특히, 제2 레이어에 포함된 복수의 서브 레이어)를 희소 행렬 형식을 적용하여 생성할 수 있다.As the ratio of the weights between nodes pruned in the neural network 10 increases, the number of weights having a value of 0 increases in the neural network 10, which can be expressed as an increase in the sparsity rate of the neural network 10. Therefore, in order to solve the problem that the inference speed decreases and the required memory space increases as the neural network becomes denser, the generation unit 130 in the apparatus 100 generates the multi-phase neural network 20 when generating the multi-phase neural network 20. Layers included within (especially, a plurality of sub-layers included in the second layer) may be generated by applying a sparse matrix format.

즉, 본 장치(100)에 의하여 생성된 멀티 페이즈 신경망(20)은 희소 행렬 형식이 적용됨에 따라, 종래 기술 대비 추론 속도(추론 시간, 정방향 추론 속도)가 빠르며 메모리 공간을 절약할 수(줄일 수) 있다. 달리 말해, 본 장치(100)는 생성된 멀티 페이즈 신경망(20) 내 레이어들(특히, 제2 레이어에 포함된 복수의 서브 레이어)의 필터를 희소 행렬 형식을 적용하여 구성함으로써, 멀티 페이즈 신경망(20)이 적용된 신경망의 경우 희소 행렬의 특징으로 인해 기본 행렬 형식이 적용된 신경망과 대비하여 정방향 추론 속도(시간)이 더 빠르고 메모리 공간이 절약될 수 있다. 여기서, 레이어 내 필터는 레이어 내부에 존재하는 파라미터들의 집합을 의미할 수 있다.That is, as the sparse matrix format is applied to the multi-phase neural network 20 generated by the present apparatus 100, the inference speed (inference time, forward inference speed) is faster compared to the prior art, and the memory space can be saved (can be reduced). ) have. In other words, the apparatus 100 configures filters of layers (especially, a plurality of sub-layers included in the second layer) in the generated multi-phase neural network 20 by applying a sparse matrix format. In the case of a neural network to which 20) is applied, due to the characteristics of a sparse matrix, the forward inference speed (time) is faster and memory space can be saved compared to a neural network to which the basic matrix format is applied. Here, the intra-layer filter may mean a set of parameters existing in the layer.

이러한 본 장치(100)는 신경망(10)(예를 들어 합성곱 신경망)에서 전력, 속도, 정확도 등의 트레이드 오프(Trade-off)를 고려한 프루닝-회복 훈련이 가능하다. 달리 말해, 본원은 정확도-속도/전력 트레이드 오프를 고려한 프루닝-회복 훈련 기술을 제공할 수 있다. 이러한 본원에서 제안하는 기술은 멀티 페이즈 프루닝 기술이라 달리 지칭될 수 있다.The present apparatus 100 can perform pruning-recovery training in consideration of trade-offs in power, speed, and accuracy in the neural network 10 (for example, a convolutional neural network). In other words, the present application may provide a pruning-recovery training technique in consideration of the accuracy-speed/power tradeoff. This technique proposed herein may be referred to differently as a multi-phase pruning technique.

한편, 멀티 페이즈 신경망 제공부(미도시)가 입력된 시스템의 성능 및/또는 요구 조건(요구되는 조건)에 따라 복수 유형의 멀티 페이즈 신경망(예를 들어, 제1 유형 내지 제4 유형의 멀티 페이즈 신경망) 중 어느 하나를 선택적으로 제공하는 것과 관련하여, 예를 들면 다음과 같다.Meanwhile, a multi-phase neural network provider (not shown) provides multiple types of multi-phase neural networks (e.g., first to fourth types of multi-phase neural networks) according to the input system performance and/or requirements (required conditions). In relation to selectively providing any one of neural networks), for example, it is as follows.

일예로, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건으로서 정확도가 일예로 0% 내지 24% 중 어느 하나인 조건이 입력된 경우, 제1 유형의 멀티 페이즈 신경망을 선택하여 제공할 수 있다. 또한, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건으로서 정확도가 일예로 25% 내지 49% 중 어느 하나인 조건이 입력된 경우, 제2 유형의 멀티 페이즈 신경망을 선택하여 제공할 수 있다. 또한, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건으로서 정확도가 일예로 50% 내지 74% 중 어느 하나인 조건이 입력된 경우, 제3 유형의 멀티 페이즈 신경망을 선택하여 제공할 수 있다. 또한, 멀티 페이즈 신경망 제공부(미도시)는 입력된 시스템의 성능 및/또는 요구 조건으로서 정확도가 일예로 75% 내지 100% 중 어느 하나인 조건이 입력된 경우, 제4 유형의 멀티 페이즈 신경망을 선택하여 제공할 수 있다.As an example, the multi-phase neural network provider (not shown) is a first type of multi-phase neural network when a condition in which the accuracy is any one of 0% to 24%, for example, is input as a performance and/or requirement condition of the input system. You can choose to provide it. In addition, the multi-phase neural network provider (not shown) provides a second type of multi-phase neural network when a condition in which the accuracy is one of 25% to 49%, for example, is input as a performance and/or requirement condition of the input system. You can choose to provide it. In addition, the multi-phase neural network providing unit (not shown) provides a third type of multi-phase neural network when a condition in which accuracy is one of 50% to 74%, for example, is input as a performance and/or requirement condition of the input system. You can choose to provide it. In addition, the multi-phase neural network provider (not shown) provides a fourth type of multi-phase neural network when a condition in which accuracy is 75% to 100%, for example, is input as a performance and/or requirement condition of the input system. You can choose to provide it.

즉, 본 장치(100)는 특정 이미지 데이터 집합에 대해 훈련된 모델에 대하여 프루닝 과정을 거친 신경망(10)에서, 프루닝된 노드간 가중치(2) 중 미리 설정된 비율의 노드간 가중치(Weight)를 회복시킬 수 있다. 이후, 본 장치(100)는 회복된 노드간 가중치에 한하여 재훈련(Retraining) 과정을 수행할 수 있다. 이러한 본 장치(100)는 회복된 노드간 가중치에 대한 재훈련을 수행함으로써, 프루닝에 의해 저하된 신경망의 정확도를 복구시킬 수 있다. That is, in the neural network 10 that has undergone a pruning process for a model trained on a specific image data set, the apparatus 100 is a weight between nodes of a preset ratio among the pruned weights 2 Can recover. Thereafter, the apparatus 100 may perform a retraining process only for the restored weights between nodes. The apparatus 100 may recover the accuracy of the neural network degraded by pruning by performing retraining on the restored weights between nodes.

또한, 본 장치(100)에서는 신경망(10)에 대한 프루닝 수행시 프루닝되지 않은 가중치(즉, 살아있는 가중치, 3)의 값은 회복 재훈련 과정에서 변하지 않기 때문에, 회복 재훈련 과정의 수행시마다(반복 수행시마다) 각각 별개로 추가된 회복된 가중치를 추가하는 식의 신경망 구성이 가능하다. In addition, in the present apparatus 100, when pruning is performed on the neural network 10, the value of the unpruned weight (ie, live weight, 3) does not change during the recovery retraining process. It is possible to construct a neural network such that each separately added recovered weight is added (for each iteration).

달리 말해, 본 장치(100)에서는 프루닝 수행시 프루닝되지 않은 가중치(즉, 살아있는 가중치, 3)의 값은 고정시켜 두고, 회복 재훈련 과정의 수행시마다(반복 수행시마다) 각각 해당 과정에서 추가된 회복된 가중치(재훈련된 회복된 가중치)가 포함된 레이어를 별개로(단계적으로) 추가함으로써, 복수 유형의 신경망(멀티 페이즈 신경망)을 생성할 수 있다.In other words, in the present apparatus 100, when pruning is performed, the value of the unpruned weight (i.e., live weight, 3) is fixed, and each time the recovery retraining process is performed (every repetition) is added in the corresponding process. A plurality of types of neural networks (multi-phase neural networks) can be created by separately (stepwise) adding layers containing recovered weights (retrained restored weights).

다시 말해, 본 장치(100)는 멀티 페이즈 신경망(20)을 생성하기 위해, 회복 재훈련 과정을 복수회 수행(반복 수행)하며, 각 과정에서 회복된 가중치들을 도 5에 도시된 바와 같이 여러 단계의 신경망(즉, 복수 유형의 멀티 페이즈 신경망)으로 생성할 수 있다. 이때, 각 과정에서 회복된 가중치가 추가될 때 마다(회복된 가중치를 포함하는 레이어가 추가될 때마다)(즉, 멀티 페이즈 신경망 내 레이어의 수가 증가할수록) 희소 비율이 작아지기 때문에 연산에서 큰 메모리 공간이 요구되며, 정방향 추론 속도(시간)이 오래 걸리게 된다. 따라서, 본 장치(100)는 회복 재훈련 과정의 수행시마다 각각이 회복된 가중치를 별도의 레이어로 생성하여 단계적으로 추가하여 도 5에 도시된 바와 같이 멀티 페이즈 신경망을 복수의 유형으로 구성(생성)할 수 있다. 이에 따라, 본 장치(100)는 시스템의 성능 및/또는 요구 조건(예를 들어, 메모리나 속도 등의 요구 조건)에 맞도록 본 장치(100)에 의해 생성된 복수 유형의 멀티 페이즈 신경망이 적응적(Adaptive)으로 사용되도록 제공할 수 있다.In other words, in order to generate the multi-phase neural network 20, the apparatus 100 performs the recovery retraining process multiple times (repeatedly), and the weights recovered in each process are multi-stepped as shown in FIG. It can be created as a neural network of (i.e., multiple types of multi-phase neural networks). At this time, each time the restored weight is added in each process (every time a layer containing the restored weight is added) (that is, as the number of layers in the multi-phase neural network increases), the sparse ratio decreases, so a large memory in the operation Space is required, and forward inference speed (time) takes a long time. Therefore, the present apparatus 100 constructs (generates) a multi-phase neural network into a plurality of types as shown in FIG. can do. Accordingly, the device 100 adapts multiple types of multi-phase neural networks generated by the device 100 to meet the performance and/or requirements of the system (for example, requirements such as memory or speed). It can be provided to be used adaptively.

본 장치(100)는 반복되는 회복 재훈련 과정을 통해 복수 유형의 멀티 페이즈 신경망을 구성(생성)함으로써, 이를 기반으로 전력, 정확도 등의 시스템의 성능 및/또는 요구 조건을 고려하여 그에 맞춤화된 신경망을 적응적으로 선택하여 제공할 수 있다.The device 100 constructs (generates) a plurality of types of multi-phase neural networks through repeated recovery retraining processes, and based on this, a neural network customized to the system performance and/or requirements such as power and accuracy is considered. Can be provided by adaptively selecting.

이하에서는 상기에 자세히 설명된 내용을 기반으로, 본원의 동작 흐름을 간단히 살펴보기로 한다.Hereinafter, based on the details described above, the operation flow of the present application will be briefly described.

도 7은 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 방법에 대한 동작 흐름도이다.7 is a flowchart illustrating an operation of a method for pruning-retraining a neural network according to an embodiment of the present application.

도 7에 도시된 신경망의 프루닝-재훈련 방법은 앞서 설명된 본 장치(100)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 본 장치(100)에 대하여 설명된 내용은 신경망의 프루닝-재훈련 방법에 대한 설명에도 동일하게 적용될 수 있다.The pruning-retraining method of the neural network shown in FIG. 7 may be performed by the apparatus 100 described above. Therefore, even if omitted below, the description of the apparatus 100 may be equally applied to the description of the pruning-retraining method of a neural network.

도 7을 참조하면, 단계S110에서는 신경망 내 노드에 대한 프루닝을 수행할 수 있다.Referring to FIG. 7, in step S110, pruning may be performed on a node in the neural network.

이때, 단계S110에서 고려되는 신경망은 일예로 훈련된 합성곱 신경망(Convolution Neural Network)일 수 있다.At this time, the neural network considered in step S110 may be a trained convolution neural network, for example.

다음으로, 단계S120에서는 단계S110에 의해 프루닝된 신경망 내의 프루닝된 노드간 가중치 중 적어도 일부를 회복시키고, 회복된 노드간 가중치에 대한 재훈련을 수행할 수 있다.Next, in step S120, at least some of the pruned inter-node weights in the neural network pruned in step S110 may be recovered, and retraining may be performed on the recovered inter-node weights.

이때, 단계S120에서는 프루닝된 노드간 가중치 중에서 회복된 적어도 일부의 노드간 가중치를 제외한 나머지 노드간 가중치 중 적어도 일부의 노드간 가중치를 추가로 회복시킬 수 있다.In this case, in step S120, at least some of the weights between nodes among the remaining node weights excluding at least some of the restored weights between nodes among the pruned node weights may be additionally restored.

이러한 단계S120은 반복 수행될 수 있다.This step S120 may be repeatedly performed.

또한, 도면에 도시하지는 않았으나, 본원의 일 실시예에 따른 신경망의 프루닝-재훈련 방법은 단계S12 이후에, 재훈련된 회복된 노드간 가중치를 고려하여 멀티 페이즈 신경망을 생성하는 단계를 포함할 수 있다.In addition, although not shown in the drawing, the pruning-retraining method of a neural network according to an embodiment of the present application includes, after step S12, generating a multi-phase neural network in consideration of the weights between the recovered nodes. I can.

이때, 멀티 페이즈 신경망은, 신경망 내의 프루닝된 노드간 가중치를 포함하도록 생성되는 제1 레이어 및 단계S120에서 회복된 노드간 가중치(단계S120에서 회복되어 재훈련된 노드간 가중치, 달리 표현하여 재훈련된 회복된 노드간 가중치)를 포함하도록 생성되는 제2 레이어를 포함할 수 있다.In this case, the multi-phase neural network is a first layer created to include the weights between nodes pruned in the neural network and the weights between nodes recovered in step S120 (weights between nodes recovered and retrained in step S120, differently expressed and retrained) It may include a second layer generated to include the restored weight between nodes.

또한, 제2 레이어는 복수의 서브 레이어를 포함할 수 있다. 여기서, 복수의 서브 레이어 중 어느 하나의 서브 레이어는 상기 회복된 적어도 일부의 노드간 가중치를 포함하도록 생성되는 서브 레이어일 수 있다. 또한, 복수의 서브 레이어 중 상기 어느 하나의 서브 레이어를 제외한 나머지 서브 레이어는 단계S120이 반복 수행되는 경우, 단계S120의 반복 수행시마다 추가로 회복된 노드간 가중치를 포함하도록 추가적으로 생성되는 서브 레이어일 수 있다.Also, the second layer may include a plurality of sub-layers. Here, any one of the plurality of sub-layers may be a sub-layer generated to include the recovered at least some inter-node weights. In addition, the remaining sub-layers excluding any one of the plurality of sub-layers may be sub-layers that are additionally generated to include the recovered inter-node weights each time step S120 is repeatedly performed when step S120 is repeatedly performed. have.

또한, 멀티 페이즈 신경망에 포함된 제2 레이어 내 복수의 서브 레이어는 희소 행렬 형식을 적용하여 생성될 수 있다.Also, a plurality of sub-layers in the second layer included in the multi-phase neural network may be generated by applying a sparse matrix format.

상술한 설명에서, 단계 S110 및 단계S120은 본원의 구현 예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps S110 and S120 may be further divided into additional steps or may be combined into fewer steps, according to an exemplary embodiment of the present disclosure. In addition, some steps may be omitted as necessary, and the order between steps may be changed.

본원의 일 실시 예에 따른 신경망의 프루닝-재훈련 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The pruning-retraining method of a neural network according to an embodiment of the present disclosure may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of the program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The above-described hardware device may be configured to operate as one or more software modules to perform the operation of the present invention, and vice versa.

또한, 전술한 신경망의 프루닝-재훈련 방법은 기록 매체에 저장되는 컴퓨터에 의해 실행되는 컴퓨터 프로그램 또는 애플리케이션의 형태로도 구현될 수 있다.In addition, the aforementioned method of pruning-retraining a neural network may be implemented in the form of a computer program or application executed by a computer stored in a recording medium.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present application is for illustrative purposes only, and those of ordinary skill in the art to which the present application pertains will be able to understand that it is possible to easily transform it into other specific forms without changing the technical spirit or essential features of the present application. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as being distributed may also be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present application is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present application.

100: 신경망의 프루닝-재훈련 장치
110: 프루닝부
120: 회복 재훈련부
130: 생성부100: neural network pruning-retraining device
110: pruning part
120: recovery retraining department
130: generation unit

Claims

As a pruning-retraining method of a neural network in which each step is performed by a computer-implemented pruning-retraining device of a neural network,
(a) performing pruning for nodes in the neural network; And
(b) recovering at least some of the pruned inter-node weights in the pruned neural network and performing retraining on the restored inter-node weights; And
(c) generating a multi-phase neural network in consideration of the weights between the retrained and recovered nodes,
Including,
The multi-phase neural network includes a first layer generated to include the pruned inter-node weights in the neural network and a second layer generated to include the recovered inter-node weights. Training method.

The method of claim 1,
The step (b),
The pruning-retraining method of a neural network, wherein at least some of the weights between nodes among the remaining node weights excluding the recovered at least some of the weights between nodes of the pruned nodes are additionally restored.

The method of claim 2,
The step (b) is repeatedly performed, pruning of a neural network-retraining method.

delete

The method of claim 1,
The second layer includes a plurality of sub-layers,
Any one sub-layer among the plurality of sub-layers is a sub-layer generated to include at least some of the restored weights between nodes,
When the step (b) is repeatedly performed, the remaining sub-layers among the plurality of sub-layers are additionally generated to include the restored weights between nodes each time the step (b) is repeatedly performed. The pruning-retraining method of a neural network that is a sub-layer.

The method of claim 6,
The plurality of sub-layers is generated by applying a sparse matrix format, pruning-retraining method of a neural network.

The method of claim 1,
In the step (a), the neural network is a trained convolution neural network, pruning-retraining a neural network.

As a neural network pruning-retraining device,
A pruning unit that performs pruning on nodes in the neural network;
A recovery retraining unit for recovering at least some of the pruned inter-node weights in the pruned neural network and performing retraining on the recovered inter-node weights; And
A generator for generating a multi-phase neural network in consideration of the retrained and restored weights between nodes,
Including,
The multi-phase neural network includes a first layer generated to include the pruned inter-node weights in the neural network and a second layer generated to include the recovered inter-node weights. Training device.

The method of claim 9,
The recovery retraining unit,
The pruning-retraining apparatus of a neural network to further recover at least some of the weights between nodes of the remaining node weights excluding the recovered at least some of the weights between nodes among the pruned node weights.

The method of claim 10,
The recovery retraining unit is to repeatedly perform a process of recovering weights between nodes and retraining restored weights between nodes, pruning-retraining apparatus of a neural network.

delete