KR102437193B1

KR102437193B1 - Apparatus and method for parallel deep neural networks trained by resized images with multiple scaling factors

Info

Publication number: KR102437193B1
Application number: KR1020200096078A
Authority: KR
Inventors: 원치선
Original assignee: 동국대학교 산학협력단
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2022-08-30
Also published as: KR20220016402A

Abstract

본 발명은 영상을 전체적인 시각(global view)과 다양한 배율(scale)의 부분적인 시각(local view)으로 훈련시키는 병렬 심층 신경망 장치 관련 기술적 사상에 관한 것으로, 영상 내에 존재하는 객체에 대한 전체적인 모양 정보뿐 만이 아니라 국부적인 미세한 차이를 분리하여 학습할 수 있도록 하기 위하여, 훈련 영상을 서로 다른 배율(scale)로 크기 변환하고 각각의 배율로 크기 변환된 영상으로 훈련된 심층 신경망들을 병렬 연결하여 사용함에 따라 영상의 크기를 특정 범위의 배율(scale)로 변환할 수 있고, 다양한 배율로 훈련된 복수의 심층 신경망 중에 기본 층의 상부층을 공유하여 메모리 소모를 최소화 할 수 있는 기술에 관한 것이다.The present invention relates to a technical idea related to a parallel deep neural network device that trains an image with a global view and a local view of various scales, and only provides overall shape information about objects present in the image. In order to separate and learn local minute differences, the training image is scaled at different scales, and the deep neural networks trained with the scaled images are connected in parallel to each other. It relates to a technology that can convert the size of , to a specific range of scale, and minimize memory consumption by sharing the upper layer of the base layer among a plurality of deep neural networks trained at various scales.

Description

Apparatus and method for parallel deep neural networks trained with images scaled according to multiple magnifications

본 발명은 영상에 존재하는 객체를 전체적인 시각(global view)과 국부적인 시각(local view) 등의 다양한 배율(scale)로 크기 변환된 훈련영상으로 학습할 수 있는 심층 신경망 관련 기술적 사상에 관한 것으로, 보다 상세하게는 훈련 영상을 서로 다른 배율(scale)로 크기 변환하고 각각의 배율로 크기 변환된 영상으로 훈련된 심층 신경망들을 병렬 연결하여 사용하는 기술에 관한 것이다.The present invention relates to a deep neural network-related technical idea that can learn an object existing in an image as a training image size-converted at various scales, such as a global view and a local view, More specifically, it relates to a technique for using a parallel connection of a training image by size-converting the training image at different scales and using the deep neural networks trained with the image scaled by each scale.

수많은 정지 영상으로 미리 훈련시킨 다양한 심층 신경망 구조들이 존재하며, 이와 같은 기 훈련된 CNN(Convolutional Neural Networks) 구조들을 전이 학습(transfer learning)을 통해 여러 응용분야에 적용하여 획기적인 성능 향상을 달성하고 있다.Various deep neural network structures pre-trained with numerous still images exist, and such pre-trained Convolutional Neural Networks (CNN) structures are applied to various applications through transfer learning to achieve epoch-making performance improvement.

예를 들어, 기 훈련된 심층 신경망 구조를 전이 학습을 통해 기본적인 객체(object)에 속한 다양한 계통의 부객체(sub-object)로 분류하는 세분화된 객체분류(Fine-grained object classification) 문제에 적용하고 있다.For example, we apply the trained deep neural network structure to the fine-grained object classification problem that classifies the sub-objects of various systems belonging to the basic object through transfer learning. have.

기존의 부객체 분류 문제는 각각의 부객체가 공통적으로 가지고 있는 파트 객체(Object Part)를 활용하여 분류의 정확도를 높이고 있다.In the existing sub-object classification problem, the accuracy of classification is improved by using the part object that each sub-object has in common.

예를 들어, 개(dog)의 기본 객체에 대한 다양한 종의 부객체(sub-object)를 분류하는 문제에 대해 개들이 공통적으로 가지고 있는 머리, 다리, 그리고 꼬리의 파트 객체를 분할하고 이들 파트 객체들을 비교하는 방향으로 신경망을 학습하여 성능 향상을 달성하였다.For example, for the problem of classifying sub-objects of various species for the basic object of a dog, we divide the part objects of the head, legs, and tail that dogs have in common, and divide these part objects. The performance improvement was achieved by learning the neural network in the direction of comparing them.

그러나, 세분화된 객체분류(Fine-grained object classification) 문제의 대상이 되는 객체들이 모두 분할 가능한 파트 객체(Object part)를 갖고 있는 것은 아니다.However, not all objects subject to the problem of fine-grained object classification have a segmentable object part.

예를 들어, 음식(food)을 분류하는 문제도 세분화된 객체분류(Fine-grained object classification) 문제에 해당하지만 음식 영상을 특별히 분할해서 비교할 수 있는 공통적인 파트 객체가 없다.For example, the problem of classifying food also corresponds to a problem of fine-grained object classification, but there is no common part object that can be compared by special segmentation of a food image.

도 1은 종래 기술에 따른 기본 객체에 속한 부객체를 분류하는 문제를 설명하는 도면이다.1 is a view for explaining a problem of classifying sub-objects belonging to a basic object according to the related art.

구체적으로, 도 1은 종래 기술에 따른 기본 객체에 속한 부객체를 분류하는 문제와 관련된 이미지들을 예시한다.Specifically, FIG. 1 exemplifies images related to the problem of classifying sub-objects belonging to a base object according to the prior art.

도 1을 참고하면, 이미지(100)는 기본 객체가 새(bird)인 경우를 나타내며 이미지(110)는 기본객체가 음식(food)인 경우를 나타낸다.Referring to FIG. 1 , an image 100 shows a case where the basic object is a bird, and an image 110 shows a case where the basic object is food.

이미지(100)의 경우는 부객체들 사이에 공통적인 파트객체인 부리, 날개, 꼬리 등으로 세분화된 객체분류(Fine-grained object classification)가 가능하다.In the case of the image 100, fine-grained object classification is possible, which is divided into beak, wing, tail, etc., which are common part objects among sub-objects.

반면에, 이미지(110)의 경우는 그런 공통적인 파트객체가 존재하지 않는 경우를 나타낸고, 이미지(110)의 경우는 상술한 음식(food)을 분류하는 문제에 해당될 수 있다.On the other hand, the case of the image 110 represents a case where such a common part object does not exist, and the case of the image 110 may correspond to the above-described problem of classifying food.

즉, 음식과 그것을 담고 있는 그릇을 세분할 수 있지만 같은 부객체(sub-object)에 속하는 음식도 다양한 용기에 담길 수 있다는 점에서 기존의 방법으로 심층 신경망의 성능을 향상시킬 수 없고 새로운 접근 방법이 필요한 실정이다.In other words, in that food and the container containing it can be subdivided, but food belonging to the same sub-object can be contained in various containers, the performance of the deep neural network cannot be improved by the existing method, and a new approach cannot be used. is in need.

미국등록특허 제10115040호, "CONVOLUTIONAL NEURAL NETWORK-BASED MODE SELECTION AND DEFECT CLASSIFICATION FOR IMAGE FUSION"US Patent No. 10115040, "CONVOLUTIONAL NEURAL NETWORK-BASED MODE SELECTION AND DEFECT CLASSIFICATION FOR IMAGE FUSION" 한국등록특허 제10-1882704호, "전자 장치 및 그 제어 방법"Korean Patent No. 10-1882704, "Electronic device and its control method" 한국공개특허 제10-2020-0066732호, "딥 러닝을 사용한 병리 슬라이드 이미지에서의 분자 아형의 종양내 이질성 검출"Korean Patent Application Laid-Open No. 10-2020-0066732, "Detection of intratumoral heterogeneity of molecular subtypes in pathological slide images using deep learning" 한국등록특허 제10-2102161호, "이미지 내 객체의 대표 특성을 추출하는 방버, 장치 및 컴퓨터 프로그램"Korean Patent Registration No. 10-2102161, "A method for extracting representative characteristics of an object in an image, a device and a computer program"

Y. Peng, X. He, and J. Zhao, "Object-part attention model for fine-grained image classification," IEEE Transactions on Image Processing, Vol. 27, No.3, pp. 1487-1500, 2018.Y. Peng, X. He, and J. Zhao, "Object-part attention model for fine-grained image classification," IEEE Transactions on Image Processing, Vol. 27, No. 3, pp. 1487-1500, 2018. L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 - Mining Discriminative Components with Random Forests," European Conference on Computer Vision (ECCV), pages 446-461. Springer, 2014.L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 - Mining Discriminative Components with Random Forests," European Conference on Computer Vision (ECCV), pages 446-461. Springer, 2014. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, "The CaltechUCSD Birds-200-2011 Dataset," California Institute of Technology, CNSTR-2011-001, 2011.C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, "The CaltechUCSD Birds-200-2011 Dataset," California Institute of Technology, CNSTR-2011-001, 2011.

본 발명은 훈련 영상을 다양한 배율로 크기 변환하여 원거리에서 관찰한 객체의 특성으로 학습한 신경망뿐만이 아니라 근거리에서 관찰한 객체의 상세한 특징 정보로 학습한 신경망을 복합적으로 사용함으로써 다양한 배율의 시각에서 구분할 수 있는 부객체들의 미세한 차이를 학습함에 따라 객체 인식 성능을 향상시키는 것을 목적으로 한다.The present invention converts the training image at various magnifications and uses not only the neural network learned with the characteristics of the object observed from a distance, but also the neural network learned from the detailed characteristic information of the object observed from a short distance. It aims to improve object recognition performance by learning subtle differences between sub-objects.

본 발명은 서로 다른 배율에 따라 크기 변환된 영상을 이용하여 훈련시킨 서로 다른 심층 신경망을 병렬로 연결하여 복수 배율 신경망(multi-scale CNN)을 구성하고, 테스트 영상에 대해 각 배율로 크기 변환한 후 해당 배율의 신경망을 통해 얻은 예측 점수(score)를 통합하여 최종 객체 분류를 수행하는 것을 목적으로 한다.The present invention configures a multi-scale neural network (CNN) by connecting different deep neural networks trained using images scaled according to different magnifications in parallel, and after size-converting the test images at each magnification It aims to perform final object classification by integrating the prediction scores obtained through the neural network of the corresponding scale.

본 발명은 배율 별로 복수개의 신경망을 사용하므로 각각의 배율에 해당하는 훈련된 파라메터를 저장해야 하는 단점을 극복하기 위해 객체수준의 신경망의 상부층을 다른 모든 배율에서 공유할 수 있도록 배율 별 신경망을 훈련시킴으로써 메모리 소비량을 최소화하는 것을 목적으로 한다.Since the present invention uses a plurality of neural networks for each magnification, in order to overcome the disadvantage of having to store the trained parameters corresponding to each magnification, by training the neural network for each magnification so that the upper layer of the object-level neural network can be shared at all other magnifications. It aims to minimize memory consumption.

본 발명은 입력 영상의 크기 변환 시 무작위 특성(randomness)를 부여하여 같은 영상에 대해 매번 크기 변환을 하더라도 영상에서 변환된 결과가 조금씩 다르도록 함에 따라 신경망 계수를 업데이트 할 시, 국부 최소치에 빠지지 않고, 학습과정에서 오퍼피팅(overfitting)을 방지하는 지터링(jittering) 효과를 주는 것을 목적으로 한다.The present invention gives randomness when changing the size of an input image so that the transformed result in the image is slightly different even if the size is changed every time for the same image. It aims to give a jittering effect that prevents overfitting in the learning process.

본 발명의 일실시예에 따르면 병렬 심층 신경망 장치는 훈련 영상에 대해 서로 다른 복수의 배율을 적용하여 복수의 영상 변환 레벨로 상기 훈련 영상의 크기를 변환하는 영상 크기 변환부 및 일반 영상으로 기 훈련된 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제1 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이학습을 수행함에 따라 상기 제1 레벨에 대한 제1 배율 신경망을 생성하고, 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 상기 제1 레벨 이외의 나머지 영상 변환 레벨들에 따라 파라메터가 교체된 하부층들을 각각 연결하여 전이학습을 수행함에 따라 상기 나머지 영상 변환 레벨들에 대한 나머지 배율 신경망들을 생성하는 배율 신경망 생성부를 포함할 수 있다.According to an embodiment of the present invention, the parallel deep neural network apparatus includes an image size converter that converts the size of the training image into a plurality of image conversion levels by applying a plurality of different magnifications to a training image, and a pre-trained image with a general image. By connecting the upper layer of the neural network and the lower layer whose parameters are replaced according to the first level among the plurality of image conversion levels to perform transfer learning, a first scaled neural network for the first level is generated, and the first scaled neural network is As transfer learning is performed by connecting the upper layer and the lower layers whose parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels, the remaining scaling neural networks for the remaining image conversion levels are generated. It may include a scaled neural network generating unit.

상기 영상 크기 변환부는 상기 훈련 영상에 대하여 객체 전체를 포함할 수 있는 전체 배율을 결정하고, 상기 결정된 전체 배율 범위 내에서 상기 서로 다른 복수의 배율을 무작위로 결정할 수 있다.The image size converter may determine a total magnification that can include the entire object with respect to the training image, and may randomly determine the plurality of different magnifications within the determined total magnification range.

상기 영상 크기 변환부는 상기 결정된 서로 다른 복수의 배율에 따라 상기 훈련 영상을 선형적으로 변환하여 중간 영상을 생성하고, 상기 생성된 중간 영상 내에서 상기 결정된 서로 다른 복수의 배율 별로 상기 훈련 영상을 무작위로 크로핑(cropping)하여 상기 훈련 영상의 크기를 변환할 수 있다.The image size conversion unit generates an intermediate image by linearly transforming the training image according to the determined plurality of different magnifications, and randomly selects the training image for each of the determined different magnifications in the generated intermediate image. The size of the training image may be transformed by cropping.

상기 영상 크기 변환부는 상기 전체 배율에 따른 상기 훈련 영상의 가로 및 세로 길이의 최소값에서 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값을 제외한 후, 평균이 0이고 표준편차가 0.5인 정규분포에서 생성한 난수값을 결합하여 생성된 값과 상기 전체 배율에 따른 상기 훈련 영상의 가로 및 세로 길이의 최소값에 상기 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값의 합에 대한 절반에 배율의 범위를 결정하기 위한 배율요소(scaling factor)를 결합하여 생성된 값을 합산하여 상기 서로 다른 복수의 배율을 결정할 수 있다.The image size converter excludes the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network from the minimum values of the horizontal and vertical lengths of the training image according to the overall magnification. The range of magnification to half the sum of the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network to the value generated by combining the generated random number value and the minimum value of the horizontal and vertical lengths of the training image according to the total magnification. The plurality of different magnifications may be determined by summing values generated by combining a scaling factor for determining .

상기 배율 신경망 생성부는 상기 제1 배율 신경망의 상부층의 파라메터를 상기 나머지 배율 신경망들을 생성하기 위한 기본 파라메터로 공유하면서, 상기 나머지 배율 신경망들의 상부층의 파라메터로 공유할 수 있다.The scaled neural network generator may share the parameter of the upper layer of the first scaled neural network as a basic parameter for generating the remaining scaled neural networks, while sharing it as a parameter of the upper layer of the remaining scaled neural networks.

상기 배율 신경망 생성부는 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제2 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 제2 배율 신경망을 생성하고, 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제3 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 제3 배율 신경망을 생성하며, 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 마지막 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 마지막 배율 신경망을 생성할 수 있다.The scaled neural network generator performs transfer learning by connecting the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to a second level among the plurality of image conversion levels, thereby generating a second scaled neural network among the remaining scaled neural networks. and by performing transfer learning by connecting the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to a third level among the plurality of image conversion levels, a third scaled neural network is generated among the remaining scaled neural networks, , by performing transfer learning by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the last level among the plurality of image conversion levels, the last scaled neural network among the remaining scaled neural networks may be generated.

상기 배율 신경망 생성부는 상기 나머지 배율 신경망들을 생성할 시, 상기 제1 배율 신경망의 상부층의 파라메터와 상기 나머지 배율 신경망들의 상부층의 파라메터를 동일하게 저장할 수 있다.When generating the remaining scaled neural networks, the scaled neural network generator may store the parameters of the upper layer of the first scaled neural network and the parameters of the upper layers of the remaining scaled neural networks in the same manner.

상기 복수의 영상 변환 레벨은 상기 훈련 영상에서 객체 크기의 배율에 해당하는 객체 레벨(object level), 객체의 부분 크기의 배율에 해당하는 파트 레벨(part level) 및 상기 객체 레벨(object level)과 상기 파트 레벨(part level)의 중간 크기의 배율에 해당하는 중간 레벨(Mid-level) 중 적어도 둘의 레벨을 포함할 수 있다.The plurality of image conversion levels include an object level corresponding to a magnification of an object size in the training image, a part level corresponding to a magnification of a partial size of an object, and the object level and the At least two levels of a mid-level corresponding to a magnification of an intermediate size of the part level may be included.

본 발명의 일실시예에 따르면 병렬 심층 신경망 장치는 상기 생성된 제1 배율 신경망 및 상기 생성된 나머지 배율 신경망들에 상기 복수의 영상 변환 레벨에 따라 크기 변환된 테스트 영상을 입력하여 복수의 클래스 별 스코어(score)를 각각 산출하는 스코어 산출부를 더 포함할 수 있다.According to an embodiment of the present invention, a parallel deep neural network apparatus inputs a size-converted test image according to the plurality of image conversion levels to the generated first scaled neural network and the generated remaining scaled neural networks to score a plurality of classes by class. It may further include a score calculator for calculating each (score).

상기 영상 크기 변환부는 상기 테스트 영상에 대하여 객체 전체를 포함할 수 있는 전체 배율을 결정하고, 상기 결정된 전체 배율 범위 내에서 상기 테스트 영상을 위한 서로 다른 복수의 배율을 결정하고, 상기 결정된 서로 다른 복수의 배율에 따라 상기 테스트 영상을 선형적으로 변환하여 중간 영상을 생성하고, 상기 생성된 중간 영상 내에서 상기 결정된 서로 다른 복수의 배율 별로 상기 테스트 영상의 크기를 변환할 수 있다.The image size conversion unit determines a total magnification that can include the entire object with respect to the test image, determines a plurality of different magnifications for the test image within the determined total magnification range, and includes a plurality of the determined different magnifications. An intermediate image may be generated by linearly converting the test image according to the magnification, and the size of the test image may be converted for each of the plurality of determined different magnifications within the generated intermediate image.

상기 영상 크기 변환부는 상기 전체 배율에 따른 상기 테스트 영상의 가로 및 세로 길이의 최소값에 상기 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값의 합에 대한 절반에 배율의 범위를 결정하기 위한 배율요소(scaling factor)를 결합하여 상기 테스트 영상을 위한 서로 다른 복수의 배율을 결정할 수 있다.The image size converter is a magnification factor for determining the range of magnification to half the sum of the minimum values of the horizontal and vertical lengths of the test image according to the overall magnification and the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network. (scaling factor) may be combined to determine a plurality of different magnifications for the test image.

본 발명의 일실시예에 따르면 병렬 심층 신경망 장치는 상기 생성된 제1 배율 신경망 및 상기 생성된 나머지 배율 신경망들을 병렬로 연결하여 상기 각각 산출된 스코어(score)를 통합하여 상기 테스트 영상의 클래스를 인식하는 영상 인식부를 더 포함할 수 있다.According to an embodiment of the present invention, the parallel deep neural network device recognizes the class of the test image by connecting the generated first scaled neural network and the remaining scaled neural networks in parallel and integrating the scores calculated respectively. It may further include an image recognition unit.

상기 영상 인식부는 상기 각각 산출된 스코어(score)를 더하거나 곱하여 산출된 최종 스코어들 중 가장 큰 스코어에 해당하는 클래스를 상기 테스트 영상의 클래스로 인식할 수 있다.The image recognition unit may recognize, as the class of the test image, a class corresponding to the highest score among final scores calculated by adding or multiplying the scores calculated respectively.

상기 클래스는 상기 테스트 영상에서 분류 대상이 되는 복수의 카테고리 중 어느 하나의 카테고리를 포함할 수 있다.The class may include any one of a plurality of categories to be classified in the test image.

본 발명의 일실시예에 따르면 병렬 심층 신경망 학습 장치의 동작 방법은 영상 크기 변환부에서, 훈련 영상에 대해 서로 다른 복수의 배율을 적용하여 복수의 영상 변환 레벨로 상기 훈련 영상의 크기를 변환하는 단계, 배율 신경망 생성부에서, 일반 영상으로 기 훈련된 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제1 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이학습을 수행함에 따라 상기 제1 레벨에 대한 제1 배율 신경망을 생성하는 단계 및 상기 배율 신경망 생성부에서, 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 상기 제1 레벨 이외의 나머지 영상 변환 레벨들에 따라 파라메터가 교체된 하부층들을 각각 연결하여 전이학습을 수행함에 따라 상기 나머지 영상 변환 레벨들에 대한 나머지 배율 신경망들을 생성하는 단계를 포함할 수 있다.According to an embodiment of the present invention, the method of operating a parallel deep neural network learning apparatus includes converting the size of the training image into a plurality of image conversion levels by applying a plurality of different magnifications to the training image in the image size converter. , by performing transfer learning by connecting the upper layer of the neural network previously trained with a general image and the lower layer in which the parameter is replaced according to the first level among the plurality of image conversion levels in the scaled neural network generator to perform transfer learning for the first level In the step of generating a scaled neural network and in the scaled neural network generator, the upper layer of the first scaled neural network and the lower layers in which the parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels, respectively The method may include generating the remaining scaled neural networks for the remaining image transformation levels by connecting and performing transfer learning.

상기 훈련 영상의 크기를 변환하는 단계는 상기 훈련 영상에 대하여 객체 전체를 포함할 수 있는 전체 배율을 결정하고, 상기 결정된 전체 배율 범위 내에서 상기 서로 다른 복수의 배율을 무작위로 결정하는 단계 및 상기 결정된 서로 다른 복수의 배율에 따라 상기 훈련 영상을 선형적으로 변환하여 중간 영상을 생성하고, 상기 생성된 중간 영상 내에서 상기 결정된 서로 다른 복수의 배율 별로 상기 훈련 영상을 무작위로 크로핑(cropping)하여 상기 훈련 영상의 크기를 변환하는 단계를 포함할 수 있다.The step of transforming the size of the training image includes determining an overall magnification that can include the entire object with respect to the training image, and randomly determining the plurality of different magnifications within the determined overall magnification range, and the determined An intermediate image is generated by linearly transforming the training image according to a plurality of different magnifications, and the training image is randomly cropped for each of the plurality of different magnifications determined within the generated intermediate image. It may include converting the size of the training image.

상기 나머지 영상 변환 레벨들에 대한 나머지 배율 신경망들을 생성하는 단계는 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제2 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 제2 배율 신경망을 생성하는 단계, 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제3 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 제3 배율 신경망을 생성하는 단계 및 상기 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 마지막 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 상기 나머지 배율 신경망들 중 마지막 배율 신경망을 생성하는 단계를 포함할 수 있다.The step of generating the remaining scaled neural networks for the remaining image conversion levels is performed by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the second level among the plurality of image conversion levels to perform transfer learning. generating a second scaled neural network from among the remaining scaled neural networks; connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to a third level among the plurality of image conversion levels to perform transfer learning. Generating a third scaled neural network among the remaining scaled neural networks, and performing transfer learning by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the last level among the plurality of image conversion levels It may include generating a last scaled neural network among the neural networks.

본 발명의 일실시예에 따르면 병렬 심층 신경망 장치의 동작 방법은 상기 생성된 제1 배율 신경망 및 상기 생성된 나머지 배율 신경망들에 상기 복수의 영상 변환 레벨에 따라 크기 변환된 테스트 영상을 입력하여 복수의 클래스 별 스코어(score)를 각각 산출하는 단계 및 상기 생성된 제1 배율 신경망 및 상기 생성된 나머지 배율 신경망들을 병렬로 연결하여 상기 각각 산출된 스코어(score)를 통합하여 상기 테스트 영상의 클래스를 인식하는 단계를 더 포함할 수 있다.According to an embodiment of the present invention, in a method of operating a parallel deep neural network apparatus, a test image size-converted according to the plurality of image conversion levels is input to the generated first scaled neural network and the generated remaining scaled neural networks to a plurality of Calculating a score for each class, and recognizing the class of the test image by integrating the calculated scores by connecting the generated first scaled neural network and the remaining scaled neural networks in parallel It may include further steps.

본 발명은 훈련 영상을 다양한 배율로 크기 변환하여 원거리에서 관찰한 객체의 특성으로 학습한 신경망뿐만이 아니라 근거리에서 관찰한 객체의 상세한 특징 정보로 학습한 신경망을 복합적으로 사용함으로써 다양한 배율의 시각에서 구분할 수 있는 부객체들의 미세한 차이를 학습함에 따라 객체 인식 성능을 향상시킬 수 있다.The present invention converts the training image at various magnifications and uses not only the neural network learned with the characteristics of the object observed from a distance, but also the neural network learned from the detailed characteristic information of the object observed from a short distance. Object recognition performance can be improved by learning the subtle differences between sub-objects.

본 발명은 서로 다른 배율에 따라 크기 변환된 영상을 이용하여 훈련시킨 서로 다른 심층 신경망을 병렬로 연결하여 복수 배율 신경망(multi-scale CNN)을 구성하고, 테스트 영상에 대해 각 배율로 크기 변환한 후 해당 배율의 신경망을 통해 얻은 예측 점수(score)를 통합하여 최종 객체 분류를 수행할 수 있다.The present invention configures a multi-scale neural network (CNN) by connecting different deep neural networks trained using images scaled according to different magnifications in parallel, and after size-converting the test images at each magnification The final object classification can be performed by integrating the prediction score obtained through the neural network of the corresponding scale.

본 발명은 배율 별로 복수개의 신경망을 사용하므로 각각의 배율에 해당하는 훈련된 파라메터를 저장해야 하는 단점을 극복하기 위해 객체수준의 신경망의 상부층을 다른 모든 배율에서 공유할 수 있도록 배율 별 신경망을 훈련시킴으로써 메모리 소비량을 최소화할 수 있다.Since the present invention uses a plurality of neural networks for each magnification, in order to overcome the disadvantage of having to store the trained parameters corresponding to each magnification, by training the neural network for each magnification so that the upper layer of the object-level neural network can be shared at all other magnifications. Memory consumption can be minimized.

본 발명은 입력 영상의 크기 변환 시 무작위 특성(randomness)를 부여하여 같은 영상에 대해 매번 크기 변환을 하더라도 영상에서 변환된 결과가 조금씩 다르도록 함에 따라 신경망 계수를 업데이트 할 시, 국부 최소치에 빠지지 않고, 학습과정에서 오퍼피팅(overfitting)을 방지하는 지터링(jittering) 효과를 제공할 수 있다.The present invention gives randomness when changing the size of an input image so that the transformed result in the image is slightly different even if the size is changed every time for the same image. It is possible to provide a jittering effect that prevents overfitting in the learning process.

도 1은 종래 기술에 따른 기본 객체에 속한 부객체를 분류하는 문제를 설명하는 도면이다.
도 2는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치를 설명하는 도면이다.
도 3은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치에서 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조를 설명하는 도면이다.
도 4는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부를 설명하는 도면이다.
도 5는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치에서 세가지 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조를 설명하는 도면이다.
도 6은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치를 설명하는 도면이다.
도 7 및 도 8은 본 발명의 일실시예에 따른 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 설명하는 도면이다.
도 9 및 도 10은 본 발명의 일실시예에 따른 세가지 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 설명하는 도면이다.
도 11a 및 도 11b는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부가 크기 변환한 영상을 설명하는 도면이다.
도 12 및 도 13은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법을 설명하는 도면이다.1 is a view for explaining a problem of classifying sub-objects belonging to a basic object according to the related art.
2 is a diagram illustrating a parallel deep neural network apparatus according to an embodiment of the present invention.
3 is a diagram for explaining the structure of a plurality of neural networks trained with images transformed in size at different magnifications in a parallel deep neural network apparatus according to an embodiment of the present invention.
4 is a diagram illustrating an image size conversion unit of a parallel deep neural network apparatus according to an embodiment of the present invention.
5 is a diagram illustrating a structure of a plurality of neural networks trained with images size-converted at three magnifications in a parallel deep neural network apparatus according to an embodiment of the present invention.
6 is a diagram illustrating a parallel deep neural network apparatus according to an embodiment of the present invention.
7 and 8 are diagrams for explaining the neural network structure of a parallel deep neural network device for recognizing a test image through a plurality of neural networks trained with images size-converted at different magnifications according to an embodiment of the present invention.
9 and 10 are diagrams illustrating a neural network structure of a parallel deep neural network device for recognizing a test image through a plurality of neural networks trained with images size-converted by three magnifications according to an embodiment of the present invention.
11A and 11B are diagrams illustrating an image size-converted by an image size converter of a parallel deep neural network apparatus according to an embodiment of the present invention.
12 and 13 are diagrams for explaining a method of operating a parallel deep neural network apparatus according to an embodiment of the present invention.

이하, 본 문서의 다양한 실시 예들이 첨부된 도면을 참조하여 기재된다.Hereinafter, various embodiments of the present document will be described with reference to the accompanying drawings.

실시 예 및 이에 사용된 용어들은 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 해당 실시 예의 다양한 변경, 균등물, 및/또는 대체물을 포함하는 것으로 이해되어야 한다.Examples and terms used therein are not intended to limit the technology described in this document to specific embodiments, and should be understood to include various modifications, equivalents, and/or substitutions of the embodiments.

하기에서 다양한 실시 예들을 설명에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다.In the following, when it is determined that a detailed description of a known function or configuration related to various embodiments may unnecessarily obscure the gist of the present invention, a detailed description thereof will be omitted.

그리고 후술되는 용어들은 다양한 실시 예들에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In addition, terms to be described later are terms defined in consideration of functions in various embodiments, which may vary according to intentions or customs of users and operators. Therefore, the definition should be made based on the content throughout this specification.

도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.In connection with the description of the drawings, like reference numerals may be used for like components.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다.The singular expression may include the plural expression unless the context clearly dictates otherwise.

본 문서에서, "A 또는 B" 또는 "A 및/또는 B 중 적어도 하나" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다.In this document, expressions such as “A or B” or “at least one of A and/or B” may include all possible combinations of items listed together.

"제1," "제2," "첫째," 또는 "둘째," 등의 표현들은 해당 구성요소들을, 순서 또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다.Expressions such as "first," "second," "first," or "second," can modify the corresponding elements, regardless of order or importance, and to distinguish one element from another element. It is used only and does not limit the corresponding components.

어떤(예: 제1) 구성요소가 다른(예: 제2) 구성요소에 "(기능적으로 또는 통신적으로) 연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다.When an (eg, first) component is referred to as being “connected (functionally or communicatively)” or “connected” to another (eg, second) component, that component is It may be directly connected to the component or may be connected through another component (eg, a third component).

본 명세서에서, "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, 하드웨어적 또는 소프트웨어적으로 "~에 적합한," "~하는 능력을 가지는," "~하도록 변경된," "~하도록 만들어진," "~를 할 수 있는," 또는 "~하도록 설계된"과 상호 호환적으로(interchangeably) 사용될 수 있다.As used herein, "configured to (or configured to)" according to the context, for example, hardware or software "suitable for," "having the ability to," "modified to ," "made to," "capable of," or "designed to," may be used interchangeably.

어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다.In some contexts, the expression “a device configured to” may mean that the device is “capable of” with other devices or components.

예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(예: CPU 또는 application processor)를 의미할 수 있다.For example, the phrase “a processor configured (or configured to perform) A, B, and C” refers to a dedicated processor (eg, an embedded processor) for performing the operations, or by executing one or more software programs stored in a memory device. , may refer to a general-purpose processor (eg, a CPU or an application processor) capable of performing corresponding operations.

또한, '또는' 이라는 용어는 배타적 논리합 'exclusive or' 이기보다는 포함적인 논리합 'inclusive or' 를 의미한다.Also, the term 'or' means 'inclusive or' rather than 'exclusive or'.

즉, 달리 언급되지 않는 한 또는 문맥으로부터 명확하지 않는 한, 'x가 a 또는 b를 이용한다' 라는 표현은 포함적인 자연 순열들(natural inclusive permutations) 중 어느 하나를 의미한다.That is, unless stated otherwise or clear from context, the expression 'x employs a or b' means any one of natural inclusive permutations.

이하 사용되는 '..부', '..기' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어, 또는, 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Terms such as '.. unit' and '.. group' used below mean a unit for processing at least one function or operation, which may be implemented as hardware or software, or a combination of hardware and software.

도 2는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치를 설명하는 도면이다.2 is a diagram illustrating a parallel deep neural network apparatus according to an embodiment of the present invention.

도 2는 본 발명의 일실시예에 따라 서로 다른 복수의 배율로 크기 변환된 훈련 영상을 학습하여 복수의 배율 신경망들을 생성하는 병렬 심층 신경망 장치의 구성 요소를 예시한다.2 illustrates components of a parallel deep neural network apparatus for generating a plurality of scaled neural networks by learning a training image size-converted at a plurality of different magnifications according to an embodiment of the present invention.

도 2를 참고하면, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치(200)는 영상 크기 변환부(210) 및 배율 신경망 생성부(220)를 포함한다.Referring to FIG. 2 , the parallel deep neural network apparatus 200 according to an embodiment of the present invention includes an image size converter 210 and a scale neural network generator 220 .

본 발명의 일실시예에 따른 영상 크기 변환부(210)는 훈련 영상에 대해 서로 다른 복수의 배율(scale)을 적용하여 복수의 영상 변환 레벨로 훈련 영상의 크기를 변환할 수 있다.The image size converter 210 according to an embodiment of the present invention may convert the size of the training image into a plurality of image conversion levels by applying a plurality of different scales to the training image.

일례로, 영상 크기 변환부(110)는 훈련 영상에 대하여 객체 전체를 포함할 수 있는 전체 배율을 결정하고, 결정된 전체 배율 범위 내에서 서로 다른 복수의 배율을 무작위로 결정할 수 있다.For example, the image size converter 110 may determine an overall magnification that can include the entire object with respect to the training image, and randomly determine a plurality of different magnifications within the determined overall magnification range.

즉, 영상 크기 변환부(110)는 훈련 영상의 범위 내에서 서로 다른 배율(scale)을 갖는 서로 다른 영상 변환 레벨의 훈련 영상으로 훈련 영상의 크기를 변환할 수 있다.That is, the image size converter 110 may convert the size of the training image into training images of different image conversion levels having different scales within the range of the training image.

예를 들어, 복수의 영상 변환 레벨은 상기 훈련 영상에서 객체 크기의 배율에 해당하는 객체 레벨(object level), 객체의 부분 크기의 배율에 해당하는 파트 레벨(part level) 및 상기 객체 레벨(object level)과 상기 파트 레벨(part level)의 중간 크기의 배율에 해당하는 중간 레벨(Mid-level) 중 적어도 둘의 레벨을 포함할 수 있다.For example, the plurality of image conversion levels may include an object level corresponding to a magnification of an object size in the training image, a part level corresponding to a magnification of a partial size of an object, and the object level. ) and a mid-level corresponding to a magnification of an intermediate size of the part level may include at least two levels.

본 발명의 일실시예에 따른 영상 크기 변환부(110)는 서로 다른 복수의 배율에 따라 훈련 영상을 선형적으로 변환하여 중간 영상을 생성하고, 생성된 중간 영상 내에서 서로 다른 복수의 배율 별로 훈련 영상을 무작위로 크로핑(cropping)하여 훈련 영상의 크기를 변환할 수 있다.The image size converter 110 according to an embodiment of the present invention generates an intermediate image by linearly transforming a training image according to a plurality of different magnifications, and trains for a plurality of different magnifications within the generated intermediate image. The size of the training image may be transformed by randomly cropping the image.

일례로 영상 크기 변환부(110)는 서로 다른 복수의 배율을 설정할 수 있고, 서로 다른 복수의 배율을 설정하는 것은 도 4를 이용하여 보충 설명한다.As an example, the image size converter 110 may set a plurality of different magnifications, and setting a plurality of different magnifications will be described with reference to FIG. 4 .

본 발명의 일실시예에 따르면 배율 신경망 생성부(120)는 일반 영상으로 기 훈련된 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제1 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이학습을 수행함에 따라 제1 레벨에 대한 제1 배율 신경망을 생성할 수 있다.According to an embodiment of the present invention, the scaling neural network generating unit 120 performs transfer learning by connecting the upper layer of the neural network previously trained with a general image and the lower layer in which the parameters are replaced according to the first level among the plurality of image conversion levels. Accordingly, a first scaling neural network for the first level may be generated.

또한, 배율 신경망 생성부(120)는 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 상기 제1 레벨 이외의 나머지 영상 변환 레벨들에 따라 파라메터가 교체된 하부층들을 각각 연결하여 전이학습을 수행함에 따라 나머지 영상 변환 레벨들에 대한 나머지 배율 신경망들을 생성할 수 있다.In addition, the scaled neural network generator 120 performs transfer learning by connecting the upper layer of the first scaled neural network and the lower layers whose parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels. Accordingly, the remaining scaling neural networks for the remaining image transformation levels may be generated.

일례로, 배율 신경망 생성부(120)는 제1 배율 신경망의 상부층의 파라메터를 나머지 배율 신경망들을 생성하기 위한 기본 파라메터로 공유할 수 있다.For example, the scaled neural network generator 120 may share the parameter of the upper layer of the first scaled neural network as a basic parameter for generating the remaining scaled neural networks.

본 발명의 일실시예에 따르면 배율 신경망 생성부(120)는 제1 배율 신경망의 상부층과 복수의 영상 변환 레벨 중 제2 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 나머지 배율 신경망들 중 제2 배율 신경망을 생성하고, 제1 배율 신경망의 상부층과 상기 복수의 영상 변환 레벨 중 제3 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 나머지 배율 신경망들 중 제3 배율 신경망을 생성할 수 있다.According to an embodiment of the present invention, the scaled neural network generator 120 connects the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the second level among the plurality of image conversion levels to perform transfer learning, and the remaining scale factor A second scaled neural network is generated among the neural networks, and transfer learning is performed by connecting the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to the third level among the plurality of image conversion levels. It is possible to create a 3-scale neural network.

또한, 배율 신경망 생성부(120)는 제1 배율 신경망의 상부층과 복수의 영상 변환 레벨 중 마지막 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이 학습을 수행함에 따라 나머지 배율 신경망들 중 마지막 배율 신경망을 생성할 수 있다.In addition, the scaled neural network generator 120 connects the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to the last level among the plurality of image conversion levels to perform transfer learning. can create

본 발명의 일실시예에 따르면 배율 신경망 생성부(120)는 나머지 배율 신경망들을 생성할 시, 제1 배율 신경망의 상부층의 파라메터와 나머지 배율 신경망들의 상부층의 파라메터를 동일하게 저장할 수 있다.According to an embodiment of the present invention, when generating the remaining scaled neural networks, the scaled neural network generator 120 may store the parameters of the upper layer of the first scaled neural network and the parameters of the upper layers of the remaining scaled neural networks in the same way.

따라서, 본 발명은 배율 별로 복수개의 신경망을 사용하므로 각각의 배율에 해당하는 훈련된 파라메터를 저장해야 하는 단점을 극복하기 위해 객체수준의 신경망의 상부층을 다른 모든 배율에서 공유할 수 있도록 배율 별 신경망을 훈련시킴으로써 메모리 소비량을 최소화할 수 있다.Therefore, since the present invention uses a plurality of neural networks for each magnification, in order to overcome the disadvantage of having to store the trained parameters corresponding to each magnification, the upper layer of the object-level neural network can be shared at all other magnifications. By training, memory consumption can be minimized.

예를 들어, 병렬 심층 신경망 장치(200)는 복수의 배율 신경망들이 병렬로 연결된 구조를 포함하고 있어서, 병렬 배율 신경망 장치로 지칭될 수 있다.For example, the parallel deep neural network device 200 includes a structure in which a plurality of scaled neural networks are connected in parallel, and thus may be referred to as a parallel scaled neural network device.

도 3은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치에서 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조를 설명하는 도면이다.3 is a diagram for explaining the structure of a plurality of neural networks trained with images transformed in size at different magnifications in a parallel deep neural network apparatus according to an embodiment of the present invention.

도 3을 참고하면, 본 발명의 일실시예에 따르면 병렬 심층 신경망 장치의 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조(300)는 영상 크기 변환부(310), 기 훈련된 신경망(320), 제1 배율로 훈련될 신경망(330), 제2 배율로 훈련될 신경망(340), 제3 배율로 훈련될 신경망(350), 제1 배율 신경망(360), 제2 배율 신경망(370) 및 제3 배율 신경망(380)으로 구성될 수 있다.Referring to FIG. 3 , according to an embodiment of the present invention, a plurality of neural network structures 300 trained with images size-converted at different magnifications of the parallel deep neural network device are image size converter 310, a previously trained neural network. 320, a neural network to be trained at a first scale 330, a neural network to be trained at a second scale 340, a neural network to be trained at a third scale 350, a first scale neural network 360, a second scale neural network ( 370) and a third scale neural network 380 .

본 발명의 일실시예에 따르면 영상 크기 변환부(310)는 훈련 영상 데이터 세트(training dataset)로부터 훈련 영상을 입력 받아서, 서로 다른 복수의 배율로 훈련 영상의 크기를 변환할 수 있다.According to an embodiment of the present invention, the image size converter 310 may receive a training image from a training image data set and convert the size of the training image at a plurality of different magnifications.

일례로, 영상 크기 변환부(310)는 훈련 영상의 크기를 변환하여 제1 배율 훈련 영상(311), 제2 배율 훈련 영상(312) 및 제3 배율 훈련 영상(313)을 생성할 수 있다.For example, the image size converter 310 may generate a first magnification training image 311 , a second magnification training image 312 , and a third magnification training image 313 by converting the size of the training image.

본 발명의 일실시예에 따르면 기 훈련된 신경망(320)은 기존의 일반 영상으로 기 학습된 신경망으로서, 앞단은 일반 영상으로 학습된 파라메터를 포함하는 상부층(321)과 전이학습을 위해 교체될 수 있는 하부층(322)으로 구성된다.According to an embodiment of the present invention, the pre-trained neural network 320 is a neural network previously trained with an existing general image, and the front end can be replaced with the upper layer 321 including the parameters learned from the general image for transfer learning. It consists of a lower layer 322 with

일례로, 영상 크기 변환부(310)는 영상 내에 존재하는 객체의 전체적인 모양 정보를 바탕으로 학습하여 전체 병렬 심층 신경망의 기본 축에 해당하는 심층 심경망을 담당할 첫번째 배율의 신경망을 위해, 훈련영상을 객체수준의 배율로 크기 변환하여 제1 배율 훈련 영상(311)을 생성한다.As an example, the image size converter 310 learns based on the overall shape information of the object existing in the image, and for the first magnification neural network that will be responsible for the deep neural network corresponding to the basic axis of the entire parallel deep neural network, the training image A first magnification training image 311 is generated by converting the size to an object-level magnification.

본 발명의 일실시예에 따르면 제1 배율로 훈련될 신경망(330)은 제1 배율 훈련 영상(311)을 입력 영상으로 사용하고, 상부층(321)의 파라메터를 상부층(331)의 파라메터로 사용하고, 제1 배율 훈련 영상(311)에 따라 하부층(332)의 파라메터를 변경하며, 상부층(331)과 하부층(332)을 연결하여 전이 학습을 수행한다.According to an embodiment of the present invention, the neural network 330 to be trained at the first magnification uses the first magnification training image 311 as an input image, and uses the parameters of the upper layer 321 as the parameters of the upper layer 331 and , change the parameters of the lower layer 332 according to the first magnification training image 311 , and perform transfer learning by connecting the upper layer 331 and the lower layer 332 .

여기서, 제1 배율로 훈련될 신경망(330)의 전이학습에 따라 제1 배율 신경망(360)이 생성될 수 있다.Here, the first magnification neural network 360 may be generated according to transfer learning of the neural network 330 to be trained at the first magnification.

예를 들어, 제1 배율 신경망(360)은 상부층(361)과 하부층(362)을 포함할 수 있다.For example, the first scaling neural network 360 may include an upper layer 361 and a lower layer 362 .

본 발명의 일실시예에 따르면 제1 배율 신경망(360)의 상부층(361)의 파라메터는 전체 복수 배율(multi-scale CNN) 신경망의 기본망으로 사용될 수 있다.According to an embodiment of the present invention, the parameters of the upper layer 361 of the first multi-scale neural network 360 may be used as a basic network of the entire multi-scale neural network.

본 발명의 일실시예에 따르면 제2 배율로 훈련될 신경망(340)은 제2 배율 훈련 영상(312)을 입력 영상으로 사용하고, 상부층(361)의 파라메터를 상부층(341)의 파라메터로 사용하고, 제2 배율 훈련 영상(312)에 따라 하부층(342)의 파라메터를 변경하며, 상부층(341)과 하부층(342)을 연결하여 전이 학습을 수행한다.According to an embodiment of the present invention, the neural network 340 to be trained at the second magnification uses the second magnification training image 312 as an input image, and uses the parameters of the upper layer 361 as the parameters of the upper layer 341 and , changing the parameters of the lower layer 342 according to the second magnification training image 312 , and performing transfer learning by connecting the upper layer 341 and the lower layer 342 .

여기서, 제2 배율로 훈련될 신경망(340)의 전이학습에 따라 제2 배율 신경망(370)이 생성될 수 있다.Here, according to transfer learning of the neural network 340 to be trained at the second magnification, the second magnification neural network 370 may be generated.

예를 들어, 제2 배율 신경망(370)은 상부층(371) 및 하부층(372)을 포함할 수 있다.For example, the second scaling neural network 370 may include an upper layer 371 and a lower layer 372 .

본 발명의 일실시예에 따르면 제3 배율로 훈련될 신경망(350)은 제2 배율 훈련 영상(313)을 입력 영상으로 사용하고, 상부층(361)의 파라메터를 상부층(351)의 파라메터로 사용하고, 제3 배율 훈련 영상(313)에 따라 하부층(352)의 파라메터를 변경하며, 상부층(351)과 하부층(352)을 연결하여 전이 학습을 수행한다.According to an embodiment of the present invention, the neural network 350 to be trained at the third magnification uses the second magnification training image 313 as an input image, and uses the parameters of the upper layer 361 as the parameters of the upper layer 351 and , changing the parameters of the lower layer 352 according to the third magnification training image 313 , and performing transfer learning by connecting the upper layer 351 and the lower layer 352 .

여기서, 제3 배율로 훈련될 신경망(350)의 전이학습에 따라 제3 배율 신경망(380)이 생성될 수 있다.Here, the third magnification neural network 380 may be generated according to transfer learning of the neural network 350 to be trained at the third magnification.

예를 들어, 제3 배율 신경망(380)은 상부층(381) 및 하부층(382)을 포함할 수 있고, 제3 배율 신경망(380)은 N배율 신경망에 해당될 수 있다.For example, the third scaled neural network 380 may include an upper layer 381 and a lower layer 382 , and the third scaled neural network 380 may correspond to an N scaled neural network.

예를 들어, 제1 배율로 훈련될 신경망(330)의 전이학습과정, 제2 배율로 훈련될 신경망(340)의 전이학습과정 및 제3 배율로 훈련될 신경망(350)의 전이학습과정은 미세 조정(fine-tuning)과정에 해당될 수 있다.For example, the transfer learning process of the neural network 330 to be trained at the first magnification, the transfer learning process of the neural network 340 to be trained at the second magnification, and the transfer learning process of the neural network 350 to be trained at the third magnification are fine. It may correspond to a fine-tuning process.

본 발명의 일실시예에 따르면 병렬 심층 신경망 장치의 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조(300)는 연속적인 배율에 대해서도 같은 방법으로 상부층(361)의 파라메터를 그대로 사용하고 하부층을 각각의 배율로 크기 변환한 영상으로 훈련시켜 다양한 배율에서 학습한 신경망들로 구성될 수 있다.According to an embodiment of the present invention, a plurality of neural network structures 300 trained with images size-converted to different magnifications of the parallel deep neural network device use the parameters of the upper layer 361 in the same way for continuous magnifications, and By training the lower layer with the size-transformed image at each magnification, it can be composed of neural networks learned at various magnifications.

일례로, 제1 배율 신경망(360) 내지 제3 배율 신경망(380)은 병렬 연결되어 병렬 심층 신경망 장치로 동작될 수 있다.As an example, the first scaled neural network 360 to the third scaled neural network 380 may be connected in parallel to operate as a parallel deep neural network device.

도 4는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부를 설명하는 도면이다.4 is a diagram illustrating an image size conversion unit of a parallel deep neural network apparatus according to an embodiment of the present invention.

도 4는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부가 서로 다른 배율로 훈련 영상 또는 테스트 영상의 크기를 변환하는 구성을 예시한다.4 illustrates a configuration in which the image size converting unit of the parallel deep neural network device according to an embodiment of the present invention converts the size of a training image or a test image at different magnifications.

도 4를 참고하면, 본 발명의 일실시예에 따른 영상 크기 변환부(400)는 배율 설정부(410), 선형 크기 변환부(420) 및 크로핑부(430)를 포함한다.Referring to FIG. 4 , the image size converting unit 400 according to an embodiment of the present invention includes a magnification setting unit 410 , a linear size converting unit 420 , and a cropping unit 430 .

본 발명의 일실시예에 따르면 배율 설정부(410)는 훈련 영상을 훈련하거나 테스트 영상을 테스트 하기 위한 서로 다른 배율을 설정한다.According to an embodiment of the present invention, the magnification setting unit 410 sets different magnifications for training a training image or testing a test image.

일례로, 배율 설정부(410)는 훈련 영상을 세가지의 배율로 크기 변환하고 각각의 배율로 크기 변환된 훈련 영상으로 세 단계에 해당되는 배율 신경망들을 구성하기 위해서 서로 다른 복수의 배율을 설정한다.For example, the magnification setting unit 410 size-converts the training image to three magnifications and sets a plurality of different magnifications in order to configure magnification neural networks corresponding to three stages with the training image size-converted by each magnification.

본 발명의 일실시예에 따르면 배율 설정부(410)는 훈련 영상 또는 테스트 영상에 해당하는 원영상(MxN)을 입력 받고, 입력된 원영상(MxN)에서 객체 전체를 포함할 수 있는 배율을 정하고, 전체 범위 내에서 배율을 무작위로 설정할 수 있다.According to an embodiment of the present invention, the magnification setting unit 410 receives an original image (MxN) corresponding to a training image or a test image, and determines a magnification that can include the entire object in the input original image (MxN), , the magnification can be set randomly within the entire range.

일례로, 선형 크기 변환부(420)는 결정된 배율에 따라 원영상(MxN)을 배율의 비율대로 원영상(MxN)의 가로 및 세로를 선형적으로 변환하여 중간 단계의 크기에 해당하는 중간영상(M_txN_t)을 생성한다.For example, the linear size conversion unit 420 linearly converts the original image (MxN) horizontally and vertically according to the ratio of the magnification according to the determined magnification, and thus the intermediate image corresponding to the size of the intermediate stage ( M _t xN _t ).

본 발명의 일실시예에 따르면 크로핑부(430)는 중간영상(M_txN_t)의 범위 내에서 최종 영상 크기에 해당하는 최종영상(M_oxN_o)을 무작위로 크로핑하여 훈련 영상 학습을 위한 신경망의 입력 크기와 동일한 크기를 갖는 최종영상(M_oxN_o)을 생성할 수 있다.According to an embodiment of the present invention, the cropping unit 430 performs training image learning by randomly cropping the final image (M _o xN _o ) corresponding to the final image size within the range of the intermediate image (M _t xN _t ). A final image (M _o xN _o ) having the same size as the input size of the neural network for

일례로, 배율 설정부(410)는 배율을 어떤 범위에서 설정하는가에 따라 다양한 시각 및 시야에서 최종영상(M_oxN_o)을 생성할 수 있다.For example, the magnification setting unit 410 may generate the final image (M _o xN _o ) at various views and views according to a range in which the magnification is set.

본 발명의 일실시예에 따르면 배율 설정부(410)는 배율의 범위를 특정 범위 내로 한정할 수 있고, 하기 수학식 1에 기반하여 한정된 범위 내에서 무작위(random)으로 설정할 수 있다.According to an embodiment of the present invention, the magnification setting unit 410 may limit the magnification range within a specific range, and may randomly set the magnification range within the limited range based on Equation 1 below.

예를 들어, 배율 설정부(410)는 원영상(MxN)의 가로 길이 및 세로 길이의 최소값이 최종영상(M_oxN_o)의 가로 길이 및 세로 길이의 최소값보다 클 경우 수학식 1에 기반하여 배율을 설정할 수 있다.For example, if the minimum values of the horizontal and vertical lengths of the original image (MxN) are greater than the minimum values of the horizontal and vertical lengths of the final image (M _o xN _o ), the magnification setting unit 410 is based on Equation 1 based on Equation 1 You can set the magnification.

[수학식 1][Equation 1]

수학식 1에서, S는 배율을 나타낼 수 있고, O_min은 원영상(MxN)의 가로 길이 및 세로 길이의 최소값을 나타낼 수 있으며, T_min은 최종영상(M_oxN_o)의 가로 길이 및 세로 길이의 최소값을 나타낼 수 있고, α는 평균이 0이고 표준편차가 0.5인 정규분포에서 생성한 난수값을 나타낼 수 있으며, β는 배율의 범위를 결정하기 위한 배율요소(scaling factor)를 나타낼 수 있다.In Equation 1, S may represent the magnification, O _min may represent the minimum value of the horizontal length and vertical length of the original image (MxN), T _min is the horizontal length and the vertical length of the final image (M _o xN _o ) It can represent the minimum value of the length, α can represent a random number generated from a normal distribution with a mean of 0 and a standard deviation of 0.5, and β can represent a scaling factor for determining the range of scale. .

예를 들어, 난수값은 가우스 분포에 기반한 난수값으로서, 배율에 따른 크기 변환 시도 마다 다르게 설정되어 지터링 효과를 제공할 수 있고, 난수값의 표준 편차에 의해 제어될 수 있다.For example, the random number value is a random number value based on a Gaussian distribution, and may be set differently for each attempt to change the size according to the magnification to provide a jittering effect, and may be controlled by the standard deviation of the random number value.

예를 들어, 가우스 분포의 평균이 0으로 설정되어 있기 때문에 스케일링 계수에 해당하는 배율은 원영상(MxN)과 목표 크기에 해당하는 최종영상(M_oxN_o) 사이에서 주로 선택될 수 있다.For example, since the mean of the Gaussian distribution is set to 0, the magnification corresponding to the scaling factor may be mainly selected between the original image (MxN) and the final image (M _o xN _o ) corresponding to the target size.

이 경우 원래 원영상(MxN)과 최종영상(M_oxN_o) 사이의 크기 간격의 절반은 선형 스케일링으로 처리되고 목표 크기까지 나머지 절반은 선형 스케일로 크기 변환된 영상 내에 목표 크기에 해당되는 영상 조각을 무작위로 잘라내어(cropping) 획득될 수 있다.In this case, half of the size interval between the original image (MxN) and the final image (M _o xN _o ) is processed by linear scaling, and the other half to the target size is the image fragment corresponding to the target size within the scale-transformed image. can be obtained by randomly cropping

본 발명의 일실시예에 따르면 영상 크기 변환부(400)는 전체 배율에 따른 훈련 영상의 가로 및 세로 길이의 최소값에서 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값을 제외한 후, 평균이 0이고 표준편차가 0.5인 정규분포에서 생성한 난수값을 결합하여 생성된 값과 전체 배율에 따른 훈련 영상의 가로 및 세로 길이의 최소값에 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값의 합에 대한 절반에 배율의 범위를 결정하기 위한 배율요소(scaling factor)를 결합하여 생성된 값을 합산하여 서로 다른 복수의 배율을 결정할 수 있다.According to an embodiment of the present invention, the image size converter 400 subtracts the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network from the minimum values of the horizontal and vertical lengths of the training image according to the overall magnification, and then the average is 0. and the minimum value of the horizontal and vertical lengths of the training image according to the total magnification and the value generated by combining random number values generated from a normal distribution with standard deviation of 0.5. It is possible to determine a plurality of different magnifications by summing values generated by combining a scaling factor for determining a range of magnifications to one half of the magnification.

상술한 설명에서 훈련 영상 또는 테스트 영상의 크기를 변환하여 배율 신경망을 생성하기 위한 구성을 예시하고 있다.In the above description, a configuration for generating a scaled neural network by converting the size of a training image or a test image is exemplified.

다른 실시예에 따라서, 훈련 영상 또는 테스트 영상의 라인 별로 선택하여 심층 신경망을 학습 및 생성하는 구성은 2018년 9월 28일에 IEEEAccess에서 공개된 "Constrained Optimization for Image Reshaping with Soft Condition"에서 소개하고 있다.According to another embodiment, the configuration for learning and generating a deep neural network by selecting each line of a training image or a test image is introduced in "Constrained Optimization for Image Reshaping with Soft Condition" published by IEEEAccess on September 28, 2018. .

도 5는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치에서 세가지 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조를 설명하는 도면이다.5 is a diagram illustrating a structure of a plurality of neural networks trained with images size-converted at three magnifications in a parallel deep neural network apparatus according to an embodiment of the present invention.

도 5를 참고하면, 본 발명의 일실시예에 따르면 병렬 심층 신경망 장치의 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망 구조(500)는 훈련 영상으로부터 변환된 객체 레벨의 영상(510), 중간 레벨의 영상(520)과 파트 레벨의 영상(530)을 이용하여 신경망 상에서 학습함에 따라 상부층과 하부층으로 구성된 배율 신경망을 생성한다.Referring to FIG. 5 , according to an embodiment of the present invention, a plurality of neural network structures 500 trained with images size-converted at different magnifications of a parallel deep neural network device are object-level images 510 converted from a training image. , a scaled neural network composed of an upper layer and a lower layer is generated by learning on the neural network using the intermediate-level image 520 and the part-level image 530 .

예를 들어, 객체 레벨의 영상(510), 중간 레벨의 영상(520)과 파트 레벨의 영상(530)은 상기 수학식 1에서 β를 각각 1, 1.25 및 1.5로 설정하여 배율 범위를 한정한 영상일 수 있다.For example, the object-level image 510, the intermediate-level image 520, and the part-level image 530 are images in which β is set to 1, 1.25, and 1.5 in Equation 1, respectively, to limit the magnification range. can be

즉, 객체 레벨의 영상(510), 중간 레벨의 영상(520)과 파트 레벨의 영상(530)은 복수의 영상 변환 레벨에 따라 훈련 영상의 크기를 변환한 영상에 해당될 수 있다.That is, the object-level image 510 , the intermediate-level image 520 , and the part-level image 530 may correspond to images obtained by converting the size of the training image according to a plurality of image transformation levels.

본 발명의 일실시예에 따르면 신경망 구조(500)는 객체 레벨의 영상(510)을 위해 상기 수학식 1에서 β=1로 고정하여 생성한 배율로 훈련 영상을 선형변환하고, 이어서 최종 영상 크기로 크로핑(cropping)을 수행하여 크기 변환된 객체 레벨의 영상(510)을 훈련 신경망에 입력하여 상부층(511)과 하부층(512)으로 구성된 신경망을 학습하면 객체 레벨(Object-level)에 해당하는 배율 심층 신경망의 상부층(513) 및 하부층(514)을 생성할 수 있다.According to an embodiment of the present invention, the neural network structure 500 linearly transforms the training image with a magnification generated by fixing β = 1 in Equation 1 for the object-level image 510, and then to the final image size. When the neural network composed of the upper layer 511 and the lower layer 512 is learned by inputting the size-converted object-level image 510 to the training neural network by performing cropping, the magnification corresponding to the object-level An upper layer 513 and a lower layer 514 of the deep neural network may be created.

일례로, 신경망 구조(500)는 배율 심층 신경망의 상부층(513)의 파라메터를 중간 레벨(Mid-level)의 신경망 학습을 위한 상부층(521)의 파라메터로 사용한다.As an example, the neural network structure 500 uses the parameters of the upper layer 513 of the scaled deep neural network as the parameters of the upper layer 521 for learning the mid-level neural network.

본 발명의 일실시예에 따르면 신경망 구조(500)는 중간 레벨의 영상(520)을 위해 수학식 1에서 β=1.25로 고정하여 생성한 배율로 훈련 영상을 선형변환하고, 최종 영상 크기로 크로핑을 수행하여 크기변환된 중간 레벨의 영상(520)을 입력하여 상부층(521)과 하부층(522)으로 구성된 신경망을 학습하여 중간수준 (Mid-level)의 배율 신경망의 상부층(523) 및 하부층(524)을 생성한다.According to an embodiment of the present invention, the neural network structure 500 linearly transforms the training image with a magnification generated by fixing β = 1.25 in Equation 1 for the image 520 of the intermediate level, and cropping it to the final image size. The upper layer 523 and lower layer 524 of the mid-level scaled neural network by learning the neural network composed of the upper layer 521 and the lower layer 522 by inputting the image 520 of the middle level transformed by performing ) is created.

마지막으로, 신경망 구조(500)는 파트 레벨의 영상(530)을 위해 수학식 1에서 β=1.5로 고정하여 생성한 배율로 훈련 영상을 선형변환하고, 최종 영상 크기로 크로핑을 수행하여 크기 변환된 파트 레벨의 영상(530)을 입력하여 상부층(531)과 하부층(532)으로 구성된 신경망을 학습하여 파트객체 수준 (Part-level)의 배율 신경망의 상부층(533) 및 하부층(534)을 생성한다.Finally, the neural network structure 500 linearly transforms the training image with a magnification generated by fixing β = 1.5 in Equation 1 for the part-level image 530, and performs cropping to the final image size to transform the size. The upper layer 533 and lower layer 534 of the part-level scale neural network are generated by inputting the part-level image 530 and learning the neural network composed of the upper layer 531 and the lower layer 532. .

본 발명의 일실시예에 따르면 신경망 구조(500)는 입력 영상의 크기 변환 시 무작위(randomness) 특성을 부여하고, 같은 영상에 대해 매번 크기변환을 할 경우 같은 영상이라도 변환된 결과가 조금씩 다르도록 함에 따라 신경망 계수를 업데이트 할 때 국부 최소치에 빠지지 않고 계수 학습과정에서 오버피팅(overfitting)을 방지하는 지터링(jittering) 효과를 제공할 수 있다.According to an embodiment of the present invention, the neural network structure 500 gives randomness characteristics when the size of the input image is changed, and when the size of the same image is changed every time, the transformed result is slightly different even for the same image. Accordingly, when updating the neural network coefficients, it is possible to provide a jittering effect that prevents overfitting in the coefficient learning process without falling into a local minimum.

따라서, 본 발명은 입력 영상의 크기 변환 시 무작위 특성(randomness)를 부여하여 같은 영상에 대해 매번 크기 변환을 하더라도 영상에서 변환된 결과가 조금씩 다르도록 함에 따라 신경망 계수를 업데이트 할 시, 국부 최소치에 빠지지 않고, 학습과정에서 오퍼피팅(overfitting)을 방지하는 지터링(jittering) 효과를 제공할 수 있다.Therefore, the present invention gives randomness when changing the size of the input image so that the transformed result in the image is slightly different even if the size is changed for the same image every time when updating the neural network coefficients. Instead, it is possible to provide a jittering effect that prevents overfitting in the learning process.

도 6은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치를 설명하는 도면이다.6 is a diagram illustrating a parallel deep neural network apparatus according to an embodiment of the present invention.

도 6은 도 2의 병렬 심층 신경망 장치를 이용하여 생성된 복수의 배율 신경망을 이용하여 테스트 영상을 인식하는 병렬 심층 신경망 장치를 예시한다.6 illustrates a parallel deep neural network device for recognizing a test image using a plurality of scaled neural networks generated using the parallel deep neural network device of FIG. 2 .

도 6을 참고하면, 병렬 심층 신경망 장치(600)는 영상 크기 변환부(610), 배율 신경망 생성부(620), 스코어 산출부(630) 및 영상 인식부(640)를 포함한다.Referring to FIG. 6 , the parallel deep neural network apparatus 600 includes an image size converter 610 , a scale neural network generator 620 , a score calculator 630 , and an image recognizer 640 .

본 발명의 일실시예에 따르면 영상 크기 변환부(610)는 테스트 영상에 대하여 객체 전체를 포함할 수 있는 전체 배율을 결정하고, 결정된 전체 배율 범위 내에서 테스트 영상을 위한 서로 다른 복수의 배율을 결정하고, 결정된 서로 다른 복수의 배율에 따라 테스트 영상을 선형적으로 변환하여 중간 영상을 생성하고, 생성된 중간 영상 내에서 결정된 서로 다른 복수의 배율 별로 상기 테스트 영상의 크기를 변환할 수 있다.According to an embodiment of the present invention, the image size converter 610 determines a total magnification that can include the entire object for the test image, and determines a plurality of different magnifications for the test image within the determined total magnification range. and linearly transforms the test image according to a plurality of determined different magnifications to generate an intermediate image, and converts the size of the test image for each of a plurality of different magnifications determined in the generated intermediate image.

일례로, 영상 크기 변환부(610) 전체 배율에 따른 상기 테스트 영상의 가로 및 세로 길이의 최소값에 배율 신경망에 입력될 영상의 가로 및 세로 길이의 최소값의 합에 대한 절반에 배율의 범위를 결정하기 위한 배율요소(scaling factor)를 결합하여 테스트 영상을 위한 서로 다른 복수의 배율을 결정할 수 있다.As an example, the image size converter 610 determines the range of magnification at half the sum of the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network to the minimum values of the horizontal and vertical lengths of the test image according to the overall magnification. A plurality of different magnifications for the test image may be determined by combining a scaling factor for the test image.

즉, 영상 크기 변환부(610)는 상기 수학식 1을 이용하여 테스트 영상의 크기를 변환하기 위한 서로 다른 복수의 배율을 설정하고, 설정된 배율에 따라 테스트 영상의 크기를 변환할 수 있다.That is, the image size converter 610 may set a plurality of different magnifications for converting the size of the test image using Equation 1, and may convert the size of the test image according to the set magnification.

본 발명의 일실시예에 따르면 배율 신경망 생성부(620)는 도 2에서 설명된 바와 같이 서로 다른 복수의 배율로 학습된 배율 신경망들을 제공한다.According to an embodiment of the present invention, the scaled neural network generator 620 provides scaled neural networks learned at a plurality of different scales as described in FIG. 2 .

또한, 배율 신경망 생성부(620)는 제1 배율 신경망의 상부층의 파라메터를 나머지 배율 신경망들을 생성하기 위한 기본 파라메터로 공유하면서, 나머지 배율 신경망들의 상부층의 파라메터로 공유할 수 있다.Also, the scaled neural network generator 620 may share the parameter of the upper layer of the first scaled neural network as a basic parameter for generating the remaining scaled neural networks, while sharing it as a parameter of the upper layer of the remaining scaled neural networks.

일례로, 스코어 산출부(630)는 복수의 배율 신경망에 복수의 영상 변환 레벨에 따라 크기 변환된 테스트 영상을 입력하여 복수의 클래스 별 스코어(score)를 각각 산출할 수 있다.For example, the score calculator 630 may input a size-converted test image according to a plurality of image conversion levels to a plurality of magnification neural networks, and may calculate scores for a plurality of classes, respectively.

예를 들어, 복수의 배율 신경망은 제1 배율 신경망 및 제1 배율 신경망을 제외한 나머지 배율 신경망들을 포함할 수 있다.For example, the plurality of scaled neural networks may include the first scaled neural network and other scaled neural networks other than the first scaled neural network.

예를 들어, 클래스는 테스트 영상에서 분류 대상이 되는 복수의 카테고리 중 어느 하나의 카테고리를 포함할 수 있다.For example, the class may include any one of a plurality of categories to be classified in the test image.

예를 들어, 클래스는 임의 설정에 따라 지정 가능하며, 비행기, 자동차, 새 등 부분이 구분된 사물과 쌀국수, 짬뽕, 설렁탕, 비프 샐러드, 시저 샐러드 및 그릭 샐러드 등 부분이 구분되지 않은 음식 등을 포함할 수 있다.For example, a class can be specified according to an arbitrary setting, and includes objects with separated parts such as airplanes, cars, and birds, and foods with no parts such as rice noodles, jjamppong, seolleongtang, beef salad, Caesar salad, and Greek salad. can do.

본 발명의 일실시예에 따르면 영상 인식부(640)는 제1 배율 신경망 및 나머지 배율 신경망들을 병렬로 연결하여 각각 산출된 스코어(score)를 통합하여 테스트 영상의 클래스를 인식할 수 있다.According to an embodiment of the present invention, the image recognition unit 640 may recognize the class of the test image by integrating the scores calculated by connecting the first scaled neural network and the remaining scaled neural networks in parallel.

예를 들어, 복수의 배율 신경망을 병렬로 연결하여 각각 산출된 스코어를 통합할 경우, 곱셈 또는 덧셈과 같은 연산이 활용될 수 있다.For example, when a plurality of scaling neural networks are connected in parallel to integrate each calculated score, an operation such as multiplication or addition may be utilized.

일례로, 영상 인식부(640)는 각각 산출된 스코어(score)를 더하거나 곱하여 산출된 최종 스코어들 중 가장 큰 스코어에 해당하는 클래스를 테스트 영상의 클래스로 인식할 수 있다.For example, the image recognition unit 640 may recognize a class corresponding to the largest score among final scores calculated by adding or multiplying scores calculated respectively as the class of the test image.

예를 들어, 클래스의 최종 스코어들이 비행기가 1이고, 자전거가 3이고, 짬뽕이 5일 경우, 테스트 영상을 짬뽕으로 인식할 수 있다.For example, when the final scores of the class are 1 for airplane, 3 for bicycle, and 5 for jjamppong, the test image may be recognized as jjamppong.

따라서, 본 발명은 훈련 영상을 다양한 배율로 크기 변환하여 원거리에서 관찰한 객체의 특성으로 학습한 신경망뿐만이 아니라 근거리에서 관찰한 객체의 상세한 특징 정보로 학습한 신경망을 복합적으로 사용함으로써 다양한 배율의 시각에서 구분할 수 있는 부객체들의 미세한 차이를 학습함에 따라 객체 인식 성능을 향상시킬 수 있다.Therefore, in the present invention, by using a neural network learned from the characteristics of the object observed from a distance as well as the neural network learned from the detailed characteristic information of the object observed at a distance by converting the size of the training image at various magnifications in a complex manner, in the view of various magnifications Object recognition performance can be improved by learning the subtle differences between distinguishable sub-objects.

또한, 본 발명은 서로 다른 배율에 따라 크기 변환된 영상을 이용하여 훈련시킨 서로 다른 심층 신경망을 병렬로 연결하여 복수 배율 신경망(multi-scale CNN)을 구성하고, 테스트 영상에 대해 각 배율로 크기 변환한 후 해당 배율의 신경망을 통해 얻은 예측 점수(score)를 통합하여 최종 객체 분류를 수행할 수 있다.In addition, the present invention configures a multi-scale neural network (CNN) by connecting different deep neural networks trained using images size-converted according to different magnifications in parallel, and size-transforms the test images at each magnification. After that, the final object classification can be performed by integrating the prediction score obtained through the neural network of the corresponding magnification.

도 7 및 도 8은 본 발명의 일실시예에 따른 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 설명하는 도면이다.7 and 8 are diagrams for explaining the neural network structure of a parallel deep neural network device for recognizing a test image through a plurality of neural networks trained with images size-converted at different magnifications according to an embodiment of the present invention.

도 7은 서로 다른 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 예시하고, 도 8은 도 7의 병렬 심층 신경망 장치의 신경망 구조에서 메모리 소모를 줄이기 위하여 제1 배율 신경망의 상부층을 제2 배율 신경망과 제3 배율 신경망의 상부층으로 공유한 신경망 구조를 예시한다.7 illustrates a neural network structure of a parallel deep neural network device that recognizes a test image through a plurality of neural networks trained with images size-transformed at different magnifications, and FIG. 8 is a memory in the neural network structure of the parallel deep neural network device of FIG. In order to reduce consumption, a neural network structure in which the upper layer of the first scaled neural network is shared as an upper layer of the second scaled neural network and the third scaled neural network is exemplified.

도 7을 참고하면, 본 발명의 일실시예에 따른 신경망 구조(700)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제1 배율 영상(710)으로 변환하여, 제1 배율 신경망의 상부층(711)과 하부층(712)을 통해 제1 배율 영상(710)을 학습한다.Referring to FIG. 7 , the neural network structure 700 according to an embodiment of the present invention receives a test image and converts the size of the test image into a first magnification image 710 , and an upper layer 711 of the first magnification neural network. and the first magnification image 710 through the lower layer 712 .

또한, 신경망 구조(700)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제2 배율 영상(720)으로 변환하여, 제2 배율 신경망의 상부층(721)과 하부층(722)을 통해 제2 배율 영상(720)을 학습한다.In addition, the neural network structure 700 receives a test image, converts the size of the test image into a second magnification image 720, and uses the second magnification image ( 720).

또한, 신경망 구조(700)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제3 배율 영상(730)으로 변환하여, 제3 배율 신경망의 상부층(731)과 하부층(732)을 통해 제3 배율 영상(730)을 학습한다.In addition, the neural network structure 700 receives a test image and converts the size of the test image into a third magnification image 730, and through the upper layer 731 and lower layer 732 of the third magnification neural network, the third magnification image ( 730) are learned.

다음으로, 신경망 구조(700)는 출력단(740)에서 제1 배율 영상(710) 내지 제3 배율 영상(730)의 학습 결과를 병렬로 연결하여 클래스 별로 스코어를 각각 산출한 후, 각각 산출된 스코어를 통합하여 테스트 영상의 클래스를 결정한다. 여기서, 결정된 클래스는 클래스 라벨(class label)로 표시될 수 있다.Next, the neural network structure 700 connects the learning results of the first magnification image 710 to the third magnification image 730 at the output terminal 740 in parallel to calculate a score for each class, and then calculates each calculated score. Integrate to determine the class of the test image. Here, the determined class may be displayed as a class label.

도 8을 참고하면, 본 발명의 일실시예에 따른 신경망 구조(800)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제1 배율 영상(810)으로 변환하여, 제1 배율 신경망의 상부층(811)과 하부층(812)을 통해 제1 배율 영상(810)을 학습한다.Referring to FIG. 8 , the neural network structure 800 according to an embodiment of the present invention receives a test image and converts the size of the test image into a first magnification image 810 , and an upper layer 811 of the first magnification neural network. The first magnification image 810 is learned through the and the lower layer 812 .

또한, 신경망 구조(800)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제2 배율 영상(820)으로 변환하여, 제2 배율 신경망의 상부층(821)과 하부층(822)을 통해 제2 배율 영상(820)을 학습한다.In addition, the neural network structure 800 receives a test image, converts the size of the test image into a second magnification image 820, and uses the second magnification image ( 820) are learned.

또한, 신경망 구조(800)는 테스트 영상을 입력 받아서 테스트 영상의 크기를 제3 배율 영상(830)으로 변환하여, 제3 배율 신경망의 상부층(831)과 하부층(832)을 통해 제3 배율 영상(830)을 학습한다.In addition, the neural network structure 800 receives a test image and converts the size of the test image into a third magnification image 830, and through the upper layer 831 and the lower layer 832 of the third magnification neural network, the third magnification image ( 830) are learned.

여기서, 신경망 구조(800)는 신경망 구조(700)와 달리 상부층(821)과 상부층(831)을 상부층(811)과 동일한 파라메터를 적용한다.Here, the neural network structure 800 applies the same parameters as the upper layer 811 to the upper layer 821 and the upper layer 831 , unlike the neural network structure 700 .

즉, 신경망 구조(800)는 신경망 구조(700)와 달리 상부층(811)을 제2 배율 신경망 및 제3 배율 신경망과 공유한다That is, unlike the neural network structure 700 , the neural network structure 800 shares the upper layer 811 with the second scaled neural network and the third scaled neural network.

다음으로, 신경망 구조(800)는 출력단(840)에서 제1 배율 영상(810) 내지 제3 배율 영상(830)의 학습 결과를 병렬로 연결하여 클래스 별로 스코어를 각각 산출한 후, 각각 산출된 스코어를 통합하여 테스트 영상의 클래스를 결정한다. 여기서, 결정된 클래스는 클래스 라벨(class label)로 표시될 수 있다.Next, the neural network structure 800 connects the learning results of the first magnification image 810 to the third magnification image 830 at the output terminal 840 in parallel to calculate a score for each class, and then calculates each calculated score. Integrate to determine the class of the test image. Here, the determined class may be displayed as a class label.

본 발명의 일실시예에 따른 신경망 구조(800)는 신경망 구조(700)에 대비하여 배율 신경망의 상부층을 모두 저장할 필요가 없어서 메모리의 소모를 현격히 줄이면서도 테스트 영상에서 객체의 인식 성능의 저하를 최소화할 수 있다.The neural network structure 800 according to an embodiment of the present invention does not need to store all the upper layers of the scaled neural network compared to the neural network structure 700, so the memory consumption is remarkably reduced and the deterioration of the object recognition performance in the test image is minimized. can do.

도 9 및 도 10은 본 발명의 일실시예에 따른 세가지 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 설명하는 도면이다.9 and 10 are diagrams illustrating a neural network structure of a parallel deep neural network device for recognizing a test image through a plurality of neural networks trained with images size-converted by three magnifications according to an embodiment of the present invention.

도 9는 세가지 배율로 크기 변환된 영상으로 훈련된 복수의 신경망을 통해 테스트 영상을 인식하는 병렬 심층 신경망 장치의 신경망 구조를 예시하고, 도 10은 도 9의 병렬 심층 신경망 장치의 신경망 구조에서 메모리 소모를 줄이기 위하여 객체 레벨의 배율 신경망의 상부층을 중간 레벨의 배율 신경망 및 파트 레벨의 배율 신경망의 상부층으로 공유한 신경망 구조를 예시한다.9 exemplifies the neural network structure of a parallel deep neural network device that recognizes a test image through a plurality of neural networks trained with images size-converted at three magnifications, and FIG. 10 is memory consumption in the neural network structure of the parallel deep neural network device of FIG. A neural network structure in which the upper layer of the object-level scaled neural network is shared as the upper layer of the intermediate-level scaled neural network and the part-level scaled neural network is exemplified in order to reduce .

도 9를 참고하면, 본 발명의 일실시예에 따른 신경망 구조(900)는 세개의 배율 신경망을 병렬로 연결하여 테스트 영상에 대해 테스트를 수행한다.Referring to FIG. 9 , the neural network structure 900 according to an embodiment of the present invention performs a test on a test image by connecting three magnification neural networks in parallel.

본 발명의 일실시예에 따르면 신경망 구조(900)는 각각의 배율 신경망으로 입력되는 테스트 영상은 객체 레벨, 중간 레벨, 파트 레벨 배율로 크기변환 되어 객체 레벨 영상(910), 중간 레벨 영상(920), 파트 레벨 영상(930)을 각각의 배율 신경망으로 입력한다.According to an embodiment of the present invention, in the neural network structure 900, a test image input to each scaled neural network is scaled to an object level, an intermediate level, and a part level scale to an object level image 910, an intermediate level image 920. , the part-level image 930 is input to each scaling neural network.

테스트 영상의 경우는 훈련 영상 때와는 달리 지터링(jittering)이 불필요 하므로 상기 수학식 1의 α는 0으로 고정하여 사용하여 테스트 영상의 크기를 변환하기 위한 배율을 설정한다.In the case of the test image, unlike the training image, jittering is unnecessary, so α in Equation 1 is fixed to 0 and used to set a magnification for converting the size of the test image.

본 발명의 일실시예에 따르면 신경망 구조(900)는 테스트 영상에 대해 α=0, β=1로 고정하여 영상 크기변환된 객체 레벨 영상(910)을 객체 레벨로 훈련된 배율 신경망 상부층(911) 및 하부층(912)에 입력하여 출력단에 각 클래스 별로 스코어(score)를 출력한다.According to an embodiment of the present invention, the neural network structure 900 is an upper layer 911 of a scaled neural network trained to the object level by using the object-level image 910 that has been converted to image size by fixing α=0, β=1 for the test image. and input to the lower layer 912 to output a score for each class to an output terminal.

또한, 신경망 구조(900)는 테스트 영상에 대해 α=0, β=1.25로 고정하여 영상 크기변환된 중간 레벨 영상(920)을 중간 레벨로 훈련된 배율 신경망의 상부층(921) 및 하부층(922)에 입력하면 출력단에 중간 레벨에 해당하는 배율 신경망의 클래스 별 스코어(score)를 출력한다.In addition, the neural network structure 900 is an upper layer 921 and a lower layer 922 of the scaled neural network trained to the intermediate level by fixing the image size-transformed intermediate level image 920 by fixing α = 0, β = 1.25 for the test image. When input to , the score for each class of the scaled neural network corresponding to the intermediate level is output to the output stage.

마지막으로, 신경망 구조(900)는 테스트 영상에 대해 α=0, β=1.5로 고정하여 영상 크기변환 된 파트 레벨 영상(930)을 파트 레벨로 훈련된 배율 신경망의 상부층(931) 및 하부층(932)에 입력하면 출력단에 파트 레벨에 해당하는 배율 신경망의 클래스 별 스코어(score)를 출력한다.Finally, the neural network structure 900 sets the image size-transformed part-level image 930 by fixing α = 0 and β = 1.5 for the test image to the upper layer 931 and lower layer 932 of the scaled neural network trained to the part level. ), the score for each class of the scaled neural network corresponding to the part level is output to the output stage.

신경망 구조(900)는 각 수준별 신경망의 스코어를 클래스 별로 더하거나 곱하여 최종 스코어를 계산하고 이 중에서 가장 큰 스코어에 해당되는 클래스를 테스트 영상의 클래스로 인식한다.The neural network structure 900 calculates a final score by adding or multiplying the scores of the neural networks for each level by class, and recognizes the class corresponding to the highest score among them as the class of the test image.

다만, 신경망 구조(900)는 세 단계의 배율을 갖는 복수의 배율 신경망 (multi-scale CNN) 구조에서 훈련된 세 개의 신경망 파라메터들을 모두 저장해야 하므로 그 만큼 많은 메모리가 요구된다.However, since the neural network structure 900 must store all three neural network parameters trained in a multi-scale CNN structure having three levels of magnification, a large amount of memory is required.

도 10을 참고하면, 신경망 구조(1000)는 중간 레벨로 훈련된 배율 신경망의 상부층(1011)을 중간 레벨로 훈련된 배율 신경망 및 파트 레벨로 훈련된 배율 신경망의 상부층(1021) 및 상부층(1031)과 공유한다.Referring to FIG. 10 , the neural network structure 1000 includes an upper layer 1011 of a scaled neural network trained at an intermediate level, an upper layer 1021 and an upper layer 1031 of a scaled neural network trained with an intermediate level and a scaled neural network trained with a part level. share with

따라서, 신경망 구조(1000)는 복수의 배율 신경망의 파라메터를 저장함에 있어서, 상부층(1011), 하부층(1012), 하부층(1022) 및 하부층(1032)의 파라메터만을 저장함에 따라 메모리 소모를 현격하게 줄일 수 있다.Accordingly, the neural network structure 1000 stores only the parameters of the upper layer 1011 , the lower layer 1012 , the lower layer 1022 and the lower layer 1032 when storing the parameters of the plurality of scaling neural networks, thereby significantly reducing memory consumption. can

본 발명의 일실시예에 따르면 신경망 구조(1000)는 테스트 영상에 대해 α=0, β=1로 고정하여 영상 크기변환된 객체 레벨 영상(1010)을 객체 레벨로 훈련된 배율 신경망 상부층(1011) 및 하부층(1012)에 입력하여 출력단에 각 클래스 별로 스코어(score)를 출력한다.According to an embodiment of the present invention, the neural network structure 1000 is an upper layer of a scaled neural network trained to the object level using the object-level image 1010, which has been converted to image size by fixing α=0, β=1 for the test image. and input to the lower layer 1012 to output a score for each class to an output terminal.

또한, 신경망 구조(1000)는 테스트 영상에 대해 α=0, β=1.25로 고정하여 영상 크기변환된 중간 레벨 영상(1020)을 객체 레벨로 훈련된 배율 신경망의 상부층(1021) 및 하부층(1021)에 입력하면 출력단에 중간 레벨에 해당하는 배율 신경망의 클래스 별 스코어(score)를 출력한다.In addition, the neural network structure 1000 is an upper layer 1021 and a lower layer 1021 of a scaled neural network trained to the object level by fixing the image size-converted intermediate-level image 1020 by fixing α = 0, β = 1.25 for the test image. When input to , the score for each class of the scaled neural network corresponding to the intermediate level is output to the output stage.

마지막으로, 신경망 구조(1000)는 테스트 영상에 대해 α=0, β=1.5로 고정하여 영상 크기변환 된 파트 레벨 영상(1030)을 객체 레벨로 훈련된 배율 신경망의 상부층(1031) 및 하부층(1032)에 입력하면 출력단에 파트 레벨에 해당하는 배율 신경망의 클래스 별 스코어(score)를 출력한다.Finally, the neural network structure 1000 sets the image size-converted part-level image 1030 by fixing α = 0 and β = 1.5 for the test image to the upper layer 1031 and lower layer 1032 of the scaled neural network trained to the object level. ), the score for each class of the scaled neural network corresponding to the part level is output to the output stage.

즉, 도 10의 신경망 구조(1000)는 도 9의 신경망 구조(900)와 유사한 테스트 영상의 객체 인식율을 제공하면서도 메모리 소모를 현격하게 줄일 수 있다.That is, the neural network structure 1000 of FIG. 10 can significantly reduce memory consumption while providing an object recognition rate of a test image similar to that of the neural network structure 900 of FIG. 9 .

신경망 구조(900)와 신경망 구조(1000)에 기반한 테스트 영상의 객체 인식율의 정확도는 음식과 같이 분리 가능한 객체 부분이 없고 객체 경계가 없는 사물에 대한 인식율을 표 1, 표 2 및 표 3을 통해 비교한다. 여기서, 데이터 셋은 표 1의 경우 UECFood256이라는 영상 데이터베이스를 이용하고, 표 2의 경우 Food101이라는 영상 데이터베이스를 이용하며, 표 3의 경우 VieroFood172라는 영상 데이터베이스를 이용한다.The accuracy of the object recognition rate of the test image based on the neural network structure 900 and the neural network structure 1000 is compared with Table 1, Table 2, and Table 3 with respect to the recognition rate for an object having no separable object part such as food and no object boundary. do. Here, for the data set, an image database called UECFood256 is used for Table 1, an image database called Food101 is used for Table 2, and an image database called VieroFood172 is used for Table 3.

또한, 신경망 구조(900)와 신경망 구조(1000)와 종래기술1, 종래기술2, 종래기술3, 종래기술4는 ResNet50라는 동일한 CNN(Convolutional Neural Networks) 구조에 기반한다.In addition, the neural network structure 900, the neural network structure 1000, the prior art 1, the prior art 2, the prior art 3, and the prior art 4 are based on the same CNN (Convolutional Neural Networks) structure called ResNet50.

표 1에 따르면 종래 기술 1, 종래 기술 2 및 종래기술 3은 음식과 같이 공통적인 파트 객체가 존재하지 않는 경우 최종 평균 정확성을 측정하기 어렵다.According to Table 1, in the prior art 1, the prior art 2, and the prior art 3, it is difficult to measure the final average accuracy when a common part object such as food does not exist.

반면에, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨 및 파트 레벨에서 각각 67.48% 및 57.70%의 최종 평균 정확성이 측정 가능하다.On the other hand, in the neural network structure 900 corresponding to FIG. 9 of the present invention, final average accuracies of 67.48% and 57.70% can be measured at the object level and the part level, respectively.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨 및 파트 레벨에서 각각 67.48% 및 62.02%의 최종 평균 정확성이 측정 가능하다.In addition, in the neural network structure 1000 corresponding to FIG. 10 of the present invention, final average accuracies of 67.48% and 62.02% can be measured at the object level and the part level, respectively.

따라서, 본 발명은 공통적인 파트 객체가 존재하지 않는 경우에도 테스트 영상의 클래스를 인식할 수 있다.Accordingly, the present invention can recognize the class of the test image even when a common part object does not exist.

신경망 구조(900)의 최종 평균 정확성과 신경망 구조(1000)의 최종 평균 정확성을 비교하면 객체 레벨은 동일하고, 파트 레벨은 유사하므로 객체 레벨 배율 신경망의 상부층을 파트 레벨 배율 신경망의 상부층으로 공유하여 메모리 소모를 줄일 수 있다.Comparing the final average accuracy of the neural network structure 900 and the final average accuracy of the neural network structure 1000, the object level is the same and the part level is similar. consumption can be reduced.

한편, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 70.36%로 객체 레벨과 유사하고 최하위(Top5)의 경우 90.71%로 객체 레벨보다 높다.On the other hand, in the case of the neural network structure 900 corresponding to FIG. 9 of the present invention, when the final average accuracy is fused at the object level and the part level, the fused result is 70.36% in the case of the top object (Top1). It is similar to the level, and in the case of the lowest level (Top5), it is 90.71% higher than the object level.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 71.75%로 객체 레벨과 유사하고 최하위(Top5)의 경우 91.49%로 객체 레벨보다 높다.In addition, when the neural network structure 1000 corresponding to FIG. 10 of the present invention is fused with the final average accuracy at the object level and the part level, the fused result is 71.75% in the case of the top object (Top1). It is similar to the level, and in the case of the lowest level (Top5), it is 91.49%, which is higher than the object level.

신경망 구조(900)의 융합 결과와 신경망 구조(1000)의 융합 결과를 비교하면 신경망 구조(1000)의 융합 결과가 상대적으로 높다.When the fusion result of the neural network structure 900 and the fusion result of the neural network structure 1000 are compared, the fusion result of the neural network structure 1000 is relatively high.

표 2에 따르면, 종래 기술 1, 종래 기술 2, 종래기술 3 및 종래기술 4는 음식과 같이 공통적인 파트 객체가 존재하지 않는 경우 최종 평균 정확성을 측정하기 어렵다.According to Table 2, in the prior art 1, the prior art 2, the prior art 3, and the prior art 4, it is difficult to measure the final average accuracy when a common part object such as food does not exist.

반면에, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨 및 파트 레벨에서 각각 84.24% 및 76.02%의 최종 평균 정확성이 측정 가능하다.On the other hand, in the neural network structure 900 corresponding to FIG. 9 of the present invention, final average accuracies of 84.24% and 76.02% can be measured at the object level and the part level, respectively.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨 및 파트 레벨에서 각각 84.24% 및 78.34%의 최종 평균 정확성이 측정 가능하다.In addition, in the neural network structure 1000 corresponding to FIG. 10 of the present invention, final average accuracies of 84.24% and 78.34% can be measured at the object level and the part level, respectively.

한편, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 85.33%로 객체 레벨과 유사하고 최하위(Top5)의 경우 96.96%로 객체 레벨보다 높다.On the other hand, when the neural network structure 900 corresponding to FIG. 9 of the present invention is fused with the final average accuracy at the object level and the part level, the fused result is 85.33% in the case of the top object (Top1). It is similar to the level, and in the case of the lowest level (Top5), it is 96.96% higher than the object level.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 86.21%로 객체 레벨과 유사하고 최하위(Top5)의 경우 97.19%로 객체 레벨보다 높다.In addition, when the neural network structure 1000 corresponding to FIG. 10 of the present invention is fused with the final average accuracy at the object level and the part level, the fused result is 86.21% in the case of the top object (Top1). It is similar to the level, and in the case of the lowest level (Top5), it is 97.19%, which is higher than the object level.

표 3에 따르면, 종래 기술 1은 음식과 같이 공통적인 파트 객체가 존재하지 않는 경우 최종 평균 정확성을 측정하기 어렵다.According to Table 3, in the prior art 1, it is difficult to measure the final average accuracy when a common part object such as food does not exist.

반면에, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨 및 파트 레벨에서 각각 88.01% 및 82.38%의 최종 평균 정확성이 측정 가능하다.On the other hand, in the neural network structure 900 corresponding to FIG. 9 of the present invention, final average accuracies of 88.01% and 82.38% can be measured at the object level and the part level, respectively.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨 및 파트 레벨에서 각각 88.01% 및 84.87%의 최종 평균 정확성이 측정 가능하다.In addition, in the neural network structure 1000 corresponding to FIG. 10 of the present invention, final average accuracies of 88.01% and 84.87% can be measured at the object level and the part level, respectively.

한편, 본 발명의 도 9에 해당하는 신경망 구조(900)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 88.87%로 객체 레벨과 유사하고 최하위(Top5)의 경우 98.27%로 객체 레벨보다 높다.On the other hand, when the neural network structure 900 corresponding to FIG. 9 of the present invention is fused with the final average accuracy at the object level and the part level, the fused result is 88.87% in the case of the top object (Top1). It is similar to the level and the lowest level (Top5) is 98.27% higher than the object level.

또한, 본 발명의 도 10에 해당하는 신경망 구조(1000)는 객체 레벨, 파트 레벨에서 최종 평균 정확성을 융합(fused)할 경우, 융합 결과(fused result)는 최상위(Top1)의 경우 89.72%로 객체 레벨과 유사하고 최하위(Top5)의 경우 98.40%로 객체 레벨보다 높다.In addition, when the neural network structure 1000 corresponding to FIG. 10 of the present invention is fused with the final average accuracy at the object level and the part level, the fused result is 89.72% in the case of the top object (Top1). It is similar to the level, and the lowest level (Top5) is 98.40% higher than the object level.

표 1, 표 2 및 표 3에서의 실험 데이터를 고려해보면 본 발명은 배율 별로 복수개의 신경망을 사용하므로 각각의 배율에 해당하는 훈련된 파라메터를 저장해야 하는 단점을 극복하기 위해 객체수준의 신경망의 상부층을 다른 모든 배율에서 공유할 수 있도록 배율 별 신경망을 훈련시킴으로써 메모리 소비량을 최소화할 수 있다.Considering the experimental data in Tables 1, 2, and 3, the present invention uses a plurality of neural networks for each magnification, so to overcome the disadvantage of having to store the trained parameters corresponding to each magnification, the upper layer of the neural network at the object level Memory consumption can be minimized by training a neural network for each scale to be shared by all other scales.

또한, 본 발명은 음식과 같이 분리 가능한 객체 부분이 없고 객체 경계가 없는 사물에 대하여 종래기술과 달리 보다 정확하게 인식할 수 있다.In addition, the present invention can more accurately recognize an object having no separable object part, such as food, and no object boundary, unlike the prior art.

도 11a 및 도 11b는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부가 크기 변환한 영상을 설명하는 도면이다.11A and 11B are diagrams illustrating an image size-converted by an image size converter of a parallel deep neural network apparatus according to an embodiment of the present invention.

도 11a 및 도 11b는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 영상 크기 변환부가 훈련 영상 또는 테스트 영상이 될 수 있는 음식 영상 또는 이미지의 전체 레벨을 객체 레벨 및 파트 레벨로 크기 변환한 영상을 예시한다.11A and 11B are images in which the image size conversion unit of the parallel deep neural network device according to an embodiment of the present invention size-converts the entire level of a food image or image, which can be a training image or a test image, into an object level and a part level. exemplifies

도 11a를 참고하면, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치가 우묵한 그릇(bowl)에 담긴 국수의 전체 레벨(1100)을 객체 레벨(1110) 및 파트 레벨(1120)로 크기 변환한 영상들을 나타낸다.Referring to Figure 11a, the parallel deep neural network apparatus according to an embodiment of the present invention size-converted image of the entire level (1100) of noodles contained in a hollow bowl (1100) to the object level (1110) and the part level (1120) indicate the

예를 들어, 전체 레벨(1100)은 크기 변환 이전의 원영상에 해당될 수 있다.For example, the full level 1100 may correspond to the original image before size conversion.

도 11b를 참고하면, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치가 접시(dish)에 담긴 샐러드의 전체 레벨(1130)을 객체 레벨(1140) 및 파트 레벨(1150)로 크기 변환한 영상들을 나타낸다.Referring to Figure 11b, the parallel deep neural network apparatus according to an embodiment of the present invention size-converted images of the entire level 1130 of the salad in the dish to the object level 1140 and the part level 1150. indicates.

예를 들어, 전체 레벨(1130)은 크기 변환 이전의 원영상에 해당될 수 있다.For example, the full level 1130 may correspond to the original image before size conversion.

따라서, 본 발명은 영상에 존재하는 객체를 전체적인 시각(global view)과 국부적인 시각(local view) 등의 다양한 배율(scale)로 크기 변환된 훈련영상으로 학습으로 학습할 수 있다.Therefore, according to the present invention, it is possible to learn an object existing in an image by learning with a training image size-converted at various scales such as a global view and a local view.

예를 들어, 객체의 전체적인 시각은 전체 레벨(1100) 및 전체 레벨(1130)에 해당될 수 있고, 국부적인 시각 또는 부객체(sub-object)는 객체 레벨(1110), 파트 레벨(1120), 객체 레벨(1140) 및 파트 레벨(1150)에 해당될 수 있다.For example, the overall view of the object may correspond to the full level 1100 and the full level 1130 , and the local view or sub-object is the object level 1110 , the part level 1120 , It may correspond to the object level 1140 and the part level 1150 .

예를 들어, 객체 레벨(1110), 파트 레벨(1120), 객체 레벨(1140) 및 파트 레벨(1150)은 스케일 지터링 및 무작위 크로핑의 결과 시도할 때 마다 결과가 조금씩 다르게 보이는 것을 나타낼 수 있다.For example, the object level 1110 , the part level 1120 , the object level 1140 , and the part level 1150 may indicate that the results of scale jittering and random cropping look slightly different each time you try. .

도 12 및 도 13은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법을 설명하는 도면이다.12 and 13 are diagrams for explaining a method of operating a parallel deep neural network apparatus according to an embodiment of the present invention.

도 12는 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법이 서로 다른 배율로 크기 변환된 훈련 영상을 학습하여 복수의 배율 신경망을 생성하는 동작을 예시한다.12 illustrates an operation of generating a plurality of scaled neural networks by learning the size-converted training images at different magnifications by the operating method of the parallel deep neural network apparatus according to an embodiment of the present invention.

도 12를 참고하면, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 단계(1201)에서 학습 영상을 서로 다른 복수의 배율로 크기 변환한다.Referring to FIG. 12 , in the operating method of a parallel deep neural network apparatus according to an embodiment of the present invention, in step 1201 , a training image is size-transformed at a plurality of different magnifications.

즉, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 훈련 영상에 대해 서로 다른 복수의 배율을 적용하여 복수의 영상 변환 레벨로 훈련 영상의 크기를 변환할 수 있다.That is, the operating method of the parallel deep neural network apparatus according to an embodiment of the present invention may convert the size of the training image into a plurality of image conversion levels by applying a plurality of different magnifications to the training image.

단계(1202)에서 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 복수의 배율로 변환된 학습 영상을 이용하여 복수의 배율 신경망을 생성한다.In the operation method of the parallel deep neural network apparatus according to an embodiment of the present invention in step 1202, a plurality of scaled neural networks are generated by using the learning image converted at a plurality of scales.

즉, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 일반 영상으로 기 훈련된 신경망의 상부층과 복수의 영상 변환 레벨 중 제1 레벨에 따라 파라메터가 교체된 하부층을 연결하여 전이학습을 수행함에 따라 제1 레벨에 대한 제1 배율 신경망을 생성하고, 제1 배율 신경망의 상부층과 복수의 영상 변환 레벨 중 제1 레벨 이외의 나머지 영상 변환 레벨들에 따라 파라메터가 교체된 하부층들을 각각 연결하여 전이학습을 수행함에 따라 나머지 영상 변환 레벨들에 대한 나머지 배율 신경망들을 생성할 수 있다.That is, the operating method of the parallel deep neural network device according to an embodiment of the present invention connects the upper layer of a neural network previously trained with a general image and the lower layer in which the parameters are replaced according to the first level among a plurality of image conversion levels to perform transfer learning. A first scaled neural network for the first level is generated as it is performed, and the upper layer of the first scaled neural network and the lower layers whose parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels are respectively connected. As transfer learning is performed, the remaining scaling neural networks for the remaining image transformation levels may be generated.

도 13은 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법이 서로 다른 배율로 크기 변환된 훈련 영상을 학습하여 생성된 복수의 배율 신경망을 이용하여 테스트 영상을 인식하는 동작을 예시한다.13 illustrates an operation of recognizing a test image using a plurality of magnification neural networks generated by learning size-converted training images at different magnifications by the operating method of the parallel deep neural network apparatus according to an embodiment of the present invention.

도 13을 참고하면, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 단계(1301)에서 테스트 영상을 서로 다른 복수의 배율로 크기 변환한다.Referring to FIG. 13 , in the operating method of the parallel deep neural network apparatus according to an embodiment of the present invention, in step 1301, a test image is size-converted to a plurality of different magnifications.

즉, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 테스트 영상에 대해 서로 다른 복수의 배율을 적용하여 복수의 영상 변환 레벨로 테스트 영상의 크기를 변환할 수 있다.That is, the operating method of the parallel deep neural network apparatus according to an embodiment of the present invention may convert the size of the test image into a plurality of image conversion levels by applying a plurality of different magnifications to the test image.

단계(1302)에서 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 복수의 배율로 변환된 학습 영상을 이용하여 생성된 복수의 배율 신경망에 변환된 배율에 따라 테스트 영상을 입력하여 복수의 클래스 별 스코어를 각각 산출할 수 있다.In step 1302, the operation method of the parallel deep neural network apparatus according to an embodiment of the present invention inputs a test image according to the converted magnification to a plurality of magnification neural networks generated using the training image converted to a plurality of magnifications. of each class can be calculated.

즉, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 제1 배율 신경망 및 나머지 배율 신경망들에 복수의 영상 변환 레벨에 따라 크기 변환된 테스트 영상을 입력하여 복수의 클래스 별 스코어(score)를 각각 산출할 수 있다.That is, in the method of operating a parallel deep neural network apparatus according to an embodiment of the present invention, a test image size-converted according to a plurality of image conversion levels is input to the first scaled neural network and the remaining scaled neural networks, and a plurality of class scores are obtained. ) can be calculated individually.

단계(1303)에서 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 복수의 배율 신경망을 병렬로 연결함에 따라 각각 산출된 스코어를 통합하여 테스트 영상의 클래스를 인식할 수 있다.In step 1303, the method of operating a parallel deep neural network apparatus according to an embodiment of the present invention may recognize a class of a test image by integrating scores calculated by connecting a plurality of scaled neural networks in parallel.

즉, 본 발명의 일실시예에 따른 병렬 심층 신경망 장치의 동작 방법은 제1 배율 신경망 및 나머지 배율 신경망들을 병렬로 연결하여 단계(1302)에서 각각 산출된 스코어(score)를 더하거나 곱하여 산출된 최종 스코어에 기반하여 테스트 영상의 클래스를 인식할 수 있다.That is, in the method of operating a parallel deep neural network apparatus according to an embodiment of the present invention, the first scaled neural network and the remaining scaled neural networks are connected in parallel and the final score calculated by adding or multiplying the scores calculated in step 1302, respectively. Based on this, the class of the test image can be recognized.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that may include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described with reference to the limited drawings as described above, various modifications and variations are possible by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

100: 병렬 심층 신경망 장치
110: 영상 크기 변환부 120: 배율 신경망 생성부100: parallel deep neural network device
110: image size conversion unit 120: scale neural network generation unit

Claims

an image size converter for converting the size of the training image into a plurality of image conversion levels by applying a plurality of different magnifications to the training image; and
By connecting the upper layer of the neural network previously trained with a general image and the lower layer whose parameters are replaced according to the first level among the plurality of image conversion levels to perform transfer learning, a first scaled neural network for the first level is generated, As transfer learning is performed by connecting the upper layer of the first magnification neural network and the lower layers whose parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels, the remaining image conversion levels including a scaled neural network generator for generating the remaining scaled neural networks for
Parallel deep neural network device.

According to claim 1,
The image size converter determines a total magnification that can include the entire object with respect to the training image, and randomly determines the plurality of different magnifications within the determined total magnification range.
Parallel deep neural network device.

3. The method of claim 2,
The image size conversion unit generates an intermediate image by linearly transforming the training image according to the determined plurality of different magnifications, and randomly selects the training image for each of the determined different magnifications in the generated intermediate image. Transforming the size of the training image by cropping
Parallel deep neural network device.

3. The method of claim 2,
The image size converter excludes the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network from the minimum values of the horizontal and vertical lengths of the training image according to the overall magnification. The range of magnification to half the sum of the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network to the value generated by combining the generated random number value and the minimum value of the horizontal and vertical lengths of the training image according to the total magnification. To determine the plurality of different magnifications by summing the values generated by combining the scaling factors for determining
Parallel deep neural network device.

According to claim 1,
The scaled neural network generating unit shares the parameters of the upper layer of the first scaled neural network as basic parameters for generating the remaining scaled neural networks, while sharing the parameters of the upper layers of the remaining scaled neural networks.
Parallel deep neural network device.

According to claim 1,
The scaled neural network generator performs transfer learning by connecting the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to a second level among the plurality of image conversion levels, thereby generating a second scaled neural network among the remaining scaled neural networks. and by performing transfer learning by connecting the upper layer of the first scaled neural network and the lower layer in which the parameter is replaced according to a third level among the plurality of image conversion levels, a third scaled neural network is generated among the remaining scaled neural networks, , to generate the last scaled neural network among the remaining scaled neural networks by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the last level among the plurality of image conversion levels to perform transfer learning
Parallel deep neural network device.

7. The method of claim 6,
When the scaled neural network generator generates the remaining scaled neural networks, the parameter of the upper layer of the first scaled neural network and the parameter of the upper layer of the remaining scaled neural networks are identically stored.
Parallel deep neural network device.

According to claim 1,
The plurality of image conversion levels include an object level corresponding to a magnification of an object size in the training image, a part level corresponding to a magnification of a partial size of an object, and the object level and the At least two levels of the mid-level corresponding to the magnification of the middle level of the part level
Parallel deep neural network device.

According to claim 1,
Further comprising a score calculator for inputting the size-converted test images according to the plurality of image conversion levels to the generated first scaled neural network and the generated remaining scaled neural networks to calculate scores for a plurality of classes, respectively
Parallel deep neural network device.

10. The method of claim 9,
The image size conversion unit determines a total magnification that can include the entire object with respect to the test image, determines a plurality of different magnifications for the test image within the determined total magnification range, and includes a plurality of the determined different magnifications. generating an intermediate image by linearly converting the test image according to the magnification, and converting the size of the test image by the determined plurality of different magnifications within the generated intermediate image
Parallel deep neural network device.

11. The method of claim 10,
The image size converter is a magnification factor for determining the range of magnification to half the sum of the minimum values of the horizontal and vertical lengths of the test image according to the overall magnification and the minimum values of the horizontal and vertical lengths of the image to be input to the magnification neural network. (scaling factor) to determine a plurality of different magnifications for the test image
Parallel deep neural network device.

10. The method of claim 9,
Further comprising an image recognition unit for recognizing the class of the test image by connecting the generated first scaled neural network and the remaining scaled neural networks in parallel and integrating the scores calculated respectively
Parallel deep neural network device.

13. The method of claim 12,
The image recognition unit recognizes, as the class of the test image, a class corresponding to the largest score among final scores calculated by adding or multiplying the respective calculated scores.
Parallel deep neural network device.

13. The method of claim 12,
The class includes any one of a plurality of categories to be classified in the test image
Parallel deep neural network device.

converting the size of the training image into a plurality of image conversion levels by applying a plurality of different magnifications to the training image in the image size converter;
In the scaling neural network generator, the transfer learning is performed by connecting the upper layer of the neural network previously trained with a general image and the lower layer in which the parameter is replaced according to the first level among the plurality of image conversion levels, so that the first level for the first level is performed. generating a scaled neural network; and
In the scaled neural network generator, transfer learning is performed by connecting the upper layer of the first scaled neural network and the lower layers whose parameters are replaced according to the remaining image conversion levels other than the first level among the plurality of image conversion levels. generating the remaining scaling neural networks for the remaining image transformation levels.
How a parallel deep neural network training device works.

16. The method of claim 15,
The step of converting the size of the training image is
determining a total magnification that can include the entire object with respect to the training image, and randomly determining a plurality of different magnifications within the determined total magnification range; and
An intermediate image is generated by linearly transforming the training image according to the determined plurality of different magnifications, and randomly cropping the training image for each of the determined different magnifications within the generated intermediate image. to convert the size of the training image by
How a parallel deep neural network device works.

16. The method of claim 15,
The step of generating the remaining scaling neural networks for the remaining image conversion levels is
generating a second scaled neural network among the remaining scaled neural networks by connecting an upper layer of the first scaled neural network and a lower layer whose parameters are replaced according to a second level among the plurality of image conversion levels to perform transfer learning;
generating a third scaled neural network among the remaining scaled neural networks by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to a third level among the plurality of image conversion levels to perform transfer learning; and
Creating the last scaled neural network among the remaining scaled neural networks by connecting the upper layer of the first scaled neural network and the lower layer whose parameters are replaced according to the last level among the plurality of image conversion levels to perform transfer learning
How a parallel deep neural network device works.

16. The method of claim 15,
inputting the size-converted test images according to the plurality of image conversion levels to the generated first scaled neural network and the generated remaining scaled neural networks, and calculating scores for a plurality of classes, respectively; and
The method further comprising recognizing the class of the test image by connecting the generated first scaled neural network and the remaining scaled neural networks in parallel and integrating the scores calculated respectively
How a parallel deep neural network device works.