KR102120669B1

KR102120669B1 - Image classification apparatus using feedback method for reducing uncertainty of image classification result in neural net architectures and method thereof

Info

Publication number: KR102120669B1
Application number: KR1020180121227A
Authority: KR
Inventors: 원치선
Original assignee: 동국대학교 산학협력단
Priority date: 2018-10-11
Filing date: 2018-10-11
Publication date: 2020-06-10
Also published as: KR20200045031A

Abstract

본 발명은 신경망의 출력단에서 최종 분류를 하기 전에 분류를 위한 각 클래스들의 확률 값들로부터 불확실성을 확인하고 불확실성이 높은 경우 입력에 사용한 원 영상의 전처리 방법을 달리하여 이전과는 다른 새로운 영상을 생성하여 신경망에 입력한다. 이 새로운 입력 영상에 대한 신경망의 출력값과 이전 출력값들의 누적값의 불확실성이 충분히 높아질 때까지 이 과정을 반복함으로써, 불확실성이 감축된 상태에서 최종 클래스 분류를 수행함으로써 분류의 정확도를 향상시키는 방법 및 장치에 관한 것이다.The present invention confirms the uncertainty from the probability values of each class for classification before final classification at the output terminal of the neural network, and generates a new image different from the previous one by differentiating the pre-processing method of the original image used for input when the uncertainty is high. Enter in By repeating this process until the uncertainty between the output value of the neural network and the accumulated value of the previous output values for this new input image is sufficiently high, the method and apparatus for improving the accuracy of classification by performing the final class classification while the uncertainty is reduced It is about.

Description

An image classification apparatus and method employing a feedback structure for reducing the uncertainty of image classification results in a neural network structure TECHNICAL FIELD IMAGES

본 발명은 영상 처리 장치 및 방법에 관한 것으로서, 보다 상세하게는 신경망을 이용한 분류의 결과에 대한 불확실성을 줄이기 위해 주어진 원영상을 다양한 형태로 전처리를 하여 반복적으로 얻은 신경망의 출력 값들을 활용하여 최종 분류에 사용하는 장치 및 방법에 관한 것이다.
The present invention relates to an image processing apparatus and method, and more specifically, to reduce uncertainty about the result of classification using a neural network, preprocessing a given original image in various forms to utilize the output values of the neural network repeatedly obtained to final classification It relates to a device and method used.

최근에 CNN(Convolutional Neural Networks)과 같이 깊은 층을 갖는 신경망 구조가 사용되면서 영상 데이터의 분류 및 인식의 성능이 급격히 향상되고 있다. 이와 같이 수 많은 영상 데이터로 기 훈련된 AlexNet, VGG-19, DenseNet 등의 깊은 층들 중에 전방에 위치한 층들의 훈련 계수는 그대로 사용하고 후방의 층들에 대해 특정 분야의 영상 데이터로 재 훈련 시켜서 사용하는 전이학습(transfer learning)의 방법이 사용되고 있다. 이들 대부분의 기 훈련된 신경망은 고정된 크기의 영상을 입력 받으므로 다양한 크기를 갖는 원영상들을 신경망의 입력 크기에 맞게 크기를 변환하는 영상 전처리 과정이 필요하다. 이와 같이, 원 영상의 크기가 신경망의 입력 크기와 차이가 나는 경우 신경망의 입력 크기에 맞추기 위해 원 영상의 크기를 조절하는 전처리 과정이 필요한데, 원 영상의 크기를 조절하는 전 처리 과정에서 영상의 일부 영역이 유실되거나 왜곡되는 것을 피할 수 없다. 이는 결과적으로 원 영상의 일부 정보가 빠지거나 왜곡된 영상을 입력 받은 신경망의 출력은 불확실한 판정이나 오류를 발생시킬 수 있다.Recently, as a neural network structure having a deep layer such as CNN (Convolutional Neural Networks) is used, performance of classification and recognition of image data is rapidly improved. Among the deep layers such as AlexNet, VGG-19, and DenseNet, which have been pre-trained with a large number of image data, the training coefficients of the layers located in the front are used as they are, and the re-trained layers are retrained with image data of a specific field. A method of transfer learning is used. Since most of these pre-trained neural networks receive a fixed sized image, an image preprocessing process is required to convert original images having various sizes to fit the input size of the neural network. As described above, when the size of the original image is different from the input size of the neural network, a pre-processing process of adjusting the size of the original image is required to match the input size of the neural network. It is inevitable that the area is lost or distorted. As a result, the output of a neural network that receives an image in which some information of the original image is missing or distorted may cause uncertain determination or error.

한편, 신경망의 fully-connected된 마지막 층에서 출력되는 특징 값의 개수는 분류 대상의 가지 수와 같고 각각의 값은 해당 클래스로 분류될 가능성을 나타내는 확률 값으로 활용할 수 있다. 즉, 신경망의 모든 출력 단자들의 확률값 중에서 가장 큰 값을 갖는 클래스로 최종 분류한다. 이 경우 최종 분류된 클래스의 확률 값이 나머지 모든 클래스의 확률값 보다 월등히 크면 최종 클래스의 분류는 불확실성(uncertainty)이 낮고 확실성이 높은 분류의 결과가 되지만, 반대로 출력 단자들에 대한 확률값들의 불확실성(uncertainty)이 높으면 최종 분류된 클래스의 확률이 다른 클래스들의 확률값들에 비해 현저하게 큰 확률 값을 갖지 않는다는 것으로, 해당 클래스에 대한 분류에 신뢰성이 떨어질 수 있다. 이 경우 일단 최대의 확률값을 갖는 클래스로 최종 분류를 하지만 분류의 에러가 발생할 가능성이 많다.
On the other hand, the number of feature values output from the last fully-connected layer of the neural network is equal to the number of branches of the classification target, and each value can be used as a probability value indicating the possibility of being classified into a corresponding class. That is, it is finally classified as the class having the largest value among the probability values of all output terminals of the neural network. In this case, if the probability value of the final classified class is significantly greater than the probability values of all other classes, the classification of the final class results in a classification with low uncertainty and high certainty, but, on the contrary, uncertainty of probability values for output terminals If this is high, the probability of the final classified class does not have a significantly greater probability value than the probability values of other classes, and reliability of classification for the corresponding class may be deteriorated. In this case, the final classification is performed as a class having the largest probability value, but there is a high possibility of errors in classification.

본 발명은 신경망의 출력단에서 얻은 확률 값들로부터 최종 분류 결과를 출력하기 전에 이들 출력 확률값들의 엔트로피(entropy)를 계산하여 불확실성을 확인하고 최종 분류의 결과에 대한 불확실성을 최대한 낮출 수 있도록 하는 영상 분류 장치 및 방법을 제공하고자 한다.
The present invention is to calculate an entropy of the output probability values before outputting the final classification result from the probability values obtained from the output terminal of the neural network, to check the uncertainty and to make the uncertainty about the final classification result as low as possible. I want to provide a method.

본 발명의 일 측면에 따르면, 신경망 구조를 이용하는 영상 분류 장치로서, 영상 데이터의 분류를 위해 사용되며, 입력 영상에 관한 출력 결과값으로서 복수 개의 출력 단자를 통해 클래스 분류를 위한 확률값들을 출력하는 기 훈련된 신경망; 입력된 원 영상의 크기를 상기 신경망에서 요구하는 입력 크기로 조절하기 위해 영상 전처리를 수행하는 제1 영상 전처리기; 상기 신경망의 각 출력 단자를 통해 출력되는 확률값들을 기초로 클래스 분류 결과의 불확실성을 계산하는 불확실성 계산기; 상기 불확실성 계산기의 계산 결과에 따라 클래스 분류 결과의 불확실성이 사전 지정된 기준값을 초과하는 불확실성을 갖는 경우, 상기 클래스 분류의 재수행을 위한 피드백을 요청하는 피드백 여부 판정기; 및 상기 피드백 여부 판정기의 판정 결과에 따라 상기 신경망을 통한 클래스 분류의 재수행이 필요한 경우, 상기 제1 영상 전처리기에서 수행된 영상 전처리 방법과는 다른 방법의 영상 전처리가 수행되어 상기 신경망에 입력될 수 있도록 상기 원 영상에 관한 영상 전처리를 수행하는 제2 영상 전처리기를 포함하는, 영상 분류 장치가 제공된다.
According to an aspect of the present invention, an image classification apparatus using a neural network structure, used for classification of image data, and training to output probability values for class classification through a plurality of output terminals as output result values for an input image Neural networks; A first image preprocessor that performs image preprocessing to adjust the size of the input original image to an input size required by the neural network; An uncertainty calculator that calculates uncertainty of a class classification result based on probability values output through each output terminal of the neural network; A feedback determiner for requesting feedback for re-execution of the class classification when the uncertainty of the class classification result exceeds the predetermined reference value according to the calculation result of the uncertainty calculator; And when it is necessary to re-perform class classification through the neural network according to the determination result of the feedback determination device, image pre-processing different from the image pre-processing method performed in the first image pre-processor is performed and input to the neural network. An image classification apparatus is provided, which includes a second image preprocessor that performs image preprocessing on the original image.

여기서, 상기 제1 영상 전처리기는, 상기 원 영상의 종횡비를 유지하면서 상기 신경망에서 요구하는 입력 크기에 가장 근접한 크기의 영상으로 변환될 수 있도록 선형 크기변환을 수행하고, 크기변환이 수행된 영상의 중심(center)을 기준으로 신경망의 입력 크기와 동일해지도록 영상을 잘라내는 크롭핑(cropping)을 수행하는 선형 크기변환-크롭핑 기법의 영상 전처리를 수행할 수 있다.
Here, the first image preprocessor, while maintaining the aspect ratio of the original image, performs linear size transformation so that it can be converted into an image having the size closest to the input size required by the neural network, and the center of the image on which the size transformation is performed. Based on (center), image preprocessing of a linear size conversion-cropping technique that performs cropping to crop an image to be the same as an input size of a neural network may be performed.

여기서, 상기 제2 영상 전처리기는, 상기 피드백 여부 판정기를 통한 피드백 요청이 발생할 때마다, 이전에 적용한 전처리 방법을 사용하지 않고 이전과는 상이한 전처리 방법을 사용하여 상기 신경망의 입력 크기에 맞는 새로운 영상을 생성할 수 있다.
Here, whenever the feedback request through the feedback determiner occurs, the second image preprocessor does not use a previously applied preprocessing method but uses a different preprocessing method than the previous one to generate a new image suitable for the input size of the neural network. Can be created.

여기서, 상기 제2 영상 전처리기는, 상기 선형 크기변환-크롭핑 기법의 영상 전처리를 수행하되, 선형 크기변환의 스케일 및 크롭핑 위치를 상기 피드백 요청의 횟수가 증가할 때마다 다르게 설정할 수 있다.
Here, the second image preprocessor performs image pre-processing of the linear size conversion-cropping technique, but may set the scale and cropping position of the linear size conversion differently each time the number of feedback requests increases.

여기서, 상기 제1 영상 전처리기 및 상기 제2 영상 전처리기는, 상기 선형 크기변환-크롭핑 기법으로 영상의 크기를 신경망의 입력 크기에 맞게 조절하는 과정에서, 영상의 화질 향상 또는 에지 강조를 위한 영상 특수효과를 함께 적용할 수 있다.
Here, the first image pre-processor and the second image pre-processor, in the process of adjusting the size of the image to the input size of the neural network by the linear size conversion-cropping technique, the image for improving the image quality or edge emphasis Special effects can be applied together.

여기서, 상기 피드백 여부 판정기는, 상기 클래스 분류 결과의 불확실성이 상기 기준값을 초과하더라도 상기 피드백 요청 횟수가 사전 지정된 임계 횟수 초과하는 경우에는 상기 클래스 분류의 재수행을 위한 패드백을 중지할 수 있다.
Here, the feedback determiner may stop the padback for re-executing the class classification when the number of feedback requests exceeds a predetermined threshold number even if the uncertainty of the class classification result exceeds the reference value.

여기서, 상기 불확실성 계산기는, 상기 피드백 요청에 따라 상기 신경망으로부터 신규 출력값이 도출된 경우, 종전까지 누적된 출력값에 상기 신규 출력값을 더한 평균값으로 갱신한 후, 갱신된 값들에 기반하여 클래스 분류 결과의 불확실성을 재계산할 수 있다.
Here, when the new output value is derived from the neural network according to the feedback request, the uncertainty calculator updates the accumulated output value up to the previous value to the average value and then adds the new output value to the average value, and then uncertainty of the class classification result based on the updated values Can be recalculated.

여기서, 상기 불확실성 계산기는, 최초 획득된 신경망의 출력값을 포함하여 피드백 요청에 의해 새로운 전처리로 생성된 영상의 신경망의 출력값들 중에서, 불확실성이 큰 출력값들은 사용하지 않고 불확실성이 낮은 출력값들만을 선별하여 불확실성 계산에 사용할 수 있다.
Here, the uncertainty calculator includes the output values of the initially obtained neural network, and among the output values of the neural network of the image generated by the new pre-processing by a feedback request, the output values having a large uncertainty are not used, and only the output values having a low uncertainty are selected to determine the uncertainty. Can be used for calculation.

여기서, 상기 불확실성 계산기는, 신경망의 출력값에 대한 불확실성을 측정하기 위해 엔트로피(entropy) 값을 계산하거나 또는 첫번째로 큰 확률값과 두번째로 큰 확률값의 비율을 계산할 수 있다.
Here, the uncertainty calculator may calculate an entropy value or a ratio of a first largest probability value and a second largest probability value to measure uncertainty of an output value of a neural network.

본 발명의 다른 측면에 따르면, 신경망 구조를 이용하는 영상 분류 장치에서의 영상 분류 방법으로서, 입력된 원 영상의 크기를 상기 신경망에서 요구하는 입력 크기로 조절하기 위해 영상 전처리를 수행하는 제1 영상 전처리 단계; 상기 신경망에서, 상기 신경망으로 인련된 입력 영상에 관한 출력 결과값으로서 복수 개의 출력 단자를 통해 클래스 분류를 위한 확률값들을 출력하는 단계; 상기 신경망의 각 출력 단자를 통해 출력되는 확률값들을 기초로 클래스 분류 결과의 불확실성을 계산하는 단계; 상기 계산 결과에 따라 클래스 분류 결과의 불확실성이 사전 지정된 기준값을 초과하는 불확실성을 갖는 경우, 상기 클래스 분류의 재수행을 위한 피드백을 요청하는 단계; 및 상기 피드백을 요청한 결과에 따라 상기 신경망을 통한 클래스 분류의 재수행이 필요한 경우, 상기 제1 영상 전처리 단계에서 수행된 영상 전처리 방법과는 다른 방법의 영상 전처리가 수행되어 상기 신경망에 입력될 수 있도록 상기 원 영상에 관한 영상 전처리를 수행하는 제2 영상 전처리 단계를 포함하는, 영상 분류 방법이 제공된다.
According to another aspect of the present invention, as an image classification method in an image classification apparatus using a neural network structure, a first image pre-processing step of performing image pre-processing to adjust the size of an input original image to an input size required by the neural network ; In the neural network, outputting probability values for class classification through a plurality of output terminals as an output result value for the input image associated with the neural network; Calculating uncertainty of a class classification result based on probability values output through each output terminal of the neural network; Requesting feedback for re-executing the class classification when the uncertainty of the class classification result exceeds the predetermined reference value according to the calculation result; And when it is necessary to perform the class classification through the neural network according to the result of requesting the feedback, an image pre-processing method different from the image pre-processing method performed in the first image pre-processing step may be performed and input to the neural network. An image classification method is provided, which includes a second image pre-processing step of performing image pre-processing on the original image.

본 발명의 실시예에 따른 영상 분류 장치 및 방법에 의하면, 영상 분류 결과에 관한 출력의 불확실성이 높은 경우, 이 사실을 입력단으로 피드백하여 원 영상에 대해 다양한 방법으로 전 처리를 수행하도록 하여 신경망에 입력으로 가해지는 영상을 다양화하고, 이들 다양한 입력 영상의 출력을 누적하여 최종적인 판정을 내림으로써, 결과적으로 원 영상의 정보를 충분히 반영하여 정확도가 향상된 분류 결과를 기대할 수 있다.
According to the image classification apparatus and method according to an embodiment of the present invention, when the uncertainty of the output regarding the image classification result is high, the fact is fed back to the input terminal to perform pre-processing on the original image in various ways and input it to the neural network By diversifying the images applied to and accumulating the outputs of these various input images and making a final determination, it is possible to expect classification results with improved accuracy by sufficiently reflecting the information of the original image.

도 1은 본 발명의 일 실시예에 따른 피드백 구조를 갖는 신경망(CNN)-분류기 영상 처리 시스템을 나타낸 전체 구성도.
도 2는 본 발명의 일 실시예에 따라 주어진 영상에 대해 최초로 수행하는 영상 전처리기의 상세 구성도.
도 3은 본 발명의 일 실시예에 따라 피드백이 발생했을 때 다양한 방법으로 영상 전처리를 수행하는 영상 전처리기의 상세 구성도.
도 4는 본 발명의 일 실시예에 따라 피드백의 발생 여부에 따라 영상 전처리(1) 혹은 영상 전처리(2)를 선택하는 스위치의 상세 구성도.
도 5는 본 발명의 일 실시예에 따라 신경망의 출력 확률값을 누적하고, 엔트로피 (entropy) 계산을 통해 출력의 불확실성 (uncertainty)를 구하는 불확실성 계산기의 상세 구성도.
도 6은 본 발명의 일 실시예에 따라 새로운 전처리 방법을 시도할지 아니면 현재의 신경망 누적 출력값을 그대로 사용하여 최종 분류를 판정할지를 판단하는 피드백 여부 판정기의 상세 구성도.
도 7은 본 발명의 일 실시예에 따라 최종 클래스를 할당하는 최종 분류기의 상세 구성도.1 is an overall configuration diagram showing a neural network (CNN)-classifier image processing system having a feedback structure according to an embodiment of the present invention.
FIG. 2 is a detailed configuration diagram of an image preprocessor first performed on a given image according to an embodiment of the present invention.
3 is a detailed configuration diagram of an image preprocessor performing image preprocessing in various ways when feedback occurs according to an embodiment of the present invention.
4 is a detailed configuration diagram of a switch for selecting an image preprocessing (1) or an image preprocessing (2) according to whether feedback occurs according to an embodiment of the present invention.
5 is a detailed configuration diagram of an uncertainty calculator that accumulates output probability values of a neural network according to an embodiment of the present invention and obtains uncertainty of output through entropy calculation.
FIG. 6 is a detailed configuration diagram of a feedback determiner determining whether to attempt a new preprocessing method according to an embodiment of the present invention or to determine a final classification using the current cumulative output value of the neural network.
7 is a detailed configuration diagram of a final classifier that allocates a final class according to an embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The present invention can be applied to various transformations and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all conversions, equivalents, and substitutes included in the spirit and scope of the present invention.

본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.In describing the present invention, when it is determined that a detailed description of related known technologies may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted. In addition, the numbers (for example, first, second, etc.) used in the description process of the present specification are only identification symbols for distinguishing one component from other components.

또한, 명세서 전체에서, 일 구성요소가 다른 구성요소와 "연결된다" 거나 "접속된다" 등으로 언급된 때에는, 상기 일 구성요소가 상기 다른 구성요소와 직접 연결되거나 또는 직접 접속될 수도 있지만, 특별히 반대되는 기재가 존재하지 않는 이상, 중간에 또 다른 구성요소를 매개하여 연결되거나 또는 접속될 수도 있다고 이해되어야 할 것이다.Also, in the specification, when one component is referred to as "connected" or "connected" with another component, the one component may be directly connected to the other component, or may be directly connected, but in particular It should be understood that, as long as there is no objection to the contrary, it may or may be connected via another component in the middle.

또한, 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하나 이상의 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 조합으로 구현될 수 있음을 의미한다.
In addition, throughout the specification, when a part “includes” a certain component, this means that other components may be further included instead of excluding other components, unless specifically stated otherwise. In addition, terms such as “part” and “module” described in the specification mean a unit that processes at least one function or operation, which means that it can be implemented by one or more hardware or software or a combination of hardware and software. .

도 1 ~ 도 7은 신경망 구조에서 영상 분류 결과의 불확실성 감축을 위한 피드백 구조를 채용한 영상 분류 장치 및 그 방법을 설명하기 위한 도면들이다. 이하, 첨부된 도면들을 참조하여 본 발명의 실시예를 상세히 설명한다.
1 to 7 are diagrams for explaining an image classification apparatus and a method employing a feedback structure for reducing uncertainty in image classification results in a neural network structure. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 피드백 구조를 갖는 신경망(CNN)-분류기 영상 처리 시스템을 나타낸 전체 구성도이다. 도 1을 참조하면, 입력된 원 영상의 크기가 기 훈련된 신경망(30)의 입력 크기와 다르면 원영상의 크기를 신경망의 크기에 맞게 크기조절을 수행하는 제1 영상 전처리기(10)를 통해서 영상 전처리(1)을 수행한다. 이는 피드백 여부 판정기(50)의 판정 결과에 따라 피드백 요청시 수행하는 제2 영상 전처리기(60)를 통한 영상 전처리 (2)와는 다른 방법의 영상 전처리이며, 이에 따라 영상 선택 스위치(20)에 의해 선택된 전처리된 영상이 신경망(30)의 입력으로 제공된다. 이후, K개의 클래스 단자를 갖는 신경망 출력의 확률값들에 대해 특정 클래스의 출력값이 다른 것들에 비해 사전 지정된 기준 이상으로 압도적으로 큰지 아닌지를 판정하는 불확실성 계산기(40)에 따른 불확실성(e) 결과값에 기반하여, 피드백 여부 판정기(50)에 의해 불확실성이 사전 지정된 일정 수준을 넘거나 또는 반복 피드백의 횟수(n)가 사전 지정된 일정 값 이하인 경우 피드백을 통해 새로운 방법으로 제2 영상 전처리기(60)를 통해 전처리된 영상(2)으로 재시도 한다. 위와 같은 프로세스를 반복 수행한 결과, 만약 불확실성이 설정된 기준값보다 작거나 피드백의 횟수(n)가 기준값을 넘으면 이제까지 얻은 클래스의 확률값들을 바탕으로 최종 분류기(7)를 통해 클래스를 판정한다.
1 is an overall configuration diagram showing a neural network (CNN)-classifier image processing system having a feedback structure according to an embodiment of the present invention. Referring to FIG. 1, if the size of the input original image is different from the input size of the pre-trained neural network 30, the size of the original image is adjusted through the first image preprocessor 10 that performs scaling according to the size of the neural network. Image preprocessing (1) is performed. This is an image pre-processing of a method different from the image pre-processing 2 through the second image pre-processor 60, which is performed when a feedback is requested according to the determination result of the feedback determining device 50, and accordingly to the image selection switch 20 The pre-processed image selected by is provided as an input of the neural network 30. Then, with respect to the probability values of the neural network output having K class terminals, the output value of a specific class is determined by the uncertainty (e) result value according to the uncertainty calculator 40 that determines whether or not the output value of a specific class is overwhelmingly greater than a predetermined criterion compared to others. Based on the feedback, if the uncertainty exceeds a predetermined predetermined level by the feedback determiner 50 or the number of repetitive feedbacks (n) is less than or equal to a predetermined predetermined value, the second image preprocessor 60 is fed through a new method through feedback. Retry with the pre-processed image (2). As a result of repeating the above process, if the uncertainty is less than the set reference value or the number of feedbacks (n) exceeds the reference value, the class is judged through the final classifier 7 based on the probability values of the class obtained so far.

도 2는 본 발명의 일 실시예에 따라 주어진 영상에 대해 최초로 수행하는 영상 전처리기(10)의 상세 구성도이다. 도 2를 참조하면, 본 발명의 실시예에 따라 도 1의 영상 전처리(1)에 대한 일 실시예로서 N1xN2의 크기를 갖는 원영상에 대해 종횡비(aspect ratio)를 유지하면서 신경망의 크기 쪽에 가까운 N1'xN2'의 크기가 되도록 일정비율로 선형 크기변환(11)을 수행하고 이에 따라 크기 변환된 영상의 센터를 중심으로 신경망의 입력의 크기와 같은 M1xM2의 크기만큼 영상을 잘라내는 크롭핑(cropping)(12)을 수행하는 선형 크기변환-cropping의 두 단계 방법이 많이 이용되고 있다(K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1566v6 (ICLR 2015), 2015).
FIG. 2 is a detailed configuration diagram of an image preprocessor 10 first performed on a given image according to an embodiment of the present invention. Referring to FIG. 2, as an embodiment of the image preprocessing 1 of FIG. 1 according to an embodiment of the present invention, N1 close to the size of the neural network while maintaining an aspect ratio for the original image having a size of N1xN2 Cropping that performs linear size transformation (11) at a constant rate to be the size of'xN2' and cuts the image by the size of M1xM2 equal to the size of the input of the neural network centering on the center of the size-transformed image. The two-step method of linear sizing-cropping to perform (12) is widely used (K. Simonyan and A. Zisserman.Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1566v6 (ICLR 2015) , 2015).

도 3은 본 발명의 일 실시예에 따라 피드백이 발생했을 때 다양한 방법으로 영상 전처리를 수행하는 영상 전처리기(60)의 상세 구성도이다. 도 3을 참조하면, 본 발명의 실시예에 따라 도 1의 영상 전처리(2)에 대한 실행 예를 나타내는 구성도로서, 피드백의 횟수를 나타내는 변수 n의 값에 따라 서로 다른 스케일로 1차 선형크기 변환을 하고(예, (61) 블록, (63) 블록 참조), 이들에 대해 각각 다양한 위치를 중심으로 크롭핑(cropping)을 수행함으로써(예, (62) 블록, (64) 블록 참조), 최종 변환된 영상의 크기는 M1xM2로 같지만 전처리된 영상의 내용이 서로 다른 다양한 입력 영상을 생성한다. 이때, 영상의 선명도를 향상시키거나 에지를 강조하는 등의 영상처리 기법(예, (66) 블록 참조)도 적용할 수 있다.
3 is a detailed configuration diagram of an image preprocessor 60 that performs image preprocessing in various ways when feedback occurs according to an embodiment of the present invention. Referring to FIG. 3, a configuration diagram showing an execution example of the image pre-processing 2 of FIG. 1 according to an embodiment of the present invention, and the primary linear size at different scales according to the value of the variable n indicating the number of feedbacks By performing transformations (e.g., refer to blocks (61) and (63)), and performing cropping around them at various positions (e.g., refer to blocks (62) and (64)). The size of the final converted image is the same as M1xM2, but various input images with different contents of the pre-processed image are generated. At this time, an image processing technique (for example, refer to (66) block) such as enhancing the sharpness of an image or emphasizing an edge may also be applied.

도 4는 본 발명의 일 실시예에 따라 피드백의 발생 여부에 따라 영상 전처리(1) 혹은 영상 전처리(2)를 선택하는 스위치(20)의 상세 구성도로서, 본 발명의 실시예에 따라 피드백의 요청에 의해 전처리된 영상(2)인지 아니면 원 영상으로부터 얻은 최초의 전처리 영상(1)인지를 피드백의 회수를 나타내는 변수 n에 따라 스위칭하는 스위치(20)를 나타내고 있다.
4 is a detailed configuration diagram of a switch 20 for selecting an image pre-processing 1 or an image pre-processing 2 according to whether feedback occurs according to an embodiment of the present invention, and according to an embodiment of the present invention, It shows a switch 20 for switching whether it is a pre-processed image 2 or a first pre-processed image 1 obtained from the original image according to a variable n indicating the number of feedbacks.

도 5는 본 발명의 일 실시예에 따라 신경망의 출력 확률값을 누적하고, 엔트로피 (entropy) 계산을 통해 출력의 불확실성 (uncertainty)를 구하는 불확실성 계산기(40)의 상세 구성도이다. 도 5를 참조하면, 본 발명의 실시예에 따라 현재의 신경망 출력(확률)값 x(1), x(2), …, x(K)가 최초의 전처리 영상(1)에 의한 것(즉, n=0인 경우)이면 y(k)=x(k)로 할당하고, 그렇지 않고 피드백의 요청에 의해 전처리된 영상인 경우 지금까지 누적된 y(k)에 현재의 출력값 x(k)를 더하여 평균한 값으로 갱신한다(예, (41) 블록 참조). 이렇게 얻은 모든 클래스의 발생 확률값 y(1), y(2), …, y(K) 중에 특정 클래스의 확률값이 다른 모든 것들보다 현저하게 큰지(불확실성이 낮음) 아니면 그런 두드러지게 큰 확률값을 갖는 클래스가 존재하지 않은지(불확실성이 높음)를 판단하는 방법의 일 예로, 엔트로피 (entropy)를 계산하는 과정(예, (42) 블록 참조)을 통해 하기 수학식 1에서와 같은 불확실성을 나타내는 값 e를 출력한다.
5 is a detailed configuration diagram of an uncertainty calculator 40 that accumulates output probability values of a neural network according to an embodiment of the present invention and obtains uncertainty of output through entropy calculation. 5, according to an embodiment of the present invention, the current neural network output (probability) values x(1), x(2), ... , y(k)=x(k) if x(K) is based on the first pre-processed image (i.e., n=0), otherwise it is a pre-processed image by request of feedback In this case, the current output value x(k) is added to the accumulated y(k) to update the average value (for example, refer to (41) block). Probability values of all classes obtained in this way y(1), y(2),… , Entropy as an example of how to determine whether the probability value of a specific class among y(K) is significantly greater than all others (low uncertainty), or that a class with such a significantly higher probability value does not exist (high uncertainty). Through the process of calculating (entropy) (for example, refer to (42) block), a value e representing uncertainty as in Equation 1 below is output.

이상에서는 도 5의 실시예를 중심으로 설명하였지만, 본 발명의 실시예에서, 불확실성 계산기(40)는, 최초 획득된 신경망의 출력값을 포함하여 피드백 요청에 의해 새로운 전처리로 생성된 영상의 신경망의 출력값들 중에서, 불확실성이 큰 출력값들은 사용하지 않고 불확실성이 낮은 출력값들만을 선별하여 불확실성 계산에 사용할 수도 있다.In the above description, the embodiment of FIG. 5 has been mainly described, but in the embodiment of the present invention, the uncertainty calculator 40 includes the output value of the initially obtained neural network, and the output value of the neural network of the image generated by the new preprocessing by the feedback request. Among them, output values having a large uncertainty are not used, but output values having a low uncertainty may be selected and used for calculation of uncertainty.

또한, 이상에서는 신경망의 출력값에 대한 불확실성을 측정하기 위해 엔트로피(entropy) 값을 계산하는 방식을 채용하였지만, 다른 방법으로서 첫번째로 큰 확률값과 두번째로 큰 확률값의 비율을 이용하는 방식 등 다양한 방법들이 채용될 수도 있을 것이다.
In addition, in the above, a method of calculating an entropy value was adopted to measure the uncertainty of the output value of the neural network, but as another method, various methods such as a method of using a ratio of a first large probability value and a second large probability value are employed. It might be.

도 6은 본 발명의 일 실시예에 따라 새로운 전처리 방법을 시도할지 아니면 현재의 신경망 누적 출력값을 그대로 사용하여 최종 분류를 판정할지를 판단하는 피드백 여부 판정기(50)의 상세 구성도이다. 도 6을 참조하면, 만약 현재의 피드백 요구 횟수를 나타내는 변수 n이 기 설정한 기준값 Tn 보다 크거나 불확실성 e가 기준값 Te보다 작으면 피드백을 더 이상 요청하지 않겠다는 것으로 n=0으로 세팅한다. 그렇지 않으면 추가적인 피드백을 요청하며 n을 1 증가시킨다.
6 is a detailed configuration diagram of a feedback determiner 50 that determines whether to attempt a new preprocessing method according to an embodiment of the present invention or to determine a final classification using the current cumulative output value of a neural network. Referring to FIG. 6, if the variable n representing the current number of feedback requests is greater than the preset reference value Tn or the uncertainty e is less than the reference value Te, it is set to n=0 as not requesting feedback any more. Otherwise, request additional feedback and increase n by 1.

도 7은 본 발명의 일 실시예에 따라 최종 클래스를 할당하는 최종 분류기(70)의 상세 구성도이다. 도 7을 참조하면, 본 발명의 실시예에 따라 n=0를 나타내어 더 이상의 피드백 요청이 없을 때 현재까지 누적된 확률값 y(1), y(2), …, y(K)을 바탕으로, 이 중에 최대의 확률 값을 갖는 클래스를 최종 클래스 k*로 결정하는 최종 분류를 수행한다.
7 is a detailed configuration diagram of a final classifier 70 that allocates a final class according to an embodiment of the present invention. Referring to FIG. 7, according to an embodiment of the present invention, n=0 indicates a probability value y(1), y(2), ... accumulated so far when there is no further feedback request. , Based on y(K), a final classification is performed to determine the class having the greatest probability value as the final class k*.

상술한 본 발명에 따른 영상 분류 방법은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로서 구현되는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체로는 컴퓨터 시스템에 의하여 해독될 수 있는 데이터가 저장된 모든 종류의 기록 매체를 포함한다. 예를 들어, ROM(Read Only Memory), RAM(Random Access Memory), 자기 테이프, 자기 디스크, 플래쉬 메모리, 광 데이터 저장장치 등이 있을 수 있다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 통신망으로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 읽을 수 있는 코드로서 저장되고 실행될 수 있다. The image classification method according to the present invention described above can be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes any kind of recording medium that stores data that can be read by a computer system. For example, there may be a read only memory (ROM), a random access memory (RAM), a magnetic tape, a magnetic disk, a flash memory, and an optical data storage device. In addition, the computer-readable recording medium may be distributed over computer systems connected through a computer communication network, and stored and executed as code readable in a distributed manner.

이상에서는 본 발명의 실시예를 참조하여 설명하였지만, 해당 기술 분야에서 통상의 지식을 가진 자라면 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 쉽게 이해할 수 있을 것이다.Although described above with reference to the embodiments of the present invention, those skilled in the art variously modify the present invention without departing from the spirit and scope of the present invention as set forth in the claims below. And it can be easily understood.

Claims

An image classification device using a neural network structure,
A pre-trained neural network that is used for classification of image data, and outputs probability values for class classification through a plurality of output terminals as output result values for an input image;
A first image preprocessor that performs image preprocessing to adjust the size of the input original image to an input size required by the neural network;
An uncertainty calculator that calculates uncertainty of a class classification result based on probability values output through each output terminal of the neural network;
A feedback determiner for requesting feedback for re-execution of the class classification when the uncertainty of the class classification result exceeds the predetermined reference value according to the calculation result of the uncertainty calculator; And
When it is necessary to perform the class classification through the neural network according to the determination result of the feedback determiner, image pre-processing different from the image pre-processing method performed in the first image pre-processor is performed and input to the neural network A second image pre-processor that performs image pre-processing on the original image to enable
Including, video classification device.

According to claim 1,
The second image preprocessor,
Whenever a request for feedback through the feedback determining device occurs, the image classification is characterized by generating a new image suitable for the input size of the neural network using a pre-processing method different from the previous one without using a pre-processing method applied previously. Device.

According to claim 2,
The second image preprocessor,
An image classification apparatus characterized in that image preprocessing of a linear scaling-cropping technique is performed, but the scale and cropping position of the linear scaling are set differently each time the number of feedback requests increases.

According to claim 3,
The first image preprocessor and the second image preprocessor,
In the process of adjusting the size of the image to the input size of the neural network by the linear size conversion-cropping technique, the image classification apparatus, characterized in that applying the image special effects for improving the image quality or edge emphasis.

According to claim 1,
The feedback determiner,
When the number of feedback requests exceeds a predetermined threshold number of times, even if the uncertainty of the class classification result exceeds the reference value, padback for re-executing the class classification is stopped.

According to claim 1,
The uncertainty calculator,
When a new output value is derived from the neural network according to the feedback request, it is characterized by updating the accumulated output value up to the previous value to an average value, and recalculating the uncertainty of the class classification result based on the updated values. Video sorting device.

The method of claim 6,
The uncertainty calculator,
Among the output values of the neural network of the image generated by the new pre-processing by the feedback request, including the output values of the initially obtained neural network, the output values having a large uncertainty are not used, but only the output values having a low uncertainty are selected and used for calculating the uncertainty. Video sorting device.

The method of claim 6,
The uncertainty calculator,
An image classification apparatus, characterized in that an entropy value is calculated to measure the uncertainty of an output value of a neural network, or a ratio of the first largest probability value to the second largest probability value.

An image classification method in an image classification apparatus using a neural network structure,
A first image preprocessing step of performing image preprocessing to adjust the size of the input original image to an input size required by the neural network;
Outputting, from the neural network, probability values for class classification through a plurality of output terminals as output result values for an input image input to the neural network;
Calculating uncertainty of a class classification result based on probability values output through each output terminal of the neural network;
Requesting feedback for re-executing the class classification when the uncertainty of the class classification result exceeds the predetermined reference value according to the calculation result; And
When it is necessary to perform the class classification through the neural network according to the result of the request for the feedback, the image pre-processing method different from the image pre-processing method performed in the first image pre-processing step is performed to be input to the neural network. Second image pre-processing step of performing image pre-processing on the original image
Including, video classification method.