KR102119687B1

KR102119687B1 - Learning Apparatus and Method of Image

Info

Publication number: KR102119687B1
Application number: KR1020200025785A
Authority: KR
Inventors: 이근신; 이종선; 전문구; 이윤관
Original assignee: 엔에이치네트웍스 주식회사; 광주과학기술원
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2020-06-05

Abstract

The present invention relates to an apparatus and a method of learning an image. The apparatus includes: an auto-encoder for learning input data by extracting important features from the input data through an encoder when the input data is inputted through a neural network and restoring the extracted important features through a decoder; a classifier network that receives the important features from the encoder and classifies a current video image according to weather conditions; and a loss calculator configured to calculate a first loss value generated when a restored improved image is obtained after the input data passes through the encoder and the decoder through the neural network, a second loss value for essential preservation information to be preserved when restored after the input data passes through the encoder and the decoder, a third loss value generated when the input data passes through the encoder and the classifier network, and a fourth loss value for essential preservation information to be preserved during classification in the classifier network, update weights of the encoder and decoder by backpropagating the calculated first and second loss values, update the weight of the classifier network by backpropagating the third loss value, and update the weight of the encoder by backpropagating the fourth loss value.

Description

Video image learning device and method{Learning Apparatus and Method of Image}

본 발명은 영상 이미지 학습장치 및 방법에 관한 것으로, 특히 비, 눈, 미세먼지, 안개 등과 같은 다양한 기상 조건이 입력되더라도 단일 모델에 의한 학습이 가능하고, 영상 복원 시 발생 되는 손실은 최소화시킴과 아울러 필수 보존 정보는 최대한 보존함으로써 영상 개선 효과를 높일 수 있는 영상 이미지 학습장치 및 방법에 관한 것이다.The present invention relates to an image image learning apparatus and method, in particular, it is possible to learn by a single model even when various weather conditions such as rain, snow, fine dust, fog, etc. are input, while minimizing the loss generated during image restoration. The essential preservation information relates to a video image learning apparatus and method capable of enhancing an image improvement effect by preserving as much as possible.

일반적으로 사람이 눈으로 보는 것을 맥락으로 인지하는 것은 인간에겐 쉬운 일이다. 다시 말해, 사람은 어떤 물체나 풍경에 대해 경험적으로 많은 학습이 되어 있기 때문에 호수 가운데 움직이는 물체가 있을 경우 해당 물체를 물고기나 배 등으로 한정하여 가정하는 것이 가능하고, 사람이라면 위험한 상황에 처해 있을 것이라는 것을 인식할 수 있다.In general, it is easy for humans to recognize what the human eye sees in context. In other words, because a person has learned a lot about an object or landscape empirically, if there is an object moving in the lake, it is possible to limit the object to a fish or boat, and if a person is in danger, Can recognize things.

최근 이러한 개념을 적용하는 인공지능(Artificial Intelligence; AI) 기술이 개발되었고, 이 인공지능 기술이 스마트 비디오 감시 시스템이나 자율 주행차, 의료 영상 분야 등에 접목되어 실제 이용되고 있다.Recently, an artificial intelligence (AI) technology that applies this concept has been developed, and the artificial intelligence technology has been applied to smart video surveillance systems, autonomous vehicles, medical imaging, and the like.

자율 주행차(또는 지능형 영상 감시 시스템) 기술 중 카메라를 이용하여 객체를 검출하는 컴퓨터 비전 기술은 딥러닝을 기반으로 이미 높은 성능을 보이고 있으나, 동작 환경이 맑은 환경(즉, 맑은 날씨 조건)에 제한되어 있고, 악천후 환경에서는 제대로 동작하지 않는 문제가 있다.Among computer-aided vehicle (or intelligent video surveillance system) technology, computer vision technology that detects objects using a camera already shows high performance based on deep learning, but is limited to a clear environment (ie, clear weather conditions). There is a problem that does not work properly in bad weather environment.

이로 인해, 악천후 환경에서의 낮은 객체 검출 성능을 보완하기 위해 악천후 환경, 즉, 비, 눈, 안개 등을 영상에서 제거하여 영상을 개선 시키는 영상 개선 기술들이 존재하나, 종래에는 비, 눈, 안개 등에 대한 각각의 기상조건별로 별도의 독립적인 개선 알고리즘이 필요하기 때문에 많은 비용이 소요되는 문제가 있다.For this reason, in order to compensate for the low object detection performance in the bad weather environment, there are image improvement technologies that improve the image by removing the bad weather environment, that is, rain, snow, fog, etc. from the image, but conventionally, rain, snow, fog, etc There is a problem in that it requires a lot of cost because a separate independent improvement algorithm is required for each weather condition.

또한, 개선이 필요한 영상에서도 보존되어야 할 이미지 정보가 존재하나, 종래에는 영상 개선 시 필수적으로 보존되어야 할 정보가 손상된 채 영상 개선이 이루어져 불완전한 이미지를 얻는 문제가 있었다.In addition, there is image information to be preserved even in an image that needs improvement, but in the past, there has been a problem of obtaining an incomplete image by improving the image while impairing the information that must be preserved when improving the image.

한국 등록특허 제10-1748780호Korean Registered Patent No. 10-1748780 한국 등록특허 제10-1998027호Korean Registered Patent No. 10-1998027

따라서, 본 발명은 상기와 같은 문제를 해결하기 위한 것으로, 본 발명의 목적은 비, 눈, 미세먼지, 안개 등과 같은 다양한 기상 조건이 입력되더라도 단일 모델에 의한 학습이 가능하고, 영상 복원 시 발생 되는 손실은 최소화시킴과 아울러 필수 보존 정보는 최대한 보존함으로써 영상 개선 효과를 높일 수 있는 영상 이미지 학습장치 및 방법을 제공하는데 있다.Therefore, the present invention is intended to solve the above problems, and the object of the present invention is to enable learning by a single model even when various weather conditions such as rain, snow, fine dust, fog, etc. are input, and are generated when an image is restored. The present invention is to provide a video image learning apparatus and method capable of enhancing an image improvement effect by minimizing loss and preserving essential preservation information as much as possible.

상술한 목적을 이루기 위해, 본 발명에 따르면, 신경망을 통해 입력 데이터가 입력되면, 인코더를 통해 상기 입력 데이터에서 중요한 특징들을 추출하고, 이렇게 추출된 중요 특징들을 디코더를 통해 복원시켜 입력 데이터를 학습하는 오토인코더; 상기 인코더로부터 중요 특징들을 전달받아 현재 영상 이미지를 기상조건별로 분류시키는 분류기 네트워크; 및 상기 신경망을 통해 입력 데이터가 인코더와 디코더를 통과한 후 복원된 개선 이미지를 얻을 때 발생 되는 제1 손실값, 상기 입력 데이터가 인코더와 디코더를 통과하여 복원될 때 보존되어야 할 필수 보존 정보에 대한 제2 손실값, 상기 입력 데이터가 인코더와 분류기 네트워크를 통과할 때 발생 되는 제3 손실값 및 상기 분류기 네트워크에서 클래스 분류 시 보존되어야 할 필수 보존 정보에 대한 제4 손실값을 계산한 후 계산된 제1 손실값과 제2 손실값을 각각 역전파하여 상기 인코더와 디코더의 가중치를 각각 업데이트시키고, 상기 제3 손실값을 역전파하여 상기 분류기 네트워크의 가중치를 업데이트시키며, 상기 제4 손실값을 역전파하여 상기 인코더의 가중치를 업데이트시키는 손실 계산부를 포함한다.In order to achieve the above object, according to the present invention, when input data is input through a neural network, important features are extracted from the input data through an encoder, and the extracted important features are restored through a decoder to learn input data. Autoencoder; A classifier network that receives important features from the encoder and classifies the current video image according to weather conditions; And a first loss value generated when the reconstructed image is obtained after the input data passes through the encoder and the decoder, and the necessary preservation information to be preserved when the input data is restored through the encoder and decoder. The second loss value, the third loss value generated when the input data passes through the encoder and the classifier network, and the fourth loss value calculated after calculating the fourth loss value for essential preservation information to be preserved when classifying the classifier network The first loss value and the second loss value are back propagated to update the weights of the encoder and decoder, respectively, and the third loss value is back propagated to update the weight of the classifier network, and the fourth loss value is back propagated. It includes a loss calculation unit for updating the weight of the encoder.

본 발명에 따르면, 상기 분류기 네트워크는 아래의 수식 1에 의해 학습 되는 것을 특징으로 한다.According to the present invention, the classifier network is characterized by learning by Equation 1 below.

[수식 1][Equation 1]

여기서, θ_f는 인코더 함수의 특징 파라미터, θ_h는 분류기 네트워크 함수의 특징 파라미터,

는 입력 데이터 확률분포 p(x)에 대한 기대함수,

는 입력 데이터 확률분포에서 샘플링된 값, L_c는 크로스 엔트로피(cross entropy) 손실함수, Y는 기상 조건 클래스 라벨값을 의미함.Where θ _f is a feature parameter of the encoder function, θ _h is a feature parameter of the classifier network function,

Is the expected function for the probability distribution p(x) of the input data,

Is the sampled value from the input data probability distribution, L _c is the cross entropy loss function, and Y is the weather condition class label value.

본 발명에 따르면, 상기 제1 손실값 내지 제4 손실값은 수식 2와 같이 표현되는 것을 특징으로 한다.According to the present invention, the first to fourth loss values are characterized by being expressed as Equation 2.

[수식 2][Equation 2]

여기서, θ_f는 인코더 함수의 특징 파라미터, θ_g는 디코더 함수의 특징 파라미터, θ_h는 분류기 네트워크 함수의 특징 파라미터,

는 입력 데이터 확률분포 p(x)에 대한 기대함수, P_x(X)는 저화질 데이터의 확률분포, L_c는 크로스 엔트로피 손실함수,

는

의 맑은 영상,

는 입력 데이터 확률분포에서 샘플링된 값,

는

의 바이어스 레벨,

는 Q라는 보조분포로부터의 바이어스 예측값, L_B는 바이어스 분류(bias classification)를 위한 크로스 엔트로피, α, β, γ는 밸런스 함수로 0.05~0.2의 값을 가짐.Here, θ _f is a feature parameter of the encoder function, θ _g is a feature parameter of the decoder function, θ _h is a feature parameter of the classifier network function,

Is the expected function for the input data probability distribution p(x), P _x (X) is the probability distribution of low quality data, L _c is the cross entropy loss function,

The

Clear video,

Is the value sampled from the input data probability distribution,

The

The bias level,

Is the bias prediction value from the auxiliary distribution called Q, L _B is the cross entropy for bias classification, and α, β, and γ are values of 0.05 to 0.2 as a balance function.

본 발명에 따르면, 상기 제1 손실값 내지 제4 손실값의 관계함수는 수식 3과 같이 표현되는 것을 특징으로 한다.According to the present invention, the relationship function between the first loss value and the fourth loss value is characterized by being expressed as Equation 3.

[수식 3][Equation 3]

여기서, T(x)는 오토인코더와 분류기 네트워크에 전달되는 전체 가중치를 의미함.Here, T(x) means the total weight delivered to the autoencoder and classifier network.

본 발명의 실시 예에 따른 영상 이미지 학습방법은 인코더가 입력 데이터에서 중요 특징들을 추출하고, 상기 인코더에서 추출된 중요 특징들을 디코더가 복원시켜 출력시키는 단계; 손실 계산부에서 상기 입력 데이터가 상기 인코더와 디코더를 통과하여 복원될 때 발생 되는 제1 손실값을 계산하는 단계; 상기 손실 계산부에서 상기 입력 데이터가 상기 인코더와 디코더를 통과하여 복원될 때 보존되어야 할 필수 보존 정보에 대한 제2 손실값을 계산하는 단계; 상기 오토인코더가 상기 제1 손실값과 제2 손실값을 역전파하여 상기 인코더와 디코더의 가중치를 업데이트시키는 단계; 분류기 네트워크에서 상기 인코더로부터 전달된 중요 특징들에 따라 현재 입력 데이터를 기상조건별로 분류하는 단계; 상기 손실 계산부에서 상기 인코더와 분류기 네트워크를 통과하여 기상조건별로 입력 데이터가 분류될 때 발생 되는 제3 손실값을 계산하는 단계; 상기 손실 계산부에서 기상조건별 입력 데이터 분류 시 보존되어야 할 필수 보존 정보에 대한 제4 손실값을 계산하는 단계; 상기 오토인코더가 상기 제3 손실값을 역전파하여 상기 분류기 네트워크의 가중치를 업데이트시키는 단계; 및 상기 오토인코더가 상기 제4 손실값을 역전파하여 상기 인코더의 가중치를 업데이트시키는 단계를 포함한다.An image image learning method according to an embodiment of the present invention includes: an encoder extracting important features from input data, and a decoder reconstructing and outputting important features extracted from the encoder; Calculating a first loss value generated when the input data is restored through the encoder and decoder in a loss calculation unit; Calculating a second loss value for essential preservation information to be preserved when the input data is restored through the encoder and decoder in the loss calculator; The auto-encoder updating the weights of the encoder and decoder by back propagating the first and second loss values; Classifying the current input data according to weather conditions according to important features transmitted from the encoder in the classifier network; Calculating a third loss value generated when input data is classified according to weather conditions by passing through the encoder and the classifier network in the loss calculation unit; Calculating a fourth loss value for essential preservation information to be preserved when classifying input data according to weather conditions in the loss calculation unit; The autoencoder updating the weight of the classifier network by back propagating the third loss value; And the auto-encoder updating the weight of the encoder by back propagating the fourth loss value.

[수식 2][Equation 2]

는

의 맑은 영상,

는 입력 데이터 확률분포에서 샘플링된 값,

는

의 바이어스 레벨,

The

Clear video,

Is the value sampled from the input data probability distribution,

The

The bias level,

[수식 3][Equation 3]

본 발명에 따르면, 오토인코더에서 복원될 때 발생 되는 제1 손실값, 오토인코더에서 복원될 때 보존되어야 할 필수 보존 정보에 대한 제2 손실값, 중요 특징들을 기상 조건 클래스별로 분류할 때 발생 되는 제3 손실값 및 기상 조건 클래스별로 분류 시 보존되어야 할 필수 보존 정보에 대한 제4 손실값을 계산하고, 이렇게 계산된 손실값들에 대한 가중치를 오토인코더와 분류기 네트워크에 다시 역전파한 후 업데이트 시켜 영상 이미지 학습을 진행하기 때문에 다양한 기상조건이 주어지더라도 하나의 모델을 통한 학습이 가능하고, 비나 안개 등과 같은 고화질 방해요소를 제거 시 비나 안개 등과 함께 나무나 도로 등이 제거되더라도 필수 보존 정보로 제거된 부분의 나무나 도로 등을 복원할 수 있으므로 악천후 환경이라 하더라도 선명한 화면으로 영상을 개선할 수 있으며, 초기 학습과정에 고려되지 않던 기상조건들이 입력되더라도 효과적으로 영상 개선을 수행할 수 있다.According to the present invention, the first loss value generated when the auto-encoder is restored, the second loss value for essential preservation information to be preserved when the auto-encoder is restored, and the first generation value when the important features are classified by weather condition classes 3 Calculate the fourth loss value for essential preservation information to be preserved when classifying by loss value and weather condition class, and then update the weights for the calculated loss values back to the auto-encoder and classifier network and update Because image learning is progressed, learning through a single model is possible even if various weather conditions are given, and even if trees or roads are removed along with rain or fog when removing high-quality obstacles such as rain or fog, it is removed as essential preservation information. Since trees or roads of parts can be restored, images can be improved with a clear screen even in bad weather environments, and image improvement can be effectively performed even if weather conditions that were not considered in the initial learning process are input.

도 1은 본 발명의 실시 예에 따른 영상 이미지 학습장치를 나타내는 도면이다.
도 2는 컨볼루션 계산을 통해 중요 특징 맵을 얻는 방법을 나타내는 도면이다.
도 3은 본 발명의 실시 예에 따른 영상 이미지 학습방법을 나타내는 도면이다.
도 4는 입력 이미지에 대한 영상 개선 효과를 나타내는 도면이다.1 is a view showing a video image learning apparatus according to an embodiment of the present invention.
2 is a diagram illustrating a method of obtaining an important feature map through convolution calculation.
3 is a view showing a video image learning method according to an embodiment of the present invention.
4 is a diagram showing an image improvement effect on an input image.

이하, 첨부된 도면들을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있는 바람직한 실시 예를 상세히 설명한다. 다만, 본 발명의 바람직한 실시 예의 동작원리를 상세하게 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다.Hereinafter, with reference to the accompanying drawings will be described in detail preferred embodiments that can be easily carried out by the person of ordinary skill in the art to which the present invention pertains. However, in the detailed description of the operation principle of the preferred embodiment of the present invention, when it is determined that a detailed description of related known functions or configurations may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.

도 1은 본 발명의 실시 예에 따른 영상 이미지 학습장치를 나타내는 도면이다.1 is a view showing a video image learning apparatus according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시 예에 따른 영상 이미지 학습장치는 오토인코더(10), 분류기 네트워크(20) 및 손실 계산부(30)를 포함한다.Referring to FIG. 1, a video image learning apparatus according to an embodiment of the present invention includes an auto-encoder 10, a classifier network 20, and a loss calculator 30.

오토인코더(10)는 신경망을 통해 입력되는 학습 데이터를 학습하는 곳으로, 신경망을 통해 영상 이미지(이하, "입력 데이터"라 함)가 입력되면, 인코더(12)를 통해 입력 데이터에서 중요한 특징들을 추출하고, 이렇게 추출된 중요 특징들을 디코더(14)를 통해 복원시켜 출력한다. 이때, 출력 데이터는 입력 데이터와 가까워지도록 오토인코더(10)가 학습된다.The auto-encoder 10 is a place for learning learning data input through a neural network. When a video image (hereinafter referred to as "input data") is input through a neural network, important features in the input data are input through the encoder 12. The extracted and extracted important features are restored through the decoder 14 and output. At this time, the auto-encoder 10 is learned so that the output data becomes closer to the input data.

이러한, 오토인코더(10)는 신경망을 통해 영상 이미지가 입력되면, 도 2와 같이 컨볼루션 계산을 통해 인코더(12)에서 중요 특징 맵(feature map)을 추출하고, 추출된 중요 특징들을 복원시키는데, 후술하는 손실 계산부(30)에 의해 계산된 손실값에 대한 가중치가 역전파 알고리즘에 업데이트 된 상태(즉, 가중치가 추출 및 복원 시 중요 특징 맵에 적용된 상태)에서 학습을 진행한다.When the video image is input through the neural network, the autoencoder 10 extracts an important feature map from the encoder 12 through convolution calculation as shown in FIG. 2 and restores the extracted important features, Learning is performed in a state in which the weights for the loss values calculated by the loss calculation unit 30 to be described later are updated in the back propagation algorithm (that is, the weights are applied to important feature maps during extraction and restoration).

분류기 네트워크(Bias Network)(20)는 오토인코더(10)의 인코더(12)로부터 중요 특징 맵을 전달받아 현재 영상 이미지의 기상 조건이 어떤 것인지를 분류한다.The classifier network 20 receives important feature maps from the encoder 12 of the autoencoder 10 and classifies the weather conditions of the current video image.

다시 말해, 분류기 네트워크(20)는 인코더(12)로부터 전달되는 중요 특징 맵을 맑음, 우천, 눈, 미세먼지 등과 같이 다수개의 클래스(class)로 분류한다.In other words, the classifier network 20 classifies important feature maps transmitted from the encoder 12 into a plurality of classes such as clear, rain, snow, and fine dust.

이러한, 분류기 네트워크(20)는 3개의 컨볼루션 층(convolution layer)과 두 개의 완전하게 연결된 계층(fully connected layer)으로 구성되고, 아래의 수식 1과 같은 방법으로 학습 된다.The classifier network 20 is composed of three convolution layers and two fully connected layers, and is learned in the same manner as in Equation 1 below.

[수식 1][Equation 1]

는 입력 데이터 확률분포 p(x)에 대한 기대함수,

손실 계산부(30)는 신경망을 통해 입력 데이터가 인코더(12)와 디코더(14)를 통과한 후 복원된 개선 이미지를 얻을 때 발생 되는 제1 손실값(즉, 인코더와 디코더의 복원 손실값), 입력 데이터가 인코더(12)와 디코더(14)를 통과하여 복원될 때 보존되어야 할 필수 보존 정보에 대한 제2 손실값(즉, 이미지 정보를 보존하기 위한 정보이론적 최적화 값), 입력 데이터가 인코더(12)와 분류기 네트워크(20)를 통과할 때 발생 되는 제3 손실값(즉, 분류기 네트워크(20)의 분류 손실값) 및 분류기 네트워크(20)에서 클래스 분류 시 보존되어야 할 필수 보존 정보에 대한 제4 손실값(즉, 바이어스 영향력을 없애기 위한 정보이론적 최적화 값)을 계산한다.The loss calculation unit 30 is the first loss value (ie, the restoration loss value of the encoder and decoder) generated when the input data through the neural network passes through the encoder 12 and the decoder 14 to obtain a reconstructed improved image. , A second loss value for essential preservation information to be preserved when the input data is restored through the encoder 12 and decoder 14 (that is, an information theoretical optimization value for preserving image information), the input data is an encoder (12) and the third loss value generated when passing through the classifier network 20 (that is, the classification loss value of the classifier network 20) and the essential preservation information to be preserved when classifying in the classifier network 20 The fourth loss value (that is, an information-theoretic optimization value for eliminating bias influence) is calculated.

이때, 제1 손실값 내지 제4 손실값은 수식 2와 같다.At this time, the first to fourth loss values are as shown in Equation 2.

[수식 2][Equation 2]

는

의 맑은 영상(즉, 타겟 영상),

는 입력 데이터 확률분포에서 샘플링된 값(예를 들면, 입력 영상),

는

의 바이어스 레벨(즉, 기상 조건 클래스의 라벨),

The

Clear image (ie, target image),

Is the value sampled from the input data probability distribution (for example, the input image),

The

The bias level of (i.e. the label of the weather condition class),

상술한 수식 2에서 제1 손실값과 제3 손실값은 정보 손실을 최소화하기 위해 작을수록 좋고, 제2 손실값과 제4 손실값은 필수 보존 정보의 보존을 위해 클수록 좋으나, 비나 안개 등과 같이 화면 방해요소들이 최대한 제거되면서 나무나 도로 등의 배경들은 최대한 보존될 수 있게 목적을 수행할 수 있도록 본 발명의 영상 이미지 학습장치에서 제1 손실값 내지 제4 손실값을 이용한 전체 가중치(즉, 목적함수)는 수식 3과 같다.In Equation 2, the first loss value and the third loss value may be smaller to minimize information loss, and the second and fourth loss values may be larger to preserve essential preservation information. The total weight (ie, the objective function) using the first loss value to the fourth loss value in the video image learning apparatus of the present invention so that the background can be preserved as much as possible while the obstacles are removed as much as possible. ) Is as shown in Equation 3.

[수식 3][Equation 3]

여기서, T(x)는 오토인코더와 분류기 네트워크에 전달되는 전체 가중치(즉, 목적함수)를 의미함.Here, T(x) means the total weight (ie, objective function) delivered to the autoencoder and classifier network.

이러한, 손실 계산부(30)는 제1 손실값 내지 제4 손실값을 계산한 후 계산된 제1 손실값과 제2 손실값을 역전파하여 오토인코더(10)의 인코더(12) 및 디코더(14)의 가중치를 업데이트시키고, 제3 손실값을 역전파하여 분류기 네트워크(20)의 가중치를 업데이트시키며, 제4 손실값을 역전파하여 인코더(12)의 가중치를 업데이트시킨다.The loss calculator 30 calculates the first loss value to the fourth loss value, and then reverse-propagates the calculated first loss value and the second loss value, thereby encoding the encoder 12 and the decoder of the autoencoder 10 ( The weight of 14) is updated, the third loss value is back propagated to update the weight of the classifier network 20, and the fourth loss value is back propagated to update the weight of the encoder 12.

한편, 손실 계산부(30)는 정보이론적 손실인 제4 손실값과 제2 손실값을 다음과 같은 방법을 통해 설정한다.Meanwhile, the loss calculation unit 30 sets the fourth and second loss values, which are information-theoretical losses, through the following method.

먼저, 분류기 네트워크(20)에서 맑음, 우천, 눈, 미세먼지 등과 같은 클래스에 대한 정보가 인지될 때에는 분류기 네트워크(20)가 영상 개선에 불필요한 바이어스(bias) 정보를 학습하고 있다고 볼 수 있으므로 인코더(12)가 바이어스 정보인 기상 조건 정보를 학습하게 되면, 영상 개선에 필요한 특징을 학습하는 게 아니라 특정 기상 정보에만 반응하는 정보를 학습할 가능성이 있고, 인코더(12)가 이러한 정보를 학습할 때 디코더(14) 또한 이러한 정보가 전파되어져 메인 프레임워크가 모두 바이어스 정보를 학습할 가능성이 발생하게 됩니다. 따라서, 수식 1의 바이어스 정보 학습시에는 수식 4와 같은 정보이론이 적용된다.First, when information about a class such as clear, rain, snow, fine dust, etc. is recognized in the classifier network 20, it can be considered that the classifier network 20 is learning bias information unnecessary for image improvement. If 12) learns weather condition information, which is bias information, it is possible to learn information that responds only to specific weather information, not to learn features necessary for image improvement, and decoders when encoder 12 learns such information. (14) In addition, the spread of this information creates the possibility that all of the main framework will learn bias information. Therefore, when learning bias information in Equation 1, the information theory as in Equation 4 is applied.

[수식 4][Equation 4]

여기서, I는 정보이론의 상호정보 지수, Y는 기상 조건 클래스 라벨값을 의미함.Here, I is the mutual information index of the information theory, and Y is the weather condition class label value.

수식 4에서 알 수 있듯이, 정보이론의 상호정보 지수는 두 개의 항이 서로 상호적으로 연관이 있는 경우 0보다 큰 값을 갖고, 서로 연관없이 독립일 경우에는 0에 근사한 값을 갖게 됩니다. 따라서, 기상 정보와 영상 개선 네트워크가 서로 연관이 없어야 하므로 수식 4의 I는 0에 근사하도록 학습되어야 합니다.As can be seen from Equation 4, the mutual information exponent of information theory has a value greater than 0 when the two terms are mutually related, and a value close to 0 when independent. Therefore, the I of Equation 4 should be learned to approximate 0 because the weather information and the image improvement network should not be related to each other.

이에 따라, 분류기 네트워크(20)에서 학습 시 손실함수(제4 손실값)는 수식 5와 같다.Accordingly, the loss function (fourth loss value) during learning in the classifier network 20 is expressed by Equation 5.

[수식 5][Equation 5]

여기서, θ_f는 인코더 함수의 특징 파라미터, θ_g는 디코더 함수의 특징 파라미터,

는 입력 데이터 확률분포 p(x)에 대한 기대함수, L_c는 크로스 엔트로피 손실함수, X_GT는 실제 목표 값, γ는 밸런스 함수(0.05~0.2), b(X)는 분류 클래스를 의미함.Here, θ _f is a feature parameter of the encoder function, θ _g is a feature parameter of the decoder function,

Is the expected function for the input data probability distribution p(x), L _c is the cross entropy loss function, X _GT is the actual target value, γ is the balance function (0.05~0.2), and b(X) is the classification class.

이때, 수식 5는 수식 6과 같이 표현할 수 있으나, 이러한 수식 6은 후면(posterior) 분포 P(b(X)|f(X)를 요구하기 때문에 직접적으로 최소화하는 것이 매우 어려우므로 수식 7과 같이 보조 분포인 Q를 이용하여 제약을 주는 것이 바람직합니다.At this time, Equation 5 can be expressed as Equation 6, but since Equation 6 requires a posterior distribution P(b(X)|f(X), it is very difficult to minimize directly, so it is supplementary as Equation 7. It is desirable to give a constraint using the distribution Q.

[수식 6][Equation 6]

여기서, H()는 가장자리 엔트로피(marginal entropy), H(|)는 조건 엔트로피(conditional entropy), b(X)는 분류 클래스를 의미함.Here, H() is marginal entropy, H(|) is conditional entropy, and b(X) is a classification class.

[수식 7][Equation 7]

여기서, θ_f는 인코더 함수의 특징 파라미터,

는 입력 데이터 확률분포 p(x)에 대한 기대함수, Q는 보조분포,

는

의 바이어스 레벨(즉, 기상 조건 클래스의 라벨), b(X)는 분류 클래스를 의미함.Here, θ _f is a feature parameter of the encoder function,

Is the expected function for the probability distribution p(x) of the input data, Q is the auxiliary distribution,

The

The bias level of (ie, the label of weather conditions class), b(X) means the classification class.

한편, 아무리 기상 조건에 의해 손상된 정보라 할지라도 보전되어야 할 필수 정보가 변형되지 않아야 하므로 손실 계산부(30)는 수식 8과 같은 정보이론을 적용하여 제2 손실값을 설정한다.Meanwhile, even if the information is damaged by weather conditions, the essential information to be preserved should not be modified, so the loss calculation unit 30 sets the second loss value by applying the information theory shown in Equation 8.

[수식 8][Equation 8]

여기서, I는 정보이론의 상호정보 지수, X는 입력 데이터를 의미함.Here, I is the mutual information index of the information theory, and X is the input data.

이때, 제2 손실값으로 표현되는 필수 보존 정보는 최대한 큰 값을 유지해야 하므로 수식 8은 수식 4와는 반대로 최대화가 요구된다.At this time, since the essential preservation information represented by the second loss value needs to be kept as large as possible, Equation 8 is required to maximize as opposed to Equation 4.

이에 따라, 오토인코더(10)에서 학습 시 손실함수(즉, 제2 손실값)는 수식 9와 같이 변경된다.Accordingly, the loss function (ie, the second loss value) during learning in the autoencoder 10 is changed as shown in Equation 9.

[수식 9][Equation 9]

는 입력 데이터 확률분포 p(x)에 대한 기대함수, L_c는 크로스 엔트로피 손실함수,

는 입력 데이터 확률분포에서 샘플링된 값(예를 들면, 입력 영상), α는 제1 손실값과 제2 손실값 간의 밸런스 함수(0.05~0.2), I는 정보이론의 상호정보 지수, X는 입력 데이터를 의미함.Here, θ _f is a feature parameter of the encoder function, θ _g is a feature parameter of the decoder function,

Is the expected function for the input data probability distribution p(x), L _c is the cross entropy loss function,

Is the value sampled from the probability distribution of the input data (for example, the input image), α is the balance function between the first loss value and the second loss value (0.05~0.2), I is the mutual information index of the information theory, X is the input Data.

이때, 수식 9에 도시된 손실함수는 상술한 수식 5 내지 수식 7과 같은 최적화 과정을 통해 수식 2의 제2 손실값과 같이 설정된다.At this time, the loss function shown in Equation 9 is set as the second loss value of Equation 2 through the optimization process of Equations 5 to 7 described above.

도 3은 본 발명의 실시 예에 따른 영상 이미지 학습방법을 나타내는 도면이다.3 is a view showing a video image learning method according to an embodiment of the present invention.

도 3에 도시된 바와 같이, 본 발명의 실시 예에 따른 영상 이미지 학습방법은 먼저, 신경망을 통해 영상 이미지 데이터가 입력되면, 인코더(12)를 통해 입력 데이터에서 중요 특징들을 추출한 후 디코더를 통해 추출된 중요 특징들을 복원시켜 출력시킨다(S110).As illustrated in FIG. 3, in the image image learning method according to an embodiment of the present invention, first, when image image data is input through a neural network, important features are extracted from the input data through the encoder 12 and then extracted through a decoder. The restored important features are restored and output (S110).

이때, 손실 계산부(30)는 입력 데이터가 오토인코더(10)를 통과할 때 발생 되는 제1 손실값과 상기 입력 데이터가 오토인코더(10)를 통과할 때 보존되어야 할 필수 보존 정보에 대한 제2 손실값을 계산한다(S120).At this time, the loss calculation unit 30 is provided for the first loss value generated when the input data passes through the auto-encoder 10 and the essential preservation information to be preserved when the input data passes through the auto-encoder 10. 2 Calculate the loss value (S120).

이후, 손실 계산부(30)는 제1 손실값과 제2 손실값을 역전파시켜 인코더(12)와 디코더(14)의 가중치를 업데이트(즉, 추출된 중요 특징 맵과 복원 맵에 가중치 적용)시킨다(S130).Thereafter, the loss calculator 30 updates the weights of the encoder 12 and the decoder 14 by back propagating the first loss value and the second loss value (that is, applying weights to the extracted important feature map and the reconstruction map). Let (S130).

한편, 입력 데이터가 오토인코더(10)를 통과할 때 인코더(12)에서 추출된 중요 특징들은 분류기 네트워크(20)로 전달되는 데, 손실 계산부(30)는 분류기 네트워크(20)에서 입력 데이터의 중요 특징들에 따라 현재 영상 이미지의 기상조건을 분류시킬 때 발생 되는 제3 손실값과 분류기 네트워크(20)에서 기상조건별 영상 이미지 분류 시 보존되어야 할 필수 보존 정보에 대한 제4 손실값을 계산한다(S140).On the other hand, when the input data passes through the auto-encoder 10, the important features extracted from the encoder 12 are transmitted to the classifier network 20, and the loss calculator 30 is configured to input the data from the classifier network 20. The third loss value generated when classifying the weather conditions of the current video image according to the important features and the fourth loss value for essential preservation information to be preserved when classifying the video image according to the weather conditions in the classifier network 20 are calculated. (S140).

이후, 손실 계산부(30)는 제3 손실값을 역전파하여 분류기 네트워크(20)의 가중치를 업데이트시키고, 제4 손실값을 역전파하여 인코더(12)의 가중치를 업데이트시킨다(S150).Thereafter, the loss calculation unit 30 updates the weight of the classifier network 20 by back propagating the third loss value, and updates the weight of the encoder 12 by back propagating the fourth loss value (S150).

이와 같이, 본 발명의 실시 예에 따른 영상 이미지 학습장치 및 방법은 오토인코더(10)에서 중요 특징 추출 및 복원 시 발생 되는 제1 손실값 및 제2 손실값과 분류기 네트워크(20)에서 발생 되는 제3 손실값 및 제4 손실값을 계산한 후 이 계산된 손실값들에 대한 가중치를 오토인코더(10)와 분류기 네트워크(20)에 다시 역전파한 후 업데이트시켜 영상 이미지 학습을 진행하기 때문에 다양한 기상 조건에 대해 하나의 모델을 통한 학습이 가능하고, 비나 안개 등과 같은 고화질 방해요소를 제거 시 비나 안개 등과 함께 나무나 도로 등이 제거되더라도 필수 보존 정보로 제거된 부분의 나무나 도로 등을 복원할 수 있으므로 악천후 환경이라 하더라도 선명한 화면으로 영상을 개선할 수 있게 된다.As described above, the image image learning apparatus and method according to an embodiment of the present invention includes first and second loss values generated during extraction and restoration of important features in the autoencoder 10, and a second loss value generated in the classifier network 20. After calculating the 3th loss value and the 4th loss value, the weights for the calculated loss values are back-propagated back to the autoencoder 10 and the classifier network 20, and then updated to perform image image learning. It is possible to learn through a single model about the condition, and even if trees or roads are removed along with rain or fog when removing high-quality obstacles such as rain or fog, trees or roads of the parts removed with essential preservation information can be restored. Therefore, even in a bad weather environment, the image can be improved with a clear screen.

또한, 본 발명의 실시 예에 따른 영상 이미지 학습장치 및 방법은 일부 환경에 대한 영상(예를 들면, 맑음 또는 흐림)만 학습한 상태에서 도 4와 같이 강우 영상이나 미세먼지 영상이 입력될 때 종래에는 정보나 미세먼지 정보를 효과적으로 개선하지 못하나, 본 발명은 종래에서 고려되었던 제1 손실값 이외에 정보이론 기반의 손실값(제2 손실값, 제4 손실값)과 분류기 네트워크(20)에서의 기상분류 손실값을 계산한 후 이를 다시 역전파시켜 오토인코더(10)와 분류기 네트워크(20)의 가중치를 업데이트시키기 때문에 학습하지 않은 강우 정보와 미세먼지 정보를 효과적으로 제거할 수 있어 영상을 효과적으로 개선할 수 있게 된다.In addition, the image image learning apparatus and method according to an embodiment of the present invention is conventional when a rainfall image or a fine dust image is input as shown in FIG. 4 in a state in which only images (for example, clear or cloudy) for some environments are learned. There is no effective improvement of information or fine dust information, but the present invention is based on information theory-based loss values (second loss value, fourth loss value) and wake-up in the classifier network 20 in addition to the first loss values previously considered. After calculating the classification loss value, the weight of the auto-encoder 10 and the classifier network 20 is updated by back-propagating it, so it is possible to effectively remove the untrained rainfall information and fine dust information, thereby improving the image effectively. There will be.

이상에서 설명한 바와 같이, 본 발명의 상세한 설명에서는 본 발명의 바람직한 실시 예에 관해서 설명하였으나, 이는 본 발명의 가장 양호한 실시 예를 예시적으로 설명한 것일 뿐, 본 발명을 한정하는 것은 아니다. 또한, 본 발명이 속하는 기술 분야의 통상의 지식을 가진 자라면 누구나 본 발명의 기술사상의 범주를 벗어나지 않는 범위 내에서 다양한 변형과 모방이 가능함을 물론이다. 따라서, 본 발명의 권리범위는 설명된 실시 예에 국한되어 정해져선 안 되며, 후술하는 청구범위뿐만 아니라 이와 균등한 것들에 의해 정해져야 한다.As described above, in the detailed description of the present invention, preferred embodiments of the present invention have been described, but these are merely illustrative of the best embodiments of the present invention, and do not limit the present invention. In addition, it is of course possible for anyone having ordinary knowledge in the technical field to which the present invention pertains to various modifications and imitation without departing from the scope of the technical idea of the present invention. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined by the equivalents as well as the claims described below.

10: 오토인코더 12: 인코더
14: 디코더 20: 분류기 네트워크
30: 손실 계산부10: auto encoder 12: encoder
14: decoder 20: classifier network
30: loss calculation unit

Claims

An auto-encoder that extracts important features from the input data through an encoder when the input data is input through a neural network, and restores the extracted important features through a decoder to learn input data;
A classifier network that receives important features from the encoder and classifies the current video image by weather conditions; And
First loss value that occurs when input data is passed through the encoder and decoder through the neural network to obtain a reconstructed enhanced image, and information about essential preservation information to be preserved when the input data is restored through the encoder and decoder. The first calculated value after calculating the second loss value, the third loss value generated when the input data passes through the encoder and the classifier network, and the fourth loss value for essential storage information to be preserved when classifying the classifier network Back-propagating the loss value and the second loss value to update the weights of the encoder and decoder respectively, updating the weight of the classifier network by back-propagating the third loss value, and back-propagating the fourth loss value And a loss calculator for updating the weight of the encoder.

The method according to claim 1,
The classifier network is a video image learning apparatus, characterized in that it is learned by Equation 1 below.
[Equation 1]

Where θ _f is a feature parameter of the encoder function, θ _h is a feature parameter of the classifier network function,

The method according to claim 1,
The first loss value to the fourth loss value is a video image learning apparatus, characterized in that represented by Equation 2.
[Equation 2]

Here, θ _f is a feature parameter of the encoder function, θ _g is a feature parameter of the decoder function, θ _h is a feature parameter of the classifier network function,

The

Clear video,

Is the value sampled from the input data probability distribution,

The

The bias level,

The method according to claim 3,
The relationship between the first loss value and the fourth loss value is a video image learning apparatus, characterized in that represented by Equation 3.
[Equation 3]

Here, T(x) means the total weight delivered to the autoencoder and classifier network.

In the video image learning method using an auto-encoder,
An encoder extracting important features from input data, and a decoder reconstructing and outputting important features extracted from the encoder;
Calculating a first loss value generated when the input data is restored through the encoder and decoder in a loss calculation unit;
Calculating a second loss value for essential preservation information to be preserved when the input data is restored through the encoder and decoder in the loss calculator;
The auto-encoder updating the weights of the encoder and decoder by back propagating the first and second loss values;
Classifying the current input data according to weather conditions according to important features transmitted from the encoder in the classifier network;
Calculating a third loss value generated when input data is classified according to weather conditions by passing through the encoder and the classifier network in the loss calculation unit;
Calculating a fourth loss value for essential preservation information to be preserved when classifying input data according to weather conditions in the loss calculation unit;
The autoencoder updating the weight of the classifier network by back propagating the third loss value; And
And the auto-encoder updating the weight of the encoder by back propagating the fourth loss value.

The method according to claim 5,
The first to fourth loss value is a video image learning method, characterized in that represented by Equation 2.
[Equation 2]

The

Clear video,

Is the value sampled from the input data probability distribution,

The

The bias level,

The method according to claim 6,
The relationship between the first loss value and the fourth loss value is a video image learning method, characterized in that represented by Equation 3.
[Equation 3]