KR20100097858A

KR20100097858A - Super-resolution using example-based neural networks

Info

Publication number: KR20100097858A
Application number: KR1020090016715A
Authority: KR
Inventors: 권동민; 조성원; 김재민; 정선태; 강호석
Original assignee: 홍익대학교 산학협력단; 조성원; 김재민; 정선태
Priority date: 2009-02-27
Filing date: 2009-02-27
Publication date: 2010-09-06

Abstract

PURPOSE: A method for enlarging high definition image using neural networks model based on an example is provided to approximate information about high frequency component through back propagation neural network. CONSTITUTION: Division to and image of a low frequency component is performed. Division to an image of mid frequency component which very low frequency is removed is performed. Division to an image of high frequency is performed. The image of the high frequency component is obtained through difference of each pixel value with the image of low frequency component and the original image of high definition.

Description

High-Resolution Images Using Example-based Neural Networks {Super-Resolution Using Example-based Neural Networks}

별도의 부가적 장비 없이 저해상도 영상으로부터 고해상도 영상을 얻기 위한 노력의 방법으로 영상 보간을 비롯한 초해상도 기법이 꾸준히 개발, 발전되고 있다. 이러한 해상도 개선 기법은 최근 활발히 연구, 개발되고 있는 High Definition Television (HDTV)에서 화질을 결정하는 중요한 기술 중의 하나이다. 고해상도의 영상 확대 기술은 정지 영상과 비디오 영상 여러 응용 분야에서 유용하게 사용되고 있다. 그 예로서 군사적 감시 장치와 같은 지상으로부터 높은 고도에서 촬영된 영상으로부터 적의 활동 정보를 얻어 내기 위한 목적을 가진 것과 정확한 진단을 내리기 위한 의료 영상 촬영 장비들을 들 수 있다. 또한, 감시 시스템에서는 자동차 번호판 인식이나 얼굴 인식에 있어 고해상도 영상을 필요로 한다. 이러한 특수한 목적을 위해 사용되는 고해상도 전용 장비들을 통해 얻어낸 영상들을 화질을 개선하여 좀 더 정확한 정보를 얻어 내고, 저해상도 취득 장치로부터 얻어 낸 영상의 화질을 개선하기 위한 방향도 함께 고려되어 연구되고 있다.Super resolution techniques including image interpolation have been steadily developed and developed as an effort to obtain high resolution images from low resolution images without additional equipment. This resolution enhancement technique is one of the important techniques to determine the image quality in High Definition Television (HDTV), which is being actively researched and developed recently. High-resolution image magnification techniques have found utility in still and video imagery. Examples include medical imaging equipment aimed at obtaining enemy activity information from high altitude images taken from the ground, such as military surveillance devices, and for accurate diagnosis. Surveillance systems also require high-resolution images for license plate recognition and face recognition. In order to improve the image quality of the images obtained through the high-resolution dedicated equipment used for such a special purpose to obtain more accurate information, the direction to improve the image quality of the image obtained from the low resolution acquisition device is also considered.

저해상도의 영상을 확대하여 고해상도의 영상을 만들어내는 방법으로서 여러 가지 영상 보간 기법들이 있는데, 이는 영상 처리 연구의 초기부터 계속 진행되어 왔고 현재에도 여러 가지 방법들이 꾸준히 제안되고 있다. 기본적인 영상 보간 기법은 선형(linear) 보간 기법과 비선형(nonlinear) 보간 기법으로 나뉜다. 선형 보간 기법은 최근접 보간법(nearest neighbor interpolation), 쌍선형 보간법(bilinar interpolation), 쌍입방 보간법(bicubic interpolation) 등이 널리 사용되고 있다. 최근접 보간법과 쌍선형 보간법은 계산이 간단하고 계산량이 적은 장점이 있으나 보간 시 블록 효과 (blocking effect)가 심하게 나타나게 된다. 이에 비해 쌍입방 보간법은 블록 효과를 줄일 수 있으나 윤곽(edge) 영역에서의 블러링 (blurring)과 링잉 효과 (ringing effect)가 나타나게 된다. 최근 컴퓨터의 발달로 많은 계산을 빠르게 처리할 수 있게 되어 비선형 보간 방법들에 대한 연구가 활발히 진행되고 있다. 비선형 방법들은 인간의 시각적 특성 (human visual system)에 민감한 에지 영역에 중점을 두어 연구되고 있다. 비선형 필터, 마르코프 확률장 (Markov random field) 모델 등이 비선형 보간법의 예이다.There are various image interpolation techniques as a method of creating a high resolution image by enlarging a low resolution image, which has been progressed since the early stages of image processing research, and various methods have been steadily proposed even now. Basic image interpolation techniques are divided into linear interpolation and nonlinear interpolation. Linear interpolation techniques such as nearest neighbor interpolation, bilinar interpolation, and bicubic interpolation are widely used. The close interpolation method and the bilinear interpolation method have the advantages of simple calculation and small amount of calculation, but the blocking effect is severe when interpolation. In contrast, bicubic interpolation can reduce the block effect, but blurring and ringing effects appear in the edge region. Recently, with the development of computers, many calculations can be processed quickly, and researches on nonlinear interpolation methods are being actively conducted. Nonlinear methods have been studied with an emphasis on edge regions that are sensitive to human visual system. Nonlinear filters, Markov random field models, and the like are examples of nonlinear interpolation methods.

본 발명에서는 예제 영상을 이용한 학습 과정을 통해 구축된 신경회로망 모델을 기반으로 저해상도 입력 영상과의 유사성을 찾아 고주파 성분을 예측하여 이를 보간한 영상에 더해줌으로써 고해상도 영상을 구현하는 기법에 대해 제안한다. 그러나 입력 영상 즉, 저해상도 이미지에는 보여지지 않은 선명하고 뚜렷한 영상을 위한 고주파 성분의 정보를 추정하는 것은 매우 어려운 일이다. 그래서 본 논문에서는 실시간 처리가 가능하고 복원성능이 우수한 신경회로망 모델을 구성하였다.The present invention proposes a technique for implementing a high resolution image by adding a high frequency component to the interpolated image by finding similarity with a low resolution input image based on a neural network model constructed through a learning process using an example image. However, it is very difficult to estimate the information of high frequency components for a clear and clear image not shown in the input image, that is, the low resolution image. Therefore, in this paper, we constructed a neural network model capable of real-time processing and excellent restoration performance.

본 발명에서는 하나의 이미지로부터 고품질 영상을 획득하는 방법을 제공한다. 입력 영상에서 확대를 하게 되면 잃게 되는 고주파수 성분의 정보를 신경회로망의 학습 기능과 역전파 알고리즘을 통해 추정하게 된다. 이렇게 얻어진 정보를 원영상을 기존 보간법을 이용하여 확대된 영상의 정보와 합성을 통해 고해상도 영상을 획득하게 된다. 입력 영상을 기존의 보간법으로 확대를 한 후 그 영상의 정보를 신경회로망 모델의 입력으로 하여 원하는 정보를 출력층에서 얻어서 보간법을 통해 얻은 영상과의 합성을 통해 고해상도 영상을 구현하게 된다.The present invention provides a method of obtaining a high quality image from one image. When the input image is enlarged, the information of the lost high frequency component is estimated through the neural network learning function and the back propagation algorithm. The high-resolution image is obtained by synthesizing the original image with the information of the enlarged image by using the existing interpolation method. After the input image is enlarged by the existing interpolation method, the information of the image is input to the neural network model, and the desired information is obtained from the output layer, and the high resolution image is realized by combining with the image obtained through the interpolation method.

본 발명에서 신경회로망의 역전파 알고리즘을 사용하여 영상 확대시 고해상도를 얻을 수 있는 방법을 제공하였다. 기존의 보간법의 결과들과 비교해 보았을 때 인간의 시각에 민감한 에지 부분을 효과적으로 복원함으로서 영상의 선명도가 높아져서 우수한 성능을 확인할 수 있었다. 또한, 시각적인 향상 뿐만 아니라 영상의 품질 면에 있어서도 다른 방법에 비해 원 영상에 더욱 근접한 영상을 획득하였음을 확인하였다. 이는 객관적 지표로 삼은 PSNR의 값을 통해 알 수 있었다. 제공하는 방법은 영상 확대시의 고해상도 영상 복원 뿐만 아니라 다양한 분야에서 응용이 가능하다는 것을 실험을 통해 알 수 있었다. 실험을 통해 본 흐린 영상의 보정 결과는 기존의 모델을 이용했음에도 불구하고 좋은 결과를 보여주었다.In the present invention, a method for obtaining a high resolution when an image is enlarged using a backpropagation algorithm of a neural network is provided. Compared with the results of the existing interpolation method, it is possible to confirm the excellent performance by increasing the sharpness of the image by effectively reconstructing the edges sensitive to human vision. In addition, it was confirmed that the image was obtained closer to the original image than other methods in terms of visual quality as well as image quality. This can be seen from the PSNR as an objective index. Experiments show that the proposed method can be applied to various fields as well as to reconstructing high resolution images. The results of the calibration of the blurred image through the experiments showed good results despite using the existing model.

본 발명에서는 하나의 이미지로부터 고품질 영상을 획득하는 방법을 제공한다. 입력 영상에서 확대를 하게 되면 잃게 되는 고주파수 성분의 정보를 신경회로망의 학습 기능과 역전파 알고리즘을 통해 추정하게 된다. 이렇게 얻어진 정보를 원영상을 기존 보간법을 이용하여 확대된 영상의 정보와 합성을 통해 고해상도 영상을 획득하게 된다. 제공하는 고해상도 복원 방법의 대략적인 구성은 [도 1]과 같다. 입력 영상을 기존의 보간법으로 확대를 한 후 그 영상의 정보를 신경회로망 모델의 입력으로 하여 원하는 정보를 출력층에서 얻어서 보간법을 통해 얻은 영상과의 합성을 통해 고해상도 영상을 구현하게 된다. 물론 이 구성은 신경회로망의 학습 과정을 거친 후에 이루어지는 과정이다. The present invention provides a method of obtaining a high quality image from one image. When the input image is enlarged, the information of the lost high frequency component is estimated through the neural network learning function and the back propagation algorithm. The high-resolution image is obtained by synthesizing the original image with the information of the enlarged image by using the existing interpolation method. The schematic configuration of the high resolution restoration method provided is as shown in FIG. After the input image is enlarged by the existing interpolation method, the information of the image is input to the neural network model, and the desired information is obtained from the output layer, and the high resolution image is realized by combining with the image obtained through the interpolation method. Of course, this configuration takes place after the neural network learning process.

제공하는 방법에서는 크게 두 단계로 나눌 수 있다. 먼저 영상을 주파수 영역에서 3가지로 분할한다. 입력 영상을 보간법을 확대하게 되면 상세 정보를 잃게 되어 저해상도의 영상이 되는데, 이와 유사한 정보를 가지는 저주파수 성분의 영상으로 분 할을 하게 된다. 그리고 영상의 상세 정보를 추정함에 있어 아주 낮은 저주파수의 정보는 크게 영향을 미치기 않기에 라플라시안(Laplacian) 필터를 통해 아주 낮은 저주파수 성분을 제거한 중간 주파수 성분의 영상으로 분할하게 된다. 마지막으로는 우리가 희망하는 출력인 고주파수 성분의 영상으로 분할하게 된다. 고주파수 성분의 영상은 이미 획득한 저주파수 성분의 영상과 고해상도의 원 영상과의 각각 화소값의 차를 통해 획득하게 된다. 또한 모든 과정은 Example-Based 방법에서 제안한 패치(patch)라는 형태로 정보를 사용하게 된다. 이는 주변 정보(Nearest Neighbor)를 이용하기 위함이다. 다음으로 분할된 주파수 영역의 이미지 정보를 이용하여 신경회로망을 학습시키고 학습 결과로 얻어진 신경망을 통해 얻고자 하는 결과를 도출하게 된다. The method can be divided into two steps. First, the image is divided into three in the frequency domain. If the interpolation method is enlarged, the detailed information is lost and the low resolution image is divided into the low frequency components having similar information. In order to estimate the detailed information of the image, very low frequency information is not significantly influenced, so the Laplacian filter is used to split the low frequency component into an image of the intermediate frequency component. Finally, we divide the image into high-frequency components, which is the desired output. An image of a high frequency component is obtained through a difference between pixel values between an image of a low frequency component that has been acquired and an original image of high resolution. In addition, all processes use information in the form of patches suggested by the Example-Based method. This is to use the neighbor information (Nearest Neighbor). Next, the neural network is trained using the image information of the divided frequency domain, and the result to be obtained through the neural network obtained as the learning result is derived.

가. 영상의 주파수별 분할end. Segmentation by Frequency of Image

영상의 주파수별 분할 과정은 고해상도 영상의 구현을 위하여 영상을 확대시 잃게 되는 이미지의 고주파 성분 정보(detail)를 예측하기 위해 필요하다. 또한 이미지 전체를 이용하는 것이 아니라 정확도를 높이고 실행 속도를 고려하여 이미지 내의 픽셀 정보를 패치(patch)화하여 사용하였다. 학습 과정과 실제 구현 단계에 있어 모든 이미지를 패치(patch)화하여 사용하였다. The frequency division process of an image is necessary to predict high frequency component details of an image lost when the image is enlarged in order to realize a high resolution image. In addition, instead of using the whole image, the pixel information in the image was patched to improve accuracy and execution speed. All images were patched and used during the learning process and the actual implementation.

(1) 저주파수 패치 생성 (1) low frequency patch generation

고해상도의 영상에서 저해상도 정보만을 분류하여 패치를 얻기 위해 영상을 원 영 상보다 작은 크기로 축소한 후 다시 보간법을 이용하여 원 영상의 크기로 확대를 하게 한다. 이로 인해 원 영상에서 가지고 있던 고주파 성분의 정보를 잃게 된다. 이 이미지를 저주파수 이미지라 표현하며 3x3 형태로 패치화를 시킨다. 이는 나중에 학습 과정에서 입력 이미지와 유사한 해상도로 만들기 위한 과정이다. [도 2]의 원영상은 학습에 사용된 고해상도 영상이다. In order to obtain a patch by classifying only the low resolution information in the high resolution image, the image is reduced to a smaller size than the original image and then enlarged to the original image size using interpolation. As a result, the high frequency components of the original image are lost. This image is referred to as a low frequency image and patched in a 3x3 form. This is a process to make the resolution similar to the input image later in the learning process. The original image of FIG. 2 is a high resolution image used for learning.

(2) 중간주파수 패치 생성 (2) Create an intermediate frequency patch

신경회로망의 학습 과정에 있어 아주 낮은 주파수의 픽셀 정보는 크게 중요하지 않다. 그래서 라플라시안(Laplacian) 필터를 사용하여 아주 낮은 주파수 영역의 정보를 제거하여 준다. 이렇게 얻어진 영상을 저주파수와 고주파수의 정보를 잃은 중간 주파수(Mid-frequency) 이미지라 표현한다. 이 이미지 역시 3x3 형태로 패치화하여 사용한다. 학습과정의 입력 패치와 실제 구현시 입력 패치는 중간 주파수 형태로서 사용하게 된다. [도 3]의 영상은 앞서 구현된 저주파수 성분의 영상에 필터를 적용시켜 얻은 결과 영상이다. Very low frequency pixel information is not important in the neural network learning process. So we use a Laplacian filter to remove very low frequency information. The image thus obtained is expressed as a mid-frequency image which has lost low and high frequency information. This image is also patched in 3x3 format. The input patch of the learning process and the input patch in the actual implementation are used as the intermediate frequency form. The image of FIG. 3 is a result image obtained by applying a filter to an image of a low frequency component implemented before.

(3) 고주파수 패치 생성 (3) high frequency patch generation

입력 영상이 가지고 있지 않은 고주파수의 정보를 가진 이미지는 고해상도의 원영상과 앞서 구한 저주파수 성분의 이미지와의 차를 통하여 구할 수 있다. 이렇게 얻은 이미지를 고주파수 이미지라 하며 신경회로망의 역전파 알고리즘을 통해 구현하고자 하는 영역이 된다. [도 4]에서 (i)의 이미지는 고해상도의 예제 영상에서 저 해상도로 변환한 영상과의 차를 통해 구현된 실제 영상의 이미지를 보여주고 있다. 화소값들이 크지 않기 때문에 실제 화소값의 영상을 통해 확인함 과정에 불편함이 있어 정규화를 시킬 필요가 있다. (ii)영상을 보면 알 수 있듯이 육안으로 확인하기 힘들었던 영상을 정규화를 통해 화소값을 뚜렷하게 확인이 가능하다. 그러나, 실제 제안하는 방법에서 사용되어지는 화소값은 실제값인 (i)의 값을 이용한다. An image with high frequency information that the input image does not have can be obtained through the difference between the high resolution original image and the image of the low frequency component obtained above. The obtained image is called a high frequency image and becomes an area to be implemented through the back propagation algorithm of neural network. In FIG. 4, the image of (i) shows an image of an actual image implemented through a difference between a high resolution example image and a low resolution image. Since the pixel values are not large, it is inconvenient to check through the image of the actual pixel values, so it is necessary to normalize them. (ii) As can be seen from the video, the pixel value can be clearly identified through normalization of an image that was difficult to check with the naked eye. However, the pixel value used in the actually proposed method uses the value of (i) which is the actual value.

나. 역전파 신경회로망 모델 구축 I. Back propagation neural network model construction

(1) 영상 복원을 위한 신경회로망 모델의 구조 (1) Structure of neural network model for image reconstruction

영상 복원을 위한 신경회로망 모델에서 이용한 역전파 알고리즘은 다층(Multi-layer)이고 feedforward 신경망에서 사용되는 학습 알고리즘이며 학습의 방법은 지도 학습(supervised learning)이다. 신경 회로망 구조는 일반적으로 입·출력층과 중간층인 3층 이상의 구조를 사용하는데, 특히 이 구조에 따라 특성이 크게 달라지며 아직 뚜렷한 신경회로망 지표가 없으므로 많은 시행 착오를 통하여 신경회로망을 구성할 수 밖에 없다.The backpropagation algorithm used in the neural network model for image reconstruction is a multi-layer, the learning algorithm used in the feedforward neural network, and the method of learning is supervised learning. The neural network structure generally uses three or more layers, i.e., input and output layers, and an intermediate layer. In particular, the characteristics of these neural networks vary greatly, and there are no clear neural network indicators. .

본 발명에서는 입·출력층과 3개의 중간층으로 구성된 5층 구조를 사용하였으며 입력층은 5x5 패치의 각각의 구성 요소로 이루어진 25개의 값을 가진 벡터를 사용하였다. 이는 중간 화소값을 중심으로 더 많은 정보를 이용하여 신경회로망의 성능을 향상시키기 위한 목적이다. 출력층에서는 입력층의 중간 화소값을 중심으로 상관 관계를 지닌 고주파수 성분의 3x3 패치들의 값을 가진 벡터 요소들로 9개의 출력을 사용하였다. 즉, [도 5]를 보면 입력층의 M는 25가 되고, N는 9가 되는 것이 고 은닉층(Hidden layers) 수가 3인 것이다. 또한, 입력층에서 사용되는 패치들의 공간적 좌표와 출력층에서 사용되는 패치들의 공간적 좌표는 같아야 한다. 그래서 학습을 마친 신경회로망 모델에 고주파수 성분의 정보를 모르는 입력 영상이 들어왔을 때 그에 맞는 고주파수 성분의 정보를 추정할 수 있다. In the present invention, a five-layer structure consisting of an input / output layer and three intermediate layers is used, and an input layer uses a vector having 25 values consisting of respective components of a 5 × 5 patch. This is to improve the performance of the neural network using more information centering on the intermediate pixel values. In the output layer, nine outputs were used as vector elements having values of 3x3 patches of high frequency components correlated with intermediate pixel values of the input layer. That is, as shown in FIG. 5, M of the input layer is 25, N is 9, and the number of hidden layers is 3. In addition, the spatial coordinates of the patches used in the input layer and the spatial coordinates of the patches used in the output layer should be the same. Therefore, when the input neural network model that does not know the information of the high frequency component enters the trained neural network model, the high frequency component information can be estimated.

(2) 신경회로망 모델의 학습 과정 (2) Learning process of neural network model

영상 복원을 위한 신경회로망 모델에서 이용한 역전파 알고리즘은 주어진 입력에 대하여 희망하는 출력과 신경회로망의 실제 출력값 간의 오차를 최소가 되도록 가중치(weight)와 바이어스(bias)를 조정하기 위한 학습 과정을 필요로 한다. [도 6]에서 보는 것과 같이 학습에 필요한 예제 영상과 이 영상을 주파수별로 변환한 영상을 사용하여 이루어진다. 입력층에 들어가는 정보는 위에서 설명한 중간주파수 성분을 가진 영상의 패치값을 사용하게 된다. 출력층에서는 학습을 위한 예제 영상과 이 영상을 저주파수 영상으로 변환한 영상과의 차를 통해 고주파수 성분을 가진 영상의 패치들의 값을 사용하게 된다. 그리고 신경회로망 모델의 성능을 향상시키기 위해 입력 영상을 앞서 설명한 저주파수 성분의 영상 변환과 같은 방법으로 구한 영상과 입력 영상과의 차를 이용하여 고주파수 성분이 강한 부분과 그렇지 않은 부분으로 나누어 따로 학습시키게 된다. 입력층에 사용되는 벡터의 성분은 화소값으로 이루어지는데 경우에 따라서는 고주파 성분이 강한 부분과 그렇지 않은 부분의 화소값이 큰 차이가 없을 수도 있다. 그런 경우 주파수 성분이 적은 부분의 출력 화소값은 거의 0에 가까운 값이 나와야 하는데, 영역을 나누지 않고 학습시켰을 경우에는 주파수 성분이 적은 부분의 출력 화소값들이 다른 값을 가지게 되어 오차로서 작용하였다. 그래서 본 논문에서는 고주파수 성분이 강한 부분과 그렇지 않은 부분으로 영상을 나누어 따로 학습시킴으로서 오차를 줄일 수 있었다. The backpropagation algorithm used in the neural network model for image reconstruction requires a learning process to adjust the weight and bias to minimize the error between the desired output and the neural network's actual output value for a given input. do. As shown in FIG. 6, an example image necessary for learning and an image obtained by converting the image for each frequency are used. Information entering the input layer uses the patch value of the image having the intermediate frequency component described above. The output layer uses the values of the patches of the image having the high frequency component through the difference between the example image for learning and the image converted into the low frequency image. In order to improve the performance of the neural network model, the input image is trained separately by the high frequency component and the non-high frequency component by using the difference between the input image and the image obtained by the same method as the image conversion of the low frequency component described above. . The component of the vector used in the input layer is composed of pixel values. In some cases, there may be no large difference between the pixel values of the portion where the high frequency component is strong and the portion that is not. In such a case, an output pixel value having a small frequency component should have a value close to zero. When learning without dividing an area, the output pixel values having a small frequency component have different values, which acts as an error. Therefore, in this paper, the error can be reduced by dividing the image into the part with high frequency component and the part with high frequency component separately.

입력층의 패치값들과 출력층의 패치값들을 통해 가중치(weight)와 바이어스(bias)를 조정하며 주어진 입력에 대하여 희망하는 출력과 신경회로망의 실제 출력값 간의 오차를 최소가 되도록 하는 강인한 신경회로망이 되도록 학습시킨다. The patch values of the input layer and the patch values of the output layer are used to adjust the weight and bias, and to be a strong neural network that minimizes the error between the desired output and the neural network's actual output value for a given input. To learn.

제공하는 신경회로망 모델의 알고리즘은 가중치 벡터의 초기화, 가중치 벡터의 업데이트, 새로운 가중치 벡터의 생성 등으로 구성되어 있다. 학습 알고리즘을 살펴보면 다음과 같다. The algorithm of the neural network model provided consists of initializing the weight vector, updating the weight vector, and generating a new weight vector. The learning algorithm is as follows.

Step 0. 가중치 초기화 Step 0. Initialize weight

Step 1. 통신망을 통한 입력 벡터 전파Step 1. Propagation of input vector through communication network

(Feedforward propagation) (Feedforward propagation)

Step 2. 출력 층(layer)에서 에러(error) 계산Step 2. Calculate the error in the output layer

Step 3. 중간 층에서 에러 계산Step 3. Calculate the error in the middle layer

(Backward propagation of error) (Backward propagation of error)

Step 4. 가중치 업데이트 Step 4. Update weights

Step 5. 수렴할 때까지 Steps 1-4를 계속 반복Step 5. Repeat Steps 1-4 repeatedly until convergence

학습 알고리즘을 통해 알 수 있듯이 계속적인 반복을 통해 에러 값을 계산하며 가중치를 업데이트하게 된다. 본 논문에서는 반복 횟수를 200000번으로 정하였고 임계값(Threshold)를 정하여 반복하는 도중에 정해놓은 임계값에 도달하면 학습을 마치도록 설정하였다. 임계값은 으로 설정하였고 는 0.3으로 하였다. As you can see from the learning algorithm, iterative iterations calculate error values and update weights. In this paper, we set the number of repetitions as 200000 and set the threshold to finish learning when the threshold is reached. The threshold was set to and was set to 0.3.

다. 영상의 주파수 합성을 통한 고해상도 영상 구현 All. High resolution image realization through frequency synthesis of image

고해상도의 영상을 구현하기 위해서는 입력 영상과 제공하는 신경회로망 모델을 통해 추정되어진 고주파수 성분의 영상과의 합성 과정을 거치게 된다. 영상의 확대를 통해 얻어진 입력 영상과 확대된 영상에서 잃게 되는 상세 정보를 가진 고주파수 영상과의 합을 통해 최종적으로 고해상도의 영상으로 해상도 개선이 이루어진다. 입력 영상의 확대는 기존의 보간법인 양선형 보간법을 이용하였다. In order to realize a high resolution image, a synthesis process is performed between an input image and an image of a high frequency component estimated through a neural network model. The resolution is finally improved to a high resolution image through the sum of an input image obtained through the enlargement of the image and a high frequency image having detailed information lost from the enlarged image. The input image was enlarged using the bilinear interpolation method.

[도 7]의 (i)는 입력 영상인 Lena 영상에 보간법을 이용하여 2배 확대한 영상이다. 그리고 (ii)는 실제 신경회로망 모델을 통해 추정되어진 고주파수 성분을 가진 영상을 정규화한 영상이다. 즉, (i) 영상과 (ii) 영상의 합을 통해 고해상도(iii) 영상을 구현하게 된다. (I) of FIG. 7 is a 2x enlarged image by using interpolation on a Lena image as an input image. And (ii) normalizes an image with high frequency components estimated through an actual neural network model. That is, a high resolution (iii) image is realized through the sum of (i) image and (ii) image.

라. 본 발명의 효과를 확인하기 위한 실험 및 결과 la. Experiments and results to confirm the effect of the present invention

본 발명에서 제공한 영상 복원을 위한 신경회로망 모델을 이용한 고해상도 영상 복원의 방법과 기존의 다른 방법들의 비교실험을 통해 영상 복원 성능을 평가하였다. 우선, 기존의 보간법들과 제공하는 방법과의 결과 영상을 비교하고 객관적 지표로 사용되는 PSNR의 수치를 통해 결과 영상의 품질을 평가하였다. 다음으로는 고해상도 관련 기존 연구에서 가장 우수한 성능을 보인 Example-Based 방법과의 실험을 통해 비교하였다. 마지막으로는 본 발명이 제공하는 방법을 이용한 응용 사례로서, 영상의 확대시 고해상도 영상 복원이 아닌, 흐린 영상의 경우 이를 보정한 실험을 수행하였다. The image reconstruction performance was evaluated by comparing a high resolution image reconstruction method using the neural network model for image reconstruction provided by the present invention with other existing methods. First, the resultant image was compared with the existing interpolation methods and the provided method, and the quality of the resultant image was evaluated through the numerical value of PSNR used as an objective index. Next, the experiments were compared with the Example-Based method, which showed the best performance in the existing high resolution studies. Finally, as an application example using the method provided by the present invention, an experiment in which a blurred image is corrected instead of reconstructing a high resolution image when the image is enlarged was performed.

(1) 기존 보간법들과의 비교 실험 (1) Comparative experiment with existing interpolation methods

[도 8], [도 9]를 통해 기존 보간법인 최근접 이웃 보간법(NN)과 양선형 보간법(Bilinear)과 3차 회선 보간법(Cubic)과의 비교를 통해 제공하는 방법의 결과 영상의 성능을 평가하고 객관적 지표인 PSNR 수치들과의 비교를 통해 결과 영상의 품질이 향상되었음을 보여준다. PSNR의 측정을 위해 실험 영상은 500x500의 원 영상을 영상 크기 축소을 통해 1/4인 125x125의 크기로 줄인 후에 여러 가지 보간법을 통해 4배로 영상을 확대하였다. [Fig. 8], [9] shows the performance of the resulting image of the method provided by comparing with the nearest neighbor interpolation (NN), bilinear interpolation (Bilinear) and cubic interpolation (Cubic) The quality of the resulting image is improved by evaluating and comparing it with the PSNR values, which are objective indicators. For the PSNR measurement, the experimental image was reduced to 500x500 original image by 125x125, which is 1/4 by reducing the image size, and then magnified by 4 times through various interpolation methods.

기존의 보간법들보다 시각적으로 흐림(burring) 현상이 줄어들고 영상의 선명도와 뚜렷함에 있어 더 우수하다는 것을 알 수 있다. 이는 기존의 보간법을 통해 확대된 영상에 보간을 통해 잃게 되는 원영상의 상세 정보를 더해줌으로서 나온 결과이다. 즉, 제공하는 신경회로망 모델을 통해 우리가 원하는 상세 정보가 잘 추정되었음을 알 수 있다. 시각적인 성능의 향상 뿐만이 아니라 영상의 품질에 있어서도 향상이 있었음을 표를 통해 알 수 있다. 성능 평가의 객관적 지표로는 PSNR (Peak Signal To Noise Ratio)을 이용하였다. It can be seen that the blurring phenomenon is reduced visually and is superior in image clarity and clarity than the conventional interpolation methods. This is the result of adding the detailed information of the original image lost through interpolation to the enlarged image through the existing interpolation method. In other words, the neural network model provided shows that the detailed information we want is well estimated. The table shows that not only the visual performance but also the quality of the image has been improved. PSNR (Peak Signal To Noise Ratio) was used as an objective indicator of performance evaluation.

NNNN BilinearBilinear CubicCubic 제안한 방법Proposed method Friuts 영상Friuts video 29.7429.74 30.3530.35 30.8930.89 30.9730.97 Baboon 영상Baboon video 20.7420.74 20.9520.95 21.2021.20 21.3121.31

(2) Example-Based 방법과의 비교 실험 (2) Comparative Experiment with Example-Based Method

고해상도와 관련된 기존의 연구들 중에서 Example -Based 방법은 우수한 성능을 보여준다. 그래서 실험 영상을 통하여 Example-Based 방법과 제공하는 방법의 결과 영상을 통해 비교하도록 하겠다. 본 실험에서는 Example-Based 방법의 추정 알고리즘 중에서 William T. Freeman이 제안한 One-pass 알고리즘을 직접 구현하여 사용하였다. Among the previous studies on high resolution, the Example-Based method shows excellent performance. Therefore, we will compare the result images of the Example-Based method and the provided method through the experimental images. In this experiment, the one-pass algorithm proposed by William T. Freeman was implemented by using the Example-Based estimation algorithm.

Example-Based 방법과 제안한 방법 모두 기존의 보간법들에 비해서는 인간의 시각에 민감한 윤곽(edge) 부분을 효과적으로 구현함으로써 훨씬 선명하고 뚜렷한 영상으로 구현된 것을 확인할 수 있다. 그러나 Example-Based 방법의 경우는 기존의 보간법에 비해 낮은 PSNR 수치를 보여주었다. 이는 고주파수 성분이 많은 영상의 윤곽(edge) 부분이 강하게 보여짐으로서 시각적으로는 우수한 성능을 보여주나, 원영상과는 다소 근접하지 못한 값을 가지게 된 결과라고 보여진다. Both the Example-Based method and the proposed method can realize a much sharper and clearer image by effectively implementing the edges that are sensitive to human vision compared to the existing interpolation methods. However, the Example-Based method showed lower PSNR values than the conventional interpolation method. This shows that the edge part of the image with a lot of high frequency components is shown to be strong, and thus the visual performance is excellent, but the result is that the value is not close to the original image.

ExampleExample -- BasedBased 방법 Way 제안한 방법Proposed method Friuts 영상Friuts video 29.2729.27 33.3633.36 Baboon 영상Baboon video 20.4920.49 24.2524.25

[도 10]을 통해 알 수 있듯이 Example-Based 방법의 경우는 제안하는 방법에 비해 시각적으로는 더욱 뚜렷한 영상으로 보여지나, 원영상과 비교해보면 전체적인 영상의 밝기가 원영상에 비해 밝아진 것을 확인할 수 있다. 그래서 원영상과는 다소 차이가 있는 값을 가지고 있어 PSNR의 수치가 낮게 나온 것으로 보여진다. 그러나 제공한 방법은 시각적으로 Example-Based 방법과 유사한 성능을 보여주며, Example-Based 방법에 비해 원영상과 근접한 값으로 구현됨으로서 좋은 결과 수치를 보여주 었다. 또한 실행 속도에 있어 매우 큰 차이를 보였다. Example-Based 방법의 경우는 대략 60분 이상의 매우 느린 실행 속도를 보여주었음에 반해 본 발명에서 제공한 방법을 사용하였을 경우 4배의 확대인 경우에도 2초 이내의 시간이 걸렸다. 따라서, 본 논문에서 제안하는 알고리즘을 이용한다면 실시간으로 사용이 가능하다는 큰 장점이 있다.As can be seen from FIG. 10, in the case of the Example-Based method, the image is more clearly seen than the proposed method, but compared to the original image, the brightness of the overall image is brighter than the original image. . Therefore, the value of PSNR appears to be low because it has a slightly different value from the original image. However, the proposed method shows the similar performance visually to the Example-Based method, and shows good results because it is implemented closer to the original image than the Example-Based method. There was also a big difference in execution speed. While the Example-Based method showed a very slow execution speed of about 60 minutes or more, it took less than 2 seconds even when the method provided by the present invention was used at 4 times magnification. Therefore, there is a big advantage that the algorithm proposed in this paper can be used in real time.

ExampleExample -- BasedBased 방법 Way 제안한 방법Proposed method 150150 영상의 4배 확대 4x magnification of 150 150 images 약 65분 65 minutes 2초 이내Within 2 seconds

(3) 흐린 영상의 보정 실험 (3) correction experiment of blurry image

본 발명에서 제공하는 방법은 영상의 확대시 고해상도의 영상 복원 이외에도 다양한 분야에서의 응용이 가능하다. 그 중의 한 예로 영상의 확대가 아닌, 흐린 영상을 입력으로 했을 때 이를 보정하는 실험을 하였다. The method provided by the present invention can be applied in various fields in addition to reconstructing a high resolution image when the image is enlarged. As an example, the experiment was performed to correct the blurred image as input instead of the enlargement of the image.

[도 11]의 결과 영상을 통해 알 수 있듯이 영상의 확대시 고해상도 영상 복원 뿐만이 아니라 흐린 영상의 보정에 있어서도 우수한 성능을 보여주었다. 본 실험은 기존의 고해상도 영상 복원에 사용된 신경회로망 모델을 그대로 적용하였기에 모델을 흐린 영상 보정에 맞게 다시금 학습 과정과 여러 요소들을 변경한다면 더욱 좋은 결과가 나올 것으로 예상한다. As can be seen from the result of FIG. 11, the image showed excellent performance not only in reconstructing a high resolution image but also in correcting a blurred image. In this experiment, the neural network model used in the existing high resolution image reconstruction is applied as it is, and it is expected that better results will be obtained if the model is changed again to adjust the blurring process to adjust the blurred image.

도 1은 제공하는 방법의 구성도이다.1 is a block diagram of a providing method.

도 2는 원영상과 저주파수 이미지이다.2 is an original image and a low frequency image.

도 3은 중간 주파수 이미지이다.3 is an intermediate frequency image.

도 4는 고주파수 이미지이다.4 is a high frequency image.

도 5는 영상 복원을 위한 신경 회로망의 구조이다.5 is a structure of a neural network for image reconstruction.

도 6은 영상 복원을 위한 신경회로망 모델의 학습 과정이다.6 is a learning process of a neural network model for image reconstruction.

도 7은 고해상도 영상 구현이다.7 is a high resolution image implementation.

도 8은 Fruits 영상의 실험 결과이다.8 is an experimental result of a Fruits image.

도 9는 Baboon 영상의 실험 결과이다.9 is an experimental result of the Baboon image.

도 10은 Example-Based 방법과의 실험 결과이다.10 shows experimental results with the Example-Based method.

도 11은 흐림 영상의 보정 실험 결과이다.11 is a result of a correction experiment of a blur image.

Claims

Neural Network Models Described in the Invention and Ideas and Algorithms for High Resolution Image Magnification and Blurred Image Correction