KR20220102811A

KR20220102811A - Apparatus and method for reconstructing single image super-resolution

Info

Publication number: KR20220102811A
Application number: KR1020210005192A
Authority: KR
Inventors: 김응태; 우희조; 심지우
Original assignee: 한국공학대학교산학협력단
Priority date: 2021-01-14
Filing date: 2021-01-14
Publication date: 2022-07-21

Abstract

A single image super-resolution reconstruction device according to the present invention includes: an image input unit for receiving a low-resolution image and providing the image to a super-resolution network; and an image reconstruction unit generating a high-resolution image by enlarging the low-resolution image by a predefined magnification using predefined super-resolution networks. Here, the super-resolution network may be configured such that a residual dense channel attention block (RDCAB) including a residual dense structure and a channel concentration structure is recursively repeated a predetermined number of times.

Description

Single image super-resolution restoration apparatus and method

본 발명은 저해상도 영상으로부터 고해상도 영상을 생성(또는 복원)할 수 있는 심층 학습 기반의 단일 영상 초해상도 기술에 관한 것이다. The present invention relates to a single image super-resolution technology based on deep learning that can generate (or restore) a high-resolution image from a low-resolution image.

단일 영상 초해상도 기법(Single Image Super Resolution)은 저해상도 영상으로부터 고해상도 영상을 생성(또는 복원)하는 기법이다. 최근, UHD(Ultra-High Definition) 수준의 고해상도의 디스플레이가 광범위하게 보급되며, FHD(Full-High Definiton) 수준의 저해상도 영상을 고해상도 영상으로 변환할 수 있는 초해상도 알고리즘에 대한 관심이 커지고 있다.A single image super-resolution technique is a technique for generating (or restoring) a high-resolution image from a low-resolution image. Recently, a high-resolution display of an Ultra-High Definition (UHD) level has been widely distributed, and interest in a super-resolution algorithm capable of converting a low-resolution image of a Full-High Definiton (FHD) level into a high-resolution image is growing.

단일 영상의 해상도를 높이는 대표적인 방식은 넓은 범주에서 보간법, 재구성법 및 심층 학습법으로 분류할 수 있다. 보간법은 주어진 주변의 화소값을 이용하여 정보가 없는 곳에 새로운 화소를 생성하는 기술로, 구체적으로는 최근접 화소보간법, 선형 보간법, 양선형 보간법 및 고차원 보간법이 존재한다. 여기에서, 보간법은 계산량이 적다는 장점을 가지고 있지만 저주파수 필터 특성으로 인해 영상을 전반적으로 몽롱화시킴에 따라 디테일한 부분의 화질 저화 및 열화가 발생할 수 있다. 재구성법은 보간법과 비교하여 고주파수 성분을 잘 복원하지만 상대적으로 복잡한 계산량을 가지고 있다.Representative methods of increasing the resolution of a single image can be classified into interpolation, reconstruction, and deep learning methods in a wide range. Interpolation is a technique for generating new pixels where there is no information using a given surrounding pixel value. Specifically, nearest pixel interpolation method, linear interpolation method, bilinear interpolation method, and high-dimensional interpolation method exist. Here, the interpolation method has an advantage in that the amount of calculation is small, but due to the low-frequency filter characteristic, the image is blurred as a whole, and thus the image quality of detailed parts may be deteriorated and deteriorated. Compared to the interpolation method, the reconstruction method restores high-frequency components well, but it has a relatively complex amount of computation.

최근, 심층 학습법의 기술 발달로 딥러닝 기반의 초해상도 알고리즘이 기존의 영상처리 방법들에 비해 높은 성능을 보여주고 있으며, 현재까지도 다양한 네트워크가 제안되며 활발히 연구되고 있다. 딥러닝을 적용한 초해상도 기법은 선형, 잔여학습, 재귀, 밀집, 집중 기법 등의 다양한 방식으로 발전되었으며 다양한 네트워크가 제안되었다.Recently, due to the technological development of deep learning methods, deep learning-based super-resolution algorithms have shown higher performance compared to existing image processing methods, and various networks have been proposed and actively studied to this day. Super-resolution techniques applying deep learning have been developed in various ways, such as linear, residual learning, recursive, dense, and concentrated techniques, and various networks have been proposed.

밀집하게 연결된 네트워크 구조들 중 대표적인 기법인 RDN(Residual Dense Network)는 그 전의 모든 계층의 정보를 활용하여 영상의 특징 맵을 추출하는데 적합하다. 여기에서, RDN는 학습된 다양한 특징 정보를 다시 사용하여 정보의 흐름을 유연하게 함으로써 우수한 성능을 나타낸다. 그러나, RDN의 경우, 다수의 잔여 밀집 블록을 쌓게 됨에 따라 많은 파라미터의 수를 가지게 되어 네트워크를 학습하는데 많은 시간이 소요되고, 큰 저장 공간이 요구됨에 따라 모바일 시스템에 적용이 어렵다는 문제점을 가지고 있다.RDN (Residual Dense Network), which is a representative technique among densely connected network structures, is suitable for extracting a feature map of an image by using information from all previous layers. Here, the RDN shows excellent performance by reusing the learned various characteristic information to make the flow of information flexible. However, in the case of RDN, it has a large number of parameters as a large number of residual dense blocks are stacked, so it takes a lot of time to learn the network, and it is difficult to apply to a mobile system because a large storage space is required.

현재, 이러한 문제점을 해결할 수 있는 새로운 초해상도 기법의 개발이 필요하다 할 것이다. Currently, it will be necessary to develop a new super-resolution technique that can solve these problems.

한국등록특허 제10-2059529호Korean Patent Registration No. 10-2059529

본 발명은 상술된 문제점을 해결하기 위해 도출된 것으로, 계산량이 대폭 줄어들어 빠른 처리 속도를 나타낼 수 있으며 동시에 우수한 성능을 가지는, 심층 학습 기반의 단일 영상 초해상도 기술을 제공하고자 한다. The present invention has been derived to solve the above-described problems, and it is intended to provide a single image super-resolution technology based on deep learning, which can exhibit a fast processing speed by significantly reducing the amount of calculation and have excellent performance at the same time.

본 발명의 일 측면에 따른 단일 영상 초해상도 복원 장치는, 저해상도 영상을 입력받아 초해상도 네트워크에 제공하는 영상 입력부; 및 기정의된 초해상도 네트워크(Super-Resolution Networks)를 이용해 상기 저해상도 영상을 기정의된 배율만큼 확대한 고해상도 영상을 생성하는 영상 복원부;를 포함한다. 여기에서, 상기 초해상도 네트워크는, 잔여 밀집 구조와 채널 집중 구조를 포함하는 RDCAB(Residual Dense Channel Attention Block)가 기정의된 횟수만큼 재귀적으로 반복되도록 구성될 수 있다. A single image super-resolution restoration apparatus according to an aspect of the present invention includes: an image input unit for receiving a low-resolution image and providing it to a super-resolution network; and an image restoration unit that generates a high-resolution image by magnifying the low-resolution image by a predefined magnification using predefined Super-Resolution Networks. Here, the super-resolution network may be configured such that a Residual Dense Channel Attention Block (RDCAB) including a residual density structure and a channel concentration structure is recursively repeated a predetermined number of times.

일 실시예에서, 상기 초해상도 네트워크는, 제1 및 제2 합성곱 층을 포함하는 특징 추출부; 상기 RDCAB 및 Concat 층을 포함하는 재귀 블록(Recursive Block); 합성곱 병목층; 제3 합성곱 층; 및 복원부;가 직렬로 연결되어 구성될 수 있다.In an embodiment, the super-resolution network may include: a feature extraction unit including first and second convolutional layers; a recursive block including the RDCAB and Concat layers; convolutional bottleneck; a third convolutional layer; and a restoration unit; may be configured to be connected in series.

일 실시예에서, 상기 특징 추출부는, 상기 제1 및 제2 합성곱 층이 각각 기정의된 개수의 특징 맵을 추출하되, 상기 제1 합성곱 층의 출력은 제2 합성곱 층 및 마지막 계층으로 입력되며, 상기 제2 합성곱 층의 출력은 상기 재귀 블록으로 입력되도록 구성될 수 있다. In an embodiment, the feature extraction unit extracts a predetermined number of feature maps from the first and second convolutional layers, respectively, and the output of the first convolutional layer is a second convolutional layer and a last layer. input, and the output of the second convolution layer may be configured to be input to the recursive block.

일 실시예에서, 상기 재귀 블록은, 상기 RDCAB가 기정의된 횟수만큼 재귀적으로 반복되며, 각 횟수마다 출력되는 특징들이 Concat 층에 의해 연결되도록 구성될 수 있다.In an embodiment, the recursive block may be configured such that the RDCAB is recursively repeated a predetermined number of times, and features outputted for each number are connected by a Concat layer.

일 실시예에서, 상기 RDCAB은 상기 잔여 밀집 구조와 채널 집중 구조가 직렬로 연결되어 구성되되, 상기 잔여 밀집 구조는 기정의된 복수 개의 합성곱 층과 Concat 층을 포함하며, 상기 채널 집중 구조는 전역 평균 풀링 층과 시그모이드 함수를 포함하여 구성될 수 있다.In an embodiment, the RDCAB is configured by connecting the residual dense structure and the channel intensive structure in series, wherein the residual dense structure includes a plurality of predefined convolutional layers and concat layers, and the channel intensive structure is It can be constructed including an average pooling layer and a sigmoid function.

일 실시예에서, 상기 복원부는, ESPCN(Efficient Sub-Pixel Convolutional Neural Network)를 이용해 업 스케일 프로세스을 수행하고, 업 스케일 프로세스를 통해 나온 특징들을 합성 곱을 통해 RGB 영상으로 복원할 수 있다.In an embodiment, the reconstructor may perform an up-scaling process using an Efficient Sub-Pixel Convolutional Neural Network (ESPCN), and reconstruct features generated through the up-scaling process into an RGB image through a synthesis product.

본 발명의 다른 일 측면에 따르면, 단일 영상 초해상도 복원 방법은 단일 영상 초해상도 복원 장치에서 수행된다. 여기에서, 단일 영상 초해상도 복원 방법은, 영상 제공부가, 저해상도 영상을 입력받아 초해상도 네트워크에 제공하는 단계; 및 영상 복원부가, 기정의된 초해상도 네트워크(Super-Resolution Networks)를 이용해 상기 저해상도 영상을 기정의된 배율만큼 확대한 고해상도 영상을 생성하는 단계;를 포함하되, 상기 초해상도 네트워크는, 잔여 밀집 구조와 채널 집중 구조를 포함하는 RDCAB(Residual Dense Channel Attention Block)가 기정의된 횟수만큼 재귀적으로 반복되도록 구성될 수 있다.According to another aspect of the present invention, the single image super-resolution restoration method is performed in a single image super-resolution restoration apparatus. Here, the single image super-resolution restoration method includes the steps of: an image providing unit, receiving a low-resolution image and providing it to a super-resolution network; and generating, by an image restoration unit, a high-resolution image by magnifying the low-resolution image by a predefined magnification using a predefined super-resolution network (Super-Resolution Networks). and RDCAB (Residual Dense Channel Attention Block) including a channel concentration structure may be configured to recursively repeat a predetermined number of times.

본 발명은, SRCNN, VDSR 등 기존의 초해상도 기법들과 비교하여, 빠른 처리 속도를 나타내며 동시에 우수한 복원 성능을 가진다. The present invention exhibits a faster processing speed and at the same time has excellent restoration performance, as compared with existing super-resolution techniques such as SRCNN and VDSR.

도 1은 본 발명에 따른 초해상도 네트워크(Super-Resolution Networks)를 설명하기 위한 참조도이다.
도 2는 본 발명에 따른 RDCAB 구조를 설명하기 위한 참조도이다.
도 3은 본 발명에 따른 단일 영상 초해상도 복원 장치를 설명하기 위한 참조도이다.
도 4 내지 7은 본 발명에 따른 단일 영상 초해상도 복원 기술의 성능을 설명하기 위한 참조도이다. 1 is a reference diagram for explaining a super-resolution network (Super-Resolution Networks) according to the present invention.
2 is a reference diagram for explaining the RDCAB structure according to the present invention.
3 is a reference diagram for explaining a single image super-resolution restoration apparatus according to the present invention.
4 to 7 are reference diagrams for explaining the performance of the single image super-resolution reconstruction technique according to the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.Since the present invention can apply various transformations and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing the present invention, if it is determined that a detailed description of a related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. Terms such as first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 이하, 본 발명의 실시예를 첨부한 도면들을 참조하여 상세히 설명하기로 한다. The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It is to be understood that this does not preclude the possibility of the presence or addition of numbers, steps, operations, components, parts, or combinations thereof. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1. 서론1. Introduction

앞서 설명한 바와 같이, 딥러닝을 적용한 초해상도 기법에 관한 다양한 네트워크가 제안된 바 있다. As described above, various networks for super-resolution techniques to which deep learning is applied have been proposed.

본 출원인은 이러한 문제를 해결하기 위해 DRRN(Image Super-Resolution via Deep Recursive Residual Network)의 재귀적인 방식을 이용한 단일 영상 초해상도 기법으로서, 잔여밀집 및 채널집중을 사용하는 RDCAB(Residual Dense Channel Attention Block)-RecursiveSRNet 단일 영상 초해상도 네트워크를 제안한다. In order to solve this problem, the present applicant is a single image super-resolution technique using a recursive method of DRRN (Image Super-Resolution via Deep Recursive Residual Network), and RDCAB (Residual Dense Channel Attention Block) using residual density and channel concentration. -RecursiveSRNet We propose a single-image super-resolution network.

본 발명은, 초기의 특징 정보를 마지막 계층에 전달하고, 이후의 계층들이 이전의 계층들의 입력 정보를 잘 사용하게 하기 위해, 잔여 밀집 구조와 영상의 전체 영역을 활용하여 영상의 특징맵에 따라 중요도를 결정해주는 채널집중 기법을 블록화시키며 재귀적인 방식으로 사용함으로써, 잔여 밀집 네트워크의 성능을 최대한 유지하면서 더 적은 파라미터의 수를 가지도록 하여 계산량을 대폭 감소시킬 수 있다. 본 발명에 따른 네트워크는 RDN과 비교 하였을 때 1.18배 더 빠른 처리속도로 비슷한 성능을 나타낼 수 있으며, 약 10배 더 적은 파라미터의 수를 가져 계산량을 대폭 줄일 수 있는 이점이 있다. In the present invention, in order to deliver the initial feature information to the last layer and to allow subsequent layers to use the input information of the previous layers well, the importance level according to the feature map of the image is utilized by utilizing the residual density structure and the entire area of the image. By using a recursive method while blocking the channel concentration method that determines The network according to the present invention can exhibit similar performance with 1.18 times faster processing speed compared to RDN, and has about 10 times fewer parameters, so that the amount of computation can be greatly reduced.

이하에서는, 본 발명에 따른 초해상도 네트워크 및 이를 이용한 단일 영상 초해상도 기법에 대하여 보다 상세하게 설명한다. Hereinafter, a super-resolution network according to the present invention and a single image super-resolution technique using the same will be described in more detail.

2. 종래의 초해상도 기법2. Conventional super-resolution technique

단일 영상 초해상도 복원 기술에 관한 다양한 알고리즘들이 제안된 바 있다. Various algorithms for single image super-resolution reconstruction technology have been proposed.

딥러닝 기술을 초해상도 기법에 적용한 최초의 기술인 SRCNN(Image Super-Resolution Using Deep Convolutional Networks)은 단 3개의 합성곱만 사용하는 방식으로, 딥러닝을 적용하지 않은 방법들에 비해 높은 성능을 나타내며, 초해상도 복원 기술에도 딥러닝이 적용될 수 있다는 가능성을 보여주었다. 그러나, SRCNN의 경우, 사전 업 샘플링 구조를 가짐에 따라 많은 계산량이 필요하고 적은 수의 층을 가짐에 따라 정확도 측면에서 한계가 있었다.Image Super-Resolution Using Deep Convolutional Networks (SRCNN), the first technology to apply deep learning to super-resolution techniques, uses only three convolutions, and shows higher performance than methods that do not apply deep learning. It showed the possibility that deep learning can be applied to super-resolution restoration technology. However, in the case of SRCNN, as it has a pre-up-sampling structure, a large amount of computation is required, and as it has a small number of layers, there is a limit in terms of accuracy.

VDSR(Accurate Image Super-Resolution Using Very Deep Convolutional Networks)은 잔차 학습을 이용하여 깊은 네트워크를 만들어 고주파수 정보만을 학습함으로써 성능을 개선하였지만, 다른 계층에는 과거 특징 정보를 사용하지 않고 마지막 계층에만 과거 특징 정보를 사용하는 단점을 가지고 있다. Accurate Image Super-Resolution Using Very Deep Convolutional Networks (VDSR) improves performance by creating a deep network using residual learning and learning only high-frequency information. There are downsides to using it.

DRRN은 깊은 네트워크가 좋은 성능을 나타낼 수 있지만 한 층씩 늘릴 때마다 파라미터의 수가 대폭 늘어 하이퍼 파라미터를 튜닝하기 어려운 점을 지적하였다. DRRN은 전역 잔차 학습과 지역적 잔차 학습을 사용하고 여러 개의 잔차 유닛으로 구성된 재귀 블록을 이용하여 추가적인 파라미터 없이 깊은 네트워크를 만들어 냈다. DRRN pointed out that although a deep network can show good performance, it is difficult to tune hyperparameters as the number of parameters increases significantly with each layer increase. DRRN uses global residual learning and local residual learning and a recursive block composed of multiple residual units to create a deep network without additional parameters.

SRDenseNet(Image Super-Resolution Using Dense Skip Connections)은 DenseNet의 밀집 연결을 이용하여 초해상도에 적용한 네트워크로 모든 층의 특징을 연결하여 깊이가 깊어질수록 기울기가 0으로 수렴하는 소실 문제를 해결하고 복원 성능 또한 향상시켰다. SRDenseNet (Image Super-Resolution Using Dense Skip Connections) is a network applied to super-resolution by using dense connections of DenseNet, and it solves the problem of loss of gradient convergence to 0 as the depth increases and restores performance by connecting the features of all layers. Also improved.

RDN은 합성 곱의 크기가 증가함에 따라 잔여 블록의 계산량의 증가를 막기 위해 기존의 잔차 학습과 SRDenseNet에서 사용된 밀집 연결을 이용하여 그 전의 모든 계층의 정보를 활용하여 영상의 특징 맵을 추출한다. RDN extracts the feature map of the image by utilizing information from all previous layers using the existing residual learning and dense connection used in SRDenseNet to prevent the increase in the amount of computation of the residual block as the size of the synthesis product increases.

3. 본 발명에 따른 단일 영상 초해상도 복원 기술3. Single image super-resolution restoration technology according to the present invention

도 1은 본 발명에 따른 초해상도 네트워크(Super-Resolution Networks)를 설명하기 위한 참조도이다. 도 1을 참조하면, 본 발명에 따른 초해상도 네트워크는 RDCAB을 재귀적인 방식으로 사용하여 추가적인 파라미터 없이 네트워크의 깊이를 늘린 RDCAB-RecursiveSRNet의 구조로 볼 수 있다. 본 발명에 따른 단일 영상 초해상도 복원 방법은, 파라미터의 수를 줄이기 위해 재귀적인 방식이 적용되며, 높은 성능을 만들기 위해 잔여 밀집 구조와 채널 집중 기법이 적용된다. 본 발명에 따른 RDCAB-RecursiveSRNet가 적용된 단일 영상 초해상도 복원 방법은, 특징 추출 단계, 재귀블록을 통한 특징 추출 단계, 업 스케일 및 복원 단계로 구성될 수 있다.1 is a reference diagram for explaining a super-resolution network (Super-Resolution Networks) according to the present invention. Referring to FIG. 1 , the super-resolution network according to the present invention can be viewed as a structure of RDCAB-RecursiveSRNet in which the depth of the network is increased without additional parameters by using RDCAB in a recursive manner. In the single image super-resolution reconstruction method according to the present invention, a recursive method is applied to reduce the number of parameters, and a residual density structure and a channel concentration technique are applied to make high performance. The single image super-resolution reconstruction method to which RDCAB-RecursiveSRNet according to the present invention is applied may consist of a feature extraction step, a feature extraction step through a recursive block, an upscaling and a reconstruction step.

3.1. 특징 추출 단계3.1. Feature extraction step

본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)는 복잡한 계산량의 증대를 막기 위해 사후 업샘플링 구조로 이루어져 있다. 여기에서, 본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)는, 저해상도의 영상이 입력되면 첫 번째와 두 번째 합성 곱에서 각각 기정의된 개수(예를 들어, 32개 이상, 바람직하게는 64개)의 특징 맵을 추출하고, 첫 번째 합성 곱의 출력은 마지막의 전역 잔차 학습에 사용이 되며, 두 번째 합성 곱의 출력은 RDCAB의 지역 잔차 학습에 사용되도록 구성된다.The super-resolution network (RDCAB-RecursiveSRNet) according to the present invention has a post-upsampling structure in order to prevent an increase in a complex amount of computation. Here, the super-resolution network (RDCAB-RecursiveSRNet) according to the present invention, when a low-resolution image is input, a predefined number (for example, 32 or more, preferably 64) in the first and second synthesis products, respectively ), the output of the first convolution product is used for the last global residual learning, and the output of the second convolution product is configured to be used for local residual learning of RDCAB.

3.2. 재귀블록을 통한 특징 추출 단계3.2. Feature extraction step through recursive block

본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)는 파라미터의 수를 최대한 줄이며 높은 성능을 유지하기 위해, 잔여 밀집 구조(RD)와 채널 집중 기법(CA)으로 구성되어 있는 RDCAB(Residual Dense Channel Attention Block)가 재귀적으로 사용되도록 구성된다. 여기에서, 잔여 밀집 구조(RD)는 연속 메모리구조, 국소특징결합, 국소 잔여학습 등을 통해 모든 계층의 정보를 활용하며, 채널 집중 기법(CA)은 채널들 사이의 중요도에 따라 채널을 재정의해 주는 기능을 가지고 있다. The super-resolution network (RDCAB-RecursiveSRNet) according to the present invention is a Residual Dense Channel Attention Block (RDCAB) composed of a residual density structure (RD) and a channel concentration method (CA) in order to reduce the number of parameters as much as possible and maintain high performance. ) is configured to be used recursively. Here, the residual dense structure (RD) utilizes information from all layers through continuous memory structure, local feature combining, and local residual learning, and the channel concentration method (CA) redefines the channels according to the importance between the channels. It has the function of giving.

도 2는 본 발명에 따른 RDCAB 구조를 설명하기 위한 참조도이다. 도 2를 참조하면, 본 발명에 따른 RDCAB은 기정의된 개수(예를 들어, 8개)의 3x3 합성 곱(한편, 합성 곱 층의 수는 12개 등 많을수록 성능 측면에서 유리하나, 학습시간이 늘어나는 단점이 있어, 도 2와 같이 8개로 정의됨)과 Szegedy 등이 제안한 병목 계층인 1x1 합성 곱이 포함되어 구성될 수 있다. 여기에서, 각 층의 입력은 이전 층의 출력을 연결한 값이 사용되고 출력은 64로 설정될 수 있으며, 최대한 높은 성능을 유지하기 위해 성장률(growth rate)는 64로 설정될 수 있다. 도 2에 도시된 바와 같이, RDCAB는 설정한 재귀 횟수만큼 반복되며, 각 횟수마다 나온 RDCAB의 특징들이 연결되어 다시 한 번 1x1 병목 계층을 통과하고 채널수가 64로 줄어 계산량이 감소될 수 있다. 2 is a reference diagram for explaining the RDCAB structure according to the present invention. 2, the RDCAB according to the present invention is a 3x3 convolutional product of a predefined number (eg, 8) (on the other hand, the more the number of convolutional layers is 12, etc., the better in terms of performance, but the learning time is Since there is a disadvantage of increasing, it can be configured by including the 1x1 convolutional product which is the bottleneck layer proposed by Szegedy et al.) and Szegedy et al. Here, the input of each layer uses a value connecting the outputs of the previous layer, and the output may be set to 64, and the growth rate may be set to 64 to maintain the highest possible performance. As shown in FIG. 2 , RDCAB is repeated as many times as the set number of recursion, and the characteristics of RDCAB that appear each number of times are connected to pass through the 1x1 bottleneck layer once again, and the number of channels can be reduced to 64, thereby reducing the amount of computation.

여기에서, 전체 영역의 정보를 활용하기 위해 각 채널들 사이의 중요도에 따라 가중치를 할당하는 채널 집중 기법(CA)이 전역 평균 풀링과 시그모이드를 사용하여 수행될 수 있으며, 입력으로 들어온 채널을 감소율 16을 이용하여 나누고 다시 곱해주는 방식이 적용되고, 네트워크의 첫 번째 합성 곱을 통과해서 나온 출력값과 전역 잔차 학습이 진행될 수 있다.Here, in order to utilize the information of the entire area, the channel concentration method (CA) that assigns weights according to the importance between each channel can be performed using global average pooling and sigmoid, and A method of dividing and re-multiplying using a reduction rate of 16 is applied, and global residual learning can be performed with the output value passed through the first convolutional product of the network.

3.3. 업 스케일 및 복원 단계3.3. Upscale and restore steps

업 스케일 단계는 종래 ESPCN(Efficient Sub-Pixel Convolutional Neural Network)에서 제안한 방식으로 수행(sub -pixel 합성 곱을 통해 채널 별 셔플을 이용하여 H x W x(C x r x r)을 rH x rW x C로 변형시킴. 여기에서 r은 확대 배율을 의미함)될 수 있다. 최종 복원 단계에서는 업 스케일 단계를 거쳐 나온 특징들을 3x3 합성 곱을 통해 RGB 영상으로 복원하는 방식으로 수행될 수 있다. The up-scaling step is performed by the method proposed by the conventional ESPCN (Efficient Sub-Pixel Convolutional Neural Network) (H x W x (C x r x r) is transformed into rH x rW x C by using shuffle for each channel through sub-pixel convolutional product. , where r means magnification). In the final restoration step, the features obtained through the up-scaling step may be reconstructed into an RGB image through a 3x3 synthesis product.

3. 4. 본 발명에 따른 단일 영상 초해상도 복원 장치의 구성3. 4. Configuration of single image super-resolution restoration apparatus according to the present invention

이하에서는 앞서 설명한 본 발명에 따른 단일 영상 초해상도 복원 기법을 수행하는 단일 영상 초해상도 복원 장치에 대하여 설명한다. 도 3은 본 발명에 따른 단일 영상 초해상도 복원 장치를 설명하기 위한 참조도이다. Hereinafter, a single image super-resolution restoration apparatus for performing the single image super-resolution restoration technique according to the present invention described above will be described. 3 is a reference diagram for explaining a single image super-resolution restoration apparatus according to the present invention.

도 3을 참조하면, 단일 영상 초해상도 복원 장치(100)은 영상 입력부(110) 및 영상 복원부(120)를 포함하여 구성되며, 상술한 단일 영상 초해상도 복원 기법을 수행하는 컴퓨팅 장치에 해당한다. Referring to FIG. 3 , the single image super-resolution restoration apparatus 100 is configured to include an image input unit 110 and an image restoration unit 120 , and corresponds to a computing device performing the single image super-resolution restoration technique described above. .

영상 입력부(110)는 저해상도 영상을 입력받아 초해상도 네트워크에 제공한다.The image input unit 110 receives the low-resolution image and provides it to the super-resolution network.

영상 복원부(120)는 본 발명에 따른 초해상도 네트워크(Super-Resolution Networks)를 이용해 저해상도 영상을 기정의된 배율만큼 확대한 고해상도 영상을 생성한다.The image restoration unit 120 generates a high-resolution image obtained by magnifying a low-resolution image by a predefined magnification using super-resolution networks according to the present invention.

여기에서, 본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)는, 잔여 밀집 구조(RD)와 채널 집중 구조(CA)를 포함하는 RDCAB(도 2)가 기정의된 횟수만큼 재귀적으로 반복되도록 구성(도 1 참조)될 수 있다. Here, the super-resolution network (RDCAB-RecursiveSRNet) according to the present invention is configured such that the RDCAB (FIG. 2) including the residual dense structure (RD) and the channel concentration structure (CA) is recursively repeated a predefined number of times. (see Fig. 1) can be.

일 실시예에서, 본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)는, 도1에 도시된 바와 같이, 제1 및 제2 합성곱(Conv) 층을 포함하는 특징 추출부; RDCAB 및 Concat 층을 포함하는 재귀 블록(Recursive Block); 합성곱 병목층(Conv 1x1); 제3 합성곱 층(Conv); 및 복원부(UpScale & Reconstruction)가 직렬로 연결되어 구성될 수 있다. In one embodiment, the super-resolution network (RDCAB-RecursiveSRNet) according to the present invention, as shown in Figure 1, a feature extraction unit comprising a first and second convolution (Conv) layers; Recursive Block containing RDCAB and Concat layers; convolution bottleneck (Conv 1x1); a third convolutional layer (Conv); and a restoration unit (UpScale & Reconstruction) connected in series.

일 실시예에서, 특징 추출부는 도 1에 도시된 바와 같이, 제1 및 제2 합성곱 층이 각각 기정의된 개수의 특징 맵을 추출하되, 제1 합성곱 층의 출력은 제2 합성곱 층 및 마지막 계층으로 입력되며, 제2 합성곱 층의 출력은 재귀 블록으로 입력되도록 구성될 수 있다.In an embodiment, as shown in FIG. 1 , the feature extraction unit extracts a predetermined number of feature maps from each of the first and second convolutional layers, and the output of the first convolutional layer is the second convolutional layer. and the last layer, and the output of the second convolution layer may be configured to be input to the recursive block.

일 실시예에서, 재귀 블록은, RDCAB가 기정의된 횟수만큼 재귀적으로 반복되며, 각 횟수마다 출력되는 특징들이 Concat 층에 의해 연결되도록 구성될 수 있다.In one embodiment, the recursive block may be configured such that RDCAB is recursively repeated a predefined number of times, and features outputted for each number are connected by a Concat layer.

일 실시예에서, RDCAB은 도 2에 도시된 바와 같이, 잔여 밀집 구조와 채널 집중 구조가 직렬로 연결되어 구성되되, 잔여 밀집 구조는 기정의된 복수 개의 합성곱 층과 Concat 층을 포함하며, 채널 집중 구조는 전역 평균 풀링 층과 시그모이드 함수를 포함하여 구성될 수 있다.In one embodiment, as shown in FIG. 2 , the RDCAB is configured by connecting a residual dense structure and a channel concentrated structure in series, wherein the residual dense structure includes a plurality of predefined convolutional layers and a concat layer, and a channel A centralized structure can be constructed including a global average pooling layer and a sigmoid function.

일 실시예에서, 복원부(UpScale & Reconstruction)는, ESPCN(Efficient Sub-Pixel Convolutional Neural Network)를 이용해 업 스케일 프로세스을 수행하고, 업 스케일 프로세스를 통해 나온 특징들을 합성 곱을 통해 RGB 영상으로 할 수 있다. In an embodiment, the reconstructor (UpScale & Reconstruction) may perform an up-scaling process using an Efficient Sub-Pixel Convolutional Neural Network (ESPCN), and convert features generated through the up-scaling process into an RGB image through a synthesis product.

4. 비교 실험4. Comparative experiment

이하에서는, 본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)가 적용된 단일 영상 초해상도 복원 방법의 성능을 확인하기 위한 비교 실험 결과를 설명한다. 비교 실험에서, 학습을 위한 데이터 셋으로 DIV2K 800장을 사용하였고 검증 셋으로 5장으로 이루어져있는 Set5를 이용하였다. 학습하기 위해 입력으로 들어온 영상에서 무작위 32x32 크기의 패치(patch) 단위를 5개 뽑아 낸 뒤 무작위 수평 뒤집기, 수직 뒤집기 및 90도 회전을 통과시켜 총 4,000장의 영상을 학습시켰고, 결과를 확인하기 위해 테스트 셋으로 Set5, Set14, BSD100, Urban100을 사용하였으며, 최대 신호 대 잡음비(PSNR: peak signal to noise) 와 구조적 유사도(SSIM: structural similarity)를 이용하여 기존의 방법들과 2배, 3배, 4배 확대 배율에서 비교하였다. 손실 함수는 영상을 복원 할 때 L2 Loss보다 L1 Loss에서 더 좋은 성능을 보여 L1 Loss를 사용하였다. 비교 실험시 하드웨어는 GTX 1660 Super GPU와 Intel i5-9400F CPU를 사용하였다.Hereinafter, comparative experimental results for confirming the performance of the single image super-resolution reconstruction method to which the super-resolution network (RDCAB-RecursiveSRNet) according to the present invention is applied will be described. In the comparison experiment, 800 DIV2K sheets were used as the training data set and Set5 consisting of 5 sheets was used as the validation set. For training, a total of 4,000 images were trained by extracting 5 random 32x32-sized patch units from the input image for training, and then passing random horizontal flip, vertical flip, and 90 degree rotation. Set5, Set14, BSD100, and Urban100 were used as a set, and 2, 3, and 4 times compared to the existing methods using peak signal to noise (PSNR) and structural similarity (SSIM). Comparisons were made at magnification. The loss function used L1 Loss as it showed better performance in L1 Loss than L2 Loss when reconstructing an image. For comparison experiments, GTX 1660 Super GPU and Intel i5-9400F CPU were used as hardware.

상기 표 1, 표 2는 각 테스트 셋을 이용하여 기존의 네트워크들과 본 발명에 따른네트워크의 신호 대 잡음비, 구조적 유사도 및 각 네트워크의 파라미터의 수를 비교한 표이다. 본 발명에 따른 초해상도 네트워크(RDCAB-RecursiveSRNet)가 적용된 단일 영상 초해상도 복원 방법이 RDN을 제외한 기존의 방법들보다 더 좋은 성능을 나타냄을 확인할 수 있다. 한편, 표 1에 나타난 바와 같이, 2배 확대 배율 BSD100 테스트 셋에서 RDN은 32.34dB를 얻었고, 본 발명에 따른 복원 방법은 32.23dB를 얻어 0.11dB 만큼 더 낮은 성능을 만들었고, 4배 확대 배율 Urban100 테스트 셋에서 RDN은 26.61dB를 얻었고 본 발명에 따른 복원 방법은 26.18dB를 얻어 0.43dB 만큼 낮은 성능을 만들었지만, 표 2에 나타난 바와 같이, RDN은 22M 파라미터 수를 갖고 본 발명에 따른 네트워크는 2.1M 파라미터 수를 가져, 본 발명에 따른 네트워크가 약 10배 더 적은 파라미터의 수를 가짐을 확인할 수 있다.Tables 1 and 2 are tables comparing the signal-to-noise ratio, structural similarity, and number of parameters of each network between the existing networks and the network according to the present invention using each test set. It can be confirmed that the single image super-resolution reconstruction method to which the super-resolution network (RDCAB-RecursiveSRNet) is applied according to the present invention exhibits better performance than the existing methods except for RDN. On the other hand, as shown in Table 1, RDN obtained 32.34 dB in the BSD100 test set at 2x magnification, and the restoration method according to the present invention obtained 32.23 dB and made lower performance by 0.11 dB, and 4x magnification Urban100 test In the set, RDN obtained 26.61 dB, and the restoration method according to the present invention obtained 26.18 dB and made performance as low as 0.43 dB, but as shown in Table 2, RDN has 22M number of parameters and the network according to the present invention has 2.1M Taking the number of parameters, it can be seen that the network according to the present invention has about ten times fewer parameters.

도 4는 상기 표 1의 4배 확대 배율에 따른 Set5 테스트 셋의 성능을 파라미터의 수로 기존의 네트워크들과 비교한 것이며, 도 3를 참조하면, 본 발명의 네트워크가 본 기존의 네트워크들 보다 더 우수한 성능을 나타내며, RDN보다는 더 적은 수의 파라미터 수로 비슷한 성능을 나타냄을 확인 할 수 있다.4 is a comparison of the performance of the Set5 test set according to the 4x magnification of Table 1 with the existing networks in terms of the number of parameters. Referring to FIG. 3 , the network of the present invention is superior to the existing networks It shows performance, and it can be confirmed that it shows similar performance with fewer parameters than RDN.

상기 표 3은 네트워크를 학습하는데 총 걸린 시간과 Set14 테스트 셋 영상을 4배 확대 배율로 처리할 때 실행시간에 대한 성능을 비교한 표로, RDN 네트워크는 학습하는데 약 25시간이 소요되며, 본 발명에 따른 네트워크는 약 14시간이 소요됨을 확인할 수 있다. 본 발명에 따른 네트워크는 0.15dB 더 적은 성능을 나타내지만 처리속도는 기존대비 1.22배 더 빠른 결과를 얻었고, 평균적으로 4배 확대배율에 대한 모든 테스트 셋에서 1.18배 더 빠른 결과를 나타냈다.Table 3 above is a table comparing the total time taken to learn the network and the performance of the Set14 test set image when processing the image at 4x magnification. The RDN network takes about 25 hours to learn, and the present invention It can be seen that the following network takes about 14 hours. The network according to the present invention exhibits 0.15 dB less performance, but the processing speed is 1.22 times faster than before, and on average, 1.18 times faster in all test sets for 4 times magnification.

표 4는 재귀 횟수에 따른 파라미터와, 성능을 Urban100 테스트 셋을 이용하여 비교한 표이다. 여기에서, RB는 재귀 블록의 수를 의미하고 RDB는 RDCAB를 재귀적으로 사용한 횟수를 의미한다. 재귀 횟수를 9로 설정 한 결과가 재귀 횟수를 3으로 설정한 결과보다 파라미터의 수는 24,576개 정도 많았지만 성능은 0.21dB 더 높은 것을 확인할 수 있다.Table 4 is a table comparing parameters and performance according to the number of recursions using the Urban100 test set. Here, RB means the number of recursive blocks and RDB means the number of times RDCAB is used recursively. It can be seen that the result of setting the number of recursions to 9 had 24,576 more parameters than the result of setting the number of recursions to 3, but the performance was 0.21 dB higher.

표 5는 본 발명에 따른 네트워크에 구성되는 채널 집중 기법과 전역 잔차 학습의 사용 여부에 따른 성능 비교를 나타낸다. 4배 확대 배율 Set5 테스트 셋으로부터 모든 기법을 적용한 경우 각 기법을 사용하지 않았을 때보다 향상된 성능을 보여준다. Table 5 shows the performance comparison according to whether the channel concentration scheme configured in the network according to the present invention and global residual learning are used. When all techniques are applied from the 4x magnification Set5 test set, performance is improved compared to when each technique is not used.

도 5 내지 7은 본 발명에 따른 네트워크와 기존의 네트워크들의 배율(도 5는 2배, 도 6은 3배, 도 7은 4배)을 달리하여 비교한 결과는 나타내는 것으로, bicubic 같은 경우 모든 확대 배율에서 저주파수 특성으로 인해 몽롱화된 것을 확인할 수 있고, 2배 확대 배율에서 기존의 네트워크는 ‘y’와 ‘o’ 사이가 선으로 연결되고 열화가 존재하는데 반해, 본 발명에 따른 네트워크는 ‘y’와 ‘o’ 사이가 연결되어 있지 않은 것을 확인 할 수 있다. 또한, 3배 확대 배율에서 기존의 네트워크는 왼쪽 상단의 창문을 나누는 창틀을 휘어져있거나 알아볼 수 없는 창틀로 복원되는데 반해, 본 발명에 따른 네트워크는 5개의 창문을 나누는 창문의 창틀을 복원한 것을 확인할 수 있다. 4배 확대 배율에서 기존의 네트워크는 사선 방향과 반대의 사선 무늬로 복원하여 체커 보드 형태를 띄우고 있는데 반해, 본 발명에 따른 네트워크는 사선 방향의 선을 정교하게 복원한 것을 확인할 수 있다.5 to 7 show the results of comparing the network according to the present invention with different magnifications (2 times in FIG. 5, 3 times in FIG. 6, and 4 times in FIG. 7) of the network according to the present invention, and in the case of bicubic, all enlarged It can be seen that the magnification is blurred due to the low-frequency characteristics, and in the conventional network at 2x magnification, there is a line between 'y' and 'o' and deterioration exists, whereas the network according to the present invention is 'y' You can see that there is no connection between ' and 'o'. In addition, at 3X magnification, the existing network restores the window frame dividing the upper left window into a curved or unrecognizable window frame, whereas the network according to the present invention restores the window frame of the window dividing the five windows. have. At 4x magnification, it can be seen that the network according to the present invention precisely restores the diagonal line in the network according to the present invention, whereas the existing network is restored in a diagonal pattern opposite to the diagonal direction to float the checker board shape.

이상에서 설명한 본 발명에 따른 단일 영상 초해상도 복원 방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장장치 등이 있다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The single image super-resolution restoration method according to the present invention described above may be implemented as a computer-readable code on a computer-readable recording medium. The computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. Examples of the computer-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device. In addition, the computer-readable recording medium is distributed in a computer system connected through a network, so that the computer-readable code can be stored and executed in a distributed manner.

상기한 본 발명의 바람직한 실시예는 예시의 목적을 위해 개시된 것이고, 본 발명에 대해 통상의 지식을 가진 당업자라면 본 발명의 사상과 범위 안에서 다양한 수정, 변경, 부가가 가능할 것이며, 이러한 수정, 변경 및 부가는 하기의 특허청구범위에 속하는 것으로 보아야 할 것이다.The above-described preferred embodiments of the present invention have been disclosed for the purpose of illustration, and various modifications, changes, and additions may be made by those skilled in the art with respect to the present invention within the spirit and scope of the present invention, and such modifications, changes and Additions should be considered to fall within the scope of the following claims.

100 : 단일 영상 초해상도 복원 장치
110 : 영상 입력부
120 : 영상 복원부100: single image super-resolution restoration device
110: video input unit
120: image restoration unit

Claims

an image input unit that receives a low-resolution image and provides it to a super-resolution network; and
An image restoration unit for generating a high-resolution image magnified by a predefined magnification by using a predefined super-resolution network (Super-Resolution Networks);
The super-resolution network,
RDCAB (Residual Dense Channel Attention Block) including the residual dense structure and the channel concentration structure is characterized in that it is configured to recursively repeat a predetermined number of times,
Single image super-resolution restoration device.

According to claim 1, wherein the super-resolution network,
a feature extraction unit including first and second convolutional layers; a recursive block including the RDCAB and Concat layers; convolutional bottleneck; a third convolutional layer; and a single image super-resolution restoration device, characterized in that the restoration unit is connected in series.

The method of claim 2, wherein the feature extraction unit,
Each of the first and second convolutional layers extracts a predefined number of feature maps,
The output of the first convolutional layer is input to the second convolutional layer and the last layer,
and the output of the second convolutional layer is configured to be input to the recursive block. Video super-resolution restoration device.

The method of claim 3, wherein the recursive block comprises:
The RDCAB is recursively repeated for a predefined number of times, and the features output for each number are configured to be connected by a Concat layer. Video super-resolution restoration device.

5. The method of claim 4, wherein the RDCAB is
The remaining dense structure and the channel concentration structure are connected in series,
The residual dense structure includes a plurality of predefined convolutional layers and Concat layers,
The channel concentration structure comprises a global average pooling layer and a sigmoid function. Video super-resolution restoration device.

The method of claim 2, wherein the restoration unit,
An up-scaling process is performed using ESPCN (Efficient Sub-Pixel Convolutional Neural Network), and the features obtained through the up-scaling process are reconstructed into an RGB image through a synthesis product. Video super-resolution restoration device.

In the single image super-resolution restoration method performed in the single image super-resolution restoration apparatus,
An image providing unit, receiving a low-resolution image and providing it to a super-resolution network; and
Generating, by an image restoration unit, a high-resolution image magnified by a predefined magnification by using a predefined super-resolution network (Super-Resolution Networks);
The super-resolution network,
RDCAB (Residual Dense Channel Attention Block) including the residual dense structure and the channel concentration structure is characterized in that it is configured to recursively repeat a predetermined number of times,
A single image super-resolution restoration method.