KR102423047B1

KR102423047B1 - Pre-Processing Apparatus And Method For Super-Resolution Device Implemented In Hardware

Info

Publication number: KR102423047B1
Application number: KR1020200153527A
Authority: KR
Inventors: 정성욱; 이수민; 주성환
Original assignee: 연세대학교 산학협력단
Priority date: 2020-11-17
Filing date: 2020-11-17
Publication date: 2022-07-19
Also published as: KR20220067142A

Abstract

본 발명은 각각 기지정된 크기의 다수의 필터를 구비하고 순차 연결되는 N개의 컨볼루션 레이어를 포함하는 인공 신경망으로 구현되어 이미지 업스케일링을 수행하는 초해상도 장치를 위한 전처리 장치로서, 초해상도 장치에 입력되는 이미지가 컨볼루션 레이어의 이미지 쉬링크로 인해 축소되는 전체 크기를 나타내는 통합 패딩영역 크기를 계산하는 통합 패딩 크기 계산부, 업스케일링 대상인 입력 이미지의 외곽을 따라 전방향으로 통합 패딩영역 크기의 절반 크기로 한번에 확장하여 통합 패딩을 수행하는 통합 패딩부 및 기지정된 방식으로 통합 패딩에 의해 확장된 영역인 패딩 영역의 각 픽셀의 픽셀값을 채우는 패딩영역 채움부를 포함하여, 하드웨어로 구현되는 초해상도 장치의 데이터 플로우를 단순화하여 설계가 용이하고 고속 연산이 가능하도록 하고, 입력 이미지에서 에지 영역의 정보가 충분하게 반영될 수 있도록 하는 초해상도 장치를 위한 전처리 장치 및 방법을 제공할 수 있다.The present invention is a pre-processing device for a super-resolution device that has a plurality of filters of a predetermined size and is implemented as an artificial neural network including N convolutional layers sequentially connected to perform image upscaling, and is input to the super-resolution device. An integrated padding size calculator that calculates the size of the integrated padding area representing the total size of the image being reduced due to image shrink of the convolutional layer, half the size of the integrated padding area in all directions along the outskirts of the input image to be upscaled of a super-resolution device implemented in hardware, including an integrated padding unit that performs integrated padding by expanding with It is possible to provide a pre-processing apparatus and method for a super-resolution device that simplifies a data flow to enable easy design and high-speed operation, and to sufficiently reflect edge region information in an input image.

Description

Pre-Processing Apparatus And Method For Super-Resolution Device Implemented In Hardware

본 발명은 초해상도 장치를 위한 전처리 장치 및 방법에 관한 것으로, 하드웨어로 구현되는 초해상도 장치를 위한 전처리 장치 및 방법에 관한 것이다.The present invention relates to a pre-processing apparatus and method for a super-resolution apparatus, and to a pre-processing apparatus and method for a super-resolution apparatus implemented in hardware.

인공 신경망의 발전에 따라 다양한 분야에서 인공 신경망이 활발하게 이용되고 있다. 현재 인공 신경망은 이미지 처리, 객체 분류, 객체 검출, 음성 인식, 자연어 처리와 같은 매우 다양한 분야에서 적용되고 있을 뿐만 아니라 그 적용 분야가 계속 확장되어 가고 있다.With the development of artificial neural networks, artificial neural networks are being actively used in various fields. Currently, artificial neural networks are being applied in a wide variety of fields such as image processing, object classification, object detection, speech recognition, and natural language processing, and the field of application continues to expand.

한편, 최근 영상 기술의 발전으로 인해 고해상도 영상에 대한 요구가 증가되고 있으며, 이에 저해상도 영상을 초고해상도 영상으로 변환하는 업스케일링(Upscaling) 기술에 또한 관심을 받고 있다. 영상 업 스케일링 기술로 다양한 알고리즘이 제안되었으나, 단순한 산술 연산에 기반한 알고리즘보다 인공 신경망을 이용하여 연산하는 방법인 초해상도(Super-Resolution) 기법이 더 좋은 성능을 나타내는 것으로 알려져 있다.Meanwhile, due to the recent development of image technology, the demand for a high-resolution image is increasing, and thus, an upscaling technology for converting a low-resolution image into an ultra-high-resolution image is also attracting attention. Various algorithms have been proposed as image upscaling techniques, but it is known that the super-resolution technique, which is a method of calculating using an artificial neural network, shows better performance than an algorithm based on simple arithmetic operation.

도 1은 인공 신경망으로 구성되는 초해상도 장치의 일 예를 나타내고, 도 2는 필터의 컨볼루션 연산에 의한 이미지 크기 변화를 설명하기 위한 도면이다.1 shows an example of a super-resolution device composed of an artificial neural network, and FIG. 2 is a diagram for explaining an image size change by a convolution operation of a filter.

도 1에서는 초해상도 기법에서 주로 활용되고 있는 FSRCNN(Super-Resolution Convolutional Neural Network)을 도시하였다.1 shows a Super-Resolution Convolutional Neural Network (FSRCNN) which is mainly used in the super-resolution technique.

초해상도 기법에서 인공 신경망은 컨볼루션(Convolution) 연산을 수행하는 CNN(Convolution Neural Network)에 기반하여 구성된다. 다만 CNN은 영상 품질을 향상시키기 위해 많은 메모리와 연산량을 요구할 뿐만 아니라, 업스케일링 속도가 느리다는 문제가 있다. 이에 현재는 적은 수의 컨볼루션 레이어로 구성되어 단순한 구조를 갖는 SRCNN(Super-Resolution Convolutional Neural Network) 또는 도 1에 도시된 바와 같은 FSRCNN이 제안되어 주로 이용되고 있다. S RCNN이나 FSRCNN은 구조가 복잡하지 않아, TV나 스마트폰과 같은 사용자 단말 수준에서도 빠르게 실행될 수 있다는 장점이 있다.In the super-resolution technique, an artificial neural network is constructed based on a Convolution Neural Network (CNN) that performs a convolution operation. However, CNN has a problem that not only requires a lot of memory and computation to improve image quality, but also the upscaling speed is slow. Accordingly, at present, a Super-Resolution Convolutional Neural Network (SRCNN) having a simple structure composed of a small number of convolutional layers or an FSRCNN as shown in FIG. 1 has been proposed and mainly used. S RCNN or FSRCNN has an advantage that it can be quickly executed even at the level of a user terminal such as a TV or a smartphone because the structure is not complicated.

도 1을 참조하면, FSRCNN은 각각 기지정된 크기의 다수의 필터를 구비하고 순차 연결되는 다수의 컨볼루션 레이어(Conv.Layer 1 ~ Conv.Layer 5)를 포함하여 입력 이미지로부터 순차적으로 특징맵을 추출하고, 최종 컨볼루션 레이어(Conv.Layer 5)에서 출력되는 특징맵을 리맵핑하여 입력 이미지에 비해 2배로 해상도를 갖도록 업스케일링된 출력 이미지를 획득한다.Referring to FIG. 1 , the FSRCNN sequentially extracts a feature map from an input image including a plurality of convolutional layers (Conv.Layer 1 to Conv.Layer 5) each having a plurality of filters of a predetermined size and sequentially connected. and remapping the feature map output from the final convolution layer (Conv.Layer 5) to obtain an output image upscaled to have twice the resolution compared to the input image.

여기서 각 컨볼루션 레이어(Conv.Layer 1 ~ Conv.Layer 5)의 다수의 필터는 입력 이미지 또는 이전 레이어에서 출력되는 특징맵과 컨볼루션 연산을 수행한다.Here, a plurality of filters of each convolutional layer (Conv.Layer 1 to Conv.Layer 5) perform a convolution operation with an input image or a feature map output from a previous layer.

이때, 필터의 크기가 1 × 1 보다 크다면, 도 2에 도시된 바와 같이, 컨볼루션 연산의 특성에 따라 입력되는 이미지에 비해 출력되는 이미지의 크기가 줄어드는 이미지 쉬링크(Image Shrinking)가 발생하게 된다. 대부분의 인공 신경망은 순차 연결 구조의 다수의 컨볼루션 레이어를 포함하므로, 입력 이미지는 각 컨볼루션 레이어를 거치는 과정에서 지속적으로 크기가 축소된다. 도 1에서는 5개의 컨볼루션 레이어(Conv.Layer 1 ~ Conv.Layer 5) 중 제1, 제3 및 제5 컨볼루션 레이어(Conv.Layer 1, Conv.Layer 3, Conv.Layer 5)가 각각 5 × 5, 3 × 3, 3 × 3 크기의 필터를 가지므로, 이미지 쉬링크를 발생시키게 된다.At this time, if the size of the filter is larger than 1 × 1, as shown in FIG. 2 , image shrinking occurs in which the size of the output image is reduced compared to the input image according to the characteristics of the convolution operation. do. Since most artificial neural networks include multiple convolutional layers of a sequentially connected structure, the input image is continuously reduced in size as it passes through each convolutional layer. In FIG. 1, the first, third, and fifth convolutional layers (Conv.Layer 1, Conv.Layer 3, Conv.Layer 5) of the five convolutional layers (Conv.Layer 1 to Conv.Layer 5) are 5, respectively. Since the filter has a size of × 5, 3 × 3, and 3 × 3, image shrink is generated.

객체 분류나 객체 검출을 수행하는 인공 신경망의 경우, 이미지로부터 객체의 특징을 추출하기 위해 구성되므로, 이미지 쉬링크가 발생하여도 문제가 되지 않는다. 그에 반해, 초해상도 기법에서는 입력된 입력 이미지의 해상도를 2배로 업스케일링하여 출력 이미지를 출력하도록 구성되므로, 이미지의 크기가 축소되는 이미지 쉬링크가 발생되는 것을 방지할 수 있어야 한다.In the case of an artificial neural network that performs object classification or object detection, since it is configured to extract features of an object from an image, it is not a problem even if image shrink occurs. On the other hand, since the super-resolution technique is configured to output the output image by upscaling the resolution of the input input image by two, it should be possible to prevent image shrink in which the size of the image is reduced.

또한 이미지와 1 × 1보다 큰 크기의 필터 사이의 컨볼루션 연산이 수행되는 경우, 컨볼루션 연산 특성에 의해 이미지의 에지 영역은 다른 영역에 비해 상대적으로 적은 연산이 수행된다. 따라서 인공 신경망이 이미지의 에지 정보를 충분하게 반영하지 못하게 되는 문제가 있다.In addition, when a convolution operation is performed between an image and a filter having a size greater than 1 × 1, relatively few operations are performed on the edge region of the image compared to other regions due to the convolution operation characteristic. Therefore, there is a problem that the artificial neural network does not sufficiently reflect the edge information of the image.

상기한 바와 같이, 인공 신경망의 다수의 컨볼루션 레이어에서 필터와의 컨볼루션 연산에 의해 발생하는 이미지 쉬링크를 방지하고, 에지 영역과 나머지 영역 사이의 연산 차가 발생되지 않도록, 이미지가 축소된 만큼 에지 외곽에 지정된 데이터로 채워진 픽셀을 추가하여 이미지의 크기를 확장하는 패딩 기법이 도입되었다.As described above, to prevent image shrink caused by a convolution operation with a filter in a plurality of convolutional layers of an artificial neural network, and to prevent a computational difference between the edge region and the remaining regions, the image is reduced as much as the edge A padding technique was introduced to extend the size of the image by adding pixels filled with specified data to the outside.

패딩 기법에서는 일반적으로 이미지의 에지 외곽으로 0의 데이터 값을 갖는 픽셀을 추가하는 제로 패딩(Zero-Padding) 기법이 이용된다.In the padding technique, in general, a zero-padding technique of adding a pixel having a data value of 0 outside the edge of an image is used.

그러나 제로 패딩 기법은 추가된 각 픽셀이 이미지 정보가 포함되지 않은 0의 데이터 값을 가짐에 따라 이미지 크기가 축소되는 이미지 쉬링크는 방지할 수 있으나, 여전히 에지 영역의 정보를 충분하게 반영하지 못한다는 문제가 있다.However, the zero padding technique can prevent image shrink, which reduces the image size as each added pixel has a data value of 0 that does not include image information, but still does not sufficiently reflect the information of the edge area. there is a problem.

또한 패딩 기법은 패딩이 적용되는 시점에 따라 도 3에 도시된 바와 같이, 전처리 패딩 기법(Pre-Padding)과 후처리(Post-Padding) 패딩 기법으로 구분될 수 있다.Also, the padding technique may be divided into a pre-padding technique and a post-padding padding technique, as shown in FIG. 3 , according to a time point at which padding is applied.

도 3은 기존에 인공 신경망을 위한 패딩 기법을 설명하기 위한 도면이다.3 is a diagram for explaining a padding technique for a conventional artificial neural network.

도 3에서 (a)는 전처리 패딩 기법을 나타내고, (b)는 후처리 패딩 기법을 나타낸다.In FIG. 3, (a) shows a pre-processing padding technique, and (b) shows a post-processing padding technique.

(a)를 참조하면, 전처리 패딩 기법에서는 각 컨볼루션 레이어에 인가되는 입력 이미지(또는 특징맵)의 크기를 이미지 쉬링크로 인해 축소될 크기를 고려하여 미리 패딩한 후, 패딩된 이미지와 필터를 컨볼루션 연산함으로써, 각 컨볼루션 레이어에서 연산되어 출력되는 출력 이미지(또는 특징맵)의 크기가 입력 이미지(또는 특징맵)와 동일한 크기로 유지되도록 한다.Referring to (a), in the preprocessing padding technique, the size of the input image (or feature map) applied to each convolutional layer is pre-padded in consideration of the size to be reduced due to image shrink, and then the padded image and the filter are applied. By performing the convolution operation, the size of the output image (or feature map) calculated and output from each convolution layer is maintained at the same size as the input image (or feature map).

반면, (b)에 도시된 바와 같이, 전처리 패딩 기법에서는 각 컨볼루션 레이어가 인가되는 입력 이미지(또는 특징맵)를 그대로 필터와 컨볼루션 연산하여 출력 이미지(또는 특징맵)를 획득하고, 획득된 출력 이미지(또는 특징맵)에 대해 이미지 쉬링크로 인해 축소된 크기만큼 패딩함으로써, 최종 이미지의 크기가 입력 이미지(또는 특징맵)와 동일한 크기로 유지되도록 한다.On the other hand, as shown in (b), in the preprocessing padding technique, the output image (or feature map) is obtained by performing convolution operation on the input image (or feature map) to which each convolution layer is applied as it is with the filter, and the obtained By padding the output image (or feature map) by the reduced size due to image shrink, the size of the final image is maintained at the same size as the input image (or feature map).

다만 (b)에 도시된 후처리 패딩 기법의 경우, 패딩이 수행되는 컨볼루션 레이어의 수가 증가될수록 이미지가 흐려지는 장기 의존성(long-dependency) 현상을 심화시키기 때문에 현재는 각 컨볼루션 레이어의 직전에 패딩을 수행하는 전처리 패딩이 선호되고 있다.However, in the case of the post-processing padding technique shown in (b), as the number of convolutional layers on which padding is performed increases, the long-dependency phenomenon in which the image is blurred is intensified. Pre-processing padding that performs

인공 신경망이 소프트웨어적으로 구현되는 경우에는 각 레이어가 이전 레이어의 연산이 모두 수행된 이후, 자신의 연산을 수행하므로 전처리 패딩 기법을 이용하더라도 패딩에 의해 증가된 데이터의 저장 공간 할당 처리나 인공 신경망 전체의 데이터 플로우에 대한 처리 복잡도가 증가되지 않는 반면, 인공 신경망이 하드웨어적으로 구현되는 경우, 구비되는 연산기 개수나 메모리 용량 등의 문제로 인해, 각 컨볼루션 레이어가 인가되는 입력 이미지(또는 특징맵)의 전체 픽셀에 대한 컨볼루션 연산을 한번에 수행하지 않고, 수용 가능한 크기 단위로 국소적으로 수행하게 된다. 따라서 각 컨볼루션 레이어가 필터를 이용하여 입력 이미지(또는 특징맵)에 대한 컨볼루션 연산을 수행함에 있어, 현재 연산되어야 하는 영역이 전처리되는 영역에 해당하는지를 각 경우의 수에 따라 구분하여 항상 고려되어야 한다. 이는 하드웨어 구현 복잡도를 크게 높이게 되는 문제를 초래한다.When the artificial neural network is implemented in software, each layer performs its own operation after all operations of the previous layer are performed. On the other hand, when the artificial neural network is implemented in hardware, the input image (or feature map) to which each convolution layer is applied due to problems such as the number of operators or memory capacity. Instead of performing a convolution operation on all pixels of , it is performed locally in an acceptable size unit. Therefore, when each convolution layer performs a convolution operation on an input image (or feature map) using a filter, it should always be considered by classifying whether the region to be calculated currently corresponds to the region to be preprocessed according to the number of cases. do. This causes a problem of greatly increasing the hardware implementation complexity.

한국 공개 특허 제10-2019-0062305호 (2019.06.05 공개)Korean Patent Publication No. 10-2019-0062305 (published on June 5, 2019)

본 발명의 목적은 하드웨어로 구현되는 초해상도 장치의 데이터 플로우를 단순화하여 설계가 용이하고 고속 연산이 가능한 초해상도 장치를 위한 전처리 장치 및 방법을 제공하는데 있다.An object of the present invention is to provide a pre-processing apparatus and method for a super-resolution device that is easy to design and can perform high-speed operation by simplifying the data flow of the super-resolution device implemented in hardware.

본 발명의 다른 목적은 입력 이미지에서 에지 영역의 정보가 충분하게 반영될 수 있는 초해상도 장치를 위한 전처리 장치 및 방법을 제공하는데 있다.Another object of the present invention is to provide a pre-processing apparatus and method for a super-resolution apparatus capable of sufficiently reflecting edge region information in an input image.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 따른 초해상도 장치를 위한 전처리 장치는 각각 기지정된 크기의 다수의 필터를 구비하고 순차 연결되는 N개의 컨볼루션 레이어를 포함하는 인공 신경망으로 구현되어 이미지 업스케일링을 수행하는 초해상도 장치를 위한 전처리 장치로서, 상기 초해상도 장치에 입력되는 이미지가 컨볼루션 레이어의 이미지 쉬링크로 인해 축소되는 전체 크기를 나타내는 통합 패딩영역 크기를 계산하는 통합 패딩 크기 계산부; 업스케일링 대상인 입력 이미지의 외곽을 따라 전방향으로 상기 통합 패딩영역 크기의 절반 크기로 한번에 확장하여 통합 패딩을 수행하는 통합 패딩부; 및 기지정된 방식으로 상기 통합 패딩에 의해 확장된 영역인 패딩 영역의 각 픽셀의 픽셀값을 채우는 패딩영역 채움부를 포함한다.A preprocessing device for a super-resolution device according to an embodiment of the present invention for achieving the above object is implemented as an artificial neural network including N convolutional layers sequentially connected and provided with a plurality of filters of a predetermined size, respectively. As a preprocessor for a super-resolution device that performs upscaling, an integrated padding size calculation unit that calculates the size of an integrated padding area indicating the overall size of an image input to the super-resolution device is reduced due to image shrink of a convolution layer ; an integrated padding unit for performing integrated padding by expanding to half the size of the integrated padding area in all directions along the periphery of the input image to be upscaled; and a padding area filling unit that fills a pixel value of each pixel of a padding area that is an area extended by the integrated padding in a predetermined manner.

상기 패딩영역 채움부는 상기 입력 이미지의 각 에지의 기지정된 개수의 행 및 열의 픽셀값들로 상기 패딩 영역의 대응하는 영역을 우선 채우고, 대응하지 않는 영역에 대해서는 0의 픽셀값으로 채울 수 있다.The padding area filling unit may first fill a corresponding area of the padding area with pixel values of a predetermined number of rows and columns of each edge of the input image, and fill a non-corresponding area with a pixel value of 0.

상기 패딩영역 채움부는 상기 입력 이미지의 각 에지에 위치하는 하나의 행 및 하나의 열에 위치하는 픽셀의 픽셀값을 복사하고, 복사된 행 및 열의 픽셀값으로 대응하는 패딩 영역에 반복적으로 채울 수 있다.The padding area filling unit may copy pixel values of pixels located in one row and one column located at each edge of the input image, and repeatedly fill the corresponding padding area with pixel values of the copied rows and columns.

상기 패딩영역 채움부는 상기 입력 이미지의 각 에지에서 패딩 영역에 크기에 대응하는 크기의 행 및 열을 복사하고, 복사된 패딩 영역에 크기에 대응하는 크기의 행 및 열의 픽셀값으로 대응하는 패딩 영역에 채울 수 있다.The padding area filling unit copies rows and columns of sizes corresponding to the sizes of the padding areas at each edge of the input image, and in the padding areas corresponding to the pixel values of rows and columns of sizes corresponding to the sizes in the copied padding area. can be filled

상기 패딩영역 채움부는 상기 입력 이미지의 각 에지에서 패딩 영역에 크기에 대응하는 크기의 행 및 열을 복사하고, 복사된 패딩 영역에 크기에 대응하는 크기의 행 및 열의 픽셀값이 대응하는 픽셀 영역에서 대칭되는 패턴이 되도록 반전하여 채울 수 있다.The padding area filling unit copies rows and columns of sizes corresponding to the sizes of the padding areas at each edge of the input image, and in the pixel areas corresponding to the pixel values of rows and columns of sizes corresponding to the sizes in the copied padding area. You can fill it by inverting it so that it becomes a symmetrical pattern.

상기 통합 패딩 크기 계산부는 1 × 1보다 큰 크기의 필터를 갖는 컨볼루션 레이어의 개수(α)와 각 컨볼루션 레이어에 포함된 필터의 길이(K_i) 및 입력 이미지 길이(L_in) 및 출력 이미지 길이(L_out)를 기반으로 수학식 The integrated padding size calculation unit includes the number of convolutional layers (α) having filters having a size greater than 1 × 1, the length of filters included in each convolutional layer (K _i ), the input image length (L _in ), and the output image Equation based on length (L _out )

에 따라 상기 통합 패딩영역의 크기를 계산할 수 있다.Accordingly, the size of the integrated padding area can be calculated.

상기 초해상도 장치는 다수의 연산기와 메모리를 포함하는 하드웨어로 구현될 수 있다.The super-resolution device may be implemented as hardware including a plurality of operators and memories.

상기 목적을 달성하기 위한 본 발명의 다른 실시예에 따른 초해상도 장치를 위한 전처리 방법은 각각 기지정된 크기의 다수의 필터를 구비하고 순차 연결되는 N개의 컨볼루션 레이어를 포함하는 인공 신경망으로 구현되어 이미지 업스케일링을 수행하는 초해상도 장치를 위한 전처리 방법으로서, 상기 초해상도 장치에 입력되는 이미지가 컨볼루션 레이어의 이미지 쉬링크로 인해 축소되는 전체 크기를 나타내는 통합 패딩영역 크기를 계산하는 단계; 업스케일링 대상인 입력 이미지의 외곽을 따라 전방향으로 상기 통합 패딩영역 크기의 절반 크기로 한번에 확장하여 통합 패딩을 수행하는 단계; 및 기지정된 방식으로 상기 통합 패딩에 의해 확장된 영역인 패딩 영역의 각 픽셀의 픽셀값을 채우는 단계를 포함한다.A pre-processing method for a super-resolution device according to another embodiment of the present invention for achieving the above object is implemented as an artificial neural network including N convolutional layers sequentially connected and provided with a plurality of filters of a predetermined size, respectively. A pre-processing method for a super-resolution device performing upscaling, comprising: calculating a size of an integrated padding area indicating an overall size of an image input to the super-resolution device reduced due to image shrink of a convolutional layer; performing integrated padding by expanding to half the size of the integrated padding area in all directions along the periphery of the input image that is the upscaling target; and filling a pixel value of each pixel of a padding area that is an area extended by the integrated padding in a predetermined manner.

따라서, 본 발명의 실시예에 따른 초해상도 장치를 위한 전처리 장치 및 방법은 인공 신경망으로 구현되는 초해상도 장치 전체에서 패딩되어야 하는 크기를 미리 계산하고, 계산된 크기만큼 미리 일괄적으로 통합 전처리 패딩한 후 초해상도 장치의 입력으로 인가하여, 초해상도 장치의 각 컨볼루션 레이어가 개별적인 패딩을 고려하지 않을 수 있도록 함으로써, 데이터 저장 공간 할당 방식이나 데이터 플로우를 단순화시킬 수 있도록 한다. 그러므로 하드웨어 구조를 단순화시킬 수 있어 비용을 저감하고, 연산 속도를 향상시킬 수 있다. 또한 통합 전처리 패딩된 영역에 입력 이미지에서 기지정된 크기의 에지 영역에 위치하는 픽셀의 값으로 채움으로써 에지 영역 특징 정보가 나머지 영역과 유사한 수준으로 충분히 반영될 수 있도록 한다.Therefore, the pre-processing apparatus and method for a super-resolution device according to an embodiment of the present invention calculates in advance the size to be padded in the entire super-resolution device implemented as an artificial neural network, and performs integrated pre-processing padding in advance as much as the calculated size. Then, it is applied as an input to the super-resolution device so that individual padding is not considered for each convolutional layer of the super-resolution device, thereby simplifying a data storage space allocation method or data flow. Therefore, it is possible to simplify the hardware structure, thereby reducing the cost and improving the operation speed. In addition, by filling the integrated preprocessing padded area with the value of a pixel located in an edge area of a predetermined size in the input image, edge area characteristic information can be sufficiently reflected at a level similar to that of the rest area.

도 1은 인공 신경망으로 구성되는 초해상도 장치의 일 예를 나타낸다.
도 2는 필터의 컨볼루션 연산에 의한 이미지 크기 변화를 설명하기 위한 도면이다.
도 3은 기존에 인공 신경망을 위한 패딩 기법을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 초해상도 장치를 위한 전처리 장치의 개략적 구조를 나타낸다.
도 5는 도 4의 통합 패딩부에 의해 입력 이미지가 통합 패딩되는 개념을 설명하기 위한 도면이다.
도 6은 도 4의 패딩영역 채움부가 패딩영역에 픽셀값을 채우는 방식을 설명하기 위한 도면이다.
도 7은 기존의 전처리 패딩 기법과 본 실시예의 통합 전처리 패딩 기법의 동작을 비교하여 설명하기 위한 도면이다.
도 8은 패딩 방식에 따른 초해상도 장치의 성능 변화를 비교한 그래프이다.
도 9는 본 발명의 일 실시예에 따른 초해상도 장치를 위한 전처리 방법을 나타낸다.1 shows an example of a super-resolution device configured with an artificial neural network.
FIG. 2 is a diagram for explaining an image size change due to a convolution operation of a filter.
3 is a diagram for explaining a padding technique for a conventional artificial neural network.
4 shows a schematic structure of a pre-processing apparatus for a super-resolution apparatus according to an embodiment of the present invention.
FIG. 5 is a diagram for explaining a concept in which an input image is integrated padding by the integrated padding unit of FIG. 4 .
FIG. 6 is a diagram for explaining a method in which the padding area filling unit of FIG. 4 fills the padding area with pixel values.
FIG. 7 is a diagram for explaining an operation of a conventional pre-processing padding technique and an integrated pre-processing padding technique of the present embodiment by comparison.
8 is a graph comparing performance changes of a super-resolution device according to a padding method.
9 shows a pre-processing method for a super-resolution device according to an embodiment of the present invention.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 충분히 이해하기 위해서는 본 발명의 바람직한 실시예를 예시하는 첨부 도면 및 첨부 도면에 기재된 내용을 참조하여야만 한다. In order to fully understand the present invention, the operational advantages of the present invention, and the objects achieved by the practice of the present invention, reference should be made to the accompanying drawings illustrating preferred embodiments of the present invention and the contents described in the accompanying drawings.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 설명함으로써, 본 발명을 상세히 설명한다. 그러나, 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 설명하는 실시예에 한정되는 것이 아니다. 그리고, 본 발명을 명확하게 설명하기 위하여 설명과 관계없는 부분은 생략되며, 도면의 동일한 참조부호는 동일한 부재임을 나타낸다. Hereinafter, the present invention will be described in detail by describing preferred embodiments of the present invention with reference to the accompanying drawings. However, the present invention may be embodied in several different forms, and is not limited to the described embodiments. In addition, in order to clearly explain the present invention, parts irrelevant to the description are omitted, and the same reference numerals in the drawings indicate the same members.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "...부", "...기", "모듈", "블록" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. Throughout the specification, when a part "includes" a certain component, it does not exclude other components unless otherwise stated, meaning that other components may be further included. In addition, terms such as "... unit", "... group", "module", and "block" described in the specification mean a unit that processes at least one function or operation, which is hardware, software, or hardware. and a combination of software.

도 4는 본 발명의 일 실시예에 따른 초해상도 장치를 위한 전처리 장치의 개략적 구조를 나타내고, 도 5는 도 4의 통합 패딩부에 의해 입력 이미지가 통합 패딩되는 개념을 설명하기 위한 도면이며, 도 6은 도 4의 패딩영역 채움부가 패딩영역에 픽셀값을 채우는 방식을 설명하기 위한 도면이다.4 shows a schematic structure of a preprocessing device for a super-resolution device according to an embodiment of the present invention, and FIG. 5 is a diagram for explaining a concept in which an input image is integrated padding by the integrated padding unit of FIG. 4, FIG. 6 is a diagram for explaining a method in which the padding area filling unit of FIG. 4 fills the padding area with pixel values.

도 4를 참조하면, 본 실시예의 전처리 장치(100)는 입력 이미지(IN)를 인가받고, 초해상도 장치(200)에서 이미지 쉬링크로 인해 축소되는 크기를 고려하여 입력 이미지(IN)를 패딩하고, 패딩된 패딩 이미지(PIN)를 인공 신경망으로 구현되는 초해상도 장치(200)로 전달한다.Referring to FIG. 4 , the pre-processing apparatus 100 of this embodiment receives the input image IN, and pads the input image IN in consideration of the size reduced due to image shrink in the super-resolution apparatus 200, and , the padded padding image (PIN) is transmitted to the super-resolution device 200 implemented as an artificial neural network.

전처리 장치(100)는 통합 패딩 크기 계산부(110), 통합 패딩부(120) 및 패딩영역 채움부(130)를 포함한다.The preprocessing apparatus 100 includes an integrated padding size calculation unit 110 , an integrated padding unit 120 , and a padding area filling unit 130 .

통합 패딩 크기 계산부(110)는 초해상도 장치(200)에서 이미지 쉬링크로 인해 축소되는 크기를 계산한다. 즉 입력 이미지(IN)에서 확장되어야 하는 패딩영역의 크기를 계산한다.The integrated padding size calculator 110 calculates a size reduced due to image shrink in the super-resolution device 200 . That is, the size of the padding area to be extended in the input image IN is calculated.

통합 패딩 크기 계산부(110)는 초해상도 장치(200)에 인가된 이미지가 이미지 쉬링크로 인해 축소되는 크기를 계산한다. 통합 패딩 크기 계산부(110)는 N개의 컨볼루션 레이어를 갖는 인공 신경망으로 구현되는 초해상도 장치(200)에서 1 × 1보다 큰 크기의 필터를 갖는 컨볼루션 레이어의 개수(α)와 각 컨볼루션 레이어에 포함된 필터의 크기 및 스트라이드 크기(stride size)(s)를 우선 확인한다.The integrated padding size calculator 110 calculates a size in which the image applied to the super-resolution device 200 is reduced due to the image shrink. The integrated padding size calculator 110 calculates the number of convolutional layers having a filter of size greater than 1 × 1 and each convolution in the super-resolution device 200 implemented as an artificial neural network having N convolutional layers. First, check the size of the filter included in the layer and the stride size (s).

그리고 확인된 α개의 컨볼루션 레이어 각각에 입력되는 이미지 크기(L_in)를 확인하여 출력되는 이미지 크기(L_out)를 확인한다. 인공 신경망에서 각 컨볼루션 레이어의 입력 및 출력은 2차원 또는 3차원 행렬 형태로 고려되지만, 하드웨어로 구현되는 경우 실제 메모리에는 1차원의 벡터 형식으로 저장된다. 이에 입력되는 이미지 크기(L_in)와 출력되는 이미지 크기(L_out)는 입력 길이 및 출력 길이라고도 할 수 있다. 또한 필터의 크기(K_n) 또한 Then, the image size (L _in ) input to each of the identified α convolutional layers is checked, and the output image size (L _out ) is checked. In an artificial neural network, the input and output of each convolutional layer are considered in the form of a two-dimensional or three-dimensional matrix, but when implemented in hardware, they are stored in the real memory in the form of a one-dimensional vector. The input image size (L _in ) and the output image size (L _out ) may also be referred to as an input length and an output length. Also, the size of the filter (K _n ) is also

그리고 1 × 1보다 큰 크기의 필터를 갖는 α개의 컨볼루션 레이어 중 n번째 컨볼루션 레이어의 필터 길이를 K_n 이라 할 때, n번째 컨볼루션 레이어의 입력 길이(L_in)에 대응하는 출력 길이(L_out)는 수학식 1로 계산될 수 있다.And when the filter length of the n-th convolutional layer among α convolutional layers having a filter size greater than 1 × 1 is K _n , the output length (L _in ) corresponding to the input length of the n-th convolutional layer ( L _out ) may be calculated by Equation (1).

수학식 1을 α개의 컨볼루션 레이어 각각에 대응하여 확장시키면, 수학식 2와 같이 표현될 수 있다.If Equation 1 is extended to correspond to each of the α convolutional layers, it can be expressed as Equation 2.

만일 α개의 컨볼루션 레이어에서 스트라이드 크기(s)가 모두 1이라고 가정하면, 초해상도 장치(200)에서 이미지 쉬링크로 인해 축소되는 크기(ΔL)는 수학식 3과 같이 계산될 수 있다.If it is assumed that all stride sizes s in the α convolutional layers are 1, the size ΔL reduced due to image shrink in the super-resolution device 200 may be calculated as in Equation 3.

수학식 3에 따라 계산된 크기(ΔL)는 상대적으로 전처리 장치(100)가 입력 이미지(IN)의 크기를 일괄적으로 통합 확장해야 하는 크기이므로, 여기서는 통합 패딩 크기라고 한다.The size ΔL calculated according to Equation 3 is a size for which the preprocessing apparatus 100 needs to collectively expand and integrate the size of the input image IN, so it is referred to herein as the integrated padding size.

통합 패딩 크기 계산부(110)에서 통합 패딩 크기(ΔL)가 계산되면, 통합 패딩부(120)가 계산된 통합 패딩 크기(ΔL)만큼 입력 이미지(IN)에 대해 일시에 패딩한다. 이때, 통합 패딩부(120)는 도 5에 도시된 바와 같이, 2차원의 입력 이미지(IN)에서 외곽 전방향에 대해 통합 패딩 크기(ΔL)의 절반 크기(ΔL/2)로 패딩한다. 이는 컨볼루션 레이어에서 입력 이미지가 외곽 전방향에서 균일하게 축소되기 때문이다.When the integrated padding size calculation unit 110 calculates the integrated padding size ΔL, the integrated padding unit 120 temporarily pads the input image IN by the calculated integrated padding size ΔL. At this time, as shown in FIG. 5 , the integrated padding unit 120 pads the two-dimensional input image IN with a half size (ΔL/2) of the integrated padding size (ΔL) in the omnidirectional outer direction. This is because the input image in the convolutional layer is reduced uniformly in all directions.

패딩영역 채움부(130)는 통합 패딩부에 의해 일괄적으로 확장된 패딩영역의 각 픽셀에 기지정된 방식으로 픽셀값을 채워 패딩 이미지(PIN)를 획득한다. The padding area filling unit 130 obtains a padding image PIN by filling each pixel of the padding area collectively expanded by the integrated padding unit with pixel values in a predetermined manner.

이때 패딩영역 채움부(130)는 제로 패딩 기법에 따라 패딩영역의 모든 픽셀에 대한 픽셀값을 0으로 채울 수도 있다. 그러나 상기한 바와 같이 제로 패딩 기법을 적용하게 되면, 초해상도 장치(200)가 입력 이미지(IN)의 에지 영역의 정보를 충분하게 반영하지 못한다는 한계가 있다.In this case, the padding area filling unit 130 may fill the pixel values of all pixels of the padding area with 0 according to the zero padding technique. However, when the zero padding technique is applied as described above, there is a limitation in that the super-resolution device 200 does not sufficiently reflect the information of the edge region of the input image IN.

이에 본 실시예에서는 초해상도 장치(200)가 입력 이미지(IN)의 에지 영역의 정보를 충분하게 반영할 수 있도록 패딩영역 채움부(130)가 패딩영역을 채우는 다른 기법들을 제안한다.Accordingly, in this embodiment, the padding area filling unit 130 proposes other techniques for filling the padding area so that the super-resolution device 200 can sufficiently reflect the information of the edge area of the input image IN.

도 6에서 (a)는 0의 픽셀값으로 패딩영역을 채우는 제로 패딩 기법을 나타내고, (b) 내지 (d)는 본 실시예에서 제안하는 서로 다른 3가지 채움 기법으로 (b)는 에지 카피(Edge Copy) 기법을 나타내고, (c)는 경계 카피(Border Copy) 기법을 나타내며, (d)는 경계 미러(Border Mirror) 기법을 나타낸다.In FIG. 6, (a) shows a zero padding technique for filling a padding area with a pixel value of 0, (b) to (d) are three different filling techniques proposed in this embodiment, and (b) is an edge copy ( Edge Copy) technique, (c) shows a border copy technique, (d) shows a border mirror technique.

도 6에서 붉은색 빗금 영역은 패딩영역에서 0의 픽셀값이 채워지는 영역이고, 파란색 빗금 영역은 입력 이미지에서 패딩영역에 채워질 픽셀값이 포함된 영역과 이 픽셀값에 따라 픽셀값이 채워지는 패딩영역을 나타낸다.In FIG. 6 , the red hatched area is the area where a pixel value of 0 is filled in the padding area, and the blue hatched area is the area including the pixel value to be filled in the padding area in the input image and the padding filled with the pixel value according to this pixel value. indicates the area.

도 6의 (a)에 도시된 바와 같이, 제로 패딩 기법에서는 입력 이미지(IN)에 무관하게 패딩영역의 모든 픽셀이 0의 픽셀값을 갖게 되므로, 파란색 빗금 영역이 없이 붉은색 빗금 영역만으로 패딩영역이 채워진다.As shown in (a) of FIG. 6 , in the zero padding technique, all pixels of the padding area have a pixel value of 0 regardless of the input image IN. this is filled

그에 반해, (b)의 에지 카피 기법에서는 입력 이미지(IN)에서 각 변의 에지 라인에 위치하는 픽셀의 픽셀값이 에지 라인 단위로 복사되어 외곽 방향으로 연장되는 방식으로 채워진다. 즉 (b)에 도시된 바와 같이, 입력 이미지(IN)의 양단의 행 및 양단의 열에 위치하는 픽셀의 픽셀값을 반복적으로 복사 및 붙여넣기 하는 형식으로 패딩영역을 채우게 된다. 다만 패딩영역은 입력 이미지(IN)를 둘러싸는 형태로 형성되므로, 입력 이미지(IN)보다 x축 및 y축 방향으로의 픽셀 수가 더 많다. 따라서 입력 이미지(IN)의 각 변의 에지 라인을 복사하여 외곽 방향으로 채우더라도 (b)에 도시된 바와 같이, 픽셀 영역에서 입력 이미지의 4개의 모서리 방향 영역은 채워지지 않는다. 이에 에지 카피 기법은 최외곽 에지 라인으로 채워지지 않는 모서리 영역의 픽셀에 대해서는 제로 패딩 기법과 유사하게 픽셀값을 0으로 채워넣는다.On the other hand, in the edge copy technique of (b), pixel values of pixels located on edge lines of each side of the input image IN are copied in units of edge lines and filled in such a way that they extend outward. That is, as shown in (b), the padding area is filled in the form of repeatedly copying and pasting pixel values of pixels located in rows and columns at both ends of the input image IN. However, since the padding area is formed to surround the input image IN, the number of pixels in the x-axis and y-axis directions is greater than that of the input image IN. Therefore, even if the edge line of each side of the input image IN is copied and filled in the outer direction, as shown in (b), the four corner direction areas of the input image are not filled in the pixel area as shown in (b). Accordingly, in the edge copy technique, a pixel value is filled with 0, similar to the zero padding technique, for pixels in the corner area not filled with the outermost edge line.

그리고 (c)의 경계 카피 기법에서는 에지 카피 기법과 유사하게 입력 이미지(IN)의 각 변 방향에서 패딩영역에 대응하는 개수의 행 또는 열에 위치하는 픽셀들의 픽셀값을 복사하여 해당 변의 외곽 방향에 위치하는 패딩영역에 채워넣는다. (c)와 같이 패딩영역이 입력 이미지(IN)의 각 변에서 외곽 방향으로 4개의 행 또는 열 크기로 확장된 경우, 경계 카피 기법은 입력 이미지(IN)의 각 변의 에지에 위치하는 4개의 행 또는 열의 픽셀값을 복사하여 복사된 순서 그대로 패딩영역의 4개의 행 또는 열에 채운다. 즉 입력 이미지(IN)의 외곽 기지정된 크기 영역을 복사하여 그대로 패딩영역에 다시 채운다. 그리고 에지 카피 기법과 마찬가지로 입력 이미지(IN)의 픽셀값으로 채워지지 않는 모서리 영역의 픽셀에 대해서는 픽셀값을 0으로 채워넣는다.And, in the boundary copy technique of (c), similar to the edge copy technique, pixel values of pixels located in the number of rows or columns corresponding to the padding area in the direction of each side of the input image IN are copied and located in the outer direction of the corresponding side. Fill in the padding area. As shown in (c), when the padding area extends from each side of the input image IN to the size of 4 rows or columns in the outward direction, the boundary copy technique uses 4 rows located at the edges of each side of the input image IN. Alternatively, the pixel values of the column are copied and filled in the 4 rows or columns of the padding area in the copied order. That is, an area of a predetermined size outside the input image IN is copied and filled again in the padding area as it is. And, like the edge copy technique, the pixel value is filled with 0 for the pixel in the corner area that is not filled with the pixel value of the input image IN.

한편 (d)의 경계 미러 기법은 경계 카피 기법과 마찬가지로 입력 이미지(IN)의 각 변 방향에서 패딩영역에 대응하는 개수의 행 또는 열에 위치하는 픽셀들의 픽셀값을 복사하여 해당 변의 외곽 방향에 위치하는 패딩영역에 채워넣는다. 다만, 경계 카피 기법은 입력 이미지(IN)에서 복사된 순서 그대로 패딩영역을 채움으로써 동일한 경계 영역이 반복되는 패턴을 갖는데 반해, 경계 미러 기법에서는 복사된 다수의 행 또는 다수의 열을 반전하여 패딩영역을 채운다. 따라서 입력 이미지(IN)의 각 변을 기준으로 입력 이미지(IN)의 다수의 행 또는 열과 이에 대응하는 패딩영역이 대칭되는 패턴의 픽셀값을 갖게 된다. 그리고 경계 미러 기법에서도 입력 이미지(IN)의 픽셀값으로 채워지지 않는 모서리 영역의 픽셀에 대해서는 픽셀값을 0으로 채워넣는다.On the other hand, the boundary mirror technique of (d) copies the pixel values of pixels located in the number of rows or columns corresponding to the padding area in the direction of each side of the input image IN, similar to the boundary copy technique, and is located in the outer direction of the corresponding side. Fill in the padding area. However, the boundary copy technique has a pattern in which the same boundary region is repeated by filling the padding region in the same order that they are copied from the input image IN, whereas in the boundary mirror technique, a number of copied rows or columns is inverted to fill the padding region. fill in Accordingly, a plurality of rows or columns of the input image IN and a padding area corresponding thereto have pixel values of a symmetric pattern with respect to each side of the input image IN. Also, in the boundary mirror technique, the pixel value is filled with 0 for the pixel in the corner area that is not filled with the pixel value of the input image IN.

(b) 내지 (d)에 도시된, 에지 카피 기법과 경계 카피 기법 및 경계 미러 기법은 모두 제로 패딩 기법과 달리 입력 이미지(IN)의 실제 에지 영역의 픽셀 값으로 패딩영역을 채움으로써, 입력 이미지(IN)의 에지 영역의 정보가 초해상도 장치(200)의 컨볼루션 연산에도 충분하게 반영될 수 있도록 한다.Unlike the zero padding technique, the edge copy technique, the boundary copy technique, and the boundary mirror technique shown in (b) to (d) fill the padding area with the pixel value of the actual edge area of the input image IN, so that the input image It allows the information of the edge region of (IN) to be sufficiently reflected in the convolution operation of the super-resolution device 200 .

초해상도 장치(200)는 각각 기지정된 크기의 다수의 필터를 구비하고 순차 연결되는 다수의 컨볼루션 레이어를 포함하여 입력 이미지로부터 순차적으로 특징맵을 추출하는 인공 신경망으로 구현되어 입력 이미지(IN)보다 큰 기지정된 해상도를 갖는 출력 이미지(OUT)를 출력한다. 초해상도 장치(200)는 도 1과 같은 FSRCNN으로 구현될 수도 있으나, SCRCNN 또는 다른 형태의 인공 신경망으로 구현될 수도 있다.The super-resolution device 200 is implemented as an artificial neural network that sequentially extracts a feature map from an input image, including a plurality of convolutional layers sequentially connected and provided with a plurality of filters of a predetermined size, respectively, and is higher than the input image (IN). Outputs an output image OUT having a large predetermined resolution. The super-resolution device 200 may be implemented as FSRCNN as shown in FIG. 1 , but may also be implemented as SCRCNN or other types of artificial neural networks.

즉 본 실시예에서 초해상도 장치(200)는 기존과 달리 입력 이미지(IN)를 직접 인가받지 않고, 전처리 장치(100)에서 전처리된 패딩 이미지(PIN)를 인가받고, 미리 학습된 방식에 따라 패딩 이미지(PIN)로부터 입력 이미지(IN)보다 고해상도의 출력 이미지(OUT)를 획득하여 출력한다.That is, in the present embodiment, the super-resolution device 200 receives the padding image PIN preprocessed in the preprocessing device 100 without directly applying the input image IN, unlike the existing ones, and padding according to a pre-learned method. An output image OUT having a higher resolution than the input image IN is obtained from the image PIN and outputted.

이때, 초해상도 장치(200)는 다수의 컨볼루션 레이어에서 필터의 크기에 따라 발생하는 이미지 쉬링크로 인한 순차적 크기 축소를 고려하지 않고, 컨볼루션 연산을 수행한다. 즉 컨볼루션 레이어 단위의 전처리 패딩 또는 후처리 패딩을 수행하지 않는다.In this case, the super-resolution apparatus 200 performs a convolution operation without considering the sequential size reduction due to image shrink occurring according to the size of the filter in the plurality of convolutional layers. That is, pre-processing padding or post-processing padding in units of convolutional layers is not performed.

대신 전처리 장치(100)가 초해상도 장치(200)의 다수의 컨볼루션 레이어에 의해 발생하게 되는 이미지 쉬링크에 의해 축소되는 전체 크기를 계산하고, 도 5에 도시된 바와 같이, 계산된 크기만큼 미리 입력 이미지(IN)에 대해 일괄적으로 통합 패딩 처리한다. 즉 전처리 장치(100)가 초해상도 장치(200)에서 발생하는 이미지 쉬링크를 미리 입력 이미지(IN)에 사전 보상한 패딩 이미지(PIN)를 획득하고, 획득된 패딩 이미지(PIN)를 초해상도 장치(200)로 전달함으로써, 초해상도 장치(200)는 각 컨볼루션 레이어 단위로 이미지 쉬링크를 보상하기 위한 패딩 작업없이 지정된 연산을 수행하여 요구되는 크기의 출력 이미지(OUT)를 획득할 수 있다.Instead, the pre-processing unit 100 calculates the overall size reduced by the image shrink caused by the plurality of convolutional layers of the super-resolution unit 200, and as shown in FIG. 5, as shown in FIG. Integrated padding is performed on the input image (IN) in a batch. That is, the pre-processing device 100 acquires a padding image (PIN) in which the image shrink generated in the super-resolution device 200 is pre-compensated to the input image (IN) in advance, and the obtained padding image (PIN) is used in the super-resolution device By transferring to 200, the super-resolution device 200 may obtain an output image OUT of a required size by performing a specified operation without a padding operation for compensating for image shrink in units of each convolutional layer.

도 7은 기존의 전처리 패딩 기법과 본 실시예의 통합 전처리 패딩 기법의 동작을 비교하여 설명하기 위한 도면이다.FIG. 7 is a diagram for explaining an operation of a conventional pre-processing padding technique and an integrated pre-processing padding technique of the present embodiment by comparison.

도 7에서 (a)는 기존의 전처리 패딩 기법을 나타내고, (b)는 본 실시예에 따른 통합 전처리 패딩 기법을 나타낸다.In FIG. 7, (a) shows the existing pre-processing padding technique, and (b) shows the integrated pre-processing padding technique according to the present embodiment.

도 7의 (a)에 도시된 바와 같이, 기존의 초해상도 장치는 컨볼루션 레이어별로 패딩을 수행함에 따라 순차적으로 연결된 다수의 컨볼루션 레이어 사이에서 특징맵의 크기를 조절하기 위한 패딩 과정이 다수 횟수로 분산되어 수행되어야 하며, 패딩 과정으로 인해 변경된 크기의 특징맵에 대해 컨볼루션 연산을 수행하기 위해서는 크기가 변경되지 않은 특징맵의 픽셀과 확장된 패딩영역의 픽셀의 위치 관계가 항시 고려되어야 한다. 이는 하드웨어로 구현되는 초해상도 장치의 메모리에 입력 이미지 또는 특징맵이 저장되는 방식으로 인한 문제와 더불어 구비되는 연산기 개수의 한계로 인한 문제이다. 이와 같이, 특징맵의 픽셀과 확장된 패딩영역의 픽셀의 위치 관계를 항시 고려하여 연산을 수행하도록 초해상도 장치를 하드웨어로 구성하기 위해서는 설계가 매우 복잡해져야할 뿐만 아니라, 병목 현상이 발생할 수 있는 구간의 수가 증가되어 동작 속도가 느려질 가능성이 높다.As shown in (a) of FIG. 7 , as the conventional super-resolution device performs padding for each convolutional layer, the padding process for adjusting the size of the feature map among a plurality of sequentially connected convolutional layers is performed a plurality of times. In order to perform a convolution operation on a feature map of a size changed due to the padding process, the positional relationship between the pixels of the feature map whose size is not changed and the pixels of the extended padding area should always be considered. This is a problem due to a limitation in the number of operators provided along with a problem due to a method in which an input image or a feature map is stored in the memory of a super-resolution device implemented in hardware. As such, in order to configure the super-resolution device as hardware to perform calculations in consideration of the positional relationship between the pixels of the feature map and the pixels of the extended padding area, the design must be very complicated, as well as a section in which a bottleneck may occur. It is highly likely that the operation speed will slow down as the number of is increased.

그에 반해, (b)에 도시된 본 실시예에서 초해상도 장치는 전처리 장치(100)에서 미리 통합 패딩된 패딩 이미지(PIN)가 인가되어 패딩을 수행하지 않아도 되므로, 패딩영역으로 인해 발생할 수 있는 메모리 저장 방식이나 연산 순서 변경 없이 단순히 지정된 연산만을 수행하면 된다. 따라서 하드웨어 구조가 매우 단순해질 수 있으며, 동작 속도를 향상시킬 수 있다.In contrast, in the present embodiment shown in (b), the super-resolution device does not need to perform padding by applying the pre-integrated padding image (PIN) in the pre-processing unit 100, so memory that may occur due to the padding area It is only necessary to perform the specified operation without changing the storage method or operation order. Accordingly, the hardware structure can be very simple, and the operation speed can be improved.

여기서는 설명의 편의를 위하여 전처리 장치(100)와 초해상도 장치(200)를 별도의 장치로 구분하여 도시하였으나, 본 실시예의 전처리 장치(100)는 초해상도 장치(200)에 포함되어 구성될 수 있다.Here, for convenience of explanation, the pre-processing apparatus 100 and the super-resolution apparatus 200 are separated and illustrated as separate apparatuses, but the pre-processing apparatus 100 of this embodiment may be included in the super-resolution apparatus 200 and configured. .

도 8은 패딩 방식에 따른 초해상도 장치의 성능 변화를 비교한 그래프이다.8 is a graph comparing performance changes of a super-resolution device according to a padding method.

도 8에서 (a) 내지 (c)는 서로 다른 데이터 셋을 이용하여 획득된 출력 이미지(OUT)의 평균 피크 신호대 잡음비(Peak-Signal-Noise-Ratio: PSNR)를 비교한 결과를 나타내고, (d) 내지 (f)는 구조적 유사도(Structure Similarity Index: SSIM)를 비교한 결과를 나타낸다. 그리고 (a) 내지 (f) 각각에서 1은 컨볼루션 레이어 단위로 전처리 패딩을 수행하는 기존의 방식을 나타내고, 2 내지 5는 각각 통합 패딩 기법에서 제로 패딩 기법과, 에지 카피 기법, 경계 카피 기법 및 경계 미러 기법을 나타낸다.8, (a) to (c) show the results of comparing the average peak signal-to-noise ratio (PSNR) of the output image OUT obtained using different data sets, (d) ) to (f) show the results of comparing the structural similarity (Structure Similarity Index: SSIM). And in each of (a) to (f), 1 indicates a conventional method of performing pre-processing padding in units of convolutional layers, and 2 to 5 indicate a zero padding technique, an edge copy technique, a boundary copy technique, and a boundary copy technique in the integrated padding technique, respectively. The boundary mirror technique is shown.

도 8을 참조하면, 에지 카피 기법, 경계 카피 기법 및 경계 미러 기법의 경우, 제로 패딩 기법보다 우수한 성능을 나타냄을 알 수 있으며, 데이터 셋에 따른 편차가 존재하지만, 기존의 전처리 패딩 기법과 비교하여도 유사한 성능을 나타냄을 알 수 있다.Referring to FIG. 8 , it can be seen that the edge copy technique, the boundary copy technique, and the boundary mirror technique exhibit better performance than the zero padding technique. It can be seen that also shows similar performance.

따라서 본 실시예에 따른 전처리 장치는 초해상도 장치가 소프트웨어로 구현되는 경우에도 적용될 수 있으며, 전처리 장치 또한 소프트웨어로 구현될 수도 있다. 다만 본 실시예의 전처리 장치는 초해상도 장치가 소프트웨어로 구현된 경우보다 하드웨어로 구현된 경우에, 초해상도 장치의 하드웨어 효율성을 크게 개선할 수 있으면서도 성능을 유지할 수 있다는 장점이 있다.Therefore, the pre-processing apparatus according to the present embodiment may be applied even when the super-resolution apparatus is implemented in software, and the pre-processing apparatus may also be implemented in software. However, the pre-processing device of this embodiment has the advantage of being able to maintain performance while greatly improving the hardware efficiency of the super-resolution device when the super-resolution device is implemented in hardware than when it is implemented in software.

도 9는 본 발명의 일 실시예에 따른 초해상도 장치를 위한 전처리 방법을 나타낸다.9 shows a pre-processing method for a super-resolution device according to an embodiment of the present invention.

도 4 내지 도 6을 참조하여 도 9의 전처리 방법을 설명하면, 먼저 초해상도 장치(200)에서 이미지 쉬링크로 인해 축소되는 크기를 고려하여 입력 이미지(IN)를 확장할 통합 패딩영역의 크기(ΔL)를 계산한다(S10). 여기서 통합 패딩영역의 크기는 N개의 컨볼루션 레이어를 갖는 인공 신경망으로 구현되는 초해상도 장치(200)에서 1 × 1보다 큰 크기의 필터를 갖는 컨볼루션 레이어의 개수(α)와 각 컨볼루션 레이어에 포함된 필터의 길이(K_n), 스트라이드 크기(s) 및 입력 이미지 길이(L_in) 및 출력 이미지 길이(L_out)를 확인함으로써 계산할 수 있다.When the pre-processing method of FIG. 9 is described with reference to FIGS. 4 to 6 , first, the size of the integrated padding area to expand the input image IN in consideration of the size reduced due to image shrink in the super-resolution device 200 ( ΔL) is calculated (S10). Here, the size of the integrated padding area is the number (α) of the convolutional layer having a filter of size greater than 1 × 1 in the super-resolution device 200 implemented as an artificial neural network having N convolutional layers and the number of convolutional layers in each convolutional layer. It can be calculated by checking the length of the included filter (K _n ), the stride size (s), and the input image length (L _in ) and the output image length (L _out ).

그리고 통합 패딩영역의 크기(ΔL)가 계산되면, 계산된 통합 패딩영역의 크기(ΔL)에 따라 입력 이미지(IN)에서 외곽 전방향에 대해 통합 패딩 크기(ΔL)의 절반 크기(ΔL/2)로 확장하여 한번에 통합 패딩한다(S20).And when the size (ΔL) of the integrated padding area is calculated, according to the calculated size (ΔL) of the integrated padding area, half the size (ΔL/2) of the integrated padding size (ΔL) for the outer omnidirectional direction in the input image (IN) , and integrated padding is performed at once (S20).

이후, 기지정된 채움 방식에 따라 입력 이미지(IN)에서 확장된 패딩영역의 픽셀에 픽셀값을 채운다(S30).Thereafter, pixel values are filled in the pixels of the padding area extended in the input image IN according to a predetermined filling method ( S30 ).

이때, 에지 카피 기법, 경계 카피 기법 및 경계 미러 기법 중 하나의 기법에 따라 패딩 영역의 픽셀값을 채울 수 있다. 에지 카피 기법에서는 입력 이미지(IN)의 에지에 위치하는 하나의 행 및 하나의 열에 위치하는 픽셀의 픽셀값을 복사하여 대응하는 픽셀 영역에 반복적으로 붙여넣기 하는 형식으로 패딩 영역을 채운다.In this case, pixel values of the padding area may be filled according to one of an edge copy technique, a boundary copy technique, and a boundary mirror technique. In the edge copy technique, the padding area is filled in the form of repeatedly pasting the pixel values of pixels located in one row and one column located at the edge of the input image IN to the corresponding pixel area.

그리고 경계 카피 기법에서는 입력 이미지(IN)의 에지에 위치하는 통합 패딩 크기(ΔL)의 절반 크기(ΔL/2)의 행 및 열에 위치하는 픽셀의 픽셀값을 복사하여 대응하는 픽셀 영역에 동일 패턴으로 붙여넣기 하는 형식으로 패딩 영역을 채운다.And in the boundary copy technique, the pixel values of pixels located in rows and columns of half the size (ΔL/2) of the integrated padding size (ΔL) located at the edge of the input image IN are copied to the corresponding pixel area as the same pattern. Fill the padding area in the pasted format.

마지막으로 경계 미러 기법에서는 입력 이미지(IN)의 에지에 위치하는 통합 패딩 크기(ΔL)의 절반 크기(ΔL/2)의 행 및 열에 위치하는 픽셀의 픽셀값을 복사하고, 대응하는 픽셀 영역에 복사된 픽셀값을 대응하는 픽셀 영역에 대칭되는 패턴이 되도록 반전하여 붙여넣기 하는 형식으로 패딩 영역을 채운다.Finally, in the boundary mirror technique, the pixel values of pixels located in rows and columns of half the size (ΔL/2) of the integrated padding size (ΔL) located at the edge of the input image (IN) are copied and copied to the corresponding pixel area. Fills the padding area in the form of inverting and pasting the pixel values to form a pattern symmetrical to the corresponding pixel area.

이때, 에지 카피 기법, 경계 카피 기법 및 경계 미러 기법에서 픽셀값이 채워지지 않은 모서리 영역의 픽셀에 대해서는 픽셀값을 0으로 채워넣는다.In this case, in the edge copy technique, the boundary copy technique, and the boundary mirror technique, a pixel value is filled with 0 for a pixel in a corner area where a pixel value is not filled.

입력 이미지(IN)의 외곽으로 확장된 패딩영역의 모든 픽셀에 픽셀값이 채워져 패딩 이미지(PIN)가 획득되면, 획득된 패딩 이미지(PIN)를 초해상도 장치로 전달한다(S40). 이에 초해상도 장치는 다수의 컨볼루션 레이어에서 필터의 크기에 따라 발생하는 이미지 쉬링크로 인한 순차적 크기 축소를 고려하지 않고, 컨볼루션 연산을 포함한 기지정된 수행하여 업스케일링된 초해상도 이미지를 획득할 수 있다.When the padding image PIN is obtained by filling pixel values in all pixels of the padding area extended to the outside of the input image IN, the obtained padding image PIN is transmitted to the super-resolution device (S40). Accordingly, the super-resolution device does not consider the sequential size reduction due to image shrink that occurs depending on the size of the filter in a plurality of convolution layers, and performs a predetermined operation including convolution operation to obtain an upscaled super-resolution image. have.

본 발명에 따른 방법은 컴퓨터에서 실행시키기 위한 매체에 저장된 컴퓨터 프로그램으로 구현될 수 있다. 여기서 컴퓨터 판독가능 매체는 컴퓨터에 의해 액세스 될 수 있는 임의의 가용 매체일 수 있고, 또한 컴퓨터 저장 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함하며, ROM(판독 전용 메모리), RAM(랜덤 액세스 메모리), CD(컴팩트 디스크)-ROM, DVD(디지털 비디오 디스크)-ROM, 자기 테이프, 플로피 디스크, 광데이터 저장장치 등을 포함할 수 있다.The method according to the present invention may be implemented as a computer program stored in a medium for execution by a computer. Here, the computer-readable medium may be any available medium that can be accessed by a computer, and may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, and read dedicated memory), RAM (Random Access Memory), CD (Compact Disk)-ROM, DVD (Digital Video Disk)-ROM, magnetic tape, floppy disk, optical data storage, and the like.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다.Although the present invention has been described with reference to the embodiment shown in the drawings, which is only exemplary, those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom.

따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 청구범위의 기술적 사상에 의해 정해져야 할 것이다.Accordingly, the true technical protection scope of the present invention should be defined by the technical spirit of the appended claims.

100: 전처리 장치 110: 통합 패딩 크기 계산부
120: 통합 패딩부 130: 패딩영역 채움부
200: 초해상도 장치100: preprocessor 110: integrated padding size calculator
120: integrated padding unit 130: padding area filling unit
200: super-resolution device

Claims

As a pre-processing device for a super-resolution device that is implemented as an artificial neural network including a plurality of filters of a predetermined size and sequentially connected N convolutional layers to perform image upscaling,
an integrated padding size calculation unit for calculating a size of an integrated padding area indicating an overall size in which an image input to the super-resolution device is reduced due to image shrink of a convolutional layer;
an integrated padding unit for performing integrated padding by expanding at once to half the size of the integrated padding area in all directions along the periphery of the input image to be upscaled; and
A padding area filling unit that fills a pixel value of each pixel of a padding area that is an area extended by the integrated padding in a predetermined manner;
The integrated padding size calculation unit
The number of convolutional layers with filters of size greater than 1 × 1 (α) and the length of filters included in each convolutional layer (K _i ) and input image length (L _in ) and output image length (L _out ) based on the formula

A preprocessor for a super-resolution device that calculates the size of the integrated padding area according to .

According to claim 1, wherein the padding area filling part
A preprocessor for a super-resolution device that first fills a corresponding area of the padding area with pixel values of a predetermined number of rows and columns of each edge of the input image, and fills a non-corresponding area with a pixel value of zero.

The method of claim 2, wherein the padding area filling part
A preprocessor for a super-resolution device that copies pixel values of pixels positioned in one row and one column positioned at each edge of the input image, and repeatedly fills the padding area corresponding to the copied row and column pixel values.

The method of claim 2, wherein the padding area filling part
A super-resolution device that copies rows and columns of sizes corresponding to the sizes in the padding area at each edge of the input image, and fills the padding areas with pixel values of rows and columns of sizes corresponding to the sizes in the copied padding area. preprocessor for

The method of claim 2, wherein the padding area filling part
At each edge of the input image, a row and column of a size corresponding to the size are copied to the padding area, and pixel values of rows and columns of the size corresponding to the size are copied to the padding area so that the pixel values are symmetrical in the corresponding pixel area. Preprocessor for super-resolution devices with inverted filling.

delete

According to claim 1, wherein the super-resolution device is
A preprocessor for a super-resolution device implemented in hardware including a plurality of operators and memory.

As a pre-processing method for a super-resolution device that is implemented as an artificial neural network including a plurality of filters of a predetermined size and sequentially connected N convolutional layers to perform image upscaling,
calculating a size of an integrated padding area indicating an overall size in which an image input to the super-resolution device is reduced due to image shrink of a convolutional layer;
performing integrated padding by expanding at one time to half the size of the integrated padding area in all directions along the periphery of the upscaling target input image; and
Filling in a pixel value of each pixel of a padding area that is an area extended by the integrated padding in a predetermined manner;
The step of calculating the size of the integrated padding area is
The number of convolutional layers with filters of size greater than 1 × 1 (α) and the length of filters included in each convolutional layer (K _i ) and input image length (L _in ) and output image length (L _out ) based on the formula

A preprocessing method for a super-resolution device for calculating the size of the integrated padding area according to .

The method of claim 8, wherein the filling of the pixel value comprises:
filling a corresponding area of the padding area with a predetermined number of row and column pixel values of each edge of the input image; and
A pre-processing method for a super-resolution device comprising the step of filling non-corresponding regions with a pixel value of zero.

10. The method of claim 9, wherein filling the corresponding region comprises:
copying pixel values of pixels positioned in one row and one column positioned at each edge of the input image; and
A preprocessing method for super-resolution devices, comprising the step of repeatedly filling corresponding padding regions with pixel values of copied rows and columns.

10. The method of claim 9, wherein filling the corresponding region comprises:
copying rows and columns having a size corresponding to the size in a padding area at each edge of the input image; and
A preprocessing method for a super-resolution device, comprising: filling the copied padding region with pixel values of rows and columns having a size corresponding to the size of the padding region.

10. The method of claim 9, wherein filling the corresponding region comprises:
copying rows and columns having a size corresponding to the size in a padding area at each edge of the input image; and
A preprocessing method for a super-resolution device, comprising the step of filling the copied padding area by inverting it so that pixel values of rows and columns having a size corresponding to the size become a symmetrical pattern in the corresponding pixel area.

delete

According to claim 8, wherein the super-resolution device is
A preprocessing method for a super-resolution device implemented in hardware including a plurality of operators and memory.