KR20240071720A

KR20240071720A - Method and apparatus for improving the resolution of images

Info

Publication number: KR20240071720A
Application number: KR1020220153538A
Authority: KR
Inventors: 권기상; 권일환; 배창현; 조성호; 권영준
Original assignee: 삼성에스디에스 주식회사
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2024-05-23

Abstract

제1 이미지를 피처 생성기에 입력하여 제1 피처 맵(feature map)을 생성하고, 상기 제1 피처 맵을 이미지 생성기에 입력하여 제2 이미지를 생성하는 이미지 해상도 개선 방법 및 그 시스템이 제공된다. 이 때, 상기 제2 이미지는 상기 제1 이미지보다 높은 해상도의 이미지이고, 상기 이미지 생성기는 업샘플링(up-sampling)을 수행하는 적어도 하나의 디컨볼루션 레이어(deconvolution layer)를 포함한다.An image resolution improvement method and system for generating a first feature map by inputting a first image into a feature generator and generating a second image by inputting the first feature map into an image generator are provided. At this time, the second image is an image with a higher resolution than the first image, and the image generator includes at least one deconvolution layer that performs up-sampling.

Description

Method and apparatus for improving the resolution of images {METHOD AND APPARATUS FOR IMPROVING THE RESOLUTION OF IMAGES}

본 개시는 이미지의 해상도를 개선하는 방법 및 장치에 관한 것이다. 보다 자세하게는, 저해상도(Low Resolution, LR)의 이미지로부터 고해상도(High Resolution, HR)의 이미지를 생성하는 방법 및 그 방법이 적용된 시스템에 관한 것이다.This disclosure relates to a method and apparatus for improving the resolution of an image. More specifically, it relates to a method of generating a high resolution (HR) image from a low resolution (LR) image and a system to which the method is applied.

고해상도의 이미지 또는 영상은 용량이 크므로 네트워크 트래픽 상태에 따라 저해상도의 이미지 또는 영상으로 송출될 수 있다.Since high-resolution images or videos have a large capacity, they may be transmitted as low-resolution images or videos depending on network traffic conditions.

저해상도의 이미지 또는 영상을 고해상도의 이미지 또는 영상으로 업샘플링(up-sampling)하는 과정에서 부분적인 화질 저하, 전체적인 흐릿함 등의 문제가 발생될 수 있고, 이를 극복하기 위해 영상 또는 이미지처리 기반의 다양한 보간법(interpolation) 또는 딥러닝 기반 슈퍼레졸루션(super-resolution) 기술이 활용되고 있다.In the process of up-sampling a low-resolution image or video to a high-resolution image or video, problems such as partial image quality degradation and overall blurriness may occur. To overcome this, various interpolation methods based on video or image processing are used. Interpolation or deep learning-based super-resolution technology is being used.

다만, 저해상도의 이미지 또는 영상을 일정 비율 이상 업샘플링하는 경우에는 딥러닝 기반 슈퍼레졸루션 기술을 적용하더라도 텍스트의 가독성이 떨어지고, 이미지 또는 영상의 일부 영역은 흐릿하게 보이는 등 화질 개선에 한계가 존재한다.However, when upsampling a low-resolution image or video by a certain percentage or more, there are limits to improving image quality, such as the readability of text being reduced and some areas of the image or video appearing blurry, even if deep learning-based super-resolution technology is applied.

이에, 저해상도의 이미지 또는 영상을 일정 비율 이상 업샘플링하는 경우에도 이미지 또는 영상의 해상도를 효과적으로 개선할 수 있는 기술이 요구된다.Accordingly, there is a need for a technology that can effectively improve the resolution of an image or video even when a low-resolution image or video is upsampled by a certain percentage or more.

미국등록특허 제113280905호 (2020.04.09 공개)US Patent No. 113280905 (published on April 9, 2020)

본 개시가 해결하고자 하는 기술적 과제는, 저해상도의 이미지를 입력하여 고해상도의 이미지를 생성할 수 있는 이미지 해상도 개선 방법 및 장치를 제공하는 것이다.The technical problem that the present disclosure aims to solve is to provide a method and device for improving image resolution that can generate a high-resolution image by inputting a low-resolution image.

본 개시가 해결하고자 하는 다른 기술적 과제는, 저해상도의 이미지를 일정 비율 이상 업샘플링하는 경우에 발생될 수 있는 부분적인 화질 저하 문제를 해결할 수 있는 이미지 해상도 개선 방법 및 장치를 제공하는 것이다.Another technical problem that the present disclosure aims to solve is to provide a method and device for improving image resolution that can solve the problem of partial image quality degradation that may occur when a low-resolution image is upsampled by a certain rate or more.

본 개시가 해결하고자 하는 또 다른 기술적 과제는, 종래의 영상 또는 이미지 처리 기반의 다양한 보간법 또는 딥러닝 기반 슈퍼레졸루션 기술에 의해 생성되는 이미지 또는 영상에 비해 개선된 해상도의 이미지 또는 영상을 제공할 수 있는 방법 및 장치를 제공하는 것이다.Another technical problem that the present disclosure aims to solve is to provide images or videos with improved resolution compared to images or videos generated by conventional images or various interpolation methods based on image processing or deep learning-based super-resolution technology. To provide a method and device.

본 개시의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 개시의 기술분야에서의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present disclosure are not limited to the technical problems mentioned above, and other technical problems not mentioned can be clearly understood by those skilled in the art from the description below.

상기 기술적 과제를 해결하기 위한 본 개시의 일 실시예에 따른 이미지 해상도 개선 방법은, 제1 이미지를 피처 생성기에 입력하여 제1 피처 맵(feature map)을 생성하는 단계 및 상기 제1 피처 맵을 이미지 생성기에 입력하여 제2 이미지를 생성하는 단계를 포함할 수 있다. 이 때, 상기 제2 이미지는 상기 제1 이미지보다 높은 해상도의 이미지일 수 있고, 상기 이미지 생성기는 업샘플링(up-sampling)을 수행하는 적어도 하나의 디컨볼루션 레이어(deconvolution layer)를 포함할 수 있다.A method for improving image resolution according to an embodiment of the present disclosure for solving the above technical problem includes inputting a first image into a feature generator to generate a first feature map, and converting the first feature map into an image. It may include generating a second image by inputting it into a generator. At this time, the second image may be an image with a higher resolution than the first image, and the image generator may include at least one deconvolution layer that performs up-sampling. there is.

일 실시예에서, 상기 피처 생성기는 상기 제1 이미지에서 중간 피처 맵을 추출하는 피처 추출기 및 정답 이미지에서 추출된 피처 맵을 반영하여 상기 중간 피처 맵을 강화시킴으로써 상기 제1 피처 맵을 출력하는 피처 강화기를 포함할 수 있다. 이 때, 상기 정답 이미지는 상기 제1 이미지보다 고해상도 이미지일 수 있다.In one embodiment, the feature generator is a feature extractor that extracts an intermediate feature map from the first image and a feature enhancer that outputs the first feature map by enhancing the intermediate feature map by reflecting the feature map extracted from the correct answer image. It may include a group. At this time, the correct answer image may be a higher resolution image than the first image.

일 실시예에서, 상기 피처 강화기는 복수의 레즈넷 블록(Resnet block)을 포함할 수 있다.In one embodiment, the feature enhancer It may contain multiple Resnet blocks.

일 실시예에서, 상기 피처 추출기는 업스케일링 블록, 다운스케일링 블록 및 피처 에그리게이터를 포함할 수 있다. 이 때, 상기 업스케일링 블록은 입력받은 피처 맵의 스케일을 증가시킬 수 있고, 상기 다운스케일링 블록은 입력받은 피처 맵의 스케일을 감소시킬 수 있으며, 상기 피처 애그리게이터는 상기 업스케일링 블록으로부터 출력된 피처 맵과 상기 다운스케일링 블록으로부터 출력된 피처 맵의 애그리게이팅을 수행할 수 있다.In one embodiment, the feature extractor may include an upscaling block, a downscaling block, and a feature aggregator. At this time, the upscaling block may increase the scale of the input feature map, the downscaling block may decrease the scale of the input feature map, and the feature aggregator may increase the scale of the input feature map. Aggregation of the map and the feature map output from the downscaling block can be performed.

일 실시예에서, 상기 업스케일링 블록은 컨볼루션 레이어와 디컨볼루션 레이어의 조합에 기초하여 구현될 수 있고, 상기 업스케일링 블록에 포함된 디컨볼루션 레이어의 개수는 상기 컨볼루션 레이어보다 많을 수 있다.In one embodiment, the upscaling block may be implemented based on a combination of a convolution layer and a deconvolution layer, and the number of deconvolution layers included in the upscaling block may be greater than the convolution layer. .

일 실시예에서, 상기 다운스케일링 블록은 컨볼루션 레이어와 디컨볼루션 레이어의 조합에 기초하여 구현될 수 있고, 상기 다운스케일링 블록에 포함된 디컨볼루션 레이어의 개수는 상기 컨볼루션 레이어보다 적을 수 있다.In one embodiment, the downscaling block may be implemented based on a combination of a convolution layer and a deconvolution layer, and the number of deconvolution layers included in the downscaling block may be less than the convolution layer. .

일 실시예에서, 상기 이미지 생성기는 상기 제1 피처 맵을 입력받는 컨볼루션 레이어(convolution layer)를 더 포함할 수 있고, 상기 적어도 하나의 디컨볼루션 레이어는 상기 컨볼루션 레이어에서 출력된 피처 맵을 기초로 업샘플링을 수행하도록 구성될 수 있다.In one embodiment, the image generator may further include a convolution layer that receives the first feature map, and the at least one deconvolution layer receives the feature map output from the convolution layer. It may be configured to perform upsampling on a basis.

일 실시예에서, 이미지 판별기를 이용하여 적대적 손실(adversarial loss) 값을 산출하는 단계 및 상기 적대적 손실 값을 이용하여 상기 피처 생성기와 상기 이미지 생성기를 업데이트하는 단계를 더 포함할 수 있다.In one embodiment, the method may further include calculating an adversarial loss value using an image discriminator and updating the feature generator and the image generator using the adversarial loss value.

일 실시예에서, 상기 제2 이미지와 정답 이미지 간의 제1 손실(loss) 값을 산출하는 단계 및 상기 제1 손실 값을 이용하여 상기 피처 생성기와 상기 이미지 생성기를 업데이트하는 단계를 더 포함할 수 있다.In one embodiment, the method may further include calculating a first loss value between the second image and the correct image and updating the feature generator and the image generator using the first loss value. .

일 실시예에서, 상기 제1 손실 값은 평균 제곱 오차(MSE), 지각 손실(Perceptual Loss) 및 전체 변동 손실(Total Variation Loss) 중 하나일 수 있다.In one embodiment, the first loss value may be one of Mean Square Error (MSE), Perceptual Loss, and Total Variation Loss.

일 실시예에서, 상기 이미지 생성기를 업데이트하는 단계는 기 설정된 조건을 충족하는 경우에 한하여, 제2 손실 값을 산출하는 단계 및 상기 제2 손실 값을 이용하여 상기 이미지 생성기를 업데이트하는 단계를 포함할 수 있다. 이 때, 상기 제2 손실 값은 L1 손실 값일 수 있다.In one embodiment, updating the image generator may include calculating a second loss value and updating the image generator using the second loss value only if a preset condition is met. You can. At this time, the second loss value may be the L1 loss value.

일 실시예에서, 상기 기 설정된 조건은 상기 제2 이미지의 PSNR(Peak Signal-to-Noise Ratio) 수치가 기준치를 초과하는 것일 수 있다.In one embodiment, the preset condition may be that the Peak Signal-to-Noise Ratio (PSNR) value of the second image exceeds a reference value.

일 실시예에서, 상기 기 설정된 조건은 상기 제1 손실 값이 기준치 이하인 것일 수 있다.In one embodiment, the preset condition may be that the first loss value is less than or equal to a reference value.

일 실시예에서, 상기 기 설정된 조건은 상기 제1 손실 값의 감소 비율이 기준치 이하인 것일 수 있다.In one embodiment, the preset condition may be that the reduction rate of the first loss value is less than or equal to a reference value.

일 실시예에서, 상기 기 설정된 조건은 상기 제2 이미지에 대한 PSNR의 감소 비율이 기준치 이상인 것일 수 있다.In one embodiment, the preset condition may be that the reduction rate of PSNR for the second image is greater than or equal to a reference value.

일 실시예에서, 상기 이미지 생성기를 업데이트하는 단계는 상기 피처 생성기를 프리징(freezing)시킨 상태에서, 상기 이미지 생성기의 가중치 파라미터를 업데이트하는 단계를 포함할 수 있다.In one embodiment, updating the image generator may include updating a weight parameter of the image generator while the feature generator is frozen.

일 실시예에서, 상기 이미지 생성기를 업데이트하는 단계는 상기 제2 손실 값에 부여된 가중치를 기초로 상기 이미지 생성기를 학습시키는 단계를 포함할 수 있다. 이 때, 상기 제2 손실 값은 L1 손실 값일 수 있다. 또한, 제1 시점에 상기 제2 손실 값에 부여된 제1 가중치는 제1 시점 이후의 제2 시점에 상기 제2 손실 값에 부여된 제2 가중치보다 낮은 값일 수 있다.In one embodiment, updating the image generator may include training the image generator based on a weight assigned to the second loss value. At this time, the second loss value may be the L1 loss value. Additionally, the first weight assigned to the second loss value at a first time may be lower than the second weight assigned to the second loss value at a second time after the first time.

일 실시예에서, 상기 제1 이미지는 실시간으로 수신되는 화상 회의 영상의 프레임 이미지일 수 있다.In one embodiment, the first image may be a frame image of a video conference video received in real time.

상기 기술적 과제를 해결하기 위한 본 개시의 다른 실시예에 따른 이미지 해상도 개선 시스템은 프로세서 및 명령어를 저장하는 메모리를 포함할 수 있고, 상기 명령어는 상기 프로세서에 의해 실행될 때 상기 프로세서로 하여금, 제1 이미지를 피처 생성기에 입력하여 제1 피처 맵(feature map)을 생성하는 동작 및 상기 제1 피처 맵을 이미지 생성기에 입력하여 제2 이미지를 생성하는 동작을 수행하도록 할 수 있다. 이 때, 상기 제2 이미지는 상기 제1 이미지 보다 높은 해상도의 이미지일 수 있고, 상기 이미지 생성기는 업샘플링(up-sampling)을 수행하는 적어도 하나의 디컨볼루션 레이어(deconvolution layer)를 포함할 수 있다.An image resolution improvement system according to another embodiment of the present disclosure for solving the technical problem may include a processor and a memory for storing instructions, wherein the instructions, when executed by the processor, cause the processor to generate a first image. An operation of generating a first feature map can be performed by inputting a to a feature generator, and an operation of generating a second image can be performed by inputting the first feature map into an image generator. At this time, the second image may be an image with a higher resolution than the first image, and the image generator may include at least one deconvolution layer that performs up-sampling. there is.

본 실시예에 따르면, 종래 기술에 의해 업샘플링된 이미지 또는 영상보다 개선된 해상도의 이미지 또는 영상이 제공됨으로써 사용자의 만족도 및 이용성이 제고될 수 있다.According to this embodiment, user satisfaction and usability can be improved by providing an image or video with improved resolution than an image or video upsampled by the prior art.

본 실시예에 따르면, 복수의 컨볼루션 레이어, 디컨볼루션 레이어 및 레즈넷 블록을 이용하여 피처 맵이 업데이트됨으로써 이미지 또는 영상의 프레임 단위 품질이 효과적으로 향상될 수 있다.According to this embodiment, the frame-by-frame quality of an image or video can be effectively improved by updating the feature map using a plurality of convolution layers, deconvolution layers, and Resnet blocks.

도 1은 종래의 이미지 해상도 개선 방법 및 본 개시의 일 실시예에 따른 이미지 해상도 개선 방법에 따라 생성된 이미지에 대한 예시도이다.
도 2는 본 개시의 일 실시예에 따른, 이미지 해상도 개선 방법을 나타내는 순서도이다.
도 3은 본 개시의 몇몇 실시예에서 참조될 수 있는, 피처 생성기의 구성에 대한 예시도이다.
도 4는 도 3에 도시된 일부 구성을 설명하기 위한 예시도이다.
도 5는 본 개시의 몇몇 실시예에서 참조될 수 있는, 피처 추출기에 의한 피처 맵 보완 프로세스를 설명하기 위한 예시도이다.
도 6은 도 2에 도시된 일부 동작을 설명하기 위한 예시도이다.
도 7은 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 해상도 개선 모델의 추론 과정을 설명하기 위한 흐름도이다.
도 8은 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 판별기의 구성에 대한 예시도이다.
도 9는 본 개시의 다른 실시예에 따른, 이미지 해상도 개선 모델을 업데이트하는 방법을 나타내는 순서도이다.
도 10은 도 9를 참조하여 설명한 일부 동작의 상세 순서도이다.
도 11은 도 9에 도시된 일부 동작을 설명하기 위한 예시도이다.
도 12는 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 판별기를 이용한 이미지 해상도 개선 모델의 학습 과정을 설명하기 위한 흐름도이다.
도 13은 본 개시의 몇몇 실시예에서 참조될 수 있는, 정답 이미지를 이용한 이미지 해상도 개선 모델의 학습 과정을 설명하기 위한 흐름도이다.
도 14는 본 개시의 몇몇 실시예들에 따른 컴퓨팅 시스템의 하드웨어 구성도이다.1 is an exemplary diagram of an image generated according to a conventional image resolution improvement method and an image resolution improvement method according to an embodiment of the present disclosure.
Figure 2 is a flowchart showing a method for improving image resolution, according to an embodiment of the present disclosure.
3 is an exemplary diagram of the configuration of a feature generator, which may be referenced in some embodiments of the present disclosure.
FIG. 4 is an example diagram for explaining some of the configurations shown in FIG. 3.
Figure 5 is an example diagram for explaining a feature map supplementation process by a feature extractor, which may be referred to in some embodiments of the present disclosure.
FIG. 6 is an example diagram for explaining some operations shown in FIG. 2.
Figure 7 is a flowchart for explaining the inference process of an image resolution improvement model, which may be referenced in some embodiments of the present disclosure.
Figure 8 is an example diagram of the configuration of an image discriminator, which may be referenced in some embodiments of the present disclosure.
Figure 9 is a flowchart showing a method of updating an image resolution improvement model according to another embodiment of the present disclosure.
FIG. 10 is a detailed flowchart of some operations described with reference to FIG. 9.
FIG. 11 is an example diagram for explaining some operations shown in FIG. 9.
FIG. 12 is a flowchart illustrating the learning process of an image resolution improvement model using an image discriminator, which may be referenced in some embodiments of the present disclosure.
FIG. 13 is a flowchart illustrating the learning process of an image resolution improvement model using a correct answer image, which may be referred to in some embodiments of the present disclosure.
14 is a hardware configuration diagram of a computing system according to some embodiments of the present disclosure.

이하, 첨부된 도면을 참조하여 본 개시의 바람직한 실시예들을 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명의 기술적 사상은 이하의 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 이하의 실시예들은 본 발명의 기술적 사상을 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명의 기술적 사상은 청구항의 범주에 의해 정의될 뿐이다.Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the attached drawings. The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the technical idea of the present invention is not limited to the following embodiments and may be implemented in various different forms. The following examples are merely intended to complete the technical idea of the present invention and to be used in the technical field to which the present invention pertains. It is provided to fully inform those skilled in the art of the scope of the present invention, and the technical idea of the present invention is only defined by the scope of the claims.

본 개시를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. In describing the present disclosure, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description will be omitted.

다른 정의가 없다면, 이하의 실시예들에서 사용되는 용어(기술 및 과학적 용어를 포함)는 본 개시가 속한 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있으나, 이는 관련 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수도 있다. 본 개시에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 개시의 범주를 제한하고자 하는 것은 아니다.Unless otherwise defined, terms (including technical and scientific terms) used in the following embodiments may be used with meanings that can be commonly understood by those skilled in the art to which this disclosure pertains. It may vary depending on the intentions or precedents of engineers working in related fields, the emergence of new technologies, etc. The terminology used in this disclosure is for describing embodiments and is not intended to limit the scope of this disclosure.

이하의 실시예들에서 사용되는 단수의 표현은 문맥상 명백하게 단수인 것으로 특정되지 않는 한, 복수의 개념을 포함한다. 또한, 복수의 표현은 문맥상 명백하게 복수인 것으로 특정되지 않는 한, 단수의 개념을 포함한다.The singular expressions used in the following embodiments include plural concepts, unless the context clearly specifies singularity. Additionally, plural expressions include singular concepts, unless the context clearly specifies plurality.

또한, 이하의 실시예들에서 사용되는 제1, 제2, A, B, (a), (b) 등의 용어는 어떤 구성요소를 다른 구성요소와 구별하기 위해 사용되는 것일 뿐, 그 용어에 의해 해당 구성요소의 본질이나 차례 또는 순서 등이 한정되지는 않는다.In addition, terms such as first, second, A, B, (a), (b) used in the following embodiments are only used to distinguish one component from another component, and the terms The nature, sequence, or order of the relevant components are not limited.

이하, 첨부된 도면들을 참조하여 본 개시의 다양한 실시예들에 대하여 상세하게 설명한다.Hereinafter, various embodiments of the present disclosure will be described in detail with reference to the attached drawings.

도 1은 종래의 이미지 해상도 개선 방법 및 본 개시의 일 실시예에 따른 이미지 해상도 개선 방법에 따라 생성된 이미지에 대한 예시도이다.1 is an exemplary diagram of an image generated according to a conventional image resolution improvement method and an image resolution improvement method according to an embodiment of the present disclosure.

화상회의 시스템 서버 프로그램은 네트워크 전송 상황에 따라 다양한 해상도의 비디오 스트림을 인코딩할 수 있고, 적절한 해상도의 비디오 스트림을 선택하여 송출할 수 있다. 예를 들어, 화상회의 시스템 서버 프로그램은 720p 해상도의 원본 영상 프레임의 경우, 네트워크 전송 상황에 따라 360p 또는 180p로 인코딩할 수 있고, 네트워크 전송이 원활한 경우에는 720p로, 원활하지 않은 경우에는 360p 또는 180p로 송출할 수 있다. The video conference system server program can encode video streams of various resolutions depending on the network transmission situation, and can select and transmit video streams of appropriate resolution. For example, a video conference system server program can encode an original video frame with 720p resolution into 360p or 180p depending on the network transmission situation, 720p when network transmission is smooth, and 360p or 180p when network transmission is not smooth. It can be transmitted to .

이 때, 클라이언트 프로그램의 목표 해상도보다 낮은 해상도의 스트림을 서버에서 송출하게 되면 이를 수신하는 클라이언트 프로그램은 낮은 해상도의 스트림을 목표 해상도의 크기로 업샘플링(up-sampling)하여 화면에 출력할 수 있다.At this time, when the server transmits a stream with a resolution lower than the target resolution of the client program, the client program that receives it can upsample the low-resolution stream to the size of the target resolution and output it on the screen.

도 1에 도시된 바와 같이, 네트워크 전송 상황이 원활한 경우에는 고해상도의 원본 이미지(1a)가 송출될 수 있다. 또한, 동시 접속자의 수가 지나치게 많은 상황 등의 이유로 네트워크 전송 상황이 다소 원활하지 않은 경우에는 저해상도의 이미지(1b)가 송출될 수 있다. 이 때, 저해상도의 이미지(1b)를 클라이언트 프로그램 화면에 출력하기에는 이미지의 화질이 매우 낮기 때문에 저해상도의 이미지가 업샘플링된 이미지(1c)가 화면에 출력될 수 있다.As shown in FIG. 1, when network transmission conditions are smooth, a high-resolution original image 1a can be transmitted. Additionally, if the network transmission situation is not smooth due to reasons such as an excessively large number of concurrent users, a low-resolution image (1b) may be transmitted. At this time, since the quality of the image is too low to output the low-resolution image 1b on the client program screen, an upsampled image 1c of the low-resolution image may be displayed on the screen.

다만, 저해상도의 이미지가 업샘플링된 이미지(1c)를 참조하면, 일부 글자(e)의 왜곡, 일부 글자(f, a)의 겹치는 부분 발생 및 일부 글자의 번짐 등으로 인하여 텍스트의 가독성에 문제가 있을 수 있다.However, referring to the image (1c) in which the low-resolution image is upsampled, there is a problem with the readability of the text due to distortion of some letters (e), overlapping parts of some letters (f, a), and blurring of some letters. There may be.

이에, 후술되는 본 개시의 몇몇 실시예에 따른 이미지 해상도 개선 방법에 따르면, 저해상도의 이미지가 업샘플링된 이미지(1c)보다 높은 해상도의 이미지(1d)가 생성될 수 있다.Accordingly, according to an image resolution improvement method according to some embodiments of the present disclosure, which will be described later, an image 1d with higher resolution than the image 1c in which a low-resolution image is upsampled can be generated.

이하에서는, 본 개시의 몇몇 실시예에 따른 이미지 해상도 개선 방법에 대하여 도 2 내지 도 13을 참조하여 설명하도록 한다. 단, 설명의 편의를 위해 본 명세서에서는 특정 이미지에 대한 해상도 개선 방법을 중점적으로 설명하였으나, 본 개시가 이에 한정되는 것은 아니고 특정 영상에 대한 해상도 또한 후술되는 방법들에 의해 개선될 수 있다.Hereinafter, a method for improving image resolution according to some embodiments of the present disclosure will be described with reference to FIGS. 2 to 13. However, for convenience of explanation, this specification focuses on a method for improving the resolution of a specific image. However, the present disclosure is not limited thereto, and the resolution of a specific image can also be improved by methods described later.

도 2는 본 개시의 일 실시예에 따른, 이미지 해상도 개선 방법을 나타내는 순서도이다. 단, 이는 본 개시의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 단계가 추가되거나 삭제될 수 있음은 물론이다.Figure 2 is a flowchart showing a method for improving image resolution, according to an embodiment of the present disclosure. However, this is only a preferred embodiment for achieving the purpose of the present disclosure, and of course, some steps may be added or deleted as needed.

도 2에 도시된 바와 같이, 본 개시의 일 실시예에 따른 이미지 해상도 개선 방법은, 제1 이미지를 피처 생성기에 입력하여 제1 피처 맵(feature map)을 생성하는 단계 S100에서 시작된다. 이 때, 상기 제1 이미지는 네트워크 전송 상황이 원활하지 않아 저해상도로 송출된 이미지일 수 있다. 또한, 상기 제1 이미지는 실시간으로 수신되는 화상 회의 영상의 프레임 이미지일 수 있다.As shown in FIG. 2, the method for improving image resolution according to an embodiment of the present disclosure begins at step S100 of generating a first feature map by inputting a first image into a feature generator. At this time, the first image may be an image transmitted in low resolution due to poor network transmission. Additionally, the first image may be a frame image of a video conference video received in real time.

상기 피처 생성기는 상기 제1 이미지에서 중간 피처 맵을 추출하는 피처 추출기 및 정답 이미지에서 추출된 피처 맵을 반영하여 상기 중간 피처 맵을 강화시킴으로써 상기 제1 피처 맵을 출력하는 피처 강화기를 포함할 수 있다. 이 때, 상기 정답 이미지는 상기 제1 이미지가 송출되기 이전 시점의 이미지일 수 있다. 즉, 상기 정답 이미지는 상기 제1 이미지보다 고해상도의 이미지일 수 있다.The feature generator may include a feature extractor that extracts an intermediate feature map from the first image and a feature enhancer that outputs the first feature map by reflecting the feature map extracted from the correct answer image to enhance the intermediate feature map. . At this time, the correct answer image may be an image before the first image is transmitted. That is, the correct answer image may be a higher resolution image than the first image.

또한, 상기 피처 강화기는 복수의 레즈넷 블록(Resnet block)을 포함할 수 있고, 상기 피처 추출기는 업스케일링 블록, 다운스케일링 블록 및 피처 애그리게이터를 포함할 수 있다. 이 때, 상기 업스케일링 블록은 입력받은 피처 맵의 스케일을 증가시킬 수 있고, 상기 다운스케일링 블록은 입력받은 피처 맵의 스케일을 감소시킬 수 있으며, 상기 피처 애그리게이터는 상기 업스케일링 블록으로부터 출력된 피처 맵과 상기 다운스케일링 블록으로부터 출력된 피처 맵의 애그리게이팅을 수행할 수 있다. 이에 대한 자세한 설명은 도 3 및 도 4를 참조하여 후술하도록 한다.Additionally, the feature enhancer may include a plurality of Resnet blocks, and the feature extractor may include an upscaling block, a downscaling block, and a feature aggregator. At this time, the upscaling block may increase the scale of the input feature map, the downscaling block may decrease the scale of the input feature map, and the feature aggregator may increase the scale of the input feature map. Aggregation of the map and the feature map output from the downscaling block can be performed. A detailed description of this will be provided later with reference to FIGS. 3 and 4.

다시 도 2를 참조하면, 단계 S200에서, 제1 피처 맵이 이미지 생성기에 입력된 결과로 제2 이미지가 생성될 수 있다. 이 때, 상기 이미지 생성기는 업샘플링을 수행하는 적어도 하나의 디컨볼루션 레이어를 포함할 수 있다. 또한, 상기 이미지 생성기는 상기 제1 피처 맵을 입력받는 컨볼루션 레이어를 더 포함할 수도 있다.Referring again to FIG. 2, in step S200, a second image may be generated as a result of inputting the first feature map to the image generator. At this time, the image generator may include at least one deconvolution layer that performs upsampling. Additionally, the image generator may further include a convolution layer that receives the first feature map.

한편, 피처 생성기와 이미지 생성기는 제2 이미지와 정답 이미지 간에 산출된 손실을 이용하여 업데이트될 수 있고, 이미지 판별기를 이용하여 산출된 손실을 이용하여 업데이트될 수도 있다. 이에, 피처 생성기와 이미지 생성기가 손실을 기초로 업데이트됨으로써 이미지의 해상도가 개선될 수 있다. 이에 대한 자세한 설명은 도 8 내지 도 13을 참조하여 후술하도록 한다.Meanwhile, the feature generator and the image generator can be updated using the loss calculated between the second image and the correct image, and can also be updated using the loss calculated using the image discriminator. Accordingly, the resolution of the image can be improved by updating the feature generator and the image generator based on the loss. A detailed description of this will be provided later with reference to FIGS. 8 to 13.

도 3은 본 개시의 몇몇 실시예에서 참조될 수 있는, 피처 생성기의 구성에 대한 예시도이다.3 is an exemplary diagram of the configuration of a feature generator, which may be referenced in some embodiments of the present disclosure.

도 3에 도시된 바와 같이, 피처 생성기(30)는 제1 이미지(LR)에서 중간 피처 맵(F1)을 추출하는 피처 추출기(10) 및 정답 이미지(HR)에서 추출된 피처 맵을 반영하여 중간 피처 맵(F1)을 강화시킴으로써 제1 피처 맵(F2)을 출력하는 피처 강화기(20)을 포함할 수 있다. 이 때, 피처 추출기(10)는 업스케일링 블록(12), 다운스케일링 블록(13) 및 피처 애그리게이터(14)를 포함할 수 있고, 피처 강화기(20)는 복수의 레즈넷 블록(21, 22)을 포함할 수 있다.As shown in Figure 3, the feature generator 30 reflects the feature extractor 10 that extracts the intermediate feature map (F1) from the first image (LR) and the feature map extracted from the correct answer image (HR) to create the intermediate feature map (F1). It may include a feature enhancer 20 that outputs the first feature map F2 by enhancing the feature map F1. At this time, the feature extractor 10 may include an upscaling block 12, a downscaling block 13, and a feature aggregator 14, and the feature enhancer 20 may include a plurality of Resnet blocks 21, 22) may be included.

먼저, 피처 추출기(10)에 포함된 컨볼루션 블록(11)은 복수의 컨볼루션 레이어를 이용하여 제1 이미지에 대한 피처 맵을 생성할 수 있다. 이 때, 컨볼루션 블록(Convolution Block)은 컨볼루션 레이어(Convolution layer)가 2개 이상 구성된 네트워크단일 수 있다. 또한, 상기 생성된 제1 이미지에 대한 피처 맵은 업스케일링 블록(12), 다운스케일링 블록(13) 및 업스케일링 블록(12)에 차례로 입력될 수 있고, 각각의 블록으로부터 복수의 피처 맵이 생성될 수 있다. 나아가, 상기 생성된 복수의 피처 맵은 피처 애그리게이터(14)를 통해 애그리게이팅될 수 있다. 이 때, 상기 애크리게이팅은, 예를 들면 컨캐터네이션(concatenation)일 수 있다.First, the convolution block 11 included in the feature extractor 10 can generate a feature map for the first image using a plurality of convolution layers. At this time, the convolution block may be a network unit composed of two or more convolution layers. Additionally, the feature map for the generated first image may be sequentially input to the upscaling block 12, the downscaling block 13, and the upscaling block 12, and a plurality of feature maps are generated from each block. It can be. Furthermore, the generated plurality of feature maps may be aggregated through the feature aggregator 14. At this time, the acgregating may be, for example, concatenation.

이 때, 디컨볼루션 과정에서는 컨볼루션 과정에서 사라질 수 있는 low-frequency 피처 맵이 생성될 수 있고, 반대로 컨볼루션 과정에서는 디컨볼루션 과정에서 흐릿해질 수 있는 high-frequency 피처 맵이 생성될 수 있다. 나아가, 상기 생성된 low-frequency 피처 맵과 high-frequency 피처 맵을 이용한 백프로젝션(back projection)을 통해 피처 맵이 강화될 수 있다.At this time, in the deconvolution process, a low-frequency feature map that may disappear in the convolution process may be created, and conversely, in the convolution process, a high-frequency feature map may be created that may become blurred in the deconvolution process. . Furthermore, the feature map can be strengthened through back projection using the generated low-frequency feature map and high-frequency feature map.

한편, 각 블록에서의 피처 레이어(feature layer)는 피처 맵을 압축할 수 있는 컨볼루션 레이어와 피처 맵을 확대할 수 있는 디컨볼루션 레이어를 포함할 수 있고, 상기 피처 레이어 각각의 출력 값을 활성화하여 파라미터의 가중치를 업데이트하기 위한 활성화 함수(activation function)는 모두 PReLU(Parametric ReLU)가 활용될 수 있다. 당해 기술 분야의 종사자라면, PReLU에 대해 이미 숙지하고 있을 것인 바, 이에 대한 자세한 설명은 생략하도록 한다.Meanwhile, the feature layer in each block may include a convolution layer capable of compressing the feature map and a deconvolution layer capable of enlarging the feature map, and the output value of each feature layer may be activated. Therefore, PReLU (Parametric ReLU) can be used as an activation function for updating the weights of parameters. Those working in the relevant technical field will already be familiar with PReLU, so detailed explanations thereof will be omitted.

도 4는 도 3에 도시된 일부 구성을 설명하기 위한 예시도이다. 보다 자세하게는, 도 4는 도 3에 도시된 피처 추출기에 포함된 업스케일링 블록과 다운스케일링 블록이 확대된 예시도이다.FIG. 4 is an example diagram for explaining some of the configurations shown in FIG. 3. More specifically, FIG. 4 is an enlarged example of the upscaling block and downscaling block included in the feature extractor shown in FIG. 3.

도 4에 도시된 바와 같이, 업스케일링 블록(12)은 디컨볼루션 레이어(12a), 컨볼루션 레이어(12b) 및 디컨볼루션 레이어(12a)가 연속적으로 구성된 블록일 수 있다. 이 때, 업스케일링 블록(12)은 피처 맵(F4)을 입력 받은 경우, 상기 피처 맵(F4)이 확대된 피처 맵(F5)를 생성할 수 있다.As shown in FIG. 4, the upscaling block 12 may be a block composed of a deconvolution layer 12a, a convolution layer 12b, and a deconvolution layer 12a sequentially. At this time, when the upscaling block 12 receives the feature map F4, it can generate a feature map F5 in which the feature map F4 is enlarged.

마찬가지로, 다운스케일링 블록(13)은 컨볼루션 레이어(13b), 디컨볼루션 레이어(13a) 및 컨볼루션 레이어(13b)가 연속적으로 구성된 블록일 수 있다. 이 때, 다운스케일링 블록(13)은 피처 맵(F6)을 입력 받은 경우, 상기 피처 맵(F6)이 축소된 피처 맵(F7)를 생성할 수 있다.Likewise, the downscaling block 13 may be a block composed of a convolution layer 13b, a deconvolution layer 13a, and a convolution layer 13b in succession. At this time, when the downscaling block 13 receives the feature map F6, it can generate a feature map F7 in which the feature map F6 is reduced.

한편, 업스케일링 블록(12)과 다운스케일링 블록(13)을 구성하는 컨볼루션 레이어와 디컨볼루션 레이어의 조합은 도 4에 도시된 내용에 한정되는 것은 아니고, 업스케일링 블록(12)과 다운스케일링 블록(13)은 다양한 조합의 형태로 구성될 수 있다.Meanwhile, the combination of the convolution layer and deconvolution layer constituting the upscaling block 12 and the downscaling block 13 is not limited to what is shown in FIG. 4, and the upscaling block 12 and the downscaling block 13 Block 13 may be configured in various combinations.

즉, 업스케일링 블록은 컨볼루션 레이어와 디컨볼루션 레이어의 조합에 기초하여 구현될 수 있고, 특히 상기 업스케일링 블록에 포함된 디컨볼루션 레이어의 개수가 상기 컨볼루션 레이어보다 많도록 구현될 수 있다. 또한, 상기 다운스케일링 블록은 컨볼루션 레이어와 디컨볼루션 레이어의 조합에 기초하여 구현될 수 있으며, 특히 상기 다운스케일링 블록에 포함된 디컨볼루션 레이어의 개수가 상기 컨볼루션 레이어보다 적도록 구현될 수 있다. That is, the upscaling block can be implemented based on a combination of a convolution layer and a deconvolution layer, and in particular, the number of deconvolution layers included in the upscaling block can be implemented so that it is greater than the convolution layer. . Additionally, the downscaling block may be implemented based on a combination of a convolution layer and a deconvolution layer. In particular, the downscaling block may be implemented so that the number of deconvolution layers included is smaller than the convolution layer. there is.

도 5는 본 개시의 몇몇 실시예에서 참조될 수 있는, 피처 추출기에 의한 피처 맵 보완 프로세스를 설명하기 위한 예시도이다.Figure 5 is an example diagram for explaining a feature map supplementation process by a feature extractor, which may be referred to in some embodiments of the present disclosure.

도 5에 도시된 바와 같이, 업스케일링 블록(12) 또는 다운스케일링 블록(13)에 포함된 디컨볼루션 레이어(12a, 13a)는 고해상도 이미지에 대한 피처 맵을 생성할 수 있고, 컨볼루션 레이어(12b, 13b)는 저해상도 이미지에 대한 피처 맵을 생성할 수 있다. 이후, 각 단계에서의 저해상도 이미지에 대한 피처 맵과 고해상도 이미지에 대한 피처 맵 간의 차이를 백프로젝션하는 프로세스를 통해 피처 맵이 보완될 수 있다. 나아가, 백프로젝션하는 과정을 통해 생성된 피처 맵들은 컨캐터네이션(concatenation) 이후 컨볼루션 레이어를 거쳐 피처 강화기의 입력 값으로 활용될 수 있다. 당해 기술 분야의 종사자라면, 백프로젝션과 컨캐터네이션에 대해 이미 숙지하고 있을 것인 바, 이에 대한 자세한 설명은 생략하도록 한다.As shown in Figure 5, the deconvolution layers 12a and 13a included in the upscaling block 12 or the downscaling block 13 can generate a feature map for a high-resolution image, and the convolution layer ( 12b, 13b) can generate a feature map for a low-resolution image. Afterwards, the feature map can be supplemented through a process of backprojecting the difference between the feature map for the low-resolution image and the feature map for the high-resolution image at each step. Furthermore, feature maps generated through the backprojection process can be used as input values for a feature enhancer through a convolution layer after concatenation. If you are working in the relevant technical field, you will already be familiar with back projection and concatenation, so detailed explanations regarding them will be omitted.

다시 도 3을 참조하면, 피처 생성기(30)에 포함된 피처 강화기(20)는 복수의 레즈넷 블록(21, 22)을 포함할 수 있다. Referring again to FIG. 3, the feature enhancer 20 included in the feature generator 30 may include a plurality of Resnet blocks 21 and 22.

이 때, 레즈넷 블록(21, 22)은 정답 이미지를 이용한 백프로젝션을 통해 피처 맵을 보완하도록 구성될 수 있다. 보다 자세하게는, 레즈넷 디컨볼루션 블록(21)과 레즈넷 컨볼루션 블록(22)은 각 블록으로부터 출력되는 피처 맵과 정답 이미지 간의 차이를 백프로젝션하여 피처 맵을 보완할 수 있다. 또한, 컨캐터네이션된 피처 맵을 피처 추출기(10)의 입력 값으로 활용함으로써 이미지 해상도가 보다 개선될 수 있다.At this time, the Resnet blocks 21 and 22 may be configured to supplement the feature map through back projection using the correct answer image. More specifically, the Resnet deconvolution block 21 and the Resnet convolution block 22 can supplement the feature map by backprojecting the difference between the feature map output from each block and the correct image. Additionally, image resolution can be further improved by using the concatenated feature map as an input value to the feature extractor 10.

한편, 이미지 해상도의 개선 결과는 PSNR(Peak Signal-to-Noise Ratio)으로 측정할 수 있다. 이 때, 제1 이미지를 피처 추출기(10)에 입력하여 중간 피처 맵을 생성하고, 생성된 중간 피처맵을 피처 강화기(20)에 입력하여 제1 피처 맵을 생성하는 과정이 반복적으로 수행되는 경우, 이미지 해상도가 개선될 수 있고, PSNR 수치 또한 증가할 수 있다. 당해 기술 분야의 종사자라면, PSNR에 대해 이미 숙지하고 있을 것인 바, 이에 대한 자세한 설명은 생략하도록 한다.Meanwhile, the improvement in image resolution can be measured by Peak Signal-to-Noise Ratio (PSNR). At this time, the process of inputting the first image into the feature extractor 10 to generate an intermediate feature map and inputting the generated intermediate feature map into the feature enhancer 20 to generate the first feature map is repeatedly performed. In this case, image resolution can be improved and PSNR values can also be increased. Anyone working in the relevant technical field will already be familiar with PSNR, so detailed explanations thereof will be omitted.

도 6은 도 2에 도시된 일부 동작을 설명하기 위한 예시도이다. 보다 자세하게는, 도 6은 도 3에 도시된 피처 생성기의 구성에 더하여 이미지 생성기가 추가적으로 표시된 예시도이다.FIG. 6 is an example diagram for explaining some operations shown in FIG. 2. More specifically, FIG. 6 is an example diagram showing an image generator in addition to the configuration of the feature generator shown in FIG. 3.

도 6에 도시된 바와 같이, 피처 추출기(10)와 피처 강화기(20)를 통해 생성된 피처 맵은 이미지 생성기(40)에 입력될 수 있고, 상기 입력의 결과로 제2 이미지가 생성될 수 있다. 이 때, 상기 제2 이미지는 피처 추출기(10)에 입력된 제1 이미지보다 높은 해상도의 이미지일 수 있다.As shown in FIG. 6, the feature map generated through the feature extractor 10 and the feature enhancer 20 may be input to the image generator 40, and a second image may be generated as a result of the input. there is. At this time, the second image may be an image with a higher resolution than the first image input to the feature extractor 10.

이미지 생성기(40)는 업샘플링을 수행하는 적어도 하나의 디컨볼루션 레이어(미도시)를 포함할 수 있다. 또한, 이미지 생성기(40)는 피처 추출기(10)와 피처 강화기(20)를 통해 생성된 피처 맵을 입력받는 컨볼루션 레이어(미도시)를 더 포함할 수도 있다.Image generator 40 may include at least one deconvolution layer (not shown) that performs upsampling. Additionally, the image generator 40 may further include a convolution layer (not shown) that receives the feature map generated through the feature extractor 10 and the feature enhancer 20.

본 개시의 일 실시예에 따른 이미지 생성기(40)는 복수 개의 컨볼루션 레이어로 구성된 신경망(FCN, Fully Convolution Network)의 구조와 유사한 구조일 수 있다. 다만, FCN은 마지막 단에서 이중선형 보간법(bilinear interpolation)을 이용하여 피처 맵을 업샘플링하는 반면에, 이미지 생성기(40)는 마지막 단에서 디컨볼루션 레이어를 이용하여 피처 맵을 업샘플링함으로써 보다 개선된 해상도의 이미지를 생성할 수 있다.The image generator 40 according to an embodiment of the present disclosure may have a structure similar to that of a fully convolutional network (FCN) composed of a plurality of convolutional layers. However, while FCN upsamples the feature map using bilinear interpolation in the last stage, the image generator 40 upsamples the feature map using a deconvolution layer in the last stage, thereby improving further. You can create images with high resolution.

한편, 피처 강화기(20)에서 컨캐터네이션된 피처 맵은 이미지 생성기(40)를 통해 정답 이미지와의 손실이 산출될 수 있고, 상기 산출된 손실에 기초하여 이미지 생성기(40)가 업데이트될 수 있다. 본 개시의 몇몇 실시예에 따른, 이미지 생성기(40)의 업데이트에 대한 자세한 설명은 도 10 및 도 11을 참조하여 후술하도록 한다.Meanwhile, the loss of the feature map concatenated in the feature enhancer 20 from the correct image can be calculated through the image generator 40, and the image generator 40 can be updated based on the calculated loss. there is. A detailed description of the update of the image generator 40 according to some embodiments of the present disclosure will be described later with reference to FIGS. 10 and 11.

도 7은 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 해상도 개선 모델의 추론 과정을 설명하기 위한 흐름도이다.Figure 7 is a flowchart for explaining the inference process of an image resolution improvement model, which may be referenced in some embodiments of the present disclosure.

도 7에 도시된 바와 같이, 피처 생성기(30)에 입력된 저해상도의 이미지는 피처 생성기(30)와 이미지 생성기(40)를 통해 고해상도의 이미지로 출력될 수 있다. 보다 자세하게는, 저해상도의 제1 이미지를 피처 추출기(10)에 저해상도의 이미지로부터 입력한 결과로, 상기 제1 이미지에 대한 피처 맵을 확대 또는 압축하는 프로세스를 통해 중간 피처 맵이 생성될 수 있다. 다음으로, 상기 중간 피처 맵을 피처 강화기(20)에 입력한 결과로, 상기 중간 피처 맵을 보완하는 프로세스를 통해 제1 피처 맵이 생성될 수 있다. 마지막으로, 상기 제1 피처 맵을 이미지 생성기(40)에 입력한 결과로, 상기 제1 피처 맵을 보완하고 업샘플링하는 프로세스를 통해 고해상도의 제2 이미지가 생성될 수 있다.As shown in FIG. 7, a low-resolution image input to the feature generator 30 may be output as a high-resolution image through the feature generator 30 and the image generator 40. More specifically, as a result of inputting a low-resolution first image to the feature extractor 10 from a low-resolution image, an intermediate feature map may be generated through a process of enlarging or compressing the feature map for the first image. Next, as a result of inputting the intermediate feature map to the feature enhancer 20, a first feature map may be generated through a process of supplementing the intermediate feature map. Finally, as a result of inputting the first feature map to the image generator 40, a high-resolution second image may be generated through a process of supplementing and upsampling the first feature map.

지금까지 도 2 내지 도 7을 참조하여 본 개시의 몇몇 실시예에 따른 이미지 해상도 개선 방법에 대하여 설명하였다. 상술한 이미지 해상도 개선 방법에 따르면, 복수의 컨볼루션 레이어와 디컨볼루션 레이어를 이용하여 피처 맵의 크기를 변동시키는 프로세스 및 복수의 레즈넷 블록을 이용하여 피처 맵을 보완하는 프로세스를 통해 고해상도의 이미지가 생성될 수 있다. 이에, 네트워크 전송 상황에 따라 저해상도 크기로 송출된 이미지 또는 영상의 프레임 단위 품질이 효과적으로 향상될 수 있고, 사용자의 만족도 및 이용성이 제고될 수 있다. So far, a method for improving image resolution according to some embodiments of the present disclosure has been described with reference to FIGS. 2 to 7 . According to the image resolution improvement method described above, a high-resolution image is generated through a process of changing the size of the feature map using a plurality of convolution layers and a deconvolution layer and a process of supplementing the feature map using a plurality of reznet blocks. can be created. Accordingly, the frame-by-frame quality of images or videos transmitted in low-resolution sizes can be effectively improved depending on the network transmission situation, and user satisfaction and usability can be improved.

이하에서는, 도 8 내지 도 13을 참조하여 본 개시의 다른 실시예에 따른 이미지 해상도 개선 모델을 학습시키는 방법에 대하여 설명하도록 한다. 이 때, 상기 이미지 해상도 개선 모델은 피처 생성기와 이미지 생성기를 포함할 수 있고, 경우에 따라 이미지 판별기를 더 포함할 수 있다.Hereinafter, a method of learning an image resolution improvement model according to another embodiment of the present disclosure will be described with reference to FIGS. 8 to 13. At this time, the image resolution improvement model may include a feature generator and an image generator, and in some cases may further include an image discriminator.

도 8은 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 판별기의 구성에 대한 예시도이다.Figure 8 is an example diagram of the configuration of an image discriminator, which may be referenced in some embodiments of the present disclosure.

도 8에 도시된 바와 같이, 본 개시의 일 실시예에 따른 이미지 해상도 개선 모델은 피처 추출기(10), 피처 강화기(20), 이미지 생성기(40) 및 이미지 판별기(50)를 포함할 수 있다. 피처 추출기(10), 피처 강화기(20) 및 이미지 생성기(40)는 도 2 내지 도 7을 참조하여 설명하였으므로 이에 대한 자세한 설명은 생략하도록 한다.As shown in FIG. 8, the image resolution improvement model according to an embodiment of the present disclosure may include a feature extractor 10, a feature enhancer 20, an image generator 40, and an image discriminator 50. there is. Since the feature extractor 10, feature enhancer 20, and image generator 40 have been described with reference to FIGS. 2 to 7, detailed description thereof will be omitted.

이미지 판별기(50)는 이미지 생성기(40)로부터 생성된 고해상도의 제2 이미지가 정답 이미지인지 오답 이미지인지를 판별할 수 있고, 적대적 손실(adversarial loss) 값을 산출할 수 있다. 이후, 상기 산출된 적대적 손실 값을 이용하여 피처 추출기(10), 피처 강화기(20) 및 이미지 생성기(40)를 업데이트할 수 있다. 이 때, 상기 적대적 손실(adversarial loss) 값은 아래의 수식을 이용하여 산출될 수 있다. The image discriminator 50 can determine whether the high-resolution second image generated by the image generator 40 is a correct image or an incorrect image and calculate an adversarial loss value. Afterwards, the feature extractor 10, feature enhancer 20, and image generator 40 can be updated using the calculated adversarial loss value. At this time, the adversarial loss value can be calculated using the formula below.

당해 기술 분야의 종사자라면, GAN(Generative Adversarial Nets)기반의 슈퍼레졸루션(super-resolution) 알고리즘에 대해 이미 숙지하고 있을 것인 바, 이에 대한 자세한 설명은 생략하도록 한다.Anyone working in the relevant technology field will already be familiar with the super-resolution algorithm based on GAN (Generative Adversarial Nets), so detailed explanations thereof will be omitted.

도 9는 본 개시의 다른 실시예에 따른, 이미지 해상도 개선 모델을 업데이트하는 방법을 나타내는 순서도이다. 단, 이는 본 개시의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 단계가 추가되거나 삭제될 수 있음은 물론이다.Figure 9 is a flowchart showing a method of updating an image resolution improvement model according to another embodiment of the present disclosure. However, this is only a preferred embodiment for achieving the purpose of the present disclosure, and of course, some steps may be added or deleted as needed.

도 9에 도시된 바와 같이, 본 개시의 다른 실시예에 따른 이미지 해상도 개선 방법은, 도 2를 참조하여 설명한 이미지 해상도 개선 방법에 따라 생성된 제2 이미지와 정답 이미지 간의 제1 손실 값을 산출하는 단계 S300 및 상기 제1 손실 값을 이용하여 피처 생성기와 이미지 생성기를 업데이트하는 단계 S400을 포함할 수 있다. 이 때, 상기 정답 이미지는 상기 제2 이미지보다 고해상도 이미지일 수 있다.As shown in FIG. 9, the image resolution improvement method according to another embodiment of the present disclosure calculates a first loss value between the second image generated according to the image resolution improvement method described with reference to FIG. 2 and the correct image. It may include step S300 and step S400 of updating the feature generator and the image generator using the first loss value. At this time, the correct answer image may be a higher resolution image than the second image.

상기 제1 손실 값은, 평균 제곱 오차(MSE, Mean Square Error), 지각 손실(Perceptual Loss) 및 전체 변동 손실(Total Variation Loss) 중 하나일 수 있다.The first loss value may be one of Mean Square Error (MSE), Perceptual Loss, and Total Variation Loss.

MSE loss는 아래의 수식에서 확인할 수 있듯이, 이미지 해상도 개선 모델로부터 생성된 고해상도의 이미지와 정답 이미지의 차이를 제곱한 후 전체 합에 대한 평균 값으로 산출되는 손실 값이다.As can be seen in the formula below, MSE loss is a loss value calculated as the average value of the total sum after squaring the difference between the high-resolution image generated from the image resolution improvement model and the correct image.

Perceptual loss는 아래의 수식에서 확인할 수 있듯이, 상술한 MSE loss와 유사하며, 이미지 해상도 개선 모델로부터 생성된 고해상도의 이미지와 정답 이미지를 Pre-trained된 VGG-19에 통과시켜 획득한 피처 맵 사이의 유클리디안 거리(Euclidean distance)로 산출되는 손실 값이다.As can be seen in the formula below, the perceptual loss is similar to the MSE loss described above, and there is a similarity between the high-resolution image generated from the image resolution improvement model and the feature map obtained by passing the correct image through the pre-trained VGG-19. This is the loss value calculated from the Euclidean distance.

전체 변동 손실(Total Variation Loss)은 아래의 수식에서 확인할 수 있듯이, 기준 픽셀(i, j)과 인접한 픽셀((i, j+1), (i+1, j))간의 차이를 이용하여 산출되는 손실 값이다.Total Variation Loss is calculated using the difference between the reference pixel (i, j) and adjacent pixels ((i, j+1), (i+1, j)), as can be seen in the formula below. This is the loss value.

한편, 평균 제곱 오차(MSE, Mean Square Error), 지각 손실(Perceptual Loss), 적대적 손실(Adversarial Loss) 및 전체 변동 손실(Total Variation Loss)를 이용하여 피처 생성기와 이미지 생성기가 업데이트되는 과정에 있어서, 상기 각각의 손실들은 다양한 비중으로 피처 생성기와 이미지 생성기의 학습에 사용될 수 있다. 예를 들어, 제1 손실 값들은 아래와 같은 비중으로 피처 생성기와 이미지 생성기의 학습에 사용될 수 있다.Meanwhile, in the process of updating the feature generator and image generator using Mean Square Error (MSE), Perceptual Loss, Adversarial Loss, and Total Variation Loss, Each of the above losses can be used in training of the feature generator and image generator in various proportions. For example, the first loss values can be used for learning the feature generator and the image generator at the following ratio.

당해 기술 분야의 종사자라면, 평균 제곱 오차(MSE, Mean Square Error), 지각 손실(Perceptual Loss) 및 전체 변동 손실(Total Variation Loss)에 대해 이미 숙지하고 있을 것인 바, 이에 대한 자세한 설명은 생략하도록 한다.Those working in the relevant technical field will already be familiar with Mean Square Error (MSE), Perceptual Loss, and Total Variation Loss, so detailed explanations thereof will be omitted. do.

도 10은 도 9를 참조하여 설명한 일부 동작의 상세 순서도이다.FIG. 10 is a detailed flowchart of some operations described with reference to FIG. 9.

도 10에 도시된 바와 같이, 제1 손실 값을 이용하여 피처 생성기와 이미지 생성기를 업데이트하는 단계 S400은, 기 설정된 조건을 충족하는 경우에 한하여, 제2 손실 값을 산출하는 단계 S410 및 상기 제2 손실 값을 이용하여 이미지 생성기를 업데이트하는 단계 S420을 포함할 수 있다. 이 때, 상기 제2 손실 값은 L1 loss 값으로, 아래의 수식과 같이 상기 이미지 생성기로부터 생성된 제2 이미지와 정답 이미지 간의 L1 distance를 통해 산출될 수 있다.As shown in FIG. 10, step S400 of updating the feature generator and image generator using the first loss value includes step S410 of calculating a second loss value only when a preset condition is met, and the second loss value. Step S420 may include updating the image generator using the loss value. At this time, the second loss value is an L1 loss value, which can be calculated through the L1 distance between the second image generated by the image generator and the correct image as shown in the formula below.

본 개시의 일 실시예에 따른, L1 loss를 이용하여 이미지 생성기를 업데이트하는 단계 S420은 기 설정된 조건을 충족하는 경우에만 수행될 수 있다. 예를 들어, 도 9를 참조하여 전술한 바와 같이, 제1 손실 값을 이용하여 피처 추출기, 피처 강화기 및 이미지 생성기로 구성된 이미지 해상도 개선 모델이 업데이트될 수 있고, 기 설정된 조건을 충족하는 경우에 한하여 L1 loss 값을 이용하여 상기 이미지 해상도 개선 모델의 구성 중 이미지 생성기만 추가적으로 업데이트할 수 있다. 이 때, 상기 기 설정된 조건은 상기 업데이트된 이미지 해상도 개선 모델을 통해 생성된 제2 이미지와 정답 이미지 간의 차이가 기준치 이하인 것일 수 있다. 상기 기 설정된 조건들에 대한 자세한 설명은 도 11을 참조하여 후술하도록 한다.Step S420 of updating the image generator using L1 loss according to an embodiment of the present disclosure can be performed only when a preset condition is met. For example, as described above with reference to FIG. 9, the image resolution improvement model consisting of a feature extractor, feature enhancer, and image generator may be updated using the first loss value, and if preset conditions are met, Insofar as the L1 loss value is used, only the image generator can be additionally updated during the construction of the image resolution improvement model. At this time, the preset condition may be that the difference between the second image generated through the updated image resolution improvement model and the correct image is less than or equal to a reference value. A detailed description of the preset conditions will be described later with reference to FIG. 11.

기 설정된 조건을 충족하는 경우에만 L1 loss를 이용하여 이미지 생성기를 업데이트하는 단계 S420이 수행되는 이유는, 1-epoch부터 L1 loss를 포함하여 총 5개 종류의 손실 값(L1 loss, MSE, Perceptual loss, Adversarial loss, TV loss)을 이용하여 이미지 생성기에 대한 종단간 기계학습(end-to-end learning)이 진행될 경우 L1 loss가 전체 손실에 수렴될 수 있기 때문이다. 따라서, 이미지 생성기의 업데이트를 위한 L1 loss는 손실의 변화가 완만해지는 이후에 수행될 수 있다.The reason why step S420 of updating the image generator using L1 loss is performed only when preset conditions are met is that a total of 5 types of loss values (L1 loss, MSE, Perceptual loss), including L1 loss from 1-epoch, are performed. This is because when end-to-end machine learning for the image generator is performed using Adversarial loss (TV loss), the L1 loss can converge to the overall loss. Therefore, L1 loss for updating the image generator can be performed after the change in loss becomes gradual.

즉, 피처 추출기, 피처 강화기 및 이미지 생성기는 전술한 [수식 5]와 같이 4개 종류의 제1 손실 값의 조합으로 계속 학습될 수 있고, 기 설정된 조건을 충족하는 경우에 이미지 생성기가 업데이트될 수 있다. 이 때, L1 loss를 이용한 이미지 생성기의 업데이트는 아래 수식과 같은 비중으로 수행될 수 있다.That is, the feature extractor, feature enhancer, and image generator can continue to be learned with a combination of the four types of first loss values as described in [Equation 5], and the image generator will be updated when preset conditions are met. You can. At this time, the update of the image generator using L1 loss can be performed with the same proportion as the formula below.

상기 수식과 같이 L1 loss의 비중은 임의의 값을 곱하여 조절될 수 있는데 영상 또는 이미지에 분포하는 오브젝트의 경계 콘트라스트(edge contrast)가 뚜렷한 경우(e.g. 흰 바탕에 검은 글씨)에는 오차 또는 손실이 크게 산출될 수 있다. 이러한 경우 상기 임의의 값은 0.001 정도로 작게 설정될 수 있다. 또한, 오브젝트의 경계 콘트라스트(edge contrast)가 완만한 경우(e.g. 유사한 컬러의 배경 및 전경 혹은 그라데이션 영역)에는 오차 또는 손실이 작게 산출될 수 있으므로 0.01 정도로 반영될 수 있다. 즉, 오차 또는 손실의 정도에 따라 L1 loss의 반영 비율이 적응적으로 조절될 수 있다.As shown in the formula above, the proportion of L1 loss can be adjusted by multiplying by an arbitrary value. If the edge contrast of the image or objects distributed in the image is clear (e.g. black text on a white background), the error or loss is calculated to be large. It can be. In this case, the arbitrary value may be set as small as 0.001. Additionally, when the edge contrast of an object is gentle (e.g. similar colored background and foreground or gradient area), the error or loss may be calculated to be small and may be reflected as about 0.01. In other words, the reflection rate of L1 loss can be adaptively adjusted depending on the degree of error or loss.

이에, 제1 손실 값(MSE, Perceptual loss, TV loss)과 적대적 손실(MSE, Adversarial loss)을 이용하여 피처 추출기, 피처 강화기 및 이미지 생성기가 충분히 업데이트된 시점 이후에 추가적으로 제2 손실 값(L1 loss)을 이용하여 이미지 생성기를 업데이트함으로써 보다 정교하고 높은 해상도의 이미지가 생성될 수 있다.Accordingly, after the feature extractor, feature enhancer, and image generator are sufficiently updated using the first loss value (MSE, Perceptual loss, TV loss) and Adversarial loss (MSE), an additional second loss value (L1 By updating the image generator using loss), more elaborate and higher resolution images can be created.

한편, L1 loss를 이용하여 이미지 생성기가 업데이트되는 과정에서 피처 생성기와 이미지 생성기의 가중치 파라미터가 동시에 업데이트될 수 있을 뿐만 아니라, 피처 생성기를 프리징(freezing)시킨 상태에서 이미지 생성기의 가중치 파라미터만 업데이트될 수도 있다.Meanwhile, in the process of updating the image generator using L1 loss, not only can the weight parameters of the feature generator and the image generator be updated simultaneously, but only the weight parameters of the image generator can be updated while the feature generator is frozen. It may be possible.

또한, L1 loss를 이용하여 이미지 생성기가 업데이트되는 과정에서 L1 loss에 부여된 가중치를 기초로 이미지 생성기가 업데이트될 수 있다. 보다 자세하게는, 제1 시점에 L1 loss에 부여된 제1 가중치는 제2 시점에 L1 loss에 부여된 제2 가중치보다 낮은 값일 수 있다. 이 때, 상기 제1 시점은 상기 제2 시점보다 이전의 시점일 수 있다.Additionally, in the process of updating the image generator using L1 loss, the image generator may be updated based on the weight given to L1 loss. More specifically, the first weight assigned to the L1 loss at the first time point may be lower than the second weight assigned to the L1 loss at the second time point. At this time, the first time point may be before the second time point.

도 11은 도 9에 도시된 일부 동작을 설명하기 위한 예시도이다. 보다 자세하게는, 도 11은 L1 loss를 이용하여 이미지 생성기를 업데이트하는 기준에 해당하는 기 설정된 조건들에 대한 예시적인 도면이다.FIG. 11 is an example diagram for explaining some operations shown in FIG. 9. In more detail, Figure 11 is an example diagram of preset conditions corresponding to the standard for updating the image generator using L1 loss.

첫째로, 상기 기 설정된 조건은 이미지 생성기로부터 생성된 고해상도의 제2 이미지의 PSNR(Pick Signal-to-Noise Ratio) 수치가 기준치를 초과하는 것일 수 있다. 즉, 도 11의 좌측 상단에 도시된 그래프와 같이, 제2 이미지의 PSNR 수치가 기준치를 초과하는 시점(t1)에 L1 loss가 산출될 수 있고, 산출된 L1 loss를 이용하여 이미지 생성기가 업데이트될 수 있다.First, the preset condition may be that the Pick Signal-to-Noise Ratio (PSNR) value of the high-resolution second image generated from the image generator exceeds the standard value. That is, as shown in the graph shown in the upper left corner of FIG. 11, L1 loss can be calculated at the time (t1) when the PSNR value of the second image exceeds the reference value, and the image generator can be updated using the calculated L1 loss. You can.

둘째로, 상기 기 설정된 조건은 이미지 생성기로부터 생성된 고해상도의 제2 이미지와 정답 이미지 간에 산출된 제1 손실 값이 기준치 이하인 것일 수 있다. 이 때, 상기 제1 손실 값은 MSE, Perceptual loss 및 TV loss 값 중 하나일 수 있다. 즉, 도 11의 우측 상단에 도시된 그래프와 같이, 제2 이미지와 정답 이미지 간에 산출된 제1 손실 값이 기준치 이하인 시점(t2)에 L1 loss가 산출될 수 있고, 산출된 L1 loss를 이용하여 이미지 생성기가 업데이트될 수 있다.Second, the preset condition may be that the first loss value calculated between the high-resolution second image generated from the image generator and the correct image is below the reference value. At this time, the first loss value may be one of MSE, Perceptual loss, and TV loss values. That is, as shown in the graph shown in the upper right corner of FIG. 11, L1 loss can be calculated at a time point (t2) when the first loss value calculated between the second image and the correct image is less than the reference value, and using the calculated L1 loss The image generator may be updated.

셋째로, 상기 기 설정된 조건은 이미지 생성기로부터 생성된 고해상도의 제2 이미지와 정답 이미지 간에 산출된 제1 손실 값의 감소 비율이 기준치 이하인 것일 수 있다. 이 때, 상기 제1 손실 값은 MSE, Perceptual loss 및 TV loss 값 중 하나일 수 있다. 즉, 도 11의 좌측 하단에 도시된 그래프와 같이, 복수의 시점(t3, t4)에 산출된 제1 손실 값의 감소 비율이 기준치 이하인 시점에 L1 loss가 산출될 수 있고, 산출된 L1 loss를 이용하여 이미지 생성기가 업데이트될 수 있다.Third, the preset condition may be that the reduction ratio of the first loss value calculated between the high-resolution second image generated from the image generator and the correct image is below the standard value. At this time, the first loss value may be one of MSE, Perceptual loss, and TV loss values. That is, as shown in the graph shown in the lower left corner of FIG. 11, L1 loss can be calculated at a time when the reduction rate of the first loss value calculated at a plurality of time points (t3, t4) is below the reference value, and the calculated L1 loss is The image generator can be updated using

넷째로, 상기 기 설정된 조건은 이미지 생성기로부터 생성된 고해상도의 제2 이미지에 대한 PSNR(Pick Signal-to-Noise Ratio) 감소 비율이 기준치를 이상인 것일 수 있다. 즉, 도 11의 좌측 상단에 도시된 그래프와 같이, 복수의 시점(t5, t6)에 산출된 제2 이미지에 대한 PSNR 감소 비율이 기준치 이상인 시점에 L1 loss가 산출될 수 있고, 산출된 L1 loss를 이용하여 이미지 생성기가 업데이트될 수 있다.Fourth, the preset condition may be that the Pick Signal-to-Noise Ratio (PSNR) reduction ratio for the high-resolution second image generated from the image generator is greater than or equal to the reference value. That is, as shown in the graph shown in the upper left corner of FIG. 11, L1 loss can be calculated at a time when the PSNR reduction ratio for the second image calculated at a plurality of time points (t5, t6) is greater than the reference value, and the calculated L1 loss The image generator can be updated using .

도 12는 본 개시의 몇몇 실시예에서 참조될 수 있는, 이미지 판별기를 이용한 이미지 해상도 개선 모델의 학습 과정을 설명하기 위한 흐름도이다.FIG. 12 is a flowchart illustrating the learning process of an image resolution improvement model using an image discriminator, which may be referenced in some embodiments of the present disclosure.

도 12에 도시된 바와 같이, 이미지 판별기(50)는 이미지 생성기(40)로부터 생성된 제2 이미지(SR image)의 적대적 손실(adversarial) 값을 산출할 수 있다. 이 때, 상기 적대적 손실 값이 기준 오차에 수렴하지 않는 경우 상기 적대적 손실 값을 이용하여 피처 추출기(10), 피처 강화기(20) 및 이미지 생성기(40)가 계속적으로 학습될 수 있다. 또한, 상기 적대적 손실 값이 기준 오차에 수렴하는 경우에는 이미지 해상도 개선 모델의 학습이 종료될 수 있다.As shown in FIG. 12, the image discriminator 50 may calculate an adversarial value of the second image (SR image) generated by the image generator 40. At this time, if the adversarial loss value does not converge to the standard error, the feature extractor 10, the feature enhancer 20, and the image generator 40 may be continuously trained using the adversarial loss value. Additionally, when the adversarial loss value converges to the reference error, training of the image resolution improvement model may be terminated.

도 13은 본 개시의 몇몇 실시예에서 참조될 수 있는, 정답 이미지를 이용한 이미지 해상도 개선 모델의 학습 과정을 설명하기 위한 흐름도이다.FIG. 13 is a flowchart illustrating the learning process of an image resolution improvement model using a correct answer image, which may be referred to in some embodiments of the present disclosure.

도 13에 도시된 바와 같이, 이미지 생성기(40)로부터 생성된 제2 이미지(SR image)와 정답 이미지(HR image) 간의 제1 손실 값(MSE loss, Perceptual loss, Total-Variation loss)이 산출될 수 있다. 이 때, 상기 제1 손실 값이 기준 오차에 수렴하지 않는 경우 상기 제1 손실 값을 이용하여 피처 추출기(10), 피처 강화기(20) 및 이미지 생성기(40)가 계속적으로 학습될 수 있다. 또한, 상기 제1 손실 값이 기준 오차에 수렴하는 경우에는 이미지 해상도 개선 모델의 학습이 종료될 수 있다.As shown in FIG. 13, the first loss values (MSE loss, Perceptual loss, Total-Variation loss) between the second image (SR image) generated from the image generator 40 and the correct answer image (HR image) are calculated. You can. At this time, if the first loss value does not converge to the reference error, the feature extractor 10, the feature enhancer 20, and the image generator 40 may be continuously trained using the first loss value. Additionally, when the first loss value converges to the reference error, training of the image resolution improvement model may be terminated.

한편, 상기 제1 손실 값을 이용한 이미지 해상도 개선 모델의 학습 과정에서 기 설정된 조건이 충족되는 경우에 한하여, 이미지 생성기(40)가 추가적으로 학습될 수 있다. 기 설정된 조건이 충족되는 경우, 이미지 생성기(40)로부터 생성된 제2 이미지(SR image)와 정답 이미지(HR image) 간의 제2 손실 값(L1 loss)이 산출될 수 있고, 상기 제2 손실 값을 이용하여 이미지 생성기가 학습될 수 있다. 상기 기 설정된 조건이 충족되는 경우는 도 11을 참조하여 전술하였으므로 생략하도록 한다.Meanwhile, the image generator 40 may be additionally trained only when preset conditions are met during the learning process of the image resolution improvement model using the first loss value. If the preset condition is met, a second loss value (L1 loss) between the second image (SR image) generated by the image generator 40 and the correct answer image (HR image) may be calculated, and the second loss value An image generator can be learned using . The case where the preset conditions are met has been described above with reference to FIG. 11 and will therefore be omitted.

지금까지 도 8 내지 도 13을 참조하여 본 개시의 다른 실시예에 따른 이미지 해상도 개선 모델을 학습시키는 방법에 대하여 설명하였다. 상술한 이미지 해상도 개선 모델의 학습 방법에 따르면, 기 설정된 조건을 충족하는 경우에 한하여, L1 loss를 기초로 이미지 생성기를 추가적으로 학습시킴으로써 보다 높은 해상도의 이미지가 생성될 수 있다. 이에, 네트워크 전송 상황에 따라 저해상도 크기로 송출된 이미지 또는 영상의 프레임 단위 품질이 효과적으로 향상될 수 있고, 사용자의 만족도 및 이용성이 제고될 수 있다. So far, a method for learning an image resolution improvement model according to another embodiment of the present disclosure has been described with reference to FIGS. 8 to 13 . According to the above-described learning method of the image resolution improvement model, an image with higher resolution can be generated by additionally training the image generator based on L1 loss only when preset conditions are met. Accordingly, the frame-by-frame quality of images or videos transmitted in low-resolution sizes can be effectively improved depending on the network transmission situation, and user satisfaction and usability can be improved.

도 14는 본 개시의 몇몇 실시예들에 따른 이미지 해상도 개선 시스템의 하드웨어 구성도이다. 도 14에 도시된 이미지 해상도 개선 시스템(1000)은, 하나 이상의 프로세서(1100), 시스템 버스(1600), 통신 인터페이스(1200), 프로세서(1100)에 의하여 수행되는 컴퓨터 프로그램(1500)을 로드(load)하는 메모리(1400)와, 컴퓨터 프로그램(1500)을 저장하는 스토리지(1300)를 포함할 수 있다.14 is a hardware configuration diagram of an image resolution improvement system according to some embodiments of the present disclosure. The image resolution improvement system 1000 shown in FIG. 14 loads one or more processors 1100, a system bus 1600, a communication interface 1200, and a computer program 1500 performed by the processor 1100. ) and a storage 1300 that stores a computer program 1500.

프로세서(1100)는 이미지 해상도 개선 시스템(1000)의 각 구성의 전반적인 동작을 제어한다. 프로세서(1100)는 본 개시의 다양한 실시예들에 따른 방법/동작을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있다. 메모리(1400)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(1400)는 본 개시의 다양한 실시예들에 따른 방법/동작들을 실행하기 위하여 스토리지(1300)로부터 하나 이상의 컴퓨터 프로그램(1500)을 로드(load) 할 수 있다. 버스(1600)는 이미지 해상도 개선 시스템(1000)의 구성 요소 간 통신 기능을 제공한다. 통신 인터페이스(1200)는 이미지 해상도 개선 시스템(1000)의 인터넷 통신을 지원한다. 스토리지(1300)는 하나 이상의 컴퓨터 프로그램(1500)을 비임시적으로 저장할 수 있다. 컴퓨터 프로그램(1500)은 본 개시의 다양한 실시예들에 따른 방법/동작들이 구현된 하나 이상의 명령어들(instructions)을 포함할 수 있다. 컴퓨터 프로그램(1500)이 메모리(1400)에 로드 되면, 프로세서(1100)는 상기 하나 이상의 명령어들을 실행시킴으로써 본 개시의 다양한 실시예들에 따른 방법/동작들을 수행할 수 있다.The processor 1100 controls the overall operation of each component of the image resolution improvement system 1000. The processor 1100 may perform operations on at least one application or program to execute methods/operations according to various embodiments of the present disclosure. The memory 1400 stores various data, commands and/or information. The memory 1400 may load one or more computer programs 1500 from the storage 1300 to execute methods/operations according to various embodiments of the present disclosure. Bus 1600 provides communication between components of image resolution improvement system 1000. The communication interface 1200 supports Internet communication of the image resolution improvement system 1000. Storage 1300 may non-temporarily store one or more computer programs 1500. The computer program 1500 may include one or more instructions implementing methods/operations according to various embodiments of the present disclosure. When the computer program 1500 is loaded into the memory 1400, the processor 1100 can perform methods/operations according to various embodiments of the present disclosure by executing the one or more instructions.

몇몇 실시예들에서, 도 14를 참조하여 설명된 이미지 해상도 개선 시스템(1000)은 가상 머신 등 클라우드 기술에 기반하여 서버 팜(server farm)에 포함된 하나 이상의 물리 서버(physical server)를 이용하여 구성될 수 있다. 이 경우, 도 15에 도시된 구성 요소 중 프로세서(1100), 메모리(1400) 및 스토리지(1300) 중 적어도 일부는 가상 하드웨어(virtual hardware)일 수 있을 것이며, 통신 인터페이스(1200) 또한 가상 스위치(virtual switch) 등 가상화된 네트위킹 요소로 구성될 수 있을 것이다.In some embodiments, the image resolution improvement system 1000 described with reference to FIG. 14 is configured using one or more physical servers included in a server farm based on cloud technology such as a virtual machine. It can be. In this case, at least some of the processor 1100, memory 1400, and storage 1300 among the components shown in FIG. 15 may be virtual hardware, and the communication interface 1200 may also be a virtual switch. It may be composed of virtualized networking elements such as switches.

지금까지 도 1 내지 도 14를 참조하여 본 개시의 다양한 실시예들 및 그 실시예들에 따른 효과들을 언급하였다. 본 개시의 기술적 사상에 따른 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.So far, various embodiments of the present disclosure and effects according to the embodiments have been mentioned with reference to FIGS. 1 to 14 . The effects according to the technical idea of the present disclosure are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description below.

지금까지 설명된 본 개시의 기술적 사상은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The technical ideas of the present disclosure described so far can be implemented as computer-readable code on a computer-readable medium. The computer program recorded on the computer-readable recording medium can be transmitted to another computing device through a network such as the Internet and installed on the other computing device, and thus can be used on the other computing device.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 이상 첨부된 도면을 참조하여 본 개시의 실시예들을 설명하였지만, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자는 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 발명이 다른 구체적인 형태로도 실시될 수 있다는 것을 이해할 수 있다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 개시에 의해 정의되는 기술적 사상의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Although operations are shown in the drawings in a specific order, it should not be understood that the operations must be performed in the specific order shown or sequential order or that all illustrated operations must be performed to obtain the desired results. In certain situations, multitasking and parallel processing may be advantageous. Although embodiments of the present disclosure have been described above with reference to the attached drawings, those skilled in the art will understand that the present invention can be implemented in other specific forms without changing the technical idea or essential features. I can understand that there is. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. The scope of protection of the present invention should be interpreted in accordance with the claims below, and all technical ideas within the equivalent scope should be construed as being included in the scope of rights of the technical ideas defined by this disclosure.

Claims

In a method performed by a computing device,
Generating a first feature map by inputting the first image into a feature generator; and
Including inputting the first feature map into an image generator to generate a second image,
The second image is a higher resolution image than the first image,
The image generator includes at least one deconvolution layer that performs up-sampling,
How to improve image resolution.

According to claim 1,
The feature generator is,
a feature extractor that extracts an intermediate feature map from the first image; and
A feature enhancer that outputs the first feature map by enhancing the intermediate feature map by reflecting the feature map extracted from the correct answer image,
The correct answer image is a higher resolution image than the first image,
How to improve image resolution.

According to clause 2,
The feature enhancer includes a plurality of Resnet blocks,
How to improve image resolution.

According to clause 2,
The feature extractor includes an upscaling block, a downscaling block, and a feature aggregator,
The upscaling block increases the scale of the input feature map,
The downscaling block reduces the scale of the input feature map,
The feature aggregator performs aggregation of the feature map output from the upscaling block and the feature map output from the downscaling block,
How to improve image resolution.

According to clause 4,
The upscaling block is implemented based on a combination of a convolution layer and a deconvolution layer,
The number of deconvolution layers included in the upscaling block is greater than the convolution layer,
How to improve image resolution.

According to clause 4,
The downscaling block is implemented based on a combination of a convolution layer and a deconvolution layer,
The number of deconvolution layers included in the downscaling block is less than the convolution layer,
How to improve image resolution.

According to claim 1,
The image generator further includes a convolution layer that receives the first feature map,
The at least one deconvolution layer is configured to perform upsampling based on the feature map output from the convolution layer,
How to improve image resolution.

According to claim 1,
Calculating an adversarial loss value using an image discriminator; and
Further comprising updating the feature generator and the image generator using the adversarial loss value,
How to improve image resolution.

According to claim 1,
calculating a first loss value between the second image and the correct image; and
Further comprising updating the feature generator and the image generator using the first loss value,
The correct answer image is a higher resolution image than the second image,
How to improve image resolution.

According to clause 9,
The first loss value is,
One of Mean Square Error (MSE), Perceptual Loss, and Total Variation Loss,
How to improve image resolution.

According to clause 9,
The step of updating the image generator is,
calculating a second loss value only when preset conditions are met; and
Updating the image generator using the second loss value,
The second loss value is the L1 loss value,
How to improve image resolution.

According to claim 11,
The preset conditions are:
The PSNR (Peak Signal-to-Noise Ratio) value of the second image exceeds the standard value,
How to improve image resolution.

According to claim 11,
The preset conditions are:
Wherein the first loss value is less than or equal to the reference value,
How to improve image resolution.

According to claim 11,
The preset conditions are:
The reduction rate of the first loss value is less than or equal to the standard value,
How to improve image resolution.

According to claim 11,
The preset conditions are:
The reduction rate of PSNR for the second image is greater than or equal to the reference value,
How to improve image resolution.

According to clause 9,
The step of updating the image generator is,
In a state of freezing the feature generator, updating weight parameters of the image generator,
How to improve image resolution.

According to clause 9,
The step of updating the image generator is,
Updating the image generator based on the weight assigned to the second loss value,
The second loss value is the L1 loss value,
The first weight assigned to the second loss value at a first time point is lower than the second weight assigned to the second loss value at a second time point after the first time point,
How to improve image resolution.

According to claim 1,
The first image is a frame image of a video conference video received in real time,
How to improve image resolution.

processor; and
Includes a memory for storing instructions,
When executed by the processor, the instruction causes the processor to:
An operation of generating a first feature map by inputting a first image into a feature generator; and
An operation of generating a second image is performed by inputting the first feature map into an image generator,
The second image is a higher resolution image than the first image,
The image generator includes at least one deconvolution layer that performs up-sampling,
Image resolution improvement system.