KR20230047687A

KR20230047687A - Apparatus for Restoring Image

Info

Publication number: KR20230047687A
Application number: KR1020210130813A
Authority: KR
Inventors: 신재섭; 류성걸; 손세훈; 김형덕; 김효성
Original assignee: 주식회사 픽스트리
Priority date: 2021-10-01
Filing date: 2021-10-01
Publication date: 2023-04-10
Also published as: KR102644142B1

Abstract

Disclosed is a video restoration device. The embodiment provides the video restoration device, which implements the ability to discriminate single distortion which occurs in all synthetic low-quality images including low-level noise from mosaics, and real low- quality images, the complex distortion consisting of numerous combinations, real distortion, and unlearned distortion during the learning process, and whether it can be restored or not, into one integrated artificial neural network to restore videos.

Description

Image Restoration Apparatus {Apparatus for Restoring Image}

본 발명의 일 실시예는 영상 복원 장치에 관한 것이다. An embodiment of the present invention relates to an image restoration device.

이하에 기술되는 내용은 단순히 본 실시예와 관련되는 배경 정보만을 제공할 뿐 종래기술을 구성하는 것이 아니다.The contents described below merely provide background information related to the present embodiment and do not constitute prior art.

일반적으로 저해상도 영상을 고해상도 영상으로 복원하는 기술은 복원에 사용되는 입력영상의 수 또는 복원 기술에 따라 구분된다. 입력영상의 수에 따라 단일영상 초해상도 복원 기술과 연속영상 초해상도 복원 기술로 구분된다.In general, a technique for restoring a low-resolution image to a high-resolution image is classified according to the number of input images used for reconstruction or a reconstruction technique. Depending on the number of input images, it is divided into single image super-resolution restoration technology and continuous image super-resolution restoration technology.

일반적으로 단일영상 초해상도 영상복원 기술은 연속영상 초해상도 영상복원에 비하여 처리 속도는 빠르지만, 복원에 필요한 정보가 부족하므로 영상 복원의 품질이 낮다.In general, single-image super-resolution image restoration technology has a faster processing speed than continuous image super-resolution image restoration, but the quality of image restoration is low because information necessary for restoration is insufficient.

연속영상 초해상도 영상복원 기술은 연속적으로 획득된 다수의 영상들로부터 추출된 다양한 특징을 이용하므로 단일영상 초해상도 영상복원 기술에 비하여 복원된 영상의 품질은 우수하나, 알고리즘이 복잡하고 연산량이 많아 실시간 처리가 어렵다.Since the continuous image super-resolution image restoration technology uses various features extracted from a plurality of consecutively acquired images, the quality of the restored image is superior to that of the single image super-resolution image restoration technology, but the algorithm is complex and the amount of computation is large, so real-time It is difficult to process.

복원 기술에 따라서는 보간법을 이용한 기술, 에지 정보를 이용한 기술, 주파수 특성을 이용한 기술, 딥러닝 등과 같은 기계학습을 이용한 기술 등이 있다. 보간법을 이용한 기술은 처리 속도가 빠르지만 가장자리 부분이 흐릿해지는 단점이 있다.Depending on the restoration technology, there is a technology using interpolation, a technology using edge information, a technology using frequency characteristics, and a technology using machine learning such as deep learning. The technique using the interpolation method has a high processing speed, but has the disadvantage of blurring the edges.

에지 정보를 이용한 기술은 속도도 빠르고 가장자리의 선명도를 유지하면서 영상을 복원할 수 있으나, 에지 방향을 잘못 추정한 경우에는 시각적으로 두드러지는 복원 에러를 포함할 수 있는 단점이 있다.The technology using edge information is fast and can restore an image while maintaining the sharpness of the edge, but has a disadvantage in that it may include a visually noticeable restoration error when the edge direction is incorrectly estimated.

주파수 특성을 이용한 기술은 고주파성분을 이용하여 에지 정보를 이용한 기술과 같이 가장자리의 선명도를 유지하며 영상을 복원할 수 있으나 경계선 부근의 Ringing Artifact가 발생하는 단점이 있다. 마지막으로 예제 기반 또는 딥러닝과 같은 기계학습을 이용한 기술은 복원된 영상의 품질이 가장 우수하지만 처리속도가 매우 느리다.The technology using frequency characteristics can restore the image while maintaining the sharpness of the edge like the technology using edge information using high-frequency components, but has the disadvantage of generating Ringing Artifact near the boundary. Finally, techniques using machine learning such as example-based or deep learning have the highest quality of restored images, but the processing speed is very slow.

상술한 바와 같이 기존의 다양한 고해상도 영상 복원 기술들 중 연속영상 초해상도 영상복원 기술은 기존의 보간법을 이용한 디지털 줌 기능이 필요한 분야에 적용될 수 있으며, 보간법 기반의 영상복원 기술에 비해 우수한 품질의 영상을 제공한다. 그러나, 기존의 초해상도 영상복원 기술은, 제한된 리소스와 실시간 처리가 요구되는 전자광학 장비에는 복잡한 연산량으로 인해 적용할 수 있는 기술이 제한적이다.As described above, among various existing high-resolution image restoration technologies, the continuous image super-resolution image restoration technology can be applied to fields requiring a digital zoom function using an existing interpolation method, and can produce images of superior quality compared to interpolation-based image restoration technologies. to provide. However, the existing super-resolution image restoration technology is limited in its application due to the complex amount of calculation in electro-optical equipment requiring limited resources and real-time processing.

실시간 처리가 가능한 기존의 단일영상 기반의 초해상도 영상복원 기술은 기 설정된 배수 이상의 고배율로 영상 확대가 필요한 경우에 연속영상 기반의 복원 기술에 비해 성능 저하가 크다는 문제가 있다.Existing single image-based super-resolution image restoration technology capable of real-time processing has a problem in that performance is significantly reduced compared to continuous image-based restoration technology when an image needs to be enlarged at a higher magnification than a preset multiple.

본 실시예는 모자이크에서부터 저저도 노이즈를 포함하는 합성 저화질 이미지, 실제 저화질 이미지 모두에서 발생하는 단일 왜곡, 수많은 조합과 강도로 구성된 복합 왜곡, 실제 왜곡, 학습과정에서 미학습된 왜곡, 복원 가능 여부 또는 복원 불가능 여부를 판별하는 능력을 하나의 통합된 인공신경망으로 구현하여 영상을 복원할 수 있도록 하는 영상 복원 장치를 제공하는 데 목적이 있다.In this embodiment, synthetic low-quality images including mosaics and low-level noises, single distortions occurring in all real low-quality images, complex distortions composed of numerous combinations and intensities, actual distortions, unlearned distortions in the learning process, whether or not restoration is possible, or An object of the present invention is to provide an image restoration device capable of restoring an image by implementing the ability to determine whether restoration is impossible as an integrated artificial neural network.

본 실시예의 일 측면에 의하면, 다수의 영상 스케일 별로 독립적으로 영상 복원을 학습한 다수의 신경망들을 구성하고, 하위 스케일 신경망들의 영상 복원 결과들을 현재 스케일 신경망의 영상 복원을 위한 부가 정보로 사용하는 구조를 갖는 추론 네트워크 블록; 복수의 상기 하위 스케일 신경망들의 상기 영상 복원 결과들을 부가 정보로 이용할 때, 영상 왜곡 복원과 영상 컨텐츠 보존의 트레이드 오프(trade-off)를 고려하여 스케일 별로 가중치를 상이하게 적용하는 구조를 갖는 네트워크 모듈; 상기 추론 네트워크 블록과 상기 네트워크 모듈을 최하위 스케일로부터 최상위 스케일까지 점진적으로 연결하여 영상 복원을 수행하는 구조를 갖는 것을 특징으로 하는 영상 복원 네트워크를 제공한다.According to an aspect of the present embodiment, a structure for constructing a plurality of neural networks independently learning image reconstruction for each of a plurality of image scales and using the image restoration results of the lower scale neural networks as additional information for image restoration of the current scale neural network an inference network block having; A network module having a structure for applying different weights for each scale in consideration of a trade-off between image distortion restoration and image content preservation when using the image restoration results of the plurality of lower-scale neural networks as additional information; Provided is a video restoration network characterized in that it has a structure for performing image restoration by gradually connecting the inference network block and the network module from the lowest scale to the highest scale.

본 실시예의 다른 측면에 의하면, 해당 단계 입력 스케일로 다운스케일된 영상(I^k-1 _content, k=1)만을 입력받아 인공 지능 학습 결과를 반영한 후 기 설정된 배수로 업스케일링한 제1 스케일 레벨 업스케일 복원 영상(I^k _detail, k=1)을 출력하는 PD^k-1(k=1)(Pretrained Decoder); 및 하위 단계에 포함된 DAF 출력 영상들을 해당 단계의 입력 스케일로 스케일링한 영상(I^L→k-1 _da-content, k>1, (k-1)>L≥0)(단, k=2인 경우, I^0→1 _da-content= I^0→1 _content), 해당 단계 입력 스케일로 다운스케일링한 영상(I^k-1 _content, k>1), 이전 단계 PD의 출력 복원 영상(I^k-1 _detail, k>1)을 포함하는 이용 가능 정보를 기반으로 현재 단계의 최적 DAF 출력 정보(I^k-1 _da-content, k>1)를 출력하도록 하는 복수의 DAF 모듈(DAF^k-1, k>1), DAF 출력 정보(I^k-1 _da-content, k>1)와 이전 단계의 PD 출력을 입력으로 받아 제k 스케일 레벨 업스케일 복원 영상(I^k _detail, k>1)을 출력하는 PD^k-1(k>1)(Pretrained Decoder)를 포함하는 것을 특징으로 하는 영상 복원 장치를 제공한다.According to another aspect of the present embodiment, the first scale level upscaled by receiving only the image (I ^k-1 _content , k=1) downscaled to the input scale of the corresponding step, reflecting the artificial intelligence learning result, and then upscaling by a preset multiple. PD ^k-1 (k=1) (Pretrained Decoder) outputting a reconstructed image (I ^k _detail , k=1); and an image (I ^L→k-1 _da-content , k>1, (k-1)>L≥0) by scaling the DAF output images included in the lower step to the input scale of the corresponding step (where k=2 , I ^0→1 _da-content = I ^0→1 _content ), an image downscaled to the input scale of the corresponding step (I ^k-1 _content , k>1), and an output reconstructed image of the previous step PD (I ^k- _A _plurality of DAF modules (DAF ^k ^-1 ^, k>1), DAF output information (I ^k-1 _da-content , k>1) and the PD output of the previous step are received as inputs and output the kth scale level upscaled reconstructed image (I ^k _detail , k>1) It provides an image restoration device characterized in that it includes a PD ^k-1 (k>1) (Pretrained Decoder).

이상에서 설명한 바와 같이 본 실시예에 의하면, 모자이크에서부터 저저도 노이즈를 포함하는 합성 저화질 이미지, 실제 저화질 이미지 모두에서 발생하는 단일 왜곡, 수많은 조합과 강도로 구성된 복합 왜곡, 실제 왜곡, 학습과정에서 미학습된 왜곡, 복원 가능 여부 또는 복원 불가능 여부를 판별하는 능력을 하나의 통합된 인공신경망으로 구현하여 영상을 복원할 수 있는 효과가 있다.As described above, according to the present embodiment, a synthetic low-quality image including mosaic and low-level noise, a single distortion occurring in all of the actual low-quality images, a complex distortion consisting of numerous combinations and intensities, actual distortion, and unlearning in the learning process There is an effect of restoring the image by implementing the ability to determine whether the distortion, restoration is possible or not, as one integrated artificial neural network.

도 1은 본 실시예에 따른 PPD 형태에서 이미지 처리를 방법을 나타낸 도면이다.
도 2는 본 실시예에 따른 DAF 적용 구조에서 이미지 처리 방법을 나타낸 도면이다.
도 3a,3b는 본 실시예에 따른 DAF^k 모듈의 이미지 처리 방법을 나타낸 도면이다.1 is a diagram illustrating an image processing method in a PPD type according to an embodiment.
2 is a diagram illustrating an image processing method in the DAF application structure according to the present embodiment.
3A and 3B are diagrams illustrating an image processing method of the DAF ^k module according to the present embodiment.

이하, 본 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, this embodiment will be described in detail with reference to the accompanying drawings.

도 1은 제1 실시예에 따른 PPD 형태의 영상 복원 장치를 위한 인공지능 네트워크 구성 및 그 학습 방법을 나타낸 도면이다.1 is a diagram showing an artificial intelligence network configuration and a learning method for a PPD-type image restoration device according to a first embodiment.

본 발명의 영상 복원 장치를 위한 인공지능 네트워크는 영상의 스케일별로 각각 독립적으로 학습을 수행한 복수의 PD(Pretrained Decoder)를 캐스캐이드(Cascade) 방식으로 연결한 PPD(Pretrained Progressive Decoder) 형태를 가진다. 본 발명의 인공지능 네트워크 및 그 학습 방법을 적용한 영상 복원 장치는 복수의 PD에서 영상의 스케일별로 각각 추론한 추론 결과를 다음에 연결된 PD로 입력하는 과정을 수행한다.The artificial intelligence network for the image restoration apparatus of the present invention has a PPD (Pretrained Progressive Decoder) form in which a plurality of PDs (Pretrained Decoders) independently learned for each image scale are connected in a cascade manner. . The image restoration apparatus to which the artificial intelligence network and the learning method of the present invention are applied performs a process of inputting inference results obtained by each image scale in a plurality of PDs to the next connected PD.

[PPD (추론) 네트워크 구조 설명][Explanation of PPD (inference) network structure]

다시 말해, k 단계의 추론 네트워크 블록(PD^k ^-1, I^k ^- ¹ _detail, I^k ^- ¹ _content (또는 DAF^k-1(제2 실시예의 경우))과 해당 단계의 추론 결과 영상의 스케일 레벨 k (I^k _detail)를 설정하고, 각 단계에 PD와 DAF(제2 실시예의 경우 포함)를 구성하여, 하위 단계에서부터 순차적으로 다음 상위 단계와 연결한다. 각 단계의 PD는 다수개의 입력을 받아 처리할 수 있도록 파라미터 인플레이팅을 적용할 수 있다. 단, 최하위 단계에서는 입력 영상에 대하여 해당 스케일로 스케일링된 한 개의 입력만을 PD에 적용할 수도 있으므로, 최하위 단계의 PD에서는 파라미터 인플레이팅을 적용하지 않을 수 있다. In other words, the inference network block of step k (PD ^k ^-1 , I ^k ^- ¹ _detail , I ^k ^- ¹ _content (or DAF ^k-1 (in the case of the second embodiment)) and the scale level of the inference result image of the corresponding step k (I ^k _detail ) is set, PD and DAF (including the case of the second embodiment) are configured in each step, and the lower step is sequentially connected to the next upper step. However, since only one input scaled by the corresponding scale of the input image can be applied to the PD at the lowest stage, parameter inflating may not be applied at the lowest stage PD. can

즉, k>1(최하위 단계가 아닌 단계)에서는 이전 단계의 PD에서 추론한 결과(즉, I^k ^- ¹ _detail)를 해당 단계의 PD(즉, PD^k ^- ¹)로 입력하고, 해당 단계의 PD에서는 파라미터(Parameter) 인플레이팅(Inflating)을 통해 두 개(예시적으로 두 개이며 다수 개일 수도 있음)의 입력인, 이전 PD의 추론 결과(즉, I^k ^- ¹ _detail)와, 입력 영상을 해당 단계 입력 스케일로 스케일링한 결과인 I^k ^- ¹ _content (또는, 도 2의 경우, 해당 단계의 DAF(즉, DAF^k ^- ¹)의 출력)를 입력으로 받아, 해당 단계에서 업스케일링된 스케일 레벨 k 영상(I^k _detail)을 추론하는 인공 지능 네트워크 구조를 갖는다.That is, at k>1 (a stage other than the lowest stage), the result inferred from the PD of the previous stage (ie, I ^k ^- ¹ _detail ) is input as the PD of the corresponding stage (ie, PD ^k ^- ¹ ), and the In the PD, the inference result (ie, I ^k ^- ¹ _detail ) of the previous PD, which is two (exemplarily two, but may be multiple) inputs, and the input image are obtained through parameter inflating. I ^k ^- ¹ _content (or, in the case of FIG. 2 , the output of the DAF (ie, DAF ^k ^- ¹ ) of the corresponding step), which is the result of scaling to the input scale of the corresponding step, is received as an input, and the scale level upscaled in the corresponding step It has an artificial intelligence network structure that infers k images (I ^k _detail ).

본 실시예에 따른 복수의 PD(Pretrained Decoder)는 PD^k ^-1(k=1), PD^k ^- ¹ _I(k>1)를 포함한다. 본 발명의 인공 지능 네트워크를 적용한 영상 복원 장치는 PD^k ^-1(k=1), 내지 PD^k ^- ¹ _I(k>1)를 캐스케이드 방식으로 연결한 PPD(Pretrained Progressive Decoder) 구조를 가진다. PD^k ^-1(k=1) 및 PD^k ^- ¹ _I(k>1)를 연결한 구조에서 각각의 PD를 통한 영상의 스케일 레벨별 추론 결과는 상위 단계의 PD로 입력되는 구조를 갖는다. 즉, PPD로 구성된 인공 지능 네트워크의 최상위 단계 PD의 출력을 통해 입력 영상에 대하여 업스케일된 최상위 스케일 레벨의 영상을 복원할 수 있는 것이다. ^A plurality of PDs (Pretrained Decoders) according to this embodiment include PD ^k ^-1 (k=1) and PD ^k ^-1 _I (k>1). The image restoration apparatus to which the artificial intelligence network of the present invention is applied has a PPD (Pretrained Progressive Decoder) structure in which PD ^k ^-1 (k = 1) to PD ^k ^- ¹ _I (k > 1) are connected in a cascade manner. In a structure in which PD ^k ^-1 (k = 1) and PD ^k ^- ¹ _I (k > 1) are connected, inference results for each scale level of an image through each PD are input to the upper stage PD. That is, an image of the highest scale level upscaled with respect to an input image can be restored through the output of the highest level PD of the artificial intelligence network composed of PPDs.

PD^k ^- ¹(k=1)는 해당 단계 입력 스케일로 다운스케일된 영상(I^k ^- ¹ _content, k=1)만을 입력받고, 이에 인공 지능 학습 결과를 반영하여, 기 설정된 배수로 업스케일링한 제1 스케일 레벨 업스케일 복원 영상(I^k _detail, k=1)을 출력한다.PD ^k ^- ¹ (k = 1) receives only the image (I ^k ^- ¹ _content , k = 1) downscaled to the input scale of the corresponding step, reflects the artificial intelligence learning result, and upscales it by a preset multiple. A 1-scale level upscaled reconstructed image (I ^k _detail , k=1) is output.

PD^k ^- ¹ _I(k>1)는 해당 단계 입력 스케일로 다운스케일된 영상(I^k-1 _content, k>1)과 이전 PD의 출력 복원 영상(I^k-1 _detail, k>1)을 입력 받는다. PD^k ^- ¹ _I(k>1)는 해당 단계 입력 스케일로 다운스케일된 영상(I^k-1 _content, k>1)과 이전 PD의 출력 복원 영상(I^k-1 _detail, k>1)에 인공 지능 학습 결과를 반영하여 기 설정된 배수로 업스케일링한 제 k 스케일 레벨 업스케일 복원 영상(I^k _detail, k>1)을 출력한다.PD ^k ^- ¹ _I (k>1) is an image (I ^k-1 _content , k>1) downscaled to the input scale of the corresponding step and an output reconstructed image (I ^k-1 _detail , k>1) of the previous PD. receive input PD ^k ^- ¹ _I (k>1) is an image (I ^k-1 _content , k>1) downscaled to the input scale of the corresponding step and an output reconstructed image (I ^k-1 _detail , k>1) of the previous PD. A kth scale level upscaled reconstructed image (I ^k _detail , k>1) upscaled by a predetermined multiple by reflecting the artificial intelligence learning result is output.

이하, PD^k-1(k=1)에 해당하는 PD⁰의 동작 과정에 대해 예시적으로 설명한다.Hereinafter, an operation process of PD ⁰ corresponding to PD ^k−1 (k=1) will be exemplarily described.

PD⁰는 제1 단계에 포함되며, 해당 스케일 레벨로 다운스케일링한 제0 스케일 레벨 영상(I⁰ _content)만을 입력 받는다. PD⁰는 제0 스케일 레벨 영상(I⁰ _content)에 인공 지능 학습 결과를 반영하여 기 설정된 배수로 업스케일링한 제1 스케일 레벨 복원 영상(I¹ _detail)을 출력한다.PD ⁰ is included in the first step and receives only the 0th scale level image (I ⁰ _content ) downscaled to the corresponding scale level. PD ⁰ reflects the artificial intelligence learning result on the 0th scale level image (I ⁰ _content ) and upscales it by a preset multiple, and outputs a first scale-level reconstructed image (I ¹ _detail ).

이하, PD^k ^- ¹ _I(k>1)에 해당하는 PD¹ _I, PD² _I, PD³ _I, PD⁴ _I의 동작 과정에 대해 예시적으로 설명한다.Hereinafter, operation processes of PD ¹ _I , PD ² _I , PD ³ _I , and PD ⁴ _I corresponding to PD ^k ⁻ ¹ _I (k>1) will be exemplarily described.

PD¹ _I는 제2 단계에 포함되며, 해당 단계 입력 스케일로 다운스케일링한 영상(I¹ _content)과 제1 스케일 레벨 복원 영상(I¹ _detail)을 입력받는다. PD¹ _I는 해당 단계 입력 스케일로 다운스케일링한 영상(I¹ _content)과 제1 스케일 레벨 복원 영상(I¹ _detail)에 인공 지능 학습 결과를 반영하여, 기 설정된 배수로 업스케일링한 제2 스케일 레벨 복원 영상(I² _detail)을 출력한다.PD ¹ _I is included in the second step, and receives an image (I ¹ _content ) downscaled to the input scale of the corresponding step and an image (I ¹ _detail ) reconstructed at the first scale level. PD ¹ _I reflects the artificial intelligence learning result on the image (I ¹ _content ) downscaled to the input scale of the corresponding step and the reconstructed first scale level image (I ¹ _detail ), and reconstructs the second scale level upscaled by a preset multiple. Outputs an image (I ² _detail ).

PD² _I는 제3 단계에 포함되며, 해당 단계 입력 스케일로 다운스케일링한 영상(I² _content)과 제2 스케일 레벨 복원 영상(I² _detail)을 입력받는다. PD² _I는 해당 단계 입력 스케일로 다운스케일링한 영상(I² _content)과 제2 스케일 레벨 복원 영상(I² _detail)에 인공 지능 학습 결과를 반영하여 기 설정된 배수로 업스케일링한 제3 스케일 레벨 복원 영상(I³ _detail)을 출력한다.PD ² _I is included in the third step, and receives an image (I ² _content ) downscaled to the input scale of the corresponding step and a second scale level reconstructed image (I ² _detail ). PD ² _I reflects the result of artificial intelligence learning on the downscaled image (I ² _content ) and the reconstructed second scale level image (I ² _detail ) to the input scale of the corresponding step, and upscales the reconstructed third scale level image by a preset multiple. (I ³ _detail ) is output.

PD³ _I는 제4 단계에 포함되며, 해당 단계 입력 스케일로 다운스케일링한 영상 (I³ _content)과 제3 스케일 레벨 복원 영상(I³ _detail)을 입력 받는다. PD³ _I는 해당 단계 입력 스케일로 다운스케일링한 영상(I³ _content)과 제3 스케일 레벨 복원 영상(I³ _detail)에 인공 지능 학습 결과를 반영하여 기 설정된 배수로 업스케일링한 제4 스케일 레벨 복원 영상(I⁴ _detail)을 출력한다.PD ³ _I is included in the fourth step, and receives an image (I ³ _content ) downscaled to the input scale of the corresponding step and a reconstructed image (I ³ _detail ) at a third scale level. PD ³ _I reflects the artificial intelligence learning result on the image (I ³ _content ) downscaled to the input scale of the corresponding step and the reconstructed 3rd scale level image (I ³ _detail ), and up-scales the fourth scale level reconstructed image by a predetermined multiple. (I ⁴ _detail ) is output.

PD⁴ _I는 제5 단계에 포함되며, 해당 단계 입력 스케일로 다운스케일링한 영상(I⁴ _content)과 제4 스케일 레벨 복원 영상(I⁴ _detail)을 입력 받는다. PD⁴ _I는 해당 단계 입력 스케일로 다운스케일링한 영상(I⁴ _content)과 제4 스케일 레벨 복원 영상(I⁴ _detail)에 인공 지능 학습 결과를 반영하여 기 설정된 배수로 업스케일링한 제5 스케일 레벨 복원 영상(I⁵ _detail)을 출력한다. PD ⁴ _I is included in the fifth step, and receives an image (I ⁴ _content ) downscaled to the input scale of the corresponding step and a reconstructed image (I ⁴ _detail ) at a fourth scale level. PD ⁴ _I reflects the result of artificial intelligence learning on the downscaled image (I ⁴ _content ) and the 4th scale level reconstructed image (I ⁴ _detail ) downscaled to the input scale of the corresponding step, and upscales the 5th scale level reconstructed image by a preset multiple. (I ⁵ _detail ) is output.

본 실시예에서는 제5 스케일 레벨 복원 영상(I⁵ _detail)을 본 발명의 인공 지능 네트워크 및 학습 방법을 적용한 영상 업스케일 복원 장치의 출력으로 사용한다.In this embodiment, the fifth scale level reconstructed image (I ⁵ _detail ) is used as an output of an image upscale restoration apparatus to which the artificial intelligence network and learning method of the present invention are applied.

[PPD 네트워크 학습 방법][How to learn PPD network]

본 실시예에 따른 복수의 PD(Pretrained Decoder)는 영상의 스케일(Scale)별로 각각 독립적으로 초기 학습을 수행한 후, PPD 형식으로 캐스캐이드로 하위 단계에서 순차적으로 상위 단계 PD들을 연결하면서, 각 단계에 해당하는 PD 추가 시 마다 추가되는 PD에 대하여 튜닝 학습을 진행하며, 이때 이미 이전 단계에서 튜닝 학습이 완료된 PD에 대해서는 다시 학습하지 않도록 한다. A plurality of PDs (Pretrained Decoders) according to the present embodiment independently perform initial learning for each scale of the image, and then sequentially connect upper-level PDs in a cascade in a PPD format at each step while connecting upper-level PDs sequentially. Whenever a PD corresponding to is added, tuning learning is performed for the added PD, and at this time, the PD for which tuning learning has already been completed in the previous step is not learned again.

전체 학습 네트워크의 구성은 복수의 PD로 구성되며, 복수의 PD는 PD^k ^-1(k=1), PD^k ^- ¹ _I(k>1)를 포함한다. 이때 아래첨자 I가 표기될 경우, 해당 PD에 파라미터 인플레이팅이 적용되었음을 나타낸다. 파라미터 인플레이팅은 사전 학습된 네트워크의 입력 채널을 다수 개로 확장하기 위한 방법이다. 해당 방법은 사전 학습된 입력 채널을 다수 개로 복사하는 과정, 복사한 다수 개의 채널을 연결(Concatenation)하는 과정, 연결된 채널 내 파라미터들의 스케일을 조정하는 과정으로 구성된다.The configuration of the entire learning network is composed of a plurality of PDs, and the plurality of PDs include PD ^k ^-1 (k=1) and PD ^k ^- ¹ _I (k>1). In this case, when the subscript I is indicated, it indicates that parameter inflating is applied to the corresponding PD. Parameter inflating is a method for expanding a number of input channels of a pretrained network. The method consists of a process of copying a plurality of pre-learned input channels, a process of concatenating the plurality of copied channels, and a process of adjusting the scale of parameters in the concatenated channels.

본 발명의 영상 복원 장치 네트워크는 영상의 스케일 레벨 별로 단계를 구성하여 PD^k-1(k=1), 내지 PD^k-1 _I(k>1)를 단계별 캐스캐이드 방식으로 연결한다.The image restoration device network of the present invention configures stages for each scale level of the image and connects PD ^k−1 (k=1) to PD ^k−1 _I (k>1) in a step-by-step cascade manner.

영상 복원 장치 네트워크에 포함된 PD^k ^- ¹ _I(k>1)는 복수개의 입력을 수용할 수 있도록 파라미터 인플레이팅이 적용될 수 있다.Parameter inflating may be applied to PD ^k ^- ¹ _I (k>1) included in the image restoration device network to accommodate a plurality of inputs.

PPD 구조의 네트워크를 학습하기 위하여, 각각 독립적으로 학습한 PD^k ^-1(k=1) 내지 PD^k ^- ¹ _I(k>1)를 순차적으로 프로그레시브하게 연결하면서 각 단계별 PD에 대한 튜닝 학습을 진행하여 전체 네트워크를 학습하는 방법을 사용한다. 다시 말해, PPD와 같은 네트워크 구조에서는 복수의 스케일 레벨 별 단계 들에 포함된 PD들을 한 번에 학습하기 어렵다. 이러한 문제를 해결하기 위해서, 네트워크 구성 요소 인 각 단계 별 PD를 순차적으로 연결하면서, 새로이 추가되는 PD에 대하여 기존 초기 학습 파라미터를, 추가되는 네트워크에 맞추어 튜닝하는 튜닝 학습을 진행하여 해당 단계를 위하여 새로 연결되는 PD의 학습을 완료한다. 이때 이전 단계까지 연결되면서 튜닝 학습이 완료된 기 연결된 PD는 학습하지 않고, 추가되는 PD에 대해서만 학습한다.In order to learn the network of the PPD structure, tuning learning is performed for each PD at each stage while progressively connecting PD ^k ^-1 (k = 1) to PD ^k ^- ¹ _I (k > 1) independently learned to train the entire network. In other words, in a network structure such as PPD, it is difficult to learn PDs included in a plurality of steps for each scale level at once. In order to solve this problem, while sequentially connecting the PDs for each step, which are network components, tuning learning is performed to tune the existing initial learning parameters for the newly added PD according to the added network, Complete the learning of the connected PD. At this time, the pre-connected PD for which tuning learning has been completed while being connected to the previous step is not learned, and only the PD to be added is learned.

복수의 스케일을 학습하는 과정을 각각의 스케일별로 독립적으로 수행한 후 서로 연결할 수 있으나, 본 발명에서는 각각의 스케일별로 사전에 독립적으로 학습한 PD를 단계적으로 연결하면서, 새로이 연결되는 PD를 포함하는 네트워크까지의 구성을 기반으로, 새로이 연결되는 PD에 대하여 튜닝 학습을 진행하여 스케일 레벨 별 추가 연결 단계의 네트워크를 순차적으로 연결하는 것이다.The process of learning a plurality of scales can be independently performed for each scale and then connected to each other. Based on the configuration up to this point, tuning learning is performed on the newly connected PD to sequentially connect the networks of additional connection steps for each scale level.

이러한 방식으로 PD^k ^-1(k=1), PD^k ^- ¹ _I(k>1)는 각각의 네트워크 파라미터가 학습되어 설정된다. PPD에서 각 단계에서의 PD는 원본을 해당 단계 입력으로 스케일링한 영상과 함께, 이전 단계의 PD를 통해 업 스케일한 복원 영상 등, 이용 가능한 영상 정보를 해당 단계의 PD의 입력으로 이용하고, 학습 목표 영상으로, 최초 클린 원본 영상을 해당 단계 PD의 출력 스케일 레벨과 동일하게 스케일링한 영상을 적용하여, 학습을 수행한다. 본 발명의 인공지능 네트워크를 학습 시에는, 다양한 왜곡이 반영된 입력 영상을 원본 입력 영상으로 사용할 수 있으며, 이러한 왜곡이 반영된 영상을 입력 원본으로 하여, 왜곡이 없는 원래의 입력 영상인 최초 클린 원본 영상을 출력할 수 있도록 인공지능 네트워크를 학습하여야 한다.In this way, PD ^k ^-1 (k = 1) and PD ^k ^- ¹ _I (k > 1) are set by learning the respective network parameters. In the PPD, the PD at each step uses available image information, such as the image scaled from the original as the input of the step and the reconstructed image upscaled through the PD of the previous step, as the input of the PD at the step, learning goal As an image, learning is performed by applying an image obtained by scaling the first clean original image to be the same as the output scale level of the corresponding step PD. When learning the artificial intelligence network of the present invention, an input image reflected with various distortions can be used as the original input image, and the first clean original image, which is the original input image without distortion, can be used as the input original image with these distortions reflected The artificial intelligence network must be trained so that it can be output.

PD^k ^- ¹(k=1)는 해당 단계 입력 스케일로 스케일링한 영상(I^k ^- ¹ _content, k=1)만을 입력으로 받고, PD^k ^-1(k=1)의 출력인 제 k (k=1) 스케일 레벨 복원 영상(I^k _detail, k=1)의 스케일 레벨로 최초 클린 원본 영상을 스케일링한 영상을 목표 영상으로 하여 튜닝 학습을 수행한다.PD ^k ^- ¹ (k = 1) receives as input only the image (I ^k ^- ¹ _content , k = 1) scaled by the input scale of the corresponding step, and outputs the k (k ) output of PD ^k ^-1 (k = 1) = 1) Tuning learning is performed with an image obtained by scaling the first clean original image to the scale level of the scale level reconstructed image (I ^k _detail , k=1) as a target image.

PD^k ^- ¹ _I(k>1)는 해당 단계 입력 스케일로 다운스케일링한 영상(I^k-1 _content, k>1)과 이전 단계 PD의 출력 스케일 레벨 복원 영상(I^k-1 _detail, k>1)을 입력으로 받고, PD^k ^-1 _I(k>1)의 출력인 제 k (k>1) 스케일 레벨 복원 영상(I^k _detail, k>1)의 스케일 레벨로 최초 클린 원본 영상을 스케일링한 영상을 목표 영상으로 하여 튜닝 학습을 수행한다. PD ^k ^- ¹ _I (k>1) is an image (I ^k-1 _content , k>1) downscaled to the input scale of the corresponding step and an output scale level reconstructed image (I ^k-1 _detail , k> of the previous step PD) 1) as an input, scaling the first clean original image to the scale level of the _kth (k>1) scale level reconstructed image (I ^k _detail , k>1), which is the output of PD ^k ^{-1 I} (k>1) Tuning learning is performed using one image as a target image.

다음 단계인 PD^k _I는 해당 단계 입력 스케일로 스케일링한 영상(I^k _content, k>1)과 현재 단계 PD^k ^- ¹ _I(k>1)의 출력 스케일 레벨 복원 영상(I^k _detail, k>1)을 입력으로 받아, PD^k _I의 출력인 제 k+1 (k>1) 스케일 레벨 복원 영상(I^k+1 _detail, k>1)의 스케일 레벨로 최초 클린 원본 영상을 스케일링한 영상을 목표 영상으로 하여 튜닝 학습을 수행한다.PD ^k _I , the next step, is an image (I ^k _content , k>1) scaled by the input scale of the corresponding step and an output scale level restored image (I ^k _detail , k> 1) of the current step PD ^k ⁻ ¹ _I (k>1). 1) as an input, an image obtained by scaling the first clean original image to the scale level of the k+1th (k>1) scale level reconstructed image (I ^k+1 _detail , k>1), which is the output of PD ^k _I Tuning learning is performed using the target image.

도 2는 제2 실시예에 따른 DAF를 적용한 인공지능 네트워크 구성 및 그 학습 방법을 나타낸 도면이다.2 is a diagram illustrating an artificial intelligence network configuration and a learning method to which DAF is applied according to a second embodiment.

[PPD 및 DAF (추론) 네트워크 구조 설명][PPD and DAF (inference) network structure description]

제2 실시예에 따른 본 발명의 인공지능 네트워크의 구조는 제1 실시예의 구조에서 각 단계별로 DAF(Domain Aware Fusion) 모듈을 추가로 적용하는 구조이다. (도 2 참조)The structure of the artificial intelligence network of the present invention according to the second embodiment is a structure in which a Domain Aware Fusion (DAF) module is additionally applied to each step from the structure of the first embodiment. (See Fig. 2)

즉, 제2 실시예는, 제1 실시예의 구조에서, 각 단계별 PD 앞 단에 DAF를 추가한 구조이다. 다시 말해, 제1 실시예의 구조에서 각 단계의 PD의 입력 중 하나인 해당 단계 입력 스케일로 스케일링한 입력 영상을 그대로 입력으로 사용하지 않고, 대신에 해당 입력 위치에 DAF 모듈을 추가 후, 해당 단계 입력 스케일로 스케일링한 입력 영상을 포함한 해당 단계에서 이용 가능한 다양한 정보를 이용하여, 해당 단계의 PD 출력을 얻기 위한 최적의 정보를 생성한 후 이를 해당 단계 PD의 입력으로 사용하는 구조이다.That is, the second embodiment has a structure in which a DAF is added to the front end of the PD for each stage in the structure of the first embodiment. In other words, in the structure of the first embodiment, the input image scaled by the corresponding step input scale, which is one of the inputs of the PD of each step, is not used as an input. Instead, after adding the DAF module to the corresponding input position, the corresponding step input This is a structure in which the optimal information for obtaining the PD output of the corresponding step is generated using various information available in the corresponding step, including the input image scaled by the scale, and then used as the input of the PD of the corresponding step.

즉, 복수의 DAF 모듈(DAF^k-1, k>1)은 하위 단계에 포함된 DAF 출력 영상들을 해당 단계의 입력 스케일로 스케일링한 영상(I^L→k-1 _da-content, k>1, (k-1)>L≥0)(단, k=2인 경우, I^0→ ¹ _da _-content= I^0→ ¹ _content), 해당 단계의 입력 스케일로 다운스케일링한 영상(I^k-1 _content, k>1), 이전 단계 PD의 출력 복원 영상(I^k-1 _detail, k>1) 등 이용 가능한 정보를 기반으로 현재 단계의 DAF(DAF^k-1, k>1)를 수행하여, 현재 단계의 PD의 최적 출력을 얻을 수 있도록 하는 최적 DAF 출력 정보(I^k-1 _da-content, k>1)를 생성한다.That is, a plurality of DAF modules (DAF ^k-1 , k>1) scale DAF output images included in lower stages to the input scale of the corresponding stage (I ^L→k-1 _da-content , k>1, (k-1)>L≥0) (however, if k=2, I ^0→ ¹ _da _-content = I ^0→ ¹ _content ), the image downscaled to the input scale of the corresponding step (I ^k-1 _content , k>1), DAF (DAF ^k -1, k>1) of the current step is performed based on available information such as the output reconstructed image (I ^k-1 _detail , k>1) of the previous step PD, Optimum DAF output information (I ^k-1 _da-content , k>1) for obtaining the optimal output of the PD of the step is generated.

[DAF 구조 및 기능 설명][Description of DAF Structure and Function]

제2 실시예의 DAF는 해당 단계의 PD를 통한 최적의 업스케일링 복원 영상 출력을 얻기 위해 필요한 정보를 생성하는 모듈로, 여러가지 구조의 모듈을 적용할 수 있다. 그 중 일 예시로 도 3의 구조와 기능에 대하여 설명한다. The DAF of the second embodiment is a module that generates information necessary to obtain an optimal upscaling reconstructed image output through the PD of the corresponding step, and modules having various structures can be applied. As an example, the structure and function of FIG. 3 will be described.

복수의 DAF 모듈은, 입력 영상을 해당 단계 입력 스케일로 스케일링한 영상, 해당 단계의 하위 단계별 DAF 출력들을 해당 단계의 입력 스케일로 스케일링한 영상들, 이전 단계 PD 출력 복원 영상을, 각 단계마다 입력 정보로 받는다. A plurality of DAF modules, an image obtained by scaling an input image to the input scale of the corresponding step, images obtained by scaling DAF outputs of sub-steps of the corresponding step to the input scale of the corresponding step, and a restored PD output image of the previous step, input information for each step receive as

각 단계의 DAF 모듈은 입력 영상을 해당 단계 입력 스케일로 스케일링한 영상, 해당 단계의 하위 단계별 DAF 출력들을 해당 단계의 입력 스케일로 스케일링한 영상들과, 이전 단계 PD 출력 복원 영상과의 차감값을 각각 산출한다.The DAF module at each stage generates an image scaled by the input scale of the corresponding stage, images obtained by scaling the DAF outputs of the lower stages of the corresponding stage to the input scale of the corresponding stage, and a difference between the PD output reconstructed image of the previous stage, respectively. yield

DAF 모듈은 상기 각 차감값을 그룹 컨벌루션을 취한 후 소프트맥스(Softmax)에 입력하여 출력값을 합산했을 때 총합이 1이되도록 한다. DAF 모듈은 소프트맥스를 거친 각 출력값에, 이전 단계 PD 출력 복원 영상을 곱한 후 합산하여, DAF 출력 정보(I^k-1 _da-content, k>1)로 생성하여, 이후에 연결된 PD에 입력한다.The DAF module performs group convolution on each of the subtraction values and inputs them to Softmax so that the total sum becomes 1 when the output values are summed. The DAF module multiplies and sums each output value that has passed through softmax by the previous PD output reconstructed image, generates DAF output information (I ^k-1 _da-content , k>1), and inputs it to the PD connected afterwards. .

이러한 과정을 거쳐서 출력되는 DAF 출력 정보는 하위 단계의 PD 출력 정보가 단계를 거치면서 실제 입력 영상 정보의 특징을 약화시키지 않고 유지할 수 있도록 하는 기능을 수행한다. 즉, DAF 모듈은, 본 발명의 네트워크에서 필요한 기능을 수행하도록 설계하여 학습, 적용할 수 있으며, 도 3의 예시에서와 같이, 본 발명의 추론 네트워크의 출력을 목표하는 방향으로 유지할 수 있도록 하는 기능을 할 수도 있다. The DAF output information output through this process performs a function to maintain the characteristics of the actual input image information without weakening the PD output information of the lower stage through the stages. That is, the DAF module can be designed, learned, and applied to perform necessary functions in the network of the present invention, and, as in the example of FIG. You can also do

[PPD 및 DAF 네트워크 학습 방법][How to learn PPD and DAF networks]

제2 실시예의 네트워크 학습 방법은 제1 실시예에서의 학습방법과 같이 프로그레시브 방식을 사용하며, 제1 실시예와의 차이점은 DAF를 추가하여, 단계별 학습 시 각 단계의 PD와 함께 DAF도 추가하여 같이 학습하는 점이다. The network learning method of the second embodiment uses a progressive method like the learning method of the first embodiment, and the difference from the first embodiment is that DAF is added, and DAF is added along with PD of each step during step-by-step learning. is to learn together.

제2 실시예에 따른 복수의 PD(Pretrained Decoder)는 영상의 스케일(Scale)별로 각각 독립적으로 초기 학습을 수행한 후, PPD 형식으로 캐스캐이드로 하위 단계에서 순차적으로 상위 단계 PD와 DAF를 연결하면서, 각 단계에 해당하는 PD 및 DAF 추가 시 마다 추가되는 PD 및 DAF에 대하여 튜닝 학습을 진행하고, 이때 이미 이전 단계에서 튜닝 학습이 완료된 PD 및 DAF에 대해서는 다시 학습하지 않도록 한다. A plurality of PDs (Pretrained Decoders) according to the second embodiment independently perform initial learning for each scale of the image, and then sequentially connect upper-level PDs and DAFs in a cascade in a PPD format at lower levels. , when PDs and DAFs corresponding to each step are added, tuning learning is performed for PDs and DAFs added, and at this time, PDs and DAFs for which tuning learning has already been completed in the previous step are not learned again.

전체 학습 네트워크의 구성은 복수의 PD와 DAF로 구성되며, 복수의 PD는 PD^k ^-1(k=1), PD^k ^- ¹ _I(k>1)를 포함하며, 복수의 DAF는 DAF^k ^-1 (k>1)를 포함한다. 이때 아래첨자 I가 표기될 경우, 해당 PD에 파라미터 인플레이팅이 적용되었음을 나타낸다. 파라미터 인플레이팅은 사전 학습된 네트워크의 입력 채널을 다수 개로 확장하기 위한 방법이다. 해당 방법은 사전 학습된 입력 채널을 다수 개로 복사하는 과정, 복사한 다수 개의 채널을 연결(Concatenation)하는 과정, 연결된 채널 내 파라미터들의 스케일을 조정하는 과정으로 구성된다.The configuration of the entire learning network consists of a plurality of PDs and DAFs, the plurality of PDs include PD ^k ^-1 (k = 1), PD ^k ^- ¹ _I (k > 1), and the plurality of DAFs are DAF ^k ^{- 1} (k>1). In this case, when the subscript I is indicated, it indicates that parameter inflating is applied to the corresponding PD. Parameter inflating is a method for expanding a number of input channels of a pretrained network. The method consists of a process of copying a plurality of pre-learned input channels, a process of concatenating the plurality of copied channels, and a process of adjusting the scale of parameters in the concatenated channels.

PPD 구조의 네트워크를 학습하기 위하여, 각각 독립적으로 학습한 PD^k ^-1(k=1) 내지 PD^k ^- ¹ _I(k>1)와, 초기화 되어 있는 각 단계별 DAF^k ^-1 (k>1)를, 각 단계 별로 순차적으로 프로그레시브하게 연결하면서, 각 단계 별 PD에 대한 튜닝 학습과 각 단계별 DAF에 대한 학습을 진행하여 전체 네트워크를 학습하는 방법을 사용한다. 다시 말해, PPD와 같은 네트워크 구조에서는 복수의 단계들에 포함된 PD들과 DAF들을 한 번에 학습하기 어렵다. 이러한 문제를 해결하기 위해서, 네트워크 구성 요소 인 각 단계 별 PD와 DAF를 순차적으로 연결하면서, 새로이 추가되는 PD에 대하여, 기존 초기 학습 파라미터를 추가되는 네트워크에 맞추어 튜닝하는 튜닝 학습을 진행하고, 새로이 추가되는 DAF에 대하여 초기화 되어 있는 파라미터를 추가되는 네트워크에 맞추어 학습을 진행하여, 해당 단계를 위하여 새로 연결되는 PD와 DAF의 학습을 완료한다. 이때 이전 단계까지 연결되면서 학습의 완료된 기 연결된 PD와 DAF는 학습하지 않고, 추가되는 PD와 DAF에 대해서만 학습한다.In order to learn the network of the PPD structure, PD ^k ^-1 (k = 1) to PD ^k ^- ¹ _I (k > 1) independently learned, and DAF ^k ^-1 (k > 1) for each stage initialized , sequentially and progressively at each stage, and learning the entire network by performing tuning learning for PD at each stage and learning for DAF at each stage. In other words, in a network structure such as PPD, it is difficult to learn PDs and DAFs included in a plurality of steps at once. In order to solve this problem, while sequentially connecting the PD and DAF for each stage, which are network components, tuning learning is performed to tune the existing initial learning parameters to the newly added network for the newly added PD, and newly added For the DAF to be initialized, learning is performed according to the network to be added, and learning of the newly connected PD and DAF is completed for the corresponding step. At this time, PDs and DAFs connected up to the previous step and completed learning are not learned, but only PDs and DAFs that are added are learned.

복수의 스케일을 학습하는 과정을 각각의 스케일별로 독립적으로 수행한 후 서로 연결할 수 있으나, 본 발명에서는 각각의 스케일별로 사전에 독립적으로 학습한 PD를 단계적으로 연결하면서, 새로이 연결되는 PD와 DAF를 포함하는 네트워크까지의 구성을 기반으로, 새로이 연결되는 PD에 대하여 튜닝 학습을 진행하고, 추가 연결되는 DAF에 대하여 학습을 진행하여, 단계 별 네트워크를 순차적으로 연결하는 것이다. The process of learning a plurality of scales can be independently performed for each scale and then connected to each other. Based on the configuration up to the network to be connected, tuning learning is performed for the newly connected PD, and learning is performed for the additionally connected DAF to sequentially connect the networks for each step.

이러한 방식으로 PD^k ^-1(k=1), PD^k ^- ¹ _I(k>1)와 DAF^k ^-1 (k>1)의 각각의 네트워크 파라미터가 학습되어 설정된다. In this way, each network parameter of PD ^k ⁻¹ (k=1), PD ^k ⁻ ¹ _I (k>1) and DAF ^k ⁻¹ (k>1) is learned and set.

실시예 2의 PPD 구조의 네트워크에서, 각 단계에서의 PD는 이전 단계 PD의 출력 영상과, 해당 단계의 DAF 출력 정보를 입력으로 이용하고, 학습 목표 영상으로 최초 클린 원본 영상을 해당 스케일 레벨의 PD의 출력 스케일과 동일하게 스케일링한 영상을 적용하여, 학습을 수행한다.In the network of the PPD structure of Example 2, the PD at each stage uses the output image of the previous stage PD and the DAF output information of the corresponding stage as inputs, and uses the first clean original image as the learning target image as the PD of the corresponding scale level. Learning is performed by applying an image scaled to be the same as the output scale of .

이때 해당 단계의 DAF도 동시에 학습이 진행되며, DAF는, 이전 단계의 PD의 업스케일 복원 영상 출력, 해당 단계 보다 하위 단계의 DAF 출력들을 현재 단계의 DAF 입력 스케일로 스케일링한 정보들 등, 이용 가능한 정보를 해당 단계의 DAF의 입력으로 적용한 후, 출력 되는 정보를 해당 단계의 PD의 입력으로 적용하고, 해당 단계의 PD의 출력이 학습 목표 영상인 최초 클린 원본 영상을 해당 스케일 레벨로 스케일한 영상이 될 수 있도록 해당 단계의 PD와 함께 DAF 학습을 수행한다.At this time, the DAF of the corresponding step is also learned at the same time, and the DAF includes information obtained by scaling DAF outputs of a lower step than the corresponding step to the DAF input scale of the current step, the output of the upscaled reconstructed image of the PD of the previous step, and the like. After applying the information as the input of the DAF of the corresponding stage, the output information is applied as the input of the PD of the corresponding stage, and the output of the PD of the corresponding stage is the learning target image, the first clean original image scaled to the corresponding scale level. DAF learning is performed together with the PD of the corresponding stage so that

PD^k ^- ¹(k=1)는 해당 단계 입력 스케일로 스케일링된 영상(I^k ^- ¹ _content, k=1)만을 입력으로 받고, PD^k ^-1(k=1)의 출력인 제 k (k=1) 스케일 레벨 복원 영상(I^k _detail, k=1)의 스케일 레벨로 최초 클린 원본 영상을 스케일링한 영상을 목표 영상으로 하여 튜닝 학습을 수행한다.PD ^k ^- ¹ (k = 1) receives only the image (I ^k ^- ¹ _content , k = 1) scaled by the input scale of the corresponding step as input, and outputs the k (k ) output of PD ^k ^-1 (k = 1). = 1) Tuning learning is performed with an image obtained by scaling the first clean original image to the scale level of the scale level reconstructed image (I ^k _detail , k=1) as a target image.

DAF^k ^-1 (k>1)는 입력으로, 해당 단계 하위 단계까지에서 얻어진 이용 가능한 다양한 정보를 이용할 수 있으며, 일례로, 도 2에서와 같이, I⁰ _content를 현재 단계의 입력 스케일로 스케일링한 영상, 현재 단계 이전 모든 하위 단계의 DAF 출력 정보들(DAF^k-1 출력 정보, 현재 단계>k>1 인 모든 k)을 현재 단계의 입력 스케일로 스케일링한 정보와, 원본 입력 영상을 현재 단계의 입력 스케일로 다운스케일링한 영상(I^k-1 _content, k>1)과, 이전 단계 PD의 출력 복원 영상(I^k-1 _detail, k>1) 등 이용 가능한 모든 정보를 입력으로 이용할 수 있다. DAF^k ^-1 (k>1)에 대한 학습은 이러한 입력 정보에 대하여, 해당 단계의 PD^k ^-1 (k>1)가 학습 목표 영상을 생성할 수 있도록 하는 최적의 영상 정보를 출력하도록 학습된다. DAF ^k ^-1 (k>1) is an input, and can use various information available from up to a lower step of the corresponding step. For example, as shown in FIG. 2, scaling I ⁰ _content to the input scale of the current step The image, information obtained by scaling DAF output information (DAF ^k-1 output information, all k where the current step>k>1) of all sub-steps before the current step to the input scale of the current step, and the original input image of the current step All available information, such as an image downscaled to the input scale (I ^k-1 _content , k>1) and an output reconstructed image (I ^k-1 _detail , k>1) of a previous stage PD, can be used as input. Learning about DAF ^k ^-1 (k > 1) is learned to output optimal image information that enables PD ^k ^-1 (k > 1) of the corresponding step to generate a learning target image with respect to this input information. .

PD^k ^- ¹ _I(k>1)는, DAF^k ^- ¹(k>1)의 출력 정보와 이전 단계 PD의 출력 복원 영상(I^k-1 _detail, k>1)을 입력으로 받아, PD^k ^- ¹ _I(k>1)의 출력인 제 k (k>1) 스케일 레벨 업스케일 복원 영상(I^k _detail, k>1)의 스케일 레벨로 최초 클린 원본 영상을 스케일링한 영상을 목표 영상으로 하여 튜닝 학습을 수행한다. PD ^k ^- ¹ _I (k>1) receives the output information of DAF ^k ^- ¹ (k>1) and the output reconstructed image (I ^k-1 _detail , k>1) of the previous PD as input, and PD ^k ^- The target image is an image obtained by scaling the first clean original image to the scale level of the k (k > 1) scale level upscaled reconstructed image (I ^k _detail , k > 1), _{which is} ^the output of 1 I (k > 1). Perform tuning learning.

본 실시예에 따른 영상 복원 장치는 학습부를 포함한다.The image restoration apparatus according to the present embodiment includes a learning unit.

학습부는 복수의 PD(Pretrained Decoder) 별로 영상의 스케일(Scale)에 따라 각각 독립적으로 학습을 수행한다. The learning unit independently performs learning according to a scale of an image for each of a plurality of pretrained decoders (PDs).

학습부는 하위 스케일 레벨(Scale LEVEL)에서 최상위 스케일 레벨까지의 복수의 PD 및 복수의 DAF를 단계(STEP) 별로 연결하면서 학습을 수행한다. 학습부는 복수의 PD 및 DAF 별로 이전 단계에서 학습한 PD 및 DAF에 대해서 다시 학습을 수행하지 않도록 한다.The learning unit performs learning while connecting a plurality of PDs and a plurality of DAFs from the lower scale level to the highest scale level step by step. The learning unit prevents learning from being performed again for PDs and DAFs learned in a previous step for each of a plurality of PDs and DAFs.

학습부는 복수의 PD 중 각각의 스케일 레벨 영상 출력을 얻기 위해 사용되는 PD를 단계별로 선별한다. 학습부는 각각의 단계에 사용되는 PD에서 최초 학습 시 입력 영상의 스케일별로 각각 독립적으로 학습을 수행한다. 학습부는 각각의 스케일 레벨별로 학습이 완료된 PD와 각 PD 앞단의 DAF를 연결한 구조상에서 하위 스케일 레벨부터 단계별로 학습을 수행한다. 학습부는 학습이 완료된 단계까지의 네트워크에서 다음 단계를 추가 연결 후 새로이 추가된 단계의 학습 시에는 이전 단계가 학습 및 업데이트되지 않도록 하는 프로그레시브 학습을 수행한다.The learning unit selects a PD used to obtain each scale level image output among a plurality of PDs step by step. The learning unit performs learning independently for each scale of the input image at the time of initial learning in the PD used in each step. The learning unit performs learning step by step from the lower scale level on the structure that connects the PD for which learning has been completed for each scale level and the DAF in front of each PD. The learning unit performs progressive learning so that the previous step is not learned and updated when learning the newly added step after additionally connecting the next step in the network until the learning is completed.

학습부는 복수의 PD 중 최초 스케일 레벨에서 이용되는 PD를 제외한 나머지 단계에서 이용되는 PD마다 파라미터 인플레팅을 수행해서 이전 단계의 PD 출력과 해당 단계의 DAF의 출력을 입력 받아 학습을 수행하도록 한다.The learning unit performs parameter inflating for each PD used in the remaining steps except for the PD used in the first scale level among the plurality of PDs, and receives the output of the PD of the previous step and the output of the DAF of the corresponding step to perform learning.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present embodiment, and various modifications and variations can be made to those skilled in the art without departing from the essential characteristics of the present embodiment. Therefore, the present embodiments are not intended to limit the technical idea of the present embodiment, but to explain, and the scope of the technical idea of the present embodiment is not limited by these embodiments. The scope of protection of this embodiment should be construed according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of rights of this embodiment.

Claims

An inference network block having a structure that configures a plurality of neural networks independently learning image reconstruction for each image scale and uses image restoration results of lower scale neural networks as additional information for image restoration of a current scale neural network;
A network module having a structure for applying different weights for each scale in consideration of a trade-off between image distortion restoration and image content preservation when using the image restoration results of the plurality of lower-scale neural networks as additional information;
The image restoration network characterized in that it has a structure for performing image restoration by gradually connecting the inference network block and the network module from the lowest scale to the highest scale.

^The first scale level upscaled reconstructed image ⁽ _I ^k _detail ^, k = 1) outputting PD ^k-1 (k = 1) (Pretrained Decoder); and
An image (I ^L→k-1 _da-content , k>1, (k-1)>L≥0) by scaling the DAF output images included in the lower step to the input scale of the step (provided that k=2 In this case, I ^0→ ¹ _da _-content = I ^0→ ¹ _content ), an image downscaled to the input scale of the corresponding step (I ^k-1 _content , k>1), and an output reconstructed image of the previous step PD (I ^k-1 A plurality of DAF modules (DAF k-1, k to output optimal DAF output information (I ^k-1 _da-content , k>1) of the current stage based on available information including _detail , k> ¹ ) >1)
Image restoration device characterized in that it comprises a.

According to claim 2,
The output (I k-1 _da-content , k>1) of the DAF module (DAF ^k ^-1 , k>1) of the corresponding step and the reconstructed output image (I ^k-1 _detail , k>1) of the previous PD Receive input, output (I k-1 _da-content , k>1) of the DAF module (DAF ^k-1 ^, k>1) of the corresponding step and output reconstructed image (I ^k-1 _detail , k>1) of the PD of the previous step PD k - 1 I (k>1) outputting a kth scale level upscaled reconstructed image (I ^k _detail , k ^> ¹ ) upscaled by _a preset multiple by reflecting the artificial intelligence learning result in ^k >1)
Image restoration device characterized in that it further comprises.

According to claim 3,
Having a PPD (Pretrained Progressive Decoder) form in which the PD ^k-1 ^I ₍ k>1) are sequentially connected in a cascade manner after the PD k-1 (k = 1) independently learned Image restoration device characterized in that.

According to claim 2,
The plurality of DAF modules (DAF ^k-1 , k>1)
The image (I ^k-1 _content , k=1) downscaled to the input scale of the corresponding step and the optimal DAF output information (I ^k-1 _da-content , k>1) of the lower step of the corresponding step are converted to the input scale of the corresponding step. An image restoration apparatus characterized in that the images scaled by , and the previous stage PD output reconstruction image are input at each stage.

According to claim 5,
The plurality of DAF modules (DAF ^k-1 , k>1)
The image (I ^k-1 _content , k=1) downscaled to the input scale of the corresponding step and the optimal DAF output information (I ^k-1 _da-content , k>1) of the lower step of the corresponding step are converted to the input scale of the corresponding step. An image restoration device characterized in that each calculates a difference value between images scaled by , and a previous stage PD output reconstruction image.

According to claim 6,
The plurality of DAF modules (DAF ^k-1 , k>1)
Group convolution is performed on each of the subtraction values, input to Softmax so that the sum of the output values is 1, and then each output value that has passed through the softmax is multiplied by the restored PD output image of the previous step and then summed. and generating the optimal DAF output information (I ^k-1 _da-content , k>1) and then inputting the information to the connected PD.