KR20210141303A

KR20210141303A - A method and an apparatus for extracting a noise of a compressed image and a method and an apparatus for reducing a noise of a compressed image

Info

Publication number: KR20210141303A
Application number: KR1020200158043A
Authority: KR
Inventors: 박재연; 안일준; 이태미
Original assignee: 삼성전자주식회사
Priority date: 2020-05-15
Filing date: 2020-11-23
Publication date: 2021-11-23

Abstract

Provided is a method for reducing noise of a compressed image. A first image is obtained by applying a convolution filter which sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image. A second image obtained by subtracting the first image from the compressed image is obtained. Noise including high-frequency information and compression artifacts of the compressed image is obtained from the second image. A third image is obtained by removing compression artifacts by applying DNN for removing compression artifacts to the noise. And the compressed image is restored by summing the first image and the third image.

Description

A method and an apparatus for extracting a noise of a compressed image and a method and an apparatus for reducing a noise of a compressed image}

본 개시는 압축 영상의 노이즈를 추출하고 저감하기 위한 방법 및 장치에 관한 것으로, 보다 구체적으로, 본 개시는 영상의 압축시 발생하는 압축 아티팩트를 포함하는 노이즈를 인공지능(AI)에 기반하여 추출하고 저감하기 위한 방법 및 장치에 관한 것이다.The present disclosure relates to a method and apparatus for extracting and reducing noise of a compressed image, and more specifically, the present disclosure relates to a method and apparatus for extracting and reducing noise of a compressed image, and extracting noise including compression artifacts generated during compression of an image based on artificial intelligence (AI), It relates to a method and apparatus for abatement.

영상은 소정의 데이터 압축 표준, 예를 들어 MPEG (Moving Picture Expert Group) 표준 등을 따르는 코덱(codec)에 의해 부호화된 후 비트스트림의 형태로 기록매체에 저장되거나 통신 채널을 통해 전송된다.The image is encoded by a codec conforming to a predetermined data compression standard, for example, a Moving Picture Expert Group (MPEG) standard, and then stored in a bitstream in the form of a recording medium or transmitted through a communication channel.

영상 압축시에 압축 아티팩트가 발생될 수 있다. 따라서, 이러한 압축 아티팩트를 저감하여 압축 영상의 품질을 향상시킬 필요가 있다.Compression artifacts may occur during image compression. Accordingly, there is a need to improve the quality of a compressed image by reducing such compression artifacts.

일 실시예에 따른 압축 영상의 노이즈를 추출하기 위한 방법 및 압축 영상의 노이즈를 저감하기 위한 방법, 및 장치는 AI 기반으로 압축 영상으로부터 복원가능성이 높은 구조적 정보와 손상 확률이 높은 고주파수 노이즈 영역을 분리하여 영상의 압축시 생성되는 압축 아티팩트를 저감함으로써 압축 영상의 품질을 향상하는 것을 기술적 과제로 한다.A method for extracting noise from a compressed image, a method for reducing noise from a compressed image, and an apparatus according to an embodiment separate structural information with a high probability of restoration and a high-frequency noise region with a high probability of damage from a compressed image based on AI Therefore, it is a technical task to improve the quality of a compressed image by reducing compression artifacts generated during image compression.

상기 기술적 과제를 해결하기 위해 본 개시에서 제안하는, 압축 영상의 노이즈를 추출하는 방법은 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득하는 단계; 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득하는 단계; 및 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득하는 단계를 포함할 수 있다.In order to solve the above technical problem, a method for extracting noise from a compressed image, proposed by the present disclosure, is a convolution filter that sequentially performs down-convolution on a compressed image of an original image and up-convolution corresponding to the down-convolution. obtaining a first image by applying obtaining a second image obtained by subtracting the first image from the compressed image; and obtaining noise including high frequency information and compression artifacts of the compressed image from the second image.

상기 기술적 과제를 해결하기 위해 본 개시에서 제안하는, 압축 영상의 노이즈를 저감하는 방법은, 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득하는 단계; 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득하는 단계; 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득하는 단계; 상기 노이즈에 압축 아티팩트 제거용 DNN을 적용하여 압축 아티팩트를 제거하여 제3 영상을 획득하는 단계; 및 상기 제1 영상과 상기 제3 영상을 합산하여 상기 압축 영상을 복원하는 단계를 포함할 수 있다.In order to solve the above technical problem, a method for reducing noise of a compressed image, proposed by the present disclosure, is a convolution of sequentially performing down-convolution on a compressed image of an original image and up-convolution corresponding to the down-convolution. obtaining a first image by applying a filter; obtaining a second image obtained by subtracting the first image from the compressed image; obtaining noise including high frequency information and compression artifacts of the compressed image from the second image; obtaining a third image by removing compression artifacts by applying DNN for removing compression artifacts to the noise; and restoring the compressed image by summing the first image and the third image.

상기 기술적 과제를 해결하기 위해 본 개시에서 제안하는, 압축 영상의 노이즈를 추출하는 장치는, 하나 이상의 인스트럭션들을 저장하는 메모리; 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션들을 실행하는 프로세서를 포함하고, 상기 프로세서는, 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득하고, 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득하고, 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득할 수 있다.In order to solve the above technical problem, an apparatus for extracting noise from a compressed image, which is proposed in the present disclosure, includes: a memory for storing one or more instructions; and a processor executing the one or more instructions stored in the memory, wherein the processor sequentially performs down-convolution and up-convolution corresponding to the down-convolution on the compressed image of the original image. By applying, a first image may be obtained, a second image obtained by subtracting the first image from the compressed image, and noise including high frequency information of the compressed image and compression artifacts may be obtained from the second image. .

상기 기술적 과제를 해결하기 위해 본 개시에서 제안하는, 압축 영상의 노이즈를 저감하는 장치는, 하나 이상의 인스트럭션들을 저장하는 메모리; 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션들을 실행하는 프로세서를 포함하고, 상기 프로세서는, 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득하고, 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득하고, 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득하고, 상기 노이즈에 압축 아티팩트 제거용 DNN을 적용하여 압축 아티팩트를 제거하여 제3 영상을 획득하고, 상기 제1 영상과 상기 제3 영상을 합산하여 상기 압축 영상을 복원할 수 있다.In order to solve the above technical problem, an apparatus for reducing noise of a compressed image, which is proposed in the present disclosure, includes: a memory for storing one or more instructions; and a processor executing the one or more instructions stored in the memory, wherein the processor sequentially performs down-convolution and up-convolution corresponding to the down-convolution on the compressed image of the original image. obtaining a first image by applying, obtaining a second image obtained by subtracting the first image from the compressed image, and obtaining noise including high-frequency information and compression artifacts of the compressed image from the second image; A third image may be obtained by removing compression artifacts by applying DNN for removing compression artifacts to noise, and the compressed image may be reconstructed by summing the first image and the third image.

다운 컨볼루션 및 업 컨볼루션 연산를 포함하는 컨볼루션 필터와 감산 및 합산 연산을 이용하여 압축 영상을 저주파수 정보 및 중간주파수 정보를 포함하는 영상과 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 가지는 영상으로 분리함으로써 노이즈를 추출하고, 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 가지는 영상을 압축 아티팩트 제거용 DNN에 입력하여 압축 아티팩트를 제거하고 저주파수 정보 및 중간주파수 정보를 포함하는 영상과 다시 합산함으로써 압축 영상의 품질이 향상될 수 있다.By using a convolution filter including down-convolution and up-convolution operations, and subtraction and summation operations, the compressed image is separated into an image containing low-frequency information and intermediate-frequency information and an image containing high-frequency information and noise containing compression artifacts. The quality of the compressed image is improved by extracting noise, inputting an image having noise including high-frequency information and compression artifacts into the DNN for compression artifact removal, removing compression artifacts, and resuming the image with low-frequency information and intermediate-frequency information. can be improved

본 명세서에서 인용되는 도면을 보다 충분히 이해하기 위하여 각 도면의 간단한 설명이 제공된다.
도 1은 일 실시예에 따른 압축 영상의 노이즈를 추출하고 저감하는 방법을 설명하기 위한 도면이다.
도 2는 컨볼루션 필터에서 수행되는 다운 컨볼루션과 업 컨볼루션 연산을 나타내는 예시적인 도면이다.
도 3은 압축 아티팩트를 제거하기 위한 DNN을 나타내는 예시적인 도면이다.
도 4는 일 실시예에 따른 압축 영상의 노이즈를 추출하는 장치의 구성을 나타내는 블록도이다.
도 5는 일 실시예에 따른 압축 영상의 노이즈를 저감하는 장치의 구성을 나타내는 블록도이다.
도 6은 컨볼루션 필터를 훈련시키는 방법을 설명하기 위한 예시적인 도면이다.
도 7은 AI 아티팩트 저감용 DNN을 훈련시키는 방법을 설명하기 위한 예시적인 도면이다.
도 8은 일 실시예에 따른 압축 영상의 노이즈를 추출하는 방법 방법을 설명하기 위한 순서도이다.
도 9는 일 실시예에 따른 압축 영상의 노이즈를 저감하는 방법 방법을 설명하기 위한 순서도이다.In order to more fully understand the drawings cited herein, a brief description of each drawing is provided.
1 is a diagram for explaining a method of extracting and reducing noise of a compressed image according to an exemplary embodiment.
2 is an exemplary diagram illustrating down-convolution and up-convolution operations performed in a convolution filter.
3 is an exemplary diagram illustrating a DNN for removing compression artifacts.
4 is a block diagram illustrating a configuration of an apparatus for extracting noise from a compressed image according to an exemplary embodiment.
5 is a block diagram illustrating a configuration of an apparatus for reducing noise in a compressed image according to an exemplary embodiment.
6 is an exemplary diagram for explaining a method of training a convolution filter.
7 is an exemplary diagram for explaining a method of training a DNN for reducing AI artifacts.
8 is a flowchart illustrating a method of extracting noise from a compressed image according to an exemplary embodiment.
9 is a flowchart illustrating a method of reducing noise in a compressed image according to an exemplary embodiment.

본 개시는 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고, 이를 상세한 설명을 통해 상세히 설명하고자 한다. 그러나, 이는 본 개시의 실시 형태에 대해 한정하려는 것이 아니며, 본 개시는 여러 실시예들의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Since the present disclosure can make various changes and can have various embodiments, specific embodiments are illustrated in the drawings and will be described in detail through the detailed description. However, this is not intended to limit the embodiments of the present disclosure, and it should be understood that the present disclosure includes all modifications, equivalents and substitutes included in the spirit and scope of various embodiments.

실시예를 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 개시의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제 1, 제 2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.In describing the embodiment, if it is determined that a detailed description of a related known technology may unnecessarily obscure the subject matter of the present disclosure, the detailed description thereof will be omitted. In addition, numbers (eg, first, second, etc.) used in the description process of the specification are only identifiers for distinguishing one component from other components.

또한, 본 명세서에서, 일 구성요소가 다른 구성요소와 "연결된다" 거나 "접속된다" 등으로 언급된 때에는, 상기 일 구성요소가 상기 다른 구성요소와 직접 연결되거나 또는 직접 접속될 수도 있지만, 특별히 반대되는 기재가 존재하지 않는 이상, 중간에 또 다른 구성요소를 매개하여 연결되거나 또는 접속될 수도 있다고 이해되어야 할 것이다.In addition, in this specification, when a component is referred to as "connected" or "connected" with another component, the component may be directly connected or directly connected to the other component, but in particular It should be understood that, unless there is a description to the contrary, it may be connected or connected through another element in the middle.

또한, 본 명세서에서 '~부(유닛)', '모듈' 등으로 표현되는 구성요소는 2개 이상의 구성요소가 하나의 구성요소로 합쳐지거나 또는 하나의 구성요소가 보다 세분화된 기능별로 2개 이상으로 분화될 수도 있다. 또한, 이하에서 설명할 구성요소 각각은 자신이 담당하는 주기능 이외에도 다른 구성요소가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성요소 각각이 담당하는 주기능 중 일부 기능이 다른 구성요소에 의해 전담되어 수행될 수도 있음은 물론이다.In addition, in the present specification, components expressed as '~ unit (unit)', 'module', etc. are two or more components combined into one component, or two or more components for each more subdivided function. may be differentiated into In addition, each of the components to be described below may additionally perform some or all of the functions of other components in addition to the main functions that each component is responsible for, and some of the main functions of each of the components are different It goes without saying that it may be performed exclusively by the component.

또한, 본 명세서에서, '영상(image)' 또는 '픽처'는 정지영상, 복수의 연속된 정지영상(또는 프레임)으로 구성된 동영상, 또는 비디오를 나타낼 수 있다.In addition, in this specification, an 'image' or 'picture' may indicate a still image, a moving picture composed of a plurality of continuous still images (or frames), or a video.

또한, 본 명세서에서 'DNN(deep neural network)'은 뇌 신경을 모사한 인공신경망 모델의 대표적인 예시로써, 특정 알고리즘을 사용한 인공신경망 모델로 한정되지 않는다.Also, in the present specification, a 'deep neural network (DNN)' is a representative example of an artificial neural network model simulating a brain nerve, and is not limited to an artificial neural network model using a specific algorithm.

또한, 본 명세서에서 '파라미터'는 뉴럴 네트워크를 이루는 각 레이어의 연산 과정에서 이용되는 값으로서 예를 들어, 입력 값을 소정 연산식에 적용할 때 이용되는 가중치를 포함할 수 있다. 또한, 파라미터는 매트릭스 형태로 표현될 수 있다. 파라미터는 훈련의 결과로 설정되는 값으로서, 필요에 따라 별도의 훈련 데이터(training data)를 통해 갱신될 수 있다.Also, in the present specification, a 'parameter' is a value used in a calculation process of each layer constituting a neural network, and may include, for example, a weight used when an input value is applied to a predetermined calculation expression. Also, the parameter may be expressed in a matrix form. A parameter is a value set as a result of training, and may be updated through separate training data if necessary.

또한, 본 명세서에서 '컨볼루션 필터'는 압축 영상에 다운 컨볼루션과 업 컨볼루션을 순차적으로 수행하여 압축 영상의 중간 주파수 및 저주파수 정보만을 가지는 영상을 획득하기 위해 이용되는 훈련된 필터를 의미하고 2번의 컨볼루션 연산이 수행되고 훈련된다는 점에서 DNN의 한 종류이고, “AI 압축 아티팩트 제거용 DNN”은 압축 영상의 고주파수 정보와 압축 아티팩트를 가지는 영상의 압축 아티팩트를 제거하기 위해 이용되는 DNN을 의미한다.In addition, in the present specification, the term 'convolution filter' refers to a trained filter used to obtain an image having only intermediate frequency and low frequency information of a compressed image by sequentially performing down-convolution and up-convolution on the compressed image 2 It is a type of DNN in that convolution operations are performed and trained, and “DNN for AI compression artifact removal” refers to a DNN used to remove compression artifacts from images having high frequency information and compression artifacts of compressed images. .

또한, 본 명세서에서 '필터 설정 정보'는 컨볼루션 필터와 관련된 정보로서 전술한 파라미터를 포함한다. 필터 설정 정보를 이용하여 컨볼루션 필터가 설정될 수 있다.Also, in the present specification, 'filter setting information' includes the parameters described above as information related to a convolution filter. A convolution filter may be set using the filter setting information.

또한, 본 명세서에서 'DNN 설정 정보'는 DNN을 구성하는 요소와 관련된 정보로서 전술한 파라미터를 포함한다. DNN 설정 정보를 이용하여 AI 압축 아티팩트 제거용 DNN이 설정될 수 있다.In addition, in the present specification, 'DNN configuration information' includes the aforementioned parameters as information related to elements constituting the DNN. A DNN for removing AI compression artifacts may be set using the DNN configuration information.

또한, 본 명세서에서 '원본 영상'은 영상 압축의 대상이 되는 영상을 의미하고, '압축 영상'은 원본 영상의 영상 압축의 결과 생성된 영상을 의미하고, '제 1 영상'은 다운 컨볼루션과 업 컨볼루션이 순차적으로 수행되는 컨볼루션 필터를 적용한 영상을 의미한다. 또한, '제 2 영상'은 압축 영상으로부터 제 1 영상을 감산한 영상을 의미하고, '제 3 영상'은 압축 아티팩트 제거 과정에서 제 2 영상에 대한 AI 압축 아티팩트 제거를 통해 획득된 영상을 의미하고, '복원 영상'은 제 1 영상과 제 3 영상을 합산한 영상을 의미한다.In addition, in this specification, 'original image' means an image subject to image compression, 'compressed image' means an image generated as a result of image compression of the original image, and 'first image' is It refers to an image to which a convolution filter in which up-convolution is sequentially performed is applied. In addition, 'second image' refers to an image obtained by subtracting the first image from the compressed image, and 'third image' refers to an image obtained by removing AI compression artifacts for the second image in the compression artifact removal process. , 'reconstructed image' means an image obtained by summing the first image and the third image.

또한, 본 명세서에서, '감산'은 영상의 픽셀 별로 픽셀 값을 감산하는 원소별 감산(element-wise subtract)을 의미하고, '합산'은 의 픽셀 별로 픽셀 값을 합산하는 원소별 합산(element-wise sum)을 의미한다.In addition, in this specification, 'subtraction' means element-wise subtraction for subtracting pixel values for each pixel of an image, and 'summing' means element-wise subtraction for summing pixel values for each pixel of an image. wise sum).

또한, 본 명세서에서, '압축 아티팩트'는 영상 압축 시에 에지 주변과 텍스처가 손상되어 발생되는 에러를 의미하고, 압축 아티팩트는 일반적으로 블록 아티팩트, 링잉 아티팩트 등이 있다. In addition, in the present specification, 'compression artifact' means an error that occurs due to damage to edges and textures during image compression, and the compression artifacts generally include block artifacts, ringing artifacts, and the like.

도 1은 일 실시예에 따른 압축 영상의 노이즈를 추출하고 저감하는 방법을 설명하기 위한 도면이다.1 is a diagram for explaining a method of extracting and reducing noise of a compressed image according to an exemplary embodiment.

원본 영상의 영상 압축으로 생성된 압축 영상에는 압축시에 발생한 압축 아티팩트를 포함한 노이즈가 발생하고, 이에 따라, 이러한 노이즈를 효과적으로 분리하여 추출하고, 노이즈에 포함된 압축 아티팩트를 효과적으로 저감하는 방안이 필요하다.Compressed images generated by image compression of the original image generate noise including compression artifacts generated during compression. Accordingly, it is necessary to effectively separate and extract such noise and effectively reduce compression artifacts included in the noise. .

압축 영상은 복원가능성이 높은 구조적 정보(저주파수 밴드 정보 및 중간주파수 밴드 정보)를 포함하는 영역과 손상 확률이 높은 고주파수 노이즈(고주파수 밴드 정보 및 압축 아티팩트) 영역을 포함한다. The compressed image includes a region including structural information with high restoration potential (low frequency band information and intermediate frequency band information) and a high frequency noise region with high damage probability (high frequency band information and compression artifacts).

도 1에 도시된 바와 같이, 본 개시의 일 실시예에 따르면, 압축 영상(105)을 다운 컨볼루션(115)와 업 컨볼루션(125)을 순차적으로 수행하는 컨볼루션 필터(110)에 적용하여 제 1 영상(135)를 획득하고, 압축 영상(105)으로부터 제 1 영상(135)를 감산(120)하여 제 2 영상(145)를 획득한다. 그리고, 제 2 영상(145)을 아티팩트 저감용 DNN(130)에 입력하여 제 3 영상(155)을 획득하고, 제 1 영상(135)과 제 3 영상(155)를 합산(140)하여 복원 영상(165)을 획득한다. As shown in FIG. 1 , according to an embodiment of the present disclosure, the compressed image 105 is applied to the convolution filter 110 that sequentially performs down-convolution 115 and up-convolution 125 . A first image 135 is obtained, and a second image 145 is obtained by subtracting 120 of the first image 135 from the compressed image 105 . Then, the second image 145 is input to the DNN 130 for reducing artifacts to obtain a third image 155 , and the first image 135 and the third image 155 are summed 140 to obtain a reconstructed image. (165) is obtained.

컨볼루션 필터(110)는 다운 컨볼루션(115)과 다운 컨볼루션(125)에 대응되는 업 컨볼루션(125)을 순차적으로 수행하는 필터로, 압축 영상(105)을 입력으로하여 압축 영상(105)의 저주파수 및 중간주파수를 가지는 제 1 영상(135)을 획득하도록 훈련된 것이다. 즉, 컨볼루션 필터(110)를 적용한 결과 압축 영상(105)의 저주파수 및 중간주파수 정보를 가지는 제 1 영상(135)이 획득되고, 압축 영상(105)으로부터 제 1 영상(135)을 감산하여 압축 영상(105)의 고주파수 정보와 압축 아티팩트를 가지는 제 2 영상(145)이 획득된다. 컨볼루션 필터의 컨볼루션 연산은 도 2을 통해 후술되고, 컨볼루션 필터의 훈련 방식은 도 6을 통해 후술된다.The convolution filter 110 is a filter that sequentially performs down-convolution 115 and up-convolution 125 corresponding to down-convolution 125, and receives compressed image 105 as input and compressed image 105 ) is trained to acquire the first image 135 having a low frequency and an intermediate frequency. That is, as a result of applying the convolution filter 110 , a first image 135 having low frequency and intermediate frequency information of the compressed image 105 is obtained, and the first image 135 is subtracted from the compressed image 105 and compressed. A second image 145 having high frequency information of the image 105 and compression artifacts is obtained. The convolution operation of the convolution filter will be described later with reference to FIG. 2 , and the training method of the convolution filter will be described with reference to FIG. 6 .

아티팩트 저감용 DNN(130)은 압축 영상(105)의 고주파수 정보와 압축 아티팩트를 가지는 제 2 영상(145)을 입력으로 하여 압축 아티팩트가 저감된 제 3 영상(155)을 획득하도록 훈련된 것이다. 즉, 아티팩트 저감용 DNN 적용 결과 압축 아티팩트가 저감된 제 3 영상(155)이 획득되고, 제 1 영상(135)과 제 3 영상(155)을 합산하여 압축 영상(105)으로부터 압축 아티팩트가 저감된 복원 영상(165)이 획득된다. 아티팩트 저감용 DNN의 구조는 도 3을 통해 후술되고, 아티팩트 저감용 DNN의 훈련 방식은 도 7을 통해 후술된다. The DNN 130 for reducing artifacts is trained to obtain a third image 155 with reduced compression artifacts by inputting the high frequency information of the compressed image 105 and the second image 145 having compression artifacts as inputs. That is, as a result of applying the DNN for reducing artifacts, a third image 155 with reduced compression artifacts is obtained, and the compression artifacts are reduced from the compressed image 105 by summing the first image 135 and the third image 155 . A restored image 165 is obtained. The structure of the DNN for reducing artifacts will be described later with reference to FIG. 3 , and the training method of the DNN for reducing artifacts will be described with reference to FIG. 7 .

상술된 압축 아티팩트가 포함된 노이즈를 추출하고, 압축 아티팩트를 저감하는 방법은 여러 번 수행될 수 있다. 또한, 영상의 블록 단위 별로 수행되거나 영상의 블록 단위 별로 여러 번 수행될 수 있다. 또한, 영상 압축이 수행된 후에 압축 아티팩트의 제거가 필요한 경우에, 상술된 압축 아티팩트가 포함된 노이즈를 추출하고 압축 아티팩트를 저감하는 방법이 어느 단계에나 적용될 수 있다.The above-described method of extracting noise including compression artifacts and reducing compression artifacts may be performed several times. Also, it may be performed for each block of the image or may be performed several times for each block of the image. In addition, when it is necessary to remove compression artifacts after image compression is performed, the above-described method of extracting noise including compression artifacts and reducing compression artifacts may be applied at any stage.

압축 영상을 저주파수 및 중간주파수 정보를 포함하는 영상과 고주파수 정보 및 압축 아티팩트를 포함하는 영상으로 분리하지 않고 AI에 기반하여 아티팩트를 저감하는 경우에는 저감해야할 압축 아티팩트와 유지해야할 아티팩트의 정보를 구분하지 못하기 때문에 압축 아티팩트만 따로 저감하지 못한다. 그러나, 도 1에서 상술한 바와 같이, 훈련된 컨볼루션 필터를 이용하여 압축 영상을 저주파수 및 중간주파수 정보를 포함하는 영상과 압축 아티팩트와 압축 아티팩트와 연관된 고주파수 정보를 포함하는 영상으로 분리하고, 고주파수 정보 및 압축 아티팩트를 포함하는 영상에 대하여 훈련된 아티팩트 저감용 DNN을 이용하여 압축 아티팩트를 제거함으로써 영상 압축 시에 생성된, 원본 영상에 없던 압축 아티팩트를 집중하여 저감할 수 있다.If the compressed image is reduced based on AI without separating the compressed image into an image containing low-frequency and intermediate-frequency information and an image containing high-frequency information and compression artifacts, it is not possible to distinguish between the information of the compression artifact to be reduced and the information of the artifact to be maintained. Therefore, only compression artifacts cannot be separately reduced. However, as described above in FIG. 1 , the compressed image is separated into an image including low and intermediate frequency information and an image including compression artifacts and high frequency information related to compression artifacts using a trained convolution filter, and high frequency information and by removing compression artifacts using a DNN for reducing artifacts trained on an image including compression artifacts, compression artifacts generated during image compression that were not present in the original image can be reduced by concentration.

도 2는 따른 컨볼루션 필터에서 수행되는 다운 컨볼루션과 업 컨볼루션 연산을 나타내는 예시적인 도면이다.2 is an exemplary diagram illustrating down-convolution and up-convolution operations performed in a convolution filter according to the present invention.

도 2에 도시된 바와 같이, 다운 컨볼루션(115)은 4x4의 입력 특징 맵(210)에 대하여 스트라이드(stride)에 따라 3x3의 다운 필터 커널(220)을 적용한 컨볼루션을 통해 특징 맵의 크기가 줄어든 2x2의 출력 특징 맵(230)를 획득하는 컨볼루션 연산이고, 업 컨볼루션(125)은 다운 컨볼루션(115)과 전치 관계에 있는 것으로, 특징 맵의 크기가 줄어든 2x2의 입력 특징 맵(240)에 대하여 스트라이드(stride)에 따라 다운 필터 커널(220)에 대응되는 3x3의 업 필터 커널(250)을 적용한 컨볼루션을 통해 특징 맵의 크기를 원래 크기로 늘어난 4x4의 출력 특징 맵(260)을 획득하는 컨볼루션 연산이다. As shown in FIG. 2 , the down convolution 115 is performed by applying the 3x3 down filter kernel 220 according to the stride to the 4x4 input feature map 210 through convolution to increase the size of the feature map. It is a convolution operation to obtain a reduced 2x2 output feature map 230, and the up convolution 125 is in a transposed relationship with the down convolution 115, and the 2x2 input feature map 240 with a reduced size of the feature map. ), a 4x4 output feature map 260 that increases the size of the feature map to its original size through convolution by applying a 3x3 up-filter kernel 250 corresponding to the down-filter kernel 220 according to the stride. Convolution operation to obtain.

다운 컨볼루션(115)을 통해 입력 특징 맵(210)에 비하여 크기가 줄어든 출력 특징 맵(230)을 획득하여 픽셀이 삭제된 만큼 고주파수 영역에 해당되는 입력 정보가 손실되지만, 그 이후에 업 컨볼루션(125)을 통해 다시 특징 맵의 크기를 늘려 픽셀 수를 복원한다. An output feature map 230 having a reduced size compared to the input feature map 210 is obtained through down convolution 115, and input information corresponding to a high-frequency region is lost as much as pixels are deleted, but after that, up-convolution Through (125), the number of pixels is restored by increasing the size of the feature map again.

즉, 큰 차원의 정보를 가지는 입력 특징 맵(210)을 훈련된 필터 커널을 이용하는 다운 컨볼루션 연산을 통해 작은 차원으로 축소하면서 고주파수 정보가 손실된 작은 차원의 출력 특징 맵(230)이 획득되고, 작은 차원의 입력 특징 맵(240)을 다운 컨볼루션 연산에 대응되는 훈련된 필터 커널을 이용하는 업 컨볼루션 연산을 통해 큰 차원으로 확대하면서 원래의 큰 차원의 출력 특징 맵(260)이 획득된다.That is, a small-dimensional output feature map 230 in which high-frequency information is lost is obtained while reducing the input feature map 210 having large-dimensional information to a small dimension through down-convolution operation using a trained filter kernel, The original large-dimensional output feature map 260 is obtained while expanding the small-dimensional input feature map 240 to a large dimension through an up-convolution operation using a trained filter kernel corresponding to the down-convolution operation.

이에 따라, 컨볼루션 필터에 입력된 압축 영상(105)의 저주파수 및 중간주파수 정보를 포함하는 제 1 영상(135)이 획득된다. Accordingly, the first image 135 including low-frequency and intermediate-frequency information of the compressed image 105 input to the convolution filter is obtained.

일 실시예에 따라, 컨볼루션 필터는 다운 컨볼루션을 수행하는 1개의 컨볼루션 레이어와 업 컨볼루션을 수행하는 1개의 컨볼루션 레이어로 구성되는 DNN일 수 있다.According to an embodiment, the convolution filter may be a DNN including one convolutional layer performing down-convolution and one convolutional layer performing up-convolution.

보간 방식의 크기 축소 및 크기 확대 방식의 고정된 필터가 아니라 크기를 축소하는 컨볼루션 연산과 크기를 확대하는 전치되는 컨볼루션(transposed convolution)을 순차적으로 수행하는 훈련된 컨볼루션 필터를 이용함으로써 압축 영상으로부터 컨볼루션 필터링된 영상을 감산하는 감산 연산 결과, 압축 영상으로부터 고주파수 정보와 압축 아티팩트를 포함하는 노이즈 영역의 영상을 효과적으로 분리할 수 있다. 이에 따라 분리된 노이즈 영역의 영상을 도 3에서 후술되는 압축 아티팩트를 제거하기 위한 DNN에 적용하여 효과적으로 압축 아티팩트를 저감할 수 있다.Compressed images by using a trained convolution filter that sequentially performs a convolution operation that reduces the size and a transposed convolution that increases the size, rather than a fixed filter of the size reduction and enlargement method of the interpolation method. As a result of a subtraction operation for subtracting the convolution-filtered image from , it is possible to effectively separate an image in a noise region including high-frequency information and compression artifacts from the compressed image. Accordingly, compression artifacts can be effectively reduced by applying the separated noise region image to DNN for removing compression artifacts, which will be described later in FIG. 3 .

도 2에서 설명한 컨볼루션 연산들은 하나의 예시일 뿐이며, 이에 한정되는 것은 아니다.The convolution operations described with reference to FIG. 2 are only examples and are not limited thereto.

도 3은 압축 아티팩트를 제거하기 위한 DNN을 나타내는 예시적인 도면이다.3 is an exemplary diagram illustrating a DNN for removing compression artifacts.

도 3에 도시된 바와 같이, 압축 영상(105)의 고주파수 정보와 압축 아티팩트를 포함하는 노이즈를 가지는 제 2 영상(145)은 제 1 컨볼루션 레이어(310)로 입력된다. 도 3에 도시된 제 1 컨볼루션 레이어(310)에 표시된 3X3X4는 3 x 3의 크기의 4개의 필터 커널을 이용하여 1개의 입력 영상에 대해 컨볼루션 처리를 하는 것을 예시한다. 컨볼루션 처리 결과 4개의 필터 커널에 의해 4개의 특징 맵이 생성된다. 각 특징 맵은 제 2 영상(145)의 고유한 특성들을 나타낸다. 예를 들어, 각 특징 맵은 제 2 영상(135)의 수직 방향 특성, 수평 방향 특성, 에지 특성, 압축 아티팩트에서 나타내는 특성 등을 나타낼 수 있다.As shown in FIG. 3 , the second image 145 having high frequency information of the compressed image 105 and noise including compression artifacts is input to the first convolutional layer 310 . 3X3X4 displayed in the first convolution layer 310 shown in FIG. 3 exemplifies convolution processing on one input image using four filter kernels having a size of 3×3. As a result of the convolution process, four feature maps are generated by four filter kernels. Each feature map represents unique characteristics of the second image 145 . For example, each feature map may represent a vertical direction characteristic, a horizontal direction characteristic, an edge characteristic, and a characteristic indicated by a compression artifact of the second image 135 .

컨볼루션 처리 결과 생성된 특징 맵들은 각 특징 별로 평균과 표준편차를 계산해서 배치 단위로 정규화하는 배치 정규화(batch normalization) 레이어로 입력될 수 있다. 배치 정규화 방법은 미니 배치 단위로 정규화하여 학습하기 때문에 학습 속도가 개선된다. 여기서, 배치는 데이터의 일 부분을 의미한다. The feature maps generated as a result of the convolution process may be input to a batch normalization layer that calculates the average and standard deviation for each feature and normalizes them in batch units. The batch normalization method improves learning speed because it learns by normalizing in mini-batch units. Here, the batch means a portion of data.

제 1 컨볼루션 레이어에서 출력된 특징 맵들은 제 1 배치 정규화 레이어(310)로 입력되고 나서 제 1 활성화 레이어(315)에 입력된다. The feature maps output from the first convolutional layer are input to the first batch normalization layer 310 and then input to the first activation layer 315 .

또한, 다른 예로, 배치 정규화는 하나의 옵션으로, 제 1 컨볼루션 레이어에 출력된 특징 맵들은 배치 정규화하지 않고, 바로 제 1 활성화 레이어(315)에 입력될 수 있다.Also, as another example, batch normalization is an option, and feature maps output to the first convolutional layer may be directly input to the first activation layer 315 without batch normalization.

제 1 활성화 레이어(315)는 각각의 특징 맵에 대해 비선형(Non-linear) 특성을 부여할 수 있다. 제 1 활성화 레이어(315)는 시그모이드 함수(sigmoid function), Tanh 함수, ReLU(Rectified Linear Unit) 함수 등을 포함할 수 있으나, 이에 한정되는 것은 아니다.The first activation layer 315 may provide a non-linear characteristic to each feature map. The first activation layer 315 may include, but is not limited to, a sigmoid function, a Tanh function, a Rectified Linear Unit (ReLU) function, and the like.

제 1 활성화 레이어(315)에서 비선형 특성을 부여하는 것은, 제 1 컨볼루션 레이어(305) 또는 제 1 컨볼루션 레이어(305)와 제 1 배치 정규화 레이어(310)의 출력인, 특징 맵의 일부 샘플 값을 변경하여 출력하는 것을 의미한다. 이때, 변경은 비선형 특성을 적용하여 수행된다.Some samples of the feature map, which are the outputs of the first convolutional layer 305 or the first convolutional layer 305 and the first batch normalization layer 310, to give non-linear properties in the first activation layer 315 It means changing the value and outputting it. At this time, the change is performed by applying a non-linear characteristic.

제 1 활성화 레이어(315)는 출력된 특징 맵들의 샘플 값들을 제 2 컨볼루션 레이어(320)로 전달할지 여부를 결정한다. 예를 들어, 특징 맵들의 샘플 값들 중 어떤 샘플 값들은 제 1 활성화 레이어(315)에 의해 활성화되어 제 2 컨볼루션 레이어(320)로 전달되고, 어떤 샘플 값들은 제 1 활성화 레이어(315)에 의해 비활성화되어 제 2 컨볼루션 레이어(320)로 전달되지 않는다. 특징 맵들이 나타내는 제 2 영상(135)의 특징 중에서 압축 아티팩트에 대응되는 특징은 비활성화되고, 압축 아티팩트에 대응되지 않는 특징은 제 1 활성화 레이어(315)에 의해 강조된다.The first activation layer 315 determines whether to transfer the sample values of the output feature maps to the second convolution layer 320 . For example, some sample values among the sample values of the feature maps are activated by the first activation layer 315 and transferred to the second convolution layer 320 , and some sample values are activated by the first activation layer 315 . It is deactivated and is not transmitted to the second convolutional layer 320 . Among the features of the second image 135 indicated by the feature maps, a feature corresponding to a compression artifact is inactivated, and a feature not corresponding to the compression artifact is emphasized by the first activation layer 315 .

제 1 활성화 레이어(315)에서 출력된 특징 맵들은 제 2 컨볼루션 레이어(320)로 입력된다. 도 3에 도시된 제 2 컨볼루션 레이어(320)에 표시된 3X3X1는 3 x 3의 크기의 1개의 필터 커널을 이용하여 1개의 출력 영상을 만들기 위해 컨볼루션 처리를 하는 것을 예시한다. 제 2 컨볼루션 레이어(350)는 최종 영상을 출력하기 위한 레이어로서 1개의 필터 커널을 이용하여 1개의 출력을 생성한다. 제 2 컨볼루션 레이어(320)는 제 2 배치 정규화 레이어(325)로 입력되고 나서 제 2 활성화 레이어(330)에 입력된다. The feature maps output from the first activation layer 315 are input to the second convolution layer 320 . 3X3X1 displayed in the second convolutional layer 320 shown in FIG. 3 exemplifies convolution processing to generate one output image using one filter kernel having a size of 3×3. The second convolutional layer 350 is a layer for outputting a final image and generates one output using one filter kernel. The second convolutional layer 320 is input to the second batch normalization layer 325 and then to the second activation layer 330 .

또한, 다른 예로, 배치 정규화는 하나의 옵션으로, 제 2 컨볼루션 레이어에 출력된 특징 맵들은 배치 정규화하지 않고, 바로 제 2 활성화 레이어에 입력될 수 있다.Also, as another example, batch normalization is an option, and feature maps output to the second convolution layer may be directly input to the second activation layer without batch normalization.

본 개시의 예시에 따르면, 제 2 활성화 레이어(330)는 제 2 영상(145)에 포함된 압축 아티팩트를 제거한 제 3 영상(155)을 출력할 수 있다.According to the example of the present disclosure, the second activation layer 330 may output the third image 155 from which the compression artifact included in the second image 145 is removed.

도 3은 아티팩트 저감용 DNN(130)이 두 개의 컨볼루션 레이어(305, 320)와 두 개의 활성화 레이어(315, 330)를 포함하고 옵션으로 두 개의 배치 정규화 레이어(310, 325)를 포함할 수 있는 것으로 도시하고 있으나, 이는 하나의 예시일 뿐이며, 구현예에 따라서, 컨볼루션 레이어, 활성화 레이어, 및 배치 정규화 레이어의 개수는 다양하게 변경될 수 있다. 또한, 구현예에 따라서, 아티팩트 저감용 DNN(130)은 RNN(recurrent neural network)을 통해 구현될 수도 있다. 이 경우는 본 개시의 예시에 따른 아티팩트 저감용 DNN(130)의 CNN 구조를 RNN 구조로 변경하는 것을 의미한다.3 shows that the DNN 130 for reducing artifacts includes two convolutional layers 305 and 320 and two activation layers 315 and 330, and may optionally include two batch normalization layers 310 and 325. Although illustrated as being there, this is only an example, and the number of convolutional layers, activation layers, and batch normalization layers may be variously changed according to implementations. Also, according to an embodiment, the DNN 130 for reducing artifacts may be implemented through a recurrent neural network (RNN). In this case, it means changing the CNN structure of the DNN 130 for reducing artifacts according to the example of the present disclosure to the RNN structure.

압축 아티팩트를 제거하기 위한 DNN은 도 7에서 후술되는 바와 같이, 압축 영상의 고주파수 정보와 압축 아티팩트를 포함하는 노이즈를 가지는 제 2 영상을 입력으로하여 원본 영상에 포함되어 있던 고주파수 정보는 최대한 유지하고 원본 영상에 포함되지 않았던 압축 아티팩트가 제거된 제 3 영상을 획득되도록 훈련되기 문에, 압축 아티팩트를 효과적으로 제거한다. 그 후, 압축 영상의 저주파수 및 중간주파수 정보를 가지는 제 1 영상과 제 3 영상을 합산함으로써, 압축 영상의 압축 아티팩트만 저감되어 원본 영상에 가깝도록 품질이 향상된 복원 영상이 획득된다.As will be described later with reference to FIG. 7 , the DNN for removing compression artifacts takes as input a second image having noise including high frequency information and compression artifacts of the compressed image, and maintains the high frequency information included in the original image as much as possible while maintaining the original image. Since it is trained to obtain a third image in which compression artifacts not included in the image have been removed, compression artifacts are effectively removed. Thereafter, by summing the first image and the third image having low and intermediate frequency information of the compressed image, only compression artifacts of the compressed image are reduced to obtain a reconstructed image with improved quality to be close to the original image.

도 3에서 설명한 컨볼루션 과정은 하나의 예시일 뿐이며, 이에 한정되는 것은 아니다.The convolution process described in FIG. 3 is only an example, and is not limited thereto.

도 4는 일 실시예에 따른 압축 영상의 노이즈를 추출하는 장치의 구성을 나타내는 블록도이다.4 is a block diagram illustrating a configuration of an apparatus for extracting noise from a compressed image according to an exemplary embodiment.

도 4를 참조하면, 일 실시예에 따른 압축 영상의 노이즈를 추출하는 장치(400)는 컨볼루션 필터(410), 감산부(420), 및 필터 설정부(430)를 포함한다.Referring to FIG. 4 , an apparatus 400 for extracting noise from a compressed image according to an embodiment includes a convolution filter 410 , a subtraction unit 420 , and a filter setting unit 430 .

컨볼루션 필터(410), 감산부(420), 및 필터 설정부(430)는 하나의 프로세서를 통해 구현될 수 있다. 이 경우, 컨볼루션 필터(410), 감산부(420), 및 필터 설정부(430)는 전용 프로세서로 구현될 수도 있고, AP(application processor) 또는 CPU(central processing unit), GPU(graphic processing unit)와 같은 범용 프로세서와 S/W의 조합을 통해 구현될 수도 있다. 또한, 전용 프로세서의 경우, 본 개시의 실시예를 구현하기 위한 메모리를 포함하거나, 외부 메모리를 이용하기 위한 메모리 처리부를 포함할 수 있다.The convolution filter 410 , the subtraction unit 420 , and the filter setting unit 430 may be implemented through one processor. In this case, the convolution filter 410 , the subtraction unit 420 , and the filter setting unit 430 may be implemented as a dedicated processor, and may include an application processor (AP), a central processing unit (CPU), or a graphic processing unit (GPU). ) may be implemented through a combination of a general-purpose processor and S/W. In addition, in the case of a dedicated processor, a memory for implementing an embodiment of the present disclosure or a memory processing unit for using an external memory may be included.

컨볼루션 필터(410), 감산부(420), 및 필터 설정부(430)는 복수의 프로세서로 구성될 수도 있다. 이 경우, 전용 프로세서들의 조합으로 구현될 수도 있고, AP 또는 CPU, GPU와 같은 다수의 범용 프로세서들과 S/W의 조합을 통해 구현될 수도 있다. 일 실시예에서, 컨볼루션 필터(410)는 제 1 프로세서로 구현되고, 감산부(420)는 제 1 프로세서와 상이한 제 2 프로세서로 구현되고, 필터 설정부(430)는 제 1 프로세서 및 제 2 프로세서와 상이한 제 3 프로세서로 구현될 수 있다.The convolution filter 410 , the subtraction unit 420 , and the filter setting unit 430 may include a plurality of processors. In this case, it may be implemented as a combination of dedicated processors, or may be implemented through a combination of S/W with a plurality of general-purpose processors such as an AP, CPU, or GPU. In one embodiment, the convolution filter 410 is implemented by a first processor, the subtraction unit 420 is implemented by a second processor different from the first processor, and the filter setting unit 430 is implemented by the first processor and the second processor. It may be implemented with a third processor different from the processor.

필터 설정부(430)는 미리 결정된 기준에 기초하여 압축 영상(105)에 대응되는 훈련된 필터 설정 정보를 컨볼루션 필터(410)에 제공할 수 있다.The filter setting unit 430 may provide trained filter setting information corresponding to the compressed image 105 to the convolution filter 410 based on a predetermined criterion.

필터 설정부(430)는 압축 영상(105)에 부합하는 제 1 영상(135)의 획득을 위해 필터 설정 정보를 저장할 수 있다.The filter setting unit 430 may store filter setting information to obtain the first image 135 corresponding to the compressed image 105 .

필터 설정 정보는 압축 영상(105)의 저주파수 및 중간주파수 정보를 가지는 제 1 영상(135)을 획득하기 위해 훈련된 것이다.The filter setting information is trained to acquire the first image 135 having low-frequency and intermediate-frequency information of the compressed image 105 .

필터 설정 정보는 필터 커널의 개수, 필터 커널의 크기, 필터 커널의 파라미터 등을 포함할 수 있다.The filter setting information may include the number of filter kernels, the size of the filter kernel, parameters of the filter kernel, and the like.

컨볼루션 필터(410)는 필터 설정부(430)로부터 제공된 필터 설정 정보에 따라 다운 컨볼루션 및 업 컨볼루션을 순차적으로 수행한다.The convolution filter 410 sequentially performs down-convolution and up-convolution according to the filter setting information provided from the filter setting unit 430 .

컨볼루션 필터(410)는 필터 설정부(430)로부터 제공된 필터 설정 정보를 이용하여 압축 영상(105)을 입력으로하여 제 1 영상(135)을 출력한다.The convolution filter 410 outputs the first image 135 by inputting the compressed image 105 using the filter setting information provided from the filter setting unit 430 .

감산부(420)는 압축 영상(105)으로부터 컨볼루션 필터(410)를 적용하여 출력된 제 1 영상을 감산한다.The subtractor 420 subtracts the output first image by applying the convolution filter 410 from the compressed image 105 .

감산부(420)는 감산 결과, 압축 영상의 고주파수 정보와 압축 아티팩트를 포함하는 노이즈를 가지는 제 2 영상(145)를 출력한다.As a result of the subtraction, the subtraction unit 420 outputs the second image 145 having noise including high frequency information of the compressed image and compression artifacts.

도 5는 일 실시예에 따른 압축 영상의 노이즈를 저감하는 장치의 구성을 나타내는 블록도이다.5 is a block diagram illustrating a configuration of an apparatus for reducing noise in a compressed image according to an exemplary embodiment.

도 5를 참조하면, 일 실시예에 따른 압축 영상의 노이즈를 저감하는 장치(500)는 노이즈 추출부(510) 및 영상 복원부(550)를 포함한다. 노이즈 추출부(510)는 컨볼루션 필터(520), 감산부(530), 및 필터 설정부(540)를 포함하고, 영상 복원부(550)는 아티팩트 저감용 DNN(560), 합산부(570), 및 AI 설정부(580)를 포함할 수 있다.Referring to FIG. 5 , an apparatus 500 for reducing noise in a compressed image according to an exemplary embodiment includes a noise extractor 510 and an image restoration unit 550 . The noise extraction unit 510 includes a convolution filter 520 , a subtraction unit 530 , and a filter setting unit 540 , and the image restoration unit 550 includes a DNN 560 for reducing artifacts, and a summing unit 570 . ), and an AI setting unit 580 .

도 5에는 노이즈 추출부(510) 및 영상 복원부(550)가 개별적인 장치로 도시되어 있으나, 노이즈 추출부(510) 및 영상 복원부(550)는 하나의 프로세서를 통해 구현될 수 있다. 이 경우, 노이즈 추출부(510) 및 영상 복원부(550)는 전용 프로세서로 구현될 수도 있고, AP(application processor) 또는 CPU(central processing unit), GPU(graphic processing unit)와 같은 범용 프로세서와 S/W의 조합을 통해 구현될 수도 있다. 또한, 전용 프로세서의 경우, 본 개시의 실시예를 구현하기 위한 메모리를 포함하거나, 외부 메모리를 이용하기 위한 메모리 처리부를 포함할 수 있다. Although the noise extractor 510 and the image restorer 550 are illustrated as separate devices in FIG. 5 , the noise extractor 510 and the image restorer 550 may be implemented through one processor. In this case, the noise extraction unit 510 and the image restoration unit 550 may be implemented as a dedicated processor, and may be implemented with a general-purpose processor such as an application processor (AP), a central processing unit (CPU), or a graphic processing unit (GPU) and an S. It can also be implemented through the combination of /W. In addition, in the case of a dedicated processor, a memory for implementing an embodiment of the present disclosure or a memory processing unit for using an external memory may be included.

노이즈 추출부(510) 및 영상 복원부(550)는 복수의 프로세서로 구성될 수도 있다. 이 경우, 전용 프로세서들의 조합으로 구현될 수도 있고, AP 또는 CPU, GPU와 같은 다수의 범용 프로세서들과 S/W의 조합을 통해 구현될 수도 있다. 일 실시예에서, 컨볼루션 필터(520) 및 감산부(530)는 제 1 프로세서로 구현되고, 아티팩트 저감용 DNN(560) 및 합산부(570)는 제 1 프로세서와 상이한 제 2 프로세서로 구현되고, 필터 설정부(540) 및 AI 설정부(580)는 제 1 프로세서 및 제 2 프로세서와 상이한 제 3 프로세서로 구현될 수 있다.The noise extraction unit 510 and the image restoration unit 550 may be configured with a plurality of processors. In this case, it may be implemented as a combination of dedicated processors, or may be implemented through a combination of S/W with a plurality of general-purpose processors such as an AP, CPU, or GPU. In one embodiment, the convolution filter 520 and the subtraction unit 530 are implemented with a first processor, and the DNN for reducing artifacts 560 and the summing unit 570 are implemented with a second processor different from the first processor. , the filter setting unit 540 and the AI setting unit 580 may be implemented as a third processor different from the first processor and the second processor.

필터 설정부(540)는 미리 결정된 기준에 기초하여 압축 영상(105)에 대응되는 훈련된 필터 설정 정보를 컨볼루션 필터(520)에 제공할 수 있다.The filter setting unit 540 may provide trained filter setting information corresponding to the compressed image 105 to the convolution filter 520 based on a predetermined criterion.

필터 설정부(540)는 압축 영상(105)에 부합하는 제 1 영상(135)의 획득을 위해 필터 설정 정보를 저장할 수 있다.The filter setting unit 540 may store filter setting information to obtain the first image 135 corresponding to the compressed image 105 .

컨볼루션 필터(520)는 필터 설정부(540)로부터 제공된 필터 설정 정보에 따라 다운 컨볼루션 및 업 컨볼루션을 순차적으로 수행한다.The convolution filter 520 sequentially performs down-convolution and up-convolution according to filter setting information provided from the filter setting unit 540 .

컨볼루션 필터(520)는 필터 설정부(540)로부터 제공된 필터 설정 정보를 이G아ㅕ 압축 영상(105)을 입력으로하여 제 1 영상(135)을 출력한다.The convolution filter 520 uses the filter setting information provided from the filter setting unit 540 and outputs the first image 135 by inputting the compressed image 105 as an input.

감산부(530)는 압축 영상(105)으로부터 컨볼루션 필터(520)를 적용하여 출력된 제 1 영상을 감산한다.The subtractor 530 subtracts the output first image by applying the convolution filter 520 from the compressed image 105 .

감산부(530)는 감산 결과, 압축 영상의 고주파수 정보와 압축 아티팩트를 포함하는 노이즈를 가지는 제 2 영상(145)를 출력한다.As a result of the subtraction, the subtractor 530 outputs the second image 145 having noise including high frequency information of the compressed image and compression artifacts.

AI 설정부(580)는 미리 결정된 기준에 따라 제 2 영상(145)에 대응되는 훈련된 DNN 설정 정보를 아티팩트 저감용 DNN에 제공한다.The AI setting unit 580 provides the trained DNN setting information corresponding to the second image 145 to the DNN for reducing artifacts according to a predetermined criterion.

AI 설정부(580)는 제 2 영상(145)에 포함되는 압축 아티팩트가 저감된 제 3영상(155)의 획득을 위해 DNN 설정 정보를 저장할 수 있다.The AI setting unit 580 may store DNN setting information to obtain the third image 155 in which the compression artifact included in the second image 145 is reduced.

DNN 설정 정보는 압축 영상(105)의 고주파수 정보 및 압축 아티팩트를 가지는 제 2 영상(145)에서 압축 아티팩트가 제거된 제 3 영상(155)을 획득하기 위해 훈련된 것이다.The DNN setting information is trained to obtain a third image 155 from which compression artifacts are removed from the second image 145 having high frequency information and compression artifacts of the compressed image 105 .

DNN 설정 정보는 아티팩트 저감용 DNN(560)에 포함되는 컨볼루션 레이어의 수, 컨볼루션 레이어별 필터 커널의 개수 및 각 필터 커널의 파라미터 중 적어도 하나에 대한 정보를 포함할 수 있다.The DNN configuration information may include information on at least one of the number of convolution layers included in the DNN 560 for reducing artifacts, the number of filter kernels for each convolutional layer, and parameters of each filter kernel.

아티팩트 저감용 DNN(560)은 AI 설정부(580)에서 제공된 DNN 설정 정보를 이용하여 제 2 영상(145)을 입력으로 하여 제 2 영상(145)에 포함된 압축 아티팩트가 제거된 제 3 영상(155)를 출력한다.The DNN 560 for reducing artifacts receives the second image 145 as an input using the DNN setting information provided from the AI setting unit 580 to remove the compression artifacts included in the second image 145 ( 155) is output.

합산부(570)는 노이즈 추출부(510)의 컨볼루션 필터(520)를 통해 출력된 제 1 영상과 아티팩트 저감용 DNN(560)을 통해 출력된 제 3 영상을 합산하여 복원 영상(165)을 출력한다. 제 1 영상은 압축 영상의 저주파수 및 중간주파수 정보를 가지고, 제 3 영상은 압축 영상의 고주파수 정보를 가지므로, 합산된 복원 영상은 압축 영상에서 압축 아티팩트가 제거된 영상이다. 따라서, 일 실시예에 따른 압축 영상의 노이즈를 저감하는 장치(500)는 압축 영상(105)에 포함된 영상 압축시 생성된 압축 아티팩트가 제거된 복원 영상(165)을 획득한다.The summing unit 570 adds the first image output through the convolution filter 520 of the noise extractor 510 and the third image output through the DNN 560 for reducing artifacts to obtain a reconstructed image 165 . print out Since the first image has low frequency and intermediate frequency information of the compressed image and the third image has high frequency information of the compressed image, the summed reconstructed image is an image from which compression artifacts are removed from the compressed image. Accordingly, the apparatus 500 for reducing noise of a compressed image according to an embodiment obtains a reconstructed image 165 from which compression artifacts generated when an image included in the compressed image 105 is compressed.

도 6은 컨볼루션 필터를 훈련시키는 방법을 설명하기 위한 예시적인 도면이다.6 is an exemplary diagram for explaining a method of training a convolution filter.

도 6에서, 원본 훈련 영상(605)은 영상 압축(610)의 대상이 되는 영상이고, 압축 훈련 영상(615)은 원본 훈련 영상(605)이 영상 압축된 영상이고, 제 1 훈련 영상(635)은 압축 훈련 영상(615)에 다운 컨볼루션과 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터(620)를 적용한 영상이다. In FIG. 6 , an original training image 605 is an image subjected to image compression 610 , a compression training image 615 is an image compressed by an original training image 605 , and a first training image 635 . is an image to which a convolution filter 620 that sequentially performs down-convolution and up-convolution is applied to the compression training image 615 .

원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상을 포함한다. 일 실시예에서, 원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상으로부터 추출된 휘도 영상을 포함할 수도 있다. 또한, 일 실시예에서, 원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상에서 추출된 패치 영상을 포함할 수도 있다. 원본 훈련 영상(605)이 복수의 프레임으로 이루어진 경우, 압축 훈련 영상(615), 제 1 훈련 영상(635) 및 저주파수 및 중간주파수 훈련 영상(630) 역시 복수의 프레임으로 구성된다. 원본 훈련 영상(605)의 복수의 프레임이 영상 압축(610)되면 압축 훈련 영상(615)의 복수의 프레임이 획득되고, 컨볼루션 필터(620)를 통해 제 1 훈련 영상(635)의 복수의 프레임이 획득될 수 있다.The original training image 605 includes a still image or a moving image composed of a plurality of frames. In an embodiment, the original training image 605 may include a still image or a luminance image extracted from a moving image including a plurality of frames. Also, according to an embodiment, the original training image 605 may include a still image or a patch image extracted from a moving picture including a plurality of frames. When the original training image 605 consists of a plurality of frames, the compressed training image 615 , the first training image 635 , and the low-frequency and intermediate-frequency training image 630 also include a plurality of frames. When a plurality of frames of the original training image 605 are compressed 610 , a plurality of frames of the compressed training image 615 are obtained, and a plurality of frames of the first training image 635 are obtained through the convolution filter 620 . this can be obtained.

원본 훈련 영상(605)의 영상 압축을 위해 MPEG-2, H.264, MPEG-4, HEVC, VC-1, VP8, VP9 및 AV1 중 어느 하나의 코덱이 이용될 수 있다.For image compression of the original training image 605, any one of MPEG-2, H.264, MPEG-4, HEVC, VC-1, VP8, VP9, and AV1 codecs may be used.

도 6을 참조하면, 압축 훈련 영상(615)을 입력으로하여 컨볼루션 필터(620)를 통해 제 1 훈련 영상(635)이 출력되는 것과 별개로, 압축 훈련 영상(615)으로부터 레거시 주파수 대역 필터(625)를 통해 필터링된 저주파수 및 중간주파수 훈련 영상(630)이 획득된다. 여기서, 레거시 주파수 대역 필터(625)의 차단 주파수는 압축 훈련 영상의 특성에 따라 달라질 수 있다. 구체적으로, 컨볼루션 필터을 이용하는 것은 압축 영상을 복원가능성이 높은 구조적 정보를 포함하는 저주파수 및 중간주파수 정보와 손상 확률이 높은 고주파수 노이즈의 고주파수 정보 및 아티팩트를 분리하기 위한 것이므로, 레거시 주파수 대역 필터의 차단 주파수는 압축 훈련 영상에 따라 달라질 수 있다.Referring to FIG. 6 , separately from outputting the first training image 635 through the convolution filter 620 by inputting the compressed training image 615 as an input, a legacy frequency band filter ( 625), the filtered low-frequency and intermediate-frequency training images 630 are obtained. Here, the cutoff frequency of the legacy frequency band filter 625 may vary according to characteristics of the compressed training image. Specifically, the use of the convolution filter is to separate low-frequency and intermediate-frequency information including structural information with a high probability of reconstructing a compressed image from high-frequency information and artifacts of high-frequency noise with a high probability of damage, so the cutoff frequency of the legacy frequency band filter may vary depending on the compression training image.

저주파수 및 중간주파수 훈련 영상(630)은 레거시 주파수 대역 필터(625)를 통해 압축 훈련 영상의 저주파수 및 중간주파수 정보만을 포함하는 영상을 의미한다. 컨볼루션 필터(620)의 훈련을 위해 압축 훈련 영상의 저주파수 및 중간주파수 정보만을 포함하는 저주파수 및 중간주파수 훈련 영상(630)을 획득하는 것이다.The low-frequency and intermediate-frequency training image 630 refers to an image including only low-frequency and intermediate-frequency information of the compressed training image through the legacy frequency band filter 625 . For training of the convolution filter 620 , a low-frequency and intermediate-frequency training image 630 including only low-frequency and intermediate-frequency information of the compressed training image is acquired.

훈련 진행 전 컨볼루션 필터(620)는 미리 결정된 필터 설정 정보로 세팅될 수 있다. 훈련이 진행됨에 따라, 필터 손실 정보(640)가 결정될 수 있다.Before training, the convolution filter 620 may be set with predetermined filter setting information. As training progresses, filter loss information 640 may be determined.

필터 손실 정보(640)는 제 1 훈련 영상(635)과 저주파수 및 중간주파수 훈련 영상(630)의 비교 결과에 기초하여 결정될 수 있다.The filter loss information 640 may be determined based on a comparison result between the first training image 635 and the low-frequency and intermediate-frequency training images 630 .

필터 손실 정보(640)는 제 1 훈련 영상(635)과 저주파수 및 중간주파수 훈련 영상(630)의 차이에 대한 L1-norm 값, L2-norm 값, SSIM(Structural Similarity) 값, PSNR-HVS(Peak Signal-To-Noise Ratio-Human Vision System) 값, MS-SSIM(Multiscale SSIM) 값, VIF(Variance Inflation Factor) 값 및 VMAF(Video Multimethod Assessment Fusion) 값 중 적어도 하나를 포함할 수 있다. 필터 손실 정보(640)는 제 1 훈련 영상(635)이 저주파수 및 중간주파수 훈련 영상(630)과 어느 정도로 유사한지를 나타낸다. 필터 손실 정보(640)가 작을수록 제 1 훈련 영상(635)이 저주파수 및 중간주파수 훈련 영상(630)에 더 유사해진다.The filter loss information 640 includes an L1-norm value, an L2-norm value, a structural similarity (SSIM) value, and a PSNR-HVS (Peak) value for the difference between the first training image 635 and the low-frequency and intermediate-frequency training image 630 . It may include at least one of a Signal-To-Noise Ratio-Human Vision System) value, a Multiscale SSIM (MS-SSIM) value, a Variance Inflation Factor (VIF) value, and a Video Multimethod Assessment Fusion (VMAF) value. The filter loss information 640 indicates how similar the first training image 635 is to the low-frequency and intermediate-frequency training images 630 . As the filter loss information 640 is smaller, the first training image 635 is more similar to the low-frequency and intermediate-frequency training images 630 .

컨볼루션 필터(620)는 필터 손실 정보(640)가 감소 또는 최소화되도록 파라미터를 갱신할 수 있다.The convolution filter 620 may update the parameter so that the filter loss information 640 is reduced or minimized.

컨볼루션 필터(620)의 최종 손실 정보는 아래의 수학식 1과 같이 결정될 수 있다.The final loss information of the convolution filter 620 may be determined as in Equation 1 below.

[수학식 1][Equation 1]

LossCF = a*필터 손실 정보LossCF = a*Filter loss information

상기 수학식 1에서, LossCF는 컨볼루션 필터(620)의 훈련을 위해 감소 또는 최소화되어야 할 최종 손실 정보를 나타내고, a는 미리 결정된 가중치에 해당할 수 있다.In Equation 1, LossCF represents final loss information to be reduced or minimized for training of the convolution filter 620, and a may correspond to a predetermined weight.

즉, 컨볼루션 필터(620)는 수학식 1의 LossCF가 감소되는 방향으로 파라미터들을 갱신하게 된다. 훈련 과정에서 도출된 LossCF에 따라 컨볼루션 필터(620)의 파라미터들이 갱신되면, 갱신된 파라미터에 기초하여 획득되는 제 1 훈련 영상(635)이 이전 훈련 과정에서의 제 1 훈련 영상(635)과 달라지게 된다.That is, the convolution filter 620 updates the parameters in a direction in which the LossCF of Equation 1 is decreased. When the parameters of the convolution filter 620 are updated according to the LossCF derived in the training process, the first training image 635 obtained based on the updated parameters is different from the first training image 635 in the previous training process. will lose

도 7은 AI 아티팩트 저감용 DNN을 훈련시키는 방법을 설명하기 위한 예시적인 도면이다.7 is an exemplary diagram for explaining a method of training a DNN for reducing AI artifacts.

도 7에서, 원본 훈련 영상(605)은 복원 훈련 영상(730)의 비교 대상이 되는 영상이고, 압축 훈련 영상(615)은 원본 훈련 영상(605)이 영상 압축된 영상이다. 제 1 훈련 영상(635)은 압축 훈련 영상(615)에 도 6에서의 훈련 과정을 통해 훈련이 완료된 컨볼루션 필터(620)가 적용된 영상이다. 제 2 훈련 영상(710)은 압축 훈련 영상(615)로부터 제 1 훈련 영상(635)을 감산(705)한 영상이고, 제 3 훈련 영상(720)은 제 2 훈련 영상(710)을 입력으로하여 AI 아티팩트 저감용 DNN(715)을 통해 출력된 영상이고, 본원 훈련 영상(730)은 제 1 훈련 영상(635)과 제 3 훈련 영상(720)을 합산(725)한 영상이다.In FIG. 7 , the original training image 605 is an image to be compared with the reconstructed training image 730 , and the compressed training image 615 is a compressed image of the original training image 605 . The first training image 635 is an image in which the convolution filter 620 is applied to the compressed training image 615 , which has been trained through the training process in FIG. 6 . The second training image 710 is an image obtained by subtracting (705) the first training image 635 from the compressed training image 615, and the third training image 720 is the second training image 710 as an input. It is an image output through the DNN 715 for AI artifact reduction, and the training image 730 is an image obtained by summing 725 of the first training image 635 and the third training image 720 .

여기서, 제 1 훈련 영상(635)은 압축 훈련 영상(615)에 훈련이 완료된 컨볼루션 필터(620)가 적용된다. 이는 컨볼루션 필터(620)는 저주파수 및 중간주파수 정보를 가지는 제 1 영상(135)과 고주파수 정보 및 압축 아티팩트를 가지는 제 2 영상(145)을 효과적으로 분리하기 위해 훈련되는 것이고, AI 아티팩트 저감용 DNN(715)은 제 2 영상(145)의 압축 아티팩트를 효과적으로 제거하기 위해 훈련되는 것이기 때문이다. 즉, 컨볼루션 필터(620)와 AI 아티팩트 저감용 DNN(715)은 독립적으로 훈련되어야 한다.Here, the first training image 635 is a convolutional filter 620 that has been trained to the compressed training image 615 is applied. This is that the convolution filter 620 is trained to effectively separate the first image 135 having low-frequency and intermediate-frequency information and the second image 145 having high-frequency information and compression artifacts, and the AI artifact reduction DNN ( This is because 715 is trained to effectively remove the compression artifact of the second image 145 . That is, the convolution filter 620 and the DNN 715 for AI artifact reduction must be trained independently.

원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상을 포함한다. 일 실시예에서, 원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상으로부터 추출된 휘도 영상을 포함할 수도 있다. 또한, 일 실시예에서, 원본 훈련 영상(605)은 정지 영상 또는 복수의 프레임으로 이루어진 동영상에서 추출된 패치 영상을 포함할 수도 있다. 원본 훈련 영상(605)이 복수의 프레임으로 이루어진 경우, 압축 훈련 영상(615), 제 1 훈련 영상(635), 제 2 훈련 영상(710), 및 제 3 훈련 영상(720) 역시 복수의 프레임으로 구성된다. 원본 훈련 영상(605)의 복수의 프레임이 영상 압축(610)되면 압축 훈련 영상(615)의 복수의 프레임이 획득되고, 훈련이 완료된 컨볼루션 필터(620)를 통해 제 1 훈련 영상(635)의 복수의 프레임이 획득되고, 압축 훈련 영상(615)으로부터 제 1 훈련 영상(635)의 감산(705)을 통해 제 2 훈련 영상(710)의 복수의 프레임이 획득되고, AI 아티팩트 저감용 DNN(715)을 통해 제 3 훈련 영상(720)의 복수의 프레임이 획득되고, 제 1 훈련 영상(635)과 제 3 훈련 영상(720)의 합산(725)을 통해 복원 훈련 영상(730)의 복수의 프레임이 획득된다.The original training image 605 includes a still image or a moving image composed of a plurality of frames. In an embodiment, the original training image 605 may include a still image or a luminance image extracted from a moving image including a plurality of frames. Also, according to an embodiment, the original training image 605 may include a still image or a patch image extracted from a moving picture including a plurality of frames. When the original training image 605 consists of a plurality of frames, the compressed training image 615, the first training image 635, the second training image 710, and the third training image 720 are also divided into a plurality of frames. is composed When a plurality of frames of the original training image 605 are compressed 610, a plurality of frames of a compressed training image 615 are obtained, and the first training image 635 is A plurality of frames are obtained, a plurality of frames of the second training image 710 are obtained through the subtraction 705 of the first training image 635 from the compressed training image 615, and a DNN 715 for reducing AI artifacts. A plurality of frames of the third training image 720 are acquired through this is obtained

훈련 진행 전 AI 아티팩트 저감용 DNN(715)는 미리 결정된 DNN 설정 정보로 세팅될 수 있다. 훈련이 진행됨에 따라, 저감 손실 정보(740)가 결정될 수 있다.The DNN 715 for reducing AI artifacts before training may be set with predetermined DNN configuration information. As training progresses, abatement loss information 740 may be determined.

저감 손실 정보(740)는 원본 훈련 영상(605)과 복원 훈련 영상(730)의 비교 결과에 기초하여 결정될 수 있다.The reduced loss information 740 may be determined based on a comparison result between the original training image 605 and the restored training image 730 .

저감 손실 정보(740)는 원본 훈련 영상(605)과 복원 훈련 영상(730)의 차이에 대한 L1-norm 값, L2-norm 값, SSIM(Structural Similarity) 값, PSNR-HVS(Peak Signal-To-Noise Ratio-Human Vision System) 값, MS-SSIM(Multiscale SSIM) 값, VIF(Variance Inflation Factor) 값 및 VMAF(Video Multimethod Assessment Fusion) 값 중 적어도 하나를 포함할 수 있다. 저감 손실 정보(740)는 복원 훈련 영상(730)이 원본 훈련 영상(605)과 어느 정도로 유사한지를 나타낸다. 저감 손실 정보(740)가 작을수록 복원 훈련 영상(730)이 원본 훈련 영상(605)에 더 유사해진다. 즉, 원본 훈련 영상이 영상 압축되어 생성된 압축 아티팩트가 저감된 복원 훈련 영상이 획득된다.The reduced loss information 740 includes an L1-norm value, an L2-norm value, a structural similarity (SSIM) value, and a peak signal-to- (PSNR-HVS) value for the difference between the original training image 605 and the reconstructed training image 730 . It may include at least one of a Noise Ratio-Human Vision System) value, a Multiscale SSIM (MS-SSIM) value, a Variance Inflation Factor (VIF) value, and a Video Multimethod Assessment Fusion (VMAF) value. The reduced loss information 740 indicates how similar the reconstructed training image 730 is to the original training image 605 . As the reduced loss information 740 is smaller, the restored training image 730 is more similar to the original training image 605 . That is, a reconstructed training image with reduced compression artifacts generated by image compression of the original training image is obtained.

AI 아티팩트 저감용 DNN(715)는 저감 손실 정보(740)가 감소 또는 최소화되도록 파라미터를 갱신할 수 있다.The AI artifact reduction DNN 715 may update parameters so that the reduction loss information 740 is reduced or minimized.

AI 아티팩트 저감용 DNN(715)의 최종 손실 정보는 아래의 수학식 2와 같이 결정될 수 있다.The final loss information of the DNN 715 for reducing AI artifacts may be determined as in Equation 2 below.

[수학식 2][Equation 2]

LossAR = b*저감 손실 정보LossAR = b*reduced loss information

상기 수학식 2에서, LossAR는 AI 아티팩트 저감용 DNN(715)의 훈련을 위해 감소 또는 최소화되어야 할 최종 손실 정보를 나타내고, b는 미리 결정된 가중치에 해당할 수 있다.In Equation 2, LossAR represents final loss information to be reduced or minimized for training of the DNN 715 for reducing AI artifacts, and b may correspond to a predetermined weight.

즉, AI 아티팩트 저감용 DNN(715)는 수학식 2의 LossAR가 감소되는 방향으로 파라미터들을 갱신하게 된다. 훈련 과정에서 도출된 LossAR에 따라 AI 아티팩트 저감용 DNN(715)의 파라미터들이 갱신되면, 갱신된 파라미터에 기초하여 획득되는 제 3 훈련 영상(720)이 이전 훈련 과정에서의 제 3 훈련 영상(720)과 달라지게 된다.That is, the DNN 715 for reducing AI artifacts updates the parameters in a direction in which the LossAR of Equation 2 is reduced. When the parameters of the DNN 715 for reducing AI artifacts are updated according to the LossAR derived in the training process, the third training image 720 obtained based on the updated parameters is the third training image 720 in the previous training process. will be different from

도 8은 일 실시예에 따른 압축 영상의 노이즈를 추출하는 방법 방법을 설명하기 위한 순서도이다.8 is a flowchart illustrating a method of extracting noise from a compressed image according to an exemplary embodiment.

S810 단계에서, 압축 영상의 노이즈를 추출하는 장치(400)는 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득한다.In step S810, the apparatus 400 for extracting noise of the compressed image applies a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image by applying a convolution filter to the compressed image of the original image. to acquire

일 실시예에 따라, 제1 영상은 상기 압축 영상의 중간 주파수 정보 및 저주파수 정보를 포함할 수 있다.According to an embodiment, the first image may include intermediate frequency information and low frequency information of the compressed image.

일 실시예에 따라, 컨볼루션 필터는 압축 영상을 입력으로 하여 압축 영상의 중간 주파수 정보 및 저주파수 정보를 포함하는 제1 영상을 획득하도록 훈련될 수 있다.According to an embodiment, the convolution filter may be trained to obtain a first image including intermediate frequency information and low frequency information of the compressed image by receiving the compressed image as an input.

일 실시예에 따라, 다운 컨볼루션은 압축 영상의 크기를 줄이도록 훈련된 컨볼루션 연산일 수 있다.According to an embodiment, the down convolution may be a convolution operation trained to reduce the size of a compressed image.

일 실시예에 따라, 업 컨볼루션은 다운 컨볼루션된 압축 영상을 상기 압축 영상의 원래 크기로 늘리도록 훈련된 컨볼루션 연산일 수 있다.According to an embodiment, the up-convolution may be a convolution operation trained to increase the down-convolved compressed image to the original size of the compressed image.

일 실시예에 따라, 다운 컨볼루션과 업 컨볼루션은 전치(transpose)된 것일 수 있다.According to an embodiment, the down convolution and the up convolution may be transposed.

S830 단계에서, 압축 영상의 노이즈를 추출하는 장치(400)는 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득한다.In operation S830, the apparatus 400 for extracting noise from the compressed image obtains a second image obtained by subtracting the first image from the compressed image.

S850 단계에서, 압축 영상의 노이즈를 추출하는 장치(400)는 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득한다.In operation S850, the apparatus 400 for extracting noise of the compressed image obtains noise including high frequency information and compression artifacts of the compressed image from the second image.

도 9는 일 실시예에 따른 압축 영상의 노이즈를 저감하는 방법 방법을 설명하기 위한 순서도이다.9 is a flowchart illustrating a method of reducing noise in a compressed image according to an exemplary embodiment.

S910 단계에서, 압축 영상의 노이즈를 저감하는 장치(500)는 원본 영상의 압축 영상에 다운 컨볼루션 및 상기 다운 컨볼루션에 대응되는 업 컨볼루션을 순차적으로 수행하는 컨볼루션 필터를 적용함으로써 제1 영상을 획득한다.In step S910 , the apparatus 500 for reducing noise of the compressed image applies a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image by applying a convolution filter to the compressed image of the original image. to acquire

S930 단계에서, 압축 영상의 노이즈를 저감하는 장치(500)는 상기 압축 영상으로부터 상기 제1 영상을 감산한 제2 영상을 획득한다.In operation S930, the apparatus 500 for reducing noise of a compressed image obtains a second image obtained by subtracting the first image from the compressed image.

S950 단계에서, 압축 영상의 노이즈를 저감하는 장치(500)는 상기 제2 영상으로부터 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 포함하는 노이즈를 획득한다.In operation S950 , the apparatus 500 for reducing noise of the compressed image obtains noise including high frequency information and compression artifacts of the compressed image from the second image.

S970 단계에서, 압축 영상의 노이즈를 저감하는 장치(500)는 상기 노이즈에 압축 아티팩트 제거용 DNN을 적용하여 압축 아티팩트를 제거하여 제3 영상을 획득한다.In step S970 , the apparatus 500 for reducing the noise of the compressed image obtains the third image by removing the compression artifact by applying the DNN for removing the compression artifact to the noise.

일 실시예에 따라, 압축 아티팩트 제거용 DNN은 상기 압축 영상의 고주파수 정보 및 압축 아티팩트를 입력으로 하여 상기 압축 아티팩트를 저감하도록 훈련된 것일 수 있다.According to an embodiment, the DNN for removing compression artifacts may be trained to reduce compression artifacts by receiving high frequency information and compression artifacts of the compressed image as inputs.

일 실시예에 따라, 압축 아티팩트 제거용 DNN은 컨볼루션 레이어 및 활성화 레이어를 포함할 수 있다.According to an embodiment, the DNN for compression artifact removal may include a convolution layer and an activation layer.

일 실시예에 따라, 압축 아티팩트 제거용 DNN은 배치 정규화 레이어를 추가로 포함할 수 있다.According to an embodiment, the DNN for compression artifact removal may further include a batch normalization layer.

S990 단계에서, 압축 영상의 노이즈를 저감하는 장치(500)는 상기 제1 영상과 상기 제3 영상을 합산하여 상기 압축 영상을 복원한다.In operation S990 , the apparatus 500 for reducing noise of the compressed image restores the compressed image by adding the first image and the third image.

한편, 상술한 본 개시의 실시예들은 컴퓨터에서 실행될 수 있는 프로그램 또는 인스트럭션으로 작성가능하고, 작성된 프로그램 또는 인스트럭션은 매체에 저장될 수 있다.Meanwhile, the above-described embodiments of the present disclosure can be written as a program or instruction that can be executed on a computer, and the written program or instruction can be stored in a medium.

매체는 컴퓨터로 실행 가능한 프로그램 또는 인스트럭션을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The medium may continuously store a computer-executable program or instructions, or may be a temporary storage for execution or download. In addition, the medium may be various recording means or storage means in the form of a single or several hardware combined, it is not limited to a medium directly connected to any computer system, and may exist distributed on a network. Examples of the medium include a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floppy disk, and those configured to store program instructions, including ROM, RAM, flash memory, and the like. In addition, examples of other media may include recording media or storage media managed by an app store that distributes applications, sites that supply or distribute various other software, and servers.

한편, 상술한 DNN과 관련된 모델은, 소프트웨어 모듈로 구현될 수 있다. 소프트웨어 모듈(예를 들어, 명령어(instruction)를 포함하는 프로그램 모듈)로 구현되는 경우, DNN 모델은 컴퓨터로 읽을 수 있는 판독 가능한 기록매체에 저장될 수 있다.Meanwhile, the above-described DNN-related model may be implemented as a software module. When implemented as a software module (eg, a program module including instructions), the DNN model may be stored in a computer-readable recording medium.

또한, DNN 모델은 하드웨어 칩 형태로 집적되어 전술한 AI 복호화 장치(200) 또는 AI 부호화 장치(600)의 일부가 될 수도 있다. 예를 들어, DNN 모델은 인공 지능을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예를 들어, CPU 또는 애플리케이션 프로세서) 또는 그래픽 전용 프로세서(예를 들어, GPU)의 일부로 제작될 수도 있다.In addition, the DNN model may be integrated in the form of a hardware chip to become a part of the aforementioned AI decoding apparatus 200 or AI encoding apparatus 600 . For example, a DNN model may be built in the form of a dedicated hardware chip for artificial intelligence, or as part of an existing general-purpose processor (e.g., CPU or application processor) or graphics-only processor (e.g., GPU). it might be

또한, DNN 모델은 다운로드 가능한 소프트웨어 형태로 제공될 수도 있다. 컴퓨터 프로그램 제품은 제조사 또는 전자 마켓을 통해 전자적으로 배포되는 소프트웨어 프로그램 형태의 상품(예를 들어, 다운로드 가능한 애플리케이션)을 포함할 수 있다. 전자적 배포를 위하여, 소프트웨어 프로그램의 적어도 일부는 저장 매체에 저장되거나, 임시적으로 생성될 수 있다. 이 경우, 저장 매체는 제조사 또는 전자 마켓의 서버, 또는 중계 서버의 저장매체가 될 수 있다.In addition, the DNN model may be provided in the form of downloadable software. The computer program product may include a product (eg, a downloadable application) in the form of a software program distributed electronically through a manufacturer or an electronic market. For electronic distribution, at least a portion of the software program may be stored in a storage medium or may be temporarily generated. In this case, the storage medium may be a server of a manufacturer or an electronic market, or a storage medium of a relay server.

이상, 본 개시의 기술적 사상을 바람직한 실시예를 들어 상세하게 설명하였으나, 본 개시의 기술적 사상은 상기 실시예들에 한정되지 않고, 본 개시의 기술적 사상의 범위 내에서 당 분야에서 통상의 지식을 가진 자에 의하여 여러 가지 변형 및 변경이 가능하다.Above, the technical idea of the present disclosure has been described in detail with reference to preferred embodiments, but the technical idea of the present disclosure is not limited to the above embodiments, and those of ordinary skill in the art within the scope of the technical spirit of the present disclosure Various modifications and changes are possible by the person.

Claims

obtaining a first image by applying a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image;
obtaining a second image obtained by subtracting the first image from the compressed image; and
and obtaining noise including high frequency information and compression artifacts of the compressed image from the second image.

According to claim 1,
The method of extracting noise from a compressed image, wherein the first image includes intermediate frequency information and low frequency information of the compressed image.

According to claim 1,
The method for extracting noise from a compressed image, wherein the convolution filter is trained to obtain the first image including intermediate frequency information and low frequency information of the compressed image by receiving the compressed image as an input.

4. The method of claim 3,
The down convolution is a convolution operation trained to reduce the size of the compressed image.

4. The method of claim 3,
The up-convolution is a convolution operation trained to increase the down-convolved compressed image to the original size of the compressed image.

According to claim 1,
The down-convolution and the up-convolution are transposed, the method of extracting noise from a compressed image.

obtaining a first image by applying a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image;
obtaining a second image obtained by subtracting the first image from the compressed image;
obtaining noise including high frequency information and compression artifacts of the compressed image from the second image;
obtaining a third image by removing compression artifacts by applying DNN for removing compression artifacts to the noise; and
and restoring the compressed image by summing the first image and the third image.

8. The method of claim 7,
The method for reducing noise in a compressed image, wherein the DNN for removing compression artifacts is trained to reduce the compression artifacts by inputting high-frequency information and compression artifacts of the compressed image as inputs.

8. The method of claim 7,
The method for reducing noise in a compressed image, wherein the DNN for removing compression artifacts includes a convolution layer and an activation layer.

10. The method of claim 9,
The DNN for removing compression artifacts further comprises a batch normalization layer.

8. The method of claim 7,
The method of reducing noise of a compressed image, wherein the first image includes intermediate frequency information and low frequency information of the compressed image.

8. The method of claim 7,
The method for reducing noise in a compressed image, wherein the convolution filter is trained to obtain the first image including intermediate frequency information and low frequency information of the compressed image.

8. The method of claim 7,
The down convolution is a convolution operation trained to reduce the size of the compressed image.

8. The method of claim 7,
The up-convolution is a convolution operation trained to increase the down-convolved compressed image to the original size of the compressed image.

8. The method of claim 7,
The down-convolution and the up-convolution are transposed, the method of reducing noise in a compressed image.

a memory storing one or more instructions; and
a processor executing the one or more instructions stored in the memory;
The processor is
A first image is obtained by applying a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image,
obtaining a second image obtained by subtracting the first image from the compressed image,
The apparatus for extracting noise from a compressed image, characterized in that the noise including high frequency information and compression artifacts of the compressed image is obtained from the second image.

a memory storing one or more instructions; and
a processor executing the one or more instructions stored in the memory;
The processor is
A first image is obtained by applying a convolution filter that sequentially performs down-convolution and up-convolution corresponding to the down-convolution to the compressed image of the original image,
obtaining a second image obtained by subtracting the first image from the compressed image,
obtaining noise including high-frequency information and compression artifacts of the compressed image from the second image;
A third image is obtained by removing compression artifacts by applying DNN for removing compression artifacts to the noise;
The apparatus for reducing noise in a compressed image, characterized in that the compressed image is restored by adding the first image and the third image.