KR102508428B1

KR102508428B1 - Method and apparatus for encoding and decoding video

Info

Publication number: KR102508428B1
Application number: KR1020180018803A
Authority: KR
Inventors: 이상윤; 김태오; 박창현; 박영오; 김찬열; 전선영; 최광표
Original assignee: 삼성전자주식회사; 연세대학교 산학협력단
Priority date: 2018-02-14
Filing date: 2018-02-14
Publication date: 2023-03-09
Also published as: KR20190098634A

Abstract

인코딩 장치가 개시된다. 인코딩 장치는, 통신부, 노이즈 모델이 저장된 메모리, 및 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득하고, 현재 블록 및 예측 블록 간 차이에 기초하여 획득된 차분 신호에 대한 양자화를 수행하고, 양자화된 차분 신호를 인코딩하여 인코딩 영상 신호를 획득하고, 예측 블록에 대응되는 노이즈 메타데이터 및 인코딩 영상 신호를 포함하는 스트림을 통신부를 통해 디코딩 장치로 출력하는 프로세서를 포함한다.An encoding device is disclosed. The encoding apparatus obtains a prediction block for encoding corresponding to a current block of an input image based on a communication unit, a memory in which a noise model is stored, and a noise model, and obtains a difference signal obtained based on a difference between the current block and the prediction block. and a processor for performing quantization on , encoding the quantized differential signal to obtain an encoded video signal, and outputting a stream including noise metadata corresponding to a prediction block and the encoded video signal to a decoding device through a communication unit.

Description

Encoding, decoding method and device { METHOD AND APPARATUS FOR ENCODING AND DECODING VIDEO }

본 발명은 인코딩, 디코딩 방법 및 장치에 대한 것으로, 더욱 상세하게는 영상의 노이즈를 고려하는 인코딩, 디코딩 방법 및 장치에 대한 것이다.The present invention relates to an encoding and decoding method and apparatus, and more particularly, to an encoding and decoding method and apparatus considering noise of an image.

최근 UHD(Ultra High Definition) 영상과 같은 고해상도의 영상이 보급되고 있으며, 전자 기술의 발달로 영상에 대한 후처리 성능이 향상되었다.Recently, high-resolution images such as UHD (Ultra High Definition) images have been popularized, and post-processing performance of images has improved with the development of electronic technology.

고해상도의 영상을 제공하기 위해서는 고효율의 영상 압축 기술이 필요함에 따라 이에 대한 연구 및 개발이 활발하게 진행되고 있는 실정이다.In order to provide a high-resolution image, high-efficiency image compression technology is required, and thus research and development are being actively conducted.

다만, 기존의 영상 압축 기술은 영상에 대한 전자 장치의 후처리 성능을 고려하지 않은 채, 인코딩 및 디코딩 알고리즘에 대한 연구 및 개발에만 집중되는 문제가 있었다.However, existing image compression technologies have a problem in that they focus only on research and development of encoding and decoding algorithms without considering post-processing performance of electronic devices for images.

그에 따라, 영상에 대한 인코딩 시에 전자 장치의 후처리 성능을 고려하고, 불필요한 인코딩 및 디코딩 과정을 생략하여 처리 속도, 성능을 향상시킬 필요성이 대두되었다.Accordingly, a need has arisen to improve processing speed and performance by considering the post-processing performance of an electronic device when encoding an image and omitting unnecessary encoding and decoding processes.

본 발명은 상술한 필요성에 따른 것으로, 본 발명의 목적은 전자 장치의 후처리를 고려하여 인코딩을 수행하는 인코딩, 디코딩 방법 및 장치를 제공함에 있다.SUMMARY OF THE INVENTION The present invention has been made in accordance with the above-described needs, and an object of the present invention is to provide an encoding and decoding method and apparatus for performing encoding in consideration of post-processing of an electronic device.

상술한 목적을 달성하기 위한 본 발명의 일 실시 예에 따른 인코딩 장치는 통신부, 노이즈 모델이 저장된 메모리, 상기 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득하고, 상기 현재 블록 및 상기 예측 블록 간 차이에 기초하여 획득된 차분 신호에 대한 양자화를 수행하고, 상기 양자화된 차분 신호를 인코딩하여 인코딩 영상 신호를 획득하고, 상기 예측 블록에 대응되는 노이즈 메타데이터 및 상기 인코딩 영상 신호를 포함하는 스트림을 상기 통신부를 통해 디코딩 장치로 출력하는 프로세서를 포함하며, 상기 노이즈 모델은, 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것이다.An encoding apparatus according to an embodiment of the present invention for achieving the above object obtains a prediction block for encoding corresponding to a current block of an input image based on a communication unit, a memory in which a noise model is stored, and the noise model, Quantization is performed on a difference signal obtained based on a difference between a current block and the prediction block, encoding of the quantized difference signal is performed to obtain an encoded video signal, and noise metadata corresponding to the prediction block and the encoded video are obtained. and a processor outputting a stream including a signal to a decoding device through the communication unit, and the noise model is obtained by learning a noise post-processing process based on a plurality of sample images.

여기서, 상기 노이즈 모델은, 상기 복수의 샘플 영상 각각의 상이한 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 노이즈 패턴 정보를 포함하고, 상기 프로세서는, 상기 노이즈 패턴 정보에 기초하여 상기 현재 블록에 대응되는 상기 예측 블록을 획득할 수 있다.Here, the noise model includes noise pattern information according to at least one of a different coding unit (CU) size, a coding index, and a bit rate of a coding coefficient of each of the plurality of sample images, The processor may obtain the prediction block corresponding to the current block based on the noise pattern information.

여기서, 상기 프로세서는, 상기 현재 블록에 포함된 노이즈 정보 중 상기 노이즈 패턴 정보에 대응되는 노이즈 정보는 유지하는 예측 블록을 획득하고, 상기 노이즈 메타데이터는, 상기 노이즈 패턴 정보에 대응되는 노이즈 정보를 포함할 수 있다.Here, the processor obtains a prediction block retaining noise information corresponding to the noise pattern information among the noise information included in the current block, and the noise metadata includes noise information corresponding to the noise pattern information. can do.

또한, 상기 노이즈 패턴 정보는, 상기 복수의 샘플 영상 각각에 기초하여 학습된 처리 가능한 노이즈에 대한 정보 및 상기 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하고, 상기 처리 가능한 노이즈에 대한 정보는, 링잉 노이즈(ringing noise), 플리커링 노이즈(flickering noise), 블로킹 노이즈(blocking noise) 및 블러링 노이즈(blurring noise) 중 적어도 하나를 포함할 수 있다.The noise pattern information includes at least one of information on processable noise learned based on each of the plurality of sample images and information on intensity of the processable noise, and the information on processable noise is , ringing noise, flickering noise, blocking noise, and blurring noise.

또한, 상기 프로세서는, 상기 통신부를 통해 상기 디코딩 장치에 대한 정보가 수신되면, 상기 노이즈 모델 및 상기 디코딩 장치에 대한 정보에 기초하여 상기 현재 블록에 대응되는 상기 예측 블록을 획득할 수 있다.Also, when the information on the decoding device is received through the communication unit, the processor may obtain the prediction block corresponding to the current block based on the noise model and the information on the decoding device.

또한, 상기 프로세서는, 상기 복수의 샘플 영상에 CNN(Convolution Neural Network)을 수행하여 상기 노이즈 모델을 획득할 수 있다.In addition, the processor may obtain the noise model by performing a convolution neural network (CNN) on the plurality of sample images.

한편, 본 개시의 일 실시 예에 따른 전자 장치는, 통신부 및 인코딩 장치로부터 인코딩 영상 신호 및 노이즈 메타데이터가 상기 통신부를 통해 수신되면, 상기 인코딩 영상 신호를 디코딩하고, 상기 노이즈 메타데이터에 기초하여 상기 디코딩된 영상 신호에 후처리를 수행하는 프로세서를 포함하며, 상기 인코딩 영상 신호는, 노이즈 모델에 기초하여 원본 영상에서 기설정된 노이즈 정보는 유지된 상태로 인코딩된 영상 신호이고, 상기 노이즈 모델은 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것이며, 상기 프로세서는, 상기 디코딩된 영상 신호에서 상기 노이즈 메타데이터에 포함된 상기 기설정된 노이즈 정보에 대응되는 노이즈를 필터링하여 상기 후처리를 수행한다.Meanwhile, when an encoded video signal and noise metadata are received from a communication unit and an encoding device through the communication unit, the electronic device according to an embodiment of the present disclosure decodes the encoded video signal, and based on the noise metadata, the electronic device A processor that performs post-processing on a decoded video signal, wherein the encoded video signal is an encoded video signal in a state in which noise information preset in an original video based on a noise model is maintained, and the noise model includes a plurality of A noise post-processing process is learned based on the sample image, and the processor performs the post-processing by filtering noise corresponding to the predetermined noise information included in the noise metadata in the decoded image signal.

여기서, 상기 노이즈 메타데이터는, 노이즈 후처리 과정에서 처리 가능한 노이즈에 대한 정보 및 상기 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하고, 상기 프로세서는, 상기 노이즈 메타데이터에 기초하여 디코딩된 영상에 링잉 노이즈(ringing noise) 필티링, 플리커링 노이즈(flickering noise) 필터링, 블로킹 노이즈(blocking noise) 필터링 및 블러링 노이즈(blurring noise) 필터링 중 적어도 하나를 수행할 수 있다.Here, the noise metadata includes at least one of information on noise that can be processed in a noise post-processing process and information on the strength of the noise that can be processed, and the processor performs a decoded image based on the noise metadata. At least one of ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering may be performed.

또한, 전자 장치는 디스플레이를 더 포함하고, 상기 프로세서는, 상기 후처리가 수행된 영상을 상기 디스플레이를 통해 제공할 수 있다.In addition, the electronic device may further include a display, and the processor may provide the post-processed image through the display.

한편, 본 개시의 일 실시 예에 따른 노이즈 모델이 저장된 인코딩 장치의 인코딩 방법은 상기 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득하는 단계, 상기 현재 블록 및 상기 예측 블록 간 차이에 기초하여 획득된 차분 신호에 대한 양자화를 수행하는 단계, 상기 양자화된 차분 신호를 인코딩하여 인코딩 영상 신호를 획득하는 단계 및 상기 예측 블록에 대응되는 노이즈 메타데이터 및 상기 인코딩 영상 신호를 포함하는 스트림을 디코딩 장치로 출력하는 단계를 포함하고, 상기 노이즈 모델은, 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것이다.Meanwhile, an encoding method of an encoding device storing a noise model according to an embodiment of the present disclosure includes obtaining a prediction block for encoding corresponding to a current block of an input image based on the noise model, the current block and the prediction block. Performing quantization on a difference signal obtained based on a difference between blocks, encoding the quantized difference signal to obtain an encoded video signal, and noise metadata corresponding to the prediction block and the encoded video signal. and outputting the stream to a decoding device, wherein the noise model is learned through a noise post-processing process based on a plurality of sample images.

여기서, 상기 노이즈 모델은, 상기 복수의 샘플 영상 각각의 상이한 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 노이즈 패턴 정보를 포함하고, 상기 예측 블록을 획득하는 단계는, 상기 노이즈 패턴 정보에 기초하여 상기 현재 블록에 대응되는 상기 예측 블록을 획득할 수 있다.Here, the noise model includes noise pattern information according to at least one of a different coding unit (CU) size, a coding index, and a bit rate of a coding coefficient of each of the plurality of sample images, In the obtaining of the prediction block, the prediction block corresponding to the current block may be obtained based on the noise pattern information.

또한, 상기 예측 블록을 획득하는 단계는, 상기 현재 블록에 포함된 노이즈 정보 중 상기 노이즈 패턴 정보에 대응되는 노이즈 정보는 유지하는 예측 블록을 획득하고, 상기 노이즈 메타데이터는, 상기 노이즈 패턴 정보에 대응되는 노이즈 정보를 포함할 수 있다.In addition, the obtaining of the prediction block may include obtaining a prediction block maintaining noise information corresponding to the noise pattern information among the noise information included in the current block, and the noise metadata corresponding to the noise pattern information. noise information may be included.

또한, 상기 예측 블록을 획득하는 단계는, 상기 디코딩 장치에 대한 정보가 수신되면, 상기 노이즈 모델 및 상기 디코딩 장치에 대한 정보에 기초하여 상기 현재 블록에 대응되는 상기 예측 블록을 획득할 수 있다.In the obtaining of the prediction block, when the information on the decoding device is received, the prediction block corresponding to the current block may be obtained based on the noise model and the information on the decoding device.

또한, 상기 복수의 샘플 영상에 CNN(Convolution Neural Network)을 수행하여 상기 노이즈 모델을 획득하는 단계를 더 포함할 수 있다.The method may further include acquiring the noise model by performing a convolution neural network (CNN) on the plurality of sample images.

한편, 본 개시의 일 실시 예에 따른 전자 장치의 디코딩 방법은 인코딩 장치로부터 인코딩 영상 신호 및 노이즈 메타데이터를 수신하는 단계, 상기 인코딩 영상 신호를 디코딩하는 단계 및 상기 노이즈 메타데이터에 기초하여 상기 디코딩된 영상 신호에 후처리를 수행하는 단계를 포함하고, 상기 인코딩 영상 신호는, 노이즈 모델에 기초하여 원본 영상에서 기설정된 노이즈 정보는 유지된 상태로 인코딩된 영상 신호이고, 상기 노이즈 모델은 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것이며, 상기 후처리를 수행하는 단계는, 상기 디코딩된 영상 신호에서 상기 노이즈 메타데이터에 포함된 상기 기설정된 노이즈 정보에 대응되는 노이즈를 필터링하여 상기 후처리를 수행할 수 있다.Meanwhile, a decoding method of an electronic device according to an embodiment of the present disclosure includes receiving an encoded video signal and noise metadata from an encoding device, decoding the encoded video signal, and decoding the encoded video signal based on the noise metadata. and performing post-processing on the video signal, wherein the encoded video signal is an encoded video signal in a state in which noise information preset in an original video is maintained based on a noise model, and the noise model is a plurality of sample images. A noise post-processing process is learned based on, and the performing of the post-processing is performed by filtering noise corresponding to the predetermined noise information included in the noise metadata in the decoded video signal to perform the post-processing. can be done

여기서, 상기 노이즈 메타데이터는, 노이즈 후처리 과정에서 처리 가능한 노이즈에 대한 정보 및 상기 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하고, 상기 후처리를 수행하는 단계는, 상기 노이즈 메타데이터에 기초하여 디코딩된 영상에 링잉 노이즈(ringing noise) 필티링, 플리커링 노이즈(flickering noise) 필터링, 블로킹 노이즈(blocking noise) 필터링 및 블러링 노이즈(blurring noise) 필터링 중 적어도 하나를 수행할 수 있다.Here, the noise metadata includes at least one of information on noise that can be processed in a noise post-processing process and information on the strength of the noise that can be processed, and the performing of the post-processing includes the noise metadata Based on the decoded image, at least one of ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering may be performed.

또한, 상기 후처리가 수행된 영상을 디스플레이를 통해 제공하는 단계를 더 포함할 수 있다.The method may further include providing the post-processed image through a display.

이상과 같은 본 발명의 다양한 실시 예에 따르면, 전자 장치의 후처리에 따라 필터링 가능한 노이즈는 그대로 유지하여 인코딩 및 디코딩 시 요구되는 시간 및 자원을 최소화하고, 성능 및 화질을 향상시킬 수 있다.According to various embodiments of the present invention as described above, it is possible to minimize time and resources required for encoding and decoding, and improve performance and image quality by maintaining filterable noise according to post-processing of the electronic device.

도 1은 본 개시의 일 실시 예에 따른 인코딩 장치를 설명하기 위한 블록도이다.
도 2는 본 개시의 이해를 돕기 위한 인코딩 장치의 구성을 나타내는 블록도이다.
도 3은 본 개시의 일 실시 예에 따른 노이즈 모델을 설명하기 위한 블록도이다.
도 4는 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 블록도이다.
도 5는 본 개시의 이해를 돕기 위한 전자 장치의 구성을 나타내는 블록도이다.
도 6은 본 개시의 일 실시 예에 따른 인코딩 장치의 인코딩 방법을 설명하기 위한 흐름도이다.
도 7은 본 개시의 일 실시 예에 따른 전자 장치의 디코딩 방법을 설명하기 위한 흐름도이다.1 is a block diagram for explaining an encoding apparatus according to an embodiment of the present disclosure.
2 is a block diagram showing the configuration of an encoding device to aid understanding of the present disclosure.
3 is a block diagram for explaining a noise model according to an embodiment of the present disclosure.
4 is a block diagram for explaining an electronic device according to an embodiment of the present disclosure.
5 is a block diagram showing the configuration of an electronic device to help understanding of the present disclosure.
6 is a flowchart illustrating an encoding method of an encoding apparatus according to an embodiment of the present disclosure.
7 is a flowchart illustrating a decoding method of an electronic device according to an embodiment of the present disclosure.

본 명세서에서 사용되는 용어에 대해 간략히 설명하고, 본 개시에 대해 구체적으로 설명하기로 한다.　Terms used in this specification will be briefly described, and the present disclosure will be described in detail.

본 개시의 실시 예에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 개시의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the embodiments of the present disclosure have been selected from general terms that are currently widely used as much as possible while considering the functions in the present disclosure, but they may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. . In addition, in a specific case, there is also a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the disclosure. Therefore, terms used in the present disclosure should be defined based on the meaning of the term and the general content of the present disclosure, not simply the name of the term.

본 개시의 실시 예들은 다양한 변환을 가할 수 있고 여러 가지 실시 예를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나 이는 특정한 실시 형태에 대해 범위를 한정하려는 것이 아니며, 개시된 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 실시 예들을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.Embodiments of the present disclosure may apply various transformations and may have various embodiments, and specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the scope to specific embodiments, and should be understood to include all transformations, equivalents, and substitutes included in the spirit and scope of technology disclosed. In describing the embodiments, if it is determined that a detailed description of a related known technology may obscure the subject matter, the detailed description will be omitted.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 구성요소들은 용어들에 의해 한정되어서는 안 된다. 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. Terms are only used to distinguish one component from another.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구성되다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "comprise" or "consist of" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other It should be understood that the presence or addition of features, numbers, steps, operations, components, parts, or combinations thereof is not precluded.

본 개시에서 "모듈" 혹은 "부"는 적어도 하나의 기능이나 동작을 수행하며, 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 "모듈" 혹은 복수의 "부"는 특정한 하드웨어로 구현될 필요가 있는 "모듈" 혹은 "부"를 제외하고는 적어도 하나의 모듈로 일체화되어 적어도 하나의 프로세서(미도시)로 구현될 수 있다.In the present disclosure, a “module” or “unit” performs at least one function or operation, and may be implemented in hardware or software or a combination of hardware and software. In addition, a plurality of "modules" or a plurality of "units" are integrated into at least one module and implemented by at least one processor (not shown), except for "modules" or "units" that need to be implemented with specific hardware. It can be.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시 예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다.　그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 개시를 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, with reference to the accompanying drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily carry out the present disclosure. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein. And in order to clearly describe the present disclosure in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

도 1은 본 개시의 일 실시 예에 따른 인코딩 장치를 설명하기 위한 블록도이다.1 is a block diagram for explaining an encoding apparatus according to an embodiment of the present disclosure.

도 1에 도시된 바와 같이, 인코딩 장치(100)는 통신부(110), 메모리(120) 및 프로세서(130)를 포함한다.As shown in FIG. 1 , the encoding device 100 includes a communication unit 110 , a memory 120 and a processor 130 .

인코딩 장치(100)는 영상을 인코딩하여 다른 신호 형태로 변경하는 장치이다. 여기서, 영상은 복수의 프레임으로 구성되어 있으며, 각 프레임은 복수의 픽셀을 포함할 수 있다. 일 예로, 인코딩 장치(100)는 가공되지 않은 원본 영상을 압축하기 위한 장치일 수 있다. 다만, 이에 한정되는 것은 아니며 인코딩 장치(100)는 기 인코딩된 영상을 다른 신호 형태로 변경하는 장치일 수도 있다.The encoding device 100 is a device that encodes an image and converts it into another signal form. Here, an image is composed of a plurality of frames, and each frame may include a plurality of pixels. For example, the encoding device 100 may be a device for compressing an unprocessed original video. However, it is not limited thereto, and the encoding device 100 may be a device that changes a pre-encoded image into another signal type.

특히, 인코딩 장치(100)는 영상을 구성하는 각 프레임을 복수의 블록으로 구분하여 인코딩을 수행할 수 있다. 인코딩 장치(100)는 블록 단위로 시간적 또는 공간적 예측, 변환, 양자화, 필터링, 엔트로피 인코딩 등을 거쳐 인코딩을 수행할 수 있다.In particular, the encoding device 100 may perform encoding by dividing each frame constituting an image into a plurality of blocks. The encoding apparatus 100 may perform encoding through temporal or spatial prediction, transformation, quantization, filtering, entropy encoding, and the like in units of blocks.

일 실시 예에 따라, 인코딩 장치(100)는 통신부(110)를 포함한다. According to an embodiment, the encoding device 100 includes a communication unit 110.

통신부(110)는 다양한 유형의 통신방식에 따라 다양한 유형의 외부 장치와 통신을 수행하는 구성이다. 일 예로, 통신부(110)는 유/무선 LAN, WAN, 이더넷, 블루투스(Bluetooth), 지그비(Zigbee), IEEE 1394, 와이파이(Wifi) 또는 PLC(Power Line Communication) 등을 이용하여, 외부 장치와 통신을 수행할 수 있다.The communication unit 110 is a component that performs communication with various types of external devices according to various types of communication methods. For example, the communication unit 110 communicates with an external device using wired/wireless LAN, WAN, Ethernet, Bluetooth, Zigbee, IEEE 1394, Wi-Fi, or Power Line Communication (PLC). can be performed.

특히, 인코딩 장치(100)는 통신부(110)를 통해 영상을 수신할 수 있고, 통신부(110)를 통해 인코딩된 영상 신호를 디코딩 장치로 출력할 수 있다. 예를 들어, 통신부(110)는 디코딩 장치(200)로 노이즈 메타데이터 및 인코딩 영상 신호를 포함하는 스트림(또는, 비트 스트림) 등을 전송할 수 있다. 노이즈 메타데이터에 대한 구체적인 설명은 후술하도록 한다.In particular, the encoding device 100 may receive an image through the communication unit 110 and output an encoded video signal to a decoding device through the communication unit 110 . For example, the communication unit 110 may transmit a stream (or bit stream) including noise metadata and an encoded video signal to the decoding device 200 . A detailed description of the noise metadata will be described later.

메모리(120)는 노이즈 모델을 저장할 수 있다. 여기서, 노이즈 모델은 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것일 수 있다. The memory 120 may store a noise model. Here, the noise model may be a noise post-processing process learned based on a plurality of sample images.

일 실시 예로, 노이즈 모델은 각각 상이한 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및 코딩 계수(Coding Coefficient)를 가지는 복수의 샘플 영상을 이용하여 기계 학습(Machine Learning)된 모델일 수 있다. 노이즈 모델은 영상에 대한 후처리를 통해 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보를 포함할 수 있다. 예를 들어, 영상에 블로킹 아티팩트(blocking artifact), 링잉 아트팩트(ringing artifact) 등이 포함되어 있는 경우를 상정할 수 있다. 인코딩 장치(100)는 영상을 인코딩함에 있어서, 아트팩트, 노이즈 등을 처리 또는 제거할 수 있다. 본 개시의 일 실시 예에 따른 인코딩 장치(100)는 영상에 대한 후처리 과정에서 처리 가능한 노이즈에 대한 정보에 기초하여 영상을 인코딩하는 과정에서 일부 노이즈는 처리 또는 제거하지 않고 인코딩할 수도 있다.As an example, the noise model may be a machine-learned model using a plurality of sample images each having a different coding unit (CU) size, coding index, and coding coefficient. The noise model may include information on noise that can be processed through post-processing of the image and information on the intensity of noise that can be processed. For example, it may be assumed that a blocking artifact, a ringing artifact, and the like are included in an image. The encoding device 100 may process or remove artifacts and noise when encoding an image. The encoding apparatus 100 according to an embodiment of the present disclosure may encode without processing or removing some noises in a process of encoding an image based on information on processable noise in a post-processing process of the image.

노이즈 모델은 영상의 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 노이즈 패턴 정보를 포함할 수 있고, 노이즈 패턴 정보는 복수의 샘플 영상 각각에 기초하여 학습된 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함할 수 있다. 예를 들어, 노이즈 모델은 복수의 샘플 영상에 CNN(Convolution Neural Network, 컨볼루션 신경망) 학습(training)을 수행하여 획득된 모델일 수 있다. 여기서, CNN은 음성처리, 이미지 처리 등을 위해 고안된 특수한 연결구조를 가진 다층신경망이다. The noise model may include noise pattern information according to at least one of a coding unit (CU) size of an image, a coding index, and a bit rate of coding coefficients, and the noise pattern information is a plurality of sample images It may include at least one of information about the processable noise and information about the strength of the processable noise learned based on each. For example, the noise model may be a model obtained by performing Convolution Neural Network (CNN) training on a plurality of sample images. Here, CNN is a multilayer neural network with a special connection structure designed for voice processing and image processing.

일 실시 예에 따라, 처리 가능한 노이즈에 대한 정보는, 링잉 노이즈(ringing noise), 플리커링 노이즈(flickering noise), 블로킹 노이즈(blocking noise) 및 블러링 노이즈(blurring noise) 중 적어도 하나를 포함할 수 있다. According to an embodiment, information on processable noise may include at least one of ringing noise, flickering noise, blocking noise, and blurring noise. there is.

메모리(120)는 인코딩 장치(100)를 구동시키기 위한 O/S(Operating System) 소프트웨어 모듈, 인코딩 알고리즘, 노이즈 모델, CNN 학습 모듈 등과 같이 다양한 데이터를 저장한다.The memory 120 stores various data such as an O/S (Operating System) software module for driving the encoding device 100, an encoding algorithm, a noise model, a CNN learning module, and the like.

메모리(120)는 프로세서(130)에 포함된 롬(ROM), 램(RAM) 등의 내부 메모리로 구현되거나, 프로세서(130)와 별도의 메모리로 구현될 수도 있다. 이 경우, 메모리(120)는 데이터 저장 용도에 따라 인코딩 장치(100)에 임베디드된 메모리 형태로 구현되거나, 인코딩 장치(100)에 탈부착이 가능한 메모리 형태로 구현될 수도 있다. 예를 들어, 인코딩 장치(100)의 구동을 위한 데이터의 경우 인코딩 장치(100)에 임베디드된 메모리에 저장되고, 인코딩 장치(100)의 확장 기능을 위한 데이터의 경우 인코딩 장치(100)에 탈부착이 가능한 메모리에 저장될 수 있다. 한편, 인코딩 장치(100)에 임베디드된 메모리의 경우 비휘발성 메모리, 휘발성 메모리, 하드 디스크 드라이브(HDD) 또는 솔리드 스테이트 드라이브(SSD) 등과 같은 형태로 구현되고, 인코딩 장치(100)에 탈부착이 가능한 메모리의 경우 메모리 카드(예를 들어, micro SD 카드, USB 메모리 등), USB 포트에 연결가능한 외부 메모리(예를 들어, USB 메모리) 등과 같은 형태로 구현될 수 있다.The memory 120 may be implemented as an internal memory such as a ROM or RAM included in the processor 130 or may be implemented as a separate memory from the processor 130 . In this case, the memory 120 may be implemented in the form of a memory embedded in the encoding device 100 or in the form of a removable memory in the encoding device 100 depending on the data storage purpose. For example, data for driving the encoding device 100 is stored in a memory embedded in the encoding device 100, and data for an extended function of the encoding device 100 is detachable from the encoding device 100. It can be stored in available memory. On the other hand, in the case of memory embedded in the encoding device 100, it is implemented in the form of non-volatile memory, volatile memory, hard disk drive (HDD) or solid state drive (SSD), etc., and is removable from the encoding device 100. In the case of , it may be implemented in the form of a memory card (eg, micro SD card, USB memory, etc.), an external memory (eg, USB memory) connectable to a USB port, and the like.

프로세서(130)는 인코딩 장치(100)의 동작을 전반적으로 제어한다.The processor 130 controls the overall operation of the encoding device 100 .

일 실시 예에 따라 프로세서(130)는 디지털 시그널 프로세서(digital signal processor(DSP), 마이크로 프로세서(microprocessor), TCON(Time controller)으로 구현될 수 있다. 다만, 이에 한정되는 것은 아니며, 중앙처리장치(central processing unit(CPU)), MCU(Micro Controller Unit), MPU(micro processing unit), 컨트롤러(controller), 어플리케이션 프로세서(application processor(AP)), 또는 커뮤니케이션 프로세서(communication processor(CP)), ARM 프로세서 중 하나 또는 그 이상을 포함하거나, 해당 용어로 정의될 수 있다. 또한, 프로세서(140)는 프로세싱 알고리즘이 내장된 SoC(System on Chip), LSI(large scale integration)로 구현될 수도 있고, FPGA(Field Programmable gate array) 형태로 구현될 수도 있다.According to an embodiment, the processor 130 may be implemented as a digital signal processor (DSP), a microprocessor, or a time controller (TCON). However, it is not limited thereto, and the central processing unit ( central processing unit (CPU)), micro controller unit (MCU), micro processing unit (MPU), controller, application processor (AP), or communication processor (CP), ARM processor In addition, the processor 140 may be implemented as a system on chip (SoC) having a built-in processing algorithm, a large scale integration (LSI), or an FPGA ( It may be implemented in the form of a field programmable gate array).

일 실시 예에 따른 프로세서(130)는 메모리(120)에 저장된 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득할 수 있다.The processor 130 according to an embodiment may obtain a prediction block for encoding corresponding to a current block of an input image based on a noise model stored in the memory 120 .

노이즈 모델은 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 노이즈 패턴 정보를 포함할 수 있다. 예를 들어, 노이즈 모델은 CU 크기에 따라 후처리 과정에서 처리 가능한 노이즈 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 노이즈 패턴 정보로 포함할 수 있다.The noise model may include noise pattern information according to at least one of a coding unit (CU) size, a coding index, and a bit rate of coding coefficients. For example, the noise model may include, as noise pattern information, at least one of processable noise and information about the strength of processable noise in a post-processing process according to the size of the CU.

노이즈 모델은 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 각각이 상이한 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것일 수 있다. The noise model may be one in which a noise post-processing process is learned based on a plurality of sample images each having a different coding unit (CU) size, coding index, and coding coefficient bit rate.

프로세서(130)는 입력 영상의 현재 블록의 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따라 노이즈 패턴 정보를 획득할 수 있다. 여기서, 노이즈 패턴 정보는 현재 블록에 대한 후처리 시에 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함할 수 있다. 예를 들어, 현재 블록에 링잉 노이즈(ringing noise), 플리커링 노이즈(flickering noise), 블로킹 노이즈(blocking noise) 및 블러링 노이즈(blurring noise) 중 적어도 하나가 포함되어 있다면, 후처리 시에 서로 다른 노이즈 각각에 대해 처리 가능한지 여부 및 서로 다른 노이즈 각각에 대해 처리 가능한 강도에 대한 정보가 포함되어 있을 수 있다.The processor 130 may obtain noise pattern information according to at least one of a size of a coding unit (CU) of a current block of an input image, a coding index, and a bit rate of coding coefficients. Here, the noise pattern information may include at least one of information on noise that can be processed and information on the intensity of noise that can be processed during post-processing of the current block. For example, if the current block includes at least one of ringing noise, flickering noise, blocking noise, and blurring noise, different Information on whether or not each noise can be processed and the intensity that can be processed for each different noise may be included.

프로세서(130)는 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득하고, 현재 블록 및 예측 블록 간 차이에 기초하여 획득된 차분 신호에 대한 양자화를 수행할 수 있다.The processor 130 may obtain a prediction block for encoding corresponding to the current block of the input image based on the noise model, and may perform quantization on the obtained differential signal based on a difference between the current block and the prediction block.

일 실시 예에 따른 프로세서(130)는 양자화된 차분 신호를 인코딩하여 인코딩 영상 신호를 획득하고, 예측 블록에 대응되는 노이즈 메타데이터 및 인코딩 영상 신호를 디코딩 장치로 출력할 수 있다.The processor 130 according to an embodiment may obtain an encoded video signal by encoding the quantized differential signal, and output noise metadata and an encoded video signal corresponding to a prediction block to a decoding device.

여기서, 예측 블록에 대응되는 노이즈 메타데이터는 노이즈 패턴 정보에 기초하여 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보를 포함할 수 있다. Here, the noise metadata corresponding to the prediction block may include information on noise that can be processed based on noise pattern information and information on the intensity of noise that can be processed.

일 예로, 프로세서(130)는 현재 블록의 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따라, 현재 블록에 대한 후처리 시에 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하는 노이즈 패턴 정보를 획득하고, 획득된 노이즈 패턴 정보에 기초하여 예측 블록을 획득할 수 있다. 노이즈 메타데이터는 획득된 노이즈 패턴 정보 또는 예측 블록에 기초하여 현재 블록에 대한 후처리 시에 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함할 수 있다.For example, the processor 130 may perform processing on the current block according to at least one of a coding unit (CU) size, a coding index, and a bit rate of coding coefficients of the current block. Noise pattern information including at least one of noise information and processable noise intensity information may be obtained, and a prediction block may be obtained based on the acquired noise pattern information. The noise metadata may include at least one of information about noise that can be processed and information about the strength of noise that can be processed during post-processing of the current block based on acquired noise pattern information or prediction blocks.

예를 들어, 프로세서(130)는 현재 블록에 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따라 플리커링 노이즈(flickering noise)에 대한 처리 가능 여부 및 처리 가능 강도에 대한 정보(또는, 노이즈 패턴 정보)를 획득하고, 획득된 정보에 기초하여 예측 블록 및 노이즈 메타데이터를 획득할 수 있다.For example, the processor 130 processes flickering noise according to at least one of a coding unit (CU) size, a coding index, and a bit rate of coding coefficients in a current block. Information (or noise pattern information) on availability and processability strength may be obtained, and a prediction block and noise metadata may be obtained based on the acquired information.

여기서, 예측 블록은 현재 블록에 포함된 노이즈 정보 중 노이즈 패턴 정보에 대응되는 노이즈 정보는 유지하는 블록일 수 있다. 예를 들어, 현재 블록에 대응되는 노이즈 패턴 정보에 따라 디코딩 장치에서 후처리 시에 링잉 노이즈(ringing noise)를 처리할 수 있다면, 인코딩 장치(100)는 링잉 노이즈는 그대로 유지하는 예측 블록에 기초하여 인코딩을 수행할 수 있다. 이에 따라, 인코딩 장치(100)는 디코딩 장치에서 처리 가능한 노이즈 및 처리 가능한 노이즈 강도를 고려하여 인코딩을 수행할 수 있다. 인코딩 장치(100)는 입력 영상에서 디코딩 장치가 처리 가능한 노이즈에 대해서는 그대로 유지하여 인코딩하므로, 인코딩 속도를 감소시킬 수 있고, 인코딩 시 요구되는 자원을 효율적으로 활용할 수 있다. 예측 블록 및 인코딩 과정에 대한 구체적인 설명은 도 2에서 하도록 한다.Here, the prediction block may be a block that retains noise information corresponding to noise pattern information among noise information included in the current block. For example, if the decoding device can process ringing noise during post-processing according to the noise pattern information corresponding to the current block, the encoding device 100 can perform ringing noise on the basis of a predicted block that retains the same encoding can be performed. Accordingly, the encoding device 100 may perform encoding in consideration of the noise that can be processed by the decoding device and the noise intensity that can be processed by the decoding device. Since the encoding device 100 maintains and encodes noise that can be processed by the decoding device in the input image, the encoding speed can be reduced and resources required for encoding can be efficiently utilized. A detailed description of the prediction block and the encoding process will be given in FIG. 2 .

본 개시의 일 실시 예에 따른 프로세서(130)는 통신부(110)를 통해 디코딩 장치에 대한 정보가 수신되면, 노이즈 모델 및 디코딩 장치에 대한 정보에 기초하여 현재 블록에 대응되는 예측 블록을 획득할 수 있다.When information on the decoding device is received through the communication unit 110, the processor 130 according to an embodiment of the present disclosure may obtain a prediction block corresponding to the current block based on the noise model and the information on the decoding device. there is.

노이즈 모델은 디코딩 장치의 후처리 성능을 고려하지 않은, 복수의 샘플 영상에 기초하여 학습된 모델일 수 있다. 이에 따라, 노이즈 모델은 디코딩 장치를 특정하지 않은 채 일반적인 디코딩 장치에서 후처리 시에 처리 가능한 노이즈 및 처리 가능한 노이즈 강도에 대한 정보를 노이즈 패턴 정보로 포함할 수 있다. 본 개시의 일 실시 예에 따른 인코딩 장치(100)는 디코딩 장치에 대한 정보를 수신하고, 수신된 정보 및 노이즈 모델에 기초하여 예측 블록을 획득할 수도 있다. 여기서, 디코딩 장치에 대한 정보는 디코딩 장치가 후처리 시에 처리 가능한 노이즈 및 노이즈 강도에 대한 정보를 포함할 수 있다.The noise model may be a model learned based on a plurality of sample images without considering the post-processing performance of the decoding device. Accordingly, the noise model may include, as noise pattern information, information about processable noise and processable noise strength during post-processing in a general decoding apparatus without specifying the decoding apparatus. The encoding device 100 according to an embodiment of the present disclosure may receive information about a decoding device and obtain a prediction block based on the received information and a noise model. Here, the information on the decoding device may include information on noise and noise strength that the decoding device can process during post-processing.

한편, 도 1은 인코딩 장치(100)가 통신 기능 및 제어 기능 등과 같은 기능을 구비한 장치인 경우를 예로 들어, 각종 구성 요소들을 간략하게 도시한 것이다. 따라서, 실시 예에 따라서는, 도 1에 도시된 구성 요소 중 일부는 생략 또는 변경될 수도 있고, 다른 구성 요소가 더 추가될 수도 있음은 물론이다.Meanwhile, FIG. 1 briefly illustrates various components by taking the case where the encoding device 100 is a device having functions such as a communication function and a control function. Accordingly, it goes without saying that, depending on embodiments, some of the components shown in FIG. 1 may be omitted or changed, and other components may be further added.

도 2는 본 개시의 이해를 돕기 위한 인코딩 장치(100)의 구성을 나타내는 블록도이다.2 is a block diagram showing the configuration of the encoding device 100 to help understanding of the present disclosure.

도 2에 도시된 바와 같이, 인코딩 장치(100)는 노이즈 모델(211), 움직임 예측부(212), 움직임 보상부(212), 인트라 예측부(220), 스위치(215), 감산기(225), 변환부(230), 양자화부(240), 엔트로피 부호화부(250), 역양자화부(260), 역변환부(270), 가산기(275), 필터부(280) 및 참조 영상 버퍼(290)를 포함한다.As shown in FIG. 2 , the encoding device 100 includes a noise model 211, a motion estimation unit 212, a motion compensation unit 212, an intra prediction unit 220, a switch 215, and a subtractor 225 , transform unit 230, quantization unit 240, entropy encoding unit 250, inverse quantization unit 260, inverse transform unit 270, adder 275, filter unit 280, and reference image buffer 290 includes

인코딩 장치(100)는 입력 영상의 각 프레임을 복수의 블록으로 구분하여 인코딩을 수행할 수 있다. 인코딩 장치(100)는 블록 단위로 시간적 또는 공간적 예측, 변환, 양자화, 필터링, 엔트로피 인코딩 등을 거쳐 인코딩을 수행할 수 있다.The encoding device 100 may perform encoding by dividing each frame of the input image into a plurality of blocks. The encoding apparatus 100 may perform encoding through temporal or spatial prediction, transformation, quantization, filtering, entropy encoding, and the like in units of blocks.

예측은 인코딩하고자 하는 현재 블록과 유사한 예측 블록을 생성하는 것을 의미한다. 여기서, 인코딩하고자 하는 블록의 단위는 코딩 트리 유닛 (CTU: Coding Tree Unit), 코딩 유닛 (CU: Coding Unit), 예측 유닛 (PU: Prediction Unit) 또는 변환 유닛 (TU: Transform Unit)일 수 있다. 다만, 이와 같은 용어들은 본 개시에 대한 설명의 편의를 위해 사용할 뿐이며, 이에 한정되는 것은 아니다.Prediction means generating a prediction block similar to the current block to be encoded. Here, the block unit to be encoded may be a Coding Tree Unit (CTU), Coding Unit (CU), Prediction Unit (PU), or Transform Unit (TU). However, these terms are only used for convenience of description of the present disclosure, and are not limited thereto.

예측은 시간적 예측 및 공간적 예측으로 구분된다. 시간적 예측은 화면 간 예측을 의미한다. 인코딩 장치(100)는 현재 인코딩하려는 영상과 상관도가 높은 일부의 참조 영상(Reference picture)을 저장하고 이를 이용하여 화면 간 예측을 수행할 수 있다. 즉, 인코딩 장치(100)는 이전 시간에 인코딩 후 디코딩된 참조 영상으로부터 예측 블록을 생성할 수 있다. 이 경우, 인코딩 장치(100)는 코딩 인덱스(Coding Index) 중 인터(Inter) 모드로 동작한다고 말한다.Prediction is divided into temporal prediction and spatial prediction. Temporal prediction means inter-screen prediction. The encoding device 100 may store some reference pictures that are highly correlated with an image to be currently encoded and perform inter-prediction using the reference pictures. That is, the encoding device 100 may generate a prediction block from a reference image decoded after encoding at a previous time. In this case, it is said that the encoding device 100 operates in an Inter mode among coding indexes.

인터 모드로 동작하는 경우, 움직임 예측부(211)는 참조 영상 버퍼(290)에 저장되어 있는 참조 영상에서 현재 블록과 가장 시간적 상관도가 높은 블록을 검색할 수 있다. 움직임 예측부(211)는 참조 영상을 보간(Interpolation)하여 보간된 영상에서 현재 블록과 가장 시간적 상관도가 높은 블록을 검색할 수도 있다.When operating in the inter mode, the motion estimation unit 211 may search for a block having the highest temporal correlation with the current block in the reference image stored in the reference image buffer 290 . The motion estimation unit 211 may interpolate the reference image and search for a block having the highest temporal correlation with the current block in the interpolated image.

여기서, 참조 영상 버퍼(290)는 참조 영상을 저장하는 공간이다. 참조 영상 버퍼(290)는 화면 간 예측을 수행하는 경우에만 이용되며, 현재 인코딩하려는 영상과 상관도가 높은 일부의 참조 영상을 저장하고 있을 수 있다. 참조 영상은 후술할 차분 블록을 순차적으로 변환, 양자화, 역양자화, 역변환, 필터링하여 생성된 영상일 수 있다. 즉, 참조 영상은 인코딩 후 디코딩된 영상일 수 있다.Here, the reference image buffer 290 is a space for storing a reference image. The reference image buffer 290 is used only when inter-prediction is performed, and may store some reference images having a high correlation with an image to be currently encoded. The reference image may be an image generated by sequentially transforming, quantizing, inverse-quantizing, inverse-transforming, and filtering residual blocks to be described later. That is, the reference image may be a decoded image after encoding.

움직임 보상부(212)는 움직임 예측부(211)에서 찾은 현재 블록과 가장 시간적 상관도가 높은 블록에 대한 움직임 정보를 바탕으로 예측 블록을 생성할 수 있다. 여기서, 움직임 정보는 움직임 벡터, 참조 영상 인덱스 등을 포함할 수 있다.The motion compensation unit 212 may generate a prediction block based on motion information about a block having the highest temporal correlation with the current block found by the motion estimation unit 211 . Here, the motion information may include a motion vector, a reference picture index, and the like.

공간적 예측은 화면 내 예측을 의미한다. 인트라 예측부(220)는 현재 영상 내의 인코딩된 인접 픽셀들로부터 공간적 예측을 수행하여 현재 블록에 대한 예측 값을 생성할 수 있다. 이 경우, 인코딩 장치(100)는 인트라(Intra) 모드로 동작한다고 말한다.Spatial prediction means intra-picture prediction. The intra predictor 220 may generate a prediction value for the current block by performing spatial prediction from encoded neighboring pixels in the current image. In this case, the encoding device 100 is said to operate in an intra mode.

인터 모드 또는 인트라 모드는 코딩 유닛(Coding Unit, CU) 단위로 결정될 수 있다. 여기서, 코딩 유닛은 적어도 하나의 예측 유닛을 포함할 수 있다. 모드가 결정되면 스위치(215)의 위치가 모드에 대응되도록 변경될 수 있다.Inter mode or intra mode may be determined in units of Coding Units (CUs). Here, the coding unit may include at least one prediction unit. When the mode is determined, the position of the switch 215 may be changed to correspond to the mode.

한편, 시간적 예측에서의 인코딩 후 디코딩된 참조 영상은 필터링이 적용된 영상이나, 공간적 예측에서의 인코딩 후 디코딩된 인접 픽셀들은 필터링이 적용되지 않은 픽셀들일 수 있다.Meanwhile, a reference image decoded after encoding in temporal prediction may be an image to which filtering is applied, but adjacent pixels decoded after encoding in spatial prediction may be pixels to which filtering is not applied.

감산기(225)는 타겟 블록과 시간적 예측 또는 공간적 예측으로부터 얻어진 예측 블록의 차이를 구해 차분 신호(Residual signal 또는 Residual block)을 생성할 수 있다. 차분 신호는 예측 과정에 의해서 중복성이 많이 제거된 블록일 수 있으나, 예측이 완벽하게 이루어지지 않아 인코딩해야 할 정보를 포함하는 블록일 수도 있다.The subtractor 225 may generate a residual signal or residual block by obtaining a difference between a target block and a prediction block obtained from temporal prediction or spatial prediction. The differential signal may be a block from which redundancy is largely removed by a prediction process, or may be a block including information to be encoded because prediction is not perfectly performed.

변환부(230)는 공간적 중복성을 제거하기 위해 화면 내 또는 화면 간 예측 이후의 차분 신호를 변환하여 주파수 영역의 변환 계수를 출력할 수 있다. 이때, 변환의 단위는 변환 유닛(Transform Unit, TU)이며, 예측 유닛과는 무관하게 결정된다. 예를 들어, 복수의 차분 신호를 포함하는 프레임은 예측 유닛과는 무관하게 복수의 변환 유닛으로 분할되고, 변환부(230)는 각 변환 유닛 별로 변환을 수행할 수 있다. 변환 유닛의 분할은 비트율 최적화에 따라 결정될 수 있다.The transform unit 230 may output a transform coefficient in the frequency domain by transforming the difference signal after intra- or inter-prediction in order to remove spatial redundancy. At this time, the unit of transformation is a transform unit (TU), which is determined regardless of the prediction unit. For example, a frame including a plurality of differential signals may be divided into a plurality of transform units regardless of a prediction unit, and the transform unit 230 may perform transform for each transform unit. The division of the transform unit may be determined according to bit rate optimization.

변환부(230)는 각 변환 유닛의 에너지를 특정 주파수 영역에 집중시키기 위해 변환을 수행할 수 있다. 예를 들어, 변환부(230)는 각 변환 유닛에 대해 DCT(Discrete Cosine Transform) 기반의 변환을 수행하여 저주파 영역으로 데이터를 집중시킬 수 있다. 또는, 변환부(230)는 DFT(Discrete Fourier Transform) 기반의 변환 또는 DST(Discrete Sine Transform) 기반의 변환을 수행할 수도 있다.The conversion unit 230 may perform conversion to concentrate the energy of each conversion unit in a specific frequency domain. For example, the transform unit 230 may concentrate data into a low-frequency region by performing DCT (Discrete Cosine Transform)-based transform on each transform unit. Alternatively, the transform unit 230 may perform a discrete Fourier transform (DFT)-based transform or a discrete sine transform (DST)-based transform.

양자화부(240)는 변환 계수에 대해 양자화를 수행하며, 변환 계수를 기설정된 수의 대표 값으로 근사화할 수 있다. 즉, 양자화부(240)는 특정 범위의 입력 값을 하나의 대표 값으로 매핑할 수 있다. 이 과정에서 사람이 잘 인지하지 못하는 고주파 신호가 제거되며, 정보의 손실이 발생할 수 있다.The quantization unit 240 may perform quantization on transform coefficients and approximate the transform coefficients to a predetermined number of representative values. That is, the quantization unit 240 may map input values within a specific range to one representative value. In this process, high-frequency signals that are not well recognized by humans are removed, and information loss may occur.

양자화부(240)는 입력 데이터의 확률 분포나 양자화의 목적에 따라 균등 양자화 및 비균등 양자화 방법 중 하나를 이용할 수 있다. 예를 들어, 양자화부(240)는 입력 데이터의 확률 분포가 균등할 때에는 균등 양자화 방법을 이용할 수 있다. 또는, 양자화부(240)는 입력 데이터의 확률 분포가 균등하지 않을 때에는 비균등 양자화 방법을 이용할 수도 있다.The quantization unit 240 may use one of uniform quantization and non-uniform quantization methods according to a probability distribution of input data or a purpose of quantization. For example, the quantization unit 240 may use a uniform quantization method when the probability distribution of the input data is uniform. Alternatively, the quantization unit 240 may use a non-uniform quantization method when the probability distribution of the input data is not uniform.

엔트로피 부호화부(250)는 양자화부(240)에서 입력된 데이터에 대해 심볼(Symbol)의 발생 확률에 따라 심볼의 길이를 가변적으로 할당하여 데이터양을 축소할 수 있다. 즉, 엔트로피 부호화부(250)는 입력된 데이터를 확률 모델을 기반으로 0과 1로 구성된 가변 길이의 비트열로 표현하여 스트림을 생성할 수 있다.The entropy encoding unit 250 may reduce the amount of data by variably allocating a symbol length according to a symbol occurrence probability for data input from the quantization unit 240 . That is, the entropy encoding unit 250 may generate a stream by expressing input data as a variable length bit string composed of 0 and 1 based on a probability model.

예를 들어, 엔트로피 부호화부(250)는 높은 발생 확률을 갖는 심볼에는 적은 수의 비트를 할당하고, 낮은 발생 확률을 갖는 심볼에는 많은 수의 비트를 할당하여 입력 데이터를 표현할 수 있다. 그에 따라, 입력 데이터의 비트열의 크기가 감소될 수 있고, 영상 인코딩의 압축 성능을 향상시킬 수 있다.For example, the entropy encoder 250 may represent input data by allocating a small number of bits to a symbol with a high probability of occurrence and a large number of bits to a symbol with a low probability of occurrence. Accordingly, the size of a bit string of input data may be reduced, and compression performance of video encoding may be improved.

엔트로피 부호화부(250)는 허프만 부호화 및 지수 골룸 부호화와 같은 가변 길이 부호화(Variable Length Coding) 또는 산술 부호화(Arithmetic Coding) 방법에 의해 엔트로피 부호화를 수행할 수 있다.The entropy encoding unit 250 may perform entropy encoding using variable length coding such as Huffman coding or exponential Gollum coding or arithmetic coding.

역양자화부(260) 및 역변환부(270)는 양자화된 변환 계수를 입력받아 각각 역양자화 후 역변환을 수행하여 복원된 차분 신호를 생성할 수 있다.The inverse quantization unit 260 and the inverse transformation unit 270 may receive the quantized transform coefficients, inverse quantize them, and then perform inverse transformation to generate a reconstructed differential signal.

가산기(275)는 복원된 차분 신호와 시간적 예측 또는 공간적 예측으로부터 얻어진 예측 블록를 더해 복원 블록을 생성할 수 있다.The adder 275 may generate a reconstructed block by adding the reconstructed differential signal and a prediction block obtained from temporal prediction or spatial prediction.

한편，인코딩 과정에서 인접한 블록들이 서로 다른 양자화 파라미터에 의해 양자화됨으로써 블록 경계가 보이는 열화가 발생될 수 있다. 이러한 현상을 블록킹 열화(blocking artifacts) 라고 하며， 이는 화질을 평가하는 중요한 요소 중의 하나이다. 이러한 열화를 줄이기 위해 필터링 과정을 수행할 수 있다.Meanwhile, in the encoding process, adjacent blocks are quantized by different quantization parameters, and thus block boundaries may be degraded. This phenomenon is called blocking artifacts, which is one of the important factors in evaluating image quality. A filtering process may be performed to reduce this degradation.

필터부(280)는 디블록킹 필터(Deblocking filter), SAO(Sample Adaptive Offset), ALF(Adaptive Loop Filter) 중 적어도 하나를 복원 영상에 적용할 수 있다. 필터링된 복원 영상은 참조 영상 버퍼(290)에 저장되어 참조 영상으로서 활용될 수 있다.The filter unit 280 may apply at least one of a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) to the reconstructed image. The filtered reconstructed image may be stored in the reference image buffer 290 and used as a reference image.

본 개시의 일 실시 예에 따른 인코딩 장치(100)는 노이즈 모델(211)에 기초하여 열화를 줄이기 위한 필터링 과정이 수행되지 않은 참조 영상에서 현재 블록에 대응되는 예측 블록을 획득할 수도 있다.Based on the noise model 211, the encoding apparatus 100 according to an embodiment of the present disclosure may obtain a prediction block corresponding to the current block from a reference image on which a filtering process for reducing degradation has not been performed.

일 예로, 블록킹 열화(blocking artifacts, 또는 blocking noise)는 디코딩 장치에서 영상에 대한 후처리 시 처리 또는 제거될 수 있다. 이에 따라, 인코딩 장치(100)는 노이즈 모델(211)에 기초하여 블록킹 열화가 존재하는 예측 블록을 획득할 수 있다.For example, blocking artifacts (or blocking noise) may be processed or removed during post-processing of an image in a decoding device. Accordingly, the encoding apparatus 100 may obtain a prediction block having blocking degradation based on the noise model 211 .

인코딩 장치(100)는 노이즈, 열화 등에 대한 필터링을 수행하지 않은 참조 영상에서 현재 블록에 대응되는 예측 블록을 획득할 수 있다. 인코딩 장치(100)는 필터링 수행 과정없이 예측 신호를 획득하고, 스트림을 출력할 수 있으므로 인코딩 시간, 인코딩에 요구되는 자원 등이 효율적으로 관리될 수 있다.The encoding device 100 may obtain a prediction block corresponding to the current block from a reference image to which filtering for noise, degradation, and the like is not performed. Since the encoding device 100 can obtain a prediction signal and output a stream without filtering, encoding time and resources required for encoding can be efficiently managed.

여기서, 스트림은 노이즈 메타데이터 및 인코딩 영상 신호를 포함할 수 있다. 인코딩 영상 신호는 양자화된 차분 신호에 엔트로피 코딩하여 획득한 신호를 의미할 수 있다.Here, the stream may include noise metadata and an encoded video signal. The encoded video signal may refer to a signal obtained by entropy coding a quantized differential signal.

노이즈 메타데이터는 예측 블록에 기초한 노이즈 패턴 정보에 대응되는 노이즈 정보를 포함할 수 있다. 예를 들어, 노이즈 메타데이터는 현재 블록의 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 기초하여 인코딩 영상 신호에 대한 디코딩을 수행하고, 디코딩 신호에 대한 후처리 시 처리 가능한 노이즈에 대한 정보 및 노이즈의 강도에 대한 정보를 포함할 수 있다.Noise metadata may include noise information corresponding to noise pattern information based on a prediction block. For example, the noise metadata performs decoding on an encoded video signal based on at least one of the size of a coding unit (CU) of a current block, a coding index, and a bit rate of coding coefficients; During post-processing of the decoded signal, information on noise that can be processed and information on the strength of the noise may be included.

한편, 움직임 예측부(212), 움직임 보상부(212), 인트라 예측부(220), 스위치(215), 감산기(225), 변환부(230), 양자화부(240), 엔트로피 부호화부(250), 역양자화부(260), 역변환부(270), 가산기(275), 필터부(280) 각각의 기능 및 동작은 프로세서(130)가 수행하는 기능 및 동작을 의미할 수도 있음은 물론이다.Meanwhile, a motion estimation unit 212, a motion compensation unit 212, an intra prediction unit 220, a switch 215, a subtractor 225, a transform unit 230, a quantization unit 240, an entropy encoding unit 250 ), the inverse quantization unit 260, the inverse transform unit 270, the adder 275, and the filter unit 280 may mean functions and operations performed by the processor 130, of course.

도 3은 본 개시의 일 실시 예에 따른 노이즈 모델을 설명하기 위한 블록도이다.3 is a block diagram for explaining a noise model according to an embodiment of the present disclosure.

도 3에 따르면, 복수의 샘플 영상 각각은 CU(Coding Unit), QP(Quantization Parameter), 인터(Inter)/인트라(Intra) 모드 등 코딩 인덱스(Coding Index), 코딩 계수(Coding Coefficient)의 비트 율 등이 상이할 수 있다.According to FIG. 3, each of a plurality of sample images has a bit rate of coding indexes and coding coefficients such as CU (Coding Unit), QP (Quantization Parameter), Inter/Intra mode, etc. etc. may be different.

인코딩 장치(100)는 복수의 샘플 영상에 기초하여 노이즈 후처리 과정을 학습할 수 있다. 학습 결과에 따라 각 노이즈 별 처리 가능 여부 및 처리 강도에 대한 정보를 획득할 수 있다.The encoding device 100 may learn a noise post-processing process based on a plurality of sample images. Depending on the learning result, information about whether each noise can be processed and the processing intensity can be obtained.

예를 들어, 인코딩 장치(100)는 노이즈 중에서 링잉 노이즈(ringing noise), 플리커링 노이즈(flickering noise), 블로킹 노이즈(blocking noise) 및 블러링 노이즈(blurring noise) 각각에 대한 처리 가능 여부, 처리 강도 등에 대한 정보를 획득할 수 있다. 여기서, 인코딩 장치(100)가 획득한 정보는, 디코딩을 수행하는 디코딩 장치, 또는 외부 전자 장치가 영상에 후처리를 수행함에 따라 처리 가능한 노이즈 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함할 수 있다. 특정 전자 장치를 한정하는 것이 아닌, 학습 결과에 따라 불특정 전자 장치가 영상에 대한 후처리 시에 필터링 가능한 노이즈에 대한 정보를 의미할 수 있다.For example, the encoding device 100 determines whether each of ringing noise, flickering noise, blocking noise, and blurring noise can be processed, and processing strength information can be obtained, etc. Here, the information obtained by the encoding device 100 includes at least one of processable noise and information about the intensity of processable noise as the decoding device performing decoding or an external electronic device performs post-processing on the image. can do. It may refer to information about noise that can be filtered when an unspecified electronic device performs post-processing of an image according to a learning result, rather than limiting a specific electronic device.

인코딩 장치(100)는 학습 결과에 기초하여, 현재 블록에 따른 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따라 예측 블록을 획득할 수 있다. 여기서, 획득된 예측 블록은 학습 결과에 따라 후처리 시에 처리 또는 제거 가능한 노이즈 정보는 그대로 유지하는 예측 블록일 수 있다. 다른 예로, 학습 결과에 따라 후처리 시에 처리 또는 제거 불가능한 노이즈 정보가 현재 블록에 포함되어 있다면, 인코딩 장치(100)는 이에 대한 필터링을 수행하여 인코딩 영상 신호를 획득할 수 있다.Based on the learning result, the encoding device 100 may obtain a prediction block according to at least one of a coding unit (CU) size, a coding index, and a bit rate of coding coefficients according to a current block. there is. Here, the obtained prediction block may be a prediction block that retains noise information that can be processed or removed during post-processing according to a learning result. As another example, if noise information that cannot be processed or removed is included in the current block during post-processing according to a learning result, the encoding device 100 may obtain an encoded video signal by performing filtering on the noise information.

본 개시의 일 실시 예에 따른 인코딩 장치(100)는 예측 블록을 획득하고, 후처리 시에 처리 또는 제거 가능한 노이즈 정보를 포함하는 노이즈 메타데이터를 획득할 수 있다.The encoding apparatus 100 according to an embodiment of the present disclosure may obtain a prediction block and acquire noise metadata including noise information that can be processed or removed during post-processing.

노이즈 메타데이터는 인코딩 영상 신호와 함께 스트림으로 디코딩 장치에 전송될 수 있다. 디코딩 장치는 노이즈 메타데이터에 기초하여 디코딩 영상 신호에 후처리(또는 필터링)을 수행할 수 있다.Noise metadata may be transmitted to a decoding device as a stream together with an encoded video signal. The decoding device may perform post-processing (or filtering) on the decoded video signal based on the noise metadata.

본 개시의 일 실시 예에 따른 인코딩 장치(100)는 복수의 샘플 영상에 CNN을 수행하고, 학습 결과에 따른 노이즈 모델을 저장할 수 있다. 여기서, CNN은 음성처리, 이미지 처리 등을 위해 고안된 특수한 연결구조를 가진 다층신경망이다. 특히, CNN은 픽셀에 전처리를 통하여 이미지를 다양하게 필터링하고, 이미지의 특성을 인식할 수 있다.The encoding device 100 according to an embodiment of the present disclosure may perform CNN on a plurality of sample images and store a noise model according to a learning result. Here, CNN is a multilayer neural network with a special connection structure designed for voice processing and image processing. In particular, CNN can filter images in various ways through pixel preprocessing and recognize characteristics of images.

도 4는 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 블록도이다.4 is a block diagram for explaining an electronic device according to an embodiment of the present disclosure.

도 4에 도시된 바와 같이, 전자 장치(400)는 통신부(410) 및 프로세서(420)를 포함한다. As shown in FIG. 4 , the electronic device 400 includes a communication unit 410 and a processor 420 .

전자 장치(400)는 인코딩 영상 신호에 디코딩을 수행할 수 있는 다양한 유형의 장치로 구현될 수 있다. 예를 들어, 셋톱 박스(set-top box), Blu-ray Player등으로 구현될 수 있다. 다만, 이에 한정되는 것은 아니며 디코딩된 영상 신호를 디스플레이할 수 있는 디스플레이 장치로 구현될 수도 있음은 물론이다.The electronic device 400 may be implemented as various types of devices capable of performing decoding on an encoded video signal. For example, it can be implemented as a set-top box, a Blu-ray player, and the like. However, it is not limited thereto and may be implemented as a display device capable of displaying a decoded video signal.

통신부(410)는 인코딩 장치(100)와 통신을 수행할 수 있다. 구체적으로, 통신부(410)는 인코딩 장치(100)로부터 스트림을 수신할 수 있다. 특히, 통신부(410)는 인코딩 영상 신호 및 노이즈 메타데이터를 수신할 수 있다.The communication unit 410 may communicate with the encoding device 100 . Specifically, the communication unit 410 may receive a stream from the encoding device 100 . In particular, the communication unit 410 may receive an encoded video signal and noise metadata.

통신부(410)는 유/무선 LAN, WAN, 이더넷, 블루투스(Bluetooth), 지그비(Zigbee), IEEE 1394, 와이파이(Wifi) 또는 PLC(Power Line Communication) 등을 이용하여, 인코딩 장치(100)와 통신을 수행할 수 있다.The communication unit 410 communicates with the encoding device 100 using wired/wireless LAN, WAN, Ethernet, Bluetooth, Zigbee, IEEE 1394, Wi-Fi, or Power Line Communication (PLC). can be performed.

일 실시 예에 따른 전자 장치(400)는 스피커, 디스플레이 등 출력부를 구비한 외부 전자 장치와 통신을 수행할 수 있으며, 디코딩된 영상 신호를 통신부(410)를 통해 출력부를 구비한 외부 전자 장치로 전송할 수도 있다.The electronic device 400 according to an embodiment may communicate with an external electronic device having an output unit, such as a speaker or a display, and transmit a decoded video signal to the external electronic device having an output unit through the communication unit 410. may be

프로세서(420)는 인코딩 장치(100)로부터 인코딩 영상 신호 및 노이즈 메타데이터가 통신부(410)를 통해 수신되면, 인코딩 영상 신호를 디코딩하고, 노이즈 메타데이터에 기초하여 디코딩된 영상 신호에 후처리를 수행할 수 있다. 여기서, 인코딩 영상 신호는, 인코딩 장치(100)에서 노이즈 모델에 기초하여 원본 영상에서 기설정된 노이즈 정보는 유지된 상태로 인코딩된 영상 신호일 수 있다.When the encoded video signal and noise metadata are received from the encoding device 100 through the communication unit 410, the processor 420 decodes the encoded video signal and performs post-processing on the decoded video signal based on the noise metadata. can do. Here, the encoded video signal may be an encoded video signal while preserving noise information preset in the original video based on the noise model in the encoding apparatus 100 .

특히, 프로세서(420)는 디코딩된 영상 신호에서 노이즈 메타데이터에 포함된 기설정된 노이즈 정보에 대응되는 노이즈를 필터링하여 후처리를 수행할 수 있다.In particular, the processor 420 may perform post-processing by filtering noise corresponding to predetermined noise information included in noise metadata in the decoded video signal.

여기서, 노이즈 메타데이터는, 노이즈 후처리 과정에서 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함할 수 있다. 프로세서(420)는 노이즈 메타데이터에 기초하여 디코딩된 영상 신호에 링잉 노이즈(ringing noise) 필티링, 플리커링 노이즈(flickering noise) 필터링, 블로킹 노이즈(blocking noise) 필터링 및 블러링 노이즈(blurring noise) 필터링 중 적어도 하나를 수행할 수 있다.Here, the noise metadata may include at least one of information on noise that can be processed in a noise post-processing process and information on the intensity of noise that can be processed. The processor 420 performs ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering on the decoded video signal based on the noise metadata. At least one of these may be performed.

디스플레이(미도시)는 액정 표시 장치(Liquid Crystal Display, LCD), 유기 전기 발광 다이오드(Organic Light Emiiting Display, OLED), LCoS(Liquid Crystal on Silicon) 또는 DLP(Digital Light Processing) 등과 같은 다양한 형태의 디스플레이로 구현될 수 있다. 다만, 이에 한정되는 것은 아니며 이미지를 디스플레이 할 수 있는 다양한 유형의 디스플레이로 구현될 수 있음은 물론이다.Displays (not shown) are various types of displays such as liquid crystal displays (LCDs), organic light emitting displays (OLEDs), liquid crystal on silicon (LCoS), or digital light processing (DLPs). can be implemented as However, it is not limited thereto, and of course it can be implemented in various types of displays capable of displaying images.

특히, 전자 장치(400)는 디코딩된 영상 신호에 후처리를 수행하고, 후처리된 영상 신호를 디스플레이를 통해 출력할 수도 있다.In particular, the electronic device 400 may perform post-processing on the decoded video signal and output the post-processed video signal through a display.

한편, 도 4은 전자 장치(400)가 통신 기능 및 제어 기능 등과 같은 기능을 구비한 장치인 경우를 예로 들어, 각종 구성 요소들을 간략하게 도시한 것이다. 따라서, 실시 예에 따라서는, 도 4에 도시된 구성 요소 중 일부는 생략 또는 변경될 수도 있고, 다른 구성 요소가 더 추가될 수도 있다.Meanwhile, FIG. 4 simplifies various components by taking a case where the electronic device 400 is a device having functions such as a communication function and a control function. Accordingly, depending on embodiments, some of the components shown in FIG. 4 may be omitted or changed, and other components may be further added.

도 5는 본 개시의 이해를 돕기 위한 전자 장치(400)의 구성을 나타내는 블록도이다.5 is a block diagram illustrating the configuration of an electronic device 400 to help understanding of the present disclosure.

도 4에 도시된 바와 같이, 전자 장치(400)는 엔트로피 복호화부(510), 역양자화부(520), 역변환부(530), 가산기(535), 인트라 예측부(540), 움직임 보상부(550), 스위치(555), 필터부(560) 및 참조 영상 버퍼(570)를 포함한다.As shown in FIG. 4 , the electronic device 400 includes an entropy decoding unit 510, an inverse quantization unit 520, an inverse transform unit 530, an adder 535, an intra prediction unit 540, a motion compensation unit ( 550), a switch 555, a filter unit 560, and a reference image buffer 570.

전자 장치(400)는 인코딩 장치(100)에서 수신된 인코딩 영상 신호에 디코딩을 수행하여 영상을 재구성할 수 있다. 전자 장치(400)는 블록 단위로 엔트로피 복호화, 역양자화, 역변환, 필터링 등을 거쳐 디코딩을 수행할 수 있다.The electronic device 400 may reconstruct an image by performing decoding on the encoded video signal received from the encoding device 100 . The electronic device 400 may perform decoding through entropy decoding, inverse quantization, inverse transformation, and filtering in block units.

엔트로피 복호화부(510)는 입력된 스트림을 엔트로피 디코딩하여 양자화된 변환 계수를 생성할 수 있다. 이때, 엔트로피 디코딩 방법은 도 2에 엔트로피 부호화부(150)에서 이용한 방법을 역으로 적용한 방법일 수 있다.The entropy decoding unit 510 may generate quantized transform coefficients by entropy decoding the input stream. In this case, the entropy decoding method may be a method in which the method used in the entropy encoding unit 150 in FIG. 2 is reversely applied.

역양자화부(520)는 양자화된 변환 계수를 입력받아 역양자화를 수행할 수 있다. 즉, 양자화부(140) 및 역양자화부(520)의 동작에 따라 특정 범위의 입력 값이 특정 범위 내의 어느 하나의 기준 입력 값으로 변경되며, 이 과정에서 입력 값과 어느 하나의 기준 입력 값의 차이만큼의 에러가 발생할 수 있다.The inverse quantization unit 520 may perform inverse quantization by receiving the quantized transform coefficient. That is, according to the operation of the quantization unit 140 and the inverse quantization unit 520, an input value within a specific range is changed to any one reference input value within a specific range, and in this process, the input value and any one reference input value are changed. Errors can occur as much as the difference.

역변환부(530)는 역양자화부(520)로부터 출력된 데이터를 역변환하며, 변환부(130)에서 이용한 방법을 역으로 적용하여 역변환을 수행할 수 있다. 역변환부(530)는 역변환을 수행하여 복원된 차분 신호을 생성할 수 있다.The inverse transformation unit 530 inversely transforms the data output from the inverse quantization unit 520, and may perform inverse transformation by applying the method used in the transformation unit 130 inversely. The inverse transform unit 530 may generate a restored differential signal by performing inverse transform.

가산기(535)는 복원된 차분 신호와 예측 블록을 더해 복원 블록을 생성할 수 있다. 여기서, 예측 블록은 인터 모드 또는 인트라 모드로 생성된 블록일 수 있다.The adder 535 may generate a reconstructed block by adding the reconstructed differential signal and the predicted block. Here, the prediction block may be a block generated in an inter mode or an intra mode.

인터 모드로 동작하는 경우, 움직임 보상부(550)는 인코딩 장치(100)로부터 디코딩하고자 하는 현재 블록에 대한 움직임 정보를 수신하여, 수신된 움직임 정보를 바탕으로 예측 블록을 생성할 수 있다. 여기서, 움직임 보상부(550)는 참조 영상 버퍼(570)에 저장되어 있는 참조 영상에서 예측 블록을 생성할 수 있다. 움직임 정보는 타겟 블록과 가장 시간적 상관도가 높은 블록에 대한 움직임 벡터, 참조 영상 인덱스 등을 포함할 수 있다.When operating in the inter mode, the motion compensator 550 may receive motion information about a current block to be decoded from the encoding device 100 and generate a prediction block based on the received motion information. Here, the motion compensator 550 may generate a prediction block from a reference image stored in the reference image buffer 570 . The motion information may include a motion vector of a block having the highest temporal correlation with the target block, a reference picture index, and the like.

여기서, 참조 영상 버퍼(570)는 현재 디코딩하려는 영상과 상관도가 높은 일부의 참조 영상을 저장하고 있을 수 있다. 참조 영상은 상술한 복원 블록을 필터링하여 생성된 영상일 수 있다. 즉, 참조 영상은 인코딩 후 디코딩된 영상일 수 있다.Here, the reference image buffer 570 may store some reference images that have a high correlation with an image to be currently decoded. The reference image may be an image generated by filtering the reconstruction block described above. That is, the reference image may be a decoded image after encoding.

인트라 모드로 동작하는 경우, 인트라 예측부(540)는 현재 영상 내의 인코딩된 인접 픽셀들로부터 공간적 예측을 수행하여 현재 블록에 대한 예측 값을 생성할 수 있다.When operating in the intra mode, the intra prediction unit 540 may generate a prediction value for the current block by performing spatial prediction from encoded adjacent pixels in the current image.

한편, 스위치(555)는 현재 블록의 모드에 따라 위치가 변경될 수 있다.Meanwhile, the position of the switch 555 may be changed according to the mode of the current block.

필터부(560)는 디블록킹 필터, SAO, ALF 중 적어도 하나를 복원 영상에 적용할 수 있다. 필터링된 복원 영상은 참조 영상 버퍼(570)에 저장되어 참조 영상으로서 활용될 수 있다.The filter unit 560 may apply at least one of a deblocking filter, SAO, and ALF to the reconstructed image. The filtered reconstructed image may be stored in the reference image buffer 570 and used as a reference image.

한편, 필터부(560)는 인코딩 장치(100)로부터 수신된 노이즈 메타데이터에 기초하여 필터링을 수행할 수도 있다. 예를 들어, 인코딩 장치(100)는 노이즈 모델(211)에 기초하여 예측 블록을 획득함에 있어서, 현재 블록의 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 예측 블록 및 노이즈 메타데이터를 획득할 수 있다. 노이즈 패턴 정보는 전자 장치(400)의 디코딩된 영상 신호에 대한 후처리 시 처리 가능한 노이즈 및 노이즈의 강도에 대한 정보를 포함하고 있으므로 필터부(560)는 노이즈 메타데이터에 기초하여 인코딩 장치(100)에서 처리(또는, 제거)되지 않은 노이즈에 대해 필터링을 수행할 수 있다. Meanwhile, the filter unit 560 may perform filtering based on noise metadata received from the encoding device 100 . For example, in obtaining a prediction block based on the noise model 211, the encoding apparatus 100 determines the size of a coding unit (CU) of the current block, a coding index, and coding coefficients. A prediction block and noise metadata according to at least one of the bit rates may be obtained. Since the noise pattern information includes information on processable noise and noise intensity during post-processing of the decoded video signal of the electronic device 400, the filter unit 560 performs the encoding device 100 based on the noise metadata. Filtering may be performed on noise that is not processed (or removed) in .

상술한 바와 같이 인코딩 장치(100)는 인코딩 과정을 통해 입력 영상을 압축하고, 압축된 영상을 전자 장치(400)로 전송할 수 있다. 전자 장치(400)는 압축된 영상을 디코딩하여 영상을 재구성할 수 있다.As described above, the encoding device 100 may compress an input image through an encoding process and transmit the compressed image to the electronic device 400 . The electronic device 400 may reconstruct an image by decoding the compressed image.

도 6은 본 개시의 일 실시 예에 따른 인코딩 장치의 인코딩 방법을 설명하기 위한 흐름도이다.6 is a flowchart illustrating an encoding method of an encoding apparatus according to an embodiment of the present disclosure.

도 6에 도시된 노이즈 모델이 저장된 인코딩 장치의 인코딩 방법에 있어서, 노이즈 모델에 기초하여 입력 영상의 현재 블록에 대응되는 인코딩을 위한 예측 블록을 획득한다(S610).In the encoding method of the encoding device in which the noise model shown in FIG. 6 is stored, a prediction block for encoding corresponding to a current block of an input image is obtained based on the noise model (S610).

이어서, 현재 블록 및 예측 블록 간 차이에 기초하여 획득된 차분 신호에 대한 양자화를 수행한다(S620).Subsequently, quantization is performed on the obtained differential signal based on the difference between the current block and the predicted block (S620).

이어서, 양자화된 차분 신호를 인코딩하여 인코딩 영상 신호를 획득한다(S630).Subsequently, an encoded video signal is obtained by encoding the quantized differential signal (S630).

이어서, 예측 블록에 대응되는 노이즈 메타데이터 및 인코딩 영상 신호를 포함하는 스트림을 디코딩 장치로 출력한다(S640).Subsequently, a stream including noise metadata and an encoded video signal corresponding to the prediction block is output to the decoding device (S640).

여기서, 노이즈 모델은, 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것일 수 있다.Here, the noise model may be a noise post-processing process learned based on a plurality of sample images.

본 개시의 일 실시 예에 따른 노이즈 모델은, 복수의 샘플 영상 각각의 상이한 CU(Coding Unit) 크기, 코딩 인덱스(Coding Index) 및, 코딩 계수(Coding Coefficient)의 비트율 중 적어도 하나에 따른 노이즈 패턴 정보를 포함하고, 예측 블록을 획득하는 S610 단계는, 노이즈 패턴 정보에 기초하여 현재 블록에 대응되는 예측 블록을 획득할 수 있다.A noise model according to an embodiment of the present disclosure is noise pattern information according to at least one of a different coding unit (CU) size, a coding index, and a bit rate of a coding coefficient of each of a plurality of sample images Including, in step S610 of obtaining a prediction block, a prediction block corresponding to the current block may be obtained based on the noise pattern information.

또한, 예측 블록을 획득하는 S610 단계는, 현재 블록에 포함된 노이즈 정보 중 노이즈 패턴 정보에 대응되는 노이즈 정보는 유지하는 예측 블록을 획득하고, 여기서, 노이즈 메타데이터는, 노이즈 패턴 정보에 대응되는 노이즈 정보를 포함할 수 있다.In addition, in step S610 of obtaining a prediction block, a prediction block that retains noise information corresponding to noise pattern information among noise information included in the current block is obtained, wherein the noise metadata is noise information corresponding to noise pattern information. information may be included.

여기서, 노이즈 패턴 정보는, 복수의 샘플 영상 각각에 기초하여 학습된 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하고, 처리 가능한 노이즈에 대한 정보는, 링잉 노이즈(ringing noise), 플리커링 노이즈(flickering noise), 블로킹 노이즈(blocking noise) 및 블러링 노이즈(blurring noise) 중 적어도 하나를 포함할 수 있다.Here, the noise pattern information includes at least one of information on processable noise learned based on each of a plurality of sample images and information on intensity of processable noise, and the information on processable noise includes ringing noise ( It may include at least one of ringing noise, flickering noise, blocking noise, and blurring noise.

일 실시 예에 따른 예측 블록을 획득하는 S610 단계는, 디코딩 장치에 대한 정보가 수신되면, 노이즈 모델 및 디코딩 장치에 대한 정보에 기초하여 현재 블록에 대응되는 예측 블록을 획득할 수 있다.In step S610 of obtaining a prediction block according to an embodiment, when information on a decoding device is received, a prediction block corresponding to the current block may be obtained based on the noise model and the information on the decoding device.

일 실시 예에 따른 인코딩 방법은 복수의 샘플 영상에 CNN(Convolution Neural Network)을 수행하여 노이즈 모델을 획득하는 단계를 더 포함할 수 있다.The encoding method according to an embodiment may further include obtaining a noise model by performing a convolution neural network (CNN) on a plurality of sample images.

도 7은 본 개시의 일 실시 예에 따른 전자 장치의 디코딩 방법을 설명하기 위한 흐름도이다.7 is a flowchart illustrating a decoding method of an electronic device according to an embodiment of the present disclosure.

도 7에 도시된 전자 장치의 디코딩 방법에 따르면, 인코딩 장치로부터 인코딩 영상 신호 및 노이즈 메타데이터를 수신되면, 인코딩 영상 신호를 디코딩한다(S710).According to the decoding method of the electronic device shown in FIG. 7, upon receiving the encoded video signal and the noise metadata from the encoding device, the encoded video signal is decoded (S710).

이어서, 노이즈 메타데이터에 기초하여 디코딩된 영상 신호에 후처리를 수행한다(S720).Subsequently, post-processing is performed on the decoded video signal based on the noise metadata (S720).

여기서, 인코딩 영상 신호는, 노이즈 모델에 기초하여 원본 영상에서 기설정된 노이즈 정보는 유지된 상태로 인코딩된 영상 신호이고, 노이즈 모델은 복수의 샘플 영상에 기초하여 노이즈 후처리 과정이 학습된 것이며, 후처리를 수행하는 S720 단계는, 디코딩된 영상 신호에서 노이즈 메타데이터에 포함된 기설정된 노이즈 정보에 대응되는 노이즈를 필터링하여 후처리를 수행할 수 있다.Here, the encoded video signal is an encoded video signal in a state in which noise information preset in the original video is maintained based on the noise model, and the noise model is a noise post-processing process learned based on a plurality of sample images. In step S720 of performing processing, post-processing may be performed by filtering noise corresponding to predetermined noise information included in noise metadata in the decoded video signal.

여기서, 노이즈 메타데이터는, 노이즈 후처리 과정에서 처리 가능한 노이즈에 대한 정보 및 처리 가능한 노이즈의 강도에 대한 정보 중 적어도 하나를 포함하고, 후처리를 수행하는 S720 단계는, 노이즈 메타데이터에 기초하여 디코딩된 영상에 링잉 노이즈(ringing noise) 필티링, 플리커링 노이즈(flickering noise) 필터링, 블로킹 노이즈(blocking noise) 필터링 및 블러링 노이즈(blurring noise) 필터링 중 적어도 하나를 수행할 수 있다.Here, the noise metadata includes at least one of processable noise information and processable noise intensity information in the noise post-processing process, and the post-processing step S720 includes decoding based on the noise metadata. At least one of ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering may be performed on the image.

본 개시의 일 실시 예에 따른 디코딩 방법은 후처리가 수행된 영상을 디스플레이를 통해 제공하는 단계를 더 포함할 수 있다.The decoding method according to an embodiment of the present disclosure may further include providing a post-processed image through a display.

한편, 이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합을 이용하여 컴퓨터(computer) 또는 이와 유사한 장치로 읽을 수 있는 기록 매체 내에서 구현될 수 있다. 일부 경우에 있어 본 명세서에서 설명되는 실시 예들이 프로세서 자체로 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다.Meanwhile, various embodiments described above may be implemented in a recording medium readable by a computer or a similar device using software, hardware, or a combination thereof. In some cases, the embodiments described herein may be implemented by a processor itself. According to software implementation, embodiments such as procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein.

한편, 상술한 본 개시의 다양한 실시 예들에 따른 처리 동작을 수행하기 위한 컴퓨터 명령어(computer instructions)는 비일시적 컴퓨터 판독 가능 매체(non-transitory computer-readable medium) 에 저장될 수 있다. 이러한 비일시적 컴퓨터 판독 가능 매체에 저장된 컴퓨터 명령어는 프로세서에 의해 실행되었을 때 상술한 다양한 실시 예에 따른 처리 동작을 특정 기기가 수행하도록 할 수 있다. Meanwhile, computer instructions for performing processing operations according to various embodiments of the present disclosure described above may be stored in a non-transitory computer-readable medium. Computer instructions stored in such a non-transitory computer readable medium may cause a specific device to perform processing operations according to various embodiments described above when executed by a processor.

비일시적 컴퓨터 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 비일시적 컴퓨터 판독 가능 매체의 구체적인 예로는, CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등이 있을 수 있다.A non-transitory computer readable medium is a medium that stores data semi-permanently and is readable by a device, not a medium that stores data for a short moment, such as a register, cache, or memory. Specific examples of the non-transitory computer readable media may include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시에 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.Although the preferred embodiments of the present disclosure have been shown and described above, the present disclosure is not limited to the specific embodiments described above, and is common in the technical field belonging to the present disclosure without departing from the gist of the present disclosure claimed in the claims. Of course, various modifications are possible by those with knowledge of, and these modifications should not be individually understood from the technical spirit or perspective of the present disclosure.

100: 인코딩 장치 110: 통신부
120: 메모리 130: 프로세서
400: 전자 장치 410: 통신부
420: 프로세서100: encoding device 110: communication unit
120: memory 130: processor
400: electronic device 410: communication unit
420: processor

Claims

communications department;
a memory in which a noise model is stored;
Obtaining a prediction block for encoding corresponding to a current block of an input image based on the noise model;
Perform quantization on a differential signal obtained based on a difference between the current block and the prediction block;
Encoding the quantized differential signal to obtain an encoded video signal;
A processor outputting a stream including noise metadata corresponding to the prediction block and the encoded video signal to a decoding device through the communication unit;
The noise model is one in which a noise post-processing process is learned based on a plurality of sample images.

According to claim 1,
The noise model,
Includes noise pattern information according to at least one of a different coding unit (CU) size of each of the plurality of sample images, a coding index, and a bit rate of a coding coefficient;
the processor,
An encoding device that obtains the prediction block corresponding to the current block based on the noise pattern information.

According to claim 2,
the processor,
Obtaining a prediction block maintaining noise information corresponding to the noise pattern information among the noise information included in the current block;
The noise metadata,
Encoding device comprising noise information corresponding to the noise pattern information.

According to claim 2,
The noise pattern information,
Includes at least one of information on processable noise learned based on each of the plurality of sample images and information on intensity of the processable noise;
Information on the processable noise,
An encoding device comprising at least one of ringing noise, flickering noise, blocking noise, and blurring noise.

According to claim 1,
the processor,
When information on the decoding device is received through the communication unit, the prediction block corresponding to the current block is obtained based on the noise model and the information on the decoding device.

According to claim 1,
the processor,
An encoding device for obtaining the noise model by performing a convolution neural network (CNN) on the plurality of sample images.

In electronic devices,
communications department; and
When an encoding video signal and noise metadata are received from an encoding device through the communication unit, a processor that decodes the encoded video signal and performs post-processing on the decoded video signal based on the noise metadata;
The encoded video signal,
Noise information preset in the original video based on the noise model is an encoded video signal while being maintained,
The noise model is learned through a noise post-processing process based on a plurality of sample images,
the processor,
The electronic device performs the post-processing by filtering noise corresponding to the predetermined noise information included in the noise metadata in the decoded video signal.

According to claim 7,
The noise metadata,
Including at least one of information about noise that can be processed in a noise post-processing process and information about the intensity of the noise that can be processed;
the processor,
At least one of ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering is performed on the decoded image based on the noise metadata to do, electronic devices.

According to claim 7,
further comprising a display;
the processor,
An electronic device that provides the post-processed image through the display.

In the encoding method of the encoding device in which the noise model is stored,
obtaining a prediction block for encoding corresponding to a current block of an input image based on the noise model;
performing quantization on a differential signal obtained based on a difference between the current block and the prediction block;
encoding the quantized differential signal to obtain an encoded video signal; and
And outputting a stream including noise metadata corresponding to the prediction block and the encoded video signal to a decoding device,
The noise model is one in which a noise post-processing process is learned based on a plurality of sample images.

According to claim 10,
The noise model,
Includes noise pattern information according to at least one of a different coding unit (CU) size of each of the plurality of sample images, a coding index, and a bit rate of a coding coefficient;
Obtaining the prediction block,
Acquiring the prediction block corresponding to the current block based on the noise pattern information.

According to claim 11,
Obtaining the prediction block,
Obtaining a prediction block retaining noise information corresponding to the noise pattern information among the noise information included in the current block;
The noise metadata,
Encoding method comprising noise information corresponding to the noise pattern information.

According to claim 11,
The noise pattern information,
Includes at least one of information on processable noise learned based on each of the plurality of sample images and information on intensity of the processable noise;
Information on the processable noise,
An encoding method comprising at least one of ringing noise, flickering noise, blocking noise, and blurring noise.

According to claim 10,
Obtaining the prediction block,
If the information on the decoding device is received, obtaining the prediction block corresponding to the current block based on the noise model and the information on the decoding device.

According to claim 10,
Acquiring the noise model by performing a convolution neural network (CNN) on the plurality of sample images; further comprising the encoding method.

In the decoding method of the electronic device,
Receiving an encoding video signal and noise metadata from an encoding device;
decoding the encoded video signal; and
Performing post-processing on the decoded video signal based on the noise metadata;
The encoded video signal,
Noise information preset in the original video based on the noise model is an encoded video signal while being maintained,
The noise model is learned through a noise post-processing process based on a plurality of sample images,
In the post-processing step,
The decoding method of performing the post-processing by filtering noise corresponding to the predetermined noise information included in the noise metadata from the decoded video signal.

According to claim 16,
The noise metadata,
Including at least one of information about noise that can be processed in a noise post-processing process and information about the intensity of the noise that can be processed;
In the post-processing step,
At least one of ringing noise filtering, flickering noise filtering, blocking noise filtering, and blurring noise filtering is performed on the decoded image based on the noise metadata , decoding method.

According to claim 16,
Further comprising, a decoding method comprising: providing the image on which the post-processing has been performed through a display.