KR20210061298A

KR20210061298A - Method and apparatus for image processing and perceptual qualitty enhancement based on perceptual characteristic

Info

Publication number: KR20210061298A
Application number: KR1020200155854A
Authority: KR
Inventors: 김종호; 조승현; 정세윤; 고현석; 권형진; 김동현; 김연희; 이주영; 이태진; 최진수
Original assignee: 한국전자통신연구원
Priority date: 2019-11-19
Filing date: 2020-11-19
Publication date: 2021-05-27

Abstract

Disclosed are a method and an apparatus for image processing such as encoding and decoding of images. The image processing apparatus performs quantization based on the minimum just detectable difference (JND) on an input image, and performs the detection of a cognitively sensitive region, the detection of an image quality deterioration region, the determination of a cognitive deterioration region, and the improvement of perceived image quality within the input image. In detecting the cognitively sensitive region, at least one of randomness, masking characteristics and attention and concentration characteristics is used. In the detection of an image quality deterioration region, at least one of boundary strength information, JND level information, and edge information is used. In the improvement of perceived image quality, at least one of adjustment of quantization parameter values and denoising based on machine learning is used.

Description

Method and apparatus for image processing and cognitive quality improvement based on cognitive characteristics {METHOD AND APPARATUS FOR IMAGE PROCESSING AND PERCEPTUAL QUALITTY ENHANCEMENT BASED ON PERCEPTUAL CHARACTERISTIC}

아래의 실시예들은 영상 처리 방법 및 장치에 관한 것으로서, 인지 특성에 기반하여 영상에 대한 압축을 처리하고 인지 화질을 향상시키기 위한 방법 및 장치에 관한 것이다.The following embodiments relate to an image processing method and apparatus, and to a method and apparatus for processing image compression and improving perceived quality based on cognitive characteristics.

정보 통신 산업의 지속적인 발달을 통해 HD(High Definition) 해상도를 가지는 방송 서비스가 세계적으로 확산되었다. 이러한 확산을 통해, 많은 사용자들이 고해상도이며 고화질인 영상(image) 및/또는 비디오(video)에 익숙해지게 되었다.Through the continuous development of the information and communication industry, broadcast services with high definition (HD) resolution have spread worldwide. Through this proliferation, many users have become accustomed to high-resolution and high-definition images and/or videos.

높은 화질에 대한 사용자들의 수요를 만족시키기 위하여, 많은 기관들이 차세대 영상 기기에 대한 개발에 박차를 가하고 있다. 에이치디티브이(High Definition TV; HDTV) 및 풀에이치디(Full HD; FHD) TV뿐만 아니라, FHD TV에 비해 4배 이상의 해상도를 갖는 울트라에이치디(Ultra High Definition; UHD) TV에 대한 사용자들의 관심이 증대하였고, 이러한 관심의 증대에 따라, 더 높은 해상도 및 화질을 갖는 영상에 대한 영상 부호화(encoding)/복호화(decoding) 기술이 요구된다.In order to satisfy users' demands for high image quality, many organizations are spurring the development of next-generation imaging devices. Users' interest in High Definition TV (HDTV) and Full HD (FHD) TV, as well as Ultra High Definition (UHD) TV, which has a resolution of 4 times or more compared to FHD TV This has increased, and as interest increases, an image encoding/decoding technology for an image having a higher resolution and quality is required.

영상 부호화(encoding)/복호화(decoding) 장치 및 방법은 고해상도 및 고화질의 영상에 대한 부호화/복호화를 수행하기 위해, 인터(inter) 예측(prediction) 기술, 인트라(intra) 예측 기술, 변환(transform) 및 양자화(quantization) 기술 및 엔트로피 부호화 기술 등을 사용할 수 있다. 인터 예측 기술은 시간적으로(temporally) 이전의 픽처 및/또는 시간적으로 이후의 픽처를 이용하여 대상 픽처에 포함된 픽셀의 값을 예측하는 기술일 수 있다. 인트라 예측 기술은 대상 픽처 내의 픽셀의 정보를 이용하여 대상 픽처에 포함된 픽셀의 값을 예측하는 기술일 수 있다. 변환 및 양자화 기술은 잔차 신호의 에너지를 압축하기 위한 기술일 수 있다. 엔트로피 부호화 기술은 출현 빈도가 높은 심볼에는 짧은 코드(code)를 할당하고, 출현 빈도가 낮은 심볼에는 긴 코드를 할당하는 기술일 수 있다.An image encoding/decoding apparatus and method is an inter prediction technology, an intra prediction technology, and a transform in order to perform encoding/decoding on a high-resolution and high-definition image. And a quantization technique and an entropy coding technique. The inter prediction technique may be a technique for predicting a value of a pixel included in a target picture using a temporally previous picture and/or a temporally later picture. The intra prediction technique may be a technique for predicting a value of a pixel included in a target picture using information on a pixel in the target picture. The transform and quantization technique may be a technique for compressing energy of a residual signal. The entropy encoding technique may be a technique of allocating a short code to a symbol with a high frequency of appearance and a long code to a symbol having a low frequency of appearance.

영상 부호화에 있어서, 최소한의 감지 가능한 차이(Just Noticeable Difference; JND) 또는 최소한의 감지 가능한 임계(Just Noticeable Threshold)의 개념이 사용된다.In image encoding, the concept of a minimum detectable difference (JND) or a minimum detectable threshold (JND) is used.

JND는 사람이 어떤 변화 또는 자극을 인지하는 시점을 나타낼 수 있다. 일반적으로, JND는 50%의 사람들이 변화를 인지하는 시점을 의미할 수 있다.JND can refer to the point at which a person perceives a change or stimulus. In general, JND can mean the point at which 50% of people perceive a change.

영상 부호화에서 사용되는 JND에 있어서, 원본 영상 및 대상 영상 간의 인지(perceptual) 화질 차이 만을 고려하는 제1 JND 포지션만이 모델링되고, 사용된다. 따라서, 영상의 부호화에서 양자화 값이 제1 JND에서 정의한 임계치를 넘어가는 경우, JND를 이용한 영상 압축의 효과가 미미할 수 있다. 또한, 시청자가 화질의 열화를 느끼는 시점부터는 JND를 통해 얻을 수 있는 이득이 거의 없을 수 있으며, 따라서 화질의 향상을 위해 사용될 수 있는 코딩 비트(coding bit)도 거의 없을 수 있다.In the JND used in image encoding, only the first JND position considering only the difference in perceptual quality between the source image and the target image is modeled and used. Accordingly, when a quantization value exceeds a threshold defined by the first JND in encoding an image, the effect of image compression using the JND may be insignificant. In addition, there may be little gain that can be obtained through JND from the point when the viewer feels the deterioration of image quality, and thus, there may be few coding bits that can be used to improve the image quality.

일 실시예는 인지 특성에 기반한 영상 처리를 위한 방법 및 장치를 제공할 수 있다.An embodiment may provide a method and an apparatus for image processing based on cognitive characteristics.

일 실시예는 영상의 압축 과정에서의 인지 중복성 제거를 통해 인지 화질의 저하 없이 영상의 압축률을 향상시킬 수 있으며, 압축된 영상의 인지 화질을 향상시킬 수 있다.According to an exemplary embodiment, the compression rate of an image may be improved without deteriorating the perceived image quality by removing cognitive redundancy in the image compression process, and the perceived image quality of the compressed image may be improved.

일 실시예는 다중 레벨 JND 구간들 및 다중 레벨 JND 임계치들을 사용하여 영상의 양자화를 처리하는 방법 및 장치를 제공할 수 있다.An embodiment may provide a method and apparatus for processing image quantization using multi-level JND intervals and multi-level JND thresholds.

일 측에 있어서, 입력 영상에 대하여 최소한의 감지 가능한 차이(Just Noticeable Difference; JND)에 기반한 영자화를 수행하는 단계; 및 상기 입력 영상에 대하여 인지 화질의 향상을 수행하는 단계를 포함하는, 영상 처리 방법이 제공된다.In one side, performing an alphabetization based on a minimum detectable difference (JND) with respect to the input image; And improving the perceived image quality on the input image.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 장치, 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하기 위한 컴퓨터 판독 가능한 기록 매체가 더 제공된다.In addition to this, another method, apparatus, and system for implementing the present invention, and a computer-readable recording medium for recording a computer program for executing the method are further provided.

인지 특성에 기반한 영상 처리를 위한 방법 및 장치가 제공된다.A method and apparatus for image processing based on cognitive characteristics are provided.

다중 레벨 JND 구간들 및 다중 레벨 JND 임계치들을 사용하여 영상의 양자화를 처리하는 방법 및 장치가 제공된다.A method and apparatus for processing an image quantization using multi-level JND intervals and multi-level JND thresholds are provided.

도 1은 본 발명이 적용되는 부호화 장치의 일 실시예에 따른 구성을 나타내는 블록도이다.
도 2는 본 발명이 적용되는 복호화 장치의 일 실시예에 따른 구성을 나타내는 블록도이다.
도 3은 일 실시예에 따른 부호화 장치의 구조도이다.
도 4는 일 실시예에 따른 복호화 장치의 구조도이다.
도 5은 일 예에 따른 JND 포지션 및 JND 구간을 설명한다.
도 6은 일 예에 따른 JND 임계치를 이용한 양자화를 나타낸다.
도 7은 일 예에 따른 주파수 영역에서의 JND 임계치를 나타낸다.
도 8은 일 실시예에 따른 영상 처리 방법의 흐름도이다.
도 9는 일 예에 따른 다중 레벨 JND 포지션들을 나타낸다.
도 10은 일 예에 따른 주파수 계수 위치에 따른 제1 JND 임계치를 나타낸다.
도 11은 일 예에 따른 주파수 계수 위치에 따른 제2 JND 임계치를 나타낸다.
도 12는 일 예에 따른 기계 학습 기반의 네트워크 학습 단계를 나타낸다.
도 13은 일 예에 따른 기계 학습 기반의 네트워크를 이용한 JND 구간의 파악을 나타낸다.
도 14는 일 예에 따른 입력 영상의 8x8 DCT 계수들을 나타낸다.
도 15는 일 예에 따른 입력 영상에 대한 인지 특성 분석을 나타낸다.
도 16은 인지 특성 분석에 따라 결정된 JND 임계치의 일 예를 도시한다.
도 17은 인지 특성 분석에 따른 JND 임계치의 다른 일 예를 도시한다.
도 18은 일 예에 따른 픽처의 블록들의 휘도 및 가중치를 도시한다.
도 19는 일 예에 따른 픽처의 블록들의 블록 타입 및 가중치를 도시한다.
도 20은 일 예에 따른 주파수 영역에서의 제1 JND 임계치를 예시한다.
도 21은 일 예에 따른 주파수 영역에서의 제2 JND 임계치를 예시한다.
도 22는 일 예에 JND 임계치를 이용하는 추가 양자화를 나타낸다.
도 23은 일 예에 따른 주파수 계수의 위치에 따른 양자화 값들을 도시한다.
도 24는 일 예에 따른 재구축된 영상의 8x8 DCT 계수들을 도시한다.
도 25는 일 예에 따른 인지 화질의 향상의 과정을 나타낸다.
도 26은 일 예에 따른 픽셀 단위의 무작위성의 검출을 나타낸다.
도 27은 일 예에 따른 블록 단위의 무작위성의 검출을 나타낸다.
도 28은 일 예에 따른 블록 경계를 걸치는 블로킹 아티팩트의 모델을 나타낸다.
도 29는 일 예에 따른 인지 열화 영역 결정의 과정을 나타낸다.1 is a block diagram showing a configuration according to an embodiment of an encoding apparatus to which the present invention is applied.
2 is a block diagram showing a configuration according to an embodiment of a decoding apparatus to which the present invention is applied.
3 is a structural diagram of an encoding apparatus according to an embodiment.
4 is a structural diagram of a decoding apparatus according to an embodiment.
5 illustrates a JND position and a JND section according to an example.
6 shows quantization using a JND threshold according to an example.
7 shows a JND threshold in a frequency domain according to an example.
8 is a flowchart of an image processing method according to an exemplary embodiment.
9 shows multi-level JND positions according to an example.
10 illustrates a first JND threshold according to a frequency coefficient position according to an example.
11 shows a second JND threshold according to a frequency coefficient position according to an example.
12 illustrates a machine learning-based network learning step according to an example.
13 illustrates grasping a JND section using a machine learning-based network according to an example.
14 shows 8x8 DCT coefficients of an input image according to an example.
15 illustrates an analysis of cognitive characteristics for an input image according to an example.
16 shows an example of a JND threshold determined according to cognitive characteristic analysis.
17 illustrates another example of a JND threshold according to cognitive characteristic analysis.
18 illustrates luminance and weights of blocks of a picture according to an example.
19 illustrates block types and weights of blocks of a picture according to an example.
20 illustrates a first JND threshold in a frequency domain according to an example.
21 illustrates a second JND threshold in a frequency domain according to an example.
22 shows additional quantization using a JND threshold in an example.
23 illustrates quantization values according to positions of frequency coefficients according to an example.
24 illustrates 8x8 DCT coefficients of a reconstructed image according to an example.
25 illustrates a process of improving cognitive image quality according to an example.
26 illustrates detection of randomness in units of pixels, according to an example.
27 illustrates detection of randomness in units of blocks, according to an example.
28 illustrates a model of a blocking artifact across a block boundary according to an example.
29 illustrates a process of determining a cognitive deterioration region according to an example.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.In the present invention, various modifications may be made and various embodiments may be provided, and specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to a specific embodiment, it should be understood to include all changes, equivalents, and substitutes included in the spirit and scope of the present invention.

후술하는 예시적 실시예들에 대한 상세한 설명은, 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 실시예를 실시할 수 있기에 충분하도록 상세히 설명된다. 다양한 실시예들은 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들면, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 실시예의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 예시적 실시예들의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다.For a detailed description of exemplary embodiments described below, reference is made to the accompanying drawings, which illustrate specific embodiments as examples. These embodiments are described in detail sufficient to enable a person skilled in the art to practice the embodiments. It should be understood that the various embodiments are different from each other but need not be mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented in other embodiments without departing from the spirit and scope of the present invention in relation to one embodiment. In addition, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the embodiment. Accordingly, the detailed description to be described below is not intended to be taken in a limiting sense, and the scope of exemplary embodiments, if appropriately described, is limited only by the appended claims, along with all scopes equivalent to those claimed by the claims.

도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다. 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다.Like reference numerals in the drawings refer to the same or similar functions over several aspects. The shapes and sizes of elements in the drawings may be exaggerated for clearer explanation.

본 발명에서 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들면, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.In the present invention, terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another component. For example, without departing from the scope of the present invention, a first element may be referred to as a second element, and similarly, a second element may be referred to as a first element. The term and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

어떤 구성요소(component)가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기의 2개의 구성요소들이 서로 간에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있으나, 상기의 2개의 구성요소들의 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 어떤 구성요소(component)가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기의 2개의 구성요소들의 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is referred to as being "connected" or "connected" to another component, the above two components may be directly connected or connected to each other, but the above 2 It should be understood that other components may exist in the middle of the components of the dog. When a component is referred to as being "directly connected" or "directly connected" to another component, it should be understood that no other component exists between the two components.

본 발명의 실시예에 나타나는 구성요소들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성요소들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성요소는 설명의 편의상 각각의 구성요소로 나열하여 포함한 것으로 각 구성요소 중 적어도 두 개의 구성요소가 합쳐져 하나의 구성요소로 이루어지거나, 하나의 구성요소가 복수 개의 구성요소로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성요소의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다.Components shown in the embodiments of the present invention are independently illustrated to represent different characteristic functions, and does not mean that each component is formed of separate hardware or a single software component unit. That is, each component is listed and included as each component for convenience of explanation, and at least two components of each component are combined to form a single component, or one component is divided into a plurality of components to perform a function. It is possible to perform, and integrated embodiments and separate embodiments of each of these components are also included in the scope of the present invention without departing from the essence of the present invention.

또한, 예시적 실시예들에서 특정 구성을 "포함"한다고 기술하는 내용은 상기의 특정 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성이 예시적 실시예들의 실시 또는 예시적 실시예들의 기술적 사상의 범위에 포함될 수 있음을 의미한다.In addition, the description of "including" a specific configuration in the exemplary embodiments does not exclude configurations other than the specific configurations described above, and additional configurations are not limited to the implementation of the exemplary embodiments or the technical idea of the exemplary embodiments. It means that it can be included in the scope.

본 발명에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 발명에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 즉, 본 발명에서 특정 구성을 "포함"한다고 기술하는 내용은 해당 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성 또한 본 발명의 실시 또는 본 발명의 기술적 사상의 범위에 포함될 수 있음을 의미한다.The terms used in the present invention are used only to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present invention, terms such as "comprise" or "have" are intended to designate the presence of features, numbers, steps, actions, components, parts, or a combination thereof described in the specification, but one or more other features. It is to be understood that the presence or addition of elements or numbers, steps, actions, components, parts, or combinations thereof does not preclude in advance. That is, in the present invention, the description of "including" a specific configuration does not exclude configurations other than the corresponding configuration, and means that additional configurations may also be included in the scope of the implementation of the present invention or the scope of the technical idea of the present invention.

이하에서는, 기술분야에서 통상의 지식을 가진 자가 실시예들을 용이하게 실시할 수 있도록 하기 위하여, 첨부된 도면을 참조하여 실시 형태에 대하여 구체적으로 설명한다. 실시예들을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 도면 상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고, 동일한 구성요소에 대한 중복된 설명은 생략한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings in order to enable those of ordinary skill in the art to easily implement the embodiments. In describing the embodiments, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present specification, a detailed description thereof will be omitted. In addition, the same reference numerals are used for the same elements in the drawings, and redundant descriptions of the same elements are omitted.

이하에서, 영상은 비디오(video)을 구성하는 하나의 픽처(picture)를 의미할 수 있으며, 비디오 자체를 나타낼 수도 있다. 예를 들면, "영상의 부호화 및/또는 복호화"는 "비디오의 부호화 및/또는 복호화"를 의미할 수 있으며, "비디오를 구성하는 영상들 중 하나의 영상의 부호화 및/또는 복호화"를 의미할 수도 있다.Hereinafter, an image may mean one picture constituting a video, and may represent a video itself. For example, "encoding and/or decoding of an image" may mean "encoding and/or decoding of a video", and may mean "encoding and/or decoding of one of the images constituting a video" May be.

이하에서, 용어들 "비디오(video)" 및 "동영상(motion picture)"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.Hereinafter, the terms "video" and "motion picture" may be used with the same meaning, and may be used interchangeably.

이하에서, 대상 영상은 부호화의 대상인 부호화 대상 영상 및/또는 복호화의 대상인 복호화 대상 영상일 수 있다. 또한, 대상 영상은 부호화 장치로 입력된 입력 영상일 수 있고, 복호화 장치로 입력된 입력 영상일 수 있다.Hereinafter, the target image may be an encoding target image that is a target of encoding and/or a decoding target image that is a target of decoding. Also, the target image may be an input image input through an encoding device or an input image input through a decoding device.

이하에서, 용어들 "영상", "픽처", "프레임(frame)" 및 "스크린(screen)"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.Hereinafter, the terms "image", "picture", "frame" and "screen" may be used with the same meaning, and may be used interchangeably.

이하에서, 대상 블록은 부호화의 대상인 부호화 대상 블록 및/또는 복호화의 대상인 복호화 대상 블록일 수 있다. 또한, 대상 블록은 현재 부호화 및/또는 복호화의 대상인 현재 블록일 수 있다. 예를 들면, 용어들 "대상 블록" 및 "현재 블록"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.Hereinafter, the target block may be an encoding target block, which is an object of encoding, and/or a decoding object block, which is an object of decoding. In addition, the target block may be a current block that is currently a target of encoding and/or decoding. For example, the terms "target block" and "current block" may be used with the same meaning, and may be used interchangeably.

이하에서, 용어들 "블록" 및 "유닛"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다. 또는 "블록"은 특정한 유닛을 나타낼 수 있다.Hereinafter, the terms "block" and "unit" may be used with the same meaning, and may be used interchangeably. Or “block” may represent a specific unit.

이하에서, 용어들 "영역(region)" 및 "세그먼트(segment)"는 서로 교체되어 사용될 수 있다.Hereinafter, the terms "region" and "segment" may be used interchangeably.

이하에서, 특정한 신호는 특정한 블록을 나타내는 신호일 수 있다. 예를 들면, 원(original) 신호는 대상 블록을 나타내는 신호일 수 있다. 예측(prediction) 신호는 예측 블록을 나타내는 신호일 수 있다. 잔차(residual) 신호는 잔차 블록을 나타내는 신호일 수 있다. Hereinafter, the specific signal may be a signal indicating a specific block. For example, the original signal may be a signal representing a target block. The prediction signal may be a signal representing a prediction block. The residual signal may be a signal indicating a residual block.

실시예들에서, 특정된 정보, 데이터, 플래그(flag) 및 요소(element), 속성(attribute) 등의 각각은 값을 가질 수 있다. 정보, 데이터, 플래그(flag) 및 요소(element), 속성(attribute) 등의 값 "0"은 논리 거짓(logical false) 또는 제1 기정의된(predefined) 값을 나타낼 수 있다. 말하자면, 값 "0", 거짓, 논리 거짓 및 제1 기정의된 값은 서로 대체되어 사용될 수 있다. 정보, 데이터, 플래그(flag) 및 요소(element), 속성(attribute) 등의 값 "1"은 논리 참(logical true) 또는 제2 기정의된(predefined) 값을 나타낼 수 있다. 말하자면, 값 "1", 참, 논리 참 및 제2 기정의된 값은 서로 대체되어 사용될 수 있다.In embodiments, each of the specified information, data, flag and element, attribute, etc. may have a value. A value of "0" such as information, data, flags, elements, and attributes may represent a logical false or a first predefined value. That is to say, the value "0", false, logical false, and the first predefined value may be replaced with each other and used. A value of "1" such as information, data, flags, elements, attributes, etc. may represent a logical true or a second predefined value. That is to say, the value "1", true, logical true, and the second predefined value may be used interchangeably.

행, 열 또는 인덱스(index)를 나타내기 위해 i 또는 j 등의 변수가 사용될 때, i의 값은 0 이상의 정수일 수 있으며, 1 이상의 정수일 수도 있다. 말하자면, 실시예들에서 행, 열 및 인덱스 등은 0에서부터 카운트될 수 있으며, 1에서부터 카운트될 수 있다.When a variable such as i or j is used to indicate a row, column, or index, the value of i may be an integer greater than or equal to 0, or may be an integer greater than or equal to 1. That is to say, in embodiments, rows, columns, and indexes may be counted from 0 and may be counted from 1.

아래에서는, 실시예들에서 사용되는 용어가 설명된다.In the following, terms used in the embodiments are described.

부호화기(encoder): 부호화(encoding)를 수행하는 장치를 의미한다.Encoder: refers to a device that performs encoding.

복호화기(decoder): 복호화(decoding)를 수행하는 장치를 의미한다.Decoder: refers to a device that performs decoding.

유닛(unit): 유닛은 영상의 부호화 및 복호화의 단위를 나타낼 수 있다. 용어들 "유닛" 및 "블록(block)"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.Unit: A unit may represent a unit for encoding and decoding an image. The terms "unit" and "block" may be used with the same meaning, and may be used interchangeably.

- 유닛은 샘플의 MxN 배열일 수 있다. M 및 N은 각각 양의 정수일 수 있다. 유닛은 흔히 2차원의 샘플들의 배열을 의미할 수 있다.-The unit may be an MxN array of samples. Each of M and N may be a positive integer. A unit can often mean an arrangement of two-dimensional samples.

- 영상의 부호화 및 복호화에 있어서, 유닛은 하나의 영상의 분할에 의해 생성된 영역일 수 있다. 말하자면, 유닛은 하나의 영상 내의 특정된 영역일 수 있다. 하나의 영상은 복수의 유닛들로 분할될 수 있다. 또는, 유닛은 하나의 영상을 세분화된 부분들로 분할하고, 분할된 부분에 대한 부호화 또는 복호화가 수행될 때, 상기의 분할된 부분을 의미할 수 있다.-In encoding and decoding of an image, a unit may be an area generated by division of one image. In other words, the unit may be a specified area within one image. One image may be divided into a plurality of units. Alternatively, the unit may mean the divided part when one image is divided into subdivided parts and encoding or decoding of the divided part is performed.

- 영상의 부호화 및 복호화에 있어서, 유닛의 종류에 따라서 유닛에 대한 기정의된 처리가 수행될 수 있다.-In encoding and decoding of an image, a predefined processing for a unit may be performed according to the type of the unit.

- 기능에 따라서, 유닛의 타입은 매크로 유닛(Macro Unit), 코딩 유닛(Coding Unit; CU), 예측 유닛(Prediction Unit; PU), 잔차 유닛(Residual Unit) 및 변환 유닛(Transform Unit; TU) 등으로 분류될 수 있다. 또는, 기능에 따라서, 유닛은 블록, 매크로블록(Macroblock), 코딩 트리 유닛(Coding Tree Unit), 코딩 트리 블록(Coding Tree Block), 코딩 유닛(Coding Unit), 부호화 블록(Coding Block), 예측 유닛(Prediction Unit), 예측 블록(Prediction Block), 잔차 유닛(Residual Unit), 잔차 블록(Residual Block), 변환 유닛(Transform Unit) 및 변환 블록(Transform Block) 등을 의미할 수 있다.-Depending on the function, the type of unit is a macro unit, a coding unit (CU), a prediction unit (PU), a residual unit, a transform unit (TU), etc. It can be classified as Alternatively, depending on the function, the unit is a block, a macroblock, a coding tree unit, a coding tree block, a coding unit, a coding block, and a prediction unit. It may mean a (Prediction Unit), a prediction block, a residual unit, a residual block, a transform unit, a transform block, and the like.

- 유닛은, 블록과 구분하여 지칭하기 위해, 휘도(luma) 성분 블록 및 이에 대응하는 색차(chroma) 성분 블록, 그리고 각 블록에 대한 구문 요소(syntax element)를 포함하는 정보를 의미할 수 있다.-A unit may mean information including a luma component block, a chroma component block corresponding thereto, and a syntax element for each block in order to distinguish it from a block.

- 유닛의 크기 및 형태는 다양할 수 있다. 또한, 유닛은 다양한 크기 및 다양한 형태를 가질 수 있다. 특히 유닛의 형태는 정사각형뿐만 아니라 직사각형, 사다리꼴, 삼각형 및 오각형 등 2차원으로 표현될 수 있는 기하학적 도형을 포함할 수 있다.-The size and shape of the unit can vary. In addition, the unit may have various sizes and various shapes. In particular, the shape of the unit may include not only a square, but also a geometric figure that can be expressed in two dimensions, such as a rectangle, a trapezoid, a triangle, and a pentagon.

- 또한, 유닛 정보는 유닛의 타입, 유닛의 크기, 유닛의 깊이, 유닛의 부호화 순서 및 유닛의 복호화 순서 등 중 적어도 하나 이상을 포함할 수 있다. 예를 들면, 유닛의 타입은 CU, PU, 잔차 유닛 및 TU 등 중 하나를 가리킬 수 있다.-Also, the unit information may include at least one of a unit type, a unit size, a unit depth, a unit encoding order, and a unit decoding order. For example, the type of the unit may indicate one of CU, PU, residual unit and TU.

- 하나의 유닛은 유닛에 비해 더 작은 크기를 갖는 하위 유닛으로 더 분할될 수 있다.-One unit can be further divided into sub-units with a smaller size than the unit.

깊이(depth): 깊이는 유닛의 분할된 정도를 의미할 수 있다. 또한, 유닛 깊이는 유닛을 트리 구조로 표현했을 때 유닛이 존재하는 레벨을 나타낼 수 있다.Depth: Depth may mean the degree of division of a unit. In addition, the unit depth may indicate the level at which the unit exists when the unit is expressed in a tree structure.

- 유닛 분할 정보는 유닛의 깊이에 관한 깊이를 포함할 수 있다. 깊이는 유닛이 분할되는 회수 및/또는 정도를 나타낼 수 있다.-The unit division information may include the depth related to the depth of the unit. Depth may indicate the number and/or degree to which a unit is divided.

- 트리 구조에서, 루트 노드(root node)의 깊이가 가장 얕고, 리프 노드(leaf node)의 깊이가 가장 깊다고 볼 수 있다.-In the tree structure, the depth of the root node is the shallowest and the depth of the leaf node is the deepest.

- 하나의 유닛은 트리 구조(tree structure)에 기반하여 깊이 정보(depth)를 가지면서 계층적으로(hierarchically) 복수의 하위 유닛들로 분할될 수 있다. 말하자면, 유닛 및 상기의 유닛의 분할에 의해 생성된 하위 유닛은 노드 및 상기의 노드의 자식 노드에 각각 대응할 수 있다. 각각의 분할된 하위 유닛은 깊이를 가질 수 있다. 깊이는 유닛이 분할된 회수 및/또는 정도를 나타내므로, 하위 유닛의 분할 정보는 하위 유닛의 크기에 관한 정보를 포함할 수도 있다.-One unit may be hierarchically divided into a plurality of sub-units while having depth information based on a tree structure. That is to say, a unit and a sub-unit generated by the division of the unit may correspond to a node and a child node of the node, respectively. Each divided sub-unit can have a depth. Since the depth indicates the number and/or degree of division of the unit, the division information of the sub-unit may include information on the size of the sub-unit.

- 트리 구조에서, 가장 상위 노드는 분할되지 않은 최초의 유닛에 대응할 수 있다. 가장 상위 노드는 루트 노드로 칭해질 수 있다. 또한, 가장 상위 노드는 최소의 깊이 값을 가질 수 있다. 이 때, 가장 상위 노드는 레벨 0의 깊이를 가질 수 있다. -In the tree structure, the highest node may correspond to the first undivided unit. The highest node may be referred to as a root node. Also, the highest node may have a minimum depth value. In this case, the uppermost node may have a depth of level 0.

- 레벨 1의 깊이를 갖는 노드는 최초의 유닛이 한 번 분할됨에 따라 생성된 유닛을 나타낼 수 있다. 레벨 2의 깊이를 갖는 노드는 최초의 유닛이 두 번 분할됨에 따라 생성된 유닛을 나타낼 수 있다.-A node with a depth of level 1 may indicate a unit created as the first unit is divided once. A node with a depth of level 2 may represent a unit created as the first unit is divided twice.

- 레벨 n의 깊이를 갖는 노드는 최초의 유닛이 n번 분할됨에 따라 생성된 유닛을 나타낼 수 있다.-A node having a depth of level n may indicate a unit generated when the first unit is divided n times.

- 리프 노드는 가장 하위의 노드일 수 있으며, 더 분할될 수 없는 노드일 수 있다. 리프 노드의 깊이는 최대 레벨일 수 있다. 예를 들면, 최대 레벨의 기정의된 값은 3일 수 있다.-The leaf node may be the lowest node and may be a node that cannot be further divided. The depth of the leaf node may be at the maximum level. For example, the predefined value of the maximum level may be 3.

- QT 깊이는 쿼드 분할에 대한 깊이를 나타낼 수 있다. BT 깊이는 이진 분할에 대한 깊이를 나타낼 수 있다. TT 깊이는 삼진 분할에 대한 깊이를 나타낼 수 있다.-QT depth may indicate the depth for quad division. BT depth may indicate the depth for binary division. The TT depth may indicate the depth for the three-way division.

샘플(sample): 샘플은 블록을 구성하는 기반(base) 단위일 수 있다. 샘플은 비트 깊이(bit depth; Bd)에 따라서 0부터 2^Bd-1까지의 값들로서 표현될 수 있다.Sample: A sample may be a base unit constituting a block. A sample may be expressed as values ranging ^{from 0 to 2 Bd} -1 according to a bit depth (Bd).

- 샘플은 픽셀 또는 픽셀 값일 수 있다.-The sample can be a pixel or a pixel value.

- 이하에서, 용어들 "픽셀", "화소" 및 "샘플"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.-Hereinafter, the terms "pixel", "pixel" and "sample" may be used with the same meaning, and may be used interchangeably.

코딩 트리 유닛(Coding Tree Unit; CTU): CTU는 하나의 휘도 성분(Y) 코딩 트리 블록과, 상기의 휘도 성분 코딩 트리 블록에 관련된 두 색차 성분(Cb, Cr) 코딩 트리 블록들로 구성될 수 있다. 또한, CTU는 상기의 블록들과 상기의 블록들의 각 블록에 대한 구문 요소를 포함한 것을 의미할 수도 있다. Coding Tree Unit (CTU): A CTU may be composed of one luminance component (Y) coding tree block and two chrominance component (Cb, Cr) coding tree blocks related to the luminance component coding tree block. have. In addition, the CTU may mean including the blocks and a syntax element for each block of the blocks.

- 각 코딩 트리 유닛은 코딩 유닛, 예측 유닛 및 변환 유닛 등의 하위 유닛을 구성하기 위하여 쿼드 트리(Quad Tree: QT), 이진 트리(Binary Tree; BT) 및 삼진 트리(Ternary Tree; TT) 등과 같은 하나 이상의 분할 방식을 이용하여 분할될 수 있다.-Each coding tree unit is a quad tree (QT), a binary tree (BT), and a ternary tree (TT) to construct sub-units such as a coding unit, a prediction unit, and a transform unit. It can be partitioned using one or more partitioning schemes.

- CTU는 입력 영상의 분할에서와 같이, 영상의 복호화 및 부호화 과정에서의 처리 단위인 픽셀 블록을 지칭하기 위한 용어로서 사용될 수 있다.-CTU may be used as a term for referring to a pixel block, which is a processing unit in a process of decoding and encoding an image, as in splitting an input image.

코딩 트리 블록(Coding Tree Block; CTB): 코딩 트리 블록은 Y 코딩 트리 블록, Cb 코딩 트리 블록, Cr 코딩 트리 블록 중 어느 하나를 지칭하기 위한 용어로 사용될 수 있다.Coding Tree Block (CTB): A coding tree block may be used as a term for referring to any one of a Y coding tree block, a Cb coding tree block, and a Cr coding tree block.

주변 블록(neighbor block): 주변 블록은 대상 블록에 인접한 블록을 의미할 수 있다. 주변 블록은 재구축된 주변 블록을 의미할 수도 있다.Neighbor block: A neighboring block may mean a block adjacent to the target block. The neighboring block may mean a reconstructed neighboring block.

- 이하에서, 용어들 "주변 블록" 및 "인접 블록(adjacent block)"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.-Hereinafter, the terms "peripheral block" and "adjacent block" may be used with the same meaning, and may be used interchangeably.

공간적 주변 블록(spatial neighbor block): 공간적 주변 블록은 대상 블록에 공간적으로 인접한 블록일 수 있다. 주변 블록은 공간적 주변 블록을 포함할 수 있다.Spatial neighbor block: The spatial neighboring block may be a block spatially adjacent to the target block. The neighboring blocks may include spatial neighboring blocks.

- 대상 블록 및 공간적 주변 블록은 대상 픽처 내에 포함될 수 있다.-The target block and the spatial neighboring block may be included in the target picture.

- 공간적 주변 블록은 대상 블록에 경계가 맞닿은 블록 또는 대상 블록으로부터 소정의 거리 내에 위치한 블록을 의미할 수 있다.-Spatial neighboring blocks may refer to blocks whose boundaries are in contact with the target block or blocks located within a predetermined distance from the target block.

- 공간적 주변 블록은 대상 블록의 꼭지점에 인접한 블록을 의미할 수 있다. 여기에서, 대상 블록의 꼭지점에 인접한 블록이란, 대상 블록에 가로로 인접한 이웃 블록에 세로로 인접한 블록 또는 대상 블록에 세로로 인접한 이웃 블록에 가로로 인접한 블록일 수 있다.-Spatial neighboring block may mean a block adjacent to the vertex of the target block. Here, the block adjacent to the vertex of the target block may be a block vertically adjacent to a neighboring block horizontally adjacent to the target block or a block horizontally adjacent to a neighboring block vertically adjacent to the target block.

시간적 주변 블록(temporal neighbor block): 시간적 주변 블록은 대상 블록에 시간적으로 인접한 블록일 수 있다. 주변 블록은 시간적 주변 블록을 포함할 수 있다.Temporal neighbor block: The temporal neighbor block may be a block that is temporally adjacent to the target block. The neighboring blocks may include temporal neighboring blocks.

- 시간적 주변 블록은 콜 블록(co-located block; col block)을 포함할 수 있다.-Temporal neighboring blocks may include co-located blocks (col blocks).

- 콜 블록은 이미 재구축된 콜 픽처(co-located picture; col picture) 내의 블록일 수 있다. 콜 블록의 콜 픽처 내에서의 위치는 대상 블록의 대상 픽처 내의 위치에 대응할 수 있다. 또는, 콜 블록의 콜 픽처 내에서의 위치는 대상 블록의 대상 픽처 내의 위치와 동일할 수 있다. 콜 픽처는 참조 픽처 리스트에 포함된 픽처일 수 있다.-The collocated block may be a block in a co-located picture (col picture) that has already been reconstructed. The position of the collocated block within the collocated picture may correspond to the position within the target picture of the target block. Alternatively, the position of the collocated block in the collocated picture may be the same as the position in the target picture of the target block. The collocated picture may be a picture included in the reference picture list.

- 시간적 주변 블록은 대상 블록의 공간적 주변 블록에 시간적으로 인접한 블록일 수 있다.-The temporal neighboring block may be a block that is temporally adjacent to the spatial neighboring block of the target block.

예측 유닛(prediction unit): 인터 예측, 인트라 예측, 인터 보상(compensation), 인트라 보상 및 움직임 보상 등의 예측에 대한 기반 단위를 의미할 수 있다.Prediction unit: may mean a base unit for prediction such as inter prediction, intra prediction, inter compensation, intra compensation, and motion compensation.

- 하나의 예측 유닛은 더 작은 크기를 갖는 복수의 파티션(partition)들 또는 하위 예측 유닛들로 분할될 수도 있다. 복수의 파티션들 또한 예측 또는 보상의 수행에 있어서의 기반 단위일 수 있다. 예측 유닛의 분할에 의해 생성된 파티션 또한 예측 유닛일 수 있다.-One prediction unit may be divided into a plurality of partitions or sub prediction units having a smaller size. The plurality of partitions may also be a base unit for performing prediction or compensation. A partition generated by division of the prediction unit may also be a prediction unit.

예측 유닛 파티션(prediction unit partition): 예측 유닛이 분할된 형태를 의미할 수 있다.Prediction unit partition: This may mean a form in which a prediction unit is divided.

재구축된 이웃 유닛(reconstructed neighboring unit): 재구축된 이웃 유닛은 대상 유닛의 주변에 이미 복호화되어 재구축된 유닛일 수 있다.Reconstructed neighboring unit: The reconstructed neighboring unit may be a unit that has already been decoded and reconstructed around the target unit.

- 재구축된 이웃 유닛은 대상 유닛에 대한 공간적(spatial) 인접 유닛 또는 시간적(temporal) 인접 유닛일 수 있다.-The reconstructed neighboring unit may be a spatial neighboring unit or a temporal neighboring unit to the target unit.

- 재구축된 공간적 주변 유닛은 대상 픽처 내의 유닛이면서 부호화 및/또는 복호화를 통해 이미 재구축된 유닛일 수 있다.-The reconstructed spatial neighboring unit may be a unit within the target picture and already reconstructed through encoding and/or decoding.

- 재구축된 시간적 주변 유닛은 참조 영상 내의 유닛이면서 부호화 및/또는 복호화를 통해 이미 재구축된 유닛일 수 있다. 재구축된 시간적 주변 유닛의 참조 영상 내에서의 위치는 대상 유닛의 대상 픽처 내에서의 위치와 같거나, 대상 유닛의 대상 픽처 내에서의 위치에 대응할 수 있다.-The reconstructed temporal neighboring unit may be a unit in the reference image and already reconstructed through encoding and/or decoding. The position of the reconstructed temporal neighboring unit in the reference image may be the same as the position in the target picture of the target unit or may correspond to the position in the target picture of the target unit.

파라미터 세트(parameter set): 파라미터 세트는 비트스트림 내의 구조(structure) 중 헤더(header) 정보에 해당할 수 있다. 예를 들면, 파라미터 세트는 비디오 파라미터 세트(video parameter set), 시퀀스 파라미터 세트(sequence parameter set), 픽처 파라미터 세트(picture parameter set) 및 적응 파라미터 세트(adaptation parameter set) 등을 포함할 수 있다.Parameter set: The parameter set may correspond to header information among structures in the bitstream. For example, the parameter set may include a video parameter set, a sequence parameter set, a picture parameter set, and an adaptation parameter set.

또한, 파라미터 세트는 슬라이스(slice) 헤더 정보 및 타일 헤더 정보를 포함할 수 있다.In addition, the parameter set may include slice header information and tile header information.

율-왜곡 최적화(rate-distortion optimization): 부호화 장치는 코딩 유닛의 크기, 예측 모드, 예측 유닛의 크기, 움직임 정보 및, 변환 유닛의 크기 등의 조합을 이용해서 높은 부호화 효율을 제공하기 위해 율-왜곡 최적화를 사용할 수 있다.Rate-distortion optimization: The encoding apparatus uses a combination of the size of the coding unit, the prediction mode, the size of the prediction unit, motion information, and the size of the transform unit to provide high coding efficiency. You can use distortion optimization.

- 율-왜곡 최적화 방식은 상기의 조합들 중에서 최적의 조합을 선택하기 위해 각 조합의 율-왜곡 비용(rate-distortion cost)을 계산할 수 있다. 율-왜곡 비용은 아래의 수식 1을 이용하여 계산될 수 있다. 일반적으로 상기 율-왜곡 비용이 최소가 되는 조합이 율-왜곡 최적화 방식에 있어서의 최적의 조합으로 선택될 수 있다.-The rate-distortion optimization method can calculate the rate-distortion cost of each combination in order to select an optimal combination among the above combinations. Rate-distortion cost can be calculated using Equation 1 below. In general, a combination in which the rate-distortion cost is minimized may be selected as an optimal combination in the rate-distortion optimization method.

[수식 1][Equation 1]

- D는 왜곡을 나타낼 수 있다. D는 변환 유닛 내에서 원래의 변환 계수들 및 재구축된 변환 계수들 간의 차이 값들의 제곱들의 평균(mean square error)일 수 있다.-D can represent distortion. D may be a mean square error of the difference values between the original transform coefficients and the reconstructed transform coefficients in the transform unit.

- R은 율을 나타낼 수 있다. R은 관련된 문맥 정보를 이용한 비트 율을 나타낼 수 있다.-R can represent a rate. R may represent a bit rate using related context information.

- λ는 라그랑지안 승수(Lagrangian multiplier)를 나타낼 수 있다. R은 예측 모드, 움직임 정보 및 코드된 블록 플래그(coded block flag) 등과 같은 코딩 파라미터 정보뿐만 아니라, 변환 계수의 부호화에 의해 발생하는 비트도 포함할 수 있다.-λ can represent Lagrangian multiplier. R may include not only coding parameter information such as prediction mode, motion information, and coded block flag, but also bits generated by encoding transform coefficients.

- 부호화 장치는 정확한 D 및 R을 계산하기 위해 인터 예측 및/또는 인트라 예측, 변환, 양자화, 엔트로피 부호화, 역양자화, 역변환 등의 과정들을 수행할 수 있다. 이러한 과정들은 부호화 장치에서의 복잡도를 크게 증가시킬 수 있다.-The encoding apparatus may perform processes such as inter prediction and/or intra prediction, transformation, quantization, entropy encoding, inverse quantization, and inverse transformation in order to calculate accurate D and R. These processes can greatly increase the complexity of the encoding device.

비트스트림(bitstream): 비트스트림은 부호화된 영상 정보를 포함하는 비트의 열을 의미할 수 있다.Bitstream: A bitstream may mean a sequence of bits including encoded image information.

파라미터 세트(parameter set): 파라미터 세트는 비트스트림 내의 구조(structure) 중 헤더(header) 정보에 해당할 수 있다.Parameter set: The parameter set may correspond to header information among structures in the bitstream.

- 파라미터 세트는 비디오 파라미터 세트(video parameter set), 시퀀스 파라미터 세트(sequence parameter set), 픽처 파라미터 세트(picture parameter set) 및 적응 파라미터 세트(adaptation parameter set) 중 적어도 하나를 포함할 수 있다. 또한, 파라미터 세트는 슬라이스(slice) 헤더의 정보 및 타일(tile) 헤더의 정보를 포함할 수도 있다.-The parameter set may include at least one of a video parameter set, a sequence parameter set, a picture parameter set, and an adaptation parameter set. In addition, the parameter set may include information on a slice header and information on a tile header.

파싱(parsing): 파싱은 비트스트림을 엔트로피 복호화하여 구문 요소(syntax element)의 값을 결정하는 것을 의미할 수 있다. 또는, 파싱은 엔트로피 복호화 자체를 의미할 수 있다.Parsing: Parsing may mean determining a value of a syntax element by entropy decoding a bitstream. Alternatively, parsing may mean entropy decoding itself.

심볼(symbol): 부호화 대상 유닛 및/또는 복호화 대상 유닛의 구문 요소, 코딩 파라미터(coding parameter) 및 변환 계수(transform coefficient) 등 중 적어도 하나를 의미할 수 있다. 또한, 심볼은 엔트로피 부호화의 대상 또는 엔트로피 복호화의 결과를 의미할 수 있다.Symbol: It may mean at least one of a syntax element, a coding parameter, a transform coefficient, and the like of a coding target unit and/or a decoding target unit. Also, the symbol may mean an object of entropy encoding or a result of entropy decoding.

참조 픽처(reference picture): 참조 픽처는 인터 예측 또는 움직임 보상을 위하여 유닛이 참조하는 영상을 의미할 수 있다. 또는, 참조 픽처는 인터 예측 또는 움직임 보상을 위해 대상 유닛이 참조하는 참조 유닛을 포함하는 영상일 수 있다.Reference picture: A reference picture may mean an image referenced by a unit for inter prediction or motion compensation. Alternatively, the reference picture may be an image including a reference unit referenced by the target unit for inter prediction or motion compensation.

이하, 용어 "참조 픽처" 및 "참조 영상"은 동일한 의미로 사용될 수 있으며, 서로 교체되어 사용될 수 있다.Hereinafter, the terms "reference picture" and "reference image" may be used with the same meaning, and may be used interchangeably.

참조 픽처 리스트(reference picture list): 참조 픽처 리스트는 인터 예측 또는 움직임 보상에 사용되는 하나 이상의 참조 영상들을 포함하는 리스트일 수 있다.Reference picture list: The reference picture list may be a list including one or more reference pictures used for inter prediction or motion compensation.

- 참조 픽처 리스트의 종류는 리스트 조합(List Combined; LC), 리스트 0(List 0; L0), 리스트 1(List 1; L1), 리스트 2(List 2; L2) 및 리스트 3(List 3; L3) 등이 있을 수 있다.-The types of reference picture lists are List Combined (LC), List 0 (List 0; L0), List 1 (List 1; L1), List 2 (List 2; L2), and List 3 (List 3; L3). ), etc.

- 인터 예측에는 하나 이상의 참조 픽처 리스트들이 사용될 수 있다.-One or more reference picture lists may be used for inter prediction.

인터 예측 지시자(inter prediction indicator): 인터 예측 지시자는 대상 유닛에 대한 인터 예측의 방향을 가리킬 수 있다. 인터 예측은 단방향 예측 및 양방향 예측 등 중 하나일 수 있다. 또는, 인터 예측 지시자는 대상 유닛의 예측 유닛을 생성할 때 사용되는 참조 영상의 개수를 나타낼 수 있다. 또는, 인터 예측 지시자는 대상 유닛에 대한 인터 예측 혹은 움직임 보상을 위해 사용되는 예측 블록의 개수를 의미할 수 있다.Inter prediction indicator: The inter prediction indicator may indicate the direction of inter prediction for the target unit. Inter prediction may be one of one-way prediction and two-way prediction. Alternatively, the inter prediction indicator may indicate the number of reference pictures used when generating the prediction unit of the target unit. Alternatively, the inter prediction indicator may mean the number of prediction blocks used for inter prediction or motion compensation for a target unit.

참조 픽처 색인(reference picture index): 참조 픽처 색인은 참조 픽처 리스트에서 특정 참조 영상을 지시하는 색인일 수 있다.Reference picture index: The reference picture index may be an index indicating a specific reference picture in the reference picture list.

움직임 벡터(Motion Vector; MV): 움직임 벡터는 인터 예측 또는 움직임 보상에서 사용되는 2차원의 벡터일 수 있다. 움직임 벡터는 대상 영상 및 참조 영상 간의 오프셋을 의미할 수 있다.Motion Vector (MV): The motion vector may be a two-dimensional vector used for inter prediction or motion compensation. The motion vector may mean an offset between the target image and the reference image.

- 예를 들면, MV는 (mv_x, mv_y)와 같은 형태로 표현될 수 있다. mv_x는 수평(horizontal) 성분을 나타낼 수 있고, mv_y 는 수직(vertical) 성분을 나타낼 수 있다.-For example, MV can be expressed in the form _{of (mv x} , mv _{y ).} mv _x may represent a horizontal component, and mv _y may represent a vertical component.

탐색 영역(search range): 탐색 영역은 인터 예측 중 MV에 대한 탐색이 이루어지는 2차원의 영역일 수 있다. 예를 들면, 탐색 영역의 크기는 MxN일 수 있다. M 및 N은 각각 양의 정수일 수 있다.Search range: The search region may be a two-dimensional region in which MV is searched during inter prediction. For example, the size of the search area may be MxN. Each of M and N may be a positive integer.

움직임 벡터 후보(motion vector candidate): 움직임 벡터 후보는 움직임 벡터를 예측할 때 예측 후보인 블록 혹은 예측 후보인 블록의 움직임 벡터를 의미할 수 있다. Motion vector candidate: A motion vector candidate may mean a block as a prediction candidate or a motion vector of a block as a prediction candidate when predicting a motion vector.

- 움직임 벡터 후보는 움직임 벡터 후보 리스트에 포함될 수 있다.-The motion vector candidate may be included in the motion vector candidate list.

움직임 벡터 후보 리스트(motion vector candidate list): 움직임 벡터 후보 리스트는 하나 이상의 움직임 벡터 후보들을 이용하여 구성된 리스트를 의미할 수 있다.Motion vector candidate list: The motion vector candidate list may mean a list constructed by using one or more motion vector candidates.

움직임 벡터 후보 색인(motion vector candidate index): 움직임 벡터 후보 색인은 움직임 벡터 후보 리스트 내의 움직임 벡터 후보를 가리키는 지시자를 의미할 수 있다. 또는, 움직임 벡터 후보 색인은 움직임 벡터 예측기(motion vector predictor)의 색인(index)일 수 있다.Motion vector candidate index: The motion vector candidate index may mean an indicator indicating a motion vector candidate in the motion vector candidate list. Alternatively, the motion vector candidate index may be an index of a motion vector predictor.

움직임 정보(motion information): 움직임 정보는 움직임 벡터, 참조 픽처 색인 및 인터 예측 지시자(inter prediction indicator) 뿐만 아니라 참조 픽처 리스트 정보, 참조 영상, 움직임 벡터 후보, 움직임 벡터 후보 색인, 머지 후보 및 머지 색인 등 중 적어도 하나를 포함하는 정보를 의미할 수 있다.Motion information: Motion information includes not only motion vector, reference picture index and inter prediction indicator, but also reference picture list information, reference picture, motion vector candidate, motion vector candidate index, merge candidate and merge index, etc. It may mean information including at least one of.

변환 유닛(transform unit): 변환 유닛은 변환, 역변환, 양자화, 역양자화, 변환 계수 부호화 및 변환 계수 복호화 등과 같은 잔차 신호(residual signal) 부호화 및/또는 잔차 신호 복호화에 있어서의 기본 유닛일 수 있다. 하나의 변환 유닛은 더 작은 크기의 복수의 변환 유닛들로 분할될 수 있다.Transform unit: The transform unit may be a basic unit in residual signal encoding and/or residual signal decoding such as transform, inverse transform, quantization, inverse quantization, transform coefficient encoding and transform coefficient decoding. One transform unit may be divided into a plurality of transform units having a smaller size.

스케일링(scaling): 스케일링은 변환 계수 레벨에 인수를 곱하는 과정을 의미할 수 있다. Scaling: Scaling may mean a process of multiplying a transform coefficient level by a factor.

- 변환 계수 레벨에 대한 스케일링의 결과로서, 변환 계수가 생성될 수 있다. 스케일링은 역양자화(dequantization)로 칭해질 수도 있다.-As a result of scaling on the transform coefficient level, a transform coefficient may be generated. Scaling may also be referred to as dequantization.

양자화 파라미터(Quantization Parameter; QP): 양자화 파라미터는 양자화에서 변환 계수에 대해 변환 계수 레벨(transform coefficient level)을 생성할 때 사용되는 값을 의미할 수 있다. 또는, 양자화 파라미터는 역양자화에서 변환 계수 레벨을 스케일링(scaling)함으로써 변환 계수를 생성할 때 사용되는 값을 의미할 수도 있다. 또는, 양자화 파라미터는 양자화 스탭 크기(step size)에 매핑된 값일 수 있다.Quantization Parameter (QP): The quantization parameter may mean a value used when generating a transform coefficient level for a transform coefficient in quantization. Alternatively, the quantization parameter may mean a value used when generating a transform coefficient by scaling a transform coefficient level in inverse quantization. Alternatively, the quantization parameter may be a value mapped to a quantization step size.

델타 양자화 파라미터(delta quantization parameter): 델타 양자화 파라미터는 예측된 양자화 파라미터 및 대상 유닛의 양자화 파라미터의 차분(differential) 값을 의미한다.Delta quantization parameter: The delta quantization parameter means a differential value between a predicted quantization parameter and a quantization parameter of a target unit.

스캔(scan): 스캔은 유닛, 블록 또는 행렬 내의 계수들의 순서를 정렬하는 방법을 의미할 수 있다. 예를 들면, 2차원 배열을 1차원 배열 형태로 정렬하는 것을 스캔이라고 칭할 수 있다. 또는, 1차원 배열을 2차원 배열 형태로 정렬하는 것도 스캔 또는 역 스캔(inverse scan)이라고 칭할 수 있다.Scan: Scan may refer to a method of arranging the order of coefficients in a unit, block, or matrix. For example, arranging a two-dimensional array into a one-dimensional array may be referred to as a scan. Alternatively, arranging the one-dimensional array in the form of a two-dimensional array may also be referred to as a scan or an inverse scan.

변환 계수(transform coefficient): 변환 계수는 부호화 장치에서 변환을 수행함에 따라 생성된 계수 값일 수 있다. 또는, 변환 계수는 복호화 장치에서 엔트로피 복호화 및 역양자화 중 적어도 하나를 수행함에 따라 생성된 계수 값일 수 있다. Transform coefficient: The transform coefficient may be a coefficient value generated by performing transformation in the encoding apparatus. Alternatively, the transform coefficient may be a coefficient value generated by performing at least one of entropy decoding and inverse quantization in the decoding apparatus.

- 변환 계수 또는 잔차 신호에 양자화를 적용함으로써 생성된 양자화된 레벨 또는 양자화된 변환 계수 레벨 또한 변환 계수의 의미에 포함될 수 있다.-A quantized level generated by applying quantization to a transform coefficient or a residual signal or a quantized transform coefficient level may also be included in the meaning of the transform coefficient.

양자화된 레벨(quantized level): 양자화된 레벨은 부호화 장치에서 변환 계수 또는 잔차 신호에 양자화를 수행함으로써 생성된 값을 의미할 수 있다. 또는, 양자화된 레벨은 복호화 장치에서 역양자화를 수행함에 있어서 역양자화의 대상이 되는 값을 의미할 수도 있다.Quantized level: The quantized level may mean a value generated by quantizing a transform coefficient or a residual signal in an encoding apparatus. Alternatively, the quantized level may mean a value that is an object of inverse quantization in performing inverse quantization in the decoding apparatus.

- 변환 및 양자화의 결과인 양자화된 변환 계수 레벨도 양자화된 레벨의 의미에 포함될 수 있다.-The quantized transform coefficient level resulting from transform and quantization may also be included in the meaning of the quantized level.

넌제로 변환 계수(non-zero transform coefficient): 넌제로 변환 계수는 0이 아닌 값을 갖는 변환 계수 또는 0이 아닌 값을 갖는 변환 계수 레벨을 의미할 수 있다. 또는, 넌제로 변환 계수는 값의 크기가 0이 아난 변환 계수 또는 값의 크기가 0이 아닌 변환 계수 레벨을 의미할 수 있다.Non-zero transform coefficient: The non-zero transform coefficient may mean a transform coefficient having a non-zero value or a transform coefficient level having a non-zero value. Alternatively, the non-zero transform coefficient may mean a transform coefficient in which the size of a value is not 0 or a transform coefficient level in which the size of a value is not zero.

양자화 행렬(quantization matrix): 양자화 행렬은 영상의 주관적 화질 혹은 객관적 화질을 향상시키기 위해서 양자화 과정 또는 역양자화 과정에서 이용되는 행렬을 의미할 수 있다. 양자화 행렬은 스케일링 리스트(scaling list)라고도 칭해질 수 있다.Quantization matrix: The quantization matrix may mean a matrix used in a quantization process or an inverse quantization process in order to improve subjective or objective quality of an image. The quantization matrix may also be referred to as a scaling list.

양자화 행렬 계수(quantization matrix coefficient): 양자화 행렬 계수는 양자화 행렬 내의 각 원소(element)를 의미할 수 있다. 양자화 행렬 계수는 행렬 계수(matrix coefficient)라고도 칭해질 수 있다.Quantization matrix coefficient: The quantization matrix coefficient may mean each element in the quantization matrix. The quantization matrix coefficient may also be referred to as a matrix coefficient.

디폴트 행렬(default matrix): 기본 행렬은 부호화 장치 및 복호화 장치에서 기정의된 양자화 행렬일 수 있다.Default matrix: The default matrix may be a quantization matrix predefined by an encoding device and a decoding device.

비 디폴트 행렬(non-default matrix): 비 디폴트 행렬은 부호화 장치 및 복호화 장치에서 기정의되어 있지 않은 양자화 행렬일 수 있다. 비 디폴트 행렬은 부호화 장치로부터 복호화 장치로 시그널링될 수 있다.Non-default matrix: The non-default matrix may be a quantization matrix that is not predefined in an encoding device and a decoding device. The non-default matrix may be signaled from the encoding device to the decoding device.

시그널링: 시그널링은 정보가 부호화 장치로부터 복호화 장치로 전송되는 것을 나타낼 수 있다. 또는, 시그널링은 정보를 비트스트림 또는 기록 매체 내에 포함시키는 것을 의미할 수 있다. 부호화 장치에 의해 시그널링된 정보는 복호화 장치에 의해 사용될 수 있다.Signaling: Signaling may indicate that information is transmitted from an encoding device to a decoding device. Alternatively, signaling may mean including information in a bitstream or a recording medium. Information signaled by the encoding device may be used by the decoding device.

도 1은 본 발명이 적용되는 부호화 장치의 일 실시예에 따른 구성을 나타내는 블록도이다.1 is a block diagram showing a configuration according to an embodiment of an encoding apparatus to which the present invention is applied.

부호화 장치(100)는 인코더, 비디오 부호화 장치 또는 영상 부호화 장치일 수 있다. 비디오는 하나 이상의 영상들을 포함할 수 있다. 부호화 장치(100)는 비디오의 하나 이상의 영상들을 순차적으로 부호화할 수 있다.The encoding device 100 may be an encoder, a video encoding device, or an image encoding device. A video may include one or more images. The encoding apparatus 100 may sequentially encode one or more images of a video.

도 1을 참조하면, 부호화 장치(100)는 인터 예측부(110), 인트라 예측부(120), 스위치(115), 감산기(125), 변환부(130), 양자화부(140), 엔트로피 부호화부(150), 역양자화부(160), 역변환부(170), 가산기(175), 필터부(180) 및 참조 픽처 버퍼(190)를 포함할 수 있다.Referring to FIG. 1, the encoding apparatus 100 includes an inter prediction unit 110, an intra prediction unit 120, a switch 115, a subtractor 125, a transform unit 130, a quantization unit 140, and entropy encoding. A sub 150, an inverse quantization unit 160, an inverse transform unit 170, an adder 175, a filter unit 180, and a reference picture buffer 190 may be included.

부호화 장치(100)는 인트라 모드 및/또는 인터 모드를 사용하여 대상 영상에 대한 부호화를 수행할 수 있다.The encoding apparatus 100 may encode a target image using an intra mode and/or an inter mode.

또한, 부호화 장치(100)는 대상 영상에 대한 부호화를 통해 부호화의 정보를 포함하는 비트스트림을 생성할 수 있고, 생성된 비트스트림을 출력할 수 있다. 생성된 비트스트림은 컴퓨터 판독가능한 기록 매체에 저장될 수 있고, 유/무선 전송 매체를 통해 스트리밍될 수 있다.Also, the encoding apparatus 100 may generate a bitstream including encoding information through encoding on a target image, and may output the generated bitstream. The generated bitstream may be stored in a computer-readable recording medium, and may be streamed through a wired/wireless transmission medium.

예측 모드로서, 인트라 모드가 사용되는 경우, 스위치(115)는 인트라로 전환될 수 있다. 예측 모드로서, 인터 모드가 사용되는 경우, 스위치(115)는 인터로 전환될 수 있다.When an intra mode is used as the prediction mode, the switch 115 may be switched to intra. When the inter mode is used as the prediction mode, the switch 115 may be switched to inter.

부호화 장치(100)는 대상 블록에 대한 예측 블록을 생성할 수 있다. 또한, 부호화 장치(100)는 예측 블록이 생성된 후, 대상 블록 및 예측 블록의 차분(residual)을 부호화할 수 있다.The encoding apparatus 100 may generate a prediction block for a target block. Also, after the prediction block is generated, the encoding apparatus 100 may encode a residual between the target block and the prediction block.

예측 모드가 인트라 모드인 경우, 인트라 예측부(120)는 대상 블록의 주변에 있는, 이미 부호화/복호화된 블록의 픽셀을 참조 샘플로서 이용할 수 있다. 인트라 예측부(120)는 참조 샘플을 이용하여 대상 블록에 대한 공간적 예측을 수행할 수 있고, 공간적 예측을 통해 대상 블록에 대한 예측 샘플들을 생성할 수 있다.When the prediction mode is an intra mode, the intra prediction unit 120 may use a pixel of an already coded/decoded block adjacent to the target block as a reference sample. The intra prediction unit 120 may perform spatial prediction for the target block using the reference sample, and may generate prediction samples for the target block through spatial prediction.

인터 예측부(110)는 움직임 예측부 및 움직임 보상부를 포함할 수 있다.The inter prediction unit 110 may include a motion prediction unit and a motion compensation unit.

예측 모드가 인터 모드인 경우, 움직임 예측부는, 움직임 예측 과정에서 참조 영상으로부터 대상 블록과 가장 매치가 잘 되는 영역을 검색할 수 있고, 검색된 영역을 이용하여 대상 블록 및 검색된 영역에 대한 움직임 벡터를 도출할 수 있다.When the prediction mode is the inter mode, the motion prediction unit can search for an area that best matches the target block from the reference image in the motion prediction process, and derives a motion vector for the target block and the searched area using the searched area. can do.

참조 영상은 참조 픽처 버퍼(190)에 저장될 수 있으며, 참조 영상에 대한 부호화 및/또는 복호화가 처리되었을 때 참조 픽처 버퍼(190)에 저장될 수 있다.The reference picture may be stored in the reference picture buffer 190, and may be stored in the reference picture buffer 190 when the reference picture is encoded and/or decoded.

움직임 보상부는 움직임 벡터를 이용하는 움직임 보상을 수행함으로써 대상 블록에 대한 예측 블록을 생성할 수 있다. 여기에서, 움직임 벡터는 인터 예측에 사용되는 2차원 벡터일 수 있다. 또한 움직임 벡터는 대상 영상 및 참조 영상 간의 오프셋(offset)을 나타낼 수 있다.The motion compensation unit may generate a prediction block for the target block by performing motion compensation using a motion vector. Here, the motion vector may be a 2D vector used for inter prediction. In addition, the motion vector may represent an offset between the target image and the reference image.

움직임 예측부 및 움직임 보상부는 움직임 벡터가 정수가 아닌 값을 가진 경우 참조 영상 내의 일부 영역에 대해 보간 필터(interpolation filter)를 적용함으로써 예측 블록을 생성할 수 있다. 인터 예측 또는 움직임 보상을 수행하기 위해, CU를 기준으로 CU에 포함된 PU의 움직임 예측 및 움직임 보상의 방법이 스킵 모드(skip mode), 머지 모드(merge mode), 향상된 움직임 벡터 예측(advanced motion vector prediction; AMVP) 모드 및 현재 픽처 참조 모드 중 어떠한 방법인지 여부가 판단될 수 있고, 각 모드에 따라 인터 예측 또는 움직임 보상이 수행될 수 있다.When the motion vector has a non-integer value, the motion prediction unit and the motion compensation unit may generate a prediction block by applying an interpolation filter to a partial region of the reference image. In order to perform inter prediction or motion compensation, a method of motion prediction and motion compensation of a PU included in the CU based on the CU is a skip mode, a merge mode, and an advanced motion vector prediction. Prediction (AMVP) mode and a current picture reference mode may be determined, and inter prediction or motion compensation may be performed according to each mode.

감산기(125)는 대상 블록 및 예측 블록의 차분인 잔차 블록(residual block)을 생성할 수 있다. 잔차 블록은 잔차 신호로 칭해질 수도 있다.The subtractor 125 may generate a residual block, which is a difference between the target block and the prediction block. The residual block may also be referred to as a residual signal.

잔차 신호는 원 신호 및 예측 신호 간의 차이(difference)를 의미할 수 있다. 또는, 잔차 신호는 원신호 및 예측 신호 간의 차이를 변환(transform)하거나 양자화하거나 또는 변환 및 양자화함으로써 생성된 신호일 수 있다. 잔차 블록은 블록 단위에 대한 잔차 신호일 수 있다.The residual signal may mean a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming or quantizing, or transforming and quantizing a difference between the original signal and the predicted signal. The residual block may be a residual signal for each block.

변환부(130)는 잔차 블록에 대해 변환(transform)을 수행하여 변환 계수를 생성할 수 있고, 생성된 변환 계수(transform coefficient)를 출력할 수 있다. 여기서, 변환 계수는 잔차 블록에 대한 변환을 수행함으로써 생성된 계수 값일 수 있다.The transform unit 130 may transform the residual block to generate a transform coefficient, and may output the generated transform coefficient. Here, the transform coefficient may be a coefficient value generated by performing transform on the residual block.

변환부(130)는 변환을 수행함에 있어서 기정의된 복수의 변환 방법들 중 하나를 사용할 수 있다.In performing the conversion, the conversion unit 130 may use one of a plurality of predefined conversion methods.

기정의된 복수의 변환 방법들은 이산 코사인 변환(Discrete Cosine Transform; DCT), 이산 사인 변환(Discrete Sine Transform; DST) 및 카루넨-루베 변환(Karhunen-Loeve Transform; KLT) 기반 변환 등을 포함할 수 있다.A plurality of predefined transformation methods may include Discrete Cosine Transform (DCT), Discrete Sine Transform (DST), and Karhunen-Loeve Transform (KLT) based transformation. have.

잔차 블록에 대한 변환을 위해 사용되는 변환 방법은 대상 블록 및/또는 주변 블록에 대한 코딩 파라미터들 중 적어도 하나에 따라 결정될 수 있다. 예를 들면, 변환 방법은 PU에 대한 인터 예측 모드, PU에 대한 인트라 예측 모드, TU의 크기 및 TU의 형태 중 적어도 하나에 기반하여 결정될 수 있다. 또는, 변환 방법을 지시하는 변환 정보가 부호화 장치(100)로부터 복호화 장치(200)로 시그널링될 수도 있다.The transform method used for transforming the residual block may be determined according to at least one of coding parameters for the target block and/or the neighboring block. For example, the transformation method may be determined based on at least one of an inter prediction mode for a PU, an intra prediction mode for a PU, a size of a TU, and a shape of a TU. Alternatively, transformation information indicating a transformation method may be signaled from the encoding device 100 to the decoding device 200.

변환 스킵(transform skip) 모드가 적용되는 경우, 변환부(130)는 잔차 블록에 대한 변환을 생략할 수도 있다.When the transform skip mode is applied, the transform unit 130 may omit the transform of the residual block.

변환 계수에 양자화를 적용함으로써 양자화된 변환 계수 레벨(transform coefficient level) 또는 양자화된 레벨이 생성될 수 있다. 이하, 실시예들에서는 양자화된 변환 계수 레벨 및 양자화된 레벨도 변환 계수로 칭해질 수 있다.By applying quantization to a transform coefficient, a quantized transform coefficient level or a quantized level may be generated. Hereinafter, in embodiments, a quantized transform coefficient level and a quantized level may also be referred to as transform coefficients.

양자화부(140)는 변환 계수를 양자화 파라미터에 맞춰 양자화함으로써 양자화된 변환 계수 레벨(quantized transform coefficient level) 또는 양자화된 레벨을 생성할 수 있다. 양자화부(140)는 생성된 양자화된 변환 계수 레벨 또는 생성된 양자화된 레벨을 출력할 수 있다. 이때, 양자화부(140)에서는 양자화 행렬을 사용하여 변환 계수를 양자화할 수 있다.The quantization unit 140 may generate a quantized transform coefficient level or a quantized level by quantizing a transform coefficient according to a quantization parameter. The quantization unit 140 may output the generated quantized transform coefficient level or the generated quantized level. In this case, the quantization unit 140 may quantize the transform coefficient using a quantization matrix.

엔트로피 부호화부(150)는, 양자화부(140)에서 산출된 값들 및/또는 부호화 과정에서 산출된 코딩 파라미터 값들 등에 기초하여 확률 분포에 따른 엔트로피 부호화를 수행함으로써 비트스트림(bitstream)을 생성할 수 있다. 엔트로피 부호화부(150)는 생성된 비트스트림을 출력할 수 있다.The entropy encoding unit 150 may generate a bitstream by performing entropy encoding according to a probability distribution based on values calculated by the quantization unit 140 and/or coding parameter values calculated during an encoding process. . The entropy encoder 150 may output the generated bitstream.

엔트로피 부호화부(150)는 영상의 픽셀에 관한 정보 및 영상의 복호화를 위한 정보에 대한 엔트로피 부호화를 수행할 수 있다. 예를 들면, 영상의 복호화를 위한 정보는 구문 요소(syntax element) 등을 포함할 수 있다. The entropy encoder 150 may perform entropy encoding on information about pixels of an image and information for decoding an image. For example, information for decoding an image may include a syntax element or the like.

엔트로피 부호화가 적용되는 경우, 높은 발생 확률을 갖는 심볼에 적은 수의 비트가 할당될 수 있고, 낮은 발생 확률을 갖는 심볼에 많은 수의 비트가 할당될 수 있다. 이러한 할당을 통해 심볼이 표현됨에 따라, 부호화의 대상인 심볼들에 대한 비트열(bitstring)의 크기가 감소될 수 있다. 따라서, 엔트로피 부호화를 통해서 영상 부호화의 압축 성능이 향상될 수 있다. When entropy coding is applied, a small number of bits may be allocated to a symbol having a high probability of occurrence, and a large number of bits may be allocated to a symbol having a low probability of occurrence. As symbols are represented through such allocation, the size of a bitstring for symbols to be encoded may be reduced. Accordingly, compression performance of image encoding may be improved through entropy encoding.

또한, 엔트로피 부호화부(150)는 엔트로피 부호화를 위해 지수 골롬(exponential golomb), 문맥-적응형 가변 길이 코딩(Context-Adaptive Variable Length Coding; CAVLC) 및 문맥-적응형 이진 산술 코딩(Context-Adaptive Binary Arithmetic Coding; CABAC) 등과 같은 부호화 방법을 사용할 수 있다. 예를 들면, 엔트로피 부호화부(150)는 가변 길이 부호화(Variable Length Coding/Code; VLC) 테이블을 이용하여 엔트로피 부호화를 수행할 수 있다. 예를 들면, 엔트로피 부호화부(150)는 대상 심볼에 대한 이진화(binarization) 방법을 도출할 수 있다. 또한, 엔트로피 부호화부(150)는 대상 심볼/빈(bin)의 확률 모델(probability model)을 도출할 수 있다. 엔트로피 부호화부(150)는 도출된 이진화 방법, 확률 모델 및 문맥 모델(context model)을 사용하여 산술 부호화를 수행할 수도 있다.In addition, the entropy encoding unit 150 includes exponential golomb, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding for entropy encoding. A coding method such as Arithmetic Coding (CABAC) can be used. For example, the entropy encoding unit 150 may perform entropy encoding using a Variable Length Coding/Code (VLC) table. For example, the entropy encoder 150 may derive a binarization method for a target symbol. In addition, the entropy encoder 150 may derive a probability model of a target symbol/bin. The entropy encoding unit 150 may perform arithmetic encoding using the derived binarization method, a probability model, and a context model.

엔트로피 부호화부(150)는 양자화된 변환 계수 레벨을 부호화하기 위해 변환 계수 스캐닝(transform coefficient scanning) 방법을 통해 2차원의 블록의 형태(form)의 계수를 1차원의 벡터의 형태로 변경할 수 있다.The entropy encoder 150 may change a coefficient of a form of a two-dimensional block into a form of a one-dimensional vector through a transform coefficient scanning method in order to encode the quantized transform coefficient level.

코딩 파라미터는 부호화 및/또는 복호화를 위해 요구되는 정보일 수 있다. 코딩 파라미터는 부호화 장치(100)에서 부호화되어 부호화 장치(100)로부터 복호화 장치로 전달되는 정보를 포함할 수 있고, 부호화 혹은 복호화 과정에서 유추될 수 있는 정보를 포함할 수 있다. 예를 들면, 복호화 장치로 전달되는 정보로서, 구문 요소가 있다.The coding parameter may be information required for encoding and/or decoding. The coding parameter may include information that is encoded by the encoding device 100 and transmitted from the encoding device 100 to the decoding device, and may include information that can be inferred during an encoding or decoding process. For example, as information transmitted to the decoding device, there is a syntax element.

코딩 파라미터(coding parameter)는 구문 요소와 같이 부호화 장치에서 부호화되고, 부호화 장치로부터 복호화 장치로 시그널링되는 정보(또는, 플래그, 인덱스 등)뿐만 아니라, 부호화 과정 또는 복호화 과정에서 유도되는 정보를 포함할 수 있다. 또한, 코딩 파라미터는 영상을 부호화하거나 복호화함에 있어서 요구되는 정보를 포함할 수 있다. 예를 들면, 유닛/블록의 크기, 유닛/블록의 깊이, 유닛/블록의 분할 정보, 유닛/블록의 분할 구조, 유닛/블록이 쿼드 트리 형태로 분할되는지 여부를 나타내는 정보, 유닛/블록이 이진 트리 형태로 분할되는지 여부를 나타내는 정보, 이진 트리 형태의 분할 방향(가로 방향 또는 세로 방향), 이진 트리 형태의 분할 형태(대칭 분할 또는 비대칭 분할), 유닛/블록이 삼진 트리 형태로 분할되는지 여부를 나타내는 정보, 삼진 트리 형태의 분할 방향(가로 방향 또는 세로 방향), 예측 방식(인트라 예측 또는 인터 예측), 인트라 예측 모드/방향, 참조 샘플 필터링 방법, 예측 블록 필터링 방법, 예측 블록 경계 필터링 방법, 필터링의 필터 탭, 필터링의 필터 계수, 인터 예측 모드, 움직임 정보, 움직임 벡터, 참조 픽처 색인, 인터 예측 방향, 인터 예측 지시자, 참조 픽처 리스트, 참조 영상, 움직임 벡터 예측기, 움직임 벡터 예측 후보, 움직임 벡터 후보 리스트, 머지 모드를 사용하는지 여부를 나타내는 정보, 머지 후보, 머지 후보 리스트, 스킵(skip) 모드를 사용하는지 여부를 나타내는 정보, 보간 필터의 종류, 보간 필터의 필터 탭, 보간 필터의 필터 계수, 움직임 벡터 크기, 움직임 벡터 표현 정확도, 변환 종류, 변환 크기, 1차 변환을 사용하는지 여부를 나타내는 정보, 추가(2차) 변환을 사용하는지 여부를 나타내는 정보, 1차 변환 인덱스, 2차 변환 인덱스, 잔차 신호의 유무를 나타내는 정보, 코드된 블록 패턴(coded block pattern), 코드된 블록 플래그(coded block flag), 양자화 파라미터, 양자화 행렬, 인트라-루프 필터에 대한 정보, 인트라-루프 필터를 적용하는지 여부에 대한 정보, 인트라-루프 필터의 계수, 인트라-루프의 필터 탭, 인트라 루프 필터의 모양(shape)/형태(form), 디블록킹 필터를 적용하는지 여부를 나타내는 정보, 디블록킹 필터 계수, 디블록킹 필터 탭, 디블록킹 필터 강도, 디블록킹 필터 모양/형태, 적응적 샘플 오프셋을 적용하는지 여부를 나타내는 정보, 적응적 샘플 오프셋 값, 적응적 샘플 오프셋 카테고리, 적응적 샘플 오프셋 종류, 적응적 루프-내(in-loop) 필터를 적용하는지 여부, 적응적 루프-내 필터 계수, 적응적 루프-내 필터 탭, 적응적 루프-내 필터 모양/형태, 이진화/역이진화 방법, 문맥 모델, 문맥 모델 결정 방법, 문맥 모델 업데이트 방법, 레귤러 모드를 수행하는지 여부, 바이패스 모드를 수행하는지 여부, 문맥 빈, 바이패스 빈, 변환 계수, 변환 계수 레벨, 변환 계수 레벨 스캐닝 방법, 영상의 디스플레이/출력 순서, 슬라이스 식별 정보, 슬라이스 타입, 슬라이스 분할 정보, 타일 식별 정보, 타일 타입, 타일 분할 정보, 픽처 타입, 비트 심도, 휘도 신호에 대한 정보 및 색차 신호에 대한 정보 중 적어도 하나의 값, 조합된 형태 또는 통계가 코딩 파라미터에 포함될 수 있다. 예측 방식은 인트라 예측 모드 및 인터 예측 모드 중 하나의 예측 모드를 나타낼 수 있다.The coding parameter, like a syntax element, may include information (or flags, indexes, etc.) that is encoded by the encoding device and signaled from the encoding device to the decoding device, as well as information derived during the encoding process or the decoding process. have. In addition, the coding parameter may include information required for encoding or decoding an image. For example, the size of the unit/block, the depth of the unit/block, the division information of the unit/block, the division structure of the unit/block, information indicating whether the unit/block is divided into a quad tree, and the unit/block is binary. Information indicating whether the tree is divided, the direction of division of the binary tree type (horizontal or vertical direction), the division type of the binary tree type (symmetrical division or asymmetrical division), and whether the unit/block is divided into a ternary tree shape. Information represented, split direction in the form of a ternary tree (horizontal or vertical direction), prediction method (intra prediction or inter prediction), intra prediction mode/direction, reference sample filtering method, prediction block filtering method, prediction block boundary filtering method, filtering Filter tap, filter coefficient of filtering, inter prediction mode, motion information, motion vector, reference picture index, inter prediction direction, inter prediction indicator, reference picture list, reference picture, motion vector predictor, motion vector prediction candidate, motion vector candidate List, information indicating whether to use merge mode, merge candidate, merge candidate list, information indicating whether to use skip mode, type of interpolation filter, filter tap of interpolation filter, filter coefficient of interpolation filter, motion Vector size, motion vector expression accuracy, transform type, transform size, information indicating whether to use a first-order transform, information indicating whether to use an additional (quaternary) transform, a first-order transform index, a second-order transform index, and residuals Information indicating the presence or absence of a signal, a coded block pattern, a coded block flag, a quantization parameter, a quantization matrix, information about an intra-loop filter, whether or not an intra-loop filter is applied. Information, intra-loop filter coefficients, intra-loop filter taps, intra-loop filter shape/form, information indicating whether to apply a deblocking filter, deblocking filter coefficients, deblocking filter Tap, deblocking filter strength, deblocking filter shape/type Information indicating whether to apply adaptive sample offset, adaptive sample offset value, adaptive sample offset category, adaptive sample offset type, whether to apply adaptive in-loop filter, adaptive In-loop filter coefficient, adaptive intra-loop filter tap, adaptive intra-loop filter shape/shape, binarization/inverse binarization method, context model, context model determination method, context model update method, whether to perform regular mode, Whether to perform bypass mode, context bin, bypass bin, transform coefficient, transform coefficient level, transform coefficient level scanning method, image display/output order, slice identification information, slice type, slice division information, tile identification information, At least one of a tile type, tile split information, picture type, bit depth, information on a luminance signal, and information on a color difference signal, a combined form, or statistics may be included in the coding parameter. The prediction method may represent one of an intra prediction mode and an inter prediction mode.

잔차 신호는 원 신호 및 예측 신호 간의 차분(difference)을 나타낼 수 있다. 또는, 잔차 신호는 원신호 및 예측 신호 간의 차분을 변환(transform)함으로써 생성된 신호일 수 있다. 또는, 잔차 신호는 원 신호 및 예측 신호 간의 차분을 변환 및 양자화함으로써 생성된 신호일 수 있다. 잔차 블록은 블록에 대한 잔차 신호일 수 있다.The residual signal may represent a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming and quantizing a difference between the original signal and the predicted signal. The residual block may be a residual signal for the block.

여기서, 플래그 또는 인덱스를 시그널링(signaling)한다는 것은 부호화 장치(100)에서는 플래그 또는 인덱스에 대한 엔트로피 부호화(entropy encoding)를 수행함으로써 생성된 엔트로피 부호화된 플래그 또는 엔트로피 부호화된 인덱스를 비트스트림(Bitstream)에 포함시키는 것을 의미할 수 있고, 복호화 장치(200)에서는 비트스트림으로부터 추출된 엔트로피 부호화된 플래그 또는 엔트로피 부호화된 인덱스에 대한 엔트로피 복호화(entropy decoding)를 수행함으로써 플래그 또는 인덱스를 획득하는 것을 의미할 수 있다.Here, signaling the flag or index means that the encoding device 100 transmits the entropy-encoded flag or entropy-encoded index generated by performing entropy encoding on the flag or index in a bitstream. It may mean to include, and the decoding apparatus 200 may mean obtaining a flag or index by performing entropy decoding on an entropy-encoded flag or entropy-encoded index extracted from the bitstream. .

부호화 장치(100)에 의해 인터 예측을 통한 부호화가 수행되기 때문에, 부호화된 대상 영상은 이후에 처리되는 다른 영상(들)에 대하여 참조 영상으로서 사용될 수 있다. 따라서, 부호화 장치(100)는 부호화된 대상 영상을 다시 재구축 또는 복호화할 수 있고, 재구축 또는 복호화된 영상을 참조 영상으로서 참조 픽처 버퍼(190)에 저장할 수 있다. 복호화를 위해 부호화된 대상 영상에 대한 역양자화 및 역변환이 처리될 수 있다.Since encoding through inter prediction is performed by the encoding apparatus 100, the encoded target image may be used as a reference image for other image(s) to be processed later. Accordingly, the encoding apparatus 100 may reconstruct or decode the encoded target image again, and store the reconstructed or decoded image in the reference picture buffer 190 as a reference image. Inverse quantization and inverse transformation of a target image encoded for decoding may be processed.

양자화된 레벨은 역양자화부(160)에서 역양자화될(inversely quantized) 수 있고, 역변환부(170)에서 역변환될(inversely transformed) 수 있다. 역양자화 및/또는 역변환된 계수는 가산기(175)를 통해 예측 블록과 합해질 수 있다, 역양자화 및/또는 역변환된 계수와 예측 블록을 합함으로써 재구축된(reconstructed) 블록이 생성될 수 있다. 여기서, 역양자화 및/또는 역변환된 계수는 역양자화(dequantization) 및 역변환(inverse-transformation) 중 적어도 하나 이상이 수행된 계수를 의미할 수 있고, 재구축된 잔차 블록을 의미할 수 있다.The quantized level may be inversely quantized in the inverse quantization unit 160 and may be inversely transformed in the inverse transform unit 170. The inverse quantized and/or inverse transformed coefficient may be summed with the prediction block through the adder 175, and a reconstructed block may be generated by summing the inverse quantized and/or inverse transformed coefficient and the prediction block. Here, the inverse quantized and/or inverse transformed coefficient may mean a coefficient in which at least one of dequantization and inverse-transformation has been performed, and may mean a reconstructed residual block.

재구축된 블록은 필터부(180)를 거칠 수 있다. 필터부(180)는 디블록킹 필터(deblocking filter), 샘플 적응적 오프셋(Sample Adaptive Offset; SAO) 및 적응적 루프 필터(Adaptive Loop Filter; ALF) 중 적어도 하나 이상을 재구축된 블록 또는 재구축된 픽처에 적용할 수 있다. 필터부(180)는 루프-내(in-loop) 필터로 칭해질 수도 있다.The reconstructed block may pass through the filter unit 180. The filter unit 180 reconstructs at least one of a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF). It can be applied to a picture. The filter unit 180 may also be referred to as an in-loop filter.

디블록킹 필터는 블록들 간의 경계에서 발생한 블록 왜곡을 제거할 수 있다. 디블록킹 필터를 적용할지 여부를 판단하기 위해, 블록에 포함된 몇 개의 열 또는 행에 포함된 픽셀(들)에 기반하여 대상 블록에 디블록킹 필터를 적용할지 여부가 판단될 수 있다.The deblocking filter may remove block distortion occurring at the boundary between blocks. In order to determine whether to apply the deblocking filter, it may be determined whether to apply the deblocking filter to the target block based on the pixel(s) included in several columns or rows included in the block.

대상 블록에 디블록킹 필터를 적용하는 경우, 적용되는 필터는 요구되는 디블록킹 필터링의 강도에 따라 다를 수 있다. 말하자면, 서로 다른 필터들 중 디블록킹 필터링의 강도에 따라 결정된 필터가 대상 블록에 적용될 수 있다. 대상 블록에 디블록킹 필터가 적용되는 경우, 요구되는 디블록킹 필터링의 강도에 따라 강한 필터(strong filter) 및 약한 필터(weak filter) 중 하나의 필터가 대상 블록에 적용될 수 있다.When the deblocking filter is applied to the target block, the applied filter may be different according to the required strength of the deblocking filtering. In other words, a filter determined according to the strength of the deblocking filtering among different filters may be applied to the target block. When the deblocking filter is applied to the target block, one of a strong filter and a weak filter may be applied to the target block according to the required strength of the deblocking filtering.

또한, 대상 블록에 수직 방향 필터링 및 수평 방향 필터링이 수행되는 경우, 수평 방향 필터링 및 수직 방향 필터링이 병행으로 처리될 수 있다.In addition, when vertical filtering and horizontal filtering are performed on the target block, horizontal filtering and vertical filtering may be processed in parallel.

SAO는 코딩 에러에 대한 보상을 위해 픽셀의 픽셀 값에 적정한 오프셋(offset)을 더할 수 있다. SAO는 디블록킹이 적용된 영상에 대해, 픽셀의 단위로 원본 영상 및 디블록킹이 적용된 영상 간의 차이에 대하여 오프셋을 사용하는 보정을 수행할 수 있다. 영상에 대한 오프셋 보정을 수행하기 위해, 영상에 포함된 픽셀들을 일정한 수의 영역들로 구분한 후, 구분된 영역들 중 오프셋이 수행될 영역을 결정하고 결정된 영역에 오프셋을 적용하는 방법이 사용될 수 있고, 영상의 각 픽셀의 에지 정보를 고려하여 오프셋을 적용하는 방법이 사용될 수 있다.SAO may add an appropriate offset to a pixel value of a pixel to compensate for a coding error. The SAO may perform correction using an offset for a difference between the original image and the image to which the deblocking is applied in a pixel unit of the deblocking image. To perform offset correction for an image, a method of dividing pixels included in an image into a certain number of areas, determining an area to be offset among the divided areas, and applying an offset to the determined area can be used. In addition, a method of applying an offset in consideration of edge information of each pixel of an image may be used.

ALF는 재구축된 영상 및 원래의 영상을 비교한 값에 기반하여 필터링을 수행할 수 있다. 영상에 포함된 픽셀들을 소정의 그룹들로 분할한 후, 각 분할된 그룹에 적용될 필터가 결정될 수 있고, 그룹 별로 차별적으로 필터링이 수행될 수 있다. 휘도 신호에 대하여, 적응적 루프 필터를 적용할지 여부에 관련된 정보는 CU 별로 시그널링될 수 있다. 각 블록에 적용될 ALF 의 모양 및 필터 계수는 블록 별로 다를 수 있다. 또는, 블록의 특징과는 무관하게, 고정된 형태의 ALF가 블록에 적용될 수 있다.The ALF may perform filtering based on a value obtained by comparing the reconstructed image and the original image. After dividing the pixels included in the image into predetermined groups, a filter to be applied to each divided group may be determined, and filtering may be performed differentially for each group. For the luminance signal, information related to whether to apply the adaptive loop filter may be signaled for each CU. The shape and filter coefficient of ALF to be applied to each block may be different for each block. Alternatively, regardless of the characteristics of the block, a fixed ALF may be applied to the block.

필터부(180)를 거친 재구축된 블록 또는 재구축된 영상은 참조 픽처 버퍼(190)에 저장될 수 있다. 필터부(180)를 거친 재구축된 블록은 참조 픽처의 일부일 수 있다. 말하자면, 참조 픽처는 필터부(180)를 거친 재구축된 블록들로 구성된 재구축된 픽처일 수 있다. 저장된 참조 픽처는 이후 인터 예측에 사용될 수 있다.The reconstructed block or reconstructed image that has passed through the filter unit 180 may be stored in the reference picture buffer 190. The reconstructed block that has passed through the filter unit 180 may be a part of the reference picture. In other words, the reference picture may be a reconstructed picture composed of reconstructed blocks that have passed through the filter unit 180. The stored reference picture can be used for inter prediction later.

도 2는 본 발명이 적용되는 복호화 장치의 일 실시예에 따른 구성을 나타내는 블록도이다.2 is a block diagram showing a configuration according to an embodiment of a decoding apparatus to which the present invention is applied.

복호화 장치(200)는 디코더, 비디오 복호화 장치 또는 영상 복호화 장치일 수 있다.The decoding device 200 may be a decoder, a video decoding device, or an image decoding device.

도 2를 참조하면, 복호화 장치(200)는 엔트로피 복호화부(210), 역양자화부(220), 역변환부(230), 인트라 예측부(240), 인터 예측부(250), 스위치(245), 가산기(255), 필터부(260) 및 참조 픽처 버퍼(270)를 포함할 수 있다.Referring to FIG. 2, the decoding apparatus 200 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an intra prediction unit 240, an inter prediction unit 250, and a switch 245. , An adder 255, a filter unit 260, and a reference picture buffer 270 may be included.

복호화 장치(200)는 부호화 장치(100)에서 출력된 비트스트림을 수신할 수 있다. 복호화 장치(200)는 컴퓨터 판독가능한 기록 매체에 저장된 비트스트림을 수신할 수 있고, 유/무선 전송 매체를 통해 스트리밍되는 비트스트림을 수신할 수 있다.The decoding apparatus 200 may receive a bitstream output from the encoding apparatus 100. The decoding apparatus 200 may receive a bitstream stored in a computer-readable recording medium, and may receive a bitstream streamed through a wired/wireless transmission medium.

복호화 장치(200)는 비트스트림에 대하여 인트라 모드 및/또는 인터 모드의 복호화를 수행할 수 있다. 또한, 복호화 장치(200)는 복호화를 통해 재구축된 영상 또는 복호화된 영상을 생성할 수 있고, 생성된 재구축된 영상 또는 복호화된 영상을 출력할 수 있다.The decoding apparatus 200 may perform intra mode and/or inter mode decoding on a bitstream. In addition, the decoding apparatus 200 may generate a reconstructed image or a decoded image through decoding, and may output the generated reconstructed image or a decoded image.

예를 들면, 복호화에 사용되는 예측 모드에 따른 인트라 모드 또는 인터 모드로의 전환은 스위치(245)에 의해 이루어질 수 있다. 복호화에 사용되는 예측 모드가 인트라 모드인 경우 스위치(245)가 인트라 모드로 전환될 수 있다. 복호화에 사용되는 예측 모드가 인터 모드인 경우 스위치(245)가 인터 모드로 전환될 수 있다.For example, switching to an intra mode or an inter mode according to a prediction mode used for decoding may be performed by the switch 245. When the prediction mode used for decoding is the intra mode, the switch 245 may be switched to the intra mode. When the prediction mode used for decoding is the inter mode, the switch 245 may be switched to the inter mode.

복호화 장치(200)는 입력된 비트스트림을 복호화함으로써 재구축된 잔차 블록(reconstructed residual block)을 획득할 수 있고, 예측 블록을 생성할 수 있다. 재구축된 잔차 블록 및 예측 블록이 획득되면, 복호화 장치(200)는 재구축된 잔차 블록 및 예측 블록을 더함으로써 복호화의 대상이 되는 재구축된 블록을 생성할 수 있다.The decoding apparatus 200 may obtain a reconstructed residual block by decoding the input bitstream, and may generate a prediction block. When the reconstructed residual block and the prediction block are obtained, the decoding apparatus 200 may generate a reconstructed block to be decoded by adding the reconstructed residual block and the prediction block.

엔트로피 복호화부(210)는 비트스트림에 대한 확률 분포에 기초하여 비트스트림에 대한 엔트로피 복호화를 수행함으로써 심볼들을 생성할 수 있다. 생성된 심볼들은 양자화된 변환 계수 레벨(quantized transform coefficient level) 형태의 심볼을 포함할 수 있다. 여기에서, 엔트로피 복호화 방법은 상술된 엔트로피 부호화 방법과 유사할 수 있다. 예를 들면, 엔트로피 복호화 방법은 상술된 엔트로피 부호화 방법의 역과정일 수 있다.The entropy decoder 210 may generate symbols by performing entropy decoding on a bitstream based on a probability distribution on the bitstream. The generated symbols may include a symbol in the form of a quantized transform coefficient level. Here, the entropy decoding method may be similar to the entropy encoding method described above. For example, the entropy decoding method may be a reverse process of the entropy encoding method described above.

엔트로피 복호화부(210)는 양자화된 변환 계수 레벨을 복호화하기 위해 변환 계수 스캐닝 방법을 통해 1차원의 벡터의 형태의 계수를 2차원의 블록의 형태로 변경할 수 있다.The entropy decoder 210 may change a coefficient in the form of a one-dimensional vector into a form of a two-dimensional block through a transform coefficient scanning method in order to decode the quantized transform coefficient level.

예를 들면, 우상단 대각 스캔을 이용하여 블록의 계수들을 스캔함으로써 계수들이 2차원 블록 형태로 변경될 수 있다. 또는, 블록의 크기 및/또는 인트라 예측 모드에 따라 우상단 대각 스캔, 수직 스캔 및 수평 스캔 중 어떤 스캔이 사용될 것인지가 결정될 수 있다.For example, the coefficients may be changed into a 2D block shape by scanning the coefficients of the block using the upper right diagonal scan. Alternatively, it may be determined which of the upper-right diagonal scan, vertical scan, and horizontal scan will be used according to the size of the block and/or the intra prediction mode.

양자화된 계수는 역양자화부(220)에서 역양자화될 수 있다. 역양자화부(220)는 양자화된 계수에 대한 역양자화를 수행함으로써 역양자화된 계수를 생성할 수 있다. 또한, 역양자화된 계수는 역변환부(230)에서 역변환될 수 있다. 역변환부(230)는 역양자화된 계수에 대한 역변환을 수행함으로써 재구축된 잔차 블록을 생성할 수 있다. 양자화된 계수에 대한 역양자화 및 역변환이 수행된 결과로서, 재구축된 잔차 블록이 생성될 수 있다. 이때, 역양자화부(220)는 재구축된 잔차 블록을 생성함에 있어서 양자화된 계수에 양자화 행렬을 적용할 수 있다.The quantized coefficient may be inverse quantized by the inverse quantization unit 220. The inverse quantization unit 220 may generate an inverse quantized coefficient by performing inverse quantization on the quantized coefficient. In addition, the inverse quantized coefficient may be inversely transformed by the inverse transform unit 230. The inverse transform unit 230 may generate a reconstructed residual block by performing an inverse transform on an inverse quantized coefficient. As a result of performing inverse quantization and inverse transformation on the quantized coefficients, a reconstructed residual block may be generated. In this case, the inverse quantization unit 220 may apply a quantization matrix to the quantized coefficients in generating the reconstructed residual block.

인트라 모드가 사용되는 경우, 인트라 예측부(240)는 대상 블록 주변의 이미 복호화된 블록의 픽셀 값을 이용하는 공간적 예측을 수행함으로써 예측 블록을 생성할 수 있다.When the intra mode is used, the intra prediction unit 240 may generate a prediction block by performing spatial prediction using pixel values of an already decoded block around the target block.

인터 예측부(250)는 움직임 보상부를 포함할 수 있다. 또는, 인터 예측부(250)는 움직임 보상부로 명명될 수 있다.The inter prediction unit 250 may include a motion compensation unit. Alternatively, the inter prediction unit 250 may be referred to as a motion compensation unit.

인터 모드가 사용되는 경우, 움직임 보상부는 움직임 벡터 및 참조 픽처 버퍼(270)에 저장된 참조 영상을 이용하는 움직임 보상을 수행함으로써 예측 블록을 생성할 수 있다.When the inter mode is used, the motion compensation unit may generate a prediction block by performing motion compensation using a motion vector and a reference image stored in the reference picture buffer 270.

움직임 보상부는 움직임 벡터가 정수가 아닌 값을 가진 경우, 참조 영상 내의 일부 영역에 대해 보간 필터를 적용할 수 있고, 보간 필터가 적용된 참조 영상을 사용하여 예측 블록을 생성할 수 있다. 움직임 보상부는 움직임 보상을 수행하기 위해 CU를 기준으로 CU에 포함된 PU를 위해 사용되는 움직임 보상 방법이 스킵 모드, 머지 모드, AMVP 모드 및 현재 픽처 참조 모드 중 어떤 모드인가를 결정할 수 있고, 결정된 모드에 따라 움직임 보상을 수행할 수 있다.When the motion vector has a non-integer value, the motion compensation unit may apply an interpolation filter to a partial region of the reference image and may generate a prediction block using the reference image to which the interpolation filter is applied. The motion compensation unit may determine which of a skip mode, a merge mode, an AMVP mode, and a current picture reference mode is the motion compensation method used for the PU included in the CU based on the CU to perform motion compensation, and the determined mode According to the motion compensation can be performed.

재구축된 잔차 블록 및 예측 블록은 가산기(255)를 통해 더해질 수 있다. 가산기(255)는 재구축된 잔차 블록 및 예측 블록을 더함으로써 재구축된 블록을 생성할 수 있다.The reconstructed residual block and prediction block may be added through an adder 255. The adder 255 may generate a reconstructed block by adding the reconstructed residual block and the prediction block.

재구축된 블록은 필터부(260)를 거칠 수 있다. 필터부(260)는 디블록킹 필터, SAO 및 ALF 중 적어도 하나를 재구축된 블록 또는 재구축된 영상에 적용할 수 있다. 재구축된 영상은 재구축된 블록을 포함하는 픽처일 수 있다.The reconstructed block may pass through the filter unit 260. The filter unit 260 may apply at least one of the deblocking filter, SAO, and ALF to the reconstructed block or the reconstructed image. The reconstructed image may be a picture including a reconstructed block.

필터부(260)를 거친 재구축된 영상은 부호화 장치(100)에 의해 출력될 수 있으며, 부호화 장치(100)에 의해 사용될 수 있다.The reconstructed image that has passed through the filter unit 260 may be output by the encoding device 100 and may be used by the encoding device 100.

필터부(260)를 거친 재구축된 영상은 참조 픽처 버퍼(270)에 참조 픽처로서 저장될 수 있다. 필터부(260)를 거친 재구축된 블록은 참조 픽처의 일부일 수 있다. 말하자면, 참조 픽처는 필터부(260)를 거친 재구축된 블록들로 구성된 영상일 수 있다. 저장된 참조 픽처는 이후 인터 예측을 위해 사용될 수 있다.The reconstructed image that has passed through the filter unit 260 may be stored as a reference picture in the reference picture buffer 270. The reconstructed block that has passed through the filter unit 260 may be a part of the reference picture. In other words, the reference picture may be an image composed of reconstructed blocks that have passed through the filter unit 260. The stored reference picture can then be used for inter prediction.

도 3은 일 실시예에 따른 부호화 장치의 구조도이다.3 is a structural diagram of an encoding apparatus according to an embodiment.

부호화 장치(300)는 전술된 부호화 장치(100)에 대응할 수 있다.The encoding device 300 may correspond to the encoding device 100 described above.

부호화 장치(300)는 버스(390)를 통하여 서로 통신하는 처리부(310), 메모리(330), 사용자 인터페이스(User Interface; UI) 입력 디바이스(350), UI 출력 디바이스(360) 및 저장소(storage)(340)를 포함할 수 있다. 또한, 부호화 장치(300)는 네트워크(399)에 연결되는 통신부(320)를 더 포함할 수 있다.The encoding device 300 includes a processing unit 310, a memory 330, a user interface (UI) input device 350, a UI output device 360, and a storage that communicate with each other through a bus 390. It may include 340. In addition, the encoding apparatus 300 may further include a communication unit 320 connected to the network 399.

처리부(310)는 중앙 처리 장치(Central Processing Unit; CPU), 메모리(330) 또는 저장소(340)에 저장된 프로세싱(processing) 명령어(instruction)들을 실행하는 반도체 장치일 수 있다. 처리부(310)는 적어도 하나의 하드웨어 프로세서일 수 있다.The processing unit 310 may be a central processing unit (CPU), a semiconductor device that executes processing instructions stored in the memory 330 or the storage 340. The processing unit 310 may be at least one hardware processor.

처리부(310)는 부호화 장치(300)로 입력되거나, 부호화 장치(300)에서 출력되거나, 부호화 장치(300)의 내부에서 사용되는 신호, 데이터 또는 정보의 생성 및 처리를 수행할 수 있고, 신호, 데이터 또는 정보에 관련된 검사, 비교 및 판단 등을 수행할 수 있다. 말하자면, 실시예에서 데이터 또는 정보의 생성 및 처리와, 데이터 또는 정보에 관련된 검사, 비교 및 판단은 처리부(310)에 의해 수행될 수 있다.The processing unit 310 may generate and process signals, data, or information input to the encoding device 300, output from the encoding device 300, or used inside the encoding device 300. Inspection, comparison, and judgment related to data or information can be performed. That is, in the embodiment, the generation and processing of data or information, and inspection, comparison, and determination related to the data or information may be performed by the processing unit 310.

처리부(310)는 인터 예측부(110), 인트라 예측부(120), 스위치(115), 감산기(125), 변환부(130), 양자화부(140), 엔트로피 부호화부(150), 역양자화부(160), 역변환부(170), 가산기(175), 필터부(180) 및 참조 픽처 버퍼(190)를 포함할 수 있다.The processing unit 310 includes an inter prediction unit 110, an intra prediction unit 120, a switch 115, a subtractor 125, a transform unit 130, a quantization unit 140, an entropy encoding unit 150, and inverse quantization. A unit 160, an inverse transform unit 170, an adder 175, a filter unit 180, and a reference picture buffer 190 may be included.

인터 예측부(110), 인트라 예측부(120), 스위치(115), 감산기(125), 변환부(130), 양자화부(140), 엔트로피 부호화부(150), 역양자화부(160), 역변환부(170), 가산기(175), 필터부(180) 및 참조 픽처 버퍼(190) 중 적어도 일부는 프로그램 모듈들일 수 있으며, 외부의 장치 또는 시스템과 통신할 수 있다. 프로그램 모듈들은 운영 체제, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 부호화 장치(300)에 포함될 수 있다.Inter prediction unit 110, intra prediction unit 120, switch 115, subtractor 125, transform unit 130, quantization unit 140, entropy encoding unit 150, inverse quantization unit 160, At least some of the inverse transform unit 170, the adder 175, the filter unit 180, and the reference picture buffer 190 may be program modules and may communicate with an external device or system. Program modules may be included in the encoding apparatus 300 in the form of an operating system, an application program module, and other program modules.

프로그램 모듈들은 물리적으로는 여러 가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈 중 적어도 일부는 부호화 장치(300)와 통신 가능한 원격 기억 장치에 저장될 수도 있다.Program modules may be physically stored on various known storage devices. In addition, at least some of these program modules may be stored in a remote storage device capable of communicating with the encoding device 300.

프로그램 모듈들은 일 실시예에 따른 기능 또는 동작을 수행하거나, 일 실시예에 따른 추상 데이터 유형을 구현하는 루틴(routine), 서브루틴(subroutine), 프로그램, 오브젝트(object), 컴포넌트(component) 및 데이터 구조(data structure) 등을 포괄할 수 있지만, 이에 제한되지는 않는다.Program modules are routines, subroutines, programs, objects, components, and data that perform functions or operations according to an embodiment or implement abstract data types according to an embodiment. The structure (data structure) may be included, but is not limited thereto.

프로그램 모듈들은 부호화 장치(300)의 적어도 하나의 프로세서(processor)에 의해 수행되는 명령어(instruction) 또는 코드(code)로 구성될 수 있다.The program modules may be composed of instructions or codes executed by at least one processor of the encoding apparatus 300.

처리부(310)는 인터 예측부(110), 인트라 예측부(120), 스위치(115), 감산기(125), 변환부(130), 양자화부(140), 엔트로피 부호화부(150), 역양자화부(160), 역변환부(170), 가산기(175), 필터부(180) 및 참조 픽처 버퍼(190)의 명령어 또는 코드를 실행할 수 있다.The processing unit 310 includes an inter prediction unit 110, an intra prediction unit 120, a switch 115, a subtractor 125, a transform unit 130, a quantization unit 140, an entropy encoding unit 150, and inverse quantization. Commands or codes of the unit 160, the inverse transform unit 170, the adder 175, the filter unit 180, and the reference picture buffer 190 may be executed.

저장부는 메모리(330) 및/또는 저장소(340)를 나타낼 수 있다. 메모리(330) 및 저장소(340)는 다양한 형태의 휘발성 또는 비휘발성 저장 매체일 수 있다. 예를 들면, 메모리(330)는 롬(ROM)(331) 및 램(RAM)(332) 중 적어도 하나를 포함할 수 있다.The storage unit may represent the memory 330 and/or the storage 340. The memory 330 and the storage 340 may be various types of volatile or nonvolatile storage media. For example, the memory 330 may include at least one of a ROM 331 and a RAM 332.

저장부는 부호화 장치(300)의 동작을 위해 사용되는 데이터 또는 정보를 저장할 수 있다. 실시예에서, 부호화 장치(300)가 갖는 데이터 또는 정보는 저장부 내에 저장될 수 있다.The storage unit may store data or information used for the operation of the encoding device 300. In an embodiment, data or information of the encoding device 300 may be stored in the storage unit.

예를 들면, 저장부는 픽처, 블록, 리스트, 움직임 정보, 인터 예측 정보 및 비트스트림 등을 저장할 수 있다.For example, the storage unit may store pictures, blocks, lists, motion information, inter prediction information, and bitstreams.

부호화 장치(300)는 컴퓨터에 의해 독출(read)될 수 있는 기록 매체를 포함하는 컴퓨터 시스템에서 구현될 수 있다.The encoding apparatus 300 may be implemented in a computer system including a recording medium that can be read by a computer.

기록 매체는 부호화 장치(300)가 동작하기 위해 요구되는 적어도 하나의 모듈을 저장할 수 있다. 메모리(330)는 적어도 하나의 모듈을 저장할 수 있고, 적어도 하나의 모듈이 처리부(310)에 의하여 실행되도록 구성될 수 있다.The recording medium may store at least one module required for the encoding apparatus 300 to operate. The memory 330 may store at least one module, and at least one module may be configured to be executed by the processing unit 310.

부호화 장치(300)의 데이터 또는 정보의 통신과 관련된 기능은 통신부(320)를 통해 수행될 수 있다.A function related to communication of data or information of the encoding device 300 may be performed through the communication unit 320.

예를 들면, 통신부(320)는 비트스트림을 후술될 복호화 장치(400)로 전송할 수 있다.For example, the communication unit 320 may transmit the bitstream to the decoding device 400 to be described later.

도 4은 일 실시예에 따른 복호화 장치의 구조도이다.4 is a structural diagram of a decoding apparatus according to an embodiment.

복호화 장치(400)는 전술된 복호화 장치(200)에 대응할 수 있다.The decoding device 400 may correspond to the decoding device 200 described above.

복호화 장치(400)는 버스(490)를 통하여 서로 통신하는 처리부(410), 메모리(430), 사용자 인터페이스(User Interface; UI) 입력 디바이스(450), UI 출력 디바이스(460) 및 저장소(storage)(440)를 포함할 수 있다. 또한, 복호화 장치(400)는 네트워크(499)에 연결되는 통신부(420)를 더 포함할 수 있다.The decoding apparatus 400 includes a processing unit 410, a memory 430, a user interface (UI) input device 450, a UI output device 460, and a storage that communicate with each other through a bus 490. (440) may be included. In addition, the decoding apparatus 400 may further include a communication unit 420 connected to the network 499.

처리부(410)는 중앙 처리 장치(Central Processing Unit; CPU), 메모리(430) 또는 저장소(440)에 저장된 프로세싱(processing) 명령어(instruction)들을 실행하는 반도체 장치일 수 있다. 처리부(410)는 적어도 하나의 하드웨어 프로세서일 수 있다.The processing unit 410 may be a central processing unit (CPU), a semiconductor device that executes processing instructions stored in the memory 430 or the storage 440. The processing unit 410 may be at least one hardware processor.

처리부(410)는 복호화 장치(400)로 입력되거나, 복호화 장치(400)에서 출력되거나, 복호화 장치(400)의 내부에서 사용되는 신호, 데이터 또는 정보의 생성 및 처리를 수행할 수 있고, 신호, 데이터 또는 정보에 관련된 검사, 비교 및 판단 등을 수행할 수 있다. 말하자면, 실시예에서 데이터 또는 정보의 생성 및 처리와, 데이터 또는 정보에 관련된 검사, 비교 및 판단은 처리부(410)에 의해 수행될 수 있다.The processing unit 410 may generate and process signals, data, or information input to the decoding device 400, output from the decoding device 400, or used inside the decoding device 400. Inspection, comparison, and judgment related to data or information can be performed. That is to say, in the embodiment, generation and processing of data or information, and inspection, comparison, and determination related to data or information may be performed by the processing unit 410.

처리부(410)는 엔트로피 복호화부(210), 역양자화부(220), 역변환부(230), 인트라 예측부(240), 인터 예측부(250), 스위치(245), 가산기(255), 필터부(260) 및 참조 픽처 버퍼(270)를 포함할 수 있다.The processing unit 410 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an intra prediction unit 240, an inter prediction unit 250, a switch 245, an adder 255, and a filter. It may include a sub 260 and a reference picture buffer 270.

엔트로피 복호화부(210), 역양자화부(220), 역변환부(230), 인트라 예측부(240), 인터 예측부(250), 스위치(245), 가산기(255), 필터부(260) 및 참조 픽처 버퍼(270) 중 적어도 일부는 프로그램 모듈들일 수 있으며, 외부의 장치 또는 시스템과 통신할 수 있다. 프로그램 모듈들은 운영 체제, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 복호화 장치(400)에 포함될 수 있다.Entropy decoding unit 210, inverse quantization unit 220, inverse transform unit 230, intra prediction unit 240, inter prediction unit 250, switch 245, adder 255, filter unit 260 and At least some of the reference picture buffers 270 may be program modules and may communicate with an external device or system. Program modules may be included in the decoding device 400 in the form of an operating system, an application program module, and other program modules.

프로그램 모듈들은 물리적으로는 여러 가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈 중 적어도 일부는 복호화 장치(400)와 통신 가능한 원격 기억 장치에 저장될 수도 있다.Program modules may be physically stored on various known storage devices. In addition, at least some of these program modules may be stored in a remote storage device capable of communicating with the decoding device 400.

프로그램 모듈들은 복호화 장치(400)의 적어도 하나의 프로세서(processor)에 의해 수행되는 명령어(instruction) 또는 코드(code)로 구성될 수 있다.The program modules may be composed of instructions or codes executed by at least one processor of the decoding apparatus 400.

처리부(410)는 엔트로피 복호화부(210), 역양자화부(220), 역변환부(230), 인트라 예측부(240), 인터 예측부(250), 스위치(245), 가산기(255), 필터부(260) 및 참조 픽처 버퍼(270)의 명령어 또는 코드를 실행할 수 있다.The processing unit 410 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, an intra prediction unit 240, an inter prediction unit 250, a switch 245, an adder 255, and a filter. Commands or codes of the unit 260 and the reference picture buffer 270 may be executed.

저장부는 메모리(430) 및/또는 저장소(440)를 나타낼 수 있다. 메모리(430) 및 저장소(440)는 다양한 형태의 휘발성 또는 비휘발성 저장 매체일 수 있다. 예를 들면, 메모리(430)는 롬(ROM)(431) 및 램(RAM)(432) 중 적어도 하나를 포함할 수 있다.The storage unit may represent the memory 430 and/or the storage 440. The memory 430 and the storage 440 may be various types of volatile or nonvolatile storage media. For example, the memory 430 may include at least one of a ROM 431 and a RAM 432.

저장부는 복호화 장치(400)의 동작을 위해 사용되는 데이터 또는 정보를 저장할 수 있다. 실시예에서, 복호화 장치(400)가 갖는 데이터 또는 정보는 저장부 내에 저장될 수 있다.The storage unit may store data or information used for the operation of the decoding apparatus 400. In an embodiment, data or information of the decoding apparatus 400 may be stored in the storage unit.

복호화 장치(400)는 컴퓨터에 의해 독출(read)될 수 있는 기록 매체를 포함하는 컴퓨터 시스템에서 구현될 수 있다.The decoding apparatus 400 may be implemented in a computer system including a recording medium that can be read by a computer.

기록 매체는 복호화 장치(400)가 동작하기 위해 요구되는 적어도 하나의 모듈을 저장할 수 있다. 메모리(430)는 적어도 하나의 모듈을 저장할 수 있고, 적어도 하나의 모듈이 처리부(410)에 의하여 실행되도록 구성될 수 있다.The recording medium may store at least one module required for the decoding apparatus 400 to operate. The memory 430 may store at least one module, and at least one module may be configured to be executed by the processing unit 410.

복호화 장치(400)의 데이터 또는 정보의 통신과 관련된 기능은 통신부(420)를 통해 수행될 수 있다.A function related to communication of data or information of the decoding device 400 may be performed through the communication unit 420.

예를 들면, 통신부(420)는 부호화 장치(300)로부터 비트스트림을 수신할 수 있다.For example, the communication unit 420 may receive a bitstream from the encoding device 300.

인지 특성에 기반한 영상 압축Image compression based on cognitive characteristics

후술될 실시예에서는 사람의 인지 특성을 이용하여 영상 부호화에 있어서 인지적으로 불필요한 인지 중복성을 감소시키는 방식을 통해 인지 화질의 저하 없이 압축율을 향상시키는 방법이 설명된다. 또한, 후술될 실시예에서는 인지 중복성을 감소시킴으로써 절약된 코딩 비트(coding bit)를 이용하여 인지 화질을 향상시키는 방법이 설명된다.In an embodiment to be described later, a method of improving the compression rate without deteriorating the perceived image quality by reducing cognitively unnecessary cognitive redundancy in image encoding by using human cognitive characteristics will be described. In addition, in an embodiment to be described later, a method of improving the perceived image quality using coding bits saved by reducing cognitive redundancy will be described.

실시예에서는 원본 영상 및 대상 영상 간의 차이를 정의하는 제1 JND 뿐만 아니라 영상의 압축 과정에서 생성되는 재구축된 영상 및 대상 영상 간의 차이를 정의할 수 있는 다양한 레벨들의 JND들이 정의될 수 있다. 이러한 정의에 따라, 영상의 압축 과정에서 재구축된 영상이 JND 레벨 1에서 정의된 임계치를 넘어서는 경우에도 다른 JND 레벨에서 정의된 JND 임계치를 사용함으로써 압축률이 향상될 수 있다.In an embodiment, not only a first JND defining a difference between the source image and the target image, but also JNDs of various levels capable of defining a difference between the reconstructed image generated in the process of compressing the image and the target image may be defined. According to this definition, even when an image reconstructed in the process of compressing an image exceeds the threshold defined at JND level 1, the compression rate can be improved by using the JND threshold defined at another JND level.

실시예에서는 인지 중복성을 감소시킴으로써 절약된 코딩 비트를 이용하여 인지 화질이 향상될 수 있다.In an embodiment, cognitive quality may be improved by using the saved coding bits by reducing cognitive redundancy.

또한, 실시예에 따르면 영상의 압축 과정에서 인지 중복성 제거를 통해 인지 화질의 저하 없이 압축률이 향상될 수 있다.In addition, according to an embodiment, the compression rate may be improved without deteriorating the perceived image quality by removing cognitive redundancy in the process of compressing an image.

도 5은 일 예에 따른 JND 포지션 및 JND 구간을 설명한다.5 illustrates a JND position and a JND section according to an example.

일반적으로 영상의 화질의 왜곡에 대한 사람의 인지 시각 특성에 따르면, 도 5에서 도시된 것과 같이 특정한 비트레이트 구간(range) 내에서는 왜곡 값이 변하여도 사람은 동일한 왜곡 또는 동일한 화질의 영상으로 인지한다.In general, according to the perceived visual characteristics of a person regarding the distortion of the image quality, as shown in FIG. 5, even if the distortion value changes within a specific bit rate range, a person perceives the same distortion as an image of the same quality. .

JND 포지션(position)은 일반적으로 사람이 비교 대상과의 차이를 느끼기 시작하는 지점을 나타낸다.The JND position generally represents the point at which a person begins to feel the difference from the object being compared.

제1 JND 포지션은 사람이 원본 영상에 비해 차이 또는 변화를 느끼기 시작하는 지점을 의미한다.The first JND position refers to a point at which a person starts to feel a difference or change compared to the original image.

제2 JND 포지션은 사람이 제1 JND 포지션에서의 영상에 비해 다시 차이 또는 변화를 느끼는 지점을 의미한다.The second JND position refers to a point at which a person feels a difference or change again compared to the image at the first JND position.

말하자면, 제1 JND 포지션 및 제2 JND 포지션 등은 사람이 인지 왜곡의 변화를 느끼는 지점을 의미할 수 있다.In other words, the first JND position and the second JND position may mean a point at which a person feels a change in cognitive distortion.

JND 구간은 왜곡 값이 변하더라도 사람은 동일한 영상이라고 인식하는 왜곡 값의 구간일 수 있다. 말하자면, 하나의 JND 구간 내에서는 영상의 비트레이트나 왜곡 값이 변하더라도 사람은 동일한 화질의 영상으로 인지한다.The JND section may be a section of a distortion value that a person recognizes as the same image even if the distortion value changes. In other words, even if the bit rate or distortion value of the image changes within one JND section, a person perceives it as an image of the same quality.

JND 포지션은 영상의 특성에 따라 다르게 나타난다. 제1 JND 포지션에 대한 모델링이 널리 이루어진 바 있다.The JND position appears differently depending on the characteristics of the video. Modeling for the first JND position has been widely conducted.

도 6은 일 예에 따른 JND 임계치를 이용한 양자화를 나타낸다.6 shows quantization using a JND threshold according to an example.

JND 임계치는 원본 영상 또는 비교 대상 영상을 특정 JND 포지션까지 이르게 하는 왜곡 값을 의미할 수 있다.The JND threshold may mean a distortion value that causes the original image or the comparison target image to reach a specific JND position.

JND 임계치는 입력 영상의 공간 영역 또는 주파수 영역에서 특정 왜곡 값이 감해지거나 가해졌을 경우, 대부분의 사람들이 인지 화질의 차이를 느끼지 못하는 크기의 왜곡 값일 수 있다.The JND threshold may be a distortion value of a size that most people do not feel a difference in perceived image quality when a specific distortion value is subtracted or applied in the spatial or frequency domain of the input image.

예를 들어, 도 6에서 도시된 것과 같이, 영상에 제1 JND 임계치에 해당하는 값이 감해지거나 가해졌을 경우, 영상의 왜곡이 제1 JND 포지션에 이를 수 있다. 말하자면, 제1 JND 임계치는 영상으로부터 감해지거나, 영상에 가해졌을 때 영상의 왜곡이 제1 JND 포지션에 도달하게 하는 값일 수 있다.For example, as illustrated in FIG. 6, when a value corresponding to the first JND threshold is subtracted or applied to the image, distortion of the image may reach the first JND position. That is, the first JND threshold may be a value that causes the distortion of the image to reach the first JND position when it is subtracted from the image or applied to the image.

즉, 도 6에서 도시된 것과 같이, 영상에 제1 JND 임계치보다 더 작은 값이 감해지거나 가해졌을 경우, 영상에 대해 전통 방식으로 측정된 왜곡 값은 변할 수 있지만, 영상의 인지 왜곡 값은 변하지 않을 수 있다.That is, as shown in FIG. 6, when a value smaller than the first JND threshold is subtracted or applied to the image, the distortion value measured in the traditional method for the image may change, but the perceived distortion value of the image does not change. I can.

예를 들면, 전통 방식은 절대 차이 합(Sum of Absolute Difference; SAD) 및 평균 제곱 오차(Mean Squared Error; MSE) 등올 포함할 수 있다.For example, the traditional method may include Sum of Absolute Difference (SAD) and Mean Squared Error (MSE).

JND 임계치는 영상에 대한 인지 특성에 따라 변할 수 있다. 압축에 사용되는 JND 임계치 모델은 주로 입력 영상으로부터 특정 왜곡을 빼서 정보(말하자면, 인지 중복성)를 감소시키는 방식으로 만들어지며, 주파수 영역에서 주파수 값의 크기를 JND 임계치만큼 감소시켜 영상의 화질은 유지하면서 부호화된 영상을 위한 부호화 비트를 감소시키는 방법이 사용될 수 있다.The JND threshold may vary depending on the cognitive characteristics of the image. The JND threshold model used for compression is mainly made by subtracting a specific distortion from the input image to reduce information (ie, perceived redundancy).The size of the frequency value in the frequency domain is reduced by the JND threshold, while maintaining the image quality. A method of reducing encoding bits for an encoded image may be used.

도 7은 일 예에 따른 주파수 영역에서의 JND 임계치를 나타낸다.7 shows a JND threshold in a frequency domain according to an example.

도 7는 주파수 영역에서의 JND 임계치들을 예시한다. 영상에서 영상의 주파수 계수에 대응하는 JND 임계치가 감해지더라도 시청자는 입력 영상에서의 인지 화질 변화를 거의 느낄 수 없다.7 illustrates JND thresholds in the frequency domain. Even if the JND threshold corresponding to the frequency coefficient of the video is reduced in the video, the viewer can hardly feel the change in the perceived quality of the input video.

전술된 것과 같이, 영상 압축에서는 일반적으로 원본 영상 및 압축된 영상 간의 인지 화질 차이를 최소화하는 제1 JND 임계치 만이 사용될 수 있다.As described above, in image compression, in general, only the first JND threshold that minimizes the difference in perceived quality between the original image and the compressed image may be used.

영상 압축 과정에서의 양자화 파라미터(Quantization Parameter; QP)의 값이 제1 JND 임계치를 넘는 경우, 재구축된 영상의 화질 및 원본 영상의 화질 간의 차이가 발생하게 되고, 제1 JND 임계치의 효용이 없어질 수 있다.When the value of the quantization parameter (QP) in the image compression process exceeds the first JND threshold, a difference between the quality of the reconstructed image and the quality of the original image occurs, and the first JND threshold is not useful. I can lose.

이러한 경우에도, 재구축된 영상의 화질이 최종적인 영상 화질의 기준이 된다면, 재구축된 영상의 화질이 속하는 JND 구간이 도출되고, 도출된 JND 구간에 대한 JND 임계치를 사용하는 추가의 양자화가 영상의 압축에 적용될 수 있다. 이러한 압축을 통해 압축율이 향상될 수 있다.Even in this case, if the quality of the reconstructed image becomes the criterion of the final image quality, the JND section to which the quality of the reconstructed image belongs is derived, and additional quantization using the JND threshold for the derived JND section is performed. Can be applied to the compression of. Through this compression, the compression rate can be improved.

실시예들에서는 사람의 인지 특성을 이용하여 제1 JND 임계치 및 제1 JND 포지션뿐만 아니라 제2 레벨 및 제3 레벨과 같은 다양한 레벨들의 JND 임계치 및 JND 포지션을 모델링하는 방법과, 입력 영상 및 재구축된 영상의 JND 구간을 결정하는 방법이 제시된다.In embodiments, a method of modeling the JND threshold and JND positions of various levels such as the second level and the third level, as well as the first JND threshold and the first JND position using human cognitive characteristics, and the input image and reconstruction A method of determining the JND section of the recorded video is presented.

또한, 실시예들에서는 압축된 영상에 대한 인지 화질의 향상이 설명된다.In addition, in the embodiments, an improvement in perceived quality of a compressed image is described.

실시예들에서는, 인지 민감 영역 및 화질 열화 영역의 검출을 통해, 인지 화질의 향상이 요구되는 영역이 결정되며, 인지 민감 영역 및 화질 열화 영역이 나뉘고, 각 영역들 결정하는데 있어 요구되는 다양한 인지 요소들이 설명된다.In embodiments, through detection of the cognitive sensitive region and the image quality deterioration region, the region requiring the improvement of the perceived image quality is determined, the cognitive sensitive region and the image quality deterioration region are divided, and various cognitive factors required in determining each region Are explained.

실시예들에서는, 인지 화질을 향상시키는 기법을 적용하는 방법이 설명된다. JND 양자화를 통해 절약된 코딩 비트를 이용하여 인지 화질이 향상될 수 있다. 부호화에 있어서 인지 화질이 향상시키는 방법들이 설명된다.In embodiments, a method of applying a technique for improving perceived image quality is described. Perceptual quality can be improved by using the coding bits saved through JND quantization. Methods for improving the perceived quality in encoding are described.

도 8은 일 실시예에 따른 영상 처리 방법의 흐름도이다.8 is a flowchart of an image processing method according to an exemplary embodiment.

실시예의 영상 처리 방법은 인지 특성에 기반한 영상 부호화 방법 또는 인지 특성에 기반한 영상 압축 방법으로 간주될 수 있다.The image processing method of the embodiment may be regarded as an image encoding method based on cognitive characteristics or an image compression method based on cognitive characteristics.

처리부(310)는 양자화부(140)일 수 있다.The processing unit 310 may be a quantization unit 140.

단계들(810 및 820)에서, 처리부(310)는 입력 영상에 대하여 JND에 기반한 양자화를 수행할 수 있다.In steps 810 and 820, the processing unit 310 may perform JND-based quantization on the input image.

단계(810)에서, 처리부(310)는 다중 레벨 JND 임계치들 및 다중 레벨 JND 구간들에 대한 모델링을 수행할 수 있다.In step 810, the processor 310 may perform modeling on the multilevel JND thresholds and the multilevel JND intervals.

다중 레벨 JND 구간들은 제1 JND뿐만 아니라 제2 JND 및 제3 JND 등과 같은 복수의 JND들에 대한 복수의 JND 구간들을 의미할 수 있다.The multi-level JND intervals may mean a plurality of JND intervals for a plurality of JNDs, such as a second JND and a third JND, as well as the first JND.

다중 레벨 JND 임계치들은 복수의 JND들에 대한 복수의 구간들 또는 JND 포지션들에 이르게 하는 복수의 임계치들을 의미할 수 있다.The multilevel JND thresholds may mean a plurality of thresholds leading to a plurality of intervals or JND positions for a plurality of JNDs.

JND 포지션 및 JND 구간은 주관적 화질 평가를 통해 사람들이 화질의 열화를 인지하는 시점을 토대로 결정될 수 있고, JND 임계치는 JND 포지션이나 JND 구간에 속하는 영상 및 원본 영상 간의 차이를 모델링하는 형태로 만들어질 수 있다.The JND position and the JND section can be determined based on the point at which people perceive deterioration of the picture quality through subjective picture quality evaluation, and the JND threshold can be made in the form of modeling the difference between the JND position or the video belonging to the JND section and the original video. have.

실시예에서, JND 임계치는 영상에 대한 사람의 인지 특성을 고려하여 결정될 수 있고, JND 임계치의 크기는 영상의 특성에 따라서 변할 수 있다.In an embodiment, the JND threshold may be determined in consideration of a person's perception characteristic of an image, and the size of the JND threshold may vary according to the characteristics of the image.

일 실시예에서, 처리부(310)는 1) 주관적 화질 평가 및 수학적 모델링 및 2) 기계 학습 중 적어도 하나를 이용하여 다중 레벨 JND 임계치들 및 다중 레벨 JND 구간들에 대한 모델링을 수행함으로써 JND 모델을 결정할 수 있다.In one embodiment, the processor 310 determines a JND model by performing modeling on multilevel JND thresholds and multilevel JND intervals using at least one of 1) subjective image quality evaluation and mathematical modeling, and 2) machine learning. I can.

일 실시예에서, 처리부(310)는 주관적 화질 평가 및 수학적 모델링을 이용하여 다중 레벨 JND 임계치들 및 다중 레벨 JND 구간들에 대한 모델링을 수행함으로써 JND 모델을 결정할 수 있다.In an embodiment, the processor 310 may determine the JND model by performing modeling on multilevel JND thresholds and multilevel JND intervals using subjective image quality evaluation and mathematical modeling.

일 실시예에서, 처리부(310)는 기계 학습을 이용하여 다중 레벨 JND 임계치들 및 다중 레벨 JND 구간들에 대한 모델링을 수행함으로써 JND 모델을 결정할 수 있다.In an embodiment, the processor 310 may determine the JND model by performing modeling on multilevel JND thresholds and multilevel JND intervals using machine learning.

단계(820)에서, 처리부(310)는 결정된 JND 모델을 사용하여 입력 영상에 대한 감축(suppression)을 수행할 수 있다.In step 820, the processor 310 may perform a suppression on the input image using the determined JND model.

단계(820)에서, 처리부(310)는 입력 영상의 인지 특성을 분석할 수 있다.In step 820, the processing unit 310 may analyze the cognitive characteristics of the input image.

입력 영상은 재구축된 영상일 수 있다.The input image may be a reconstructed image.

단계(820)에서, 처리부(310)는 입력 영상 및 원본 영상 간의 차이 값을 도출할 수 있다.In step 820, the processor 310 may derive a difference value between the input image and the original image.

일 실시예에서, 처리부(310)는 입력 영상의 픽셀들 및 원본 영상의 픽셀들 간의 차이 값들의 합을 이용하여 입력 영상 및 원본 영상 간의 차이 값을 계산할 수 있다.In an embodiment, the processor 310 may calculate a difference value between the input image and the original image by using a sum of difference values between pixels of the input image and pixels of the original image.

일 실시예에서, 처리부(310)는 입력 영상의 픽셀들 및 원본 영상의 픽셀들 간의 차이 값들의 가중치가 부여된 합(weighted-sum)을 이용하여 입력 영상 및 원본 영상 간의 차이 값을 계산할 수 있다.In an embodiment, the processor 310 may calculate a difference value between the input image and the original image by using a weighted sum of difference values between the pixels of the input image and the pixels of the original image. .

처리부(310)는 도출된 입력 영상 및 원본 영상 간의 차이 값을 사용하여 입력 영상이 단계(810)에서 모델링된 다중 레벨 JND 구간들 및 다중 레벨 JND 임계치들 중 어느 JND 구간 및 어느 JND 임계치에 해당하는지를 결정할 수 있다.The processor 310 determines which JND section and which JND threshold among the multilevel JND sections and multilevel JND thresholds modeled in step 810 by using the difference value between the derived input image and the original image. You can decide.

입력 영상에 대한 JND 구간을 결정함에 있어서, 입력 영상의 인지 특성에 따라서 입력 영상에 해당하는 JND 구간 및 JND 임계치가 달라질 수 있다.In determining the JND section for the input image, the JND section and the JND threshold corresponding to the input image may vary according to the cognitive characteristics of the input image.

예를 들면, 2 개의 복원 영상들의 원본 영상과의 차이 값들이 동일한 경우에도, 입력 영상들의 인지 특성들에 따라 입력 영상들에 대응하는 JND 구간들 및 JND 임계치들이 서로 다를 수 있음.For example, even when two reconstructed images have the same difference values from the original image, JND intervals and JND thresholds corresponding to the input images may be different according to cognitive characteristics of the input images.

일 실시예에서, 처리부(310)는 입력 영상의 대비 민감도 특성을 사용하여 입력 영상에 대응하는 JND 구간 및 JND 임계치를 결정할 수 있다.In an embodiment, the processor 310 may determine a JND section and a JND threshold corresponding to the input image by using the contrast sensitivity characteristic of the input image.

일 실시예에서, 처리부(310)는 입력 영상의 마스킹 특성을 사용하여 입력 영상에 대응하는 JND 구간 및 JND 임계치를 결정할 수 있다.In an embodiment, the processor 310 may determine a JND section and a JND threshold corresponding to the input image by using the masking characteristic of the input image.

일 실시예에서, 처리부(310)는 입력 영상의 주의 및 집중 특성을 사용하여 입력 영상에 대응하는 JND 구간 및 JND 임계치를 결정할 수 있다.In an embodiment, the processor 310 may determine a JND section and a JND threshold corresponding to the input image by using the attention and concentration characteristics of the input image.

일 실시예에서, 처리부(310)는 입력 영상의 경계(edge) 및 텍스처 정보를 사용하여 입력 영상에 대응하는 JND 구간 및 JND 임계치를 결정할 수 있다.In an embodiment, the processor 310 may determine a JND section and a JND threshold corresponding to the input image using edge and texture information of the input image.

일 실시예에서, 처리부(310)는 결정된 JND 모델을 이용하여 감축을 수행함에 있어서, 원본인 입력 영상에 대한 전처리를 통한 감축 및 잔차 영상에 대한 감축 중 적어도 하나를 이용할 수 있다.In an embodiment, in performing reduction using the determined JND model, the processor 310 may use at least one of reduction through pre-processing of an original input image and reduction of a residual image.

일 실시예에서, 처리부(310)는 결정된 JND 모델을 이용하여 감축을 수행함에 있어서, 원본인 입력 영상에 대한 전처리를 통한 감축을 이용할 수 있다.In an embodiment, in performing the reduction using the determined JND model, the processor 310 may use the reduction through pre-processing of the original input image.

일 실시예에서, 처리부(310)는 결정된 JND 모델을 이용하여 감축을 수행함에 있어서, 잔차 영상에 대한 감축을 이용할 수 있다.처리부(310)는 결정된 JND 임계치를 이용하여 입력 영상에 대한 양자화를 수행할 수 있다.In one embodiment, the processing unit 310 may use the reduction of the residual image in performing reduction using the determined JND model. The processing unit 310 performs quantization on the input image using the determined JND threshold. can do.

상기의 양자화는 재구축된 영상인 입력 영상에 대한 추가 양자화일 수 있다.The above quantization may be additional quantization for an input image that is a reconstructed image.

추가 양자화를 수행함에 따라 입력 영상의 왜곡 값은 입력 영상의 다음의 레벨의 JND 포지션으로 이동할 수 있고, 이러한 이동을 통해 인지 화질은 유지하면서 비트레이트는 감소될 수 있다.As additional quantization is performed, the distortion value of the input image may be moved to the JND position of the next level of the input image, and through this movement, the bit rate may be reduced while maintaining the perceived image quality.

전술된 양자화로 생성된 재구축된 영상에 대한 복호화가 복호화 장치(200)에 의해 수행될 수 있다. 또한, 전술된 양자화에 대응하는 역양자화가 복호화 장치(200)의 역양자화부(220)에 의해 이루어질 수 있다.The decoding of the reconstructed image generated by the above-described quantization may be performed by the decoding apparatus 200. In addition, inverse quantization corresponding to the above-described quantization may be performed by the inverse quantization unit 220 of the decoding apparatus 200.

일 실시예에서, 처리부(310)는 결정된 JND 모델을 이용하여 감축을 수행함에 있어서, 잔차 신호에 대해 감축을 수행할 수 있다.In an embodiment, in performing reduction using the determined JND model, the processor 310 may perform reduction on a residual signal.

단계(830, 840 및 850)에서, 처리부(310)는 입력 영상에 대하여 인지 화질의 향상을 수행할 수 있다.In steps 830, 840, and 850, the processing unit 310 may improve the perceived quality of the input image.

단계(830)에서, 처리부(310)는 입력 영상 내에서 인지 민감 영역을 검출할 수 있다.In step 830, the processor 310 may detect a cognitive sensitive region in the input image.

일 실시예에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 1) 무작위성(randomness), 2) 마스킹 특성 및 3) 주의 및 집중 특성 중 적어도 하나를 이용할 수 있다.In an embodiment, the processor 310 may use at least one of 1) randomness, 2) masking characteristics, and 3) attention and concentration characteristics in detecting a cognitive sensitive region.

일 실시예에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 무작위성을 이용할 수 있다.In an embodiment, the processing unit 310 may use randomness in detecting a cognitive sensitive region.

일 실시예에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 마스킹 특성을 이용할 수 있다.In an embodiment, the processing unit 310 may use a masking characteristic in detecting a cognitive sensitive region.

일 실시예에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 주의 및 집중 특성을 이용할 수 있다.In an embodiment, the processing unit 310 may use attention and concentration characteristics in detecting a cognitive sensitive region.

단계(840)에서, 처리부(310)는 입력 영상 내에서 화질 열화 영역을 검출할 수 있다.In step 840, the processing unit 310 may detect an image quality deterioration region in the input image.

단계(840)에서, 처리부(310)는 1) 경계 강도(Boundary Strength; BS) 정보, 2) JND 레벨 정보 및 3) 에지(edge) 정보 중 적어도 하나를 이용하여 화질 열화 영역을 검출할 수 있다.In step 840, the processor 310 may detect the image quality deterioration region using at least one of 1) boundary strength (BS) information, 2) JND level information, and 3) edge information. .

일 실시예에서, 처리부(310)는 BS 정보를 이용하여 화질 열화 영역을 검출할 수 있다.In an embodiment, the processing unit 310 may detect an image quality deterioration region using BS information.

일 실시예에서, 처리부(310)는 JND 레벨 정보를 이용하여 화질 열화 영역을 검출할 수 있다.In an embodiment, the processing unit 310 may detect an image quality deterioration region using the JND level information.

일 실시예에서, 처리부(310)는 에지 정보를 이용하여 화질 열화 영역을 검출할 수 있다.In an embodiment, the processing unit 310 may detect an image quality deterioration region using edge information.

단계(850)에서, 처리부(310)는 입력 영상의 인지 열화 영역을 결정할 수 있다.In step 850, the processor 310 may determine a perceived deterioration region of the input image.

단계(860)에서, 처리부(310)는 입력 영상의 인지 화질의 향상을 수행할 수 있다.In step 860, the processor 310 may improve the perceived image quality of the input image.

일 실시예에서, 처리부(310)는 1) QP 값의 조정 및 2) 기계 학습에 기반한 노이즈 제거 중 적어도 하나를 이용하여 인지 화질의 향상을 수행할 수 있다.In an embodiment, the processor 310 may improve the perceived image quality by using at least one of 1) adjusting a QP value and 2) removing noise based on machine learning.

일 실시예에서, 처리부(310)는 QP 값의 조정을 이용하여 인지 화질의 향상을 수행할 수 있다.In an embodiment, the processor 310 may improve the perceived image quality by using the adjustment of the QP value.

일 실시예에서, 처리부(310)는 기계 학습에 기반한 노이즈 제거를 이용하여 인지 화질의 향상을 수행할 수 있다.In an embodiment, the processor 310 may improve cognitive image quality using noise removal based on machine learning.

도 9는 일 예에 따른 다중 레벨 JND 포지션들을 나타낸다.9 shows multi-level JND positions according to an example.

도 8을 참조하여 전술된 단계(810)의 다중 레벨 JND 임계치들 및 다중 레벨 JND 구간들에 대한 모델링이 설명된다.Modeling of the multi-level JND thresholds and multi-level JND intervals of the above-described step 810 will be described with reference to FIG. 8.

단계(810)에서, 처리부(310)는 JND 구간, JND 포지션 및 JND 임계치의 모델링을 수행할 수 있다. 상기의 모델링은 1) 주관적 화질 평가 및 수학적 모델링을 사용하는 방법 및 2) 기계 학습을 사용하는 방법으로 나뉠 수 있다.In step 810, the processing unit 310 may perform modeling of the JND interval, the JND position, and the JND threshold. The above modeling can be divided into 1) a method using subjective image quality evaluation and mathematical modeling, and 2) a method using machine learning.

주관적 화질 평가 및 수학적 모델링을 사용하는 방법How to use subjective picture quality assessment and mathematical modeling

처리부(310)는 원본 영상의 열화의 크기를 키워가면서 다양한 영상들을 생성할 수 있다. 처리부(310)는 생성된 영상들에 대한 주관적 화질 평가를 통해 원본 영상에 대응하는 다중 레벨 JND 포지션들의 각 JND 포지션에 해당하는 영상을 도출할 수 있다.The processing unit 310 may generate various images while increasing the size of the deterioration of the original image. The processor 310 may derive an image corresponding to each JND position of the multi-level JND positions corresponding to the original image through subjective image quality evaluation of the generated images.

예를 들면, 다중 레벨 JND 포지션들은 제1 JND 포지션, 제2 JND 포지션 및 제3 포지션을 포함할 수 있다.For example, the multi-level JND positions may include a first JND position, a second JND position, and a third position.

처리부(310)는 각각의 JND 포지션에 해당하는 영상 및 원본 영상과의 차이의 분석을 통해 실제로 각각의 JND 포지션에 이르는 차이 값을 결정할 수 있다.The processing unit 310 may actually determine a difference value reaching each JND position through analysis of the difference between the image corresponding to each JND position and the original image.

예를 들면, JND 포지션에 대한 차이 값은 원본 영상 및 JND 포지션에 해당하는 복원 영상 간의 공간 영역 및 주파수 영역에서의 차이 값들을 단순히 합한 값일 수 있다. 또한, 이러한 차이 값들의 합이 JND 포지션 및 JND 구간에 이르게 하는 JND 임계치로 정의될 수 있다.For example, the difference value for the JND position may be simply a sum of difference values in the spatial domain and the frequency domain between the original image and the reconstructed image corresponding to the JND position. In addition, the sum of these difference values may be defined as a JND threshold leading to a JND position and a JND interval.

영상 A 및 영상 B 간의 공간 영역에서의 차이 값들의 합은 아래의 수식 2와 같이 정의될 수 있다.The sum of the difference values in the spatial domain between the image A and the image B may be defined as in Equation 2 below.

[수식 2][Equation 2]

영상 A 및 영상 B 간의 주파수 영역에서의 차이 값들의 합은 아래의 수식 3과 같이 정의될 수 있다.The sum of the difference values in the frequency domain between the image A and the image B may be defined as in Equation 3 below.

[수식 3][Equation 3]

MATD는 절대 변환된 차이의 평균(mean of absoluted transformed differences)를 나타낼 수 있다. MATD can represent the mean of absoluted transformed differences.

SATD는 절대 변환된 차이의 합(sum of absoluted transformed differences)를 나타낼 수 있다. SATD can represent the sum of absoluted transformed differences.

SSTE는 변환된 오류들의 제곱의 합(sum of squared transformed errors)를 나타낼 수 있다. SSTE may represent the sum of squared transformed errors.

Diff(i, j)는 아래의 수식 4와 같이 정의될 수 있다. Diff ( i , j ) can be defined as in Equation 4 below.

[수식 4][Equation 4]

DiffT(i, j)는 아래의 수식 5와 같이 정의될 수 있다.DiffT(i, j) can be defined as in Equation 5 below.

i 및 j는 공간 영역에서의 영상의 픽셀의 좌표들을 나타낼 수 있다. 영상(i, j)는 영상에서의 좌표들이 (i, j)인 픽셀의 값을 나타낼 수 있다.i and j may represent coordinates of pixels of an image in a spatial domain. The image (i, j) may represent a value of a pixel whose coordinates in the image are (i, j).

[수식 5][Equation 5]

i 및 j는 주파수 영역에서의 영상의 픽셀의 좌표들을 나타낼 수 있다.i and j may represent coordinates of pixels of an image in the frequency domain.

도 10은 일 예에 따른 주파수 계수 위치에 따른 제1 JND 임계치를 나타낸다.10 illustrates a first JND threshold according to a frequency coefficient position according to an example.

도 11은 일 예에 따른 주파수 계수 위치에 따른 제2 JND 임계치를 나타낸다.11 shows a second JND threshold according to a frequency coefficient position according to an example.

주파수 도메인에 대해서는, 주파수 계수 별로 변환(transform)된 원본 영상 및 특정 JND에 해당하는 열화된 영상 간의 차이가 모델링될 수 있다. 각 JND 포지션에 대해서 원본 영상의 주파수 계수 및 열화된 영상 간의 주파수 계수 간의 차이 값이 상기의 특정 JND의 JND 포지션 및 JND 구간에 이르게 하는 JND 임계치로 설정될 수 있다.For the frequency domain, a difference between an original image transformed for each frequency coefficient and a deteriorated image corresponding to a specific JND may be modeled. For each JND position, a difference value between the frequency coefficient of the original image and the frequency coefficient between the deteriorated image may be set as a JND threshold for reaching the JND position and the JND section of the specific JND.

예를 들면, 원본 영상 및 특정 JND 포지션에 해당하는 영상의 주파수 영역에서의 대응하는 계수들 간의 차이가 단순 평균이나 선형회귀 같은 수학적 모델링을 통해 도 10 및 도 11과 같이 모델링될 수 있다.For example, the difference between the original image and corresponding coefficients in the frequency domain of the image corresponding to a specific JND position may be modeled as shown in FIGS. 10 and 11 through mathematical modeling such as a simple average or linear regression.

도 10은 원본 영상 및 제1 JND 포지션에 해당하는 영상의 대응하는 계수들 간의 차이를 나타낼 수 있다.10 may represent a difference between corresponding coefficients of an original image and an image corresponding to a first JND position.

도 11은 원본 영상 및 제2 JND 포지션에 해당하는 영상의 대응하는 계수들 간의 차이를 나타낼 수 있다.11 may represent a difference between corresponding coefficients of an original image and an image corresponding to a second JND position.

이러한 차이가 JND 포지션에 해당되는 JND 임계치라고 볼 수 있다.This difference can be seen as the JND threshold corresponding to the JND position.

만일, 원본 영상 및 열화된 영상(말하자면, 복원된 영상)의 주파수 영역에서의 대응하는 계수들 간의 차이가 앞서 정의된 제1 JND 임계치 보다 더 작다면 열화된 영상의 화질 또는 왜곡은 제1 JND 구간 내에 있다고 볼 수 있다.If the difference between the corresponding coefficients in the frequency domain of the original image and the deteriorated image (that is, the reconstructed image) is smaller than the first JND threshold defined above, the quality or distortion of the deteriorated image is the first JND section. It can be seen that it is within.

기계 학습을 사용하는 방법How to use machine learning

도 12는 일 예에 따른 기계 학습 기반의 네트워크 학습 단계를 나타낸다.12 illustrates a machine learning-based network learning step according to an example.

처리부(310)는 기계 학습 기반 네트워크를 포함할 있고, 또는 기계 학습 기반 네트워크를 운영할 수 있다.The processing unit 310 may include a machine learning-based network or may operate a machine learning-based network.

기계 학습 기반 네트워크는 원본 영상 및 다중 레벨 JND 포지션들에 해당하는 영상들을 사용하는 다양한 기계 학습을 수행할 수 있다.The machine learning-based network can perform various machine learning using the original image and images corresponding to multi-level JND positions.

도 13은 일 예에 따른 기계 학습 기반의 네트워크를 이용한 JND 구간의 파악을 나타낸다.13 illustrates grasping a JND section using a machine learning-based network according to an example.

기계 학습 기반 네트워크가 도 12를 참조하여 설명된 것과 같은 학습을 수행하면, 기계 학습 기반 네트워크는 기계 학습 기반 네트워크에 입력된 영상에 대하여 다중 레벨 JND 포지션들 및 다중 레벨 JND 구간들 중 입력된 영상이 해당되는 JND 포지션 및/또는 JND 구간을 분류 및 출력할 수 있다.When the machine-learning-based network performs learning as described with reference to FIG. 12, the machine-learning-based network displays the input image among multi-level JND positions and multi-level JND intervals for an image input to the machine learning-based network The corresponding JND position and/or JND section can be classified and output.

입력 영상에 대한 특성 분석을 통한 JND 임계치 및 JND 구간의 결정Determination of JND threshold and JND section through characteristic analysis of input image

도 14는 일 예에 따른 입력 영상의 8x8 DCT 계수들을 나타낸다.14 shows 8x8 DCT coefficients of an input image according to an example.

도 15는 일 예에 따른 입력 영상에 대한 인지 특성 분석을 나타낸다.15 illustrates an analysis of cognitive characteristics for an input image according to an example.

도 8을 참조하여 전술된 단계(820)에서, 처리부(310)는 입력 영상 및 원본 영상 간의 차이 값을 도출할 수 있다.In step 820 described above with reference to FIG. 8, the processor 310 may derive a difference value between the input image and the original image.

처리부(310)는 입력 영상(즉, 재구축된 영상)의 JND 구간을 찾기 위해 차이 값을 계산할 수 있으며 아래와 같은 방식들을 사용할 수 있다.The processing unit 310 may calculate a difference value to find the JND section of the input image (ie, the reconstructed image), and may use the following methods.

처리부(310)는 차이 값들의 합을 사용하여 입력 영상의 JND 구간을 찾을 수 있다.The processor 310 may find the JND section of the input image by using the sum of the difference values.

입력 영상이 영상 A이고, 원본 영상이 영상 B일 때, 영상 A 및 영상 B 간의 공간 영역에서의 차이 값들의 합은 아래의 수식 6과 같이 정의될 수 있다.When the input image is the image A and the original image is the image B, the sum of the difference values in the spatial domain between the image A and the image B may be defined as in Equation 6 below.

[수식 6][Equation 6]

영상 A 및 영상 B 간의 주파수 영역에서의 차이 값들의 합은 아래의 수식 7와 같이 정의될 수 있다.The sum of the difference values in the frequency domain between the image A and the image B may be defined as in Equation 7 below.

[수식 7][Equation 7]

Diff(i, j)는 아래의 수식 8과 같이 정의될 수 있다. Diff ( i , j ) can be defined as in Equation 8 below.

[수식 8][Equation 8]

DiffT(i, j)는 아래의 수식 9와 같이 정의될 수 있다.DiffT(i, j) can be defined as in Equation 9 below.

[수식 9][Equation 9]

처리부(310)는 차이 값들의 가중치가 부여된 합을 사용하여 입력 영상의 JND 구간을 찾을 수 있다.The processor 310 may find the JND section of the input image by using the sum to which the difference values are weighted.

처리부(310)는 전술된 수식 5 내지 수식 9의 수식들에 대하여 가중치 항목을 추가할 수 있고, 차이 값들의 가중치가 부여된 합을 정의할 수 있다.The processing unit 310 may add a weight item to the equations of Equations 5 to 9 described above, and may define a sum to which the weights of the difference values are assigned.

예를 들면, 주파수 영역에서의 차이 값이 사용될 경우, 사람의 인지 특성 상 사람이 고주파 변화에 둔감하다는 점에 근거하여 고주파 영역의 차이 값에 대해서는 저주파 영역의 차이 값에 비해 더 작은 가중치를 부여할 수 있다.For example, when a difference value in the frequency domain is used, a smaller weight may be given to the difference value in the high frequency domain than the difference value in the low frequency domain based on the fact that the human is insensitive to high-frequency changes due to human perception characteristics. I can.

영상 A 및 영상 B 간의 공간 영역에서의 가중치가 부여된 합은 아래의 수식 10과 같이 정의될 수 있다.A weighted sum in the spatial domain between the image A and the image B may be defined as in Equation 10 below.

[수식 10][Equation 10]

영상 A 및 영상 B 간의 주파수 영역에서의 가중치가 부여된 합은 아래의 수식 11과 같이 정의될 수 있다.A weighted sum in the frequency domain between the image A and the image B may be defined as in Equation 11 below.

[수식 11][Equation 11]

도 16은 인지 특성 분석에 따라 결정된 JND 임계치의 일 예를 도시한다.16 shows an example of a JND threshold determined according to cognitive characteristic analysis.

도 17은 인지 특성 분석에 따른 JND 임계치의 다른 일 예를 도시한다.17 illustrates another example of a JND threshold according to cognitive characteristic analysis.

도 8을 참조하여 설명된 단계(820)에서, 처리부(310)는 도출된 입력 영상 및 원본 영상 간의 차이 값의 크기를 사용하여 입력 영상이 단계(810)에 모델링된 다중 레벨 JND 구간들 및 다중 레벨 JND 임계치들 중 어느 JND 구간 및 어느 JND 임계치에 해당하는지를 결정할 수 있다.In step 820 described with reference to FIG. 8, the processor 310 uses the size of the difference value between the derived input image and the original image to determine the multi-level JND sections and multi-level JND sections modeled in step 810. Among the level JND thresholds, it is possible to determine which JND interval and which JND threshold corresponds to.

처리부(310)는 입력 영상의 인지 특성을 분석하여 JND 임계치를 계산할 수 있고, 계산된 임계치에 따라 입력 영상의 JND 구간을 정할 수 있다. 이러한 방식에 의해 다양한 인지 특성에 따라서 JND 구간 및 JND 임계치가 다르게 결정될 수 있다.The processing unit 310 may calculate the JND threshold by analyzing the cognitive characteristics of the input image, and may determine the JND section of the input image according to the calculated threshold. In this manner, the JND interval and the JND threshold may be determined differently according to various cognitive characteristics.

입력 영상의 인지 특성의 분석을 통한 JND 임계치 JND _Threshold 는 JND _Basic 및 가중치들의 곱의 형태로 표현될 수 있다. The JND threshold JND _Threshold through the analysis of the cognitive characteristics of the input image can be expressed in the form of a product of JND _{Basic and weights.}

JND _Basic 는 주관적 화질 평가를 통해 도출된 JND 임계치의 기본값일 수 있다. JND _Basic may be a default value of the JND threshold derived through subjective image quality evaluation.

가중치들은 입력 영상의 대비 민감도, 마스킹 특성 및 주의 및 집중 특성 등과 같은 다양한 인지 특성을 분석함으로써 도출된 값들일 수 있다.The weights may be values derived by analyzing various cognitive characteristics such as contrast sensitivity, masking characteristics, attention and concentration characteristics of the input image.

예를 들면, JND 임계치 JND _Threshold 는 아래의 수식 12와 같은 형태로 표현될 수 있다.For example, the JND threshold JND _Threshold may be expressed in the form of Equation 12 below.

[수식 12][Equation 12]

JND _Basic 는 주관적 화질 평가를 통해 획득된 JND 임계치의 기본값으로, 임계치 행렬일 수 있다. JND _Basic is a default value of the JND threshold obtained through subjective quality evaluation, and may be a threshold matrix.

W _Contrast 는 입력 영상의 대비 민감도 특성을 고려하여 획득된 가중치이며, 가중치 행렬일 수 있다. W _Contrast is a weight obtained in consideration of the contrast sensitivity characteristic of the input image, and may be a weight matrix.

W _Masking 은 입력 영상의 마스킹 특성을 고려하여 획득된 가중치이며, 가중치 행렬일 수 있다. W _Masking is a weight obtained in consideration of the masking characteristic of the input image, and may be a weight matrix.

W _Attention 은 입력 영상의 주의 및 집중 특성을 고려하여 획득된 가중치이며, 가중치 행렬일 수 있다. W _Attention is a weight obtained in consideration of the attention and concentration characteristics of the input image, and may be a weight matrix.

예를 들면, 입력 영상의 마스킹 특성이 큰 경우, 마스킹 특성에 대한 가중치 W _Masking 의 값이 증가할 수 있으며, 이러한 W _Masking 의 값의 증가에 의해 JND 임계치가 증가한다.For example, when the masking characteristic of the input image is large, the value of the weight W _Masking for the masking characteristic may increase, and the JND threshold increases as the value of W _{Masking increases.}

입력 영상의 인지 특성의 분석에 있어서 사용되는 주요한 인지 특성들The main cognitive characteristics used in the analysis of the cognitive characteristics of the input image

입력 영상의 인지 특성의 분석에 있어서 사용되는 주요한 인지 특성들에 대해 아래에서 설명된다.The main cognitive characteristics used in the analysis of the cognitive characteristics of the input image are described below.

1) 대비 민감도(contrast sensitivity) 특성: 사람의 눈의 인지 특성에 따르면, 사람의 눈은 일반적으로 공간 주파수가 낮은 영역 대해서는 대비 변화에 대한 높은 민감도를 갖을 수 있고, 공간 주파수가 높은 영역에 대해서는 대비 변화에 대한 낮은 민감도를 갖을 수 있다.1) Contrast sensitivity characteristics: According to the cognitive characteristics of the human eye, the human eye can generally have a high sensitivity to changes in contrast for regions with low spatial frequencies, and contrast for regions with high spatial frequencies. May have low sensitivity to change.

즉, 사람들은 주파수 도메인(frequency domain)에서 고주파 요소의 변화보다는 저주파 요소의 변화를 더 민감하게 느낄 수 있으며, 특정된 주파수 크기의 이상의 주파수에서는 대비 변화를 거의 느끼지 못할 수 있다.That is, people may feel more sensitively to a change in a low-frequency component than a change in a high-frequency component in the frequency domain, and may hardly feel a change in contrast at a frequency above a specified frequency size.

따라서, 이러한 인지 특성을 이용하여 JND 임계치를 계산할 때, 서로 다른 주파수 영역들에 대해서 서로 다른 가중치들을 각각 부여할 수 있다. 또한, 차이 값에 대한 주파수 영역들의 가중치들의 합의 크기에 따라, JND 임계치에 곱해지는 가중치의 크기가 달라질 수 있다. 말하자면, JND 임계치에 곱해지는 가중치의 크기는 차이 값에 대한 주파수 영역들의 가중치들의 합의 크기에 기반하여 결정될 수 있다.Therefore, when calculating the JND threshold using this cognitive characteristic, different weights may be assigned to different frequency domains. In addition, the size of the weight multiplied by the JND threshold may vary according to the size of the sum of the weights of the frequency domains with respect to the difference value. In other words, the size of the weight multiplied by the JND threshold may be determined based on the size of the sum of the weights of the frequency domains with respect to the difference value.

2) 대비 마스킹(contrast masking) 특성: 대비 마스킹 특성은 영상 특성에 따른 왜곡의 크기 또는 왜곡의 변화에 사람들의 인지 특성이 달라지는 것을 나타내는 용어일 수 있다. 일반적으로 사람의 인지 특성에 따르면 평탄한 영역에서의 왜곡의 변화는 잘 인지될 수 있지만, 반대로 텍스처가 많은 영역에서는 왜곡의 변화가 잘 인지되지 못할 수 있다.2) Contrast masking characteristic: The contrast masking characteristic may be a term indicating that a person's cognitive characteristics are changed in response to a change in distortion or a size of distortion according to an image characteristic. In general, a change in distortion in a flat area can be recognized well according to a person's cognitive characteristics, but on the contrary, a change in distortion may not be recognized well in a textured area.

따라서, 이러한 인지 특성에 기초하여 왜곡 값의 계산에 있어서, 블록들의 텍스처(texture) 및/또는 경계(edge) 특성들에 따라서 블록들에 대한 가중치들이 블록 별로 서로 다를 수 있다. 말하자면, 블록에 대한 가중치는 블록의 텍스처 특성 및/또는 경계 특성에 기반하여 결정될 수 있다.Accordingly, in calculating a distortion value based on such cognitive characteristics, weights for blocks may be different for each block according to texture and/or edge characteristics of blocks. In other words, the weight for the block may be determined based on the texture characteristic and/or the boundary characteristic of the block.

3) 시간 마스킹(temporal masking) 특성: 시간 마스킹 특성이란 사람의 인지 특성에 따라 영상의 시간 주파수가 올라갈수록 움직임이 빠른 객체에 생기는 왜곡에 대한 인지율이 떨어지는 현상을 나타내는 용어일 수 있다. 이러한 인지 특성에 기초하여 시간 주파수 및 영상에서의 움직임의 크기에 따라서 블록들에 대한 가중치들이 블록 별로 서로 다를 수 있다. 말하자면, 블록에 대한 가중치는 시간 주파수 및 영상에서의 움직임의 크기에 기반하여 결정될 수 있다.3) Temporal masking characteristic: The temporal masking characteristic may be a term indicating a phenomenon in which the recognition rate for distortion occurring in a fast moving object decreases as the temporal frequency of an image increases according to a person's cognitive characteristics. Based on this cognitive characteristic, weights for blocks may be different for each block according to a time frequency and a motion size in an image. In other words, the weight for the block may be determined based on the time frequency and the size of the motion in the image.

4) 주의 및 집중 특성: 주의 및 집중 특성은 사람의 인지 시각 시스템이 특정 환경이나 객체에 집중하거나 무시하는 현상을 의미하는 용어일 수 있다. 주의 및 집중 특성은 주로 주의를 끄는 팩터(factor)에 따라 아래와 같이 크게 2가지로 분류될 수 있다. 실시예에서는, 영상에 주의 및 집중 팩터가 포함되었는지 여부에 따라서 부과되는 가중치가 변할 수 있다. 말하자면, 가중치는 영상에 주의 및 집중 팩터가 포함되었는지 여부에 기반하여 결정될 수 있다.4) Attention and concentration characteristics: Attention and concentration characteristics may be terms that refer to a phenomenon in which a person's cognitive visual system focuses on or ignores a specific environment or object. Attention and concentration characteristics can be largely classified into two as follows according to a factor that attracts attention. In an embodiment, the weight imposed may be changed according to whether the attention and concentration factors are included in the image. In other words, the weight may be determined based on whether attention and concentration factors are included in the image.

4-1) 바텀-업 주의(bottom-up attention) (또는, 리플렉시브(reflexive) 주의 또는 외인성(exogenous) 주의)4-1) Bottom-up attention (or, reflexive attention or exogenous attention)

바텀-업 주의의 팩터는 저-레벨 현저한 특성/그대로의 감각 입력(Low-level salient feature/raw sensory input)이라고 칭할 수 있다.The factor of bottom-up attention can be referred to as a low-level salient feature/raw sensory input.

바텀-업 주의의 팩터는 잠재적으로 중요성을 가진 특성으로의 주의 이동이 급격하거나 무의식적으로 이루어지도록 하는 팩터를 칭할 수 있다.The factor of bottom-up attention can refer to a factor that causes the shift of attention to a trait of potentially significant importance to occur rapidly or unconsciously.

바텀-업 주의는 특별한 의도를 가지고 생성된 팩터가 아닌, 컬러, 모양, 움직임, 대비 및 크기 등과 같은 팩터들의 급격한 변화나 두각으로 주의를 끌게 하는 것을 의할 수 있다.Bottom-up attention may refer to attracting attention with a sharp change in factors such as color, shape, movement, contrast, and size, rather than a factor created with special intention.

4-2) 톱-다운 주의(top-down attention) (또는, 자발적인(voluntary) 주의 또는 내인성(endogenous) 주의) 4-2) Top-down attention (or voluntary attention or endogenous attention)

톱-다운 주의의 팩터는, 표지판 및 수화 등과 같은 목적 지향적인 인지 팩터일 수 있으며, 사전 지식 또는 특정 기대 등에 의해 주의를 끄는 팩터를 의미할 수 있다.The factor of top-down attention may be a purpose-oriented cognitive factor such as signs and sign language, and may mean a factor that attracts attention by prior knowledge or specific expectations.

인지 특성들에 따른 가중치의 부여Assigning weights according to cognitive characteristics

아래에서는 전술된 인지 특성들을 사용하여 가중치를 부여하는 구체적인 방법들이 설명된다.In the following, specific methods of assigning weights using the above-described cognitive characteristics will be described.

1) 대비 민감도 특성을 사용하는 방법1) How to use the contrast sensitivity feature

가중치는 주파수 공간에서의 인지 특성을 고려하여 공간 주파수의 크기에 따라서 부여될 수 있다.The weight may be assigned according to the size of the spatial frequency in consideration of the cognitive characteristics in the frequency space.

인지적으로 민감한 영역은 아래의 수식 13과 같은 대비 민감도 함수(Contrast Sensitive Function; CSF)에 기반하여 판단될 수 있다.The cognitively sensitive area may be determined based on a contrast sensitivity function (CSF) as shown in Equation 13 below.

[수식 13][Equation 13]

ω _i,j는 주파수 도메인 상의 좌표들 (i, j)에서의 공간 주파수 크기를 나타낼 수 있다. ω _i,j may represent the spatial frequency magnitude at coordinates (i, j) in the frequency domain.

a, b 및 c는 기정의된 상수일 수 있다. 예를 들면, a는 1.33, b는 0.11 및 c는 0.1일 수 있다. a , b and c may be predefined constants. For example, a may be 1.33, b may be 0.11, and c may be 0.1.

아래의 수식 14의 8x8 행렬 H _i,j(ω _i,j)은 전술된 수식 13에서 도출된 결과가, 최대값으로 정규화된 것일 수 있다. The 8x8 matrix H _i,j ( ω _i,j ) of Equation 14 below may be a result derived from Equation 13 above, normalized to a maximum value.

[수식 14][Equation 14]

8x8 행렬 H _i,j(ω _i,j)은 수식 13에 기반한 가중치 행렬일 수 있으며, 대비 민감도 가중치 행렬일 수 있다.The 8x8 matrix H _i,j ( ω _i,j ) may be a weight matrix based on Equation 13, and may be a contrast sensitivity weight matrix.

행렬 H _i,j(ω _i,j)에서 나타난 것과 같이, 주파수 크기가 상대적으로 더 작은 위치에 속하는 요소들은 비교적 큰 값을 가질 수 있다.As shown in the matrix H _i,j ( ω _i,j ), elements belonging to positions having a relatively smaller frequency size may have relatively large values.

행렬 H _i,j(ω _i,j)과 같이 도출된 대비 민감도의 값이 가중치로 사용될 수 있다.The contrast sensitivity value derived as the matrix H _i,j ( ω _i,j ) can be used as the weight.

수식 13은 대비 민감도를 구하기 위한 일 예로, 대비 민감도 특성을 반영하는 다른 형태의 식이 대비 민감도를 결정하기 위해 사용될 수 있다.Equation 13 is an example for obtaining the contrast sensitivity, and may be used to determine the contrast sensitivity of another type of diet reflecting the contrast sensitivity characteristic.

행렬 H _i,j(ω _i,j)와 같은 대비 민감도 가중치 행렬은 기본 JND 임계치들의 임계치 행렬에 곱해지는 가중치 행렬일 수 있다.The contrast sensitivity weighting matrix, such as the matrix H _i,j ( ω _i,j ), may be a weighting matrix multiplied by a threshold matrix of basic JND thresholds.

2) 마스킹 특성을 이용하는 방법2) How to use masking properties

마스킹 효과는 특정한 신호 또는 특정한 자극에 의해 다른 신호 또는 다른 자극에 대한 인지율이 떨어지거나, 다른 신호 또는 다른 자극이 아예 인지되지 못하게 되는 현상을 의미할 수 있다.The masking effect may mean a phenomenon in which a recognition rate for another signal or other stimulus decreases, or another signal or other stimulus is not recognized at all by a specific signal or a specific stimulus.

공간 도메인에 있어서, 마스킹 효과는 텍스처가 복잡한 영역에서 발생한 에러(말하자면, 신호)는 평탄한 영역(smooth area)에서 발생한 에러보다 인지하기 어려워지는 현상을 의미할 수 있다.In the spatial domain, the masking effect may mean a phenomenon in which an error (that is, a signal) occurring in an area having a complex texture becomes more difficult to recognize than an error occurring in a smooth area.

시간 도메인에 있어서, 마스킹 효과는 연속되는 프레임들 간의 휘도 차이가 더 클수록 상기의 프레임들 내에서 발생한 에러에 대한 인지율이 떨어지는 현상을 의미할 수 있다.In the time domain, the masking effect may mean a phenomenon in which the recognition rate for an error occurring in the above frames decreases as the difference in luminance between successive frames increases.

실시예에서는 휘도 마스킹, 대비 마스킹 및 시간 마스킹의 특성을 이용하여 가중치가 결정될 수 있다.In an embodiment, the weight may be determined using characteristics of luminance masking, contrast masking, and temporal masking.

도 18은 일 예에 따른 픽처의 블록들의 휘도 및 가중치를 도시한다.18 illustrates luminance and weights of blocks of a picture according to an example.

2-1) 휘도 마스킹 특성을 이용하는 가중치의 결정2-1) Determination of weights using luminance masking characteristics

도 18에서 도시된 것과 같이, 입력 영상은 복수의 블록들로 분할될 수 있다. 실시예에서, 입력 신호는 블록일 수 있다. As illustrated in FIG. 18, the input image may be divided into a plurality of blocks. In an embodiment, the input signal may be a block.

각 블록은 평균 휘도 값을 가질 수 있고, 블록의 평균 휘도 값에 따라서 블록에 가중치가 부여될 수 있다.Each block may have an average luminance value, and a weight may be assigned to the block according to the average luminance value of the block.

블록의 휘도 적응 특성을 고려하여 인지적으로 민감한 영역에 대해 높은 가중치가 부여될 수 있다.Considering the luminance adaptation characteristic of the block, a high weight may be given to a cognitively sensitive area.

아래의 수식 15와 같이, 블록의 평균 휘도 값에 기반하여 인지적으로 민감한 영역이 판단될 수 있다. 또한, 계산된 평균 휘도 값의 범위에 따라서, 블록에 대한 가중치가 결정될 수 있다.As shown in Equation 15 below, a cognitively sensitive area may be determined based on the average luminance value of the block. In addition, a weight for a block may be determined according to the range of the calculated average luminance value.

블록에 대한 가중치는 아래의 수식 15와 같이 결정될 수 있다.The weight for the block may be determined as in Equation 15 below.

[수식 15][Equation 15]

는 블록의 평균 휘도 값(average intensity value)일 수 있다.

May be an average intensity value of the block.

ω는 블록에 대한 가중치일 수 있다. ω _i,j는 블록에 대한 가중치 행렬일 수 있다. ω may be the weight for the block. ω _i,j may be a weight matrix for the block.

왜곡의 계산을 위한 가중치 또는 가중치 행렬을 구성함에 있어서, 블록의 평균 휘도 값이 어두운 영역이나 밝은 영역의 범위에 속할 경우 블록에 대하여 작은 가중치가 할당될 수 있다.In constructing a weight or weight matrix for calculating distortion, a small weight may be assigned to a block when the average luminance value of a block falls within a range of a dark or bright area.

수식 15에서, ω _i,j가 가중치 또는 가중치 행렬일 때, 블록의 평균 휘도 값이 중간 영역에 속할 경우(즉, 블록의 평균 휘도 값이 60 보다 더 크고, 170의 이하인 경우) 블록에 대한 가중치는 1일 수 있다. 블록의 평균 휘도 값이 중간 영역에 속하지 않는 경우 수식 15 내의 수식에 의해 가중치가 정의될 수 있다. 예를 들면, 블록의 평균 휘도 값이 중간 영역에 속하지 않는 경우, 블록에 대한 가중치는 1보다 더 작을 수 있다.In Equation 15, when ω _i,j is a weight or weight matrix, when the average luminance value of the block belongs to the middle region (that is, when the average luminance value of the block is greater than 60 and less than 170), the weight for the block Can be 1. When the average luminance value of the block does not belong to the middle region, the weight may be defined by an equation in Equation 15. For example, when the average luminance value of the block does not belong to the middle region, the weight for the block may be less than 1.

수식 15는 블록에 대한 가중치를 결정하는 일 예일 수 있다. 블록에 대한 가중치를 결정하는 방식은 휘도 적응 특성을 반영하는 다른 형태의 수식으로도 정의될 수 있다.Equation 15 may be an example of determining a weight for a block. The method of determining the weight for the block may be defined as another type of equation reflecting the luminance adaptation characteristic.

2-2) 대비 마스킹 특성을 이용하는 가중치의 결정2-2) Determination of weights using contrast masking characteristics

도 19는 일 예에 따른 픽처의 블록들의 블록 타입 및 가중치를 도시한다.19 illustrates block types and weights of blocks of a picture according to an example.

도 19에서 도시된 것과 같이, 입력 영상은 복수의 블록들로 분할될 수 있다. 실시예에서, 입력 신호는 블록일 수 있다.As illustrated in FIG. 19, the input image may be divided into a plurality of blocks. In an embodiment, the input signal may be a block.

블록의 대비 마스킹 특성에 따라서 블록에 가중치가 부여될 수 있다. 블록의 대비 마스킹 특성을 고려하여 인지적으로 민감한 영역에 대해 높은 가중치가 부여될 수 있다.A weight may be assigned to a block according to the contrast masking characteristic of the block. A high weight may be assigned to a cognitively sensitive area in consideration of the contrast masking characteristic of the block.

평균의 경계 픽셀(edge pixel) 밀집도의 크기가 정의될 수 있고, 평균의 경계 픽셀 밀집도의 크기에 기반하여 인지적으로 민감한 영역이 판단될 수 있다. 경계 픽셀 밀집도는 블록의 경계 픽셀의 개수를 나타내거나, 경계 픽셀의 개수에 비례할 수 있다. 또한, 경계 픽셀 밀집도는 블록의 크기에 반비례할 수 있다.The size of the average edge pixel density may be defined, and a cognitively sensitive area may be determined based on the size of the average edge pixel density. The boundary pixel density may indicate the number of boundary pixels of a block or may be proportional to the number of boundary pixels. Also, the boundary pixel density may be in inverse proportion to the size of the block.

블록의 평균의 경계 픽셀 밀집도

는 아래의 수식 16과 같이 결정될 수 있다.Boundary pixel density of the average of the block

Can be determined as in Equation 16 below.

[수식 16][Equation 16]

경계 픽셀은 소벨(sobel)이나 캐니(canny)와 같은 경계 픽셀 검출 연산자 등을 통해 검출될 수 있다.The boundary pixel may be detected through a boundary pixel detection operator such as a sobel or canny.

블록의 블록 타입(block type)은 블록의 경계 픽셀 밀집도

에 기반하여 결정될 수 있다. 계산된 평균의 경계 픽셀 밀집도

가 속하는 범위에 따라서, 블록의 블록 타입이 결정될 수 있다. 블록의 블록 타입에 기반하여 블록에 적용되는 가중치 또는 가중치 행렬이 결정될 수 있다.The block type of the block is the density of the boundary pixels of the block.

Can be determined based on Boundary pixel density of the calculated average

The block type of the block may be determined according to the range to which is belong. A weight applied to a block or a weight matrix may be determined based on the block type of the block.

블록의 블록 타입은 아래의 수식 17과 같이 결정될 수 있다.The block type of the block may be determined as shown in Equation 17 below.

[수식 17][Equation 17]

α 및 β는 실수 값들일 수 있다. α and β can be real values.

블록의 블록 타입은 수식 17에서 예시된 것과 같이, 평균의 경계 픽셀 밀집도

및 기정의된 값들 간의 비교의 결과에 의해 플레인(plane), 경계(edge) 및 텍스처(texture) 중 하나로 분류될 수 있다.The block type of the block is the average boundary pixel density, as illustrated in Equation 17.

And a plane, an edge, and a texture based on a result of comparison between predefined values.

즉, 가중치 행렬을 구성함에 있어서, 입력 신호인 블록 내에 텍스처가 많으면, 화질 저하 또는 왜곡에 대한 인지율이 낮아질 수 있다. 이러한 경우, 사람이 블록에서의 화질 저하 또는 왜곡에 대해서 인지적으로 덜 민감하게 반응한다고 판단되어, 블록에 대하여 낮은 가중치가 설정될 수 있다. 반대로, 입력 신호인 블록 내에 텍스처가 적으면, 블록에 대하여 높은 가중치가 설정될 수 있다.That is, in constructing the weight matrix, if there are many textures in a block that is an input signal, the recognition rate for image quality deterioration or distortion may be lowered. In this case, it is determined that a person reacts cognitively less sensitively to image quality degradation or distortion in the block, so that a lower weight may be set for the block. Conversely, if there are few textures in the block as the input signal, a high weight may be set for the block.

블록에 대한 가중치는 아래의 수식 18과 같이 설정될 수 있다.The weight for the block may be set as shown in Equation 18 below.

[수식 18][Equation 18]

수식 18에서, 블록의 크기는 8x8일 수 있다.In Equation 18, the size of the block may be 8x8.

ω는 블록에 대한 가중치일 수 있다. Hω _i,j는 블록에 대한 가중치 행렬일 수 있다. ω may be the weight for the block. Hω _i,j may be a weight matrix for the block.

수식 18은 경계 픽셀의 밀집도에 따라서 블록의 가중치가 결정되는 일 예일 수 있다.Equation 18 may be an example in which the weight of the block is determined according to the density of the boundary pixel.

블록 타입이 텍스처일 때, 블록의 경계 픽셀의 개수가 특정된 개수(예를 들면, 16 개)를 초과하는 경우, 경계가 있는 블록으로 판단될 수 있다. 블록 타입이 텍스처일 때, 블록의 경계 픽셀의 개수가 특정된 개수의 이하인 경우, 경계가 없는 블록으로 판단될 수 있다.When the block type is a texture, when the number of boundary pixels of the block exceeds a specified number (eg, 16), it may be determined as a block with a boundary. When the block type is a texture, when the number of boundary pixels of the block is less than or equal to the specified number, it may be determined as a block without a boundary.

블록 타입이 텍스처일 때, 경계가 있는 블록에 대해서 경계가 없는 블록에 비해 더 높은 가중치가 부여될 수 있다. 말하자면, 블록 타입이 텍스처로 판명된 블록의 경계 픽셀의 개수가 특정된 개수의 이하인 경우에는, 상기의 블록은 상대적으로 낮은 경계 픽셀 밀집도를 갖는 텍스처 블록으로 인식될 수 있고, 상대적으로 낮은 경계 픽셀 밀집도를 갖는 텍스처 블록으로 인식됨에 따라 낮은 가중치가 부여될 수 있다.When the block type is a texture, a higher weight may be given to a block with a boundary compared to a block without a boundary. In other words, if the number of boundary pixels of a block whose block type is determined to be a texture is less than or equal to the specified number, the block can be recognized as a texture block having a relatively low boundary pixel density, and a relatively low boundary pixel density. As it is recognized as a texture block having a, a low weight may be given.

수식 17 및 수식 18은 블록에 대한 블록 타입과, 블록 타입에 따른 가중치를 결정하는 일 예일 수 있다. 블록에 대한 블록 타입과, 블록 타입에 따른 가중치를 결정하는 방식은 대비 마스킹 특성을 반영하는 다른 형태의 수식으로도 정의될 수 있다.Equations 17 and 18 may be examples of determining a block type for a block and a weight according to the block type. A block type for a block and a method of determining a weight according to the block type may be defined as another type of equation reflecting the contrast masking characteristic.

2-3) 시간 마스킹 특성을 이용하는 가중치의 결정2-3) Determination of weights using time masking characteristics

입력 영상은 복수의 블록들로 분할될 수 있다. 실시예에서, 입력 신호는 블록일 수 있다.The input image may be divided into a plurality of blocks. In an embodiment, the input signal may be a block.

블록의 시간 마스킹 특성에 따라서 블록에 가중치가 부여될 수 있다. 블록의 시간 마스킹 특성을 고려하여 인지적으로 민감한 영역에 대해 높은 가중치가 부여될 수 있다.A weight may be assigned to a block according to the time masking characteristic of the block. A high weight may be assigned to a cognitively sensitive area in consideration of the temporal masking characteristic of the block.

시간 주파수 특성을 나타내는 프레임레이트 및 블록의 움직임 벡터의 상관 관계를 통해 인지적으로 민감한 영역이 판단될 수 있다.A cognitively sensitive region may be determined through a correlation between a frame rate representing a time frequency characteristic and a motion vector of a block.

즉, 가중치 행렬을 구성함에 있어서, 입력 영상의 프레임레이트 및 블록의 움직임 벡터의 절대 크기가 클수록 화질 저하 또는 왜곡에 대한 인지율이 낮아질 수 있다. 이러한 경우, 사람이 블록에서의 화질 저하 또는 왜곡에 대해서 인지적으로 덜 민감하게 반응한다고 판단되어, 블록에 대하여 낮은 가중치가 설정될 수 있다. 반대로, 입력 영상의 프레임레이트 및 블록의 움직임 벡터의 절대 크기가 작을수록 블록에 대하여 높은 가중치가 설정될 수 있다.That is, in constructing the weight matrix, as the frame rate of the input image and the absolute size of the motion vector of the block increase, the recognition rate for image quality deterioration or distortion may decrease. In this case, it is determined that a person reacts cognitively less sensitively to image quality degradation or distortion in the block, so that a lower weight may be set for the block. Conversely, as the frame rate of the input image and the absolute size of the motion vector of the block are smaller, a higher weight may be set for the block.

표 1은 프레임레이트 및 블록의 움직임 벡터의 절대 크기에 따른 가중치의 일 예를 나타낸다.Table 1 shows an example of a weight according to a frame rate and an absolute size of a motion vector of a block.

[표 1][Table 1]

표 1은 프레임레이트 및 블록의 움직임 벡터의 절대 크기에 따라서 결정되는 블록에 대한 가중치를 나타낼 수 있다.Table 1 may indicate a weight for a block determined according to a frame rate and an absolute size of a motion vector of the block.

표 1에서 나타난 것과 같이, 프레임레이트가 더 커질수록 왜곡 또는 화면 열화에 대한 인지율이 낮아진다고 판단될 수 있고, 이러한 판단에 따라 블록에 더 작은 가중치가 부여될 수 있다. 또한, 움직임 벡터의 절대 크기가 더 커질수록 왜곡 또는 화면 열화에 대한 인지율이 낮아진다고 판단될 수 있고, 이러한 판단에 따라 블록에 더 작은 가중치가 부여될 수 있다.As shown in Table 1, as the frame rate increases, it may be determined that the recognition rate for distortion or screen deterioration decreases, and according to this determination, a smaller weight may be assigned to the block. In addition, as the absolute size of the motion vector increases, it may be determined that the recognition rate for distortion or screen deterioration decreases, and according to this determination, a smaller weight may be assigned to the block.

표 1은 시간 마스킹 특성을 이용하여 프레임레이트 및 움직임 벡터의 절대 크기에 따라 가중치를 결정하는 일 예일 수 있다. 프레임레이트 및 움직임 벡터의 절대 크기에 따른 가중치를 결정하는 방식은 시간 마스킹 특성을 반영하는 다른 형태의 수식으로 정의되거나, 다른 값으로 표현될 수 있다.Table 1 may be an example of determining a weight according to a frame rate and an absolute size of a motion vector using a temporal masking characteristic. The method of determining the weight according to the frame rate and the absolute size of the motion vector may be defined as another type of equation reflecting the temporal masking characteristic, or may be expressed as a different value.

3) 주의 및 집중 특성을 이용하는 가중치의 결정3) Determination of weights using attention and concentration characteristics

입력 영상 내에 특별한 의도 하에 생성된 것이 아니라 팩터들의 급격한 변화 또는 두각으로 인해 주의를 끄는 컬러, 모양, 움직임, 대비 및 크기 등과 같은 팩터들의 존재 여부에 따라서 가중치가 다르게 결정될 수 있다.The weight may be determined differently depending on whether factors such as color, shape, movement, contrast, and size that attract attention due to a sudden change or prominence of factors are not generated in the input image with special intention.

입력 영상 내에 표지판 및 수화 등과 같이 목적 지향적인 인지 팩터로서, 사전 지식 또는 특정 기대에 의해 주의를 끄는 팩터들의 존재 여부에 따라 가중치가 다르게 결정될 수 있다.As a purpose-oriented cognitive factor, such as a sign or sign language, in the input image, a weight may be determined differently depending on whether there are factors that attract attention by prior knowledge or a specific expectation.

예를 들면, 영상 내에서 이러한 팩터들이 나타났을 경우, 이러한 팩터들이 인지 시각적으로 중요한 팩터라고 간주될 수 있고, 이러한 간주에 따라 영상에 낮은 가중치가 부여될 수 있다. 낮은 가중치가 부여되는 경우, JND 임계치가 낮아질 수 있으며, JND 임계치를 이용한 추가 양자화의 크기도 작아지게 될 수 있다.For example, when such factors appear in an image, these factors may be regarded as cognitively and visually important factors, and a low weight may be given to the image according to these considerations. When a low weight is given, the JND threshold may be lowered, and the size of additional quantization using the JND threshold may be reduced.

4) 경계/텍스처 정보를 이용하는 가중치의 결정4) Determination of weight using boundary/texture information

주파수 영역에서, 주요 계수의 방향성은 공간 영역(즉, 공간 도메인)에서 수직 형태의 경계 및 텍스처로서 많이 나타날 수 있다. 즉, 공간 영역에서 수직 경계 성분이 많이 나타나면, 주파수 영역에서는 수평 성분에 대하여 큰 계수 값(즉, 강한 계수)이 나타낼 수 있다.In the frequency domain, the directionality of the dominant coefficients can appear a lot as vertically shaped boundaries and textures in the spatial domain (ie, the spatial domain). That is, when a large number of vertical boundary components appear in the spatial domain, a large coefficient value (ie, a strong coefficient) with respect to the horizontal component may be expressed in the frequency domain.

이러한 특성을 이용하여, 공간 영역에서의 경계 및 텍스처의 방향성에 따라서 기본의 JND 임계치에 곱해지는 가중치가 다르게 결정될 수 있다. Using this characteristic, a weight multiplied by a basic JND threshold may be determined differently according to a boundary and a texture direction in a spatial domain.

예를 들면, 공간 영역에서 수직 성분이 강하게 나타났을 경우, 주파수 영역에서 수평 위치에 해당하는 계수들에게 작은 가중치가 부여될 수 있다. 말하자면, 공간 영역에서 영상의 계수들의 수직 성분이 강하게 나타나는 경우, JND 임계치 모델은 주파수 영역에서 수평 성분에 대한 임계치를 감소시킬 수 있다.For example, when a vertical component appears strong in a spatial domain, small weights may be given to coefficients corresponding to a horizontal position in the frequency domain. In other words, when the vertical component of the coefficients of the image appears strongly in the spatial domain, the JND threshold model can reduce the threshold for the horizontal component in the frequency domain.

이러한 처리를 통해, JND 임계치를 이용한 양자화 시, 주파수 영역에서 수평 성분에 해당하는 계수들에 대해서는 더 적게 양자화가 적용되게 할 수 있다. 주파수 영역에서 수평 성분에 해당하는 계수들에 대해서 더 적게 양자화를 적용함으로써 공간 영역에서는 수직 성분에 대하여 더 작게 양자화가 되며, 수직 성분에 대한 화질 저하가 감소될 수 있다.Through this process, when quantization using the JND threshold, less quantization can be applied to coefficients corresponding to the horizontal component in the frequency domain. By applying less quantization to the coefficients corresponding to the horizontal component in the frequency domain, the quantization is smaller for the vertical component in the spatial domain, and the deterioration of the image quality on the vertical component can be reduced.

도 20은 일 예에 따른 주파수 영역에서의 제1 JND 임계치를 예시한다.20 illustrates a first JND threshold in a frequency domain according to an example.

도 21은 일 예에 따른 주파수 영역에서의 제2 JND 임계치를 예시한다.21 illustrates a second JND threshold in a frequency domain according to an example.

전술된 것과 같은 다양한 기준들 및 원리들을 통해 주파수 영역에서의 제1 JND 임계치 및 제2 JND 임계치들이 도출될 수 있다.The first JND threshold and the second JND threshold in the frequency domain may be derived through various criteria and principles as described above.

JND 임계치는 추가 양자화를 위한 양자화 값과 동일할 수 있다.The JND threshold may be the same as a quantization value for further quantization.

도 20 및 도 21에서 도시된 JND 임계치들은 입력 영상 및 재구축된 영상의 인지 특성들에 따라 가중치가 곱해진 결과로서, JND 임계치들의 크기들은 서로 다를 수 있다.The JND thresholds shown in FIGS. 20 and 21 are a result of multiplying the weights according to cognitive characteristics of the input image and the reconstructed image, and the sizes of the JND thresholds may be different from each other.

도 22는 일 예에 JND 임계치를 이용하는 추가 양자화를 나타낸다.22 shows additional quantization using a JND threshold in an example.

도 22에서 도시된 것과 같이, 부호화된 영상에 대하여 JND 임계치를 이용하는 추가 양자화가 수행될 수 있고, JND 임계치를 이용하는 추가 양자화에 의해 인지 화질이 유지되면서도, 부호화된 영상을 위한 비트레이트가 절감될 수 있다.As shown in FIG. 22, additional quantization using the JND threshold may be performed on the encoded image, and the bit rate for the encoded image may be reduced while the perceived image quality is maintained by the additional quantization using the JND threshold. have.

도 23은 일 예에 따른 주파수 계수의 위치에 따른 양자화 값들을 도시한다.23 illustrates quantization values according to positions of frequency coefficients according to an example.

도 24는 일 예에 따른 재구축된 영상의 8x8 DCT 계수들을 도시한다.24 illustrates 8x8 DCT coefficients of a reconstructed image according to an example.

도 24의 재구축된 영상의 DCT 계수들은 도 14의 입력 영상의 DCT 계수들 및 도 23의 주파수 계수의 위치에 따른 양자화 값들 간의 차일 수 있다.The DCT coefficients of the reconstructed image of FIG. 24 may be a difference between the DCT coefficients of the input image of FIG. 14 and quantization values according to positions of the frequency coefficients of FIG. 23.

양성 부호화 과정에서, 처리부(310)는 재구축된 영상의 JND 구간을 결정하기 위해 차이 값을 계산할 수 있고, 계산된 차이 값을 JND 임계치와 비교할 수 있다.In the positive encoding process, the processor 310 may calculate a difference value to determine the JND section of the reconstructed image, and compare the calculated difference value with the JND threshold.

전술된 것과 같이, JND 임계치와의 비교를 위한 차이 값은 계수들의 단순 합 또는 가중치가 부여된 합일 수 있다. JND 임계치들 또한 동일한 방법으로 합해짐에 따라 하나의 값으로 표현될 수 있다.As described above, the difference value for comparison with the JND threshold may be a simple sum of coefficients or a weighted sum. JND thresholds can also be expressed as one value as they are summed in the same way.

합해진 차이 값들이 제1 JND 임계치들의 합보다 더 작은 경우, 영상은 제1 JND 구간에 속할 수 있고, 제1 JND 임계치가 영상에 대한 추가 양자화 값으로 사용될 수 있다.When the summed difference values are smaller than the sum of the first JND thresholds, the image may belong to the first JND interval, and the first JND threshold may be used as an additional quantization value for the image.

합해진 차이 값들이 제1 JND 임계치들의 합보다 더 크고 제2 JND 임계치들의 합보다 더 작은 경우, 영상은 제2 JND 구간에 속할 수 있고, 제2 JND 임계치가 영상에 대한 추가 양자화 값으로 사용될 수 있다.When the summed difference values are greater than the sum of the first JND thresholds and smaller than the sum of the second JND thresholds, the image may belong to the second JND interval, and the second JND threshold may be used as an additional quantization value for the image. .

도 25는 일 예에 따른 인지 화질의 향상의 과정을 나타낸다.25 illustrates a process of improving cognitive image quality according to an example.

도 25의 상단 좌측에는 원본의 영상이 도시되었다.The image of the original is shown in the upper left of FIG. 25.

도 25의 상단 중간에는 인지 민감 영역 맵이 도시되었다.In the middle of the upper part of FIG. 25, a map of a cognitive sensitive region is shown.

도 25의 상단 우측에는 CU 단위의 부호화를 수행함으로써 생성된 열화 영역 맵이 도시되었다.In the upper right of FIG. 25, a deterioration region map generated by performing CU-based encoding is illustrated.

도 25의 하단 우측에는 인지 열화 영역 맵이 도시되었다.A cognitive deterioration area map is shown in the lower right of FIG. 25.

도 25의 하단 좌측에는 인지 화질 열화 맵에 기반하는 화질 향상 기법의 적용에 따른 영상이 도시되었다.In the lower left of FIG. 25, an image according to the application of the image quality enhancement technique based on the perceived image quality degradation map is shown.

무작위성을 이용하는 인지 민감 영역의 검출Detection of cognitive sensitive areas using randomness

도 26은 일 예에 따른 픽셀 단위의 무작위성의 검출을 나타낸다.26 illustrates detection of randomness in units of pixels, according to an example.

도 27은 일 예에 따른 블록 단위의 무작위성의 검출을 나타낸다.27 illustrates detection of randomness in units of blocks, according to an example.

도 8을 참조하여 전술된 단계(830)에서, 처리부(310)는 입력 영상 내에서 인지 민감 영역을 검출할 수 있다.In step 830 described above with reference to FIG. 8, the processing unit 310 may detect a cognitive sensitive region in the input image.

인지 민감 영역은 인지적으로 민감한 영역을 의미할 수 있다. 일반적으로, 인지 민감 영역에서 화질 열화가 발생한 경우, 사람들이 화질 열화에 대하여 더 민감하게 반응할 수 있다. 인지 민감 영역의 결정에 있어서 화질 향상이 요구되는 영역을 결정하기 위한 지표가 사용될 수 있다.The cognitive sensitive area may mean a cognitively sensitive area. In general, when image quality deterioration occurs in a cognitive sensitive region, people may react more sensitively to the image quality deterioration. In determining the cognitive sensitive region, an index for determining a region requiring improved image quality may be used.

무작위성(randomness)는 주어진 영역이 얼마나 불규칙한 특성을 가지는지를 나타내는 척도일 수 있다. Randomness can be a measure of how irregular a given area is.

무작위성은 주변 영역으로부터 현재의 영역을 얼마나 잘 복원할 수 있는가를 측정함에 따라 계산되는 값일 수 있다. 여기에서, 주변 영역은 블록 또는 픽셀일 수 있다.Randomness may be a value calculated by measuring how well the current area can be restored from the surrounding area. Here, the peripheral area may be a block or a pixel.

일반적으로, 인지 시각의 측면에서는, 사람들은 큰 무작위성을 갖는 객체 또는 영역에 나타나는 열화에 대해서는 인지를 잘 못하거나 무시하는 경향이 있을 수 있다. 낮은 무작위성을 갖는 영역에 나타나는 왜곡(distortion)에 대한 인지 민감도가 높기 때문에, 화질 향상 영역을 결정함에 있어서 영역의 무작위성이 우선적으로 고려될 수 있다.In general, in terms of cognitive vision, people may be poorly aware of or tend to ignore deterioration appearing in objects or areas with large randomness. Since the perception sensitivity to distortion appearing in a region having low randomness is high, the region randomness may be considered preferentially in determining the image quality enhancement region.

무작위성은 픽셀 단위 또는 블록 단위로 검출될 수 있다.Randomness can be detected in units of pixels or blocks.

마스킹 특성을 이용하는 인지 민감 영역의 검출Detection of cognitive sensitive areas using masking characteristics

도 8을 참조하여 전술된 단계(830)에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 마스킹 특성을 이용할 수 있다.In step 830 described above with reference to FIG. 8, the processing unit 310 may use the masking characteristic in detecting the cognitive sensitive region.

전술된 것과 같이, 마스킹 특성에 기반하여 영상 또는 변환된 영상에 대한 다양한 가중치 값들이 정의될 수 있다. 여기에서, 정의된 가중치 값은 가중치 값이 더 클수록 인지 민감도가 더 낮다는 것을 의미할 수 있다. 따라서, 낮은 가중치를 갖는 영역이 인지 민감 영역으로서 결정될 수 있다.As described above, various weight values for an image or a transformed image may be defined based on the masking characteristic. Here, the defined weight value may mean that the larger the weight value, the lower the cognitive sensitivity. Accordingly, an area having a low weight can be determined as a cognitive sensitive area.

주의 및 집중 특성을 이용하는 인지 민감 영역의 검출Detection of cognitive sensitive areas using attention and concentration characteristics

도 8을 참조하여 전술된 단계(830)에서, 처리부(310)는 인지 민감 영역을 검출함에 있어서, 주의 및 집중 특성을 이용할 수 있다.In step 830 described above with reference to FIG. 8, the processing unit 310 may use attention and concentration characteristics in detecting the cognitive sensitive region.

마스킹 특성과 동일하게, 앞서 정의된 주의 및 집중 특성에 준하여 인지적으로 눈에 잘 띄는 영역이 인지 민감 영역으로서 결정될 수 있다.Like the masking characteristic, a cognitively conspicuous region may be determined as a cognitive sensitive region based on the above-defined attention and concentration characteristics.

화질 열화 영역의 검출Detection of image quality deterioration areas

도 8을 참조하여 전술된 단계(840)에서, 처리부(310)는 입력 영상 내에서 화질 열화 영역을 검출할 수 있다.In step 840 described above with reference to FIG. 8, the processing unit 310 may detect an image quality deterioration region in the input image.

경계 강도(Boundary Strength; BS)를 이용하는 화질 열화 영역의 검출Detection of image quality deterioration regions using boundary strength (BS)

도 28은 일 예에 따른 블록 경계를 걸치는 블로킹 아티팩트의 모델을 나타낸다.28 illustrates a model of a blocking artifact across a block boundary according to an example.

BS는 변환 및 양자화로 인해 발생하는 블록들 간의 블로킹 아티팩트(blocking artifact)를 검출하기 위해 사용될 수 있으며, 디블록킹 필터에 사용될 수 있다.The BS may be used to detect blocking artifacts between blocks generated due to transformation and quantization, and may be used in a deblocking filter.

아래의 표 2에서 BS의 값에 따른 블록 모드들, 조건들 및 수정되어야 할(즉, 필터링의 대상인) 픽셀들이 예시된 것과 같이, BS의 값이 클수록 블록들 사이에서 발생하는 블로킹 아티팩트가 더 클 수 있다.As illustrated in Table 2 below, block modes, conditions according to the value of BS, and pixels to be modified (ie, to be filtered), the larger the value of BS, the greater the blocking artifacts occurring between blocks. I can.

[표 2][Table 2]

따라서, 화질 열화 영역을 검출함에 있어서, 처리부(310)는 기정의된 BS의 값의 이하의 값을 갖는 단위를 화질 열화 영역으로 지정할 수 있다. 여기에서, 단위는 하나 이상의 (인접한) 픽셀들 및/또는 하나 이상의 (인접한) 블록들일 수 있으며, 단위는 정사각형 또는 직사각형 같은 특정된 형태를 가질 수 있다.Accordingly, in detecting the image quality deterioration region, the processing unit 310 may designate a unit having a value less than or equal to the predefined BS value as the image quality deterioration region. Here, the unit may be one or more (adjacent) pixels and/or one or more (adjacent) blocks, and the unit may have a specific shape such as a square or a rectangle.

JND 레벨 정보를 이용하는 화질 열화 영역의 검출Detection of image quality deterioration areas using JND level information

앞서 JND에 대한 정의에서 설명된 것과 같이, 일반적으로 영상의 JND 레벨이 더 클수록 영상의 열화가 더 커질 수 있다. 따라서, 처리부(310)는 영상의 특정된 영역의 JND 레벨이 기정의된 레벨의 이상인 경우, 영역에 눈에 띄는 열화가 발생한 것으로 판단할 수 있고, 영역을 열화 영역으로 지정할 수 있다. 여기에서, 영역은 전술된 단위일 수 있다.As described above in the definition of JND, in general, the larger the JND level of the image, the greater the deterioration of the image may be. Accordingly, when the JND level of the specified region of the image is greater than or equal to the predefined level, the processor 310 may determine that noticeable deterioration has occurred in the region and may designate the region as a deterioration region. Here, the area may be the unit described above.

영상 또는 영역의 JND 레벨은 아래와 같이 결정될 수 있다The JND level of an image or region can be determined as follows.

1) 처리부(310)는 다양한 JND 레벨에 대한 정보가 정의된 데이터베이스를 이용하여 학습된 기계 학습에 기반하여 기계 학습의 판별기를 통해 영상 또는 영역의 JND 레벨을 결정할 수 있다.1) The processing unit 310 may determine the JND level of an image or region through a machine learning discriminator based on machine learning learned using a database in which information on various JND levels is defined.

2) 처리부(310)는 주관적 화질 평가를 통해 모델링된 JND 임계치를 이용하여 영상 또는 영역의 JND 레벨을 결정할 수 있다.2) The processing unit 310 may determine the JND level of the image or region using the JND threshold modeled through subjective image quality evaluation.

모델링된 JND 임계치 JND _k (i, j)는 아래의 수식 19을 사용하여 이용될 수 있다.The modeled JND threshold JND _k ( i , j ) can be used using Equation 19 below.

[수식 19][Equation 19]

K는 JND 레벨을 나타낼 수 있다. K may represent the JND level.

i, j는 주파수 영역에서 계수의 위치를 나타낼 수 있다. i and j may represent the positions of the coefficients in the frequency domain.

Δc(i, j)는 원본 영상 및 열화된 영상(또는, 재구축된 영상)의 변화 계수들 간의 차분 값일 수 있다.Δ c ( i , j ) may be a difference value between coefficients of change of the original image and the deteriorated image (or reconstructed image).

JND _k (i, j)는 JND 레벨 k에서의 JND 임계치일 수 있다. 즉, JND ₁(i, j)는 제1 JND 임계치 모델에서의 i, j의 위치의 임계치일 수 있다. JND _k ( i , j ) may be a JND threshold at the JND level k. That is, JND ₁ ( i , j ) may be a threshold value of positions of i and j in the first JND threshold model.

수식 19에서,

를 JND 레벨 k에서의 인지 오차 발생률이라고 간주하면, 처리부(310)는

가 1을 초과할 경우 해당하는 JND 레벨이 도과한 것으로 판단할 수 있다. 즉,

가 1을 초과하고

가 1의 이하인 경우, 해당하는 영상 또는 영역의 JND 레벨은 2가 될 수 있다.In Equation 19,

Considering the rate of occurrence of the recognition error at the JND level k, the processing unit 310

If is greater than 1, it can be determined that the corresponding JND level has exceeded. In other words,

Is greater than 1 and

When is less than or equal to 1, the JND level of the corresponding image or region may be 2.

주요 에지 정보를 이용하는 화질 열화 영역의 검출Detection of image quality deterioration areas using main edge information

일반적으로, 사람들은 대각선 에지의 변화 보다는 수직 방향 및/또는 수평 방향의 에지의 변화에 더 민감하게 반응할 수 있다(말하자면, 오빌리크 효과(oblique effect)). 따라서, 영상 또는 영역에 수평 방향 및/또는 수직 방향의 에지가 두드러지고, 재구축된 영상 또는 영역에서는 해당하는 에지의 성분이 사라진다면, 처리부((310)는 상기의 영상 또는 영역을 열화 영역으로 결정할 수 있다.In general, people can react more sensitively to changes in vertical and/or horizontal edges than to changes in diagonal edges (i.e., oblique effect). Therefore, if the horizontal and/or vertical edges are prominent in the image or region, and the corresponding edge component disappears in the reconstructed image or region, the processing unit 310 converts the image or region into a deteriorated region. You can decide.

예를 들면, 처리부(310)는 에지 오퍼레이터를 통해 수평 방향 및 수직 방향의 각각에 대한 블록의 에지의 크기의 값을 정할 수 있고, 이 값이 특정된 기준치의 이상일 때 해당 블록에 수평 및/또는 수직 에지 성분이 존재한다고 판단할 수 있다. 처리부(310)는 변환 및 양자화를 거친 재구축된 영상에서 해당 수평 및/또는 수직 에지의 크기가 기준치 이상으로 감소하였을 경우 블록에 열화가 발생했다고 판단할 수 있다.For example, the processing unit 310 may determine a value of the size of the edge of the block in the horizontal direction and the vertical direction through the edge operator, and when the value is greater than or equal to a specified reference value, the block is horizontally and/or It can be determined that the vertical edge component is present. The processor 310 may determine that deterioration has occurred in the block when the size of the corresponding horizontal and/or vertical edge in the reconstructed image that has undergone transformation and quantization decreases beyond a reference value.

도 29는 일 예에 따른 인지 열화 영역 결정의 과정을 나타낸다.29 illustrates a process of determining a cognitive deterioration region according to an example.

도 29에서는 무작위성 및 BS에 대한 정보를 이용하는 인지 열화 영역의 결정의 과정이 도시되었다.In FIG. 29, a process of determining a cognitive deterioration region using information about randomness and BS is illustrated.

도 29의 상단 좌측에는 원본의 영상이 도시되었다.The image of the original is shown on the upper left of FIG. 29.

도 29의 상단 중간에는 무작위성 맵이 도시되었다.A randomness map is shown in the middle of the top of FIG. 29.

도 29의 상단 우측에는 CU 단위의 부호화를 수행함으로써 생성된 BS 맵이 도시되었다.In the upper right of FIG. 29, a BS map generated by performing CU-based encoding is shown.

도 29의 하단에는 인지 열화 영역 맵이 도시되었다.A cognitive deterioration area map is shown in the lower part of FIG. 29.

인지 열화 영역의 결정에 있어서, 처리부(310)는 전술된 인지 민감 영역에 대한 정보 및 열화 영역에 대한 정보를 이용하여 인지 민감 영역에 열화가 발생한 경우, 열화가 발생한 인지 민감 영역을 인지 열화 영역으로서 결정할 수 있다.In determining the cognitive deterioration region, the processing unit 310 uses the information on the cognitive deterioration region and the information on the deterioration region as described above, when deterioration occurs in the cognition sensitive region, the cognitive deterioration region in which the deterioration has occurred as a cognitive deterioration region You can decide.

인지 화질의 향상Improvement of cognitive image quality

도 8을 참조하여 전술된 단계(860)에서, 처리부는 인지 화질의 향상을 수행할 수 있다.In step 860 described above with reference to FIG. 8, the processing unit may improve the perceived image quality.

처리부(310)는 앞서 설명된 JND 감축(suppression)을 통해 절약된 코딩 비트를 이용하여 인지 열화 영역의 화질을 향상시킬 수 있다.The processing unit 310 may improve the picture quality of the cognitive degradation region by using the coding bits saved through the JND reduction described above.

QP의 값의 조정을 통한 인지 화질의 향상Improvement of perceived quality through adjustment of QP value

일 실시예에서, 처리부(310)는 절약된 코딩 비트를 QP의 값의 조정에 사용할 수 있다.In an embodiment, the processor 310 may use the saved coding bits to adjust the value of QP.

처리부(310)는 영상의 부호화에 있어서, 인지 열화 영역에는 정해진 QP 보다 더 낮은 QP를 할당함으로써 인지 열화 영역의 인지 화질을 향상시킬 수 있다.In encoding an image, the processor 310 may improve the perceived quality of the cognitive degradation region by allocating a lower QP than the predetermined QP to the cognitive degradation region.

처리부(31)는 QP의 값을 정함에 있어서, 1) 특정된 값을 QP에 할당하고, 2) 인지 열화 영역의 JND 레벨이 낮아질 때까지 QP의 값을 조정할 수 있고, 3) 인지 열화의 현상이 사라질 때까지 QP의 값을 조정할 수 있다. 이 때, 이러한 조정은 QP의 값을 조정함으로써 증가한 비트들의 개수가 JND 감축을 통해 절약된 비트들의 개수의 이하가 되도록 제한될 수 있다.In determining the value of the QP, the processing unit 31 may 1) allocate a specified value to the QP, 2) adjust the value of the QP until the JND level in the cognitive deterioration area is lowered, and 3) the phenomenon of cognitive deterioration. You can adjust the value of QP until disappears. In this case, such adjustment may be limited so that the number of bits increased by adjusting the value of QP is less than the number of bits saved through JND reduction.

기계 학습에 기반하는 노이즈 제거를 통한 인지 화질의 향상Improvement of cognitive image quality through noise removal based on machine learning

현재 다양한 종류의 기계 학습에 기반하는 노이즈 제거 기법들이 제시되고 있다. 처리부(310)는 기계 학습에 기반하는 노이즈 제거 기법을 인지 화질의 향상을 위해 적용할 수 있다.Currently, various types of machine learning-based noise removal techniques are being proposed. The processor 310 may apply a noise removal technique based on machine learning to improve perceived image quality.

처리부(310)는 기계 학습에 기반하는 노이즈 제거 기법을 적용함에 있어 앞서, 결정된 인지 열화 영역을 학습 데이터로서 사용함에 따라 이러한 학습 데이터에 최적화된 기계 학습 기반에 기반하는 노이즈 제거 신경망을 생성할 수 있고, 노이즈 제거 신경망을 인지 화질의 향상을 위해 사용할 수 있다.The processing unit 310 may generate a noise removal neural network based on machine learning optimized for the learning data by using the determined cognitive deterioration region as training data in the application of the noise removal technique based on machine learning. In addition, a noise canceling neural network can be used to improve cognitive quality.

상술한 실시예들에서, 방법들은 일련의 단계 또는 유닛으로서 순서도를 기초로 설명되고 있으나, 본 발명은 단계들의 순서에 한정되는 것은 아니며, 어떤 단계는 상술한 바와 다른 단계와 다른 순서로 또는 동시에 발생할 수 있다. 또한, 당해 기술 분야에서 통상의 지식을 가진 자라면 순서도에 나타난 단계들이 배타적이지 않고, 다른 단계가 포함되거나, 순서도의 하나 또는 그 이상의 단계가 본 발명의 범위에 영향을 미치지 않고 삭제될 수 있음을 이해할 수 있을 것이다.In the above-described embodiments, the methods are described on the basis of a flow chart as a series of steps or units, but the present invention is not limited to the order of steps, and certain steps may occur in a different order or concurrently with other steps as described above. I can. In addition, those of ordinary skill in the art understand that the steps shown in the flowchart are not exclusive, other steps are included, or one or more steps in the flowchart may be deleted without affecting the scope of the present invention. You can understand.

이상 설명된 본 발명에 따른 실시예들은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다.The embodiments according to the present invention described above may be implemented in the form of program instructions that can be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded in the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in the computer software field.

컴퓨터 판독 가능한 기록 매체는 본 발명에 따른 실시예들에서 사용되는 정보를 포함할 수 있다. 예를 들면, 컴퓨터 판독 가능한 기록 매체는 비트스트림을 포함할 수 있고, 비트스트림은 본 발명에 따른 실시예들에서 설명된 정보를 포함할 수 있다.The computer-readable recording medium may contain information used in embodiments according to the present invention. For example, the computer-readable recording medium may include a bitstream, and the bitstream may include information described in embodiments according to the present invention.

컴퓨터 판독 가능한 기록 매체는 비-일시적 컴퓨터 판독 가능한 매체(non-transitory computer-readable medium)를 포함할 수 있다.The computer-readable recording medium may include a non-transitory computer-readable medium.

컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magnetic-optical media such as floptical disks. media), and a hardware device specially configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform the processing according to the present invention, and vice versa.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명이 상기 실시예들에 한정되는 것은 아니며, 본 발명이 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형을 꾀할 수 있다.In the above, the present invention has been described by specific matters such as specific elements and limited embodiments and drawings, but this is provided only to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , Anyone having ordinary knowledge in the technical field to which the present invention pertains can make various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention is limited to the above-described embodiments and should not be defined, and all modifications that are equally or equivalent to the claims as well as the claims to be described later fall within the scope of the spirit of the present invention. I would say.

Claims

Performing alphabetization based on a just noticeable difference (JND) with respect to the input image; And
Enhancement of perceived quality of the input image
Containing, image processing method.