KR20210026015A

KR20210026015A - Fast block split method in video and encoding apparatus

Info

Publication number: KR20210026015A
Application number: KR1020190106205A
Authority: KR
Inventors: 강제원; 박상효
Original assignee: 이화여자대학교 산학협력단
Priority date: 2019-08-29
Filing date: 2019-08-29
Publication date: 2021-03-10

Abstract

A fast block split method in an image includes steps of: performing, by an encoder, an evaluation of a binary tree (BT) splitting a specific CU of a frame; and performing, by the encoder, evaluation on a ternary tree (TT) splitting the specific CU. The encoder omits the evaluation of a vertical direction in a TT division process when a rate-distortion (RD) cost in a horizontal direction in a BT split is smaller than the RD cost in the vertical direction, and the encoder omits the evaluation of the horizontal direction in the TT segmentation process when the rate-distortion (RD) cost in the horizontal direction is greater than the RD cost in the vertical direction. The present invention can perform high-speed encoding by using the rate-distortion cost for blocks having overlapping regions.

Description

High-speed segmentation method and encoder device for blocks in video {FAST BLOCK SPLIT METHOD IN VIDEO AND ENCODING APPARATUS}

이하 설명하는 기술은 영상 코딩 기법에 관한 것이다. 특히, 이하 설명하는 기술은 인코딩 과정에서의 영상 내 블록에 대한 분할 기법에 관한 것이다.The technique described below relates to an image coding technique. In particular, the technique described below relates to a splitting technique for a block in an image in the encoding process.

차세대 비디오 부호화 (Future Video Coding) 표준을 위하여 구성된 ITU-T SC16 WP3와 ISO/IEC JTC 1/SC 29/WG 11의 협력팀인 JVET(Joint Video Exports Team)은 HEVC(High Efficiency Video Coding)/H.265 보다 2배 이상의 부호화 성능을 갖는 VVC (Versatile Video coding)에 대한 표준화 작업을 진행 중이다. VVC는 프레임을 복수의 블록으로 분할하고, 분할된 블록 단위로 코딩을 수행한다. VVC는 MTT(Multi-Type Tree)라는 새로운 트리 구조를 사용하여 블록을 분할한다.Joint Video Exports Team (JVET), a team of ITU-T SC16 WP3 and ISO/IEC JTC 1/SC 29/WG 11, configured for the next-generation video coding standard, is HEVC (High Efficiency Video Coding)/H. Standardization work is in progress for VVC (Versatile Video coding), which has more than twice the coding performance of 265. VVC divides a frame into a plurality of blocks and performs coding in units of divided blocks. VVC divides a block using a new tree structure called MTT (Multi-Type Tree).

K. Choi, S. -H. Park, and E. S. Jang, "Coding tree pruning based CU early termination," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T and ISO/IEC, Document JCTVC-F092, Jul. 2019.K. Choi, S. -H. Park, and E. S. Jang, "Coding tree pruning based CU early termination," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T and ISO/IEC, Document JCTVC-F092, Jul. 2019.

인코더는 MTT의 다양성을 활용하기 위해서는 각 블록마다 예측기법 및 변환/양자화 기법을 확인해야 한다. 따라서 VVC는 인코딩 과정에서 HEVC와 비교하여 훨씬 많은 경우의 수에 대하여 부호화 과정을 시험해야 한다.In order to utilize the diversity of MTT, the encoder must check the prediction technique and transform/quantization technique for each block. Therefore, VVC should test the encoding process for a much larger number of cases compared to HEVC in the encoding process.

이하 설명하는 기술은 블록 구조의 특성을 이용하여 특정 블록에 대한 시험 과정을 생략하는 기법을 제공하고자 한다. 이하 설명하는 기술은 픽셀이 중복되는 블록 사이에서 율-왜곡(rate distortion) 비용을 공유하는 인코딩 기법을 제공하고자 한다.The technique described below is intended to provide a technique for omitting a test process for a specific block by using the characteristic of the block structure. The technique described below is intended to provide an encoding technique that shares a rate-distortion cost between blocks with overlapping pixels.

영상 내 블록에 대한 고속 분할 방법은 인코더가 프레임의 특정 CU(coding unit)에 대하여 이진 트리(BT) 분할에 대한 평가를 수행하는 단계 및 상기 인코더가 상기 특정 CU에 대하여 삼진 트리(TT) 분할에 대한 평가를 수행하는 단계를 포함한다. 상기 인코더는 상기 BT 분할에서 수평 방향의 율-왜곡(RD) 비용이 수직 방향의 RD 비용보다 작은 경우, 상기 TT 분할 과정에서 수직 방향의 평가를 생략하한다. 또 상기 인코더는 상기 수평 방향의 율-왜곡(RD) 비용이 상기 수직 방향의 RD 비용보다 큰 경우, 상기 TT 분할 과정에서 수평 방향의 평가를 생략한다.In the high-speed segmentation method for a block in an image, an encoder performs an evaluation of a binary tree (BT) segmentation for a specific CU (coding unit) of a frame, and the encoder performs a three-dimensional tree (TT) segmentation for the specific CU. And performing an evaluation on the. In the case where the rate-distortion (RD) cost in the horizontal direction in the BT division is smaller than the RD cost in the vertical direction, the encoder omits the evaluation in the vertical direction in the TT division process. In addition, when the rate-distortion (RD) cost in the horizontal direction is greater than the RD cost in the vertical direction, the encoder skips the evaluation of the horizontal direction in the TT division process.

영상 내 블록에 대한 고속 분할을 수행하는 인코더 장치는 분할 대상인 프레임 내 타깃 블록에 대한 정보, 상기 타깃 블록에 대한 이진 트리(BT) 분할 평가 결과 및 상기 타깃 블록을 분할하기 위한 코드를 저장하는 저장 장치 및 상기 코드를 실행하여, 상기 타깃 블록에 대한 삼진 트리(TT) 분할을 평가하는 프로세서를 포함한다. 상기 BT 분할 평가 결과는 상기 타깃 블록에 대한 BT 분할의 수평 방향의 율-왜곡(RD) 비용 및 상기 타깃 블록에 대한 BT 분할의 수직 방향의 RD 비용을 포함함다. 상기 프로세서는 상기 수평 방향의 RD 비용과 상기 수직 방향의 RD 비용을 기준으로, 상기 TT 분할 평가 과정에서 수평 방향 또는 수직 방향의 평가를 생략한다.An encoder device that performs high-speed segmentation on a block in an image is a storage device that stores information on a target block in a frame to be segmented, a result of a binary tree (BT) segmentation evaluation result for the target block, and a code for segmenting the target block. And a processor that executes the code to evaluate the three-dimensional tree (TT) partitioning of the target block. The BT segmentation evaluation result includes a horizontal rate-distortion (RD) cost of the BT segmentation for the target block and a vertical RD cost of the BT segmentation for the target block. The processor omits the evaluation in the horizontal direction or the vertical direction in the TT division evaluation process based on the RD cost in the horizontal direction and the RD cost in the vertical direction.

이하 설명하는 기술은 VVC에서 MTT에 따라 분할되는 블록 중 영역이 중복되는 블록에 대한 율-왜곡 비용을 이용하여 고속으로 인코딩을 할 수 있다. 이하 설명하는 기술은 모바일 기기와 같이 한정된 성능을 갖는 장비가 HEVC 보다 높은 복잡도를 갖는 VVC 영상을 실시간으로 녹화하게 한다.The technique described below can perform encoding at high speed using a rate-distortion cost for a block with an overlapping region among blocks divided according to MTT in VVC. The technology to be described below allows a device having limited performance, such as a mobile device, to record a VVC image having a higher complexity than HEVC in real time.

도 1은 VVC의 MTT(Multi-Type Tree) 구조에 대한 예이다.
도 2는 인코딩 과정에서 연산한 TT의 RD 비용이 최적이 아닌 경우에 대한 사전 확률의 예이다.
도 3은 BT의 RD 비용이 동일 방향 TT에 대한 유효한 데이터가 되는 사후 확률에 대한 예이다.
도 4는 VVC에서 블록의 분할 구조를 결정하는 과정에 대한 예이다.
도 5는 VVC에서 블록의 분할 구조를 결정하는 인코더 장치의 구성을 도시한 예이다.1 is an example of a structure of a multi-type tree (MTT) of VVC.
2 is an example of a prior probability for a case where the RD cost of TT calculated during the encoding process is not optimal.
3 is an example of a posterior probability that the RD cost of BT becomes valid data for TT in the same direction.
4 is an example of a process of determining a partition structure of a block in VVC.
5 is an example of a configuration of an encoder device that determines a block division structure in VVC.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시례를 가질 수 있는 바, 특정 실시례들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The technology to be described below may be modified in various ways and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the technology to be described below with respect to a specific embodiment, and it should be understood to include all changes, equivalents, or substitutes included in the spirit and scope of the technology to be described below.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 상기 용어들에 의해 한정되지는 않으며, 단지 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 이하 설명하는 기술의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as 1st, 2nd, A, B, etc. may be used to describe various components, but the components are not limited by the above terms, and only for the purpose of distinguishing one component from other components. Is only used. For example, a first component may be referred to as a second component, and similarly, a second component may be referred to as a first component without departing from the scope of the rights of the technology described below. The term and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

본 명세서에서 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 설시된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.In terms of the terms used in the present specification, expressions in the singular should be understood as including plural expressions unless clearly interpreted differently in context, and terms such as "includes" are specified features, numbers, steps, actions, and components. It is to be understood that the presence or addition of one or more other features or numbers, step-acting components, parts or combinations thereof is not meant to imply the presence of, parts, or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Prior to the detailed description of the drawings, it is intended to clarify that the division of the constituent parts in the present specification is merely divided by the main function that each constituent part is responsible for. That is, two or more constituent parts to be described below may be combined into one constituent part, or one constituent part may be divided into two or more for each more subdivided function. In addition, each of the constituent units to be described below may additionally perform some or all of the functions of other constituent units in addition to its own main function, and some of the main functions of each constituent unit are different. It goes without saying that it can also be performed exclusively by.

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In addition, in performing the method or operation method, each of the processes constituting the method may occur differently from the specified order unless a specific order is clearly stated in the context. That is, each of the processes may occur in the same order as the specified order, may be performed substantially simultaneously, or may be performed in the reverse order.

이하 설명하는 기술은 VVC 영상에 대한 인코딩을 수행하는 과정이다. VVC 영상은 VVC 규약(표준)에 따라 코딩되는 영상을 의미한다. 이하 인코더가 인코딩을 수행한다고 설명한다. 인코더는 영상을 인코딩하는 장치를 말한다. PC, 모바일 기기, 스마트 기기, 서버 등과 같은 컴퓨팅 장치가 인코더가 될 수 있다.The technique described below is a process of performing encoding on a VVC video. VVC video refers to a video coded according to the VVC protocol (standard). Hereinafter, it will be described that the encoder performs encoding. An encoder refers to a device that encodes an image. A computing device such as a PC, mobile device, smart device, server, etc. may be an encoder.

도 1은 VVC의 MTT(Multi-Type Tree) 구조에 대한 예이다.1 is an example of a structure of a multi-type tree (MTT) of VVC.

MTT는 VVC에 적용된 새로운 트리 구조이다. MTT는 종래 QT(Quad Tree, 사진 트리) 외에도 BT(Binary Tree, 이진 트리) 및 TT(Ternary Tree, 삼진 트리)를 사용하다. MTT는 하나의 블록을 QT, BT 또는 TT를 사용하여 분할할 수 있다. MTT는 하나의 CU(Coding Unit)를 다양한 서브 트리(sub-tree)로 분할할 수 있게 한다. 특히, BT와 TT는 QT의 하위에서만 존재할 수 있다. BT는 BT 또는 TT를 서브 트리로 둘 수 있다. 또 TT는 TT 또는 BT를 서브 트리로 둘 수 있다. BT 및 TT는 도 1과 같이 가로 방향(horizontal)과 세로 방향(vertical)으로 분할 가능하다. 트리 최대 깊이와 최소 블록 사이즈에 따라 하나의 블록은 보다 더 다양한 형태로 분할될 수 있다. 즉, VCC는 다양한 트리 구조를 사용하여 블록 분할을 다양한 형태로 할 수 있다. VCC는 다양한 트리 구조를 제공하는 MTT를 이용하여 향상된 코딩 효율을 제공한다.MTT is a new tree structure applied to VVC. In addition to the conventional QT (Quad Tree, photo tree), MTT uses BT (Binary Tree, binary tree) and TT (Ternary Tree, ternary tree). MTT can divide one block using QT, BT or TT. MTT allows one CU (Coding Unit) to be divided into various sub-trees. In particular, BT and TT may exist only under QT. BT can place BT or TT as a sub tree. In addition, TT may have TT or BT as a subtree. BT and TT can be divided into a horizontal direction (horizontal) and a vertical direction (vertical) as shown in FIG. According to the maximum depth of the tree and the minimum block size, one block can be divided into more various types. That is, the VCC can perform block division in various forms using various tree structures. VCC provides improved coding efficiency by using MTT, which provides a variety of tree structures.

인코더는 MTT 기반으로 CU를 분할하는 과정에서 모든 가능한 분할 구조에 대한 RD(rate distortion) 비용을 연산한다. 이를 통해 인코더는 CU에 대한 최적의 트리 구조를 결정한다. 인코더는 기본적으로 특정 CU에 대하여 RD 비용이 가장 작은 구조를 갖도록 분할한다. 한편, 인코더는 CU를 분할하여 최적 RD 비용이 나오지 않는다고 판단하면, 해당 CU를 더 이상 분할하지 않는다. 물론 전술한 바와 같이, 인코더는 트리 최대 깊이와 최소 블록 사이즈라는 조건을 고려하여 CU를 분할한다.The encoder calculates the RD (rate distortion) cost for all possible partition structures in the process of partitioning a CU based on MTT. Through this, the encoder determines the optimal tree structure for the CU. The encoder basically divides a specific CU to have a structure with the smallest RD cost. On the other hand, if the encoder determines that the optimal RD cost is not generated by dividing the CU, it does not divide the CU any more. Of course, as described above, the encoder divides the CU in consideration of the conditions of the maximum tree depth and the minimum block size.

MTT의 트리 중 TT가 상대적으로 인코딩 복잡도가 높다. 이하 설명하는 기술은 TT에 대한 RD 비용 평가의 횟수를 줄인다. 이하 설명하는 기술은 베이즈 확률론(Bayesian probability)에 근거하여, TT의 RD 비용 연산 횟수를 줄이고자 한다.Among the trees of MTT, TT has a relatively high encoding complexity. The technique described below reduces the number of RD cost evaluations for TT. The technique described below is based on Bayesian probability to reduce the number of RD cost calculations of TT.

VCC 테스트 모델(VTM)이 TT에 대하여 수행하는 RD 비용 연산 과정을 살펴본다. 도 2는 인코딩 과정에서 연산한 TT의 RD 비용이 최적이 아닌 경우에 대한 사전 확률(prior probability)의 예이다. 도 2는 VTM이 TT 분할을 위하여 연산한 RD 비용 τ이 최적이 아닌 경우에 대한 예이다. p(τ)는 TT에 대한 RD 비용이 최적이 아닌 경우의 확률을 나타낸다. 즉 p(τ)는 TT에 대한 평가가 필요하지 않을 확률이라고 할 수 있다. 베이즈 이론(Bayes theorem)에서 사전 확률 p(τ)은 실제 증거(데이터)가 관측되기 전까지 가설 τ이 참(true)일 확률을 말한다. Let's look at the RD cost calculation process performed by the VCC test model (VTM) for TT. 2 is an example of a prior probability for a case where the RD cost of TT calculated during the encoding process is not optimal. 2 is an example of a case where the RD cost τ calculated for TT division by VTM is not optimal. p(τ) represents the probability when the RD cost for TT is not optimal. That is, p(τ) can be said to be a probability that evaluation of TT is not required. In Bayes theorem, the prior probability p(τ) refers to the probability that the hypothesis τ is true until actual evidence (data) is observed.

도 2는 샘플 영상(샘플 시퀀스)을 대상으로 시험한 결과이다. 시험은 샘플 시퀀스의 첫 번째 프레임만을 사용하였다. 시험은 양자화 파라미터(quantization parameter, QP)는 QP = 22와 QP = 37로 설정한 두 가지 경우로 수행하였다. 2 is a test result of a sample image (sample sequence). The test used only the first frame of the sample sequence. The test was performed in two cases in which the quantization parameter (QP) was set to QP = 22 and QP = 37.

도 2를 살펴보면, TT가 수평(가로)으로 분할될 평균 사전 확률 p(τ_hor)은 QP = 22인 경우 87.9%이고, QP = 37인 경우 89.0%이다. 아래 첨자 'hor'는 수평 방향(horizontal)을 의미한다. 또한, TT가 수직(세로)으로 분할될 평균 사전 확률 p(τ_ver)은 QP = 22인 경우 89.6%이고, QP = 37인 경우 89.1%이다. 아래 첨자 'ver'는 수직 방향(vertical)을 의미한다.Referring to FIG. 2, the average prior probability p(τ _hor ) that TT will be divided horizontally (horizontally) is 87.9% when QP = 22, and 89.0% when QP = 37. The subscript'hor' means horizontal. In addition, the average prior probability p(τ _ver ) that TT will be divided vertically (vertically) is 89.6% when QP = 22, and 89.1% when QP = 37. The subscript'ver' means vertical.

베이즈 이론에 따라 적절한 증거 ε를 찾아 사후 확률 p(τ|ε)을 살펴보고자 한다. TT와 BT는 구조적으로 대부분의 픽셀들이 중복(overlap)된다. 따라서, BT의 RD 비용이 τ에 대한 적절한 증거ε가 될 수 있다. BT는 TT에 존재하는 픽셀들을 상당부분 공유한다. 예컨대, 수평 방향 TT는 부모 CU의 픽셀들을 25%만 공유하지만, 수평 방향 BT와는 50% 픽셀들을 공유한다. 따라서 BT에 대한 RD 비용이 가설에 대한 증거가 될 수 있다.According to Bayesian theory, we try to find the appropriate evidence ε and examine the posterior probability p(τ|ε). In TT and BT, structurally, most of the pixels overlap. Therefore, the RD cost of BT can be a good evidence ε for τ. BT shares a large portion of the pixels present in the TT. For example, the horizontal direction TT shares only 25% of the pixels of the parent CU, but the horizontal direction BT shares 50% of the pixels. Thus, the RD cost for BT can be evidence for the hypothesis.

증거 ε은 특정 방향 BT의 RD 비용이 다른 방향보다 좋은(작은) 상황을 나타낸다. 특정 블록을 기준으로 설명하면, BT의 특정 방향 RD 비용이 다른 방향의 RD 비용보다 적어서 상기 특정 블록이 BT로 분할되는 경우, 상기 특정 블록은 특정 방향의 BT 분할이 수행되는 상황을 나타낸다. 아래 수학식 1은 특정 방향 δ에 대한 증거 ε를 정의한다.Evidence ε indicates a situation where the RD cost of one direction BT is better (smaller) than the other direction. When described on the basis of a specific block, when the specific block is divided into BT because the RD cost of a specific direction of BT is less than the RD cost of another direction, the specific block represents a situation in which BT division of a specific direction is performed. Equation 1 below defines the evidence ε for a specific direction δ.

δ는 BT 분할에서 수평 또는 수직 방향을 의미한다. ε(δ)는 BT의 RD 비용 J_BT에 따라 값이 결정된다. ~δ는 δ와 다른 방향을 의미한다. 수학식 1에 따르면 ε(δ)는 J_BT(δ) < J_BT(~δ)인 경우 1이고, 그 밖의 경우에는 0이다. p(ε)는 ε가 1일 확률을 나타낸다.δ means a horizontal or vertical direction in BT division. The value of ε(δ) is determined according to _{the RD cost J BT of BT.} ~δ means a different direction from δ. According to Equation 1, ε(δ) is 1 when J _BT (δ) <J _BT (˜δ), and 0 in other cases. p(ε) represents the probability that ε is 1.

도 3은 BT의 RD 비용이 동일 방향 TT에 대한 유효한 데이터가 되는 사후 확률(posterior probability)에 대한 예이다. 사후 확률 p(τ|ε)은 증거 ε이 관측된 후 가설 τ가 참일 확률이다. 도 3을 살펴보면, 사후 확률은 수평 방향 및 수직 방향 각각에 대하여 평균 95%와 94%의 확률을 갖는다. 따라서 사전 확률에 비하여 사후 확률이 높다는 것을 알 수 있다. 도 2와 도 3의 결과에 따르면, TT 평가 과정에서 인코더가 TT에 대한 별도의 RD 비용 연산을 하지 않고, 동일 방향 BT에 대한 RD 비용을 이용해도 충분히 효과적이라고 할 수 있다.3 is an example of a posterior probability that the RD cost of BT becomes valid data for TT in the same direction. The posterior probability p(τ|ε) is the probability that the hypothesis τ is true after evidence ε is observed. Referring to FIG. 3, the posterior probability has an average of 95% and 94% for each of the horizontal and vertical directions. Therefore, it can be seen that the posterior probability is higher than the prior probability. According to the results of FIGS. 2 and 3, it can be said that the encoder does not perform a separate RD cost calculation for TT in the TT evaluation process, and it can be said that it is sufficiently effective to use the RD cost for BT in the same direction.

도 4는 VVC에서 블록의 분할 구조를 결정하는 과정(100)에 대한 예이다. 도 4는 BT와 TT의 픽셀 중복성에 근거하여, BT에서의 RD 비용 결과를 TT의 평가를 위하여 사용하는 기법에 해당한다. 즉, 도 4는 조기에 TT 평가를 종료하여 인코딩 복잡도를 낮추는 기법이다.4 is an example of a process 100 of determining a block division structure in VVC. FIG. 4 corresponds to a technique used for evaluating TT based on the pixel redundancy of BT and TT. That is, FIG. 4 is a technique for lowering encoding complexity by ending TT evaluation early.

도 4는 특정 블록에 대한 분할 과정에 대한 예이다. VCC를 기준으로 설명하면, 도 4는 특정 CI에 대한 분할 과정에 대한 예이다. 분할 대상 내지 평가 대상이 되는 블록을 타깃 블록이라고 명명한다. 또 분할 대상 내지 평가 대상이 되는 CU를 타깃 CU라고 명명한다. 이하 CU를 기준으로 설명한다.4 is an example of a partitioning process for a specific block. Referring to the VCC as a reference, FIG. 4 is an example of a partitioning process for a specific CI. The block to be divided or to be evaluated is referred to as a target block. In addition, the CU to be divided or evaluated is referred to as a target CU. Hereinafter, it will be described based on the CU.

일반적으로 인코더는 먼저 타깃 CU 전체에 대한 인트라 예측(intra prediction)을 수행한다(110). 즉, 인코더는 타깃 CU에 대한 자식 노드가 존재하지 않는 경우에 대한 상황을 평가한다. 이 예측 모드를 현재 블록 모드(current block mode)라고 명명할 수 있다. 인코더는 현재 블록에 대한 RD 비용 J_cur를 산출할 수 있다.In general, the encoder first performs intra prediction for the entire target CU (110). That is, the encoder evaluates a situation in which a child node for the target CU does not exist. This prediction mode may be referred to as a current block mode. The encoder can calculate the RD cost J _{cur for the current block.}

인코더는 타깃 CU에 대한 QT 분할 평가를 할 수 있다(120). 인코더는 타깃 CU에 대한 QT 분할의 경우 RD 비용 J_QT를 산출할 수 있다. QT 분할 평가 순서는 도 4와 다를 수 있다. 예컨대, 인코더는 BT 분할 평가 후에 QT 분할 평가를 할 수도 있다.The encoder may perform QT division evaluation on the target CU (120). The encoder can calculate the _{RD cost J QT} in case of QT division for the target CU. The order of QT division evaluation may be different from that of FIG. 4. For example, the encoder may perform QT division evaluation after BT division evaluation.

인코더는 타깃 CU에 대한 BT 분할 평가를 한다(130). 인코더는 타깃 CU에 대한 BT 분할의 경우 RD 비용을 산출할 수 있다. BT 분할은 두 가지 방향(수평 방향 및 수직 방향)이 있다. 인코더는 수평 방향 BT 분할에 대한 RD 비용 J_BT _{_} _HOR 및 수직 방향 BT 분할에 대한 RD 비용 J_BT _{_} _VER 을 각각 산출할 수 있다. 인코더는 TT 평가 전에 BT 평가를 완료한다. 인코더는 BT 분할 평가 정보를 저장할 수 있다. BT 분할 평가 정보는 J_{BT_HOR}및 J_{BT_VER}을 포함할 수 있다.The encoder performs BT segmentation evaluation on the target CU (130). The encoder can calculate the RD cost in case of BT division for the target CU. BT segmentation has two directions (horizontal direction and vertical direction). Encoder costs RD for horizontal BT segmentation J _BT _{_} _HOR And RD cost for vertical direction BT segmentation J _BT _{_} _VER Can each be calculated. The encoder completes the BT evaluation before the TT evaluation. The encoder may store BT segmentation evaluation information. The BT split evaluation information may include J _{BT_HOR} and J _{BT_VER} .

인코더는 타깃 CU에 대한 TT 분할 평가를 한다(140). 이 과정에서 인코더는 전술한 증거 ε(δ)에 기초하여, TT 평가를 수행할지 여부를 결정할 수 있다. The encoder performs TT division evaluation on the target CU (140). In this process, the encoder may determine whether to perform TT evaluation based on the above-described evidence ε(δ).

인코더는 먼저 타깃 CU에 대한 BT 분할 평가 정보를 기준으로 타깃 CU에 대한 TT 분할 평가가 필요할지 결정할 수 있다. 인코더는 J_BT _{_} _HOR 및 J_BT _{_} _VER을 비교한다(141). The encoder may first determine whether TT partition evaluation is required for the target CU based on BT partition evaluation information for the target CU. Encoder is J _BT _{_} _HOR And compares the _BT J _VER _{_} (141).

J_BT _{_} _HOR < J_BT _{_} _VER인 경우(141의 YES), 인코더는 타깃 CU에 대한 TT 역시 유사한 경향성을 가질 것으로 판단할 수 있다. 이 경우 인코더는 수직 방향 분할에 대한 평가(시험) 과정을 생략하고, 수평 방향 TT 분할에 대한 평가만을 수행할 수 있다(142). 인코더는 수평 방향 TT 분할의 RD 비용 J_TT _{_} _HOR을 산출한다.J _BT _{_} _HOR If <J _BT _{_} _VER (YES in 141), the encoder may determine that the TT for the target CU also has a similar tendency. In this case, the encoder may omit the evaluation (test) process for vertical division and only evaluate horizontal TT division (142). The encoder calculates _{the RD cost J TT} _{_} _HOR of the horizontal direction TT division.

J_BT _{_} _HOR < J_BT _{_} _VER가 아닌 경우(141의 NO)는 통상적으로 J_BT _{_} _HOR > J_BT _{_} _VER이다. 이 경우도 인코더는 타깃 CU에 대한 TT 역시 유사한 경향성을 가질 것으로 판단할 수 있다. 이 경우 인코더는 수평 방향 분할에 대한 평가(시험) 과정을 생략하고, 수직 방향 TT 분할에 대한 평가만을 수행할 수 있다(143). 인코더는 수평 방향 TT 분할의 RD 비용 J_{TT_VER}을 산출한다.J _BT _{_} _HOR <J _BT _{_} _VER (NO of 141) is usually J _BT _{_} _HOR > J _BT _{_} _VER . Also in this case, the encoder may determine that the TT for the target CU also has a similar tendency. In this case, the encoder may omit the evaluation (test) process for horizontal division and perform only evaluation on vertical TT division (143). The encoder calculates _{the RD cost J TT_VER} of TT division in the horizontal direction.

최종적으로 인코더는 산출한 RD 비용 정보를 기준으로 타깃 CU에 대한 분할 정보를 결정한다. 예컨대, 인코더는 다양한 분할 모드를 기준으로 산출한 RD 비용을 비교하여, 최적(가장 작은) RD 비용을 갖는 구조로 타깃 CU를 분할할 수 있다. 분할 정보는 타깃 CU에 대한 최적 분할 구조를 나타낸다. Finally, the encoder determines partition information for the target CU based on the calculated RD cost information. For example, the encoder may divide the target CU into a structure having an optimal (smallest) RD cost by comparing the RD cost calculated based on various division modes. The partitioning information indicates an optimal partitioning structure for the target CU.

인코더는 분할한 CU에 대한 자식 CU를 대상으로 다시 분할 여부 및 분할 구조를 결정할 수 있다.The encoder may determine whether or not to re-segment a child CU of the divided CU and a partition structure.

도 5는 VVC에서 블록의 분할 구조를 결정하는 인코더 장치(200)의 구성을 도시한 예이다. 설명의 편의를 위하여, 도 5는 인코더 장치 중 블록(또는 CU) 분할을 위한 구성만을 도시하였다. 즉, 인코더 장치(200)는 전체 인코딩을 수행하는 인코더의 내부 컴포넌트일 수 있다.5 is an example of a configuration of an encoder device 200 for determining a block division structure in VVC. For convenience of explanation, FIG. 5 shows only a configuration for partitioning a block (or CU) among the encoder devices. That is, the encoder device 200 may be an internal component of an encoder that performs full encoding.

인코더 장치(200)는 인코딩 과정에서 프레임의 블록에 대한 분할을 수행한다. 인코더 장치(200)는 아날로그 회로 또는 디지털 회로로 구현될 수 있다. 도 5는 블록 분할을 위한 코드가 임베드된 장치를 예로 도시한다. 도 5는 블록 분할을 위한 코드 내지 프로그램이 임베드된 칩과 같은 형태의 인코더 장치(200)를 도시한다. 인코더 장치(200)는 VVC 표준에 따르는 영상에 대한 블록 분할을 수행할 수 있다.The encoder device 200 divides a block of a frame in an encoding process. The encoder device 200 may be implemented as an analog circuit or a digital circuit. 5 shows an example of an apparatus in which a code for block division is embedded. 5 shows an encoder device 200 in the form of a chip in which a code or program for block division is embedded. The encoder device 200 may perform block division on an image conforming to the VVC standard.

인코더 장치(200)는 저장 장치(210), 프로세서(220) 및 인터페이스 장치(230)를 포함한다.The encoder device 200 includes a storage device 210, a processor 220, and an interface device 230.

저장 장치(210)는 블록 분할에 필요한 정보를 저장한다. 저장 장치(210)는 블록 분할에 필요한 정보를 임시 저장할 수 있다. 이 경우 저장 장치(210)는 플래시 메모리와 같은 장치일 수 있다. 나아가 저장 장치(210)는 블록 분할을 위한 코드 내지 프로그램을 저장할 수 있다. 이 경우 저장 장치(210)는 ROM과 같은 장치일 수 있다. 따라서, 도 5에서 저장 장치(210)로 하나의 블록을 도시하였지만, 저장 장치(210)는 임시 정보를 저장하는 메모리와 블록 분할 코드를 저장하는 저장 매체로 구분될 수도 있다.The storage device 210 stores information required for block division. The storage device 210 may temporarily store information required for block division. In this case, the storage device 210 may be a device such as a flash memory. Furthermore, the storage device 210 may store codes or programs for block division. In this case, the storage device 210 may be a device such as a ROM. Accordingly, although one block is illustrated as the storage device 210 in FIG. 5, the storage device 210 may be divided into a memory storing temporary information and a storage medium storing a block division code.

저장 장치(210)는 블록 분할에 필요한 다양한 정보를 저장할 수 있다. 저장 장치(210)는 프레임 정보, 블록 정보 등과 같은 정보를 수신하여 저장할 수 있다. 도 5는 타깃 블록에 대한 정보를 수신한다고 도시하였다.The storage device 210 may store various pieces of information necessary for block division. The storage device 210 may receive and store information such as frame information and block information. 5 shows that information on a target block is received.

나아가 저장 장치(210)는 블록 분할 과정에서 산출하는 중간 결과물을 저장할 수 있다. 예컨대, 저장 장치(210)는 RD 비용 J_cur, J_BT, J_TT 등을 저장할 수 있다. J_BT는 수직 방향 분할에 대한 RD 비용 J_{BT_HOR}및 J_{BT_VER}을 포함한다.Furthermore, the storage device 210 may store an intermediate result calculated during the block division process. For example, the storage device 210 may store the RD cost J _cur , J _BT , J _TT , and the like. J _BT includes the RD cost J _{BT_HOR} and J _{BT_VER} for vertical division.

인터페이스 장치(230)는 인코더 내부에서 데이터 및 명령을 전달하는 구성을 의미한다. 인터페이스 장치(230)는 내부 통신을 위한 물리적 장치 및 통신 프로토콜을 포함할 수 있다. 인터페이스 장치(230)는 정보를 전달하는 회선, 정보 전달을 제어하는 구성(스위치 등), 정보를 가공하는 구성(합산기 등) 등으로 구성될 수 있다.The interface device 230 refers to a component that transmits data and commands inside the encoder. The interface device 230 may include a physical device and a communication protocol for internal communication. The interface device 230 may be configured with a line for transmitting information, a configuration for controlling information transmission (such as a switch), a configuration for processing information (such as an adder), and the like.

프로세서(220)는 주어진 데이터 내지 정보를 처리하는 구성을 의미한다. 프로세서(220)는 일종의 연산 장치이다. 프로세서(220)는 저장 장치(210)에 저장된 코드 내지 프로그램을 이용하여 영상에 대한 블록 분할을 수행할 수 있다. The processor 220 refers to a component that processes given data or information. The processor 220 is a kind of computing device. The processor 220 may perform block division on an image using a code or program stored in the storage device 210.

프로세서(220)는 특정 블록 내지 CU에 대한 블록 분할을 수행한다. 프로세서(220)는 타깃 CU 전체에 대한 인트라 예측(intra prediction)을 수행할 수 있다. 프로세서(220)는 현재 블록에 대한 RD 비용 J_cur를 산출할 수 있다. The processor 220 performs block division for a specific block or CU. The processor 220 may perform intra prediction for the entire target CU. The processor 220 may calculate _{the RD cost J cur for the current block.}

프로세서(220)는 특정 블록 내지 CU에 대한 QT 분할 평가를 할 수 있다. 프로세서(220)는 QT 분할이 가능한지 여부를 판단할 수 있다. 프로세서(220)는 QT 분할의 경우 RD 비용 J_QT를 산출할 수 있다.The processor 220 may perform QT division evaluation for a specific block or CU. The processor 220 may determine whether QT division is possible. In the case of QT division, the processor 220 may calculate the _{RD cost J QT.}

프로세서(220)는 특정 블록 내지 CU에 대한 BT 분할 평가를 한다. 프로세서(220)는 수평 방향 BT 분할에 대한 RD 비용 J_BT _{_} _HOR 및 수직 방향 BT 분할에 대한 RD 비용 J_BT _{_} _VER 을 각각 산출할 수 있다. 저장 장치(210)는 J_BT _{_} _HOR 및 J_BT _{_} _VER 을 포함하는 BT 분할 평가 정보를 저장할 수 있다. 도 5는 프로세서(210)가 산출한 BT 분할 평가 정보를 저장하는 예를 도시하였다.The processor 220 performs BT partitioning evaluation for a specific block or CU. Processor 220 is the RD cost for the horizontal direction BT partition J _BT _{_} _HOR And RD cost for vertical direction BT segmentation J _BT _{_} _VER Can each be calculated. The storage device 210 is J _BT _{_} _HOR And J _BT _{_} _VER BT division evaluation information including a can be stored. 5 illustrates an example of storing BT division evaluation information calculated by the processor 210.

프로세서(220)는 특정 블록 내지 CU에 대한 TT 분할 평가를 한다. 프로세서(220)는 BT 분할 평가 정보를 기준으로 특정 블록 내지 CU에 대한 TT 분할 평가가 필요할지 결정할 수 있다.The processor 220 performs TT partition evaluation for a specific block or CU. The processor 220 may determine whether TT division evaluation for a specific block or CU is required based on the BT division evaluation information.

프로세서(220)는 J_BT _{_} _HOR 및 J_BT _{_} _VER을 비교할 수 있다. (1) J_BT _{_} _HOR < J_BT _{_} _VER인 경우(141의 YES), 프로세서(220)는 특정 블록 내지 CU에 대한 TT 역시 유사한 경향성을 가질 것으로 판단할 수 있다. 이 경우 프로세서(220)는 수직 방향 분할에 대한 평가(시험) 과정을 생략하고, 수평 방향 TT 분할에 대한 평가만을 수행할 수 있다. 프로세서(220)는 수평 방향 TT 분할의 RD 비용 J_TT _{_} _HOR을 산출한다. (2) J_BT _{_} _HOR < J_BT _{_} _VER가 아닌 경우는 통상적으로 J_BT _{_} _HOR > J_BT _{_} _VER이다. 이 경우 프로세서(220)는 특정 블록 내지 CU에 대한 TT 역시 유사한 경향성을 가질 것으로 판단할 수 있다. 프로세서(220)는 수평 방향 분할에 대한 평가(시험) 과정을 생략하고, 수직 방향 TT 분할에 대한 평가만을 수행할 수 있다. 프로세서(220)는 수평 방향 TT 분할의 RD 비용 J_TT _{_} _VER을 산출한다.Processor 220 is J _BT _{_} _HOR And J _BT _{_} _VER can be compared. (1) J _BT _{_} _HOR If <J _BT _{_} _VER (YES in 141), the processor 220 may determine that TT for a specific block or CU also has a similar tendency. In this case, the processor 220 may omit the evaluation (test) process for the vertical division and only evaluate the horizontal TT division. The processor 220 calculates _{the RD cost J TT} _{_} _HOR of TT division in the horizontal direction. (2) J _BT _{_} _HOR <If you are not J _BT _{_} _VER typically J _BT _{_} _HOR > J _BT _{_} _VER . In this case, the processor 220 may determine that the TT for a specific block or CU also has a similar tendency. The processor 220 may omit the evaluation (test) process for horizontal division and perform only evaluation on vertical TT division. The processor 220 calculates _{the RD cost J TT} _{_} _VER of TT division in the horizontal direction.

프로세서(220)는 산출한 RD 비용 정보를 기준으로 특정 블록 내지 CU에 대한 분할 정보를 결정한다. 분할 정보는 특정 블록 내지 CU에 대한 최적 분할 구조를 나타낸다. 저장 장치(210)는 특정 블록 내지 CU에 대하여 최종적으로 결정한 분할 정보를 저장할 수 있다. The processor 220 determines partition information for a specific block or CU based on the calculated RD cost information. The partitioning information indicates an optimal partitioning structure for a specific block or CU. The storage device 210 may store partition information finally determined for a specific block or CU.

또한, 상술한 바와 같은 블록 분할 방법 내지 인코딩 방법은 컴퓨터에서 실행될 수 있는 실행가능한 알고리즘을 포함하는 프로그램(또는 어플리케이션)으로 구현될 수 있다. 상기 프로그램은 비일시적 판독 가능 매체(non-transitory computer readable medium)에 저장되어 제공될 수 있다.In addition, the block division method or the encoding method as described above may be implemented as a program (or application) including an executable algorithm that can be executed on a computer. The program may be provided by being stored in a non-transitory computer readable medium.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.The non-transitory readable medium refers to a medium that stores data semi-permanently and can be read by a device, rather than a medium that stores data for a short moment, such as a register, a cache, and a memory. Specifically, the above-described various applications or programs may be provided by being stored in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, or the like.

본 실시례 및 본 명세서에 첨부된 도면은 전술한 기술에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 전술한 기술의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시례는 모두 전술한 기술의 권리범위에 포함되는 것이 자명하다고 할 것이다.The present embodiment and the accompanying drawings are merely illustrative of some of the technical ideas included in the above-described technology, and those skilled in the art can easily be used within the scope of the technical idea included in the specification and drawings of the above-described technology. It will be apparent that all of the modified examples and specific embodiments that can be inferred are included in the scope of the rights of the above-described technology.

Claims

In a method for an encoder to perform segmentation on a CU (coding unit) of an image,
Performing, by an encoder, an evaluation of binary tree (BT) partitioning for a particular CU of the frame; And
Including the step of performing, by the encoder, evaluation of the three-dimensional tree (TT) partitioning for the specific CU,
In the case where the rate-distortion (RD) cost in the horizontal direction in the BT division is less than the RD cost in the vertical direction, the encoder omits the evaluation in the vertical direction in the TT division process,
The encoder is a high-speed segmentation method for a block in an image in which the horizontal direction evaluation is omitted in the TT segmentation process when the horizontal rate-distortion (RD) cost is greater than the vertical direction RD cost.

Performing, by the encoder, intra prediction evaluation for the entire target block to be divided;
Performing, by the encoder, a photo tree (QT) segmentation evaluation on the target block;
Performing, by the encoder, a binary tree (BT) division evaluation on the target block;
Performing, by the encoder, a three-dimensional tree (TT) division evaluation on the target block; And
Determining, by the encoder, a division structure for the target block based on a result of division evaluation for the target block,
Based on the horizontal rate-distortion (RD) cost of the BT segmentation for the target block and the RD cost in the vertical direction, the encoder is a block in an image that omits horizontal or vertical evaluation during the TT segmentation evaluation process. Fast splitting method for.

The method of claim 2,
The encoder divides the target block of a VVC (Versatile Video Coding) image into a high-speed segmentation method for a block within an image.

The method of claim 2,
In the case where the RD cost in the horizontal direction is smaller than the RD cost in the vertical direction in the BT division, the encoder
A high-speed segmentation method for a block in an image in which evaluation in the vertical direction is omitted in the TT segmentation process and only RD cost in the horizontal direction is calculated.

The method of claim 2,
The encoder, when the RD cost in the horizontal direction is greater than the RD cost in the vertical direction,
A high-speed segmentation method for a block in an image in which evaluation in the horizontal direction is omitted in the TT segmentation process and only the RD cost in the vertical direction is calculated.

A storage device for storing information on a target block in a frame to be divided, a result of a binary tree (BT) division evaluation result for the target block, and a code for dividing the target block; And
Comprising a processor that executes the code and evaluates the ternary tree (TT) partitioning of the target block,
The BT segmentation evaluation result includes a horizontal rate-distortion (RD) cost of the BT segmentation for the target block and a vertical RD cost of the BT segmentation for the target block,
The processor, based on the RD cost in the horizontal direction and the RD cost in the vertical direction, performs high-speed segmentation on a block in an image for omitting evaluation in a horizontal direction or a vertical direction in the TT segmentation evaluation process.

The method of claim 6,
The processor is an encoder device that performs high-speed segmentation of a block within an image that performs segmentation of the target block of a Versatile Video Coding (VVC) image.

The method of claim 6,
In the case where the RD cost in the horizontal direction is smaller than the RD cost in the vertical direction in the BT division,
An encoder device for performing high-speed segmentation on a block in an image that omits vertical direction evaluation in the TT segmentation process and calculates only the RD cost for the horizontal direction.

The method of claim 6,
The processor, when the RD cost in the horizontal direction is greater than the RD cost in the vertical direction,
An encoder device for performing high-speed segmentation on a block in an image that omits horizontal direction evaluation in the TT segmentation process and calculates only an RD cost for a vertical direction.

The method of claim 6,
The processor performs intra prediction evaluation on the entire target block, and performs photo tree (QT) division evaluation on the target block,
The storage device is an encoder device that performs high-speed segmentation on a block in an image that stores the intra prediction result and the QT segmentation evaluation result.

The method of claim 10,
The processor is an encoder that performs high-speed segmentation on a block in an image that determines a segmentation structure for the target block based on the intra prediction result, the QT segmentation evaluation result, the BT segmentation evaluation result, and the TT segmentation evaluation result. Device.